If the user reads exactly the number of bytes in the response, then
drops the Read object, streams will never get returned to the pool since
the user never triggered a read past the end of the LimitedRead.
This fixes that by making PoolReturnRead aware of the level below it, so
it can ask if a stream is "done" without actually doing a read.
Also, a refactorign:
Previously, Response contained an Option<Box<Unit>> because the testing
method `from_str()` would construct a Response with no associated Unit.
However, this increased code complexity with no corresponding test
benefit. Instead, construct a fake Unit in from_str().
Also, instead of taking an `Option<Box<Unit>>`, PoolReturnRead now takes
a URL (to figure out host and port for the PoolKey) and an &Agent where
it will return the stream. This cuts interconnectedness somewhat:
PoolReturnRead doesn't need to know about Unit anymore.
3rd party TLS connections are currently required to provide a Sync
guarantee. Since ureq will only ever use the connection on a single
thread, this is not necessary.
The reason we end up with this bound is because we want Response and
Error to be Sync, in which case Rust's automatic inferral of Sync
fails.
This change "masks" the Stream in a wrapper making it Sync.
Close#474
Tests use `Response::as_write_vec` to inspect the outgoing HTTP/1.1
request line and headers. The current version has two problems:
1. Called `as_write_vec` when it actually returns a `&[u8]`.
2. Inspects/uses the `Response::stream` without consuming `Response`.
The first problem is trivial, but the second is subtle. Currently all
calls on `Response` that works with the internal `Response::stream`
consumes `self` (`into_string`, `into_reader`).
`Response` is by itself `Send + Sync`, and must be so because the
nested Stream is `Read + Write + Send + Sync`. However for
implementors of `TLSStream`, it would be nice to relax the `Sync`
requirement.
Assumption: If all fields in Response are `Sync` except
`Response::stream`, but any access to `stream` consumes `Response`, we
can consider the entire `Response` `Sync`.
This assumption can help us relax the `TlsStream` `Sync` requirement
in a later PR.
The newline becomes part of the response body, even though it was not passed in
as `body`, which makes `Response::new` difficult to use with anything that
checksums data contents for example. Arguably there should also be a mechanism
for passing in `[u8]` bodies, but that's for a separate feature request.
This prevents a malicious server from running the client out of memory
by returning a very large body.
Limit into_string() to 1 megabyte.
Limit individual header fields to 100 kilobytes.
Limit number of header fields to 100.
Part of #267
Idiomatic rust organizes unit tests into `mod tests { }` blocks
using conditional compilation `[cfg(test)]` to decide whether to
compile the code in that block.
This commit moves "bare" test functions into such blocks, and puts
the block at the bottom of respective file.
The HTTP spec allows for non-ascii values both in the status line and
in headers. ureq does not handle that, we can however provide better
context for when it happens.
The spec says the reason phrase at least must be a space. However in
the wild, there are sites that just ends after the status code.
To be more compatible, this commit relaxes ureq's parsing.
Close#316
Status code must be exactly three digits.
HTTP version must be "HTTP/", digit, dot, digit.
https://tools.ietf.org/html/rfc7230#section-3.1.2
status-line = HTTP-version SP status-code SP reason-phrase CRLF
This allows Error to report both the URL that caused an error, and the
original URL that was requested.
Change unit::connect to use the Response history for tracking number of
redirects, instead of passing the count as a separate parameter.
Incidentally, move handling of the `stream` fully inside `Response`.
Instead of `do_from_read` + `set_stream`, we now have `do_from_stream`,
which takes ownership of the stream and keeps it. We also have
`do_from_request`, which does all of `do_from_stream`, but also sets the
`previous` field.
Now that Responses with non-2xx statuses get turned into `Error`,
there is less need for these. Also, surveying the set of public crates
that depend on ureq, none of them use these methods. It seems that
users tend to prefer checking the status code directly.
Here is my thinking on each of these individually:
.ok() -- With the new Result API, any Request you get back will be
.ok(). Also, I think the name .ok() is a little confusing with
Result::ok().
.error() - with the new Result API, this is an exact overlap with
anything that would return Error. People will just check for whether a
Result is Err(...) rather than call .error().
.client_error() - most of the time, if someone wants to specially handle
a 4xx error, they want to handle specific ones, because the response to
them is different. For instance a specialized response to a 404 would be
"delete this from the list of URLs to check in the future," where a
specialized response to a 401 would be "try and load updated
credentials." For instance:
4200edb9ed/healthchecks/src/manage.rs (L70-L84)75d4b363b6/src/lib.rs (L59-L63)1d7daea38b/src/netlify.rs (L101-L112)
.server_error() - I don't have as much objection to this one, since it's
reasonable to want to treat all server errors (500, 502, 503) more or
less the same. Although even at that, 501 Not Implemented seems like
people would want to handle it differently. I guess that doesn't come up
much in practice - I've never seen a 501 in the wild.
.redirect() - Usually redirects are handled under the hood, unless
someone disables automatic redirect handling. I'm not terribly opposed
to this one, but given that no-one's using it and it's just as easy to
do 300..399.contains(resp.status()), I'm mildly inclined towards
deletion.
Also, remove Response::{ok, error, client_error, server_error,
redirect}. The idea is that you would access these through the
Error object instead.
I fetched all the reverse dependencies of ureq on crates.io and looked
for uses of the methods being removed. I found none.
I'm also considering removing the error_on_non_2xx method entirely. If
it's easy to get the underlying response for errors, it would be nice to
make that the single way to do things rather than support two separate
ways of handling HTTP errors.
This moves Stream's enum into an `Inner` enum, and wraps a single
BufReader around the whole thing. This makes it easier to consistently
treat the contents of Stream as being wrapped in a BufReader.
Also, implement BufRead for DeadlineStream. This means when a timeout
is set, we don't have to set that timeout on the socket with every
small read, just when we fill up the buffer. This reduces the number
of syscalls.
Remove the `Cursor` variant from Stream. It was strictly less powerful
than that `Test` variant, so I've replaced the handful of Cursor uses
with `Test`. Because some of those cases weren't test, I removed the
`#[cfg(test)]` param on the `Test` variant.
Now that all inputs to `do_from_read` are `impl BufRead`, add that
as a type constraint. Change `read_next_line` to take advantage of
`BufRead::read_line`, which may be somewhat faster (though I haven't
benchmarked).
Stream now has an `Inner` enum, and wraps an instance of that enum in a
BufReader. This allows Stream itself to implement BufRead trivially, and
simplify some of the match dispatching. Having Stream implement BufRead
means we can make use of `read_line` instead of our own `read_next_line`
(not done in this PR yet).
Also, removes the `Cursor` variant of the Inner enum in favor of using
the `Test` variant everywhere, since it's strictly more powerful.
This adds a source field to keep track of upstream errors and allow
backtraces, plus a URL field to indicate what URL an error was
associated with.
The enum variants we used to use for Error are now part of a new
ErrorKind type. For convenience within ureq, ErrorKinds can be turned
into an Error with `.new()` or `.msg("some additional information")`.
Error acts as a builder, so additional information can be added after
initial construction. For instance, we return a DnsFailed error when
name resolution fails. When that error bubbles up to Request's
`do_call`, Request adds the URL.
Fixes#232.