This is more correct, since all gzip streams can consist of multiple members. It
has the happy side effect that it causes gzipped responses to reliably return
their streams to the pool.
The mbedtls example has caused problem in the main build a number of
times. By making it a standalone `cargo new --bin`, we can keep it in
the source tree as a good example but avoid having it break the main
build.
Also, fix some clippy lints.
* Response: build reader at construction time
* Remove unwrapping/rewrapping DeadlineStream
* Let into_reader() provide Sync + 'static
* Prebuffer Vec use known capacity
Co-authored-by: Martin Algesten <martin@algesten.se>
Previously, ReadWrite had methods `is_poolable` and `written_bytes`, which
were solely for the use of unittests.
This replaces `written_bytes` and `TestStream` with a `struct Recorder`
that implements `ReadWrite` and allows unittests to access its recorded
bytes via an `Arc<Mutex<Vec<u8>>>`. It eliminates `is_poolable`; it's fine
to pool a Stream of any kind.
The new `Recorder` also has some convenience methods that abstract away
boilerplate code from many of our unittests.
I got rid of `Stream::from_vec` and `Stream::from_vec_poolable` because
they depended on `TestStream`. They've been replaced by `NoopStream` for
the pool.rs tests, and `ReadOnlyStream` for constructing `Response`s from
`&str` and some test cases.
We unwrap the stream exactly once per response, and we know that case
will be uncontended for the same reason `SyncWrapper` works:
`into_reader()` takes `self`, so it must have exclusive ownership.
Uncontended mutexes are extremely cheap. This saves us a dependency
at a trivial performance cost.
If the user reads exactly the number of bytes in the response, then
drops the Read object, streams will never get returned to the pool since
the user never triggered a read past the end of the LimitedRead.
This fixes that by making PoolReturnRead aware of the level below it, so
it can ask if a stream is "done" without actually doing a read.
Also, a refactorign:
Previously, Response contained an Option<Box<Unit>> because the testing
method `from_str()` would construct a Response with no associated Unit.
However, this increased code complexity with no corresponding test
benefit. Instead, construct a fake Unit in from_str().
Also, instead of taking an `Option<Box<Unit>>`, PoolReturnRead now takes
a URL (to figure out host and port for the PoolKey) and an &Agent where
it will return the stream. This cuts interconnectedness somewhat:
PoolReturnRead doesn't need to know about Unit anymore.
3rd party TLS connections are currently required to provide a Sync
guarantee. Since ureq will only ever use the connection on a single
thread, this is not necessary.
The reason we end up with this bound is because we want Response and
Error to be Sync, in which case Rust's automatic inferral of Sync
fails.
This change "masks" the Stream in a wrapper making it Sync.
Close#474
Tests use `Response::as_write_vec` to inspect the outgoing HTTP/1.1
request line and headers. The current version has two problems:
1. Called `as_write_vec` when it actually returns a `&[u8]`.
2. Inspects/uses the `Response::stream` without consuming `Response`.
The first problem is trivial, but the second is subtle. Currently all
calls on `Response` that works with the internal `Response::stream`
consumes `self` (`into_string`, `into_reader`).
`Response` is by itself `Send + Sync`, and must be so because the
nested Stream is `Read + Write + Send + Sync`. However for
implementors of `TLSStream`, it would be nice to relax the `Sync`
requirement.
Assumption: If all fields in Response are `Sync` except
`Response::stream`, but any access to `stream` consumes `Response`, we
can consider the entire `Response` `Sync`.
This assumption can help us relax the `TlsStream` `Sync` requirement
in a later PR.
The newline becomes part of the response body, even though it was not passed in
as `body`, which makes `Response::new` difficult to use with anything that
checksums data contents for example. Arguably there should also be a mechanism
for passing in `[u8]` bodies, but that's for a separate feature request.
This prevents a malicious server from running the client out of memory
by returning a very large body.
Limit into_string() to 1 megabyte.
Limit individual header fields to 100 kilobytes.
Limit number of header fields to 100.
Part of #267
Idiomatic rust organizes unit tests into `mod tests { }` blocks
using conditional compilation `[cfg(test)]` to decide whether to
compile the code in that block.
This commit moves "bare" test functions into such blocks, and puts
the block at the bottom of respective file.
The HTTP spec allows for non-ascii values both in the status line and
in headers. ureq does not handle that, we can however provide better
context for when it happens.
The spec says the reason phrase at least must be a space. However in
the wild, there are sites that just ends after the status code.
To be more compatible, this commit relaxes ureq's parsing.
Close#316
Status code must be exactly three digits.
HTTP version must be "HTTP/", digit, dot, digit.
https://tools.ietf.org/html/rfc7230#section-3.1.2
status-line = HTTP-version SP status-code SP reason-phrase CRLF
This allows Error to report both the URL that caused an error, and the
original URL that was requested.
Change unit::connect to use the Response history for tracking number of
redirects, instead of passing the count as a separate parameter.
Incidentally, move handling of the `stream` fully inside `Response`.
Instead of `do_from_read` + `set_stream`, we now have `do_from_stream`,
which takes ownership of the stream and keeps it. We also have
`do_from_request`, which does all of `do_from_stream`, but also sets the
`previous` field.
Now that Responses with non-2xx statuses get turned into `Error`,
there is less need for these. Also, surveying the set of public crates
that depend on ureq, none of them use these methods. It seems that
users tend to prefer checking the status code directly.
Here is my thinking on each of these individually:
.ok() -- With the new Result API, any Request you get back will be
.ok(). Also, I think the name .ok() is a little confusing with
Result::ok().
.error() - with the new Result API, this is an exact overlap with
anything that would return Error. People will just check for whether a
Result is Err(...) rather than call .error().
.client_error() - most of the time, if someone wants to specially handle
a 4xx error, they want to handle specific ones, because the response to
them is different. For instance a specialized response to a 404 would be
"delete this from the list of URLs to check in the future," where a
specialized response to a 401 would be "try and load updated
credentials." For instance:
4200edb9ed/healthchecks/src/manage.rs (L70-L84)75d4b363b6/src/lib.rs (L59-L63)1d7daea38b/src/netlify.rs (L101-L112)
.server_error() - I don't have as much objection to this one, since it's
reasonable to want to treat all server errors (500, 502, 503) more or
less the same. Although even at that, 501 Not Implemented seems like
people would want to handle it differently. I guess that doesn't come up
much in practice - I've never seen a 501 in the wild.
.redirect() - Usually redirects are handled under the hood, unless
someone disables automatic redirect handling. I'm not terribly opposed
to this one, but given that no-one's using it and it's just as easy to
do 300..399.contains(resp.status()), I'm mildly inclined towards
deletion.