Introduce PoolReturner, a handle on an agent and a PoolKey that is
capable of returning a Stream to a Pool. Make Streams keep track of
their own PoolReturner, instead of having PoolReturnRead keep track of
that information.
For the LimitedRead code path, get rid of PoolReturnRead. Instead,
LimitedRead is responsible for returning its Stream to the Pool after
its second-to-last read. In other words, LimitedRead will return the
stream if the next read is guaranteed to return Ok(0).
Constructing a LimitedRead of size 0 is always wrong, because we could
always just return the stream immediately. Change the size argument to
NonZeroUsize to enforce that.
Remove the Done trait, which was only used for LimitedRead. It was used
to try and make sure we returned the stream to the pool on exact reads,
but was not reliable.
This does not yet move the ChunkDecoder code path away from
PoolReturnRead. That requires a little more work.
Part 1 of #559. Fixes#555.
The mbedtls example has caused problem in the main build a number of
times. By making it a standalone `cargo new --bin`, we can keep it in
the source tree as a good example but avoid having it break the main
build.
Also, fix some clippy lints.
Previously, ReadWrite had methods `is_poolable` and `written_bytes`, which
were solely for the use of unittests.
This replaces `written_bytes` and `TestStream` with a `struct Recorder`
that implements `ReadWrite` and allows unittests to access its recorded
bytes via an `Arc<Mutex<Vec<u8>>>`. It eliminates `is_poolable`; it's fine
to pool a Stream of any kind.
The new `Recorder` also has some convenience methods that abstract away
boilerplate code from many of our unittests.
I got rid of `Stream::from_vec` and `Stream::from_vec_poolable` because
they depended on `TestStream`. They've been replaced by `NoopStream` for
the pool.rs tests, and `ReadOnlyStream` for constructing `Response`s from
`&str` and some test cases.
Tests use `Response::as_write_vec` to inspect the outgoing HTTP/1.1
request line and headers. The current version has two problems:
1. Called `as_write_vec` when it actually returns a `&[u8]`.
2. Inspects/uses the `Response::stream` without consuming `Response`.
The first problem is trivial, but the second is subtle. Currently all
calls on `Response` that works with the internal `Response::stream`
consumes `self` (`into_string`, `into_reader`).
`Response` is by itself `Send + Sync`, and must be so because the
nested Stream is `Read + Write + Send + Sync`. However for
implementors of `TLSStream`, it would be nice to relax the `Sync`
requirement.
Assumption: If all fields in Response are `Sync` except
`Response::stream`, but any access to `stream` consumes `Response`, we
can consider the entire `Response` `Sync`.
This assumption can help us relax the `TlsStream` `Sync` requirement
in a later PR.
The auth header stripping was in the wrong place (when serializing the request),
rather than in the construction of the Unit, where it ought to be.
This also makes redirect header retention testable.
This makes us less likely to try and reuse a closed connection, which
produces problems in particular for requests that can't be retried.
Fixes#361Fixes#124
- Don't panic on the mutex in all tests if a single test fails
- Give a more helpful message if a test handler wasn't registered
- Enable env_logger for tests
Change "Bad" to "Invalid" in error names, mimicking io::Error::ErrorKind.
Change InvalidProxyCreds to ProxyUnauthorized.
Change DnsFailed to just Dns (the fact that there was a failure is implicit
in the fact that this was an error).
Now that Responses with non-2xx statuses get turned into `Error`,
there is less need for these. Also, surveying the set of public crates
that depend on ureq, none of them use these methods. It seems that
users tend to prefer checking the status code directly.
Here is my thinking on each of these individually:
.ok() -- With the new Result API, any Request you get back will be
.ok(). Also, I think the name .ok() is a little confusing with
Result::ok().
.error() - with the new Result API, this is an exact overlap with
anything that would return Error. People will just check for whether a
Result is Err(...) rather than call .error().
.client_error() - most of the time, if someone wants to specially handle
a 4xx error, they want to handle specific ones, because the response to
them is different. For instance a specialized response to a 404 would be
"delete this from the list of URLs to check in the future," where a
specialized response to a 401 would be "try and load updated
credentials." For instance:
4200edb9ed/healthchecks/src/manage.rs (L70-L84)75d4b363b6/src/lib.rs (L59-L63)1d7daea38b/src/netlify.rs (L101-L112)
.server_error() - I don't have as much objection to this one, since it's
reasonable to want to treat all server errors (500, 502, 503) more or
less the same. Although even at that, 501 Not Implemented seems like
people would want to handle it differently. I guess that doesn't come up
much in practice - I've never seen a 501 in the wild.
.redirect() - Usually redirects are handled under the hood, unless
someone disables automatic redirect handling. I'm not terribly opposed
to this one, but given that no-one's using it and it's just as easy to
do 300..399.contains(resp.status()), I'm mildly inclined towards
deletion.
Also, remove Response::{ok, error, client_error, server_error,
redirect}. The idea is that you would access these through the
Error object instead.
I fetched all the reverse dependencies of ureq on crates.io and looked
for uses of the methods being removed. I found none.
I'm also considering removing the error_on_non_2xx method entirely. If
it's easy to get the underlying response for errors, it would be nice to
make that the single way to do things rather than support two separate
ways of handling HTTP errors.
Stream now has an `Inner` enum, and wraps an instance of that enum in a
BufReader. This allows Stream itself to implement BufRead trivially, and
simplify some of the match dispatching. Having Stream implement BufRead
means we can make use of `read_line` instead of our own `read_next_line`
(not done in this PR yet).
Also, removes the `Cursor` variant of the Inner enum in favor of using
the `Test` variant everywhere, since it's strictly more powerful.
This adds a source field to keep track of upstream errors and allow
backtraces, plus a URL field to indicate what URL an error was
associated with.
The enum variants we used to use for Error are now part of a new
ErrorKind type. For convenience within ureq, ErrorKinds can be turned
into an Error with `.new()` or `.msg("some additional information")`.
Error acts as a builder, so additional information can be added after
initial construction. For instance, we return a DnsFailed error when
name resolution fails. When that error bubbles up to Request's
`do_call`, Request adds the URL.
Fixes#232.
Doctests run against a normally-built copy of the crate, i.e. one
without #[cfg(test)] set, so we can't use the conditional compilation
feature.
Instead, define a static var that indicates whether the library is
running in test mode or not. For each doctest, insert a hidden call that
sets this var to true. Then, when ureq::agent() is called, it returns a
test_agent instead.
This required moving testserver out of the test mod and into src/, so
that it can be included unconditionally (i.e. when cfg(test) is false).
This PR converts one doctest as an example. If we land this PR, I'll
send a followup to convert the rest.
Instead, rely on Url's built-in query parameter handling. A Request now
accumulates a list of query param pairs, and joins them with a parsed
URL at the time do_call is called.
In the process, remove some getters that rely on parsing the URL.
Adapting these getters was going to be awkward, and they mostly
duplicate things people can readily get by parsing the URL.
* Remove Request::build
* All mutations on Request follow builder pattern
The previous `build()` on request was necessary because mutating
functions did not follow a proper builder pattern (taking `&mut self`
instead of `mut self`). With a proper builder pattern, the need for
`.build()` goes away.
* All Request body and call methods consume self
Anything which "executes" the request will now consume the `Request`
to produce a `Result<Response>`.
* Move all config from request to agent builder
Timeouts, redirect config, proxy settings and TLS config are now on
`AgentBuilder`.
* Rename max_pool_connections -> max_idle_connections
* Rename max_pool_connections_per_host -> max_idle_connections_per_host
Consistent internal and external naming.
* Introduce new AgentConfig for static config created by builder.
`Agent` can be seen as having two parts. Static config and a mutable
shared state between all states. The static config goes into
`AgentConfig` and the mutable shared state into `AgentState`.
* Replace all use of `Default` for `new`.
Deriving or implementing `Default` makes for a secondary instantiation
API. It is useful in some cases, but gets very confusing when there
is both `new` _and_ a `Default`. It's especially devious for derived
values where a reasonable default is not `0`, `false` or `None`.
* Remove feature native_tls, we want only native rustls.
This feature made for very clunky handling throughout the code. From a
security point of view, it's better to stick with one single TLS API.
Rustls recently got an official audit (very positive).
https://github.com/ctz/rustls/tree/master/audit
Rustls deliberately omits support for older, insecure TLS such as TLS
1.1 or RC4. This might be a problem for a user of ureq, but on balance
not considered important enough to keep native_tls.
* Remove auth and support for basic auth.
The API just wasn't enough. A future reintroduction should at least
also provide a `Bearer` mechanism and possibly more.
* Rename jar -> cookie_store
* Rename jar -> cookie_tin
Just make some field names sync up with the type.
* Drop "cookies" as default feature
The need for handling cookies is probably rare, let's not enable it by
default.
* Change all feature checks for "cookie" to "cookies"
The outward facing feature is "cookies" and I think it's better form
that the code uses the official feature name instead of the optional
library "cookies".
* Keep `set` on Agent level as well as AgentBuilder.
The idea is that an auth exchange might result in a header that need
to be set _after_ the agent has been built.
This feature was broken in #67, which reset timeouts on the
stream before passing it to set_stream.
As part of this change, refactor the internal storage of
timeouts on the Request object to use Option<Duration>.
Remove the deadline field on Response. It wasn't used. The
deadline field on unit was used instead.
Add a unittest.
Previously, Agent stored most of its state in one big
Arc<Mutex<AgentState>>. This separates the Arc from the Mutexes.
Now, Agent is a thin wrapper around an Arc<AgentState>. The individual
components that need locking, ConnectionPool and CookieStore, now are
responsible for their own locking.
There were a couple of reasons for this. Internal components that needed
an Agent were often instead carrying around an Arc<Mutex<AgentState>>.
This felt like the components were too intertwined: those other
components shouldn't have to care quite so much about how Agent is
implemented. Also, this led to compromises of convenience: the Proxy on
Agent wound up stored inside the `Arc<Mutex<AgentState>>` even though it
didn't need locking. It was more convenient that way because that was
what Request and Unit had access too.
The other reason to push things down like this is that it can reduce
lock contention. Mutations to the cookie store don't need to lock the
connection pool, and vice versa. This was a secondary concern, since I
haven't actually profiled these things and found them to be a problem,
but it's a happy result of the refactoring.
Now all the components outside of Agent take an Agent instead of
AgentState.
In the process I removed `Agent.cookie()`. Its API was hard to use
correctly, since it didn't distinguish between cookies on different
hosts. And it would have required updates as part of this refactoring.
I'm open to reinstating some similar functionality with a refreshed API.
I kept `Agent.set_cookie`, but updated its method signature to take a
URL as well as a cookie.
Many of ConnectionPool's methods went from `&mut self` to `&self`,
because ConnectionPool is now using interior mutability.
This is a step towards allowing our tests to run without network access,
which will make them more resilient and faster.
Replace the URL in one instance of an HTTPS test that didn't need HTTPS.
In the process, rename set_foo methods to just foo, since methods on the
builder will always be setters.
Adds a new() method on ConnectionPool so it can be constructed directly
with the desired limits. Removes the setter methods on ConnectionPool
for those limits. This means that connection limits can only be set when
an Agent is built.
There were two tests that verify Send and Sync implementations, one for
Agent and one for Request. This PR moves the Request test to request.rs,
and changes both tests to more directly verify the traits. There may be
another way to do this, I'm not sure.
This adds validation of header values on receive, and of both header
names and header values on send. This doesn't change the return
type of set to be a Result, it just validates when the request is
sent. Also removes the section in the README describing handling
of invalid headers, and updates a test that verified acceptance of
non-ASCII headers so that it verifies rejection of them instead.
Gets rid of synthetic_error, and makes the various send_* methods return `Result<Response, Error>`.
Introduces a new error type "HTTP", which represents an error due to status codes 4xx or 5xx.
The HTTP error type contains a boxed Response, so users can read the actual response if they want.
Adds an `error_for_status` setting to disable the functionality of treating 4xx and 5xx as errors.
Adds .unwrap() to a lot of tests.
Fixes#128.
CookieJar doesn't support the path-match and domain-match algorithms from [RFC 6265](https://tools.ietf.org/html/rfc6265#section-5.1.3), while cookie_store does.
This fixes some issues with the cookie matching algorithm currently in ureq. For instance,
the domain-match uses substring matching rather than the RFC 6265 algorithm.
This deletes two tests:
match_cookies_returns_nothing_when_no_cookies didn't test much
agent_cookies was failing because cookie_store rejects cookies on the `test:` scheme.
The way around this is to set up a testserver - but it turns out cookies_on_redirect already
does that, and covers the same cases and more.
This changes some cookie-related behavior:
- Cookies could previously be sent to a wrong domain - e.g. a cookie set on `example.com`
could go to `example.com.evil.com` or `evilexample.com`. Probably no one was relying on
this, since it's quite broken.
- A cookie with a path of `/foo` could be sent on a request to `/foobar`, but now it can't.
- Cookies could previously be set on IP addresses, but now they can't.
- Cookies could previously be set for domains other than the one on the request (or its
parents), but now they can't.
- When a cookie had no domain attribute, it would previously get the domain from the
request, and subsequently be sent to that domain and all subdomains. Now, it will only
be sent to that exact domain (host-only).
That last one is probably the most likely to break people, since someone could depend
on it without realizing it was broken behavior.