Previously, ReadWrite had methods `is_poolable` and `written_bytes`, which
were solely for the use of unittests.
This replaces `written_bytes` and `TestStream` with a `struct Recorder`
that implements `ReadWrite` and allows unittests to access its recorded
bytes via an `Arc<Mutex<Vec<u8>>>`. It eliminates `is_poolable`; it's fine
to pool a Stream of any kind.
The new `Recorder` also has some convenience methods that abstract away
boilerplate code from many of our unittests.
I got rid of `Stream::from_vec` and `Stream::from_vec_poolable` because
they depended on `TestStream`. They've been replaced by `NoopStream` for
the pool.rs tests, and `ReadOnlyStream` for constructing `Response`s from
`&str` and some test cases.
If the user reads exactly the number of bytes in the response, then
drops the Read object, streams will never get returned to the pool since
the user never triggered a read past the end of the LimitedRead.
This fixes that by making PoolReturnRead aware of the level below it, so
it can ask if a stream is "done" without actually doing a read.
Also, a refactorign:
Previously, Response contained an Option<Box<Unit>> because the testing
method `from_str()` would construct a Response with no associated Unit.
However, this increased code complexity with no corresponding test
benefit. Instead, construct a fake Unit in from_str().
Also, instead of taking an `Option<Box<Unit>>`, PoolReturnRead now takes
a URL (to figure out host and port for the PoolKey) and an &Agent where
it will return the stream. This cuts interconnectedness somewhat:
PoolReturnRead doesn't need to know about Unit anymore.
Idiomatic rust organizes unit tests into `mod tests { }` blocks
using conditional compilation `[cfg(test)]` to decide whether to
compile the code in that block.
This commit moves "bare" test functions into such blocks, and puts
the block at the bottom of respective file.
Stream now has an `Inner` enum, and wraps an instance of that enum in a
BufReader. This allows Stream itself to implement BufRead trivially, and
simplify some of the match dispatching. Having Stream implement BufRead
means we can make use of `read_line` instead of our own `read_next_line`
(not done in this PR yet).
Also, removes the `Cursor` variant of the Inner enum in favor of using
the `Test` variant everywhere, since it's strictly more powerful.
Replace `this` idiom with `mut self`.
Move idle connections constants from pool.rs to agent.rs.
Remove Agent.set and some convenience request methods
(leaving get, post, and put).
Move max_idle_connections setting from AgentConfig to AgentBuilder
(since the builder passes these to ConnectionPool and the Agent
doesn't subsequently need them).
Eliminate duplicate copy of proxy in AgentState; use the one in
AgentConfig.
* Remove Request::build
* All mutations on Request follow builder pattern
The previous `build()` on request was necessary because mutating
functions did not follow a proper builder pattern (taking `&mut self`
instead of `mut self`). With a proper builder pattern, the need for
`.build()` goes away.
* All Request body and call methods consume self
Anything which "executes" the request will now consume the `Request`
to produce a `Result<Response>`.
* Move all config from request to agent builder
Timeouts, redirect config, proxy settings and TLS config are now on
`AgentBuilder`.
* Rename max_pool_connections -> max_idle_connections
* Rename max_pool_connections_per_host -> max_idle_connections_per_host
Consistent internal and external naming.
* Introduce new AgentConfig for static config created by builder.
`Agent` can be seen as having two parts. Static config and a mutable
shared state between all states. The static config goes into
`AgentConfig` and the mutable shared state into `AgentState`.
* Replace all use of `Default` for `new`.
Deriving or implementing `Default` makes for a secondary instantiation
API. It is useful in some cases, but gets very confusing when there
is both `new` _and_ a `Default`. It's especially devious for derived
values where a reasonable default is not `0`, `false` or `None`.
* Remove feature native_tls, we want only native rustls.
This feature made for very clunky handling throughout the code. From a
security point of view, it's better to stick with one single TLS API.
Rustls recently got an official audit (very positive).
https://github.com/ctz/rustls/tree/master/audit
Rustls deliberately omits support for older, insecure TLS such as TLS
1.1 or RC4. This might be a problem for a user of ureq, but on balance
not considered important enough to keep native_tls.
* Remove auth and support for basic auth.
The API just wasn't enough. A future reintroduction should at least
also provide a `Bearer` mechanism and possibly more.
* Rename jar -> cookie_store
* Rename jar -> cookie_tin
Just make some field names sync up with the type.
* Drop "cookies" as default feature
The need for handling cookies is probably rare, let's not enable it by
default.
* Change all feature checks for "cookie" to "cookies"
The outward facing feature is "cookies" and I think it's better form
that the code uses the official feature name instead of the optional
library "cookies".
* Keep `set` on Agent level as well as AgentBuilder.
The idea is that an auth exchange might result in a header that need
to be set _after_ the agent has been built.
Previously, Agent stored most of its state in one big
Arc<Mutex<AgentState>>. This separates the Arc from the Mutexes.
Now, Agent is a thin wrapper around an Arc<AgentState>. The individual
components that need locking, ConnectionPool and CookieStore, now are
responsible for their own locking.
There were a couple of reasons for this. Internal components that needed
an Agent were often instead carrying around an Arc<Mutex<AgentState>>.
This felt like the components were too intertwined: those other
components shouldn't have to care quite so much about how Agent is
implemented. Also, this led to compromises of convenience: the Proxy on
Agent wound up stored inside the `Arc<Mutex<AgentState>>` even though it
didn't need locking. It was more convenient that way because that was
what Request and Unit had access too.
The other reason to push things down like this is that it can reduce
lock contention. Mutations to the cookie store don't need to lock the
connection pool, and vice versa. This was a secondary concern, since I
haven't actually profiled these things and found them to be a problem,
but it's a happy result of the refactoring.
Now all the components outside of Agent take an Agent instead of
AgentState.
In the process I removed `Agent.cookie()`. Its API was hard to use
correctly, since it didn't distinguish between cookies on different
hosts. And it would have required updates as part of this refactoring.
I'm open to reinstating some similar functionality with a refreshed API.
I kept `Agent.set_cookie`, but updated its method signature to take a
URL as well as a cookie.
Many of ConnectionPool's methods went from `&mut self` to `&self`,
because ConnectionPool is now using interior mutability.
In the process, rename set_foo methods to just foo, since methods on the
builder will always be setters.
Adds a new() method on ConnectionPool so it can be constructed directly
with the desired limits. Removes the setter methods on ConnectionPool
for those limits. This means that connection limits can only be set when
an Agent is built.
There were two tests that verify Send and Sync implementations, one for
Agent and one for Request. This PR moves the Request test to request.rs,
and changes both tests to more directly verify the traits. There may be
another way to do this, I'm not sure.
When running tests locally, this error can surface.
```
---- test::agent_test::custom_resolver stdout ----
thread 'test::agent_test::custom_resolver' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }', src/stream.rs:60:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
The problem is that setting the timeouts might fail, and this is done
in a From trait where there is not possibility to "bubble" the
io::Error.
```
socket.set_read_timeout(None).unwrap();
socket.set_write_timeout(None).unwrap();
```
This commit moves the resetting of timers to an explicit `Stream::reset()` fn
that must be called every time we're unwrapping the inner stream.
In a few places we relied on "localhost" as a default if a URL's host
was not set, but I think it's better to error out in these cases.
In general, there are a few places in Unit that assumed there is a
host as part of the URL. I've made that explicit by doing a check
at the beginning of `connect()`. I've also tried to plumb through
the semantics of "host is always present" by changing the parameter
types of some of the functions that use the hostname.
I considered a more thorough way to express this with types - for
instance implementing an `HttpUrl` struct that embeds a `Url`, and
exports most of the same methods, but guarantees that host is always
present. However, that was more invasive than this so I did a smaller
change to start.
We only want streams in the pool that are ready for the next request,
and don't have any remaining bytes of the previous request. That's the
case when we return a stream due to EOF on the underlying `Read`.
However, there was also an `impl Drop` that returned the stream to the
pool on drop of a PoolReturnRead. That resulted in a bug: If a user
called `response.into_reader()`, but did not read the entire body,
a dirty stream would be returned to the pool. The next time a request
was made to the same host, that stream would be reused, and instead of
reading off "HTTP/1.1 200 OK", ureq would read a remnant of the previous
request, resulting in a BadStatus error.
This broke during some recent refactorings. The special case branch for
max_connections == 0 wasn't actually setting the max_idle_connections
field.
This change simply deletes that branch, since the existing code that
whittles down the pool size one element at a time suffices. In the
common case this will be set on an empty pool anyhow.
This defines a new trait `Resolver`, which turns an address into a
Vec<SocketAddr>. It also provides an implementation of Resolver for
`Fn(&str)` so it's easy to define simple resolvers with a closure.
Fixes#82
Co-authored-by: Ulrik <ulrikm@spotify.com>
Instead of cloning most of `Request`'s fields individually when
creating a `Unit`, this PR switches to just cloning `Request` and
stuffing it in `Unit`, and changes references to `unit.[field]` to
`unit.req.[field]` where appropriate.
Fixes#155
Now that we allow multiple connections for the same PoolKey, update our
invariants accordingly. Also provide a couple of helper functions for
removing the first or last match of an entry in a VecDeque.
This also changes which entry from `lru` gets removed when a stream is
removed from the pool. Previously it was the oldest matching one. Now it's
the newest matching one, which matches the semantics we are applying to
`recycle[K]`.
Followup to #133
/cc @cfal
`Some(AgentState)` seem to be assumed pretty much everywhere. I could not
find any test or piece of code hinting at how `None` should be interpreted,
or even see how a state of `None` could even be constructed.
Co-authored-by: Ulrik <ulrikm@spotify.com>
The auto-derived implementation would have unexpectedly initialized the
`usize` fields to 0. Having a `::default()` leading to an unuseable value
is counter-intuitive.
Adds set_max_idle_connections and set_max_idle_connections_per_host.
This turns the values of Pool.recycle into a VecDeque of Streams for the same PoolKey.
The freshest stream (most recently used) is at the back; the stalest stream is at the front.
This also removes the invariant "Each PoolKey exists in recycle at most once and lru at
most once," replacing it with "each PoolKey has the same number of entries in lru as in
recycle."
Fixes#110
Adds limit of 100 connections to ConnectionPool. This is implemented
with a VecDeque acting as an LRU. When the pool becomes full, the oldest
stream is popped off the back from the VecDeque and also removed from
the map of PoolKeys to Streams.
Fixes#77
PoolKey calls unwrap() on an option that can be None. Specifically, the
local variable `port` can be None when PoolKey is constructed with a Url
whose scheme is unrecognized in url.port_or_known_default().
To fix that, make port an Option. Also, make scheme part of the PoolKey.
This prevents, for instance, a stream opened for `https://example.com:9999`
being reused on a request for `http://example.com:9999`.
Remove the test-only pool.get() accessor. This was used in only one test,
agent_pool in range.rs. This seemed like it was testing the agent more
than it was testing ranges, so I moved it to agent.rs and edited to
remove the range-testing parts.
Also, reject unrecognized URLs earlier in connect_socket so they don't
reach try_get_connection.
Adds some feature guards, and removes an unnecessary feature guard
around a call to connect_https (there's an implementation available for
non-TLS that returns UnknownScheme).
Also, remove unnecessary agent.state() method that was only available in
TLS builds. The state field is directly accessible within the crate, and
can be used in both TLS and non-TLS builds.
Co-authored-by: Martin Algesten <martin@algesten.se>