It turns out Headers is actually an internal-only API. None of the
user-facing types use it.
Unfortunately, making it unexported also required deleting the doctests,
since doctests can only run against a public interface.
This adds a source field to keep track of upstream errors and allow
backtraces, plus a URL field to indicate what URL an error was
associated with.
The enum variants we used to use for Error are now part of a new
ErrorKind type. For convenience within ureq, ErrorKinds can be turned
into an Error with `.new()` or `.msg("some additional information")`.
Error acts as a builder, so additional information can be added after
initial construction. For instance, we return a DnsFailed error when
name resolution fails. When that error bubbles up to Request's
`do_call`, Request adds the URL.
Fixes#232.
Instead, rely on Url's built-in query parameter handling. A Request now
accumulates a list of query param pairs, and joins them with a parsed
URL at the time do_call is called.
In the process, remove some getters that rely on parsing the URL.
Adapting these getters was going to be awkward, and they mostly
duplicate things people can readily get by parsing the URL.
* Remove Request::build
* All mutations on Request follow builder pattern
The previous `build()` on request was necessary because mutating
functions did not follow a proper builder pattern (taking `&mut self`
instead of `mut self`). With a proper builder pattern, the need for
`.build()` goes away.
* All Request body and call methods consume self
Anything which "executes" the request will now consume the `Request`
to produce a `Result<Response>`.
* Move all config from request to agent builder
Timeouts, redirect config, proxy settings and TLS config are now on
`AgentBuilder`.
* Rename max_pool_connections -> max_idle_connections
* Rename max_pool_connections_per_host -> max_idle_connections_per_host
Consistent internal and external naming.
* Introduce new AgentConfig for static config created by builder.
`Agent` can be seen as having two parts. Static config and a mutable
shared state between all states. The static config goes into
`AgentConfig` and the mutable shared state into `AgentState`.
* Replace all use of `Default` for `new`.
Deriving or implementing `Default` makes for a secondary instantiation
API. It is useful in some cases, but gets very confusing when there
is both `new` _and_ a `Default`. It's especially devious for derived
values where a reasonable default is not `0`, `false` or `None`.
* Remove feature native_tls, we want only native rustls.
This feature made for very clunky handling throughout the code. From a
security point of view, it's better to stick with one single TLS API.
Rustls recently got an official audit (very positive).
https://github.com/ctz/rustls/tree/master/audit
Rustls deliberately omits support for older, insecure TLS such as TLS
1.1 or RC4. This might be a problem for a user of ureq, but on balance
not considered important enough to keep native_tls.
* Remove auth and support for basic auth.
The API just wasn't enough. A future reintroduction should at least
also provide a `Bearer` mechanism and possibly more.
* Rename jar -> cookie_store
* Rename jar -> cookie_tin
Just make some field names sync up with the type.
* Drop "cookies" as default feature
The need for handling cookies is probably rare, let's not enable it by
default.
* Change all feature checks for "cookie" to "cookies"
The outward facing feature is "cookies" and I think it's better form
that the code uses the official feature name instead of the optional
library "cookies".
* Keep `set` on Agent level as well as AgentBuilder.
The idea is that an auth exchange might result in a header that need
to be set _after_ the agent has been built.
Previously, Agent stored most of its state in one big
Arc<Mutex<AgentState>>. This separates the Arc from the Mutexes.
Now, Agent is a thin wrapper around an Arc<AgentState>. The individual
components that need locking, ConnectionPool and CookieStore, now are
responsible for their own locking.
There were a couple of reasons for this. Internal components that needed
an Agent were often instead carrying around an Arc<Mutex<AgentState>>.
This felt like the components were too intertwined: those other
components shouldn't have to care quite so much about how Agent is
implemented. Also, this led to compromises of convenience: the Proxy on
Agent wound up stored inside the `Arc<Mutex<AgentState>>` even though it
didn't need locking. It was more convenient that way because that was
what Request and Unit had access too.
The other reason to push things down like this is that it can reduce
lock contention. Mutations to the cookie store don't need to lock the
connection pool, and vice versa. This was a secondary concern, since I
haven't actually profiled these things and found them to be a problem,
but it's a happy result of the refactoring.
Now all the components outside of Agent take an Agent instead of
AgentState.
In the process I removed `Agent.cookie()`. Its API was hard to use
correctly, since it didn't distinguish between cookies on different
hosts. And it would have required updates as part of this refactoring.
I'm open to reinstating some similar functionality with a refreshed API.
I kept `Agent.set_cookie`, but updated its method signature to take a
URL as well as a cookie.
Many of ConnectionPool's methods went from `&mut self` to `&self`,
because ConnectionPool is now using interior mutability.
Gets rid of synthetic_error, and makes the various send_* methods return `Result<Response, Error>`.
Introduces a new error type "HTTP", which represents an error due to status codes 4xx or 5xx.
The HTTP error type contains a boxed Response, so users can read the actual response if they want.
Adds an `error_for_status` setting to disable the functionality of treating 4xx and 5xx as errors.
Adds .unwrap() to a lot of tests.
Fixes#128.
CookieJar doesn't support the path-match and domain-match algorithms from [RFC 6265](https://tools.ietf.org/html/rfc6265#section-5.1.3), while cookie_store does.
This fixes some issues with the cookie matching algorithm currently in ureq. For instance,
the domain-match uses substring matching rather than the RFC 6265 algorithm.
This deletes two tests:
match_cookies_returns_nothing_when_no_cookies didn't test much
agent_cookies was failing because cookie_store rejects cookies on the `test:` scheme.
The way around this is to set up a testserver - but it turns out cookies_on_redirect already
does that, and covers the same cases and more.
This changes some cookie-related behavior:
- Cookies could previously be sent to a wrong domain - e.g. a cookie set on `example.com`
could go to `example.com.evil.com` or `evilexample.com`. Probably no one was relying on
this, since it's quite broken.
- A cookie with a path of `/foo` could be sent on a request to `/foobar`, but now it can't.
- Cookies could previously be set on IP addresses, but now they can't.
- Cookies could previously be set for domains other than the one on the request (or its
parents), but now they can't.
- When a cookie had no domain attribute, it would previously get the domain from the
request, and subsequently be sent to that domain and all subdomains. Now, it will only
be sent to that exact domain (host-only).
That last one is probably the most likely to break people, since someone could depend
on it without realizing it was broken behavior.
When running tests locally, this error can surface.
```
---- test::agent_test::custom_resolver stdout ----
thread 'test::agent_test::custom_resolver' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 22, kind: InvalidInput, message: "Invalid argument" }', src/stream.rs:60:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```
The problem is that setting the timeouts might fail, and this is done
in a From trait where there is not possibility to "bubble" the
io::Error.
```
socket.set_read_timeout(None).unwrap();
socket.set_write_timeout(None).unwrap();
```
This commit moves the resetting of timers to an explicit `Stream::reset()` fn
that must be called every time we're unwrapping the inner stream.
Previously we had a special case for BadStatusRead that would happen
only when we got a ConnectionAborted error reading the status line.
However, sometimes we get ConnectionReset instead. Also the HTTP
spec says that idempotent requests may be retried anytime a connection
is closed prematurely.
The change treats as retryable any ConnectionAborted OR ConnectionReset
error while reading the status line and headers. It removes the special
case BadStatusRead error.
Fixes#165 (I think).
In a few places we relied on "localhost" as a default if a URL's host
was not set, but I think it's better to error out in these cases.
In general, there are a few places in Unit that assumed there is a
host as part of the URL. I've made that explicit by doing a check
at the beginning of `connect()`. I've also tried to plumb through
the semantics of "host is always present" by changing the parameter
types of some of the functions that use the hostname.
I considered a more thorough way to express this with types - for
instance implementing an `HttpUrl` struct that embeds a `Url`, and
exports most of the same methods, but guarantees that host is always
present. However, that was more invasive than this so I did a smaller
change to start.
Previously, using `.send()` on `Request` would require to set either the
Transfer-Encoding or the Content-Length header.
In an effort to provide better ergonomics for this library and to avoid
making users fall in a not-so obvious pitfall, the library should build a
valid Request without asking the user to mess around with the headers.
This commit attempts to fix this issue: if the user use `.send()` to
provide an unknown sized reader, the chunked Transfer-Encoding will be
used. Of course, there are prior checks to ensure we do not override the
user wish, like if the user already set a Content-Length or a
Transfer-Encoding.
This commit fix an edge case where the Content-Length would not be
automatically set if the Transfer-Encoding is set but not chunked.
This defines a new trait `Resolver`, which turns an address into a
Vec<SocketAddr>. It also provides an implementation of Resolver for
`Fn(&str)` so it's easy to define simple resolvers with a closure.
Fixes#82
Co-authored-by: Ulrik <ulrikm@spotify.com>
Instead of cloning most of `Request`'s fields individually when
creating a `Unit`, this PR switches to just cloning `Request` and
stuffing it in `Unit`, and changes references to `unit.[field]` to
`unit.req.[field]` where appropriate.
Fixes#155
`Some(AgentState)` seem to be assumed pretty much everywhere. I could not
find any test or piece of code hinting at how `None` should be interpreted,
or even see how a state of `None` could even be constructed.
Co-authored-by: Ulrik <ulrikm@spotify.com>
* Put the crate version in the User-Agent header.
https://tools.ietf.org/html/rfc7231#section-5.5.3
Each product identifier consists of a name and optional version.
product = token ["/" product-version]
product-version = token
* User-Agent of rust-ureq
* Update src/unit.rs
Co-authored-by: Martin Algesten <martin@algesten.se>
PoolKey calls unwrap() on an option that can be None. Specifically, the
local variable `port` can be None when PoolKey is constructed with a Url
whose scheme is unrecognized in url.port_or_known_default().
To fix that, make port an Option. Also, make scheme part of the PoolKey.
This prevents, for instance, a stream opened for `https://example.com:9999`
being reused on a request for `http://example.com:9999`.
Remove the test-only pool.get() accessor. This was used in only one test,
agent_pool in range.rs. This seemed like it was testing the agent more
than it was testing ranges, so I moved it to agent.rs and edited to
remove the range-testing parts.
Also, reject unrecognized URLs earlier in connect_socket so they don't
reach try_get_connection.
This also reverts a change to send_body that was originally added to
return the number of bytes written. It's no longer needed now that we
check the size of the reader in advance.
Fixes#76.
Adds some feature guards, and removes an unnecessary feature guard
around a call to connect_https (there's an implementation available for
non-TLS that returns UnknownScheme).
Also, remove unnecessary agent.state() method that was only available in
TLS builds. The state field is directly accessible within the crate, and
can be used in both TLS and non-TLS builds.
Co-authored-by: Martin Algesten <martin@algesten.se>
This deprecates timeout_read() and timeout_write() in favor of
timeout(). The new timeout method on Request takes a Duration instead
of a number of milliseconds, and is measured against overall request
time, not per-read time.
Once a request is started, the timeout is turned into a deadline
specific to that call. The deadline is used in conjunction with the
new DeadlineStream class, which sets a timeout on each read according
to the remaining time for the request. Once the request is done,
the DeadlineStream is unwrapped via .into::<Stream>() to become
an undecorated Stream again for return to the pool. Timeouts on the
stream are unset at this point.
Still to be done:
Add a setting on Agent for default timeout.
Change header-writing code to apply overall deadline rather than
per-write timeout.
Fixes#28.
Fix up cfg attributes to work on an xor basis.
Previously, the cfg(any()) attributes would cause issues when
both native-tls and tls features were enabled. Now, https functions
and enum variants will only be created when tls xor native-tls are
enabled. Additionally, a compile error has been added for when
both tls and native-tls features are enabled.
This builds on 753d61b. Before we send a request, we can do a 1-byte
nonblocking peek on the connection. If the server has closed the
connection, this will give us an EOF, and we can take the connection out
of the pool before sending any request on it. This will reduce the
likelihood that we send a non-retryable POST on an already-closed
connection.
The server could still potentially close the connection between when we
make this check and when we finish sending the request, but this should
handle the majority of cases.
This is not a perfect solution. It works as long as we are not sending
any body bytes. We discover the error first when attempting to read
the response status line. That means we discover the error after
sending body bytes. To be able to re-send the body, we would need to
introduce a buffer to be able to replay the body on the next
request. We don't currently do that.