When a process is running and another calls `uv cache clean` or `uv
cache prune` we currently deadlock - sometimes until the CI timeout
(https://github.com/astral-sh/setup-uv/issues/588). To avoid this, we
add a default 5 min timeout waiting for a lock. 5 min balances allowing
in-progress builds to finish, especially with larger native
dependencies, while also giving timely errors for deadlocks on (remote)
systems.
Commit 1 is a refactoring.
This branch also fixes a problem with the logging where acquired and
released resources currently mismatch:
```
DEBUG Acquired lock for `https://github.com/tqdm/tqdm`
DEBUG Using existing Git source `https://github.com/tqdm/tqdm`
DEBUG Released lock at `C:\Users\Konsti\AppData\Local\uv\cache\git-v0\locks\16bb813afef8edd2`
```
Alternative to #15105
Instead of building a `BaseClientBuilder` from `NetworkSettings` each
time we need a client, we instead build a single `BaseClientBuilder` and
pass it around. The `RegistryClientBuilder` then uses
`BaseClientBuilder` exclusively for configuration. This removes a chunk
of copy-and-paste code, and also moves the fallible `retries_from_env`
into a single place
Borrow vs. clone is mostly ad-hoc, we can change it in either direction
if it matters.
Closes#15105
Prior to this PR, there were numerous places where uv would leak
credentials in logs. We had a way to mask credentials by calling methods
or a recently-added `redact_url` function, but this was not secure by
default. There were a number of other types (like `GitUrl`) that would
leak credentials on display.
This PR adds a `DisplaySafeUrl` newtype to prevent leaking credentials
when logging by default. It takes a maximalist approach, replacing the
use of `Url` almost everywhere. This includes when first parsing config
files, when storing URLs in types like `GitUrl`, and also when storing
URLs in types that in practice will never contain credentials (like
`DirectorySourceUrl`). The idea is to make it easy for developers to do
the right thing and for the compiler to support this (and to minimize
ever having to manually convert back and forth). Displaying credentials
now requires an active step. Note that despite this maximalist approach,
the use of the newtype should be zero cost.
One conspicuous place this PR does not use `DisplaySafeUrl` is in the
`uv-auth` crate. That would require new clones since there are calls to
`request.url()` that return a `&Url`. One option would have been to make
`DisplaySafeUrl` wrap a `Cow`, but this would lead to lifetime
annotations all over the codebase. I've created a separate PR based on
this one (#13576) that updates `uv-auth` to use `DisplaySafeUrl` with
one new clone. We can discuss the tradeoffs there.
Most of this PR just replaces `Url` with `DisplaySafeUrl`. The core is
`uv_redacted/lib.rs`, where the newtype is implemented. To make it
easier to review the rest, here are some points of note:
* `DisplaySafeUrl` has a `Display` implementation that masks
credentials. Currently, it will still display the username when there is
both a username and password. If we think is the wrong choice, it can
now be changed in one place.
* `DisplaySafeUrl` has a `remove_credentials()` method and also a
`.to_string_with_credentials()` method. This allows us to use it in a
variety of scenarios.
* `IndexUrl::redacted()` was renamed to
`IndexUrl::removed_credentials()` to make it clearer that we are not
masking.
* We convert from a `DisplaySafeUrl` to a `Url` when calling `reqwest`
methods like `.get()` and `.head()`.
* We convert from a `DisplaySafeUrl` to a `Url` when creating a
`uv_auth::Index`. That is because, as mentioned above, I will be
updating the `uv_auth` crate to use this newtype in a separate PR.
* A number of tests (e.g., in `pip_install.rs`) that formerly used
filters to mask tokens in the test output no longer need those filters
since tokens in URLs are now masked automatically.
* The one place we are still knowingly writing credentials to
`pyproject.toml` is when a URL with credentials is passed to `uv add`
with `--raw`. Since displaying credentials is no longer automatic, I
have added a `to_string_with_credentials()` method to the `Pep508Url`
trait. This is used when `--raw` is passed. Adding it to that trait is a
bit weird, but it's the simplest way to achieve the goal. I'm open to
suggestions on how to improve this, but note that because of the way
we're using generic bounds, it's not as simple as just creating a
separate trait for that method.
As per
https://matklad.github.io/2021/02/27/delete-cargo-integration-tests.html
Before that, there were 91 separate integration tests binary.
(As discussed on Discord — I've done the `uv` crate, there's still a few
more commits coming before this is mergeable, and I want to see how it
performs in CI and locally).
## Summary
We now track the discovered `IndexCapabilities` for each `IndexUrl`. If
we learn that an index doesn't support range requests, we avoid doing
any batch prefetching.
Closes https://github.com/astral-sh/uv/issues/7221.
## Summary
This PR revives https://github.com/astral-sh/uv/pull/4944, which I think
was a good start towards adding `--trusted-host`. Last night, I tried to
add `--trusted-host` with a custom verifier, but we had to vendor a lot
of `reqwest` code and I eventually hit some private APIs. I'm not
confident that I can implement it correctly with that mechanism, and
since this is security, correctness is the priority.
So, instead, we now use two clients and multiplex between them.
Closes https://github.com/astral-sh/uv/issues/1339.
## Test Plan
Created self-signed certificate, and ran `python3 -m http.server --bind
127.0.0.1 4443 --directory . --certfile cert.pem --keyfile key.pem` from
the packse index directory.
Verified that `cargo run pip install
transitive-yanked-and-unyanked-dependency-a-0abad3b6 --index-url
https://127.0.0.1:8443/simple-html` failed with:
```
error: Request failed after 3 retries
Caused by: error sending request for url (https://127.0.0.1:8443/simple-html/transitive-yanked-and-unyanked-dependency-a-0abad3b6/)
Caused by: client error (Connect)
Caused by: invalid peer certificate: Other(OtherError(CaUsedAsEndEntity))
```
Verified that `cargo run pip install
transitive-yanked-and-unyanked-dependency-a-0abad3b6 --index-url
'https://127.0.0.1:8443/simple-html' --trusted-host '127.0.0.1:8443'`
failed with the expected error (invalid resolution) and made valid
requests.
Verified that `cargo run pip install
transitive-yanked-and-unyanked-dependency-a-0abad3b6 --index-url
'https://127.0.0.1:8443/simple-html' --trusted-host '127.0.0.2' -n` also
failed.
We now use the getters and setters everywhere.
There were some places where we wanted to build a `MarkerEnvironment`
out of whole cloth, usually in tests. To facilitate those use cases, we
add a `MarkerEnvironmentBuilder` that provides a convenient constructor.
It's basically like a `MarkerEnvironment::new`, but with named
parameters. That's useful here because there are so many fields (and
they many have the same type).
In #2976 I made some changes that led to regressions:
- We stopped tracking URLs that we had not seen credentials for in the
cache
- This means the cache no longer returns a value to indicate we've seen
a realm before
- We stopped seeding the cache with URLs
- Combined with the above, this means we no longer had a list of
locations that we would never attempt to fetch credentials for
- We added caching of credentials found on requests
- Previously the cache was only populated from the seed or credentials
found in the netrc or keyring
- This meant that the cache was populated for locations that we
previously did not cache, i.e. GitHub artifacts(?)
Unfortunately this unveiled problems with the granularity of our cache.
We cache credentials per realm (roughly the hostname) but some realms
have mixed authentication modes i.e. different credentials per URL or
URLs that do not require credentials. Applying credentials to a URL that
does not require it can lead to a failed request, as seen in #3123 where
GitHub throws a 401 when receiving credentials.
To resolve this, the cache is expanded to supporting caching at two
levels:
- URL, cached URL must be a prefix of the request URL
- Realm, exact match required
When we don't have URL-level credentials cached, we attempt the request
without authentication first. On failure, we'll search for realm-level
credentials or fetch credentials from external services. This avoids
providing credentials to new URLs unless we know we need them.
Closes https://github.com/astral-sh/uv/issues/3123
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Closes#2564
## Test Plan
1. Changed existing linehaul tests to leverage insta.
2. Ran tests in various linux distros (Debian, Ubuntu, Centos, Fedora,
Alpine) to ensure they also pass locally again.
---------
Co-authored-by: konstin <konstin@mailbox.org>
This test was introduced in 42973cd9cb. It
looks like it compares some values against some platform specific code
that attempts to find the OS version. But the comparisons made some
assumptions about what kind of data is available. In this commit, we try
to make the test a little more flexible on Linux by not assuming that
`Option` values are `Some`.
## Summary
Closes#1958
This adds linehaul metadata to uv's user-agent when pep 508 markers are
provided to the RegistryClientBuilder. Thanks to #2381, we were able to
leverage most information from markers and avoid inconsistency.
Linehaul is meant to be accompanying metadata pip sends in it's user
agent when talking to registries. You can see this output by running
something like `python -c 'from pip._internal.network.session import
user_agent; print(user_agent())'`.
In PyPI, this metadata processed by the
[linehaul-cloud-function](https://github.com/pypi/linehaul-cloud-function).
More info about linehaul can be found in #1958.
Below are some examples from pip:
* Linux GHA: `pip/24.0
{"ci":true,"cpu":"x86_64","distro":{"id":"jammy","libc":{"lib":"glibc","version":"2.35"},"name":"Ubuntu","version":"22.04"},"implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL
3.0.2 15 Mar
2022","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Linux","release":"6.5.0-1016-azure"}}`
* Windows GHA: `pip/24.0
{"ci":true,"cpu":"AMD64","implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL
3.0.13 30 Jan
2024","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Windows","release":"2022Server"}}`
* OSX GHA: `pip/24.0
{"ci":true,"cpu":"arm64","distro":{"name":"macOS","version":"14.2.1"},"implementation":{"name":"CPython","version":"3.12.2"},"installer":{"name":"pip","version":"24.0"},"openssl_version":"OpenSSL
3.0.13 30 Jan
2024","python":"3.12.2","rustc_version":"1.76.0","system":{"name":"Darwin","release":"23.2.0"}}`
Here's how uv results look like (sorry for the keys not having the same
order):
* Linux GHA: `uv/0.1.21
{"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":"Linux","release":"6.5.0-1016-azure"},"cpu":"x86_64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}`
* Windows GHA: `uv/0.1.21
{"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":null,"system":{"name":"Windows","release":"2022Server"},"cpu":"AMD64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}`
* OSX GHA: `uv/0.1.21
{"installer":{"name":"uv","version":"0.1.21"},"python":"3.12.2","implementation":{"name":"CPython","version":"3.12.2"},"distro":{"name":"macOS","version":"14.2.1","id":null,"libc":null},"system":{"name":"Darwin","release":"23.2.0"},"cpu":"arm64","openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}`
Distro information (such as the one pip uses `from pip._vendor import
distro` to retrieve instead of `platform` module) was not retrieved from
markers. Instead, the linux release codename/name/version uses
`sys-info` crate, adding about 50us of extra overhead on linux. The
distro osx version re-used the [mac_os version
implementation](99c992e38b/crates/platform-host/src/mac_os.rs)
from #2381 which adds about 20us of overhead on osx. I tried to use
other crates to avoid re-introducing `mac_os.rs` but most of them didn't
yield satisfactory performance (40ms-60ms~) or had the wrong values
needed (e.g. darwin version vs osx version).
I also didn't add libc retrieval or rustc retrieval as those seem to add
substantial overhead due to querying `ldd` or `rustc`. PyPy version
detection was also not added to avoid adding extra overhead to [support
PyPy for
linehaul](https://github.com/pypa/pip/blob/24.0/src/pip/_internal/network/session.py#L123).
All other behavior was kept 1-1 to match what pip's linehaul
implementation does (as of 24.0). This also aligns with what was
discussed in #1958.
## Test Plan
Added new integration test to uv-client.
---------
Co-authored-by: konstin <konstin@mailbox.org>
## Summary
Add netrc support to the uv-client.
closes#1405
## Test Plan
I've added a corresponding test case to validate the correct header.
Furthermore a tested it against a real world private repository.
## Summary
Closes#1977
This allows us to send uv's version in the `uv-client` User Agent
header.
Here's how request headers look like to a server now:
```
...
Accept: application/vnd.pypi.simple.v1+json, application/vnd.pypi.simple.v1+html;q=0.2, text/html;q=0.01
User-Agent: uv/0.1.13
...
```
~~I went for a mix of Option 1 and 2 from #1977.~~ Open to alternative
naming as well, not tied too strongly here to the names picked.
~~Another possibility for this new crate is that we can use it to
consolidate metadata that exists across crates to ultimately be able to
create linehaul information described in #1958, but I haven't looked
into what those changes might look like.~~
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
<!-- How was it tested? -->
Added initial tests in the new crate to exercise its public API and
added a new test to uv-client to validate the headers using a 1-time
disposable server.
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.
Then, update directory and file names:
```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```
Then add all the files again
```
# Add all the files again
git add crates
git add python/uv
# This one needs a force-add
git add -f crates/uv-trampoline
```