## Summary
Follow up to https://github.com/astral-sh/uv/pull/15563
Closes https://github.com/astral-sh/uv/issues/13485
This is a first-pass at adding support for conditional support for Git
LFS between git sources, initial feedback welcome.
e.g.
```
[tool.uv.sources]
test-lfs-repo = { git = "https://github.com/zanieb/test-lfs-repo.git", lfs = true }
```
For context previously a user had to set `UV_GIT_LFS` to have uv fetch
lfs objects on git sources. This env var was all or nothing, meaning you
must always have it set to get consistent behavior and it applied to all
git sources. If you fetched lfs objects at a revision and then turned
off lfs (or vice versa), the git db, corresponding checkout lfs
artifacts would not be updated properly. Similarly, when git source
distributions were built, there would be no distinction between sources
with lfs and without lfs. Hence, it could corrupt the git, sdist, and
archive caches.
In order to support some sources being LFS enabled and other not, this
PR adds a stateful layer roughly similar to how `subdirectory` works but
for `lfs` since the git database, the checkouts and the corresponding
caching layers needed to be LFS aware (requested vs installed). The
caches also had to isolated and treated entirely separate when handling
LFS sources.
Summary
* Adds `lfs = true` or `lfs = false` to git sources in pyproject.toml
* Added `lfs=true` query param / fragments to most relevant url structs
(not parsed as user input)
* In the case of uv add / uv tool, `--lfs` is supported instead
* `UV_GIT_LFS` environment variable support is still functional for
non-project entrypoints (e.g. uv pip)
* `direct-url.json` now has an custom `git_lfs` entry under VcsInfo
(note, this is not in the spec currently -- see caveats).
* git database and checkouts have an different cache key as the sources
should be treated effectively different for the same rev.
* sdists cache also differ in the cache key of a built distribution if
it was built using LFS enabled revisions to distinguish between non-LFS
same revisions. This ensures the strong assumption for archive-v0 that
an unpacked revision "doesn't change sources" stays valid.
Caveats
* `pylock.toml` import support has not been added via git_lfs=true,
going through the spec it wasn't clear to me it's something we'd support
outside of the env var (for now).
* direct-url struct was modified by adding a non-standard `git_lfs`
field under VcsInfo which may be undersirable although the PEP 610 does
say `Additional fields that would be necessary to support such VCS
SHOULD be prefixed with the VCS command name` which could be interpret
this change as ok.
* There will be a slight lockfile and cache churn for users that use
`UV_GIT_LFS` as all git lockfile entries will get a `lfs=true` fragment.
The cache version does not need an update, but LFS sources will get
their own namespace under git-v0 and sdist-v9/git hence a cache-miss
will occur once but this can be sufficient to label this as breaking for
workflows always setting `UV_GIT_LFS`.
## Test Plan
Some initial tests were added. More tests likely to follow as we reach
consensus on a final approach.
For IT test, we may want to move to use a repo under astral namespace in
order to test lfs functionality.
Manual testing was done for common pathological cases like killing LFS
fetch mid-way, uninstalling LFS after installing an sdist with it and
reinstalling, fetching LFS artifacts in different commits, etc.
PSA: Please ignore the docker build failures as its related to depot
OIDC issues.
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
Co-authored-by: konstin <konstin@mailbox.org>
## Summary
Now that we perform this fast-path in
`crates/uv-distribution/src/source/mod.rs`, I _think_ the fast-path here
is no longer used? In my testing, we only actually took this path when
the fast-path _already_ failed (and thus it would fail again, wasting
time).
Prior to this PR, there were numerous places where uv would leak
credentials in logs. We had a way to mask credentials by calling methods
or a recently-added `redact_url` function, but this was not secure by
default. There were a number of other types (like `GitUrl`) that would
leak credentials on display.
This PR adds a `DisplaySafeUrl` newtype to prevent leaking credentials
when logging by default. It takes a maximalist approach, replacing the
use of `Url` almost everywhere. This includes when first parsing config
files, when storing URLs in types like `GitUrl`, and also when storing
URLs in types that in practice will never contain credentials (like
`DirectorySourceUrl`). The idea is to make it easy for developers to do
the right thing and for the compiler to support this (and to minimize
ever having to manually convert back and forth). Displaying credentials
now requires an active step. Note that despite this maximalist approach,
the use of the newtype should be zero cost.
One conspicuous place this PR does not use `DisplaySafeUrl` is in the
`uv-auth` crate. That would require new clones since there are calls to
`request.url()` that return a `&Url`. One option would have been to make
`DisplaySafeUrl` wrap a `Cow`, but this would lead to lifetime
annotations all over the codebase. I've created a separate PR based on
this one (#13576) that updates `uv-auth` to use `DisplaySafeUrl` with
one new clone. We can discuss the tradeoffs there.
Most of this PR just replaces `Url` with `DisplaySafeUrl`. The core is
`uv_redacted/lib.rs`, where the newtype is implemented. To make it
easier to review the rest, here are some points of note:
* `DisplaySafeUrl` has a `Display` implementation that masks
credentials. Currently, it will still display the username when there is
both a username and password. If we think is the wrong choice, it can
now be changed in one place.
* `DisplaySafeUrl` has a `remove_credentials()` method and also a
`.to_string_with_credentials()` method. This allows us to use it in a
variety of scenarios.
* `IndexUrl::redacted()` was renamed to
`IndexUrl::removed_credentials()` to make it clearer that we are not
masking.
* We convert from a `DisplaySafeUrl` to a `Url` when calling `reqwest`
methods like `.get()` and `.head()`.
* We convert from a `DisplaySafeUrl` to a `Url` when creating a
`uv_auth::Index`. That is because, as mentioned above, I will be
updating the `uv_auth` crate to use this newtype in a separate PR.
* A number of tests (e.g., in `pip_install.rs`) that formerly used
filters to mask tokens in the test output no longer need those filters
since tokens in URLs are now masked automatically.
* The one place we are still knowingly writing credentials to
`pyproject.toml` is when a URL with credentials is passed to `uv add`
with `--raw`. Since displaying credentials is no longer automatic, I
have added a `to_string_with_credentials()` method to the `Pep508Url`
trait. This is used when `--raw` is passed. Adding it to that trait is a
bit weird, but it's the simplest way to achieve the goal. I'm open to
suggestions on how to improve this, but note that because of the way
we're using generic bounds, it's not as simple as just creating a
separate trait for that method.
This PR redacts credentials in displayed URLs.
It mostly relies on a `redacted_url` function (and where possible
`IndexUrl::redacted`). This is a quick way to prevent leaked credentials
but it's prone to programmer error when adding new trace statements. A
better follow-on would use a `RedactedUrl` type with the appropriate
`Display` implementation. This would allow us to still extract
credentials from the URL while displaying it securely. On the plus side,
the sites where the `redacted_url` function are used serve as easy
signposts for where to use the new type in a future PR.
Closes#1714.
## Summary
closes#12234
[fetch_with_cli](e0f81f0d4a/crates/uv-git/src/git.rs (L573))
doesn't respect the registry client's [connectivity
setting](e0f81f0d4a/crates/uv-client/src/registry_client.rs (L1009))
- this pr updates `fetch_with_cli` to set `GIT_ALLOW_PROTOCOL=file` when
the client's connectivity setting is `Connectivity::Offline`
## Test Plan
E2E
```sh
cargo run add "pycurl @ git+https://github.com/pycurl/pycurl.git" --directory ~/src/offline-test/ --offline
```
```sh
Compiling uv-cli v0.0.1 (/Users/justinchapman/src/uv/crates/uv-cli)
Compiling uv v0.6.11 (/Users/justinchapman/src/uv/crates/uv)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 4.47s
Running `target/debug/uv add 'pycurl @ git+https://github.com/pycurl/pycurl.git' --directory /Users/justinchapman/src/offline-test/ --offline`
Updating https://github.com/pycurl/pycurl.git (HEAD) × Failed to download and build `pycurl @ git+https://github.com/pycurl/pycurl.git`
├─▶ Git operation failed
├─▶ failed to fetch into: /Users/justinchapman/.cache/uv/git-v0/db/9a596e5213c3162d
╰─▶ process didn't exit successfully: `/usr/bin/git fetch --force --update-head-ok 'https://github.com/pycurl/pycurl.git' '+HEAD:refs/remotes/origin/HEAD'` (exit status: 128)
--- stderr
fatal: transport 'https' not allowed
help: If you want to add the package regardless of the failed resolution, provide the `--frozen` flag to skip locking and syncing.
```
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
## Summary
* Upgrade the rust toolchain to 1.85.0. This does not increase the MSRV.
* Update windows trampoline to 1.86 nightly beta (previously in 1.85
nightly beta).
## Test Plan
Existing tests
We want to build `uv-build` without depending on the network crates. In
preparation for that, we split uv-git into uv-git and uv-git-types,
where only uv-git depends on reqwest, so that uv-build can use
uv-git-types.
When stderr is not a tty, we currently don't show any messages for build
or large downloads, since indicatif is hidden. We can improve this by
showing a message for:
* Starting and finishing a large download (>1MB)
* Starting and finishing a build
Downloads are limited to 1MB or unknown size to keep the logs concise
and not scroll the entire terminal away for a download that finishes
almost immediately.
These messages are not captured in the tests since their order is
non-deterministic (downloads and builds race to finish).
There are no "tick" messages for large downloads yet, we could e.g. show
an update on runnning downloads every n seconds.
Part of #11121
**Test Plan**
```
$ uv venv && FORCE_COLOR=1 cargo run -q pip install numpy --no-binary :all: --no-cache 2>&1 | tee a.txt
Using CPython 3.13.0
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
Resolved 1 package in 221ms
Building numpy==2.2.2
Built numpy==2.2.2
Prepared 1 package in 2m 34s
Installed 1 package in 6ms
+ numpy==2.2.2
```

```
$ uv venv && FORCE_COLOR=1 cargo run -q pip install torch --no-cache 2>&1 | tee b.txt
Using CPython 3.13.0
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
Resolved 24 packages in 648ms
Downloading setuptools (1.2MiB)
Downloading nvidia-cuda-cupti-cu12 (13.2MiB)
Downloading torch (731.1MiB)
Downloading nvidia-nvjitlink-cu12 (20.1MiB)
Downloading nvidia-cufft-cu12 (201.7MiB)
Downloading nvidia-cuda-nvrtc-cu12 (23.5MiB)
Downloading nvidia-curand-cu12 (53.7MiB)
Downloading nvidia-nccl-cu12 (179.9MiB)
Downloading nvidia-cudnn-cu12 (634.0MiB)
Downloading nvidia-cublas-cu12 (346.6MiB)
Downloading sympy (5.9MiB)
Downloading nvidia-cusparse-cu12 (197.8MiB)
Downloading nvidia-cusparselt-cu12 (143.1MiB)
Downloading networkx (1.6MiB)
Downloading nvidia-cusolver-cu12 (122.0MiB)
Downloading triton (241.4MiB)
Downloaded setuptools
Downloaded networkx
Downloaded sympy
Downloaded nvidia-cuda-cupti-cu12
Downloaded nvidia-nvjitlink-cu12
Downloaded nvidia-cuda-nvrtc-cu12
Downloaded nvidia-curand-cu12
[...]
```

## Summary
Sort of undecided on this. These are already stored as `dyn Reporter` in
each struct, so we're already using dynamic dispatch in that sense. But
all the methods take `impl Reporter`. This is sometimes nice (the
callsites are simpler?), but it also means that in practice, you often
_can't_ pass `None` to these methods that accept `Option<impl
Reporter>`, because Rust can't infer the generic type.
Anyway, this adds more consistency and simplifies the setup by using
`Arc<dyn Reporter>` everywhere.
## Summary
This is part of making
https://github.com/astral-sh/uv/issues/7299#issuecomment-2385286341
better. You can now use `tool.uv.dependency-metadata` for direct URL
requirements. Unfortunately, you _must_ include a version, since we need
one to perform resolution.
## Summary
We retain them if you use `--raw-sources`, but otherwise they're
removed. We still respect them in the subsequent `uv.lock` via an
in-process store.
Closes#6056.
## Summary
It turns out that the Git fetch implementation is initializing its own
client, which can be really expensive on macOS (due to loading native
certificates) _and_ bypasses any of our middleware. This PR modifies the
Git implementation to accept a shared client.
## Summary
We currently rely on libgit2 for most git-related functionality.
However, libgit2 has long-standing performance issues, as well as lags
significantly behind git in terms of new features. For these reasons we
now use the git CLI by default for fetching repositories
(https://github.com/astral-sh/uv/pull/1781). This PR completely drops
libgit2 in favor of the git CLI for all git-related functionality, which
should allow us to use features such as partial clones and sparse
checkouts in the future for performance.
There is also a lot of technical debt in the current git code as it's
mostly taken from Cargo. Switching to the git CLI *vastly* simplifies
the `uv-git` codebase.
Eventually we might want to look into switching to
[`gitoxide`](https://github.com/Byron/gitoxide), but it's currently too
immature for our use case.
The sync scenarios script is broken, so i did the updates manually
```
$ ./scripts/sync_scenarios.sh
Setting up a temporary environment...
Using Python 3.12.1 interpreter at: /home/konsti/projects/uv/.venv/bin/python3
Creating virtualenv at: .venv
Activate with: source .venv/bin/activate
× No solution found when resolving dependencies:
╰─▶ Because docutils==0.21.post1 is unusable because the package metadata was inconsistent and you require docutils==0.21.post1, we can conclude that the requirements are unsatisfiable.
hint: Metadata for docutils==0.21.post1 was inconsistent:
Package metadata version `0.21` does not match given version `0.21.post1`
```
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
## Summary
I noticed in #2769 that I was now stripping `.git` suffixes from Git
URLs after resolving to a precise commit. This PR cleans up the internal
caching to use a better canonical representation: a `RepositoryUrl`
along with a `GitReference`, instead of a `GitUrl` which can contain
non-canonical data. This gives us both better fidelity (preserving the
`.git`, along with any casing that the user provided when defining the
URL) and is overall cleaner and more robust.
Closes https://github.com/astral-sh/uv/issues/1775
Closes https://github.com/astral-sh/uv/issues/1452
Closes https://github.com/astral-sh/uv/issues/1514
Follows https://github.com/astral-sh/uv/pull/1717
libgit2 does not support host names with extra identifiers during SSH
lookup (e.g. [`github.com-some_identifier`](
https://docs.github.com/en/authentication/connecting-to-github-with-ssh/managing-deploy-keys#using-multiple-repositories-on-one-server))
so we use the `git` command instead for fetching. This is required for
`pip` parity.
See the [Cargo
documentation](https://doc.rust-lang.org/nightly/cargo/reference/config.html#netgit-fetch-with-cli)
for more details on using the `git` CLI instead of libgit2. We may want
to try to use libgit2 first in the future, as it is more performant
(#1786).
We now support authentication with:
```
git+ssh://git@<hostname>/...
git+ssh://git@<hostname>-<identifier>/...
```
Tested with a deploy key e.g.
```
cargo run -- \
pip install uv-private-pypackage@git+ssh://git@github.com-test-uv-private-pypackage/astral-test/uv-private-pypackage.git \
--reinstall --no-cache -v
```
and
```
cargo run -- \
pip install uv-private-pypackage@git+ssh://git@github.com/astral-test/uv-private-pypackage.git \
--reinstall --no-cache -v
```
with a ssh config like
```
Host github.com
Hostname github.com
IdentityFile=/Users/mz/.ssh/id_ed25519
Host github.com-test-uv-private-pypackage
Hostname github.com
IdentityFile=/Users/mz/.ssh/id_ed25519
```
It seems quite hard to add test coverage for this to the test suite, as
we'd need to add the SSH key and I don't know how to isolate that from
affecting other developer's machines.
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.
Then, update directory and file names:
```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```
Then add all the files again
```
# Add all the files again
git add crates
git add python/uv
# This one needs a force-add
git add -f crates/uv-trampoline
```