When a process is running and another calls `uv cache clean` or `uv
cache prune` we currently deadlock - sometimes until the CI timeout
(https://github.com/astral-sh/setup-uv/issues/588). To avoid this, we
add a default 5 min timeout waiting for a lock. 5 min balances allowing
in-progress builds to finish, especially with larger native
dependencies, while also giving timely errors for deadlocks on (remote)
systems.
Commit 1 is a refactoring.
This branch also fixes a problem with the logging where acquired and
released resources currently mismatch:
```
DEBUG Acquired lock for `https://github.com/tqdm/tqdm`
DEBUG Using existing Git source `https://github.com/tqdm/tqdm`
DEBUG Released lock at `C:\Users\Konsti\AppData\Local\uv\cache\git-v0\locks\16bb813afef8edd2`
```
## Summary
Follow up to https://github.com/astral-sh/uv/pull/15563
Closes https://github.com/astral-sh/uv/issues/13485
This is a first-pass at adding support for conditional support for Git
LFS between git sources, initial feedback welcome.
e.g.
```
[tool.uv.sources]
test-lfs-repo = { git = "https://github.com/zanieb/test-lfs-repo.git", lfs = true }
```
For context previously a user had to set `UV_GIT_LFS` to have uv fetch
lfs objects on git sources. This env var was all or nothing, meaning you
must always have it set to get consistent behavior and it applied to all
git sources. If you fetched lfs objects at a revision and then turned
off lfs (or vice versa), the git db, corresponding checkout lfs
artifacts would not be updated properly. Similarly, when git source
distributions were built, there would be no distinction between sources
with lfs and without lfs. Hence, it could corrupt the git, sdist, and
archive caches.
In order to support some sources being LFS enabled and other not, this
PR adds a stateful layer roughly similar to how `subdirectory` works but
for `lfs` since the git database, the checkouts and the corresponding
caching layers needed to be LFS aware (requested vs installed). The
caches also had to isolated and treated entirely separate when handling
LFS sources.
Summary
* Adds `lfs = true` or `lfs = false` to git sources in pyproject.toml
* Added `lfs=true` query param / fragments to most relevant url structs
(not parsed as user input)
* In the case of uv add / uv tool, `--lfs` is supported instead
* `UV_GIT_LFS` environment variable support is still functional for
non-project entrypoints (e.g. uv pip)
* `direct-url.json` now has an custom `git_lfs` entry under VcsInfo
(note, this is not in the spec currently -- see caveats).
* git database and checkouts have an different cache key as the sources
should be treated effectively different for the same rev.
* sdists cache also differ in the cache key of a built distribution if
it was built using LFS enabled revisions to distinguish between non-LFS
same revisions. This ensures the strong assumption for archive-v0 that
an unpacked revision "doesn't change sources" stays valid.
Caveats
* `pylock.toml` import support has not been added via git_lfs=true,
going through the spec it wasn't clear to me it's something we'd support
outside of the env var (for now).
* direct-url struct was modified by adding a non-standard `git_lfs`
field under VcsInfo which may be undersirable although the PEP 610 does
say `Additional fields that would be necessary to support such VCS
SHOULD be prefixed with the VCS command name` which could be interpret
this change as ok.
* There will be a slight lockfile and cache churn for users that use
`UV_GIT_LFS` as all git lockfile entries will get a `lfs=true` fragment.
The cache version does not need an update, but LFS sources will get
their own namespace under git-v0 and sdist-v9/git hence a cache-miss
will occur once but this can be sufficient to label this as breaking for
workflows always setting `UV_GIT_LFS`.
## Test Plan
Some initial tests were added. More tests likely to follow as we reach
consensus on a final approach.
For IT test, we may want to move to use a repo under astral namespace in
order to test lfs functionality.
Manual testing was done for common pathological cases like killing LFS
fetch mid-way, uninstalling LFS after installing an sdist with it and
reinstalling, fetching LFS artifacts in different commits, etc.
PSA: Please ignore the docker build failures as its related to depot
OIDC issues.
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
Co-authored-by: konstin <konstin@mailbox.org>
Part of https://github.com/astral-sh/uv/issues/4392
We shouldn't link to PyPI, and dropping the workspace-level
documentation link should mean that we get the auto-generated `docs.rs`
links.
Currently, `uv cache clean` and `uv cache prune` can cause crashes in
other uv processes running in parallel by removing their in-use files.
We can solve this by using a shared (read) lock on the cache directory,
while the `uv cache` operations use an exclusive (write) lock. The
drawback is that this is always one extra lock, and that we assume that
all platforms support shared locks.
Once Rust 1.89 fulfills our N-2 policy, we can add support for these
methods in fs_err and switch to
https://doc.rust-lang.org/std/fs/struct.File.html#platform-specific-behavior-2.
**Test Plan**
Open one terminal, run:
```
uv venv -c -p 3.13
UV_CACHE_DIR=cache uv cache clean
UV_CACHE_DIR=cache uv pip install numpy==2.0.0
```
Open another terminal, run:
```
UV_CACHE_DIR=cache uv cache clean
```
Fixes#15704
Part of #13883
## Summary
This PR adds support for the `application/vnd.pyx.simple.v1` content
type, similar to `application/vnd.pypi.simple.v1` with the exception
that it can also include core metadata for package-versions directly.
As a frontend to Ruff's formatter.
There are some interesting choices here, some of which may just be
temporary:
1. We pin a default version of Ruff, so `uv format` is stable for a
given uv version
2. We install Ruff from GitHub instead of PyPI, which means we don't
need a Python interpreter or environment
3. We do not read the Ruff version from the dependency tree
See https://github.com/astral-sh/ruff/pull/19665 for a prototype of the
LSP integration.
## Summary
Make the use of `Self` consistent. Mostly done by running `cargo clippy
--fix -- -A clippy::all -W clippy::use_self`.
## Test Plan
<!-- How was it tested? -->
No need.
Adds a cache bucket for Python installs and uses it by default during
tests, extending the opt-in cache added in
https://github.com/astral-sh/uv/pull/12175
Updates the `python_install` tests to use a shared cache for Python
installs. This reduces the `python_install` test runtime on my machine
from 23s -> 17s. The difference should be much larger on machines with
slower internet and less cores for test workers :) This should also
improve stability in CI by reducing reliance on the network during test
runs, see #14327
## Summary
This should reduce the number of filesystem operations fairly
dramatically:
- Only query actual symlinks.
- Don't recurse into package bodies (huge).
- Only traverse once (rather than twice).
Closes https://github.com/astral-sh/uv/issues/13667.
Prior to this PR, there were numerous places where uv would leak
credentials in logs. We had a way to mask credentials by calling methods
or a recently-added `redact_url` function, but this was not secure by
default. There were a number of other types (like `GitUrl`) that would
leak credentials on display.
This PR adds a `DisplaySafeUrl` newtype to prevent leaking credentials
when logging by default. It takes a maximalist approach, replacing the
use of `Url` almost everywhere. This includes when first parsing config
files, when storing URLs in types like `GitUrl`, and also when storing
URLs in types that in practice will never contain credentials (like
`DirectorySourceUrl`). The idea is to make it easy for developers to do
the right thing and for the compiler to support this (and to minimize
ever having to manually convert back and forth). Displaying credentials
now requires an active step. Note that despite this maximalist approach,
the use of the newtype should be zero cost.
One conspicuous place this PR does not use `DisplaySafeUrl` is in the
`uv-auth` crate. That would require new clones since there are calls to
`request.url()` that return a `&Url`. One option would have been to make
`DisplaySafeUrl` wrap a `Cow`, but this would lead to lifetime
annotations all over the codebase. I've created a separate PR based on
this one (#13576) that updates `uv-auth` to use `DisplaySafeUrl` with
one new clone. We can discuss the tradeoffs there.
Most of this PR just replaces `Url` with `DisplaySafeUrl`. The core is
`uv_redacted/lib.rs`, where the newtype is implemented. To make it
easier to review the rest, here are some points of note:
* `DisplaySafeUrl` has a `Display` implementation that masks
credentials. Currently, it will still display the username when there is
both a username and password. If we think is the wrong choice, it can
now be changed in one place.
* `DisplaySafeUrl` has a `remove_credentials()` method and also a
`.to_string_with_credentials()` method. This allows us to use it in a
variety of scenarios.
* `IndexUrl::redacted()` was renamed to
`IndexUrl::removed_credentials()` to make it clearer that we are not
masking.
* We convert from a `DisplaySafeUrl` to a `Url` when calling `reqwest`
methods like `.get()` and `.head()`.
* We convert from a `DisplaySafeUrl` to a `Url` when creating a
`uv_auth::Index`. That is because, as mentioned above, I will be
updating the `uv_auth` crate to use this newtype in a separate PR.
* A number of tests (e.g., in `pip_install.rs`) that formerly used
filters to mask tokens in the test output no longer need those filters
since tokens in URLs are now masked automatically.
* The one place we are still knowingly writing credentials to
`pyproject.toml` is when a URL with credentials is passed to `uv add`
with `--raw`. Since displaying credentials is no longer automatic, I
have added a `to_string_with_credentials()` method to the `Pep508Url`
trait. This is used when `--raw` is passed. Adding it to that trait is a
bit weird, but it's the simplest way to achieve the goal. I'm open to
suggestions on how to improve this, but note that because of the way
we're using generic bounds, it's not as simple as just creating a
separate trait for that method.
## Summary
If a set of wheel tags includes a dot, this code is treating the part
_after_ the dot as an extension, and thereby failing to detect that the
entry is a symlink to an archive (and thereby removing the archive).
This is all an optimization, so this code just makes it a little
targeted: we skip specific known extensions, rather than anything with
any extension.
Closes https://github.com/astral-sh/uv/issues/13270.
In #13302, there was an IO error without context. This error seems to be
caused by a symlink error. Switching as symlinking to `fs_err` ensures
these errors will carry context in the future.
Part of #11834
Currently, all Python installation are a streaming download-and-extract.
With this PR, we add the `UV_PYTHON_CACHE_DIR` variable. When set, the
installation is split into downloading the interpreter into
`UV_PYTHON_CACHE_DIR` and extracting it there from a second step. If the
archive is already present in `UV_PYTHON_CACHE_DIR`, we skip the
download.
The feature can be used to speed up tests and CI. Locally for me, `cargo
test -p uv -- python_install` goes from 43s to 7s (1,7s in release mode)
when setting `UV_PYTHON_CACHE_DIR`. It can also be used for offline
installation of Python interpreter, by copying the archives to a
directory in the offline machine, while the path rewriting is still
performed on the target machine on installation.
## Summary
I don't know if I actually want to commit this, but I did it on the
plane last time and just polished it off (got it to compile) while
waiting to board.
## Summary
This ended up being more involved than expected. The gist is that we
setup all the packages we want to reinstall upfront (they're passed in
on the command-line); but at that point, we don't have names for all the
packages that the user has specified. (Consider, e.g., `uv pip install
.` -- we don't have a name for `.`, so we can't add it to the list of
`Reinstall` packages.)
Now, `Reinstall` also accepts paths, so we can augment `Reinstall` based
on the user-provided paths.
Closes#12038.
<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Closes#2410
<!-- What's the purpose of the change? What does it do, and why? -->
This changes the name of files in `wheels` bucket to use a hash instead
of the wheel name as to not exceed maximum file length limit on various
systems.
This only addresses the primary concern of #2410. It still does _not_
address:
- Path limit of 260 on windows:
https://github.com/astral-sh/uv/issues/2410#issuecomment-2062020882
To solve this we need to opt-in to longer path limits on windows
([ref](https://github.com/astral-sh/uv/issues/2410#issuecomment-2150532658)),
but I think that is a separate issue and should be a separate MR.
- Exceeding filename limit while building a wheel from source
distribution
As per my understanding, this is out of uv's control. Name of the output
wheel will be decided by build-backend used by the project. For wheels
built from source distribution, pip also uses the wheel names in cache.
So I have not touched `sdists` cache.
I have added a `filename: WheelFileName` field in `Archive`, so we can
use it while indexing instead of relying on the filename on disk.
Another way to do this was to read `.dist-info/WHEEL` and
`.dist-info/METADATA` and build `WheelFileName` but that seems less
robust and will be slower.
## Test Plan
<!-- How was it tested? -->
Tested by installing `yt-dlp`, `httpie` and `sqlalchemy` and verifying
that cache files in `wheels` bucket use hash.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
* Upgrade the rust toolchain to 1.85.0. This does not increase the MSRV.
* Update windows trampoline to 1.86 nightly beta (previously in 1.85
nightly beta).
## Test Plan
Existing tests
Revert #11601 for now
We run Python interpreter discovery with `-I` (#2500) which means these
environments variables are ignored when determining `sys.path`. Unless
we decide to remove the `-I` flag from the `sys.path` query, we
shouldn't release these changes to interpreter discovery caching.
We want to use `sys.path` for package discovery (#2500, #9849). For
that, we need to know the correct value of `sys.path`. `sys.path` is a
runtime-changeable value, which gets influenced from a lot of different
sources: Environment variables, CLI arguments, `.pth` files with
scripting, `sys.path.append()` at runtime, a distributor patching
Python, etc. We cannot capture them all accurately, especially since
it's possible to change `sys.path` mid-execution. Instead, we do a best
effort attempt at matching the user's expectation.
The assumption is that package installation generally happens in venv
site-packages, system/user site-packages (including pypy shipping
packages with std), and `PYTHONPATH`. Specifically, we reuse
`PYTHONPATH` as dedicated way for users to tell uv to include specific
directories in package discovery.
A common way to influence `sys.path` that is not using venvs is setting
`PYTHONPATH`. To support this we're capturing `PYTHONPATH` as part of
the cache invalidation, i.e. we refresh the interpreter metadata if it
changed. For completeness, we're also capturing other environment
variables documented as influencing `sys.path` or other fields in the
interpreter info.
This PR does not include reading registry values for `sys.path`
additions on Windows as documented in
https://docs.python.org/3.11/using/windows.html#finding-modules. It
notably also does not include parsing of python CLI arguments, we only
consider their environment variable versions for package installation
and listing. We could try parsing CLI flags in `uv run python`, but we'd
still miss them when Python is launched indirectly through a script, and
it's more consistent to only consider uv's own arguments and environment
variables, similar to uv's behavior in other places.
Instead of using junctions, we can just write files that contain (as the
file contents) the target path. This requires a little more finesse in
that, as readers, we need to know where to expect these. But it also
means we get to avoid junctions, which have led to a variety of
confusing behaviors. Further, `replace_symlink` should now be on atomic
on Windows.
Closes#11263.
We've never bumped the version of this bucket, and we may never do so...
But it's still incorrect for us to omit it from these serialized structs
in the cache. Specifically, these structs include a pointer into the
archive bucket (namely, the ID). But we don't include the bucket
version! So, in theory, we could end up pointing to archives that don't
match the current bucket version expected in the code.
## Summary
Today, scripts use `CachedEnvironment`, which results in a different
virtual environment path every time the interpreter changes _or_ the
project requirements change. This makes it impossible to provide users
with a stable path to the script that they can use for (e.g.) directing
their editor.
This PR modifies `uv run` to use a stable path for local scripts (we
continue to use `CachedEnvironment` for remote scripts and scripts from
`stdin`). The logic now looks a lot more like it does for projects: we
`get_or_init` an environment, etc.
For now, the path to the script is like:
`environments-v1/4485801245a4732f`, where `4485801245a4732f` is a SHA of
the absolute path to the script. But I'm not picky on that :)
## Summary
If we fail to deserialize cached metadata in the cache, we should just
ignore it, rather than failing.
Ideally, this never happens. If it does, it means we missed a cache
version bump. But if it does happen, it should still be non-fatal.
Closes https://github.com/astral-sh/uv/issues/11043.
Closes https://github.com/astral-sh/uv/issues/11101.
## Test Plan
Prior to this PR, the following would fail:
- `uvx uv@0.5.25 venv --python 3.12 --cache-dir foo`
- `uvx uv@0.5.25 pip install ./scripts/packages/hatchling_dynamic
--no-deps --python 3.12 --cache-dir foo`
- `uvx uv@0.5.18 venv --python 3.12 --cache-dir foo`
- `uvx uv@0.5.18 pip install ./scripts/packages/hatchling_dynamic
--no-deps --python 3.12 --cache-dir foo`
We can't go back and fix 0.5.18, but this will prevent such regressions
in the future.
## Summary
On Windows, we have a lot of issues with atomic replacement and such.
There are a bunch of different failure modes, but they generally
involve: trying to persist a fail to a path at which the file already
exists, trying to replace or remove a file while someone else is reading
it, etc.
This PR adds locks to all of the relevant database paths. We already use
these advisory locks when building source distributions; now we use them
when unzipping wheels, storing metadata, etc.
Closes#11002.
## Test Plan
I ran the following script:
```shell
# Define the cache directory path
$cacheDir = "C:\Users\crmar\workspace\uv\cache"
# Clear the cache directory if it exists
if (Test-Path $cacheDir) {
Remove-Item -Recurse -Force $cacheDir
}
# Create the cache directory again
New-Item -ItemType Directory -Force -Path $cacheDir
# Define the command to run with --cache-dir flag
$command = {
param ($venvPath)
# Create a virtual environment in the specified path with --python
uv venv $venvPath
# Run the pip install command with --cache-dir flag
C:\Users\crmar\workspace\uv\target\profiling\uv.exe pip install flask==1.0.4 --no-binary flask --cache-dir C:\Users\crmar\workspace\uv\cache -v --python $venvPath
}
# Define the paths for the different virtual environments
$venv1 = "C:\Users\crmar\workspace\uv\venv1"
$venv2 = "C:\Users\crmar\workspace\uv\venv2"
$venv3 = "C:\Users\crmar\workspace\uv\venv3"
$venv4 = "C:\Users\crmar\workspace\uv\venv4"
$venv5 = "C:\Users\crmar\workspace\uv\venv5"
# Start the command in parallel five times using Start-Job, each with a different venv
$job1 = Start-Job -ScriptBlock $command -ArgumentList $venv1
$job2 = Start-Job -ScriptBlock $command -ArgumentList $venv2
$job3 = Start-Job -ScriptBlock $command -ArgumentList $venv3
$job4 = Start-Job -ScriptBlock $command -ArgumentList $venv4
$job5 = Start-Job -ScriptBlock $command -ArgumentList $venv5
# Wait for all jobs to complete
$jobs = @($job1, $job2, $job3, $job4, $job5)
$jobs | ForEach-Object { Wait-Job $_ }
# Retrieve the results (optional)
$jobs | ForEach-Object { Receive-Job -Job $_ }
# Clean up the jobs
$jobs | ForEach-Object { Remove-Job -Job $_ }
```
And ensured it succeeded in five straight invocations (whereas on
`main`, it consistently fails with a variety of different traces).
## Summary
This PR extends the thinking in #10525 to platform tags, and then uses
the structured tag enums everywhere, rather than passing around strings.
I think this is a big improvement! It means we're no longer doing ad hoc
tag parsing all over the place.
## Summary
A lot of good new lints, and most importantly, error stabilizations. I
tried to find a few usages of the new stabilizations, but I'm sure there
are more.
IIUC, this _does_ require bumping our MSRV.