## Summary
Follow up to https://github.com/astral-sh/uv/pull/15563
Closes https://github.com/astral-sh/uv/issues/13485
This is a first-pass at adding support for conditional support for Git
LFS between git sources, initial feedback welcome.
e.g.
```
[tool.uv.sources]
test-lfs-repo = { git = "https://github.com/zanieb/test-lfs-repo.git", lfs = true }
```
For context previously a user had to set `UV_GIT_LFS` to have uv fetch
lfs objects on git sources. This env var was all or nothing, meaning you
must always have it set to get consistent behavior and it applied to all
git sources. If you fetched lfs objects at a revision and then turned
off lfs (or vice versa), the git db, corresponding checkout lfs
artifacts would not be updated properly. Similarly, when git source
distributions were built, there would be no distinction between sources
with lfs and without lfs. Hence, it could corrupt the git, sdist, and
archive caches.
In order to support some sources being LFS enabled and other not, this
PR adds a stateful layer roughly similar to how `subdirectory` works but
for `lfs` since the git database, the checkouts and the corresponding
caching layers needed to be LFS aware (requested vs installed). The
caches also had to isolated and treated entirely separate when handling
LFS sources.
Summary
* Adds `lfs = true` or `lfs = false` to git sources in pyproject.toml
* Added `lfs=true` query param / fragments to most relevant url structs
(not parsed as user input)
* In the case of uv add / uv tool, `--lfs` is supported instead
* `UV_GIT_LFS` environment variable support is still functional for
non-project entrypoints (e.g. uv pip)
* `direct-url.json` now has an custom `git_lfs` entry under VcsInfo
(note, this is not in the spec currently -- see caveats).
* git database and checkouts have an different cache key as the sources
should be treated effectively different for the same rev.
* sdists cache also differ in the cache key of a built distribution if
it was built using LFS enabled revisions to distinguish between non-LFS
same revisions. This ensures the strong assumption for archive-v0 that
an unpacked revision "doesn't change sources" stays valid.
Caveats
* `pylock.toml` import support has not been added via git_lfs=true,
going through the spec it wasn't clear to me it's something we'd support
outside of the env var (for now).
* direct-url struct was modified by adding a non-standard `git_lfs`
field under VcsInfo which may be undersirable although the PEP 610 does
say `Additional fields that would be necessary to support such VCS
SHOULD be prefixed with the VCS command name` which could be interpret
this change as ok.
* There will be a slight lockfile and cache churn for users that use
`UV_GIT_LFS` as all git lockfile entries will get a `lfs=true` fragment.
The cache version does not need an update, but LFS sources will get
their own namespace under git-v0 and sdist-v9/git hence a cache-miss
will occur once but this can be sufficient to label this as breaking for
workflows always setting `UV_GIT_LFS`.
## Test Plan
Some initial tests were added. More tests likely to follow as we reach
consensus on a final approach.
For IT test, we may want to move to use a repo under astral namespace in
order to test lfs functionality.
Manual testing was done for common pathological cases like killing LFS
fetch mid-way, uninstalling LFS after installing an sdist with it and
reinstalling, fetching LFS artifacts in different commits, etc.
PSA: Please ignore the docker build failures as its related to depot
OIDC issues.
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
Co-authored-by: konstin <konstin@mailbox.org>
Part of https://github.com/astral-sh/uv/issues/4392
We shouldn't link to PyPI, and dropping the workspace-level
documentation link should mean that we get the auto-generated `docs.rs`
links.
## Summary
Right now, we only list changes if the _version_ differs. This PR takes
the SHA into account. We may want to list changes to _any_ sources, but
that gets more complicated (e.g., if the user swaps the index URL, we'd
have to show _all_ changes to the index URL).
Closes#15810.
## Summary
Make the use of `Self` consistent. Mostly done by running `cargo clippy
--fix -- -A clippy::all -W clippy::use_self`.
## Test Plan
<!-- How was it tested? -->
No need.
Prior to this PR, there were numerous places where uv would leak
credentials in logs. We had a way to mask credentials by calling methods
or a recently-added `redact_url` function, but this was not secure by
default. There were a number of other types (like `GitUrl`) that would
leak credentials on display.
This PR adds a `DisplaySafeUrl` newtype to prevent leaking credentials
when logging by default. It takes a maximalist approach, replacing the
use of `Url` almost everywhere. This includes when first parsing config
files, when storing URLs in types like `GitUrl`, and also when storing
URLs in types that in practice will never contain credentials (like
`DirectorySourceUrl`). The idea is to make it easy for developers to do
the right thing and for the compiler to support this (and to minimize
ever having to manually convert back and forth). Displaying credentials
now requires an active step. Note that despite this maximalist approach,
the use of the newtype should be zero cost.
One conspicuous place this PR does not use `DisplaySafeUrl` is in the
`uv-auth` crate. That would require new clones since there are calls to
`request.url()` that return a `&Url`. One option would have been to make
`DisplaySafeUrl` wrap a `Cow`, but this would lead to lifetime
annotations all over the codebase. I've created a separate PR based on
this one (#13576) that updates `uv-auth` to use `DisplaySafeUrl` with
one new clone. We can discuss the tradeoffs there.
Most of this PR just replaces `Url` with `DisplaySafeUrl`. The core is
`uv_redacted/lib.rs`, where the newtype is implemented. To make it
easier to review the rest, here are some points of note:
* `DisplaySafeUrl` has a `Display` implementation that masks
credentials. Currently, it will still display the username when there is
both a username and password. If we think is the wrong choice, it can
now be changed in one place.
* `DisplaySafeUrl` has a `remove_credentials()` method and also a
`.to_string_with_credentials()` method. This allows us to use it in a
variety of scenarios.
* `IndexUrl::redacted()` was renamed to
`IndexUrl::removed_credentials()` to make it clearer that we are not
masking.
* We convert from a `DisplaySafeUrl` to a `Url` when calling `reqwest`
methods like `.get()` and `.head()`.
* We convert from a `DisplaySafeUrl` to a `Url` when creating a
`uv_auth::Index`. That is because, as mentioned above, I will be
updating the `uv_auth` crate to use this newtype in a separate PR.
* A number of tests (e.g., in `pip_install.rs`) that formerly used
filters to mask tokens in the test output no longer need those filters
since tokens in URLs are now masked automatically.
* The one place we are still knowingly writing credentials to
`pyproject.toml` is when a URL with credentials is passed to `uv add`
with `--raw`. Since displaying credentials is no longer automatic, I
have added a `to_string_with_credentials()` method to the `Pep508Url`
trait. This is used when `--raw` is passed. Adding it to that trait is a
bit weird, but it's the simplest way to achieve the goal. I'm open to
suggestions on how to improve this, but note that because of the way
we're using generic bounds, it's not as simple as just creating a
separate trait for that method.
Rustfmt introduces a lot of formatting changes in the 2024 edition. To
not break everything all at once, we split out the set of formatting
changes compatible with both the 2021 and 2024 edition by first
formatting with the 2024 style, and then again with the currently used
2021 style.
Notable changes are the formatting of derive macro attributes and lines
with overly long strings and adding trailing semicolons after statements
consistently.
This PR redacts credentials in displayed URLs.
It mostly relies on a `redacted_url` function (and where possible
`IndexUrl::redacted`). This is a quick way to prevent leaked credentials
but it's prone to programmer error when adding new trace statements. A
better follow-on would use a `RedactedUrl` type with the appropriate
`Display` implementation. This would allow us to still extract
credentials from the URL while displaying it securely. On the plus side,
the sites where the `redacted_url` function are used serve as easy
signposts for where to use the new type in a future PR.
Closes#1714.
## Summary
This is the pattern I see in a variety of crates, and I believe this is
preferred if you don't _need_ an owned `String`, since you can avoid the
allocation. This could be pretty impactful for us?
Initially, we were limiting Git schemes to HTTPS and SSH as only
supported schemes. We lost this validation in #3429. This incidentally
allowed file schemes, which apparently work with Git out of the box.
A caveat for this is that in tool.uv.sources, we parse the git field
always as URL. This caused a problem with #11425: repo = { git =
'c:\path\to\repo', rev = "xxxxx" } was parsed as a URL where c: is the
scheme, causing a bad error message down the line.
This PR:
* Puts Git URL validation back in place. It bans everything but HTTPS,
SSH, and file URLs. This could be a breaking change, if users were using
a git transport protocol were not aware of, even though never
intentionally supported.
* Allows file: URL in Git: This seems to be supported by Git and we were
supporting it albeit unintentionally, so it's reasonable to continue to
support it.
* It does not allow relative paths in the git field in tool.uv.sources.
Absolute file URLs are supported, whether we want relative file URLs for
Git too should be discussed separately.
Closes#3429: We reject the input with a proper error message, while
hinting the user towards file:. If there's still desire for relative
path support, we can keep it open.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
We want to build `uv-build` without depending on the network crates. In
preparation for that, we split uv-git into uv-git and uv-git-types,
where only uv-git depends on reqwest, so that uv-build can use
uv-git-types.