Commit Graph

16 Commits

Author SHA1 Message Date
Tom Parker-Shemilt 042432b200 Strip sources out of pubgrub 2024-04-26 21:42:23 +01:00
Tom Parker-Shemilt 6db1eec1ab Merge branch 'main' into annotate-requirements-sources 2024-04-25 21:23:26 +01:00
Andrew Gallant 0b84eb0140
once-map: avoid hard-coding `Arc` (#3242)
The only thing a `OnceMap` really needs to be able to do with the value
is to clone it. All extant uses benefited from having this done for them
by automatically wrapping values in an `Arc`. But this isn't necessarily
true for all things. For example, a value might have an `Arc` internally
to making cloning cheap in other contexts, and it doesn't make sense to
re-wrap it in an `Arc` just to use it with a `OnceMap`. Or
alternatively, cloning might just be cheap enough on its own that an
`Arc` isn't worth it.
2024-04-24 11:11:46 -04:00
Charlie Marsh 14f05f27b3
Add ticks around error messages more consistently (#3004)
## Summary

I found some of these too bare (e.g., when they _just_ show a package
name with no other information). For me, this makes it easier to
differentiate error message copy from data. But open to other opinions.
Take a look at the fixture changes and LMK!
2024-04-22 23:58:36 +00:00
Charlie Marsh 2e88bb6f1b
Add a proxy layer for extras (#3100)
Given requirements like:

```
black==23.1.0
black[colorama]
```

The resolver will (on `main`) add a dependency on Black, and then try to
use the most recent version of Black to satisfy `black[colorama]`. For
sake of example, assume `black==24.0.0` is the most recent version. Once
the selects this most recent version, it'll fetch the metadata, then
return the dependencies for `black==24.0.0` with the `colorama` extra
enabled. Finally, it will tack on `black==24.0.0` (a dependency on the
base package). The resolver will then detect a conflict between
`black==23.1.0` and `black==24.0.0`, and throw out
`black[colorama]==24.0.0`, trying to next most-recent version.

This is both wasteful and can cause problems, since we're fetching
metadata for versions that will _never_ satisfy the resolver. In the
`apache-airflow[all]` case, I also ran into an issue whereby we were
attempting to build very old versions of `apache-airflow` due to
`apache-airflow[pandas]`, which in turn led to resolution failures.

The solution proposed here is that we create a new proxy package with
exactly two dependencies: one on `black` and one of `black[colorama]`.
Both of these packages must be at the same version as the proxy package,
so the resolver knows much _earlier_ that (in the above example) the
extra variant _must_ match `23.1.0`.
2024-04-19 01:04:59 +00:00
Charlie Marsh 1f3b5bb093
Add hash-checking support to `install` and `sync` (#2945)
## Summary

This PR adds support for hash-checking mode in `pip install` and `pip
sync`. It's a large change, both in terms of the size of the diff and
the modifications in behavior, but it's also one that's hard to merge in
pieces (at least, with any test coverage) since it needs to work
end-to-end to be useful and testable.

Here are some of the most important highlights:

- We store hashes in the cache. Where we previously stored pointers to
unzipped wheels in the `archives` directory, we now store pointers with
a set of known hashes. So every pointer to an unzipped wheel also
includes its known hashes.
- By default, we don't compute any hashes. If the user runs with
`--require-hashes`, and the cache doesn't contain those hashes, we
invalidate the cache, redownload the wheel, and compute the hashes as we
go. For users that don't run with `--require-hashes`, there will be no
change in performance. For users that _do_, the only change will be if
they don't run with `--generate-hashes` -- then they may see some
repeated work between resolution and installation, if they use `pip
compile` then `pip sync`.
- Many of the distribution types now include a `hashes` field, like
`CachedDist` and `LocalWheel`.
- Our behavior is similar to pip, in that we enforce hashes when pulling
any remote distributions, and when pulling from our own cache. Like pip,
though, we _don't_ enforce hashes if a distribution is _already_
installed.
- Hash validity is enforced in a few different places:
1. During resolution, we enforce hash validity based on the hashes
reported by the registry. If we need to access a source distribution,
though, we then enforce hash validity at that point too, prior to
running any untrusted code. (This is enforced in the distribution
database.)
2. In the install plan, we _only_ add cached distributions that have
matching hashes. If a cached distribution is missing any hashes, or the
hashes don't match, we don't return them from the install plan.
3. In the downloader, we _only_ return distributions with matching
hashes.
4. The final combination of "things we install" are: (1) the wheels from
the cache, and (2) the downloaded wheels. So this ensures that we never
install any mismatching distributions.
- Like pip, if `--require-hashes` is provided, we require that _all_
distributions are pinned with either `==` or a direct URL. We also
require that _all_ distributions have hashes.

There are a few notable TODOs:

- We don't support hash-checking mode for unnamed requirements. These
should be _somewhat_ rare, though? Since `pip compile` never outputs
unnamed requirements. I can fix this, it's just some additional work.
- We don't automatically enable `--require-hashes` with a hash exists in
the requirements file. We require `--require-hashes`.

Closes #474.

## Test Plan

I'd like to add some tests for registries that report incorrect hashes,
but otherwise: `cargo test`
2024-04-10 19:09:03 +00:00
Charlie Marsh 7ae06b3b46
Surface invalid metadata as hints in error reports (#2850)
## Summary

Closes #2847.
2024-04-09 23:12:10 -04:00
Charlie Marsh 34341bd6e9
Allow package lookups across multiple indexes via explicit opt-in (#2815)
## Summary

This partially revives https://github.com/astral-sh/uv/pull/2135 (with
some modifications) to enable users to opt-in to looking for packages
across multiple indexes.

The behavior is such that, in version selection, we take _any_
compatible version from a "higher-priority" index over the compatible
versions of a "lower-priority" index, even if that means we might accept
an "older" version.

Closes https://github.com/astral-sh/uv/issues/2775.
2024-04-03 23:23:37 +00:00
Zanie Blue e1878c8359
Consider installed packages during resolution (#2596)
Previously, we did not consider installed distributions as candidates
while performing resolution. Here, we update the resolver to use
installed distributions that satisfy requirements instead of pulling new
distributions from the registry.

The implementation details are as follows:

- We now provide `SitePackages` to the `CandidateSelector`
- If an installed distribution satisfies the requirement, we prefer it
over remote distributions
- We do not want to allow installed distributions in some cases, i.e.,
upgrade and reinstall
- We address this by introducing an `Exclusions` type which tracks
installed packages to ignore during selection
- There's a new `ResolvedDist` wrapper with `Installed(InstalledDist)`
and `Installable(Dist)` variants
- This lets us pass already installed distributions throughout the
resolver

The user-facing behavior is thoroughly covered in the tests, but
briefly:

- Installing a package that depends on an already-installed package
prefers the local version over the index
- Installing a package with a name that matches an already-installed URL
package does not reinstall from the index
- Reinstalling (--reinstall) a package by name _will_ pull from the
index even if an already-installed URL package is present
- To reinstall the URL package, you must specify the URL in the request

Closes https://github.com/astral-sh/uv/issues/1661

Addresses:

- https://github.com/astral-sh/uv/issues/1476
- https://github.com/astral-sh/uv/issues/1856
- https://github.com/astral-sh/uv/issues/2093
- https://github.com/astral-sh/uv/issues/2282
- https://github.com/astral-sh/uv/issues/2383
- https://github.com/astral-sh/uv/issues/2560

## Test plan

- [x] Reproduction at `charlesnicholson/uv-pep420-bug` passes
- [x] Unit test for editable package
([#1476](https://github.com/astral-sh/uv/issues/1476))
- [x] Unit test for previously installed package with empty registry
- [x] Unit test for local non-editable package
- [x] Unit test for new version available locally but not in registry
([#2093](https://github.com/astral-sh/uv/issues/2093))
- ~[ ] Unit test for wheel not available in registry but already
installed locally
([#2282](https://github.com/astral-sh/uv/issues/2282))~ (seems
complicated and not worthwhile)
- [x] Unit test for install from URL dependency then with matching
version ([#2383](https://github.com/astral-sh/uv/issues/2383))
- [x] Unit test for install of new package that depends on installed
package does not change version
([#2560](https://github.com/astral-sh/uv/issues/2560))
- [x] Unit test that `pip compile` does _not_ consider installed
packages
2024-03-28 13:49:17 -05:00
konsti 33bde826a0
Update pubgrub to use a dependency provider (#2648)
With https://github.com/pubgrub-rs/pubgrub/pull/190, pubgrub attaches
all types to a dependency provider to reduce the number of generics. We
need a dummy dependency provider now to emulate this. On the plus side,
pep440_rs drops its pubgrub dependency.
2024-03-25 15:51:31 +01:00
Charlie Marsh 5a95f50619
Add support for PyTorch-style local version semantics (#2430)
## Summary

This PR adds limited support for PEP 440-compatible local version
testing. Our behavior is _not_ comprehensively in-line with the spec.
However, it does fix by _far_ the biggest practical limitation, and
resolves all the issues that've been raised on uv related to local
versions without introducing much complexity into the resolver, so it
feels like a good tradeoff for me.

I'll summarize the change here, but for more context, see [Andrew's
write-up](https://github.com/astral-sh/uv/issues/1855#issuecomment-1967024866)
in the linked issue.

Local version identifiers are really tricky because of asymmetry.
`==1.2.3` should allow `1.2.3+foo`, but `==1.2.3+foo` should not allow
`1.2.3`. It's very hard to map them to PubGrub, because PubGrub doesn't
think of things in terms of individual specifiers (unlike the PEP 440
spec) -- it only thinks in terms of ranges.

Right now, resolving PyTorch and friends fails, because...

- The user provides requirements like `torch==2.0.0+cu118` and
`torchvision==0.15.1+cu118`.
- We then match those exact versions.
- We then look at the requirements of `torchvision==0.15.1+cu118`, which
includes `torch==2.0.0`.
- Under PEP 440, this is fine, because `torch @ 2.0.0+cu118` should be
compatible with `torch==2.0.0`.
- In our model, though, it's not, because these are different versions.
If we change our comparison logic in various places to allow this, we
risk breaking some fundamental assumptions of PubGrub around version
continuity.
- Thus, we fail to resolve, because we can't accept both `torch @ 2.0.0`
and `torch @ 2.0.0+cu118`.

As compared to the solutions we explored in
https://github.com/astral-sh/uv/issues/1855#issuecomment-1967024866, at
a high level, this approach differs in that we lie about the
_dependencies_ of packages that rely on our local-version-using package,
rather than lying about the versions that exist, or the version we're
returning, etc.

In short:

- When users specify local versions upfront, we keep track of them. So,
above, we'd take note of `torch` and `torchvision`.
- When we convert the dependencies of a package to PubGrub ranges, we
check if the requirement matches `torch` or `torchvision`. If it's
an`==`, we check if it matches (in the above example) for
`torch==2.0.0`. If so, we _change_ the requirement to
`torch==2.0.0+cu118`. (If it's `==` some other version, we return an
incompatibility.)

In other words, we selectively override the declared dependencies by
making them _more specific_ if a compatible local version was specified
upfront.

The net effect here is that the motivating PyTorch resolutions all work.
And, in general, transitive local versions work as expected.

The thing that still _doesn't_ work is: imagine if there were _only_
local versions of `torch` available. Like, `torch @ 2.0.0` didn't exist,
but `torch @ 2.0.0+cpu` did, and `torch @ 2.0.0+gpu` did, and so on.
`pip install torch==2.0.0` would arbitrarily choose one one `2.0.0+cpu`
or `2.0.0+gpu`, and that's correct as per PEP 440 (local version
segments should be completely ignored on `torch==2.0.0`). However, uv
would fail to identify a compatible version. I'd _probably_ prefer to
fix this, although candidly I think our behavior is _ok_ in practice,
and it's never been reported as an issue.

Closes https://github.com/astral-sh/uv/issues/1855.

Closes https://github.com/astral-sh/uv/issues/2080.

Closes https://github.com/astral-sh/uv/issues/2328.
2024-03-16 10:24:50 -04:00
Charlie Marsh 3799862f5d
Trim injected `python_version` marker to (major, minor) (#2395)
## Summary

Per [PEP 508](https://peps.python.org/pep-0508/), `python_version` is
just major and minor:

![Screenshot 2024-03-12 at 5 15
09 PM](https://github.com/astral-sh/uv/assets/1309177/cc3b8d65-dab3-4229-aed7-c6fe590b8da0)

Right now, we're using the provided version directly, so if it's, e.g.,
`-p 3.11.8`, we'll inject the wrong marker. This was causing `pandas` to
omit `numpy` when `-p 3.11.8` was provided, since its markers look like:

```
Requires-Dist: numpy<2,>=1.22.4; python_version < "3.11"
Requires-Dist: numpy<2,>=1.23.2; python_version == "3.11"
Requires-Dist: numpy<2,>=1.26.0; python_version >= "3.12"
```

Closes https://github.com/astral-sh/uv/issues/2392.
2024-03-13 00:11:50 +00:00
Charlie Marsh 79ac3a2a7e
Wait for request stream to flush before returning resolution (#2374)
## Summary

This is a more robust fix for
https://github.com/astral-sh/uv/issues/2300.

The basic issue is:

- When we resolve, we attempt to pre-fetch the distribution metadata for
candidate packages.
- It's possible that the resolution completes _without_ those pre-fetch
responses. (In the linked issue, this was mainly because we were running
with `--no-deps`, but the pre-fetch was causing us to attempt to build a
package to get its dependencies. The resolution would then finish before
the build completed.)
- In that case, the `Index` will be marked as "waiting" for that
response -- but it'll never come through.
- If there's a subsequent call to the `Index`, to see if we should fetch
or are waiting for that response, we'll end up waiting for it forever,
since it _looks_ like it's in-flight (but isn't). (In the linked issue,
we had to build the source distribution for the install phase of `pip
install`, but `setuptools` was in this bad state from the _resolve_
phase.)

This PR modifies the resolver to ensure that we flush the stream of
requests before returning. Specifically, we now `join` rather than
`select` between the resolution and request-handling futures.

This _could_ be wasteful, since we don't _need_ those requests, but it
at least ensures that every `.wait` is followed by ` .done`. In
practice, I expect this not to have any significant effect on
performance, since we end up using the pre-fetched distributions almost
every time.

## Test Plan

I ran through the test plan from
https://github.com/astral-sh/uv/pull/2373, but ran the build 10 times
and ensured it never crashed. (I reverted
https://github.com/astral-sh/uv/pull/2373, since that _also_ fixes the
issue in the proximate case, by never fetching `setuptools` during the
resolve phase.)

I also added logging to verify that requests are being handled _after_
the resolution completes, as expected.

I also introduced an arbitrary error in `fetch` to ensure that the error
was immediately propagated.
2024-03-12 10:13:57 -04:00
danieleades 8d721830db
Clippy pedantic (#1963)
Address a few pedantic lints

lints are separated into separate commits so they can be reviewed
individually.

I've not added enforcement for any of these lints, but that could be
added if desirable.
2024-02-25 14:04:05 -05:00
Charlie Marsh 7eaed07f6c
Move conflicting dependencies into PubGrub (#1796)
## Summary

This revives a PR from long ago
(https://github.com/astral-sh/uv/pull/383 and
https://github.com/zanieb/pubgrub/pull/24) that modifies how we deal
with dependencies that are declared multiple times within a single
package.

To quote from the originating PR:

> Uses an experimental pubgrub branch (#370) that allows us to handle
multiple version ranges for a single dependency to the solver which
results in better error messages because the derivation tree contains
all of the relevant versions. Previously, the version ranges were merged
(by us) in the resolver before handing them to pubgrub since only one
range could be provided per package. Since we don't merge the versions
anymore, we no longer give the solver an empty range for conflicting
requirements; instead the solver comes to that conclusion from the
provided versions. You can see the improved error message for direct
dependencies in [this
snapshot](https://github.com/astral-sh/puffin/pull/383/files#diff-a0437f2c20cde5e2f15199a3bf81a102b92580063268417847ec9c793a115bd0).

The main issue with that PR was around its handling of URL dependencies,
so this PR _also_ refactors how we handle those. Previously, we stored
URL dependencies on `PubGrubPackage`, but they were omitted from the
hash and equality implementations of `PubGrubPackage`. This led to some
really careful codepaths wherein we had to ensure that we always visited
URLs before non-URL packages, so that the URL-inclusive versions were
included in any hashmaps, etc. I considered preserving this approach,
but it would require us to rely on lots of internal details of PubGrub
(since we'd now be relying on PubGrub to merge those packages in the
"right" order).

So, instead, we now _always_ set the URL on a given package, whenever
that package was _given_ a URL upfront. I think this is easier to reason
about: if the user provided a URL for `flask`, then we should just
always add the URL for `flask`. If we see some other URL for `flask`, we
error, like before. If we see some unknown URL for `flask`, we error,
like before.

Closes https://github.com/astral-sh/uv/issues/1522.

Closes https://github.com/astral-sh/uv/issues/1821.

Closes https://github.com/astral-sh/uv/issues/1615.
2024-02-21 21:27:58 -05:00
Zanie Blue 2586f655bb
Rename to `uv` (#1302)
First, replace all usages in files in-place. I used my editor for this.
If someone wants to add a one-liner that'd be fun.

Then, update directory and file names:

```
# Run twice for nested directories
find . -type d -print0 | xargs -0 rename s/puffin/uv/g
find . -type d -print0 | xargs -0 rename s/puffin/uv/g

# Update files
find . -type f -print0 | xargs -0 rename s/puffin/uv/g
```

Then add all the files again

```
# Add all the files again
git add crates
git add python/uv

# This one needs a force-add
git add -f crates/uv-trampoline
```
2024-02-15 11:19:46 -06:00