Commit Graph

24 Commits

Author SHA1 Message Date
Charlie Marsh b7d77c04cc
Add Git resolver in lieu of static hash map (#3954)
## Summary

This PR removes the static resolver map:

```rust
static RESOLVED_GIT_REFS: Lazy<Mutex<FxHashMap<RepositoryReference, GitSha>>> =
    Lazy::new(Mutex::default);
```

With a `GitResolver` struct that we now pass around on the
`BuildContext`. There should be no behavior changes here; it's purely an
internal refactor with an eye towards making it cleaner for us to
"pre-populate" the list of resolved SHAs.
2024-05-31 22:44:42 -04:00
konsti 081f20c53e
Add support for `tool.uv` into distribution building (#3904)
With the change, we remove the special casing of workspace dependencies
and resolve `tool.uv` for all git and directory distributions. This
gives us support for non-editable workspace dependencies and path
dependencies in other workspaces. It removes a lot of special casing
around workspaces. These changes are the groundwork for supporting
`tool.uv` with dynamic metadata.

The basis for this change is moving `Requirement` from
`distribution-types` to `pypi-types` and the lowering logic from
`uv-requirements` to `uv-distribution`. This changes should be split out
in separate PRs.

I've included an example workspace `albatross-root-workspace2` where
`bird-feeder` depends on `a` from another workspace `ab`. There's a
bunch of failing tests and regressed error messages that still need
fixing. It does fix the audited package count for the workspace tests.
2024-05-31 02:42:03 +00:00
Andrew Gallant 17c043536b uv-resolver: thread markers through the resolver and into the lock file
This addresses the lack of marker support in prior commits.
Specifically, we add them as a new field to `AnnotatedDist`, and from
there, they get added to a `Distribution` in a `Lock`.
2024-05-30 14:23:14 -04:00
Andrew Gallant f865406ab4 uv-resolver: implement merging of forked resolutions
This commit is a pretty invasive change that implements the merging
of resolutions created by each fork of the resolver.

The main idea here is that each `SolveState` is converted into a
`Resolution` (a new type) and stored on the heap after its fork
completes. When all forks complete, they are all merged into a single
`Resolution`. This `Resolution` is then used to build a `ResolutionGraph`.

Construction of `ResolutionGraph` mostly stays the same (despite the
gnarly diff due to an indent change) with one exception: the code to
extract dependency edges out of PubGrub's state has been moved to
`SolveState::into_resolution`. The idea here is that once a fork
completes, we extract what we need from the PubGrub state and then
throw it away. We store these edges in our own intermediate type which
is then converted into petgraph edges in the `ResolutionGraph`
constructor.

One interesting change we make here is that our edge
data is now a `Version` instead of a `Range<Version>`. I don't think
`Range<Version>` was actually being used anywhere, so this seems okay?
In any case, I think `Version` here is correct because a resolution
corresponds to specific dependencies of each package. Moreover, I didn't
see an easy way to make things work with `Range<Version>`. Notably,
since we no longer have the guarantee that there is only one version of
each package, we need to use `(PackageName, Version)` instead of just
`PackageName` for inverted lookups in `ResolutionGraph::from_state`.

Finally, the main resolver loop itself is changed a bit to track all
forked resolutions and then merge them at the end.

Note that we don't really have any dealings with markers in this commit.
We'll get to that in a subsequent commit.
2024-05-30 14:23:14 -04:00
Andrew Gallant 9e977aa1be uv-resolver: slightly simplify ResolutionGraph::from_state
This changes the constructor to just take an `InMemoryIndex`
directly instead of the constituent parts. No real reason other
than it seems a little simpler.
2024-05-30 14:23:14 -04:00
Charlie Marsh 4859a27948
Add extra dependency annotations to lockfile and sync commands (#3913)
## Summary

This PR adds extras to the lockfile, and enables users to selectively
sync extras in `uv sync` and `uv run`. The end result here was fairly
simple, though it required a few refactors to get here. The basic idea
is that `DistributionId` now includes `extra: Option<ExtraName>`, so we
effectively treat extras as separate packages. Generating the lockfile,
and generating the resolution from the lockfile, fall out of this
naturally with no special-casing or additional changes.

The main downside here is that it bloats the lockfile significantly.
Specifically:

- We include _all_ distribution URLs and hashes for _every_ extra
variant.
- We include all dependencies for the extra variant, even though that
are dependencies of the base package.

We could normalize this representation by changing each distribution
have an `optional-dependencies` hash map that keys on extras, but we
actually don't have the information we need to create that right now
(specifically, we can't differentiate between dependencies that
_require_ the extra and dependencies on the base package).

Closes #3700.
2024-05-29 19:25:58 +00:00
Charlie Marsh 1bd5d8bc34
Include all extras when generating lockfile (#3912)
## Summary

This PR just ensures that when running `uv lock` (or `uv run`), we lock
with all extras. When we later install, we'll also _install_ with all
extras, but that will be changed in a future PR.
2024-05-29 15:08:20 -04:00
Charlie Marsh ed7f55606d
Split `requirements.txt`-style resolution distribution into its own type (#3911) 2024-05-29 16:48:28 +00:00
Charlie Marsh 19c91e7dac
Create distinct graph nodes for each package extra (#3908)
## Summary

Today, we represent each package as a single node in the graph, and
combine all the extras. This is helpful for the `requirements.txt`-style
resolution, in which we want to show each a single line for each package
with the extras combined into a single array.

This PR modifies the representation to instead use a separate node for
each (package, extra) pair. We then reduce into the previous format when
printing in the `requirements.txt`-style format, so there shouldn't be
any user-facing changes here.
2024-05-29 15:42:49 +00:00
Charlie Marsh 42b1ba04ec
Remove unused `PetGraph` weight (#3906) 2024-05-29 15:16:25 +00:00
Charlie Marsh cedd18e4c6
Remove some unused `pub` functions (#3872)
## Summary

I wrote a bad Python script to find these.
2024-05-28 15:58:13 +00:00
Charlie Marsh 1fc6a59707
Remove special-casing for editable requirements (#3869)
## Summary

There are a few behavior changes in here:

- We now enforce `--require-hashes` for editables, like pip. So if you
use `--require-hashes` with an editable requirement, we'll reject it. I
could change this if it seems off.
- We now treat source tree requirements, editable or not (e.g., both `-e
./black` and `./black`) as if `--refresh` is always enabled. This
doesn't mean that we _always_ rebuild them; but if you pass
`--reinstall`, then yes, we always rebuild them. I think this is an
improvement and is close to how editables work today.

Closes #3844.

Closes #2695.
2024-05-28 15:49:34 +00:00
konsti 4db468e27f
Use `VerbatimParsedUrl` in `pep508_rs` (#3758)
When parsing requirements from any source, directly parse the url parts
(and reject unsupported urls) instead of parsing url parts at a later
stage. This removes a bunch of error branches and concludes the work
parsing url parts once and passing them around everywhere.

Many usages of the assembled `VerbatimUrl` remain, but these can be
removed incrementally.

Please review commit-by-commit.
2024-05-23 19:52:47 +00:00
Charlie Marsh 79fecdf251
Add a diagnostic trait (#3777) 2024-05-22 19:44:37 -04:00
Charlie Marsh 74c494d7dd
Report yanks for cached and resolved packages (#3772)
## Summary

We now show yanks as part of the resolution diagnostics, so they now
appear for `sync`, `install`, `compile`, and any other operations.
Further, they'll also appear for cached packages (but not packages that
are _already_ installed).

Closes https://github.com/astral-sh/uv/issues/3768.

Closes #3766.
2024-05-22 21:21:37 +00:00
konsti 76418f5bdf
Arc-wrap `PubGrubPackage` for cheap cloning in pubgrub (#3688)
Pubgrub stores incompatibilities as (package name, version range)
tuples, meaning it needs to clone the package name for each
incompatibility, and each non-borrowed operation on incompatibilities.
https://github.com/astral-sh/uv/pull/3673 made me realize that
`PubGrubPackage` has gotten large (expensive to copy), so like `Version`
and other structs, i've added an `Arc` wrapper around it.

It's a pity clippy forbids `.deref()`, it's less opaque than `&**` and
has IDE support (clicking on `.deref()` jumps to the right impl).

## Benchmarks

It looks like this matters most for complex resolutions which, i assume
because they carry larger `PubGrubPackageInner::Package` and
`PubGrubPackageInner::Extra` types.

```bash
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/jupyter.in" "./uv-branch pip compile -q ./scripts/requirements/jupyter.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/airflow.in" "./uv-branch pip compile -q ./scripts/requirements/airflow.in"
hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/boto3.in" "./uv-branch pip compile -q ./scripts/requirements/boto3.in"
```

```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/jupyter.in
  Time (mean ± σ):      18.2 ms ±   1.6 ms    [User: 14.4 ms, System: 26.0 ms]
  Range (min … max):    15.8 ms …  22.5 ms    181 runs

Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/jupyter.in
  Time (mean ± σ):      17.8 ms ±   1.4 ms    [User: 14.4 ms, System: 25.3 ms]
  Range (min … max):    15.4 ms …  23.1 ms    159 runs

Summary
  ./uv-branch pip compile -q ./scripts/requirements/jupyter.in ran
    1.02 ± 0.12 times faster than ./uv-main pip compile -q ./scripts/requirements/jupyter.in
```

```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/airflow.in
  Time (mean ± σ):     153.7 ms ±   3.5 ms    [User: 165.2 ms, System: 157.6 ms]
  Range (min … max):   150.4 ms … 163.0 ms    19 runs

Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/airflow.in
  Time (mean ± σ):     123.9 ms ±   4.6 ms    [User: 152.4 ms, System: 133.8 ms]
  Range (min … max):   118.4 ms … 138.1 ms    24 runs

Summary
  ./uv-branch pip compile -q ./scripts/requirements/airflow.in ran
    1.24 ± 0.05 times faster than ./uv-main pip compile -q ./scripts/requirements/airflow.in
```

```
Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/boto3.in
  Time (mean ± σ):     327.0 ms ±   3.8 ms    [User: 344.5 ms, System: 71.6 ms]
  Range (min … max):   322.7 ms … 334.6 ms    10 runs

Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/boto3.in
  Time (mean ± σ):     311.2 ms ±   3.1 ms    [User: 339.3 ms, System: 63.1 ms]
  Range (min … max):   307.8 ms … 317.0 ms    10 runs

Summary
  ./uv-branch pip compile -q ./scripts/requirements/boto3.in ran
    1.05 ± 0.02 times faster than ./uv-main pip compile -q ./scripts/requirements/boto3.in
```

<!--
Thank you for contributing to uv! To help us out with reviewing, please
consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
2024-05-21 13:49:35 +02:00
Andrew Gallant 776a7e47f3 uv-resolver: add `Option<MarkerTree>` to `PubGrubPackage`
This just adds the field to the type and always sets it to `None`. There
are semantic changes in this commit.

Closes #3359
2024-05-20 19:56:24 -04:00
Andrew Gallant eac8221718 uv-resolver: use named fields for some PubGrubPackage variants
I'm planning to add another field here (markers), which puts a lot of
stress on the positional approach. So let's just switch over to named
fields.
2024-05-20 19:56:24 -04:00
konsti 95c9621541
Refactor editables for supporting them in bluejay commands (#3639)
This is split out from workspaces support, which needs editables in the
bluejay commands. It consists mainly of refactorings:

* Move the `editable` module one level up.
* Introduce a `BuiltEditableMetadata` type for `(LocalEditable,
Metadata23, Requirements)`.
* Add editables to `InstalledPackagesProvider` so we can use
`EmptyInstalledPackages` for them.
2024-05-20 16:22:12 +00:00
Ibraheem Ahmed 0f67a6ceea
Use `FxHasher` in resolver (#3641)
## Summary

We can use `FxHasher` in a few more places for string and version keys.
This gives a consistent ~2% improvement to warm resolves.
2024-05-17 15:04:22 -04:00
Ibraheem Ahmed 39af09f09b
Parallelize resolver (#3627)
## Summary

This PR introduces parallelism to the resolver. Specifically, we can
perform PubGrub resolution on a separate thread, while keeping all I/O
on the tokio thread. We already have the infrastructure set up for this
with the channel and `OnceMap`, which makes this change relatively
simple. The big change needed to make this possible is removing the
lifetimes on some of the types that need to be shared between the
resolver and pubgrub thread.

A related PR, https://github.com/astral-sh/uv/pull/1163, found that
adding `yield_now` calls improved throughput. With optimal scheduling we
might be able to get away with everything on the same thread here.
However, in the ideal pipeline with perfect prefetching, the resolution
and prefetching can run completely in parallel without depending on one
another. While this would be very difficult to achieve, even with our
current prefetching pattern we see a consistent performance improvement
from parallelism.

This does also require reverting a few of the changes from
https://github.com/astral-sh/uv/pull/3413, but not all of them. The
sharing is isolated to the resolver task.

## Test Plan

On smaller tasks performance is mixed with ~2% improvements/regressions
on both sides. However, on medium-large resolution tasks we see the
benefits of parallelism, with improvements anywhere from 10-50%.

```
./scripts/requirements/jupyter.in
Benchmark 1: ./target/profiling/baseline (resolve-warm)
  Time (mean ± σ):      29.2 ms ±   1.8 ms    [User: 20.3 ms, System: 29.8 ms]
  Range (min … max):    26.4 ms …  36.0 ms    91 runs
 
Benchmark 2: ./target/profiling/parallel (resolve-warm)
  Time (mean ± σ):      25.5 ms ±   1.0 ms    [User: 19.5 ms, System: 25.5 ms]
  Range (min … max):    23.6 ms …  27.8 ms    99 runs
 
Summary
  ./target/profiling/parallel (resolve-warm) ran
    1.15 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm)
```
```
./scripts/requirements/boto3.in   
Benchmark 1: ./target/profiling/baseline (resolve-warm)
  Time (mean ± σ):     487.1 ms ±   6.2 ms    [User: 464.6 ms, System: 61.6 ms]
  Range (min … max):   480.0 ms … 497.3 ms    10 runs
 
Benchmark 2: ./target/profiling/parallel (resolve-warm)
  Time (mean ± σ):     430.8 ms ±   9.3 ms    [User: 529.0 ms, System: 77.2 ms]
  Range (min … max):   417.1 ms … 442.5 ms    10 runs
 
Summary
  ./target/profiling/parallel (resolve-warm) ran
    1.13 ± 0.03 times faster than ./target/profiling/baseline (resolve-warm)
```
```
./scripts/requirements/airflow.in 
Benchmark 1: ./target/profiling/baseline (resolve-warm)
  Time (mean ± σ):     478.1 ms ±  18.8 ms    [User: 482.6 ms, System: 205.0 ms]
  Range (min … max):   454.7 ms … 508.9 ms    10 runs
 
Benchmark 2: ./target/profiling/parallel (resolve-warm)
  Time (mean ± σ):     308.7 ms ±  11.7 ms    [User: 428.5 ms, System: 209.5 ms]
  Range (min … max):   287.8 ms … 323.1 ms    10 runs
 
Summary
  ./target/profiling/parallel (resolve-warm) ran
    1.55 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm)
```
2024-05-17 11:47:30 -04:00
Charlie Marsh d76b023cd6
Remove `resolve_cli.rs` and `resolve_many.rs` (#3606)
## Summary

These depend on some APIs that I want to be internal-only for the
resolver crate, and we're no longer using them.
2024-05-15 16:04:47 +00:00
Charlie Marsh 85c71d6987
Rename `pinned_package` to `dist` (#3598) 2024-05-15 00:42:27 +00:00
Charlie Marsh 9a18d4ff46
Split `resolution.rs` into multiple files (#3597)
## Summary

No code changes.
2024-05-15 00:16:04 +00:00