mirror of https://github.com/astral-sh/uv
16 Commits
| Author | SHA1 | Message | Date |
|---|---|---|---|
|
|
ac3a085084
|
Respect `requires-python` when prefetching (#4900)
## Summary This is fallout from https://github.com/astral-sh/uv/pull/4705. We need to respect `requires-python` in the prefetch code to avoid building unsupported distributions. Closes https://github.com/astral-sh/uv/issues/4898. |
|
|
|
d9dbb8a4af
|
Support conflicting URL in separate forks (#4435)
Downstack PR: #4481 ## Introduction We support forking the dependency resolution to support conflicting registry requirements for different platforms, say on package range is required for an older python version while a newer is required for newer python versions, or dependencies that are different per platform. We need to extend this support to direct URL requirements. ```toml dependencies = [ "iniconfig @ |
|
|
|
40f852687b
|
Add context to unregistered task name to error context (#4471)
I caused this error during development and having the name of the task on it is helpful for debugging. Split out from #4435 |
|
|
|
0db1bf4df7
|
Avoid pre-fetching for unbounded minimum versions (#4149)
## Summary I think we should be able to model PubGrub such that this isn't necessary (at least for the case described in the issue), but for now, let's just avoid attempting to build very old distributions in prefetching. Closes https://github.com/astral-sh/uv/issues/4136. |
|
|
|
bcfe88dfdc
|
Track `Markers` via a PubGrub package variant (#4123)
## Summary This PR adds a lowering similar to that seen in https://github.com/astral-sh/uv/pull/3100, but this time, for markers. Like `PubGrubPackageInner::Extra`, we now have `PubGrubPackageInner::Marker`. The dependencies of the `Marker` are `PubGrubPackageInner::Package` with and without the marker. As an example of why this is useful: assume we have `urllib3>=1.22.0` as a direct dependency. Later, we see `urllib3 ; python_version > '3.7'` as a transitive dependency. As-is, we might (for some reason) pick a very old version of `urllib3` to satisfy `urllib3 ; python_version > '3.7'`, then attempt to fetch its dependencies, which could even involve building a very old version of `urllib3 ; python_version > '3.7'`. Once we fetch the dependencies, we would see that `urllib3` at the same version is _also_ a dependency (because we tack it on). In the new scheme though, as soon as we "choose" the very old version of `urllib3 ; python_version > '3.7'`, we'd then see that `urllib3` (the base package) is also a dependency; so we see a conflict before we even fetch the dependencies of the old variant. With this, I can successfully resolve the case in #4099. Closes https://github.com/astral-sh/uv/issues/4099. |
|
|
|
0acae9bd9c
|
Add support for development dependencies (#4036)
## Summary Externally, development dependencies are currently structured as a flat list of PEP 580-compatible requirements: ```toml [tool.uv] dev-dependencies = ["werkzeug"] ``` When locking, we lock all development dependencies; when syncing, users can provide `--dev`. Internally, though, we model them as dependency groups, similar to Poetry, PDM, and [PEP 735](https://peps.python.org/pep-0735). This enables us to change out the user-facing frontend without changing the internal implementation, once we've decided how these should be exposed to users. A few important decisions encoded in the implementation (which we can change later): 1. Groups are enabled globally, for all dependencies. This differs from extras, which are enabled on a per-requirement basis. Note, however, that we'll only discover groups for uv-enabled packages anyway. 2. Installing a group requires installing the base package. We rely on this in PubGrub to ensure that we resolve to the same version (even though we only expect groups to come from workspace dependencies anyway, which are unique). But anyway, that's encoded in the resolver right now, just as it is for extras. |
|
|
|
17c043536b |
uv-resolver: thread markers through the resolver and into the lock file
This addresses the lack of marker support in prior commits. Specifically, we add them as a new field to `AnnotatedDist`, and from there, they get added to a `Distribution` in a `Lock`. |
|
|
|
76418f5bdf
|
Arc-wrap `PubGrubPackage` for cheap cloning in pubgrub (#3688)
Pubgrub stores incompatibilities as (package name, version range) tuples, meaning it needs to clone the package name for each incompatibility, and each non-borrowed operation on incompatibilities. https://github.com/astral-sh/uv/pull/3673 made me realize that `PubGrubPackage` has gotten large (expensive to copy), so like `Version` and other structs, i've added an `Arc` wrapper around it. It's a pity clippy forbids `.deref()`, it's less opaque than `&**` and has IDE support (clicking on `.deref()` jumps to the right impl). ## Benchmarks It looks like this matters most for complex resolutions which, i assume because they carry larger `PubGrubPackageInner::Package` and `PubGrubPackageInner::Extra` types. ```bash hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/jupyter.in" "./uv-branch pip compile -q ./scripts/requirements/jupyter.in" hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/airflow.in" "./uv-branch pip compile -q ./scripts/requirements/airflow.in" hyperfine --warmup 5 "./uv-main pip compile -q ./scripts/requirements/boto3.in" "./uv-branch pip compile -q ./scripts/requirements/boto3.in" ``` ``` Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/jupyter.in Time (mean ± σ): 18.2 ms ± 1.6 ms [User: 14.4 ms, System: 26.0 ms] Range (min … max): 15.8 ms … 22.5 ms 181 runs Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/jupyter.in Time (mean ± σ): 17.8 ms ± 1.4 ms [User: 14.4 ms, System: 25.3 ms] Range (min … max): 15.4 ms … 23.1 ms 159 runs Summary ./uv-branch pip compile -q ./scripts/requirements/jupyter.in ran 1.02 ± 0.12 times faster than ./uv-main pip compile -q ./scripts/requirements/jupyter.in ``` ``` Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/airflow.in Time (mean ± σ): 153.7 ms ± 3.5 ms [User: 165.2 ms, System: 157.6 ms] Range (min … max): 150.4 ms … 163.0 ms 19 runs Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/airflow.in Time (mean ± σ): 123.9 ms ± 4.6 ms [User: 152.4 ms, System: 133.8 ms] Range (min … max): 118.4 ms … 138.1 ms 24 runs Summary ./uv-branch pip compile -q ./scripts/requirements/airflow.in ran 1.24 ± 0.05 times faster than ./uv-main pip compile -q ./scripts/requirements/airflow.in ``` ``` Benchmark 1: ./uv-main pip compile -q ./scripts/requirements/boto3.in Time (mean ± σ): 327.0 ms ± 3.8 ms [User: 344.5 ms, System: 71.6 ms] Range (min … max): 322.7 ms … 334.6 ms 10 runs Benchmark 2: ./uv-branch pip compile -q ./scripts/requirements/boto3.in Time (mean ± σ): 311.2 ms ± 3.1 ms [User: 339.3 ms, System: 63.1 ms] Range (min … max): 307.8 ms … 317.0 ms 10 runs Summary ./uv-branch pip compile -q ./scripts/requirements/boto3.in ran 1.05 ± 0.02 times faster than ./uv-main pip compile -q ./scripts/requirements/boto3.in ``` <!-- Thank you for contributing to uv! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> |
|
|
|
776a7e47f3 |
uv-resolver: add `Option<MarkerTree>` to `PubGrubPackage`
This just adds the field to the type and always sets it to `None`. There are semantic changes in this commit. Closes #3359 |
|
|
|
eac8221718 |
uv-resolver: use named fields for some PubGrubPackage variants
I'm planning to add another field here (markers), which puts a lot of stress on the positional approach. So let's just switch over to named fields. |
|
|
|
39af09f09b
|
Parallelize resolver (#3627)
## Summary This PR introduces parallelism to the resolver. Specifically, we can perform PubGrub resolution on a separate thread, while keeping all I/O on the tokio thread. We already have the infrastructure set up for this with the channel and `OnceMap`, which makes this change relatively simple. The big change needed to make this possible is removing the lifetimes on some of the types that need to be shared between the resolver and pubgrub thread. A related PR, https://github.com/astral-sh/uv/pull/1163, found that adding `yield_now` calls improved throughput. With optimal scheduling we might be able to get away with everything on the same thread here. However, in the ideal pipeline with perfect prefetching, the resolution and prefetching can run completely in parallel without depending on one another. While this would be very difficult to achieve, even with our current prefetching pattern we see a consistent performance improvement from parallelism. This does also require reverting a few of the changes from https://github.com/astral-sh/uv/pull/3413, but not all of them. The sharing is isolated to the resolver task. ## Test Plan On smaller tasks performance is mixed with ~2% improvements/regressions on both sides. However, on medium-large resolution tasks we see the benefits of parallelism, with improvements anywhere from 10-50%. ``` ./scripts/requirements/jupyter.in Benchmark 1: ./target/profiling/baseline (resolve-warm) Time (mean ± σ): 29.2 ms ± 1.8 ms [User: 20.3 ms, System: 29.8 ms] Range (min … max): 26.4 ms … 36.0 ms 91 runs Benchmark 2: ./target/profiling/parallel (resolve-warm) Time (mean ± σ): 25.5 ms ± 1.0 ms [User: 19.5 ms, System: 25.5 ms] Range (min … max): 23.6 ms … 27.8 ms 99 runs Summary ./target/profiling/parallel (resolve-warm) ran 1.15 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm) ``` ``` ./scripts/requirements/boto3.in Benchmark 1: ./target/profiling/baseline (resolve-warm) Time (mean ± σ): 487.1 ms ± 6.2 ms [User: 464.6 ms, System: 61.6 ms] Range (min … max): 480.0 ms … 497.3 ms 10 runs Benchmark 2: ./target/profiling/parallel (resolve-warm) Time (mean ± σ): 430.8 ms ± 9.3 ms [User: 529.0 ms, System: 77.2 ms] Range (min … max): 417.1 ms … 442.5 ms 10 runs Summary ./target/profiling/parallel (resolve-warm) ran 1.13 ± 0.03 times faster than ./target/profiling/baseline (resolve-warm) ``` ``` ./scripts/requirements/airflow.in Benchmark 1: ./target/profiling/baseline (resolve-warm) Time (mean ± σ): 478.1 ms ± 18.8 ms [User: 482.6 ms, System: 205.0 ms] Range (min … max): 454.7 ms … 508.9 ms 10 runs Benchmark 2: ./target/profiling/parallel (resolve-warm) Time (mean ± σ): 308.7 ms ± 11.7 ms [User: 428.5 ms, System: 209.5 ms] Range (min … max): 287.8 ms … 323.1 ms 10 runs Summary ./target/profiling/parallel (resolve-warm) ran 1.55 ± 0.08 times faster than ./target/profiling/baseline (resolve-warm) ``` |
|
|
|
018a7150d6
|
uv-distribution: include all wheels in distribution types (#3595)
Our current flow of data from "simple registry package" to "final resolved distribution" goes through a number of types: * `SimpleMetadata` is the API response from a registry that includes all published versions for a package. Each version has an assortment of metadata associated with it. * `VersionFiles` is the aforementioned metadata. It is split in two: a group of files for source distributions and a group of files for wheels. * `PrioritizedDist` collects a subset of the files from `VersionFiles` to form a selection of the "best" sdist and the "best" wheel for the current environment. * `CompatibleDist` is created from a borrowed `PrioritizedDist` that, perhaps among other things, encapsulates the decision of whether to pick an sdist or a wheel. (This decision depends both on compatibility and the action being performed. e.g., When doing installation, a `CompatibleDist` will sometimes select an sdist over a wheel.) * `ResolvedDistRef` is like a `ResolvedDist`, but borrows a `Dist`. * `ResolvedDist` is the almost-final-form of a distribution in a resolution and is created from a `ResolvedDistRef`. * `AnnotatedResolvedDist` is a new data type that is the actual final form of a distribution that a universal lock file cares about. It bundles a `ResolvedDist` with some metadata needed to generate a lock file. One of the requirements of a universal lock file is that we include all wheels (and maybe all source distributions? but at least one if it's present) associated with a distribution. But the above flow of data (in the step from `VersionFiles` to `PrioritizedDist`) drops all wheels except for the best one. To remedy this, in this PR, we rejigger `PrioritizedDist`, `CompatibleDist` and `ResolvedDistRef` so that all wheel data is preserved. And when a `ResolvedDistRef` is finally turned into a `ResolvedDist`, we copy all of the wheel data. And finally, we adjust the `Lock` constructor to read this new data and include it in the lock file. To make this work, we also modify `RegistryBuiltDist` so that it can contain one or more wheels instead of just one. One shortcoming here (called out in the code as a FIXME) is that if a source distribution is selected as the "best" thing to use (perhaps there are no compatible wheels), then the wheels won't end up in the lock file. I plan to fix this in a follow-up PR. We also aren't totally consistent on source distribution naming. Sometimes we use `sdist`. Sometimes `source`. Sometimes `source_dist`. I think it'd be nice to just use `sdist` everywhere, but I do prefer the type names to be `SourceDist`. And sometimes you want function names to match the type names (i.e., `from_source_dist`), which in turn leads to an appearance of inconsistency. I'm open to ideas. Closes #3351 |
|
|
|
d1b07a3f49
|
Log versions tried from batch prefetch (#3090)
This is required for evaluating #3087. This also removes tracking of virtual packages from extras from the batch prefetcher (we only track real packages). Let's look at some stats: * jupyter: Tried 100 versions: anyio 1, argon2-cffi 1, argon2-cffi-bindings 1, arrow 1, asttokens 1, async-lru 1, attrs 1, babel 1, beautifulsoup4 1, bleach 1, certifi 1, cffi 1, charset-normalizer 1, comm 1, debugpy 1, decorator 1, defusedxml 1, exceptiongroup 1, executing 1, fastjsonschema 1, fqdn 1, h11 1, httpcore 1, httpx 1, idna 1, ipykernel 1, ipython 1, ipywidgets 1, isoduration 1, jedi 1, jinja2 1, json5 1, jsonpointer 1, jsonschema 1, jsonschema-specifications 1, jupyter 1, jupyter-client 1, jupyter-console 1, jupyter-core 1, jupyter-events 1, jupyter-lsp 1, jupyter-server 1, jupyter-server-terminals 1, jupyterlab 1, jupyterlab-pygments 1, jupyterlab-server 1, jupyterlab-widgets 1, markupsafe 1, matplotlib-inline 1, mistune 1, nbclient 1, nbconvert 1, nbformat 1, nest-asyncio 1, notebook 1, notebook-shim 1, overrides 1, packaging 1, pandocfilters 1, parso 1, pexpect 1, platformdirs 1, prometheus-client 1, prompt-toolkit 1, psutil 1, ptyprocess 1, pure-eval 1, pycparser 1, pygments 1, python-dateutil 1, python-json-logger 1, pyyaml 1, pyzmq 1, qtconsole 1, qtpy 1, referencing 1, requests 1, rfc3339-validator 1, rfc3986-validator 1, root 1, rpds-py 1, send2trash 1, six 1, sniffio 1, soupsieve 1, stack-data 1, terminado 1, tinycss2 1, tomli 1, tornado 1, traitlets 1, types-python-dateutil 1, typing-extensions 1, uri-template 1, urllib3 1, wcwidth 1, webcolors 1, webencodings 1, websocket-client 1, widgetsnbextension 1 * boto3: botocore 1697, boto3 849, urllib3 2, jmespath 1, python-dateutil 1, root 1, s3transfer 1, six 1 * transformers-extras: Tried 1191 versions: sagemaker 152, hypothesis 67, tensorflow 21, jsonschema 19, tensorflow-cpu 18, multiprocess 10, pathos 10, tensorflow-text 10, chex 8, tf-keras 8, tf2onnx 8, aiohttp 6, aiosignal 6, alembic 6, annotated-types 6, apscheduler 6, attrs 6, backoff 6, binaryornot 6, black 6, boto3 6, click 6, coloredlogs 6, colorlog 6, dash 6, dash-bootstrap-components 6, dlinfo 6, exceptiongroup 6, execnet 6, fire 6, frozenlist 6, gitdb 6, google-auth 6, google-auth-oauthlib 6, hjson 6, iniconfig 6, jinja2-time 6, markdown 6, markdown-it-py 6, markupsafe 6, mpmath 6, namex 6, nbformat 6, ninja 6, nvidia-nvjitlink-cu12 6, onnxconverter-common 6, pandas 6, plac 6, platformdirs 6, pluggy 6, portalocker 6, poyo 6, protobuf3-to-dict 6, py-cpuinfo 6, py3nvml 6, pyarrow 6, pyarrow-hotfix 6, pydantic-core 6, pygments 6, pynvml 6, pypng 6, python-slugify 6, responses 6, smdebug-rulesconfig 6, soupsieve 6, sqlalchemy 6, tensorboard-data-server 6, tensorboard-plugin-wit 6, tensorboardx 6, threadpoolctl 6, tomli 6, wasabi 6, wcwidth 6, werkzeug 6, wheel 6, xxhash 6, zipp 6, etils 5, tensorboard 5, beautifulsoup4 4, cffi 4, clldutils 4, codecarbon 4, datasets 4, dill 4, evaluate 4, gitpython 4, hf-doc-builder 4, kenlm 4, librosa 4, llvmlite 4, nest-asyncio 4, nltk 4, optuna 4, parameterized 4, phonemizer 4, psutil 4, pyctcdecode 4, pytest 4, pytest-timeout 4, pytest-xdist 4, ray 4, rjieba 4, rouge-score 4, ruff 4, sacrebleu 4, sacremoses 4, sigopt 4, sortedcontainers 4, tensorstore 4, timeout-decorator 4, toolz 4, torchaudio 4, accelerate 3, audioread 3, cookiecutter 3, decorator 3, deepspeed 3, faiss-cpu 3, flax 3, fugashi 3, ipadic 3, isort 3, jax 3, jaxlib 3, joblib 3, keras-nlp 3, lazy-loader 3, numba 3, optax 3, pooch 3, pydantic 3, pygtrie 3, rhoknp 3, scikit-learn 3, segments 3, soundfile 3, soxr 3, sudachidict-core 3, sudachipy 3, torch 3, unidic 3, unidic-lite 3, urllib3 3, absl-py 2, arrow 2, astunparse 2, async-timeout 2, botocore 2, cachetools 2, certifi 2, chardet 2, charset-normalizer 2, csvw 2, dash-core-components 2, dash-html-components 2, dash-table 2, diffusers 2, dm-tree 2, fastjsonschema 2, flask 2, flatbuffers 2, fsspec 2, ftfy 2, gast 2, google-pasta 2, greenlet 2, grpcio 2, h5py 2, humanfriendly 2, idna 2, importlib-metadata 2, importlib-resources 2, jinja2 2, jmespath 2, jupyter-core 2, kagglehub 2, keras 2, keras-core 2, keras-preprocessing 2, libclang 2, mako 2, mdurl 2, ml-dtypes 2, msgpack 2, multidict 2, mypy-extensions 2, networkx 2, nvidia-cublas-cu12 2, nvidia-cuda-cupti-cu12 2, nvidia-cuda-nvrtc-cu12 2, nvidia-cuda-runtime-cu12 2, nvidia-cudnn-cu12 2, nvidia-cufft-cu12 2, nvidia-curand-cu12 2, nvidia-cusolver-cu12 2, nvidia-cusparse-cu12 2, nvidia-nccl-cu12 2, nvidia-nvtx-cu12 2, onnx 2, onnxruntime 2, onnxruntime-tools 2, opencv-python 2, opt-einsum 2, orbax-checkpoint 2, pathspec 2, plotly 2, pox 2, ppft 2, pyasn1-modules 2, pycparser 2, pyrsistent 2, python-dateutil 2, pytz 2, requests-oauthlib 2, retrying 2, rich 2, rsa 2, s3transfer 2, scipy 2, setuptools 2, six 2, smmap 2, sympy 2, tabulate 2, tensorflow-estimator 2, tensorflow-hub 2, tensorflow-io-gcs-filesystem 2, termcolor 2, text-unidecode 2, traitlets 2, triton 2, typing-extensions 2, tzdata 2, tzlocal 2, wrapt 2, xmltodict 2, yarl 2, Python 1, av 1, babel 1, bibtexparser 1, blinker 1, colorama 1, decord 1, filelock 1, huggingface-hub 1, isodate 1, itsdangerous 1, language-tags 1, lxml 1, numpy 1, oauthlib 1, packaging 1, pillow 1, protobuf 1, pyasn1 1, pylatexenc 1, pyparsing 1, pyyaml 1, rdflib 1, regex 1, requests 1, rfc3986 1, root 1, safetensors 1, sentencepiece 1, tenacity 1, timm 1, tokenizers 1, torchvision 1, tqdm 1, transformers 1, types-python-dateutil 1, uritemplate 1 You can reproduce them with python 3.10 and: ``` RUST_LOG=uv_resolver=debug cargo run pip compile -o /dev/null -q scripts/requirements/<input>.in 2>&1 | tail -n 1 ``` Closes #2270 - This is less invasive compared to the other PR, we can revisit number of network/build request tracking later. |
|
|
|
96c3c2e774
|
Support unnamed requirements in `--require-hashes` (#2993)
## Summary This PR enables `--require-hashes` with unnamed requirements. The key change is that `PackageId` becomes `VersionId` (since it refers to a package at a specific version), and the new `PackageId` consists of _either_ a package name _or_ a URL. The hashes are keyed by `PackageId`, so we can generate the `RequiredHashes` before we have names for all packages, and enforce them throughout. Closes #2979. |
|
|
|
f9c0632953
|
Ignore direct URL distributions in prefetcher (#2943)
## Summary The prefetcher tallies the number of times we tried a given package, and then once we hit a threshold, grabs the version map, assuming it's already been fetched. For direct URL distributions, though, we don't have a version map! And there's no need to prefetch. Closes https://github.com/astral-sh/uv/issues/2941. |
|
|
|
fb4ba2bbc2
|
Speed up cold cache urllib3/boto3/botocore with batched prefetching (#2452)
With pubgrub being fast for complex ranges, we can now compute the next n candidates without taking a performance hit. This speeds up cold cache `urllib3<1.25.4` `boto3` from maybe 40s - 50s to ~2s. See docstrings for details on the heuristics. **Before**  **After**  --- We need two parts of the prefetching, first looking for compatible version and then falling back to flat next versions. After we selected a boto3 version, there is only one compatible botocore version remaining, so when won't find other compatible candidates for prefetching. We see this as a pattern where we only prefetch boto3 (stack bars), but not botocore (sequential requests between the stacked bars).  The risk is that we're completely wrong with the guess and cause a lot of useless network requests. I think this is acceptable since this mechanism only triggers when we're already on the bad path and we should simply have fetched all versions after some seconds (assuming a fast index like pypi). --- It would be even better if the pubgrub state was copy-on-write so we could simulate more progress than we actually have; currently we're guessing what the next version is which could be completely wrong, but i think this is still a valuable heuristic. Fixes #170. |