## Summary
This PR restructures the flat index fetching in a few ways:
1. It now lives in its own `FlatIndexClient`, since it felt a bit
awkward (in my opinion) for it to live in `RegistryClient`.
2. We now fetch the `FlatIndex` outside of the resolver. This has a few
benefits: (1) the resolver construct is no longer `async` and no longer
returns `Result`, which feels better for a resolver; and (2) we can
share the `FlatIndex` across resolutions rather than re-fetching it for
every source distribution build.
## Summary
`FlatIndex` is now the thing that's keyed on `PackageName`, while
`FlatDistributions` is what used to be called `FlatIndex` (a map from
version to `PrioritizedDistribution`, for a single package). I find this
a bit clearer, since we can also remove the `from_files` that doesn't
return `Self`, which I had trouble following.
## Summary
I'm running into some annoyances converting `&Version` to
`&PubGrubVersion` (which is just a wrapper type around `Version`), and I
realized... We don't even need `PubGrubVersion`?
The reason we "need" it today is due to the orphan trait rule: `Version`
is defined in `pep440_rs`, but we want to `impl
pubgrub::version::Version for Version` in the resolver crate.
Instead of introducing a new type here, which leads to a lot of
awkwardness around conversion and API isolation, what if we instead just
implement `pubgrub::version::Version` in `pep440_rs` via a feature? That
way, we can just use `Version` everywhere without any confusion and
conversion for the wrapper type.
Add directory `--find-links` support for local paths to pip-compile.
It seems that pip joins all sources and then picks the best package. We
explicitly give find links packages precedence if the same exists on an
index and locally by prefilling the `VersionMap`, otherwise they are
added as another index and the existing rules of precedence apply.
Internally, the feature is called _flat index_, which is more meaningful
than _find links_: We're not looking for links, we're picking up local
directories, and (TBD) support another index format that's just a flat
list of files instead of a nested index.
`RegistryBuiltDist` and `RegistrySourceDist` now use `WheelFilename` and
`SourceDistFilename` respectively. The `File` inside `RegistryBuiltDist`
and `RegistrySourceDist` gained the ability to represent both a url and
a path so that `--find-links` with a url and with a path works the same,
both being locked as `<package_name>@<version>` instead of
`<package_name> @ <url>`. (This is more of a detail, this PR in general
still work if we strip that and have directory find links represented as
`<package_name> @ file:///path/to/file.ext`)
`PrioritizedDistribution` and `FlatIndex` have been moved to locations
where we can use them in the upstack PR.
I added a `scripts/wheels` directory with stripped down wheels to use
for testing.
We're lacking tests for correct tag priority precedence with flat
indexes, i only confirmed this manually since it is not covered in the
pip-compile or pip-sync output.
Closes#876
There is no guarantee that indexes provide hashes at all or the sha256
we support specifically. [PEP
503](https://peps.python.org/pep-0503/#specification):
> The URL SHOULD include a hash in the form of a URL fragment with the
following syntax: #<hashname>=<hashvalue>, where <hashname> is the
lowercase name of the hash function (such as sha256) and <hashvalue> is
the hex encoded digest.
We instead use the url as input to generate a hash when caching.
## Summary
We always normalize extra names in our requirements (e.g., `cuda12_pip`
to `cuda12-pip`), but we weren't normalizing within PEP 508 markers,
which meant we ended up comparing `cuda12-pip` (normalized) against
`cuda12_pip` (unnormalized).
Closes https://github.com/astral-sh/puffin/issues/911.
Previously, the url on file could either be a relative or an absolute
url, depending on the index, and we would finalize it lazily. Now we
finalize the url when converting `pypi_types::File` to
`distribution_types::File`. This change is required to make the hashes
on `File` optional (https://github.com/astral-sh/puffin/pull/910), which
are currently the only unique field usable for caching.
## Summary
This PR ensures that when the user passes in `--python-version`, we
adjust the _markers_ to match the target version, thus forcing us to
select compatible wheels for the `--python-version`, rather than the
installed version.
## Context
Let's call Python 3.10 the "installed" environment and Python 3.12 the
"target" environment. For each version, we have _both_ a Python version
(to match against `Requires-Python`) and a set of tags (to match against
wheels).
The rules for resolution are as follows...
- For each package, for each version, we try to find the "best
candidate" for resolution and installation.
- We first look for a wheel that's compatible with the _target_
environment. This requires testing against both the `Requires-Python`
and the markers. (We won't have to build or run this code, so the
_installed_ version is irrelevant.) **(This PR corrects _this_ bullet --
previously, we validated against the _installed_ markers, rather than
the target markers.)**
- If we can't find a compatible wheel, we accept any _incompatible_
wheel as long as there's a source distribution. The source distribution
_must_ be compatible with the target environment. (We won't have to
build or run this code, so the _installed_ version is irrelevant.)
- If there are no wheels, then the source distribution must be
compatible with _both_ the installed and target environments, since we
need to build it.
This is all true for the top-level resolution. When we perform a
sub-resolution (when resolving the build dependencies of a source
distribution), we should _only_ use the installed environment, and
ignore the target environment, since we assume that the dependencies
will be the same in both environments once built -- so our goal is
"just" to build the distribution, without concern for which build
dependencies it uses.
Closes https://github.com/astral-sh/puffin/issues/883.
Remove a test case from the `install_editable` that slows it down from
3.6s to 6.5s while providing low test coverage. It also seems to block
other tests sometimes, `cargo nextest run -E "test(editable)"
--all-features` has more consistent and lower runtimes. Surprisingly
this seems to have bigger effect than switching from pyo3 to cffi.
Used test commands:
```
rm -rf scripts/editable-installs/maturin_editable/target/ && time cargo nextest run -E "test(=install_editable)" --all-features
rm -rf scripts/editable-installs/maturin_editable/target/ && time cargo nextest run -E "test(editable)" --all-features
```
Part of #878
Replace the DTLSsocket test with a dummy package that does nothing but
contain the build system specs that we need. This should speed up one of
the slowest tests.
Part of #878
## Summary
Now that `get_or_build_wheel` will often _also_ handle the unzip step,
we need to move our per-target locking (`OnceMap`) up a level.
Previously, it was only applied to the unzip step, to prevent us from
attempting to unzip into the same target concurrently; now, it's applied
at the `get_wheel` level, which includes both downloading and unzipping.
## Test Plan
It seems like none of our existing tests catch this -- perhaps because
they're too "simple"? You need to run into a situation in which you're
doing multiple source distribution builds concurrently (since they'll
all try to download `setuptools`):
```
rm -rf foo && virtualenv --clear .venv && cargo run -p puffin-cli -- pip-compile ./scripts/requirements/pydantic.in --verbose --cache-dir foo
```
Reduces the number of implementation branches handling `Range:full`,
deferring it to `PackageRange`.
Improves some user-facing messages, e.g. saying `all versions of
<package>` instead of `<package>*`.
Changes the member names of the `PackageRangeKind` enum — they were not
very clear.
## Summary
Installs the seed packages you get with `virtualenv`, but opt-in rather
than opt-out.
Closes https://github.com/astral-sh/puffin/issues/852.
## Test Plan
```
❯ ./scripts/benchmarks/venv.sh
+ hyperfine --runs 20 --warmup 3 --prepare 'rm -rf .venv' './target/release/puffin venv' --prepare 'rm -rf .venv' 'virtualenv --without-pip .venv' --prepare 'rm -rf .venv' 'python -m venv --without-pip .venv'
Benchmark 1: ./target/release/puffin venv
Time (mean ± σ): 4.6 ms ± 0.2 ms [User: 2.4 ms, System: 3.6 ms]
Range (min … max): 4.3 ms … 4.9 ms 20 runs
Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.
Benchmark 2: virtualenv --without-pip .venv
Time (mean ± σ): 73.3 ms ± 0.3 ms [User: 57.4 ms, System: 14.2 ms]
Range (min … max): 72.8 ms … 74.0 ms 20 runs
Benchmark 3: python -m venv --without-pip .venv
Time (mean ± σ): 22.5 ms ± 0.3 ms [User: 17.0 ms, System: 4.9 ms]
Range (min … max): 22.0 ms … 23.2 ms 20 runs
Summary
'./target/release/puffin venv' ran
4.92 ± 0.20 times faster than 'python -m venv --without-pip .venv'
16.00 ± 0.63 times faster than 'virtualenv --without-pip .venv'
+ hyperfine --runs 20 --warmup 3 --prepare 'rm -rf .venv' './target/release/puffin venv --seed' --prepare 'rm -rf .venv' 'virtualenv .venv' --prepare 'rm -rf .venv' 'python -m venv .venv'
Benchmark 1: ./target/release/puffin venv --seed
Time (mean ± σ): 20.2 ms ± 0.4 ms [User: 8.6 ms, System: 15.7 ms]
Range (min … max): 19.7 ms … 21.2 ms 20 runs
Benchmark 2: virtualenv .venv
Time (mean ± σ): 135.1 ms ± 2.4 ms [User: 66.7 ms, System: 65.7 ms]
Range (min … max): 133.2 ms … 142.8 ms 20 runs
Benchmark 3: python -m venv .venv
Time (mean ± σ): 1.656 s ± 0.014 s [User: 1.447 s, System: 0.186 s]
Range (min … max): 1.641 s … 1.697 s 20 runs
Summary
'./target/release/puffin venv --seed' ran
6.67 ± 0.17 times faster than 'virtualenv .venv'
81.79 ± 1.70 times faster than 'python -m venv .venv'
```
## Summary
Our current setup uses the legacy `setup.py`-based builds if a
`pyproject.toml` file isn't present. This matches pip's behavior.
However, `pypa/build` uses PEP 517-based builds in such cases, and it
looks like pip plans to make that the default
(https://github.com/pypa/pip/issues/9175), with the limiting factor
being performance issues related to isolated builds.
This is now the default behavior, but the `--legacy-setup-py` flag
allows users to opt-in to using `setup.py` directly for distributions
that lack a `pyproject.toml`.
## Summary
This PR adds support for `prepare_metadata_for_build_wheel`, which
allows us to determine source distribution metadata without building the
source distribution. This represents an optimization for the resolver,
as we can skip the expensive build phase for build backends that support
it.
For reference, `prepare_metadata_for_build_wheel` seems to be supported
by:
- `hatchling` (as of
[1.0.9](https://hatch.pypa.io/latest/history/hatchling/#hatchling-v1.9.0)).
- `flit`
- `setuptools`
In fact, it seems to work for every backend _except_ those using legacy
`setup.py`.
Closes#599.
Refactoring split out from find links support: Find links files can be
represented as `Dist`, but not really as `File`, they don't have url nor
hashes.
`DistRequiresPython` is somewhat odd as an in between type.
Changes `File::size` from a `usize` to a `u64`.
The motivations are that with tensorflow wheels being 475 MB
(https://pypi.org/project/tensorflow/2.15.0.post1/#files), we're already
only one order of magnitude away and to avoid target dependent failures.
## Summary
If pre-releases are available for a package that we otherwise couldn't
resolve, we now show a hint that includes one of the example versions.
Closes https://github.com/astral-sh/puffin/issues/811.
Instead of trying to fixup _all_ the invalid version specifiers on pypi
and elsewhere, this filters out distributions with invalid
`requires-python` version specifiers that even
`LenientVersionSpecifiers` couldn't parse, as opposed to failing
entirely, which we currently do.
I would be nicer to model through an invalid distribution pubgrub type,
together with e.g. source dists with an unknown extension, so that the
version itself still shows up in the error trace.
At the same time, we reduce the log level for fixups from warning to
trace, as they are not actionable for the user.
Adjusts display of "no versions available" in error messages to be
consistent with other package/range pairings i.e. we usually display
"<package-name><range>".
## Summary
Right now, both the callback _and_ the "We have no compatible wheel"
paths have a lot of repeated code. This PR changes the callback to
_just_ remove all the wheels and handle the download, and the rest of
the method following the callback is responsible for finding and
building any wheels.
## Summary
I'm unable to run `puffin-cli` on `main` as the
`tracing-durations-export` is marked as optional, but the crate actually
depends on it to compile. Further, without `tracing-durations-export`,
there are `Option` types that can't resolve to a concrete type.
This PR fixes compilation with and without the feature.
Noticed these when working on something unrelated. Generally:
- Prefer `entry.file_type()` over `entry.path().is_file()` or similar,
as the former is almost always free on Unix.
- Call `entry.path()` once, since it allocates internally (returns a
`PathBuf`).
In the past, I moved us to `owo-colors`
(https://github.com/astral-sh/puffin/pull/121); then, we moved back,
because we ran into issues with overriding the settings to force-disable
colors. But `anstream` solved those problems, so I'm moving us _back_ to
`owo-colors`, since it's what `anstream` recommends, and it's already
used by many of our dependencies (`miette`, `configparser`).
---------
Co-authored-by: konstin <konstin@mailbox.org>
## Summary
We can use `anstream` for all color control, rather than going through
`colored`. Note that we still need the `colored` crate, since `colored`
and `anstream` solve different problems. (`anstream` recommends using
`owo-colors` alongside it, but `colored` seems to work fine?)
Resolves the issue raised in
https://github.com/astral-sh/puffin/pull/742 via `anstream` rather than
`colored`.
Closes https://github.com/astral-sh/puffin/issues/782.
Looking at the profile for tf-models-nightly after #789,
`compare_release` is the single biggest item. Adding a fast path, we
avoid paying the cost for padding releases with 0s when they are the
same length, resulting in a 16% for this pathological case. Note that
this mainly happens because tf-models-nightly is almost all large dev
releases that hit the slow path.
**Before**

**After**

```
$ hyperfine --warmup 1 --runs 3 "target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt"
"target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt"
Benchmark 1: target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
Time (mean ± σ): 11.963 s ± 0.225 s [User: 11.478 s, System: 0.451 s]
Range (min … max): 11.747 s … 12.196 s 3 runs
Benchmark 2: target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt
Time (mean ± σ): 10.317 s ± 0.720 s [User: 9.885 s, System: 0.404 s]
Range (min … max): 9.501 s … 10.860 s 3 runs
Summary
target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt ran
1.16 ± 0.08 times faster than target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
```
This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s
real 34.579
user 34.004
sys 0.413
maxmem 2867 MB
faults 0
```
And now with this PR:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s
real 9.218
user 8.919
sys 0.165
maxmem 463 MB
faults 0
```
This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:
```
$ hyperfine -w10 \
"puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
"puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 337.5 ms ± 3.9 ms [User: 310.5 ms, System: 73.2 ms]
Range (min … max): 333.6 ms … 343.4 ms 10 runs
Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 189.8 ms ± 3.0 ms [User: 168.1 ms, System: 78.4 ms]
Range (min … max): 185.0 ms … 196.2 ms 15 runs
Summary
puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```
There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.
Fixes#373, Closes#396
Uses new metadata added in https://github.com/zanieb/packse/pull/61 to
assert that resolution succeeded or failed _and_ that the installed
package versions match the expected result.
The semantics are a bit unintuitive because `--python-version` is a
preference when looking for a python version without a venv, but if we
don't find that exact version we'll take `python3` and patch the
markers. This will make more sense once we start provisioning python
builds.
We can now resolve black with both python 3.8 and 3.12, with or without
that python version being in scope. In the example below,
`PATH=$HOME/.cargo/bin:/usr/bin` removes the pyenv builds and leaves
only `python3`, which is python 3.11.
```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
0.004108s DEBUG puffin::commands::pip_compile Using Python 3.8 at /home/konsti/.local/bin/python3.8
Resolved 8 packages in 44ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
# via black
tomli==2.0.1
# via black
typing-extensions==4.8.0
# via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
0.004315s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 8 packages in 43ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
# via black
tomli==2.0.1
# via black
typing-extensions==4.8.0
# via black
```
```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
0.004216s DEBUG puffin::commands::pip_compile Using Python 3.12 at /home/konsti/.local/bin/python3.12
Resolved 6 packages in 37ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
# via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
0.004190s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 6 packages in 39ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
# via black
```
Fixes#235.
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
One of the most common ways source dists fail to build (on linux) is
when the linker fails because the shared library of a native dependency
is not installed. These errors are hard to understand when you're not a
c programmer:
```
In file included from /usr/include/python3.10/unicodeobject.h:1046,
from /usr/include/python3.10/Python.h:83,
from Modules/3.x/readline.c:8:
Modules/3.x/readline.c: In function ‘on_completion’:
/usr/include/python3.10/cpython/unicodeobject.h:744:29: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
744 | #define _PyUnicode_AsString PyUnicode_AsUTF8
| ^~~~~~~~~~~~~~~~
Modules/3.x/readline.c:842:23: note: in expansion of macro ‘_PyUnicode_AsString’
842 | char *s = _PyUnicode_AsString(r);
| ^~~~~~~~~~~~~~~~~~~
Modules/3.x/readline.c: In function ‘readline_until_enter_or_signal’:
Modules/3.x/readline.c:1044:9: warning: ‘sigrelse’ is deprecated: Use the sigprocmask function instead [-Wdeprecated-declarations]
1044 | sigrelse(SIGINT);
| ^~~~~~~~
In file included from Modules/3.x/readline.c:10:
/usr/include/signal.h:359:12: note: declared here
359 | extern int sigrelse (int __sig) __THROW
| ^~~~~~~~
Modules/3.x/readline.c: In function ‘PyInit_readline’:
Modules/3.x/readline.c:1179:34: warning: assignment to ‘char * (*)(FILE *, FILE *, const char *)’ from incompatible pointer type ‘char * (*)(FILE *, FILE *, char *)’ [-Wincompatible-pointer-types]
1179 | PyOS_ReadlineFunctionPointer = call_readline;
| ^
In file included from /usr/include/string.h:535,
from /usr/include/python3.10/Python.h:30,
from Modules/3.x/readline.c:8:
In function ‘strncpy’,
inlined from ‘call_readline’ at Modules/3.x/readline.c:1124:9:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:95:10: warning: ‘__builtin_strncpy’ output truncated before terminating nul copying as many bytes from a string as its length [-Wstringop-truncation]
95 | return __builtin___strncpy_chk (__dest, __src, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
96 | __glibc_objsize (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~
Modules/3.x/readline.c: In function ‘call_readline’:
Modules/3.x/readline.c:1099:9: note: length computed here
1099 | n = strlen(p);
| ^~~~~~~~~
/usr/bin/ld: cannot find -lncurses: No such file or directory
collect2: error: ld returned 1 exit status
error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
---
```
We parse these errors out, tell the user about the missing shared
library and even the most likely debian/ubuntu package name:
```
This error likely indicates that you need to install the library that provides a shared library for ncurses for pygraphviz-1.11 (e.g. libncurses-dev)
```
Previously, we just pulled the latest commit from `main` on every
update. This causes problems when you do not intend to update the
scenarios as in #787.
This bumps to the latest `packse` commit without new scenarios.
Adds support for a `PUFFIN_NO_WRAP` environment variable which disables
line wrapping in `miette` output.
We set this variable in the scenario tests to improve the readability of
snapshots.
I contributed the ability to disable line wrapping upstream at
https://github.com/zkat/miette/pull/328
This PR does a bit of refactoring to the pep440 crate, and in
particular around the erorr types. This PR is meant to be a precursor
to another PR that does some surgery (both in parsing and in `Version`
representation) that benefits somewhat from this refactoring.
As usual, please review commit-by-commit.
This fixes a compilation error with tests on current `main`. I didn't
track down the exact provenance, but I'd guess it's the result of a
botched merge. (i.e., Two or more PRs that pass CI independently, but
when merged cause failures.)
I'm still confused about it, but this seems to do the right thing?
`HierarchicalLayer` internally has [`let ansi =
io::stderr().is_terminal();`](fcd9eed252/src/lib.rs (L74)),
so the logging itself is already correctly uncolored, but errors in the
log weren't.
Test command, ran with network deactivated:
```shell
RUST_LOG=debug cargo run --bin puffin -- pip-compile -v ./scripts/popular_packages/pypi_8k_downloads.txt 2> log.txt
```
**Before**
```
[1;31merror[0m: Request error: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: failed to lookup address information: Temporary failure in name resolution
```
**After**
```
error: Request error: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: failed to lookup address information: Temporary failure in name resolution
```
Fixup for
7349527ceadde8fc265a33e6a4e662/boto3-1.2.0-py2.py3-none-any.whl:
```
botocore>=1.3.0,<1.4.0',
```
Note that neither the quote nor the comma are right.
Following #757, improves the script for generating scenario test cases
with:
- A requirements file
- Support for downloading packse scenarios from GitHub dynamically
- Running rustfmt on the generated test file
- Updating snapshots / running tests
As mentioned in #746, instead of just installing the scenario root we
will unpack the root dependencies into the install command to allow
better coverage of direct user requests with scenarios.
I added display of the package tree provided by each scenario.
Use a mustache template for iterative replacements.
Adds tests using packse test scenarios! Uses `test.pypi.org` as a
backing index.
Tests are generated by a simple Python script. Requires
https://github.com/zanieb/packse/pull/49.
This opens us to a slight attack surface, as we cannot force use of
`test.pypi.org` only and someone could register these package names on
the real `pypi.org` index with malicious content. I could publish these
packages there too.
`simplify_set` can itself simplify to the full range, so it seems like
we should be checking if the set is `Range::full` _after_ simplifying
rather than before.
## Summary
This PR modifies the resolver to treat the Python version as a package,
which allows for better error messages (since we no longer treat
incompatible packages as if they "don't exist at all").
There are a few tricky pieces here...
First, we need to track both the interpreter's Python version and the
_target_ Python version, because we support resolving for other versions
via `--python 3.7`.
Second, we allow using incompatible wheels during resolution, as long as
there's a compatible source distribution. So we still need to test for
`requires-python` compatibility when selecting distributions.
This could use more testing, but it feels like an area where `packse`
would be more productive than writing PyPI tests.
Closes https://github.com/astral-sh/puffin/issues/406.
This PR fixes our prefetching logic to ensure that we always attempt to
prefetch the "best-guess" distribution for all dependencies. This logic
already existed, but because we only attempted to prefetch when package
metadata was available, it almost never triggered. Now, we wait for the
package metadata to become available, _then_ kick off the "best-guess"
prefetch (for every package).
In my testing, this dramatically improves performance (like 2x). I'm
wondering if this regressed at some point?
Closes#743.
Co-authored-by: konsti <konstin@mailbox.org>
I've tried to investigate puffin's performance wrt to builds and
parallelism in general, but found the previous instrumentation to
granular. I've tried to add spans to every function that either needs
noticeable io or cpu resources without creating duplication. This also
fixes some wrong tracing usage on async functions
(https://docs.rs/tracing/latest/tracing/struct.Span.html#in-asynchronous-code)
and some spans that weren't actually entered.
This PR adds a dedicated error message for resolutions that fail, but
might've succeeded if pre-releases were allowed. Specifically, if we see
a failed resolution, and failed to find a version for a package that
included a pre-release marker, we add a hint nudging the user to
explicitly enable all pre-releases.
We'd prefer a solution like
https://github.com/astral-sh/puffin/pull/666, but believe that it will
break some assumptions in PubGrub, so this is the lighter-weight
solution.
Closes https://github.com/astral-sh/puffin/issues/659.
The `async fn` and return-position `impl Trait` in traits improve
`BuildContext` ergonomics. The traits use `impl Future` over `async fn`
to make the send bound explicit
(https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html).
The remaining changes are due to clippy.
This PR combines three small changes to finish up the install-many
testing.
* Download pypi_10k_most_dependents.txt in script I'd like to have the
setup process of the large scale checks automated.
* Some install-many dev script improvements
* Fix mkl_fft-1.3.6-58-cp310-cp310-manylinux2014_x86_64.whl:
mkl_fft-1.3.6-58-cp310-cp310-manylinux2014_x86_64.whl has multiple
Wheel-Version entries, we have to ignore that like pip
Apart from the mkl-fft fix the only other errors i've seen showing up
are
https://github.com/astral-sh/puffin/issues/520#issuecomment-1869625642.
## Summary
This PR adds support for relative URLs in the simple JSON responses. We
already support relative URLs for HTML responses, but the handling has
been consolidated between the two. Similar to index URLs, we now store
the base alongside the metadata, and use the base when resolving the
URL.
Closes#455.
## Test Plan
`cargo test` (to test HTML indexes). Separately, I also ran `cargo run
-p puffin-cli -- pip-compile requirements.in -n
--index-url=http://localhost:3141/packages/pypi/+simple` on the
`zb/relative` branch with `packse` running, and forced both HTML and
JSON by limiting the `accept` header.
## Summary
This PR makes the `pypi_types::File` a response-only type (i.e., a type
that's only used when deserializing over the wire), and adds a separate
internal `File` type. Right now, the representations are similar, but
already, we can avoid the "lenient" deserialization on our internal
`File` type, and avoid the special-casing of the property names that's
required in the JSON. Over time, we can evolve this representation
entirely separately from the representation we receive from PyPI and
other indexes.
This crate started off as generic caching utilities, but we started
adding a lot of Puffin-specific stuff (like the cache buckets
abstraction that knows about Git vs. direct URL vs. indexes and so on).
This PR moves the generic stuff into a new `cache-key` crate.
I don't have a good testing strategy here (I'm manually testing against
`devpi` via `packse`), but the HTML index uses (e.g.)
`data-requires-python=">=3.8"`, so we need to decode.
From manual inspection, this dataset generated through the [libraries.io
API](https://libraries.io/api#project-search) seems more mainstream than
the current 8k one, which is also preserved. I've added the dataset to
the repo because the API requires an API key.
We lock git checkout directories and the virtualenv to avoid two puffin
instances running in parallel changing files at the same time and
leading to a broken state. When one instance is blocking another, we
need to inform the user (why is the program hanging?) and also add some
information for them to debug the situation.
The new messages will print
```
Waiting to acquire lock for /home/konsti/projects/puffin/.venv (lockfile: /home/konsti/projects/puffin/.venv/.lock)
```
or
```
Waiting to acquire lock for git+https://github.com/pydantic/pydantic-extra-types@0ce9f207a1e09a862287ab77512f0060c1625223 (lockfile: /home/konsti/projects/puffin/cache-all-kinds/git-v0/locks/f157fd329a506a34)
```
The messages aren't perfect but clear enough to see what the contention
is and in the worst case to delete the lockfile.
Fixes#714
Easier than i expected: We simply never construct the pubgrub error
variants since we have our own main loop. The `unreachable!()`s can be
removed when never is stabilized
Otherwise, when a server does not support HTTP range requests we throw
an error instead of downloading without range requests.
---------
Co-authored-by: konstin <konstin@mailbox.org>
This allows the default index URL to be easily overridden with a local
index e.g. a `packse` server
```
export PUFFIN_INDEX_URL="http://localhost:3141/packages/all/+simple"
```
For the install tests, i need the ability to ignore failures in the
`DistFinder`. To avoid just copy&pasting a version that collects errors
separately, i followed
https://gendignoux.com/blog/2021/04/01/rust-async-streams-futures-part1.html
and switched the custom channel over to an async stream yielding
`Result` items.
I like the async streams mirror the normal iterator api.
The high level goal here is to improve the tests for the version parser.
Namely, we now check not just that version strings parse successfully,
but that they parse to the expected result.
We also do a few other cleanups. Most notably, `Version` is now an
opaque type so that we can more easily change its representation going
forward.
Reviewing commit-by-commit is suggested. :-)
The test creates a cache from multiple sources and injects faults (once
using invalid data and once by making the files unreadable on the fs
level), then resolves again.
I didn't test git because it has its own locking and correctness logic.
The main drawback is that this test is slow (2.5s for me), we could
`#[ignore]` it.
This is a pure refactor to follow-up #690, to separate the metadata that
we know upfront about distributions (like the version, for
registry-based distributions) vs. the metadata that requires building
(like the version, for URL-based distributions).
We now show the fully-resolved URL, rather than the URL as given by the
user, _everywhere_ except for the output resolution file (which should
retain relative paths, unexpanded environment variables, etc.).
Closes https://github.com/astral-sh/puffin/issues/687.
With `Option<T>` and `.unwrap_or_default()` later, the default of `T`
isn't shown in the help output.
Old:
```
--link-mode <LINK_MODE>
The method to use when installing packages from the global cache
Possible values:
- clone: Clone (i.e., copy-on-write) packages from the wheel into the site packages
- copy: Copy packages from the wheel into the site packages
- hardlink: Hard link packages from the wheel into the site packages
-q, --quiet
Do not print any output
--resolution <RESOLUTION>
Possible values:
- highest: Resolve the highest compatible version of each package
- lowest: Resolve the lowest compatible version of each package
- lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies
--prerelease <PRERELEASE>
Possible values:
- disallow: Disallow all pre-release versions
- allow: Allow all pre-release versions
- if-necessary: Allow pre-release versions if all versions of a package are pre-release
- explicit: Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
- if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```

New:
```
--link-mode <LINK_MODE>
The method to use when installing packages from the global cache
[default: hardlink]
Possible values:
- clone: Clone (i.e., copy-on-write) packages from the wheel into the site packages
- copy: Copy packages from the wheel into the site packages
- hardlink: Hard link packages from the wheel into the site packages
-q, --quiet
Do not print any output
--resolution <RESOLUTION>
[default: highest]
Possible values:
- highest: Resolve the highest compatible version of each package
- lowest: Resolve the lowest compatible version of each package
- lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies
--prerelease <PRERELEASE>
[default: if-necessary-or-explicit]
Possible values:
- disallow: Disallow all pre-release versions
- allow: Allow all pre-release versions
- if-necessary: Allow pre-release versions if all versions of a package are pre-release
- explicit: Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
- if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```

This PR uses borrowed data in `BuildDispatch` which makes creating a
`BuildDispatch` extremely cheap (only one allocation, for the Python
executable). I can be talked out of this, it will have no measurable
impact.
Separate branch for rebasing #677 onto main because i don't trust the
rebase enough to force push.
Closes#677.
---
If you install `black` from PyPI, then `-e ../black`, we need to
uninstall the existing `black`. This sounds simple, but that in turn
requires that we _know_ `-e ../black` maps to the package `black`, so
that we can mark it for uninstallation in the install plan. This, in
turn, means that we need to build editable dependencies prior to the
install plan.
This is just a bunch of reorganization to fix that specific bug
(installing multiple versions of `black` if you run through the above
workflow): we now run through the list of editables upfront, mark those
that are already installed, build those that aren't, and then ensure
that `InstallPlan` correctly removes those that need to be removed, etc.
Closes#676.
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
Per the title: adds support for `-e` installs to `puffin pip-install`.
There were some challenges here around threading the editable installs
to the right places. Namely, we want to build _once_, then reuse the
editable installs from the resolution. At present, we were losing the
`editable: true` flag on the `Dist` that came back through the
resolution, so it required some changes to the resolver.
Closes https://github.com/astral-sh/puffin/issues/672.
This PR modifies `SitePackages` to store all distributions in a flat
vector, and maintain two indexes (hash maps) from "per-element data for
an element in the vector" to "index of that element". This enables us to
maintain a map on both package name and editable URL.
`pip` supports installing packages without names (e.g.,
`git+https://github.com/pallets/flask.git`), but it doesn't adhere to
the PEP grammar, and we don't yet support it (and may never) (#313).
This PR adds a dedicated error path for such cases, to ensure that we
can give meaningful feedback to the user:
```
error: Couldn't parse requirement in requirements.in position 0 to 18
Caused by: URL requirement is missing a package name; expected: `package_name @ https://google.com`
https://google.com
^^^^^^^^^^^^^^^^^^
```
Closes https://github.com/astral-sh/puffin/issues/650.
This gives a 1.23 speedup on transformers-extras. We could change to
msgpack for the entire cache if we want. I only tried this format and
postcard so far, where postcard was much slower (like 1.6s).
I don't actually want to merge it like this, i wanted to figure out the
ballpark of improvement for switching away from json.
```
hyperfine --warmup 3 --runs 10 "target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in" "target/profiling/branch pip-compile scripts/requirements/transformers-extras.in"
Benchmark 1: target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in
Time (mean ± σ): 179.1 ms ± 4.8 ms [User: 157.5 ms, System: 48.1 ms]
Range (min … max): 174.9 ms … 188.1 ms 10 runs
Benchmark 2: target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 221.1 ms ± 6.7 ms [User: 208.1 ms, System: 46.5 ms]
Range (min … max): 213.5 ms … 235.5 ms 10 runs
Summary
target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in ran
1.23 ± 0.05 times faster than target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
```
Disadvantage: We can't manually look into the cache anymore to debug
things
- [ ] Check more formats, i currently only tested json, msgpack and
postcard, there should be other formats, too
- [x] Switch over `CachedByTimestamp` serialization (for the interpreter
caching)
- [x] Switch over error handling and make sure puffin is still resilient
to cache failure
This enables us to remove a number of allocations (in particular,
`peek_while` and `take_while` no longer allocate). It also makes it
trivial to move the cursor to a new location, since you can just slice
and call `.chars()`. At present, moving to a new location would require
converting the iterator to a string, then back to a character iterator.
Two low-hanging fruits as optimizations for version parsing: A fast path
for release only versions and removing the regex from version specifiers
(still calling into version's parsing regex if required). This enables
optimizing the serde format since we now see the serde part instead of
only PEP 440 parsing. I intentionally didn't rewrite the full PEP 440 at
this step.
```console
$ hyperfine --warmup 5 --runs 50 "target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in" "target/profiling/main pip-compile scripts/requirements/transformers-extras.in"
Benchmark 1: target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 217.1 ms ± 3.2 ms [User: 194.0 ms, System: 55.1 ms]
Range (min … max): 211.0 ms … 228.1 ms 50 runs
Benchmark 2: target/profiling/main pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 276.7 ms ± 5.7 ms [User: 252.4 ms, System: 54.6 ms]
Range (min … max): 268.9 ms … 303.5 ms 50 runs
Summary
target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in ran
1.27 ± 0.03 times faster than target/profiling/main pip-compile scripts/requirements/transformers-extras.in
```
---------
Co-authored-by: Andrew Gallant <andrew@astral.sh>
## Summary
This PR ensures that we re-use the resolution to install the build
dependencies when building a source distribution. Currently, we only
pass along the list of requirements, and then use the `Finder` to map
each requirement to a distribution. But we already determine the correct
distribution when resolving!
Closes https://github.com/astral-sh/puffin/issues/655.
## Summary
This is more of a hypothetical problem, but the cache manifest could in
theory get out-of-sync with the contents on disk. This PR modifies the
`BuiltWheelMetadata` lookup to warn (but not fail) if the manifest
includes a wheel that no longer exists on disk. You can mimic this by
removing a wheel from the `built-wheels-v0` cache without modifying the
manifest correspondingly.
## Summary
This PR modifies `source_dist.rs` to store source distributions (from
remote URLs) in the cache. The cache structure for registries now looks
like:
<img width="1053" alt="Screen Shot 2023-12-14 at 10 43 43 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/3c2dbf6b-5926-41f2-b69b-74031741aba8">
(I will update the docs prior to merging, if approved.)
The benefit here is that we can reuse the source distribution (avoid
download + unzipping it) if we need to build multiple wheels. In the
future, it will be even more relevant, since we'll need to reuse the
source distribution to support
https://github.com/astral-sh/puffin/issues/599.
I also included some misc. refactors to DRY up repeated operations and
add some more abstraction to `source_dist.rs`.
## Summary
This PR enables users to express relative dependencies via environment
variables. Like pip, PDM, Hatch, Rye, and others, we now allow users to
express dependencies like:
```text
flask @ file://${PROJECT_ROOT}/flask-3.0.0-py3-none-any.whl
```
In the compiled requirements file, we'll also preserve the unexpanded
environment variable.
Closes https://github.com/astral-sh/puffin/issues/592.
## Summary
This PR adds a `VerbatimUrl` struct to preserve verbatim URLs throughout
the resolution and installation pipeline. In short, alongside the parsed
`Url`, we also keep the URL as written by the user. This enables us to
display the URL exactly as written by the user, rather than the
serialized path that we use internally.
This will be especially useful once we start expanding environment
variables since, at that point, we'll be able to write the version of
the URL that includes the _unexpected_ environment variable to the
output file.
We have some shared utilities beyond `puffin-build` and
`puffin-distribution`, and further, I want to be able to access the
sdist archive extraction logic from `puffin-distribution`. This is
really generic, so moving into its own crate.
This allows us to enforce type safety within the resolver. For example,
in the index, we can remove `String` as a key type and enforce that
callers _must_ present us with a `PackageId`. (This actually caught one
bug, where we were using the SHA rather than the package ID. That bug
shouldn't have had any effect given where it was, since those are 1:1,
but it's still problematic.)
## Summary
When resolving `transformers[tensorboard]`, the `[tensorboard]` extra
doesn't exist. Previously, we returned "unknown" dependencies for this
variant, which leads the resolution to try all versions, then fail. This
PR instead warns, but returns the base dependencies for the package,
which matches `pip`. (Poetry doesn't even warn, it just proceeds as
normal.)
Arguably, it would be better to return a custom incompatibility here and
then propagate... But this PR is better than the status quo, and I don't
know if we have support for that behavior yet...? (\cc @zanieb)
Closes#386.
Closes https://github.com/astral-sh/puffin/issues/423.
## Summary
This PR enables overrides to be passed to `pip-compile` and
`pip-install` via a new `--overrides` flag.
When overrides are provided, we effectively replace any requirements
that are overridden with the overridden versions. This is applied at all
depths of the tree.
The merge semantics are such that we replace _all_ requirements of a
package with _all_ requirements from the overrides files. So, for
example, if a package declares:
```
foo >= 1.0; python_version < '3.11'
foo < 1.0; python_version >= '3.11'
```
And the user provides an override like:
```
foo >= 2.0
```
Then _both_ of the `foo` requirements in the package will be replaced
with the override.
If instead, the user provided an override like:
```
foo >= 2.0; python_version < '3.11'
foo < 3.0; python_version >= '3.11'
```
Then we'd replace _both_ of the original `foo` requirements with both of
these overrides. (In technical terms, for each package in the
requirements file, we flat-map over its overrides.)
Closes https://github.com/astral-sh/puffin/issues/511.
## Summary
Now, `puffin_warnings::warn_once` and `puffin_warnings::warn` will go to
`stderr`, as long as the user isn't running under `--quiet`. Previously,
these went through `tracing`, and so were only visible when running
under `--verbose`.
Uses https://github.com/pubgrub-rs/pubgrub/pull/156 to consolidate
version ranges in error reports using the actual available versions for
each package.
Alternative to https://github.com/zanieb/pubgrub/pull/8 which implements
this behavior as a method in the `Reporter` — here it's implemented in
our custom report formatter (#521) instead which requires no upstream
changes.
Requires https://github.com/zanieb/pubgrub/pull/11 to only retrieve the
versions for packages that will be used in the report.
This is a work in progress. Some things to do:
- ~We may want to allow lazy retrieval of the version maps from the
formatter~
- [x] We should probably create a separate error type for no solution
instead of mixing them with other resolve errors
- ~We can probably do something smarter than creating vectors to hold
the versions~
- [x] This degrades error messages when a single version is not
available, we'll need to special case that
- [x] It seems safer to coerce the error type in `resolve` instead of
`solve` if feasible
Make `prepare_metadata_for_build_wheel` accessible across the puffin
codebase by splitting the built call into a setup, a metadata and a
wheel call. This does not actually use the hook yet, but it's the
required refactoring for it.
Part of #599.
I don't know why, but this seems to resolve
https://github.com/astral-sh/puffin/issues/619. The Tokio docs also say
that using Tokio's Mutex is _not_ recommended unless you need to hold
the Mutex across an `.await`, which we don't.
Since this is a non-deterministic failure, I just ran it a bunch of
times and ensured it didn't hang (whereas it did hang occasionally prior
to this PR).
Closes https://github.com/astral-sh/puffin/issues/619
## Summary
Now, after running `pip-install`, we validate that the set of installed
packages is consistent -- that is, that we don't have any packages that
are missing dependencies, or incompatible versions of installed
dependencies.
## Summary
At present, when performing a `pip-install`, we first do a resolution,
then take the set of requirements and basically run them through our
`pip-sync`, which itself includes re-resolving the dependencies to get a
specific `Dist` for each package. (E.g., the set of requirements might
say `flask==3.0.0`, but the installer needs a specific _wheel_ or source
distribution to install.)
This PR removes this second resolution by exposing the set of pinned
packages from the resolution. The main challenge here is that we have an
optimization in the resolver such that we let the resolver read metadata
from an incompatible wheel as long as a source distribution exists for a
given package. This lets us avoid building source distributions in the
resolver under the assumption that we'll be able to install the package
later on, if needed. As such, the resolver now needs to track the
resolution and installation filenames separately.
## Summary
When running `puffin pip-install`, we should respect versions that are
already installed in the environment. For example, if you run `puffin
pip-install flask==2.0.0` and then `puffin pip-install flask`, we should
avoid upgrading Flask. The most natural way to model this is to mark
them as "preferences".
(It's not enough to just filter those requirements out prior to
resolving, since we may not have the _dependencies_ of those packages
installed. We _could_ recursively verify this across the
`site-packages`, but that would be a larger PR.)
## Summary
This PR adds a `pip-install` command that operates like, well, `pip
install`. In short, it resolves the provided dependency, then makes sure
they're all installed in the environment. The primary differences with
`pip-sync` are that (1) `pip-sync` ignores dependencies, and assumes
that the packages represent a complete set; and (2) `pip-sync`
uninstalls any unlisted packages.
There are a bunch of TODOs that I'll resolve in subsequent PRs.
Closes https://github.com/astral-sh/puffin/issues/129.
First step, sufficient to run
```shell
cargo run --bin puffin-dev -- build --editable -w target/editables/ scripts/editable-installs/poetry_editable/
```
and check the wheel to confirm its working. Tests will be added with the
pip-sync integration.
## Summary
At present, we have two separate phases within the installation pipeline
related to populating wheels into the cache. The first phase downloads
the distribution, and then builds any source distributions into wheels;
the second phase unzips all the built wheels into the cache.
This PR merges those two phases into one, such that we seamlessly
download, build, and unzip wheels in one pass. This is more efficient,
since we can start unzipping while we build. It also ensures that if the
install _fails_ partway through, we don't end up with a bunch of
downloaded wheels that we never had a chance to unzip. The code is also
much simpler.
The main downside is that the user-facing feedback isn't as granular,
since we only have one phase and one progress bar for what was
originally three distinct phases.
Closes https://github.com/astral-sh/puffin/issues/571.
## Test Plan
I ran the benchmark script on two separate requirements files, and saw a
7% and 31% speedup respectively:
```text
+ TARGET=./scripts/benchmarks/requirements.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache
Time (mean ± σ): 269.4 ms ± 33.0 ms [User: 42.4 ms, System: 117.5 ms]
Range (min … max): 221.7 ms … 446.7 ms 100 runs
Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache
Time (mean ± σ): 250.6 ms ± 28.3 ms [User: 41.5 ms, System: 127.4 ms]
Range (min … max): 207.6 ms … 336.4 ms 100 runs
Summary
'./target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache' ran
1.07 ± 0.18 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
```
```text
+ TARGET=./scripts/benchmarks/requirements-large.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
Time (mean ± σ): 5.053 s ± 0.354 s [User: 1.413 s, System: 6.710 s]
Range (min … max): 4.584 s … 6.333 s 100 runs
Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
Time (mean ± σ): 3.845 s ± 0.225 s [User: 1.364 s, System: 6.970 s]
Range (min … max): 3.482 s … 4.715 s 100 runs
Summary
'./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran
```
Currently, `dbg!` is hard to read because versions are verbose, showing
all optional fields, and we have a lot of versions. Changing debug
formatting to displaying the version number (which can be losslessly
converted to the struct and back) makes this more readable.
See e.g.
https://gist.github.com/konstin/38c0f32b109dffa73b3aa0ab86b9662b
**Before**
```text
version: Version {
epoch: 0,
release: [
1,
2,
3,
],
pre: None,
post: None,
dev: None,
local: None,
},
```
**After**
```text
version: "1.2.3",
```
I saw warnings when we were e.g. unzipping wheel and setuptools in two
tasks at the same time. We now keep track of in flight unzips.
This introduces a `OnceMap` abstraction which we also use in the
resolver.
## Summary
This PR adds two flags to `pip-sync`: `--reinstall`, and
`--reinstall-package [PACKAGE]`. The former reinstalls all packages in
the requirements, while the latter can be repeated and reinstalls all
specified packages.
For our purposes, a reinstall includes (1) purging the cache, and (2)
marking any already-installed versions as extraneous.
Closes#572.
Closes https://github.com/astral-sh/puffin/issues/271.
## Summary
This PR enables `puffin clean` to accept package names as command line
arguments, and selectively purge entries from the cache tied to the
given package.
Relate to #572.
## Test Plan
Modified all the caching tests to run an additional step to (1) purge
the cache, and (2) re-install the package.
## Summary
If someone else beats us to the unzip, we should let them win.
We already have a check for this at the top of the unzip method, but
it's also possible that two source distributions get built in parallel
that both try to unpack the same build dependency.
## Summary
This PR modifies the Git wheel cache to: (1) use a shorter version of
the SHA, to save space; and (2) include the package name, for
consistency with all other buckets.
I considered removing the URL hash entirely, and _just_ using the SHA,
which would be even _more_ consistent with other buckets. But if we
remove the URL, then we won't have separate directories for
subdirectories (which are part of the URL).
Before:
<img width="1035" alt="Screen Shot 2023-12-07 at 7 23 42 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/86afce67-682f-464f-9ba1-0b60d5b7f19f">
After:
<img width="1232" alt="Screen Shot 2023-12-07 at 8 09 23 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/eda42a19-974f-47fe-8c83-54a602ddfd2d">
Extends #517 with a suggestion from @konstin to parse the `SimpleJson`
into an intermediate type `SimpleMetadata(BTreeMap<Version,
VersionFiles>)` before converting to a `VersionMap`. This reduces the
number of times we need to parse the response. Additionally, we cache
the parsed response now instead of `SimpleJson`.
`VersionFiles` stores two vectors with
`WheelFilename`/`SourceDistFilename` and `File` tuples. These can be
iterated over together or separately. A new enum `DistFilename` was
added to capture the `SourceDistFilename` and `WheelFilename` variants
allowing iteration over both vectors.
Also ensures that we filter out any incompatible requirements when
building the install plan. In general, we assume that requirements were
generated by `pip-compile`, in which case all requirements should be
compatible and there should be no duplicates; but we should handle this
case gracefully.
Closes https://github.com/astral-sh/puffin/issues/582.
This PR adds caching support for built wheels in the installer.
Specifically, the `RegistryWheelIndex` now indexes both downloaded and
built wheels (from registries), and we have a new `BuiltWheelIndex` that
takes a subdirectory and returns the "best-matching" compatible wheel.
Closes#570.
Adds a few more tests for re-installs with various kinds of source
distributions, and changes the tests to use packages that we can safely
import (via `check_command`) for extra validation.
Once we properly respect cached built wheels, we should expect these
snapshots to change, since we'll no longer download and re-build
unnecessarily.
I'm working off of @konstin's commit here to implement arbitrary unsat
test cases for the resolver.
The entirety of the resolver's io are two functions: Get the version map
for a package (PEP 440 version -> distribution) and get the metadata for
a distribution. A new trait `ResolverProvider` abstracts these two away and
allows replacing the real network requests e.g. with stored responses
(https://github.com/pradyunsg/pip-resolver-benchmarks/blob/main/scenarios/pyrax_198.json).
---------
Co-authored-by: konsti <konstin@mailbox.org>
Parse `-e` for editable installs in `requirements.txt`.
Unlike all the other requirements, editable installs don't have the name
of the package specified.
Path distribution cache reading errors are no longer fatal.
We now invalidate the path file source dists if its modification
timestamp changed, and invalidate path dir source dists if
`pyproject.toml` or alternatively `setup.py` changed, which seems good
choices since changing pyproject.toml should trigger a rebuild and the
user can `touch` the file as part of their workflow.
`CachedByTimestamp` is now a shared util. It doesn't have methods as i
don't think it's worth it yet for two users.
Closes#478
TODO(konstin): Write a test. This is probably twice as much work as that
fix itself, so i made that PR without one for now.
## Summary
When installing a local wheel, we need to avoid removing the zipped
wheel (since it lives outside of the cache), _and_ need to ensure that
we unzip the wheel into the cache (rather than replacing the zipped
wheel, which may even live outside of the project).
Closes https://github.com/astral-sh/puffin/issues/553.
This PR modifies the source distribution building to replace any
existing targets after building the new wheel. In some cases, the
existence of an existing target may be indicative of a bug, so we warn.
It's partially a workaround for some (but not all) of the errors in
https://github.com/astral-sh/puffin/issues/554.
## Summary
This is hard to reproduce, but if you run a long installation process
that errors part-way through, you can end up with zipped wheels in the
`Wheels` cache, which is intended to contain only unzipped wheels. This
PR avoids returning those entries from the registry, which will then
lead to errors downstream when we treat them as directories.
## Summary
This enables users to rely on yanked versions via explicit `==` markers,
which is necessary in some projects (and, in my opinion, reasonable).
Closes#551.
## Summary
Allows, e.g., `--python-version 3.7` or `--python-version 3.7.9`. This
was also feedback I received in the original PR.
Closes https://github.com/astral-sh/puffin/issues/533.