## Summary
Like https://github.com/astral-sh/puffin/pull/1180, this PR adds logic
for `requirements.txt` parsing whereby if a requirement _looks like_ a
local requirements file or an editable directory, we prompt the user to
correct the error (typically, by adding `-r`).
Lacking windows compatible aarch64 hardware, i cross compiled the
trampoline from x86_64 linux to aarch64-pc-windows-msvc; I added the
instructions to the puffin-trampoline readme. With some testing on an
aarch64 windows machine, this should be sufficient to build working
win_arm64 tagged wheels.
i686-pc-windows-msvc is failing with an error:
```
error: linking with `lld-link` failed: exit status: 1
= note: lld-link: error: undefined symbol: __aulldiv
>>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
>>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
>>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
>>> referenced 4 more times
lld-link: error: undefined symbol: __aullrem
>>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
>>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
>>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
>>> referenced 4 more times
```
Instrument the main function as anchor span for checking overhead and
update tracing-durations-export to 0.2.0 for differentiating
blocking/non-blocking tasks.
Add a `jupyter.in` requirement since `pip install jupyter` is a common
operation. I tried `jupyterlab` too but there is no difference in
performance (1.00 ± 0.07).
Use `virtualenv` consistently, remove unused error variants and hint the
user towards installing missing python versions.
I didn't touch the Readme but i replaced `virtualenv environment` with
`virtualenv` in the strings i found.
Fixes https://github.com/astral-sh/puffin/issues/1167
## Summary
See: https://github.com/astral-sh/puffin/issues/1181.
## Test Plan
```
❯ cargo run -- pip install packse@../../zanieb/packse
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/puffin pip install 'packse@../../zanieb/packse'`
error: Distribution not found at: file:///Users/crmarsh/zanieb/packse
```
Make the test `compile_python_37` pass whether python 3.7 is installed
or not by muting the warning for a missing 3.7. The resolution error is
independent of whether 3.7 is installed or not.
## Summary
This PR adds support for `--find-links`, `--index-url`, and
`--extra-index-url` arguments when specified in a `requirements.txt`.
It's a mostly-straightforward change. The only uncertain piece is what
to do when multiple files include these flags, and/or when we include
them on the CLI and in other files.
In general:
- If _anything_ specifies `--no-index`, we respect it.
- We combine all `--extra-index-url` and `--find-links` across all
sources, since those are just vectors.
- If we see multiple `--index-url` in requirements files, we error.
- We respect the `--index-url` from the command line over any provided
in a requirements file.
(`pip-compile` seems to just pick one semi-arbitrarily when multiple are
provided.)
Closes https://github.com/astral-sh/puffin/issues/1143.
This adds what is effectively an owned wrapper around
`Archived<SimpleMetadata>`. Normally, an `Archived<SimpleMetadata>`
has to be used behind a pointer (since it has a lifetime
attached to its underlying byte buffer), but we create a
wrapper around it that owns the underlying buffer and provides
free access to the archived type.
This in effect creates an anchor point for the archived type
and lets us pass it around easily. (There has to be an anchor
point for it somewhere.)
An alternative to this approach would be to store it as a file
backed memory map. But in practice, we're dealing with small
files, and just reading them on to the heap is likely to be
faster. (Memory maps also have wildly different perf characteristics
across platforms.)
Note that this commit just defines the type. It isn't actually
used anywhere yet.
Less verbose span fields for `Dist`s by using the display impl and no
more min length in the tracing durations plot config for comparability
(we lose spans due to a speedup otherwise). Both wait points in the
solver loop are now instrumented so we can inspect what we're waiting
for to progress in the solver.
This PR migrates our source distribution downloads to unzip as we
stream, similar to our approach for wheels.
In my testing, this showed a consistent speedup (e.g., 6% here for a few
representative source distributions):
```text
❯ python -m scripts.bench --puffin-path ./target/release/main --puffin-path ./target/release/puffin --benchmark install-cold requirements.in
Benchmark 1: ./target/release/main (install-cold)
Time (mean ± σ): 1.503 s ± 0.039 s [User: 1.479 s, System: 0.537 s]
Range (min … max): 1.466 s … 1.605 s 10 runs
Benchmark 2: ./target/release/puffin (install-cold)
Time (mean ± σ): 1.421 s ± 0.024 s [User: 1.505 s, System: 0.593 s]
Range (min … max): 1.381 s … 1.454 s 10 runs
Summary
'./target/release/puffin (install-cold)' ran
1.06 ± 0.03 times faster than './target/release/main (install-cold)'
```
This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.
For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:
* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
allocate and create all necessary objects.
This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.
The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.
My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.
There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.
[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
## Summary
This is my guess as to the source of the resolver flake, based on
information and extensive debugging from @zanieb. In short, if we rely
on `self.index.packages` as a source of truth during error reporting, we
open ourselves up to a source of non-determinism, because we fetch
package metadata asynchronously in the background while we solve -- so
packages _could_ be included in or excluded from the index depending on
the order in which those requests are returned.
So, instead, we now track the set of packages that _were_ visited by the
solver. Visiting a package _requires_ that we wait for its metadata to
be available. By limiting analysis to those packages that were visited
during solving, we are faithfully representing the state of the solver
at the time of failure.
Closes#863
## Summary
We have this optimization in `wheel.rs`, in the installer, but it makes
a huge difference for zips with many small files:
```
Benchmarking file_reader/Django-5.0.1-py3-none-any.whl: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 74.2s, or reduce sample count to 10.
file_reader/Django-5.0.1-py3-none-any.whl
time: [751.63 ms 757.78 ms 764.27 ms]
change: [-1.0290% +0.0841% +1.2289%] (p = 0.88 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
Benchmarking buffered_reader/Django-5.0.1-py3-none-any.whl: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 53.4s, or reduce sample count to 10.
buffered_reader/Django-5.0.1-py3-none-any.whl
time: [529.86 ms 536.44 ms 543.35 ms]
change: [+0.0293% +1.5543% +3.1426%] (p = 0.05 > 0.05)
No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
```
That's almost 30% faster...
In Rust, `fs::copy` automatically preserves permissions (see:
https://doc.rust-lang.org/std/fs/fn.copy.html).
Elsewhere, when copying from the zip archive out to the cache, we can
set permissions during file creation, rather than as a separate call.
Both of these should be slightly more efficient.
## Summary
When we migrated to an "unzip while we stream" solution, we lost the
logic to set permissions on the extracted files, so executables in
wheels were no longer executable. It turns out this is a little tricky,
since the permissions metadata is in the central directory at the _end_
of the zip file, and the async ZIP reader explicitly stops iteration
once it hits the central directory. (Specifically, it goes 4 bytes into
the central directory, since it sees the 4-byte signature header and
then stops.)
So, to solve that, I've added a `CentralDirectoryReader` that continues
where that iterator left off. This required forking the async zip crate:
https://github.com/charliermarsh/rs-async-zip/pull/1. It took a lot of
fiddling but I'm quite confident in the code now, especially since the
async zip crate validates the signature kind on every read.
The central directory is typically quite small (even for the Zig wheel,
which is enormous, it's just around 1MB), so I don't expect this to have
a high cost.
Closes https://github.com/astral-sh/puffin/issues/1148.
## Summary
This ensures that we warn when redundant options are passed (like
`--allow-unsafe`, which is really common for forwards compatibility
since it's going to be the default in a future release), and errors when
known variants are passed that we _don't_ support (like
`--resolver=backtracking`).
Closes https://github.com/astral-sh/puffin/issues/1127.
In https://github.com/astral-sh/puffin/pull/1040 we broke the pip
compile scenarios designed to test failure when a required Python
version is not available — resolution succeeded because all of the
Python versions were available in CI. Following #1105 we have the
ability to isolate tests from Python versions available in the system.
Here, we limit the scenarios to only the Python version in the current
environment, restoring our ability to test the error messages.
With https://github.com/zanieb/packse/pull/95, we will be able to
specify scenarios with access to additional system Python versions. This
will allow us to include test coverage where resolution can succeed by
using a version available elsewhere on the system. See #1111 for this
follow-up.
Replaces https://github.com/astral-sh/puffin/pull/1068 and #1070 which
were more complicated than I wanted.
- Introduces a `.python-versions` file which defines the Python versions
needed for development
- Adds a Bash script at `scripts/bootstrap/install` which installs the
required Python versions from `python-build-standalone` to `./bin`
- Checks in a `versions.json` file with metadata about available
versions on each platform and a `fetch-version` Python script derived
from `rye` for updating the versions
- Updates CI to use these Python builds instead of the `setup-python`
action
- Updates to the latest packse scenarios which require Python 3.8+
instead of 3.7+ since we cannot use 3.7 anymore and includes new test
coverage of patch Python version requests
- Adds a `PUFFIN_PYTHON_PATH` variable to prevent lookup of system
Python versions for isolation during development
Tested on Linux (via CI) and macOS (locally) — presumably it will be a
bit more complicated to do proper Windows support.
## Background
In virtual environments, we want to install python programs as console
commands, e.g. `black .` over `python -m black .`. They may be called
[entrypoints](https://packaging.python.org/en/latest/specifications/entry-points/)
or scripts. For entrypoints, we're given a module name and function to
call in that module.
On Unix, we generate a minimal python script launcher. Text files are
runnable on unix by adding a shebang at their top, e.g.
```python
#!/usr/bin/env python
```
will make the operating system run the file with the current python
interpreter. A venv launcher for black in `/home/ferris/colorize/.venv`
(module name: `black`, function to call: `patched_main`) would look like
this:
```python
#!/home/ferris/colorize/.venv/bin/python
# -*- coding: utf-8 -*-
import re
import sys
from black import patched_main
if __name__ == "__main__":
sys.argv[0] = re.sub(r"(-script\.pyw|\.exe)?$", "", sys.argv[0])
sys.exit(patched_main())
```
On windows, this doesn't work, we can only rely on launching `.exe`
files.
## Summary
We use posy's rust implementation of a trampoline, which is based on
distlib's c++ implementation. We pre-build a minimal exe and append the
launcher script as stored zip archive behind it. The exe will look for
the venv python interpreter next to it and use it to execute the
appended script.
The changes in this PR make the `black` entrypoint work:
```powershell
cargo run -- venv .venv
cargo run -q -- pip install black
.\.venv\Scripts\black --version
```
Integration with our existing tests will be done in follow-up PRs.
## Implementation and Details
I've vendored the posy trampoline crate. It is a formatted, renamed and
slightly changed for embedding version of
https://github.com/njsmith/posy/pull/28.
The posy launchers are smaller than the distlib launchers, 16K vs 106K
for black. Currently only `x86_64-pc-windows-msvc` is supported. The
crate requires a nightly compiler for its no-std binary size tricks.
On windows, an application can be launched with a console or without (to
create windows instead), which needs two different launchers. The gui
launcher will subsequently use `pythonw.exe` while the console launcher
uses `python.exe`.
## Summary
Rather than checking cache freshness in the install plan, it's a lot
simple to have the install plan _never_ return cached data when the
refresh policy is in place, and then rely on the distribution database
to check for freshness. The original implementation didn't support this,
since the distribution database was rebuilding things too often. Now, it
rarely rebuilds (it's much better about this), so it seems conceptually
much simpler to split up the responsibilities like this.
## Summary
This ensures that (like Cargo) we don't suffer from
https://github.com/advisories/GHSA-r5w3-xm58-jv6j, by way of checking
known hosts when fetching via `libgit2`.
The implementation is taken from Cargo itself, modified to remove all
configuration, since we don't yet support configuration for known hosts,
etc.
Closes#285.
## Summary
Use a single error type in `puffin_distribution`, rather than two
confusingly similar types between `DistributionDatabase` and the source
distribution module.
Also removes the `#[from]` for IO errors and replaces with explicit
wrapping, which is verbose but removes a bunch of incorrect error
messages.
This PR changes the error type to be boxed internally so that it uses
less size on the stack. This makes functions returning `Result<T,
Error>`, in particular, return something much smaller.
The specific thing that motivated this was Clippy lints firing when I
tried to refactor code in this crate.
I chose to achieve boxing by splitting the enum out into a separate
type, and then wiring up the necessary `From` impl to make error
conversions easy, and then making `Error` itself opaque. We could expose
the `Box`, but there isn't a ton of benefit in doing so because one
cannot pattern match through a `Box`.
This required using more explicit error conversions in several places.
And as a result, I was able to remove all `#[from]` attributes on
non-transparent error variants.
Our existing detection doesn't work on Windows, because we canoncalize
the interpreter path but not `info.sys_executable`, so the former
includes the UNC prefix, etc. This is cross-platform and gets at the
intent of the check.
## Summary
This PR adds a `NormalizedDisplay` trait that we can use for user-facing
paths, to strip the UNC prefix on Windows.
On other platforms, the implementation is a no-op (vs. `Display`).
I audited all usages of `.display()`, and changed any that were
user-facing, either via `println!` or `eprintln!`, or by way of being
included in error messages. I did _not_ change uses that were only in
tests or only went to tracing.
Closes https://github.com/astral-sh/puffin/issues/1084.
Windows uses `;` instead of `:` to separate `PATH` entries. This pull
request switches from manually using `:` to the `std::env` functions.
This fixes
```
puffin pip install -e scripts/editable-installs/maturin_editable
```
on windows.
## Summary
When we unzip wheels in the cache, we write the directories out to an
`archive-v0` bucket, and then symlink into that bucket from the
`wheels-v0` and `built-wheels-v0` buckets.
On Windows, symlinks are not well supported. Specifically, they need to
be explicitly enabled by the user. So, instead of symlinks, we now use
junctions, which are well-supported on Windows, and allow you to
(effectively) symlink a directory to another directory. This PR
implements said junction support, which gets the core installer working
on Windows.
In the past, we also used symlinks to implement another primitive: we
wanted to be able to replace a directory "atomically" (I put
"atomically" in quotes because I don't know if it's actually a
guaranteed atomic operation), in case someone was trying to use the
directory while we were replacing it (as opposed to deleting the
directory, then moving it into place).
On Windows, it doesn't appear to be possible to atomically replace a
junction. So instead, I'm using a new design, whereby the cache always
returns canonicalized paths. We know these canonicalized paths are
unique and won't be replaced, so they're safe for writers to rely on. In
general, when we write new data to the cache, we now return the
canonicalized path. When we read from the cache, and try to identify
(e.g.) the set of wheels available to us, we canonicalize the links
immediately and consider them non-existent if that operation fails.
Closes#1085.
---------
Co-authored-by: konstin <konstin@mailbox.org>
Requires https://github.com/zanieb/pubgrub/pull/20
In short, `UnusableDependencies` can be generalized into `Unavailable`
which encompasses incompatibilities where a package range which is
unusable for some inherent reason as well as when its dependencies are
unusable. We can eventually use this to track more incompatibilities in
the solver. I made the reason string required because I can't see a case
where we should leave it out.
Additionally, this improves the display of conflicts in the root
requirements.
## Summary
It turns out this is significantly faster when reading (e.g.) _just_ the
`METADATA` file from a zipped wheel.
I audited other `File::open` usages, and everything else seems to be
using a buffered reader already (directly, or in whatever third-party
crate it's passed to) _or_ is read immediately in full.
See the criterion benchmark:
```
file_reader/numpy-1.26.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
time: [6.9618 ms 6.9664 ms 6.9713 ms]
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
file_reader/flask-3.0.1-py3-none-any.whl
time: [237.50 µs 238.25 µs 239.13 µs]
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
buffered_reader/numpy-1.26.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
time: [648.92 µs 653.85 µs 660.09 µs]
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
buffered_reader/flask-3.0.1-py3-none-any.whl
time: [39.578 µs 39.712 µs 39.869 µs]
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
```
Follow-up to https://github.com/astral-sh/puffin/pull/1040 adding a
user-facing warning when we cannot build with their requested version.
e.g.
```
❯ cargo run -- pip compile requirements.in --python-version 3.11.4 --no-build
Resolved 8 packages in 483ms
❯ cargo run -- pip compile requirements.in --python-version 3.11.4
warning: The requested Python version 3.11.4 is not available; 3.11.7 will be used to build dependencies instead.
Resolved 8 packages in 71ms
❯ cargo run -- pip compile requirements.in --python-version 3.11
Resolved 8 packages in 71ms
```
## Summary
This PR uses `ctime` consistently on Unix as a more conservative
approach to change detection. It also ensures that our timestamp
abstraction is entirely internal, so we can change the representation
and logic easily across the codebase in the future.
## Summary
First batch of changes for windows support. Notable changes:
* Fixes all compile errors and added windows specific paths.
* Working venv creation on windows, both from a base interpreter and
from a venv. This requires querying `stdlib` from the sysconfig paths to
find the launcher.
* Basic url/path conversion handling for windows.
* `if cfg!(...)` instead of `#[cfg()]`. This should make it easier to
keep everything compiling across platforms.
## Outlook
Test summary: 402 tests run: 299 passed (15 slow), 103 failed, 1 skipped
There are various reason for the remaining test failure:
* Windows-specific colorama and tzdata dependencies that change the
snapshot slightly. This is by far the biggest batch.
* Some url-path handling issues. I fixed some in the PR, some remain.
* Lack of the latest python patch versions for older pythons on my
machine, since there are no builds for windows and we need to register
them in the registry for them to be picked up for `py --list-paths` (CC
@zanieb RE #1070).
* Lack of entrypoint launchers.
* ... likely more
Extends #1029
Closes https://github.com/astral-sh/puffin/issues/1038
Instead of always using the current Python version for builds when a
target version is provided, we will do our best to use a compatible
Python version for builds.
Removes behavior where Python versions without patch versions were
always assumed to be the latest known patch version (previously
discussed in https://github.com/astral-sh/puffin/pull/534). While this
was convenient for resolutions which include packages which require
minimum patch versions e.g. `requires-python=">=3.7.4"`, it conflicts
with the idea that the target Python version you provide is the
_minimum_ compatible version. Additionally, it complicates interpreter
lookup as we cannot tell if the user has asked for that specific patch
version or not.
In windows, `python3.9` and `python3.11` are not in `PATH`. Instead, we
should pass only the python version to `puffin venv -p` in packse
scenarios (#1039).
This PR replaces a few uses of hash maps/sets with btree maps/sets and
index maps/sets. This has the benefit of guaranteeing a deterministic
order of iteration.
I made these changes as part of looking into a flaky test.
Unfortunately, I'm not optimistic that anything here will actually fix
the flaky test, since I don't believe anything was actually dependent
on the order of iteration.
## Summary
This PR is an alternative approach to #949 which should be much safer.
As in #949, we add a `Refresh` policy to the cache. However, instead of
deleting entries from the cache the first time we read them, we now
check if the entry is sufficiently new (created after the start of the
command) if the refresh policy applies. If the entry is stale, then we
avoid reading it and continue onward, relying on the cache to
appropriately overwrite based on "new" data. (This relies on the
preceding PRs, which ensure the cache is append-only, and ensure that we
can atomically overwrite.)
Unfortunately, there are just a lot of paths through the cache, and
didn't data is handled with different policies, so I really had to go
through and consider the "right" behavior for each case. For example,
the HTTP requests can use `max-age=0, must-revalidate`. But for the
routes that are based on filesystem modification, we need to do
something slightly different.
Closes#945.
## Summary
This PR ensures that we store HTTP caching information for wheels.
Previously, we only stored these for source distributions. This will be
helpful for refresh, since we can avoid re-downloading wheels that are
unchanged per HTTP caching semantics.
There should be zero performance hit here for warm installs, and only an
extremely small hit for cold installs (writing the HTTP cache data to
disk). The hyperfine benchmarks reflect this.
## Summary
If you send a revalidation request to a resource that returns an
`immutable` directive, the server apparently returns a 200 instead of a
304? In other words, the server can ignore the revalidation request.
This PR adds handling on top of the HTTP cache semantics to respect
immutable resources, which is especially useful since all PyPI files are
immutable.
## Summary
One problem we have in the cache today is that we can't overwrite
entries atomically, because we store unzipped _directories_ in the cache
(which makes installation _much_ faster than storing zipped
directories). So, if you ignore the existing contents of the cache when
writing, you might run into an error, because you might attempt to write
a directory where a directory already exists.
This is especially annoying for cache refresh, because in order to
refresh the cache, we have to purge it (i.e., delete a bunch of stuff),
which is also highly unsafe if Puffin is running across multiple threads
or multiple processes.
The solution I'm proposing here is that whenever we persist a
_directory_ to the cache, we persist it to a special "archive" bucket.
Then, within the other buckets, directory entries are actually symlinks
into that "archive" bucket. With symlinks, we can atomically replace,
which means we can easily overwrite cache entries without having to
delete from the cache.
The main downside is that we'll now accumulate dangling entries in the
"archive" bucket, and so we'll need to implement some form of garbage
collection to ensure that we remove entries with no symlinks. Another
downside is that cache reads and writes will be a bit slower, since we
need to deal with creating and resolving these symlinks.
As an example... after this change, the cache entry for this unzipped
wheel is actually a symlink:

Then, within the archive directory, we actually have two unique entries
(since I intentionally ran the command twice to ensure overwrites were
safe):

Per https://apenwarr.ca/log/20181113, `ctime` should be a lot more
conservative, and should detect things like the issue we see with the
python-build-standalone builds, where the `mtime` is identical across
builds.
On Windows, I'm just using `last_write_time`. But we should probably add
`volume_serial_number` and other attributes via
[`winapi_util`](https://docs.rs/winapi-util/latest/winapi_util/index.html).
## Summary
This is a refactor of the source distribution cache that again aims to
make the cache purely additive. Instead of deleting all built wheels
when the cache gets invalidated (e.g., because the source distribution
changed on PyPI or something), we now treat each invalidation as its own
cache directory. The manifest inside of the source distribution
directory now becomes a pointer to the "latest" version of the source
distribution cache.
Here's a visual example:

With this change, we avoid deleting built distributions that might be
relied on elsewhere and maintain our invariant that the cache is purely
additive. The cost is that we now preserve stale wheels, but we should
add a garbage collection mechanism to deal with that.
## Summary
This PR gets rid of the manifest that we store for source distributions.
Historically, that manifest included the source distribution metadata,
plus a list of built wheels.
The problem with the manifest is that it duplicates state, since we now
have to look at both the manifest and the filesystem to understand the
cache state. Instead, I think we should treat the cache as the source of
truth, and get rid of the duplicated state in the manifest.
Now, we store the manifest (which is merely used to check for cache
freshness -- in future PRs, I will repurpose it though, so I left it
around), then the distribution metadata as its own file, then any
distributions in the same directory. When we want to see if there are
any valid distributions, we `readdir` on the directory. This is also
much more consistent with how the install plan works.
Mirroring `virtualenv -p` and driven by the lack of `pythonx.y` in
`PATH` on windows, this PR adds `-p x.y` support to `puffin venv` (first
commit).
Supported formats:
* NEW: `-p 3.10` searches for an installed Python 3.10 (Looking for
`python3.10` on linux/mac).
Specifying a patch version is not supported
* `-p python3.10` or `-p python.exe` looks for a binary in `PATH`
* `-p /home/ferris/.local/bin/python3.10` uses this exact Python
In the second commit, we add python interpreter search on windows using
`py --list-paths`. On windows, all python are called `python.exe` so the
unix trick of looking for `python{}.{}` in `PATH` doesn't work. Instead,
we ask the python launcher for windows to tell us about all installed
packages. We should eventually migrate this to [PEP
514](https://peps.python.org/pep-0514/) by reading the registry entries
ourselves.
Extends #1048 interface providing a more general interface that I think
should be standard.
Allows forcing colors to be on _or_ off. e.g. `NO_COLOR=1 pip install
pip-tools --color always` would be colored.
Hides the `--no-color` option as it only exists for compatibility (and
seems better than throwing an error when people assume it will exist).
Has a nice side-effect of documenting our coloring behaviors e.g.
```
--color <COLOR>
Control colors in output
[default: auto]
Possible values:
- auto: Enables colored output only when the output is going to a terminal or TTY with support
- always: Enables colored output regardless of the detected environment
- never: Disables colored output
```
If the executable is a symbolic link, checking the modified time will
not reflect changes to the source file e.g.
```
❯ touch foo
❯ ln -s foo foobar
❯ gstat -c %Y foo
1705958431
❯ gstat -c %Y foobar
1705958438
❯ touch foo
❯ gstat -c %Y foobar
1705958438
```
This can result in a stale cache being treated as fresh; for example,
when Rye changes the interpreter linked in a virtual environment.
In https://github.com/astral-sh/puffin/pull/986 there was some confusion
about what these values are set to and I noticed that we never actually
display the target version being used for a resolution.
- Consistently display the Python interpreter being used, i.e. make it
clear that we are referring the the interpreter/installed Python version
and always show the version number
- Display the target Python version during solving
## Summary
This PR adds support for PyPy wheels by changing the compatible tags
based on the implementation name and version of the current interpreter.
For now, we only support CPython and PyPy, and explicitly error out when
given other interpreters. (Is this right? Should we just fallback to
CPython tags...? Or skip the ABI-specific tags for unknown
interpreters?)
The logic is based on
4d85340613/src/packaging/tags.py (L247).
Note, however, that `packaging` uses the `EXT_SUFFIX` variable from
`sysconfig`... Instead, I looked at the way that PyPy formats the tags,
and recreated them based on the Python and implementation version. For
example, PyPy wheels look like
`cchardet-2.1.7-pp37-pypy37_pp73-win_amd64.whl` -- so that's `pp37` for
PyPy with Python version 3.7, and then `pypy37_pp73` for PyPy with
Python version 3.7 and PyPy version 7.3.
Closes https://github.com/astral-sh/puffin/issues/1013.
## Test Plan
I tested this manually, but I couldn't find macOS universal PyPy
wheels... So instead I added `cchardet` to a `requirements.in`, ran
`cargo run pip sync requirements.in --index-url
https://pypy.kmtea.eu/simple --verbose`, and added logging to verify
that the platform tags matched (even if the architecture didn't).
This PR attempts to fix a common footgun in `requirements.txt` files.
Previously, to provide a file, you had to use `package_name @
file:///Users/crmarsh/...` -- in other words, an absolute path.
Now, these requirements follow the exact same rules as editables, so you
can do:
```
package_name @ ./file.zip
```
And similar.
The way the parsing is setup, this is intentionally _not_ supported when
reading metadata -- only when parsing `requirements.txt` directly.
Closes#984.
## Summary
`interpreter.version()` returns the `python_full_version`, but the
marker variant uses `python_version` instead of `python_full_version` --
so it's omitting the patch.
## Summary
Based on user feedback. Calling it a "parse error" is misleading, since
this is really something we don't support, but that users can work
around.
e.g. for scenarios that test resolution _without_ installation.
This refactors the `update` script to generate scenario test files for
`pip compile` _and_ `pip install`. We don't overlap scenarios to save
time. We only generate `pip compile` test cases for scenarios we cannot
represent with `pip install` e.g. a `--python-version` override.
The _one_ scenario I added happened to reveal a bug in our resolver
where we were incorrectly filtering versions by the installed version
when wheels were available. Per the comment at
https://github.com/astral-sh/puffin/issues/883#issuecomment-1890773112,
we should _only_ need to check for a compatible installed Python version
when using a different _target_ Python version if we need to build a
source distribution.
53bce68400
resolves this by removing the excessive constraints — the correct Python
version incompatibilities are applied elsewhere.
Adds support for disabling installation from pre-built wheels i.e. the
package must be built from source locally.
We will still always use pre-built wheels for metadata during
resolution.
Available via `--no-binary` and `--no-binary-package <name>` flags in
`pip install` and `pip sync`. There is no flag for `pip compile` since
no installation happens there.
```
--no-binary
Don't install pre-built wheels.
When enabled, all installed packages will be installed from a source distribution.
The resolver will still use pre-built wheels for metadata.
--no-binary-package <NO_BINARY_PACKAGE>
Don't install pre-built wheels for a specific package.
When enabled, the specified packages will be installed from a source distribution.
The resolver will still use pre-built wheels for metadata.
```
When packages are already installed, the `--no-binary` flag will have no
affect without the `--reinstall` flag. In the future, I'd like to change
this by tracking if a local distribution is from a pre-built wheel or a
locally-built wheel. However, this is significantly more complex and
different than `pip`'s behavior so deferring for now.
For reference, `pip`'s flag works as follows:
```
--no-binary <format_control>
Do not use binary packages. Can be supplied multiple times, and each time adds to the
existing value. Accepts either ":all:" to disable all binary packages, ":none:" to empty the
set (notice the colons), or one or more package names with commas between them (no colons).
Note that some packages are tricky to compile and may fail to install when this option is
used on them.
```
Note we are not matching the exact `pip` interface here because it seems
complicated to use. I think we may want to consider adjusting our
interface for this behavior since we're not entirely compatible anyway
e.g. I think `--force-build` and `--force-build-package` are clearer
names. We could also consider matching the `pip` interface or only
allowing `--no-binary <package>` for compatibility. We can of course do
whatever we want in our _own_ install interfaces later.
Additionally, we may want to further consider the semantics of
`--no-binary`. For example, if I run `pip install pydantic --no-binary`
I expect _just_ Pydantic to be installed without binaries but by default
we will build all of Pydantic's dependencies too.
This work was prompted by #895, as it is much easier to measure
performance gains from building source distributions if we have a flag
to ensure we actually build source distributions. Additionally, this is
a flag I have used frequently in production to debug packages that ship
Cythonized wheels.
Improves some of the "no versions of <package> are available" messages
by showing the complement or inversion of the package.
Does not address cases like
```
Because there are no versions of crow that satisfy any of:
crow>1.0.0,<2.0.0a5
crow>2.0.0a7,<2.0.0b1
crow>2.0.0b1,<2.0.0b5
...
```
which are a bit more complicated; I'll focus on those cases in a
follow-up.
## Summary
I don't know if this is actually a good change, but it tries to make the
editable install experience more consistent. Specifically, we now
support...
```
# Use a relative path with a `file://` prefix.
# Prior to this PR, we supported `file:../foo`, but not `file://../foo`, which felt inconsistent.
-e file://../foo
# Use environment variables with paths, not just URLs.
# Prior to this PR, we supported `file://${PROJECT_ROOT}/../foo`, but not the below.
-e ${PROJECT_ROOT}/../foo
```
Importantly, `-e file://../foo` is actually not supported by pip... `-e
file:../foo` _is_ supported though. We support both, as of this PR. Open
to feedback.
On top of https://github.com/astral-sh/puffin/pull/947, we can also box
`PrioritizedDistribution`.
In a simple benchmark, this seems to slightly improve performance when
comparing only this commit to main, even though the benchmark is too
noisy to establish significance:
```
$ hyperfine --warmup 30 --runs 300 "target/profiling/main-dev resolve meine_stadt_transparent" "target/profiling/puffin-dev resolve meine_stadt_transparent"
Benchmark 1: target/profiling/main-dev resolve meine_stadt_transparent
Time (mean ± σ): 83.6 ms ± 2.0 ms [User: 77.7 ms, System: 20.0 ms]
Range (min … max): 81.4 ms … 98.2 ms 300 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: target/profiling/puffin-dev resolve meine_stadt_transparent
Time (mean ± σ): 80.8 ms ± 2.2 ms [User: 75.4 ms, System: 19.5 ms]
Range (min … max): 78.6 ms … 98.6 ms 300 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
target/profiling/puffin-dev resolve meine_stadt_transparent ran
1.03 ± 0.04 times faster than target/profiling/main-dev resolve meine_stadt_transparent
```
The effect on type sizes however is considerable ([downstack
PR](https://gist.github.com/konstin/38e6c774db541db46d61f1d4ea6b498f)
vs. [this
PR](https://gist.github.com/konstin/003a77fe7d7d246b0d535e3fc843cb36)):
```patch
--- branch.txt 2024-01-17 14:26:01.826085176 +0100
+++ boxed-prioritized-dist.txt 2024-01-17 14:25:57.101900963 +0100
@@ -1,19 +1,3 @@
-9264 alloc::collections::btree::node::InternalNode<pep440_rs::version::Version, distribution_types::PrioritizedDistribution> align=8
- 9168 data
- 96 edges
-
-9264 alloc::collections::btree::node::InternalNode<pep440_rs::Version, distribution_types::PrioritizedDistribution> align=8
- 9168 data
- 96 edges
-
-9168 alloc::collections::btree::node::LeafNode<pep440_rs::version::Version, distribution_types::PrioritizedDistribution> align=8
- 9064 vals
- 88 keys
-
-9168 alloc::collections::btree::node::LeafNode<pep440_rs::Version, distribution_types::PrioritizedDistribution> align=8
- 9064 vals
- 88 keys
-
8992 tokio::sync::mpsc::block::Block<hyper::client::dispatch::Envelope<http::request::Request<reqwest::async_impl::body::ImplStream>, http::response::Response<hyper::body::body::Body>>> align=8
8960 values
32 header
@@ -74,10 +58,23 @@
40 __tracing_attr_span
64 variant Unresumed, Returned, Panicked
+5648 {async fn body@crates/puffin-client/src/registry_client.rs:224:5: 224:30} align=8
+ 5647 variant Suspend0
+ 5576 __awaitee align=8
+ 40 __tracing_attr_span
```
This is https://github.com/astral-sh/puffin/pull/947 again but this time
merging into main instead of downstack, sorry for the noise.
---
Windows has a default stack size of 1MB, which makes puffin often fail
with stack overflows. The PR reduces stack size by three changes:
* Boxing `File` in `Dist`, reducing the size from 496 to 240.
* Boxing the largest futures.
* Boxing `CachePolicy`
## Method
Debugging happened on linux using
https://github.com/astral-sh/puffin/pull/941 to limit the stack size to
1MB. Used ran the command below.
```
RUSTFLAGS=-Zprint-type-sizes cargo +nightly build -p puffin-cli -j 1 > type-sizes.txt && top-type-sizes -w -s -h 10 < type-sizes.txt > sizes.txt
```
The main drawback is top-type-sizes not saying what the `__awaitee` is,
so it requires manually looking up with a future with matching size.
When the `brotli` features on `reqwest` is active, a lot of brotli types
show up. Toggling this feature however seems to have no effect. I assume
they are false positives since the `brotli` crate has elaborate control
about allocation. The sizes are therefore shown with the feature off.
## Results
The largest future goes from 12208B to 6416B, the largest type
(`PrioritizedDistribution`, see also #948) from 17448B to 9264B. Full
diff: https://gist.github.com/konstin/62635c0d12110a616a1b2bfcde21304f
For the second commit, i iteratively boxed the largest file until the
tests passed, then with an 800KB stack limit looked through the
backtrace of a failing test and added some more boxing.
Quick benchmarking showed no difference:
```console
$ hyperfine --warmup 2 "target/profiling/main-dev resolve meine_stadt_transparent" "target/profiling/puffin-dev resolve meine_stadt_transparent"
Benchmark 1: target/profiling/main-dev resolve meine_stadt_transparent
Time (mean ± σ): 49.2 ms ± 3.0 ms [User: 39.8 ms, System: 24.0 ms]
Range (min … max): 46.6 ms … 63.0 ms 55 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: target/profiling/puffin-dev resolve meine_stadt_transparent
Time (mean ± σ): 47.4 ms ± 3.2 ms [User: 41.3 ms, System: 20.6 ms]
Range (min … max): 44.6 ms … 60.5 ms 62 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
target/profiling/puffin-dev resolve meine_stadt_transparent ran
1.04 ± 0.09 times faster than target/profiling/main-dev resolve meine_stadt_transparent
```
By default, windows has a stack size limit of 1MB which we run against
in debug without any explicit culprit. A new environment variable
`PUFFIN_STACK_SIZE` allows setting an artificially smaller stack size.
## Summary
I got confused by why `VerbatimUrl` was on `Path`. Since it's directly
computed from it, I think we should just compute it as-needed. I think
it's also possibly-buggy because the URL is the URL of the _directory_,
not the artifact itself, which differs from other distributions.
Missing piece for the release.
## Test Plan
Built the image locally:
```shell
❯ docker run 99956098e1f8f04e209dcfc4a0afcee67df1fe8a726c164884e67f035b1a0f42
Usage: puffin [OPTIONS] <COMMAND>
Commands:
pip Resolve and install Python packages
venv Create a virtual environment
clean Clear the cache
help Print this message or the help of the given subcommand(s)
Options:
-q, --quiet Do not print any output
-v, --verbose Use verbose output
-n, --no-cache Avoid reading from or writing to the cache
--cache-dir <CACHE_DIR> Path to the cache directory [env: PUFFIN_CACHE_DIR=]
-h, --help Print help
-V, --version Print version
```
## Summary
This PR adds a release workflow powered by `cargo-dist`. It's similar to
the version that's PR'd in Ruff
(https://github.com/astral-sh/ruff/pull/9559), with the exception that
it doesn't include the Docker build or the "update dependents" step for
pre-commit.
## Summary
This PR is like #957, but for validating the virtual environment, rather
than the cache. So, if you have a local wheel, and you rebuild it, we'll
now correctly uninstall and reinstall it in the virtual environment.
## Summary
- This was inherited from
d719988323/src/metadata.rs (LL78C2-L91C26)
- ...which introduced this code here:
9cd1d43f7c
- ...with the originating issue here:
https://github.com/PyO3/maturin/issues/612
- ...and the upstream issue here:
https://github.com/staktrace/mailparse/issues/50
It seems like the goal was to support Unicode in certain header fields,
but I don't think this is necessary for us. We only use
`get_first_value` for `Requires-Python`, which has to be ASCII, doesn't
it?
In my testing, it seems like the `charset` hack can also be removed. The
tests I copied over actually work without it, which makes me a bit
skeptical.
The main benefit here is that we get to a remove a _big_ dependency
stack, including Chumsky and Stacker and psm which have limited
cross-platform support.
## Summary
This is a small correctness improvement that ensures that we avoid using
stale cache entries for local dependencies in the install plan. We
already have some logic like this in the source distribution builder,
but it didn't apply in the install plan, and so we'd end up using stale
wheels.
Specifically, now, if you create a new local wheel, and run `pip sync`,
we'll mark the cache entries as stale and make sure we unzip it and
install it. (If the wheel is _already_ installed, we won't reinstall it
though, which will be a separate change. This is just about reading from
the cache, not the environment.)
Fixes#965
We have to canonicalize the interpreter path, otherwise the home is set
to the venv dir instead of the real root. This would make
python-build-standalone fail with the encodings module not being found
because its home is wrong.
The `InstallPlan` does a lot of work in the constructor, which I tend to
feel is an anti-pattern. With cache refresh, it's also going to need to
be made `async`, so it really feels like it should be a clearer method
rather than an async, fallible constructor that does a bunch of IO. This
PR splits into a `Planner` (with a `build` method) and a `Plan`.
## Summary
It turns out that storing an absolute URL for every file caused a
significant performance regression. This PR attempts to address the
regression with two changes.
The first is that we now store the raw string if the URL is an absolute
URL. If the URL is relative, we store the base URL alongside the raw
relative string. As such, we avoid serializing and deserializing URLs
until we need them (later on), except for the base URL.
The second is that we now use the internal `Url` crate methods for
serializing and deserializing. If you look inside `Url`, its standard
serializer and deserialization actually convert it to a string, then
parse the string. But the crate exposes some other methods for faster
serialization and deserialization (with fewer guarantees). I think this
is totally fine since the cache is entirely internal.
If we _just_ change the `Url` serialization (and no other code -- so
continue to store URLs for every file), then the regression goes down to
about 5%:
```shell
❯ python -m scripts.bench \
--puffin-path ./target/release/main \
--puffin-path ./target/release/relative --puffin-path ./target/release/puffin \
scripts/requirements/home-assistant.in --benchmark resolve-warm
Benchmark 1: ./target/release/main (resolve-warm)
Time (mean ± σ): 496.3 ms ± 4.3 ms [User: 452.4 ms, System: 175.5 ms]
Range (min … max): 487.3 ms … 502.4 ms 10 runs
Benchmark 2: ./target/release/relative (resolve-warm)
Time (mean ± σ): 284.8 ms ± 2.1 ms [User: 245.8 ms, System: 165.6 ms]
Range (min … max): 280.3 ms … 288.0 ms 10 runs
Benchmark 3: ./target/release/puffin (resolve-warm)
Time (mean ± σ): 300.4 ms ± 3.2 ms [User: 255.5 ms, System: 178.1 ms]
Range (min … max): 295.4 ms … 305.1 ms 10 runs
Summary
'./target/release/relative (resolve-warm)' ran
1.05 ± 0.01 times faster than './target/release/puffin (resolve-warm)'
1.74 ± 0.02 times faster than './target/release/main (resolve-warm)'
```
So I considered _just_ making that change. But 5% is kind of
borderline...
With both of these changes, the regression is down to 1-2%:
```
Benchmark 1: ./target/release/relative (resolve-warm)
Time (mean ± σ): 282.6 ms ± 7.4 ms [User: 244.6 ms, System: 181.3 ms]
Range (min … max): 275.1 ms … 318.5 ms 30 runs
Benchmark 2: ./target/release/puffin (resolve-warm)
Time (mean ± σ): 286.8 ms ± 2.2 ms [User: 247.0 ms, System: 169.1 ms]
Range (min … max): 282.3 ms … 290.7 ms 30 runs
Summary
'./target/release/relative (resolve-warm)' ran
1.01 ± 0.03 times faster than './target/release/puffin (resolve-warm)'
```
It's consistently ~2%-ish, but at this point it's unclear if that's due
to the URL change or something other change between now and then.
Closes#943.
On ubuntu and python 3.10,
```
cargo run -q -- pip-install --find-links https://storage.googleapis.com/jax-releases/jax_cuda_releases.html "jax[cuda12_pip]==0.4.23"
```
non-deterministically but for me consistently fails to install with
messages such as
```
error: Failed to install: nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (nvidia-nccl-cu12==2.19.3)
Caused by: failed to remove file `/home/konsti/projects/puffin/.venv/lib/python3.10/site-packages/nvidia/__init__.py`
Caused by: No such file or directory (os error 2)
```
```
error: Failed to install: nvidia_cublas_cu12-12.3.4.1-py3-none-manylinux1_x86_64.whl (nvidia-cublas-cu12==12.3.4.1)
Caused by: Replacing an existing file or directory failed
```
```
error: Failed to install: nvidia_cuda_nvcc_cu12-12.3.107-py3-none-manylinux1_x86_64.whl (nvidia-cuda-nvcc-cu12==12.3.107)
Caused by: failed to hardlink file from /home/konsti/.cache/puffin/wheels-v0/pypi/nvidia-cuda-nvcc-cu12/nvidia_cuda_nvcc_cu12-12.3.107-py3-none-manylinux1_x86_64/nvidia/__init__.py to /home/konsti/projects/puffin/.venv/lib/python3.10/site-packages/nvidia/__init__.py
Caused by: File exists (os error 17)
```
We install a lot of nvidia package, that all contain
`nvidia/__init__.py`, since they all install themselves into the
`nvidia` module:
```
nvidia-cublas-cu12==12.3.4.1
nvidia-cuda-cupti-cu12==12.3.101
nvidia-cuda-nvcc-cu12==12.3.107
nvidia-cuda-nvrtc-cu12==12.3.107
nvidia-cuda-runtime-cu12==12.3.101
nvidia-cudnn-cu12==8.9.7.29
nvidia-cufft-cu12==11.0.12.1
nvidia-cusolver-cu12==11.5.4.101
nvidia-cusparse-cu12==12.2.0.103
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.3.101
```
```
$ tree -L 1 .venv/lib/python3.10/site-packages/nvidia
.venv/lib/python3.10/site-packages/nvidia
├── cublas
├── cuda_cupti
├── cuda_nvcc
├── cuda_nvrtc
├── cuda_runtime
├── cudnn
├── cufft
├── cusolver
├── cusparse
├── __init__.py
├── nccl
└── nvjitlink
```
When installing we get a race condition, each package installation is
its own thread:
* Installer Thread 1 creates `nvidia/__init__.py`
* Installer Thread 2 sees an existing `nvidia/__init__.py`
* Installer Thread 3 sees an existing `nvidia/__init__.py`
* Installer Thread 2 removes `nvidia/__init__.py`
* Installer Thread 3 tries to remove `nvidia/__init__.py`, it doesn't
exist anymore -> failure.
We switch to a new strategy: When the target files exists, we don't
remove it, but instead hardlink the source file to a tempfile first,
then renaming the tempfile to the target file. Renaming is considered an
atomic operation.
I've put the logging on debug level because they cases indicate a
conflict between two packages while being rare.
Closes#925
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This PR uses a single `Index` that's shared between the top-level
resolver and any sub-resolutions happen in the course of that top-level
resolution (namely, to resolve build dependencies for any source
distributions).
In theory it's an optimization, since (e.g.) if we have two packages
that both need the `flit-core` build system, and we attempt to build
them both at once, we'll only fetch its metadata _once_, and share it
across the two resolutions. In practice, I haven't been able to get this
to show up in benchmarks. I suspect you'd need a _lot_ of source
distributions for it to matter... Though it may still be worth doing, it
strikes me as a cleaner design.
Closes#200.
Closes#541.
## Summary
This fixes an extremely subtle bug in `pip install --reinstall`, whereby
if you depend on `setuptools` at the top level, we end up uninstalling
it after resolving, which breaks some cached state. If we have
`--reinstall`, we need to reset that cached state between resolving and
installing.
## Test Plan
Running `pip install --reinstall` with:
```txt
setuptools
devpi @ e334eb4dc9bb023329e4b610e4515b/devpi-2.2.0.tar.gz
```
Fails on `main`, but passes.
## Summary
This PR fixes a subtle bug in `pip install` when using `--reinstall`. If
a package depends on a build system directly (e.g., `waitress` depends
on `setuptools`), and then you have other packages that also need the
build system to build a source distribution, right now, we don't share
the `OnceMap` between those cases.
This lifts the `InFlight` tracking up a level, so that it's initialized
once per command, then shared everywhere.
## Test Plan
I'm having trouble coming up with an identical test-case and hesitant to
add this slow test to the suite... But if you run `pip install
--reinstall` with:
```
waitress @ git+https://github.com/zanieb/waitress
devpi-server @ git+https://github.com/zanieb/devpi#subdirectory=server
```
It fails consistently on `main` and passes here.
## Summary
This makes the separation clearer between the legacy `pip` API and the
API we'll add in the future for the package manager itself. It also
enables seamless `puffin pip` aliasing for those that want it.
Closes#918.
## Summary
This PR restructures the flat index fetching in a few ways:
1. It now lives in its own `FlatIndexClient`, since it felt a bit
awkward (in my opinion) for it to live in `RegistryClient`.
2. We now fetch the `FlatIndex` outside of the resolver. This has a few
benefits: (1) the resolver construct is no longer `async` and no longer
returns `Result`, which feels better for a resolver; and (2) we can
share the `FlatIndex` across resolutions rather than re-fetching it for
every source distribution build.
## Summary
`FlatIndex` is now the thing that's keyed on `PackageName`, while
`FlatDistributions` is what used to be called `FlatIndex` (a map from
version to `PrioritizedDistribution`, for a single package). I find this
a bit clearer, since we can also remove the `from_files` that doesn't
return `Self`, which I had trouble following.
## Summary
I'm running into some annoyances converting `&Version` to
`&PubGrubVersion` (which is just a wrapper type around `Version`), and I
realized... We don't even need `PubGrubVersion`?
The reason we "need" it today is due to the orphan trait rule: `Version`
is defined in `pep440_rs`, but we want to `impl
pubgrub::version::Version for Version` in the resolver crate.
Instead of introducing a new type here, which leads to a lot of
awkwardness around conversion and API isolation, what if we instead just
implement `pubgrub::version::Version` in `pep440_rs` via a feature? That
way, we can just use `Version` everywhere without any confusion and
conversion for the wrapper type.
Add directory `--find-links` support for local paths to pip-compile.
It seems that pip joins all sources and then picks the best package. We
explicitly give find links packages precedence if the same exists on an
index and locally by prefilling the `VersionMap`, otherwise they are
added as another index and the existing rules of precedence apply.
Internally, the feature is called _flat index_, which is more meaningful
than _find links_: We're not looking for links, we're picking up local
directories, and (TBD) support another index format that's just a flat
list of files instead of a nested index.
`RegistryBuiltDist` and `RegistrySourceDist` now use `WheelFilename` and
`SourceDistFilename` respectively. The `File` inside `RegistryBuiltDist`
and `RegistrySourceDist` gained the ability to represent both a url and
a path so that `--find-links` with a url and with a path works the same,
both being locked as `<package_name>@<version>` instead of
`<package_name> @ <url>`. (This is more of a detail, this PR in general
still work if we strip that and have directory find links represented as
`<package_name> @ file:///path/to/file.ext`)
`PrioritizedDistribution` and `FlatIndex` have been moved to locations
where we can use them in the upstack PR.
I added a `scripts/wheels` directory with stripped down wheels to use
for testing.
We're lacking tests for correct tag priority precedence with flat
indexes, i only confirmed this manually since it is not covered in the
pip-compile or pip-sync output.
Closes#876
There is no guarantee that indexes provide hashes at all or the sha256
we support specifically. [PEP
503](https://peps.python.org/pep-0503/#specification):
> The URL SHOULD include a hash in the form of a URL fragment with the
following syntax: #<hashname>=<hashvalue>, where <hashname> is the
lowercase name of the hash function (such as sha256) and <hashvalue> is
the hex encoded digest.
We instead use the url as input to generate a hash when caching.
## Summary
We always normalize extra names in our requirements (e.g., `cuda12_pip`
to `cuda12-pip`), but we weren't normalizing within PEP 508 markers,
which meant we ended up comparing `cuda12-pip` (normalized) against
`cuda12_pip` (unnormalized).
Closes https://github.com/astral-sh/puffin/issues/911.
Previously, the url on file could either be a relative or an absolute
url, depending on the index, and we would finalize it lazily. Now we
finalize the url when converting `pypi_types::File` to
`distribution_types::File`. This change is required to make the hashes
on `File` optional (https://github.com/astral-sh/puffin/pull/910), which
are currently the only unique field usable for caching.
## Summary
This PR ensures that when the user passes in `--python-version`, we
adjust the _markers_ to match the target version, thus forcing us to
select compatible wheels for the `--python-version`, rather than the
installed version.
## Context
Let's call Python 3.10 the "installed" environment and Python 3.12 the
"target" environment. For each version, we have _both_ a Python version
(to match against `Requires-Python`) and a set of tags (to match against
wheels).
The rules for resolution are as follows...
- For each package, for each version, we try to find the "best
candidate" for resolution and installation.
- We first look for a wheel that's compatible with the _target_
environment. This requires testing against both the `Requires-Python`
and the markers. (We won't have to build or run this code, so the
_installed_ version is irrelevant.) **(This PR corrects _this_ bullet --
previously, we validated against the _installed_ markers, rather than
the target markers.)**
- If we can't find a compatible wheel, we accept any _incompatible_
wheel as long as there's a source distribution. The source distribution
_must_ be compatible with the target environment. (We won't have to
build or run this code, so the _installed_ version is irrelevant.)
- If there are no wheels, then the source distribution must be
compatible with _both_ the installed and target environments, since we
need to build it.
This is all true for the top-level resolution. When we perform a
sub-resolution (when resolving the build dependencies of a source
distribution), we should _only_ use the installed environment, and
ignore the target environment, since we assume that the dependencies
will be the same in both environments once built -- so our goal is
"just" to build the distribution, without concern for which build
dependencies it uses.
Closes https://github.com/astral-sh/puffin/issues/883.
Remove a test case from the `install_editable` that slows it down from
3.6s to 6.5s while providing low test coverage. It also seems to block
other tests sometimes, `cargo nextest run -E "test(editable)"
--all-features` has more consistent and lower runtimes. Surprisingly
this seems to have bigger effect than switching from pyo3 to cffi.
Used test commands:
```
rm -rf scripts/editable-installs/maturin_editable/target/ && time cargo nextest run -E "test(=install_editable)" --all-features
rm -rf scripts/editable-installs/maturin_editable/target/ && time cargo nextest run -E "test(editable)" --all-features
```
Part of #878
Replace the DTLSsocket test with a dummy package that does nothing but
contain the build system specs that we need. This should speed up one of
the slowest tests.
Part of #878
## Summary
Now that `get_or_build_wheel` will often _also_ handle the unzip step,
we need to move our per-target locking (`OnceMap`) up a level.
Previously, it was only applied to the unzip step, to prevent us from
attempting to unzip into the same target concurrently; now, it's applied
at the `get_wheel` level, which includes both downloading and unzipping.
## Test Plan
It seems like none of our existing tests catch this -- perhaps because
they're too "simple"? You need to run into a situation in which you're
doing multiple source distribution builds concurrently (since they'll
all try to download `setuptools`):
```
rm -rf foo && virtualenv --clear .venv && cargo run -p puffin-cli -- pip-compile ./scripts/requirements/pydantic.in --verbose --cache-dir foo
```
Reduces the number of implementation branches handling `Range:full`,
deferring it to `PackageRange`.
Improves some user-facing messages, e.g. saying `all versions of
<package>` instead of `<package>*`.
Changes the member names of the `PackageRangeKind` enum — they were not
very clear.
## Summary
Installs the seed packages you get with `virtualenv`, but opt-in rather
than opt-out.
Closes https://github.com/astral-sh/puffin/issues/852.
## Test Plan
```
❯ ./scripts/benchmarks/venv.sh
+ hyperfine --runs 20 --warmup 3 --prepare 'rm -rf .venv' './target/release/puffin venv' --prepare 'rm -rf .venv' 'virtualenv --without-pip .venv' --prepare 'rm -rf .venv' 'python -m venv --without-pip .venv'
Benchmark 1: ./target/release/puffin venv
Time (mean ± σ): 4.6 ms ± 0.2 ms [User: 2.4 ms, System: 3.6 ms]
Range (min … max): 4.3 ms … 4.9 ms 20 runs
Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.
Benchmark 2: virtualenv --without-pip .venv
Time (mean ± σ): 73.3 ms ± 0.3 ms [User: 57.4 ms, System: 14.2 ms]
Range (min … max): 72.8 ms … 74.0 ms 20 runs
Benchmark 3: python -m venv --without-pip .venv
Time (mean ± σ): 22.5 ms ± 0.3 ms [User: 17.0 ms, System: 4.9 ms]
Range (min … max): 22.0 ms … 23.2 ms 20 runs
Summary
'./target/release/puffin venv' ran
4.92 ± 0.20 times faster than 'python -m venv --without-pip .venv'
16.00 ± 0.63 times faster than 'virtualenv --without-pip .venv'
+ hyperfine --runs 20 --warmup 3 --prepare 'rm -rf .venv' './target/release/puffin venv --seed' --prepare 'rm -rf .venv' 'virtualenv .venv' --prepare 'rm -rf .venv' 'python -m venv .venv'
Benchmark 1: ./target/release/puffin venv --seed
Time (mean ± σ): 20.2 ms ± 0.4 ms [User: 8.6 ms, System: 15.7 ms]
Range (min … max): 19.7 ms … 21.2 ms 20 runs
Benchmark 2: virtualenv .venv
Time (mean ± σ): 135.1 ms ± 2.4 ms [User: 66.7 ms, System: 65.7 ms]
Range (min … max): 133.2 ms … 142.8 ms 20 runs
Benchmark 3: python -m venv .venv
Time (mean ± σ): 1.656 s ± 0.014 s [User: 1.447 s, System: 0.186 s]
Range (min … max): 1.641 s … 1.697 s 20 runs
Summary
'./target/release/puffin venv --seed' ran
6.67 ± 0.17 times faster than 'virtualenv .venv'
81.79 ± 1.70 times faster than 'python -m venv .venv'
```
## Summary
Our current setup uses the legacy `setup.py`-based builds if a
`pyproject.toml` file isn't present. This matches pip's behavior.
However, `pypa/build` uses PEP 517-based builds in such cases, and it
looks like pip plans to make that the default
(https://github.com/pypa/pip/issues/9175), with the limiting factor
being performance issues related to isolated builds.
This is now the default behavior, but the `--legacy-setup-py` flag
allows users to opt-in to using `setup.py` directly for distributions
that lack a `pyproject.toml`.
## Summary
This PR adds support for `prepare_metadata_for_build_wheel`, which
allows us to determine source distribution metadata without building the
source distribution. This represents an optimization for the resolver,
as we can skip the expensive build phase for build backends that support
it.
For reference, `prepare_metadata_for_build_wheel` seems to be supported
by:
- `hatchling` (as of
[1.0.9](https://hatch.pypa.io/latest/history/hatchling/#hatchling-v1.9.0)).
- `flit`
- `setuptools`
In fact, it seems to work for every backend _except_ those using legacy
`setup.py`.
Closes#599.
Refactoring split out from find links support: Find links files can be
represented as `Dist`, but not really as `File`, they don't have url nor
hashes.
`DistRequiresPython` is somewhat odd as an in between type.
Changes `File::size` from a `usize` to a `u64`.
The motivations are that with tensorflow wheels being 475 MB
(https://pypi.org/project/tensorflow/2.15.0.post1/#files), we're already
only one order of magnitude away and to avoid target dependent failures.
## Summary
If pre-releases are available for a package that we otherwise couldn't
resolve, we now show a hint that includes one of the example versions.
Closes https://github.com/astral-sh/puffin/issues/811.
Instead of trying to fixup _all_ the invalid version specifiers on pypi
and elsewhere, this filters out distributions with invalid
`requires-python` version specifiers that even
`LenientVersionSpecifiers` couldn't parse, as opposed to failing
entirely, which we currently do.
I would be nicer to model through an invalid distribution pubgrub type,
together with e.g. source dists with an unknown extension, so that the
version itself still shows up in the error trace.
At the same time, we reduce the log level for fixups from warning to
trace, as they are not actionable for the user.
Adjusts display of "no versions available" in error messages to be
consistent with other package/range pairings i.e. we usually display
"<package-name><range>".
## Summary
Right now, both the callback _and_ the "We have no compatible wheel"
paths have a lot of repeated code. This PR changes the callback to
_just_ remove all the wheels and handle the download, and the rest of
the method following the callback is responsible for finding and
building any wheels.
## Summary
I'm unable to run `puffin-cli` on `main` as the
`tracing-durations-export` is marked as optional, but the crate actually
depends on it to compile. Further, without `tracing-durations-export`,
there are `Option` types that can't resolve to a concrete type.
This PR fixes compilation with and without the feature.
Noticed these when working on something unrelated. Generally:
- Prefer `entry.file_type()` over `entry.path().is_file()` or similar,
as the former is almost always free on Unix.
- Call `entry.path()` once, since it allocates internally (returns a
`PathBuf`).
In the past, I moved us to `owo-colors`
(https://github.com/astral-sh/puffin/pull/121); then, we moved back,
because we ran into issues with overriding the settings to force-disable
colors. But `anstream` solved those problems, so I'm moving us _back_ to
`owo-colors`, since it's what `anstream` recommends, and it's already
used by many of our dependencies (`miette`, `configparser`).
---------
Co-authored-by: konstin <konstin@mailbox.org>
## Summary
We can use `anstream` for all color control, rather than going through
`colored`. Note that we still need the `colored` crate, since `colored`
and `anstream` solve different problems. (`anstream` recommends using
`owo-colors` alongside it, but `colored` seems to work fine?)
Resolves the issue raised in
https://github.com/astral-sh/puffin/pull/742 via `anstream` rather than
`colored`.
Closes https://github.com/astral-sh/puffin/issues/782.
Looking at the profile for tf-models-nightly after #789,
`compare_release` is the single biggest item. Adding a fast path, we
avoid paying the cost for padding releases with 0s when they are the
same length, resulting in a 16% for this pathological case. Note that
this mainly happens because tf-models-nightly is almost all large dev
releases that hit the slow path.
**Before**

**After**

```
$ hyperfine --warmup 1 --runs 3 "target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt"
"target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt"
Benchmark 1: target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
Time (mean ± σ): 11.963 s ± 0.225 s [User: 11.478 s, System: 0.451 s]
Range (min … max): 11.747 s … 12.196 s 3 runs
Benchmark 2: target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt
Time (mean ± σ): 10.317 s ± 0.720 s [User: 9.885 s, System: 0.404 s]
Range (min … max): 9.501 s … 10.860 s 3 runs
Summary
target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt ran
1.16 ± 0.08 times faster than target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
```
This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s
real 34.579
user 34.004
sys 0.413
maxmem 2867 MB
faults 0
```
And now with this PR:
```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s
real 9.218
user 8.919
sys 0.165
maxmem 463 MB
faults 0
```
This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:
```
$ hyperfine -w10 \
"puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
"puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 337.5 ms ± 3.9 ms [User: 310.5 ms, System: 73.2 ms]
Range (min … max): 333.6 ms … 343.4 ms 10 runs
Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
Time (mean ± σ): 189.8 ms ± 3.0 ms [User: 168.1 ms, System: 78.4 ms]
Range (min … max): 185.0 ms … 196.2 ms 15 runs
Summary
puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```
There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.
Fixes#373, Closes#396
Uses new metadata added in https://github.com/zanieb/packse/pull/61 to
assert that resolution succeeded or failed _and_ that the installed
package versions match the expected result.
The semantics are a bit unintuitive because `--python-version` is a
preference when looking for a python version without a venv, but if we
don't find that exact version we'll take `python3` and patch the
markers. This will make more sense once we start provisioning python
builds.
We can now resolve black with both python 3.8 and 3.12, with or without
that python version being in scope. In the example below,
`PATH=$HOME/.cargo/bin:/usr/bin` removes the pyenv builds and leaves
only `python3`, which is python 3.11.
```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
0.004108s DEBUG puffin::commands::pip_compile Using Python 3.8 at /home/konsti/.local/bin/python3.8
Resolved 8 packages in 44ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
# via black
tomli==2.0.1
# via black
typing-extensions==4.8.0
# via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
0.004315s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 8 packages in 43ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
# via black
tomli==2.0.1
# via black
typing-extensions==4.8.0
# via black
```
```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
0.004216s DEBUG puffin::commands::pip_compile Using Python 3.12 at /home/konsti/.local/bin/python3.12
Resolved 6 packages in 37ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
# via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
0.004190s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 6 packages in 39ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
# puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
# via black
```
Fixes#235.
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
One of the most common ways source dists fail to build (on linux) is
when the linker fails because the shared library of a native dependency
is not installed. These errors are hard to understand when you're not a
c programmer:
```
In file included from /usr/include/python3.10/unicodeobject.h:1046,
from /usr/include/python3.10/Python.h:83,
from Modules/3.x/readline.c:8:
Modules/3.x/readline.c: In function ‘on_completion’:
/usr/include/python3.10/cpython/unicodeobject.h:744:29: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
744 | #define _PyUnicode_AsString PyUnicode_AsUTF8
| ^~~~~~~~~~~~~~~~
Modules/3.x/readline.c:842:23: note: in expansion of macro ‘_PyUnicode_AsString’
842 | char *s = _PyUnicode_AsString(r);
| ^~~~~~~~~~~~~~~~~~~
Modules/3.x/readline.c: In function ‘readline_until_enter_or_signal’:
Modules/3.x/readline.c:1044:9: warning: ‘sigrelse’ is deprecated: Use the sigprocmask function instead [-Wdeprecated-declarations]
1044 | sigrelse(SIGINT);
| ^~~~~~~~
In file included from Modules/3.x/readline.c:10:
/usr/include/signal.h:359:12: note: declared here
359 | extern int sigrelse (int __sig) __THROW
| ^~~~~~~~
Modules/3.x/readline.c: In function ‘PyInit_readline’:
Modules/3.x/readline.c:1179:34: warning: assignment to ‘char * (*)(FILE *, FILE *, const char *)’ from incompatible pointer type ‘char * (*)(FILE *, FILE *, char *)’ [-Wincompatible-pointer-types]
1179 | PyOS_ReadlineFunctionPointer = call_readline;
| ^
In file included from /usr/include/string.h:535,
from /usr/include/python3.10/Python.h:30,
from Modules/3.x/readline.c:8:
In function ‘strncpy’,
inlined from ‘call_readline’ at Modules/3.x/readline.c:1124:9:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:95:10: warning: ‘__builtin_strncpy’ output truncated before terminating nul copying as many bytes from a string as its length [-Wstringop-truncation]
95 | return __builtin___strncpy_chk (__dest, __src, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
96 | __glibc_objsize (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~
Modules/3.x/readline.c: In function ‘call_readline’:
Modules/3.x/readline.c:1099:9: note: length computed here
1099 | n = strlen(p);
| ^~~~~~~~~
/usr/bin/ld: cannot find -lncurses: No such file or directory
collect2: error: ld returned 1 exit status
error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
---
```
We parse these errors out, tell the user about the missing shared
library and even the most likely debian/ubuntu package name:
```
This error likely indicates that you need to install the library that provides a shared library for ncurses for pygraphviz-1.11 (e.g. libncurses-dev)
```
Previously, we just pulled the latest commit from `main` on every
update. This causes problems when you do not intend to update the
scenarios as in #787.
This bumps to the latest `packse` commit without new scenarios.
Adds support for a `PUFFIN_NO_WRAP` environment variable which disables
line wrapping in `miette` output.
We set this variable in the scenario tests to improve the readability of
snapshots.
I contributed the ability to disable line wrapping upstream at
https://github.com/zkat/miette/pull/328
This PR does a bit of refactoring to the pep440 crate, and in
particular around the erorr types. This PR is meant to be a precursor
to another PR that does some surgery (both in parsing and in `Version`
representation) that benefits somewhat from this refactoring.
As usual, please review commit-by-commit.
This fixes a compilation error with tests on current `main`. I didn't
track down the exact provenance, but I'd guess it's the result of a
botched merge. (i.e., Two or more PRs that pass CI independently, but
when merged cause failures.)
I'm still confused about it, but this seems to do the right thing?
`HierarchicalLayer` internally has [`let ansi =
io::stderr().is_terminal();`](fcd9eed252/src/lib.rs (L74)),
so the logging itself is already correctly uncolored, but errors in the
log weren't.
Test command, ran with network deactivated:
```shell
RUST_LOG=debug cargo run --bin puffin -- pip-compile -v ./scripts/popular_packages/pypi_8k_downloads.txt 2> log.txt
```
**Before**
```
[1;31merror[0m: Request error: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: dns error: failed to lookup address information: Temporary failure in name resolution
[1;31mCaused by[0m: failed to lookup address information: Temporary failure in name resolution
```
**After**
```
error: Request error: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: dns error: failed to lookup address information: Temporary failure in name resolution
Caused by: failed to lookup address information: Temporary failure in name resolution
```
Fixup for
7349527ceadde8fc265a33e6a4e662/boto3-1.2.0-py2.py3-none-any.whl:
```
botocore>=1.3.0,<1.4.0',
```
Note that neither the quote nor the comma are right.
Following #757, improves the script for generating scenario test cases
with:
- A requirements file
- Support for downloading packse scenarios from GitHub dynamically
- Running rustfmt on the generated test file
- Updating snapshots / running tests
As mentioned in #746, instead of just installing the scenario root we
will unpack the root dependencies into the install command to allow
better coverage of direct user requests with scenarios.
I added display of the package tree provided by each scenario.
Use a mustache template for iterative replacements.