Commit Graph

591 Commits

Author SHA1 Message Date
Charlie Marsh
55f2be72e2 Default to PEP 517-based builds (#843)
## Summary

Our current setup uses the legacy `setup.py`-based builds if a
`pyproject.toml` file isn't present. This matches pip's behavior.
However, `pypa/build` uses PEP 517-based builds in such cases, and it
looks like pip plans to make that the default
(https://github.com/pypa/pip/issues/9175), with the limiting factor
being performance issues related to isolated builds.

This is now the default behavior, but the `--legacy-setup-py` flag
allows users to opt-in to using `setup.py` directly for distributions
that lack a `pyproject.toml`.
2024-01-10 01:27:06 +00:00
Charlie Marsh
e26dc8e33d Add support for prepare_metadata_for_build_wheel (#842)
## Summary

This PR adds support for `prepare_metadata_for_build_wheel`, which
allows us to determine source distribution metadata without building the
source distribution. This represents an optimization for the resolver,
as we can skip the expensive build phase for build backends that support
it.

For reference, `prepare_metadata_for_build_wheel` seems to be supported
by:

- `hatchling` (as of
[1.0.9](https://hatch.pypa.io/latest/history/hatchling/#hatchling-v1.9.0)).
- `flit`
- `setuptools`

In fact, it seems to work for every backend _except_ those using legacy
`setup.py`.

Closes #599.
2024-01-10 00:07:37 +00:00
konsti
858d5584cc Use Dist in VersionMap (#851)
Refactoring split out from find links support: Find links files can be
represented as `Dist`, but not really as `File`, they don't have url nor
hashes.

`DistRequiresPython` is somewhat odd as an in between type.
2024-01-10 00:14:42 +01:00
konsti
1203f8f9e8 Gourgeist updates (#862)
* Use caching again
* Make clap feature only required for the cli/bin optional
2024-01-09 23:04:15 +00:00
bojanserafimov
e67b7858e6 Use zlib-ng for faster decompression (#859) 2024-01-09 16:13:36 -05:00
Zanie Blue
34d548de21 Improve error messages when there are no versions of a singleton range (#855) 2024-01-09 15:09:52 -06:00
Charlie Marsh
33982efb25 Remove a TOCTOU read in build (#860)
We should just read and handle the not-found case, rather than checking
if the file doesn't exist first.
2024-01-09 20:33:08 +00:00
Charlie Marsh
31139aa88d Add derive feature to gourgeist (#854)
Needed to build `gourgeist` directly, probably dropped during a
refactor.
2024-01-09 17:46:16 +00:00
Charlie Marsh
7f6aa1a236 Add a basic venv benchmark (#853) 2024-01-09 17:43:16 +00:00
konsti
ee6d809b60 Remove unused Result (#849)
Remove some dead code, seems to be a refactoring oversight
2024-01-09 16:35:10 +00:00
konsti
643e5e4a49 Use pdm for black editable as PEP 621 test case (#848)
This gives us a PEP 621 test package in tree and increases the diversity
for the editable tests a bit.
2024-01-09 16:33:05 +00:00
konsti
5b0b072e3c Allow files >4GB on 32-bit platforms (#847)
Changes `File::size` from a `usize` to a `u64`.

The motivations are that with tensorflow wheels being 475 MB
(https://pypi.org/project/tensorflow/2.15.0.post1/#files), we're already
only one order of magnitude away and to avoid target dependent failures.
2024-01-09 17:31:49 +01:00
Charlie Marsh
ee3a6431c7 Show available pre-releases in error hints (#844)
## Summary

If pre-releases are available for a package that we otherwise couldn't
resolve, we now show a hint that includes one of the example versions.

Closes https://github.com/astral-sh/puffin/issues/811.
2024-01-09 09:58:38 -05:00
konsti
b1edecdf1f Filter out files with invalid requires python specifiers (#775)
Instead of trying to fixup _all_ the invalid version specifiers on pypi
and elsewhere, this filters out distributions with invalid
`requires-python` version specifiers that even
`LenientVersionSpecifiers` couldn't parse, as opposed to failing
entirely, which we currently do.

I would be nicer to model through an invalid distribution pubgrub type,
together with e.g. source dists with an unknown extension, so that the
version itself still shows up in the error trace.

At the same time, we reduce the log level for fixups from warning to
trace, as they are not actionable for the user.
2024-01-09 02:46:27 +00:00
Zanie Blue
64da1f0306 Always pair package names with ranges in error messages (#838)
Adjusts display of "no versions available" in error messages to be
consistent with other package/range pairings i.e. we usually display
"<package-name><range>".
2024-01-08 22:11:10 +00:00
Charlie Marsh
19c6d655b5 Avoid duplicated source distribution handling in url (#841)
## Summary

Right now, both the callback _and_ the "We have no compatible wheel"
paths have a lot of repeated code. This PR changes the callback to
_just_ remove all the wheels and handle the download, and the rest of
the method following the callback is responsible for finding and
building any wheels.
2024-01-08 16:19:54 -05:00
Charlie Marsh
cc9140643e Rename metadata to built_wheel in source/mod.rs (#840) 2024-01-08 19:20:20 +00:00
Charlie Marsh
df254087d9 Break source_dist.rs into a module (#839)
## Summary

Finding this file hard to edit and work in since it's gotten quite
large.
2024-01-08 19:14:45 +00:00
Zanie Blue
2b0c2e294b Fix formatting of negated singleton versions in error messages (#836)
Closes #805 
Requires https://github.com/zanieb/pubgrub/pull/17
2024-01-08 12:33:01 -06:00
Zanie Blue
c3d37db85b Check locked dependencies in CI (#837) 2024-01-08 18:18:40 +00:00
Charlie Marsh
aeefe65227 Fix tracing-duration-export compilation (#835)
## Summary

I'm unable to run `puffin-cli` on `main` as the
`tracing-durations-export` is marked as optional, but the crate actually
depends on it to compile. Further, without `tracing-durations-export`,
there are `Option` types that can't resolve to a concrete type.

This PR fixes compilation with and without the feature.
2024-01-08 18:04:23 +00:00
Charlie Marsh
c06bf658bb Remove some filesystem calls from the installer (#834)
Noticed these when working on something unrelated. Generally:

- Prefer `entry.file_type()` over `entry.path().is_file()` or similar,
as the former is almost always free on Unix.
- Call `entry.path()` once, since it allocates internally (returns a
`PathBuf`).
2024-01-08 12:59:01 -05:00
bojanserafimov
cd6265a66d Fix typo in bench docs (#833) 2024-01-08 12:35:26 -05:00
konsti
004147d441 Add tracing_durations_export feature to puffin-cli (#830)
The optional `tracing-durations-export` feature allows creating
parallelism plots from all puffin-cli commands without affecting
production builds.

Usage:

```
virtualenv --clear -p 3.10 .venv310 && TRACING_DURATIONS_FILE=target/traces/jupyter-no-cache.ndjson RUST_LOG=puffin=info VIRTUAL_ENV=.venv310 cargo run --bin puffin --profile profiling --features tracing-durations-export -- pip-install -v --no-cache jupyter
virtualenv --clear -p 3.10 .venv310 && TRACING_DURATIONS_FILE=target/traces/jupyter.ndjson RUST_LOG=puffin=info VIRTUAL_ENV=.venv310 cargo run --bin puffin --profile profiling --features tracing-durations-export -- pip-install -v jupyter
 ```

Output, plotted in collapsed mode for readability:

Cached jupyter:

![jupyter](https://github.com/astral-sh/puffin/assets/6826232/f7e03c68-0438-4cf4-bceb-9a4a146cc506)

Uncached jupyter:

![image](https://github.com/astral-sh/puffin/assets/6826232/cfdd3383-7a9d-43d6-b8d0-201f64611596)
2024-01-08 16:20:45 +01:00
konsti
b6338b5e4a Use tracing-durations-export to visualize parallelism bottlenecks (dev commands) (#816)
Example usage:

```
# Cached
TRACING_DURATIONS_FILE=target/traces/black.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve black
TRACING_DURATIONS_FILE=target/traces/meine_stadt_transparent.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve meine_stadt_transparent
TRACING_DURATIONS_FILE=target/traces/jupyter.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve jupyter

# No cache
TRACING_DURATIONS_FILE=target/traces/black-no-cache.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve --no-cache black
TRACING_DURATIONS_FILE=target/traces/meine_stadt_transparent-no-cache.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve --no-cache meine_stadt_transparent
TRACING_DURATIONS_FILE=target/traces/jupyter-no-cache.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve --no-cache jupyter
```

Uncached black output example:


![black-no-cache](https://github.com/astral-sh/puffin/assets/6826232/38497b89-7214-453b-9456-c9d9cbf7d2d5)
2024-01-08 16:20:38 +01:00
konsti
243392f718 cargo run run puffin by default (#831)
`cargo run` now runs `puffin` by default. `cargo run --bin puffin-dev`
remains working.
2024-01-08 12:49:06 +00:00
konsti
3f587156ec Improve install instrumentation (#829)
Add tracing spans to different phases of the wheel installation.
2024-01-08 10:13:59 +00:00
konsti
2d4743d782 Update compare_with_pip.py (#828)
Preparing for #817
2024-01-08 09:46:30 +00:00
konsti
60ba7dd14f Use std::io::read_to_string (#826)
The `std::io::read_to_string` shorthand was stabilized in 1.65.
2024-01-08 09:15:38 +00:00
Charlie Marsh
54838914be Migrate back to owo-colors (#824)
In the past, I moved us to `owo-colors`
(https://github.com/astral-sh/puffin/pull/121); then, we moved back,
because we ran into issues with overriding the settings to force-disable
colors. But `anstream` solved those problems, so I'm moving us _back_ to
`owo-colors`, since it's what `anstream` recommends, and it's already
used by many of our dependencies (`miette`, `configparser`).

---------

Co-authored-by: konstin <konstin@mailbox.org>
2024-01-08 08:54:57 +00:00
Charlie Marsh
17452e3e64 Simplify ranges in pre-release hints (#825)
Closes https://github.com/astral-sh/puffin/issues/807.
2024-01-07 12:40:22 -05:00
Charlie Marsh
e6fcb9c4d3 Use anstream for all color control (#823)
## Summary

We can use `anstream` for all color control, rather than going through
`colored`. Note that we still need the `colored` crate, since `colored`
and `anstream` solve different problems. (`anstream` recommends using
`owo-colors` alongside it, but `colored` seems to work fine?)

Resolves the issue raised in
https://github.com/astral-sh/puffin/pull/742 via `anstream` rather than
`colored`.

Closes https://github.com/astral-sh/puffin/issues/782.
2024-01-06 20:44:05 -05:00
Charlie Marsh
fed492831a Inline some format placeholders (#822) 2024-01-06 23:13:44 +00:00
Charlie Marsh
77c3a67029 Remove pub(crate) from RegistryClient fields (#821) 2024-01-06 22:05:18 +00:00
Charlie Marsh
9ded337870 Remove unused proxy field from client (#820) 2024-01-06 17:02:35 -05:00
Charlie Marsh
5f98210083 Use a single hyperfine command for each benchmark (#819)
## Summary

Refactors the benchmark script such that we use a single `hyperfine`
invocation per benchmark, and thus get the comparative summary, which is
_way_ nicer:

```
Benchmark 1: ./target/release/puffin (install-cold)
  Time (mean ± σ):     410.3 ms ±  19.9 ms    [User: 173.7 ms, System: 1314.5 ms]
  Range (min … max):   389.7 ms … 452.1 ms    10 runs

Benchmark 2: ./target/release/baseline (install-cold)
  Time (mean ± σ):     418.2 ms ±  14.4 ms    [User: 210.7 ms, System: 1246.0 ms]
  Range (min … max):   397.3 ms … 445.7 ms    10 runs

Summary
  './target/release/puffin (install-cold)' ran
    1.02 ± 0.06 times faster than './target/release/baseline (install-cold)'
```
2024-01-06 20:14:22 +00:00
Charlie Marsh
0817a0d0d4 Use a dedicated flag for each tool in bench script (#818)
Taking some of Zanie's suggestions to make the custom-path API simpler
in the benchmark script. Each tool is now a dedicated argument, like:

```
python -m scripts.bench --pip-sync --poetry requirements.in
```

To provide custom binaries:

```
python -m scripts.bench \
    --puffin-path ./target/release/puffin \
    --puffin-path ./target/release/baseline \
    requirements.in
```
2024-01-06 19:45:00 +00:00
Charlie Marsh
3b43515262 Misc. refactors to benchmark script (#814)
- Use separate suites for `pip-sync` and `pip-compile`
- DRY up some properties
- Improve documentation
- Reorder classes to match enum
2024-01-06 03:33:50 +00:00
Charlie Marsh
063dd00542 Enable self-benchmarking for Puffin branches in bench.py (#804)
## Summary

This PR enables the use of the `bench.py` script to benchmark Puffin
itself. This is something I often do by via a process like:

- Checkout the `main` branch (or any other baseline branch)
- Run: `cargo build --release`
- Run: `mv ./target/release/puffin ./target/release/baseline`
- Checkout a development branch
- Run: `cargo build --release`
- (New) Run: `python bench.py --tool puffin --path
./target/release/puffin --tool puffin --path ./target/release/baseline
requirements.in`
2024-01-06 03:23:19 +00:00
Charlie Marsh
d2d87db7a3 Add Poetry support to bench.py (#803)
## Summary

Enables benchmarking against Poetry for resolution and installation:

```
Benchmark 1: pip-tools (resolve-cold)
  Time (mean ± σ):     962.7 ms ± 241.9 ms    [User: 322.8 ms, System: 80.5 ms]
  Range (min … max):   714.9 ms … 1459.4 ms    10 runs

Benchmark 1: puffin (resolve-cold)
  Time (mean ± σ):     193.2 ms ±   8.2 ms    [User: 31.3 ms, System: 22.8 ms]
  Range (min … max):   179.8 ms … 206.4 ms    14 runs

Benchmark 1: poetry (resolve-cold)
  Time (mean ± σ):     900.7 ms ±  21.2 ms    [User: 371.6 ms, System: 92.1 ms]
  Range (min … max):   855.7 ms … 933.4 ms    10 runs

Benchmark 1: pip-tools (resolve-warm)
  Time (mean ± σ):     386.0 ms ±  19.1 ms    [User: 255.8 ms, System: 46.2 ms]
  Range (min … max):   368.7 ms … 434.5 ms    10 runs

Benchmark 1: puffin (resolve-warm)
  Time (mean ± σ):       8.1 ms ±   0.4 ms    [User: 4.4 ms, System: 5.1 ms]
  Range (min … max):     7.5 ms …  11.1 ms    183 runs

Benchmark 1: poetry (resolve-warm)
  Time (mean ± σ):     336.3 ms ±   0.6 ms    [User: 283.6 ms, System: 44.7 ms]
  Range (min … max):   335.0 ms … 337.3 ms    10 runs
```
2024-01-06 02:52:55 +00:00
Charlie Marsh
ca2e3d7073 Remove outdated Cargo.toml comment (#813) 2024-01-06 02:50:52 +00:00
Zanie Blue
88adba83a0 Add scenarios with unresolvable dependencies due to excluded versions (#801)
Scenarios added in https://github.com/zanieb/packse/pull/71
2024-01-05 16:21:47 -06:00
Charlie Marsh
fc76f979e6 Fix context managers in bench.py (#802)
I'm sloppy and didn't test this last change prior to merging.
2024-01-05 21:24:45 +00:00
Charlie Marsh
ac385c80ab Add a script to benchmark against other tools (#800)
## Summary

Enables us to benchmark Puffin against `pip-tools` on a variety of
tasks. In subsequent PRs, I'll add support for Poetry and perhaps other
tools too.

Example usage:

```
❯ python scripts/bench.py -f requirements.in
2024-01-05 15:05:39 INFO Benchmarks: resolve-cold, resolve-warm
2024-01-05 15:05:39 INFO Tools: pip-tools, puffin
2024-01-05 15:05:39 INFO Reading requirements from: /Users/crmarsh/workspace/puffin/requirements.in
2024-01-05 15:05:39 INFO ```
2024-01-05 15:05:39 INFO black
2024-01-05 15:05:39 INFO ```
Benchmark 1: pip-tools (resolve-cold)
  Time (mean ± σ):     758.4 ms ±  15.1 ms    [User: 317.8 ms, System: 68.4 ms]
  Range (min … max):   738.1 ms … 786.7 ms    10 runs

Benchmark 1: puffin (resolve-cold)
  Time (mean ± σ):     213.5 ms ±  25.6 ms    [User: 34.6 ms, System: 27.9 ms]
  Range (min … max):   184.6 ms … 270.6 ms    12 runs

Benchmark 1: pip-tools (resolve-warm)
  Time (mean ± σ):     384.3 ms ±   6.7 ms    [User: 259.2 ms, System: 47.0 ms]
  Range (min … max):   376.0 ms … 399.6 ms    10 runs

Benchmark 1: puffin (resolve-warm)
  Time (mean ± σ):       8.0 ms ±   0.4 ms    [User: 4.4 ms, System: 5.0 ms]
  Range (min … max):     7.4 ms …  10.8 ms    209 runs
```
2024-01-05 21:08:20 +00:00
Zanie Blue
9a75703973 Bump packse to hide requires-python in docstrings when not relevant (#797) 2024-01-05 20:49:09 +00:00
Zanie Blue
def7f79f20 Add pre-release test scenario reproducing hint simplification bug (#796)
A reproduction of #751 

Scenarios added in https://github.com/zanieb/packse/pull/68
2024-01-05 14:41:40 -06:00
konsti
65efee1d76 Add compare_release fast path (#799)
Looking at the profile for tf-models-nightly after #789,
`compare_release` is the single biggest item. Adding a fast path, we
avoid paying the cost for padding releases with 0s when they are the
same length, resulting in a 16% for this pathological case. Note that
this mainly happens because tf-models-nightly is almost all large dev
releases that hit the slow path.

**Before**


![image](https://github.com/astral-sh/puffin/assets/6826232/0d2b4553-da69-4cdb-966b-0894a6dd5d94)

**After**


![image](https://github.com/astral-sh/puffin/assets/6826232/6d484808-9d16-408d-823e-a12d321802a5)

```
$ hyperfine --warmup 1 --runs 3 "target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt"
 "target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt"
Benchmark 1: target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
  Time (mean ± σ):     11.963 s ±  0.225 s    [User: 11.478 s, System: 0.451 s]
  Range (min … max):   11.747 s … 12.196 s    3 runs

Benchmark 2: target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt
  Time (mean ± σ):     10.317 s ±  0.720 s    [User: 9.885 s, System: 0.404 s]
  Range (min … max):    9.501 s … 10.860 s    3 runs

Summary
  target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt ran
    1.16 ± 0.08 times faster than target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
```
2024-01-05 15:14:11 -05:00
Andrew Gallant
6c98ae9d77 pep440: rewrite the parser and make version comparisons cheaper (#789)
This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s

real    34.579
user    34.004
sys     0.413
maxmem  2867 MB
faults  0
```

And now with this PR:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s

real    9.218
user    8.919
sys     0.165
maxmem  463 MB
faults  0
```

This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:

```
$ hyperfine -w10 \
    "puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
    "puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     337.5 ms ±   3.9 ms    [User: 310.5 ms, System: 73.2 ms]
  Range (min … max):   333.6 ms … 343.4 ms    10 runs

Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     189.8 ms ±   3.0 ms    [User: 168.1 ms, System: 78.4 ms]
  Range (min … max):   185.0 ms … 196.2 ms    15 runs

Summary
  puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
    1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```

There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.

Fixes #373, Closes #396
2024-01-05 11:57:32 -05:00
Zanie Blue
74777c01ea Improve documentation for scenario tests (#795)
- Fix documentation of scenario test module
- Add instructions to scenario update script for local development
2024-01-05 16:51:25 +00:00
konsti
5820a9d937 Update dependencies (#794)
Pull in a bunch of updates so they get some testing before we announce
the project. textwrap 0.16 is blocked on miette updating, http 1.0 on
reqwest.
2024-01-05 11:40:12 -05:00