Commit Graph

940 Commits

Author SHA1 Message Date
konsti 765e3175e1
Make windows compile (#1035)
Minimal changes to make `cargo check`/`cargo run` work to unblock the
remaining PR stacking
2024-01-22 13:11:20 +00:00
Charlie Marsh b9bee013ce
Use full Python version for installed version (#1033)
## Summary

`interpreter.version()` returns the `python_full_version`, but the
marker variant uses `python_version` instead of `python_full_version` --
so it's omitting the patch.
2024-01-22 00:44:39 -06:00
Zanie Blue 6202c9e1b5
Use current and requested Python versions in `requires-python` incompatibility errors (#986)
Closes https://github.com/astral-sh/puffin/issues/806
2024-01-22 00:32:02 -06:00
Charlie Marsh 23f73592b1
Add test to avoid invalidating virtualenv (#1031)
## Summary

I think if we used symlinks (instead of hardlinks), this test would fail
-- so it's worth including.
2024-01-21 19:53:58 -05:00
Charlie Marsh 540442b8de
Treat missing package name error as an unsupported requirement (#1025)
## Summary

Based on user feedback. Calling it a "parse error" is misleading, since
this is really something we don't support, but that users can work
around.
2024-01-21 19:53:10 -05:00
Zanie Blue 4026710189
Add scenario tests for `pip-compile` (#1011)
e.g. for scenarios that test resolution _without_ installation.

This refactors the `update` script to generate scenario test files for
`pip compile` _and_ `pip install`. We don't overlap scenarios to save
time. We only generate `pip compile` test cases for scenarios we cannot
represent with `pip install` e.g. a `--python-version` override.

The _one_ scenario I added happened to reveal a bug in our resolver
where we were incorrectly filtering versions by the installed version
when wheels were available. Per the comment at
https://github.com/astral-sh/puffin/issues/883#issuecomment-1890773112,
we should _only_ need to check for a compatible installed Python version
when using a different _target_ Python version if we need to build a
source distribution.
53bce68400
resolves this by removing the excessive constraints — the correct Python
version incompatibilities are applied elsewhere.
2024-01-21 17:47:42 -06:00
Charlie Marsh d9cc9dbf88
Improve error message when editable requirement doesn't exist (#1024)
Making these a lot clearer in the common case by reducing the depth of
the error.
2024-01-20 12:59:18 -05:00
Charlie Marsh 69d2791a43
Remove URL clone in requirements-txt parser (#1020) 2024-01-19 17:30:17 -05:00
Charlie Marsh b3954f2449
Enable PowerPC builds (#1017)
Closes #1015.
2024-01-19 17:29:11 -05:00
Charlie Marsh 459c2abc81
Avoid canonicalizing paths in `requirements-txt` (#1019)
## Summary

When you specify an editable that doesn't exist, it should error, but
not in the parser -- the error should be downstream.
2024-01-19 16:28:04 -05:00
Charlie Marsh d55e34c310
Make editable URL parsing more robust (#1018)
This just generalizes the parsing to handle arbitrary schemes instead of
encoding a fixed list.
2024-01-19 16:01:33 -05:00
Charlie Marsh c66395977d
Rename `pep440-rs` to `Readme.md` (#1014)
This is due to a bug in Maturin
(https://github.com/PyO3/maturin/pull/1915), so I'll just fix our setup
to work with existing versions.

Closes https://github.com/astral-sh/puffin/issues/991.
2024-01-19 15:16:12 -05:00
Zanie Blue 33b35f7020
Add support for disabling installation from pre-built wheels (#956)
Adds support for disabling installation from pre-built wheels i.e. the
package must be built from source locally.
We will still always use pre-built wheels for metadata during
resolution.

Available via `--no-binary` and `--no-binary-package <name>` flags in
`pip install` and `pip sync`. There is no flag for `pip compile` since
no installation happens there.

```
--no-binary

    Don't install pre-built wheels.
    
    When enabled, all installed packages will be installed from a source distribution. 
    The resolver will still use pre-built wheels for metadata.


--no-binary-package <NO_BINARY_PACKAGE>

    Don't install pre-built wheels for a specific package.
    
    When enabled, the specified packages will be installed from a source distribution. 
    The resolver will still use pre-built wheels for metadata.
```

When packages are already installed, the `--no-binary` flag will have no
affect without the `--reinstall` flag. In the future, I'd like to change
this by tracking if a local distribution is from a pre-built wheel or a
locally-built wheel. However, this is significantly more complex and
different than `pip`'s behavior so deferring for now.

For reference, `pip`'s flag works as follows:

```
--no-binary <format_control>

    Do not use binary packages. Can be supplied multiple times, and each time adds to the
    existing value. Accepts either ":all:" to disable all binary packages, ":none:" to empty the
    set (notice the colons), or one or more package names with commas between them (no colons).
    Note that some packages are tricky to compile and may fail to install when this option is
    used on them.
```

Note we are not matching the exact `pip` interface here because it seems
complicated to use. I think we may want to consider adjusting our
interface for this behavior since we're not entirely compatible anyway
e.g. I think `--force-build` and `--force-build-package` are clearer
names. We could also consider matching the `pip` interface or only
allowing `--no-binary <package>` for compatibility. We can of course do
whatever we want in our _own_ install interfaces later.

Additionally, we may want to further consider the semantics of
`--no-binary`. For example, if I run `pip install pydantic --no-binary`
I expect _just_ Pydantic to be installed without binaries but by default
we will build all of Pydantic's dependencies too.

This work was prompted by #895, as it is much easier to measure
performance gains from building source distributions if we have a flag
to ensure we actually build source distributions. Additionally, this is
a flag I have used frequently in production to debug packages that ship
Cythonized wheels.
2024-01-19 11:24:27 -06:00
Zanie Blue 8b49d900bd
Refer to the user instead of "root" when mentioning direct dependencies (#982)
Closes https://github.com/astral-sh/puffin/issues/857
2024-01-19 11:17:42 -06:00
Zanie Blue ae7a2cddc2
Avoid showing negations of ranges in error messages (#981)
Closes https://github.com/astral-sh/puffin/issues/980
2024-01-19 11:07:14 -06:00
Zanie Blue 02ed195982
Improve simple no version messages using complement of range (#979)
Improves some of the "no versions of <package> are available" messages
by showing the complement or inversion of the package.

Does not address cases like

```
Because there are no versions of crow that satisfy any of:
    crow>1.0.0,<2.0.0a5
    crow>2.0.0a7,<2.0.0b1
    crow>2.0.0b1,<2.0.0b5
...
```

which are a bit more complicated; I'll focus on those cases in a
follow-up.
2024-01-19 16:48:20 +00:00
Zanie Blue 7bb4fda8af
Say "depend on" instead of "depends on" when proper in error messages (#968)
I would like to spend some additional time working on the package range
display abstractions, but maybe that is best done _after_ I've done a
good bit of fiddling with the error messages.

Addresses
https://github.com/astral-sh/puffin/pull/868#discussion_r1447593081
2024-01-19 16:08:17 +00:00
Zanie Blue 5fe3444e5a
Use more realistic names in scenario snapshots (#978)
This is helpful to make the error messages more realistic and the names
are indisputably cuter.
2024-01-19 10:01:34 -06:00
Charlie Marsh 5adb08a304
Allow relative paths and environment variables in all editable representations (#1000)
## Summary

I don't know if this is actually a good change, but it tries to make the
editable install experience more consistent. Specifically, we now
support...

```
# Use a relative path with a `file://` prefix.
# Prior to this PR, we supported `file:../foo`, but not `file://../foo`, which felt inconsistent.
-e file://../foo

# Use environment variables with paths, not just URLs.
# Prior to this PR, we supported `file://${PROJECT_ROOT}/../foo`, but not the below.
-e ${PROJECT_ROOT}/../foo
```

Importantly, `-e file://../foo` is actually not supported by pip... `-e
file:../foo` _is_ supported though. We support both, as of this PR. Open
to feedback.
2024-01-19 09:00:37 -05:00
konsti cd2fb6fd60
Box `PrioritizedDistribution` (#948)
On top of https://github.com/astral-sh/puffin/pull/947, we can also box
`PrioritizedDistribution`.

In a simple benchmark, this seems to slightly improve performance when
comparing only this commit to main, even though the benchmark is too
noisy to establish significance:

```
$ hyperfine --warmup 30 --runs 300 "target/profiling/main-dev resolve meine_stadt_transparent" "target/profiling/puffin-dev resolve meine_stadt_transparent"
  Benchmark 1: target/profiling/main-dev resolve meine_stadt_transparent
    Time (mean ± σ):      83.6 ms ±   2.0 ms    [User: 77.7 ms, System: 20.0 ms]
    Range (min … max):    81.4 ms …  98.2 ms    300 runs

    Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

  Benchmark 2: target/profiling/puffin-dev resolve meine_stadt_transparent
    Time (mean ± σ):      80.8 ms ±   2.2 ms    [User: 75.4 ms, System: 19.5 ms]
    Range (min … max):    78.6 ms …  98.6 ms    300 runs

    Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

  Summary
    target/profiling/puffin-dev resolve meine_stadt_transparent ran
      1.03 ± 0.04 times faster than target/profiling/main-dev resolve meine_stadt_transparent
```

The effect on type sizes however is considerable ([downstack
PR](https://gist.github.com/konstin/38e6c774db541db46d61f1d4ea6b498f)
vs. [this
PR](https://gist.github.com/konstin/003a77fe7d7d246b0d535e3fc843cb36)):

```patch
--- branch.txt  2024-01-17 14:26:01.826085176 +0100
+++ boxed-prioritized-dist.txt  2024-01-17 14:25:57.101900963 +0100
@@ -1,19 +1,3 @@
-9264 alloc::collections::btree::node::InternalNode<pep440_rs::version::Version, distribution_types::PrioritizedDistribution> align=8
-   9168 data
-     96 edges
-
-9264 alloc::collections::btree::node::InternalNode<pep440_rs::Version, distribution_types::PrioritizedDistribution> align=8
-   9168 data
-     96 edges
-
-9168 alloc::collections::btree::node::LeafNode<pep440_rs::version::Version, distribution_types::PrioritizedDistribution> align=8
-   9064 vals
-     88 keys
-
-9168 alloc::collections::btree::node::LeafNode<pep440_rs::Version, distribution_types::PrioritizedDistribution> align=8
-   9064 vals
-     88 keys
-
 8992 tokio::sync::mpsc::block::Block<hyper::client::dispatch::Envelope<http::request::Request<reqwest::async_impl::body::ImplStream>, http::response::Response<hyper::body::body::Body>>> align=8
    8960 values
      32 header
@@ -74,10 +58,23 @@
          40 __tracing_attr_span
      64 variant Unresumed, Returned, Panicked

+5648 {async fn body@crates/puffin-client/src/registry_client.rs:224:5: 224:30} align=8
+   5647 variant Suspend0
+       5576 __awaitee align=8
+         40 __tracing_attr_span
```
2024-01-19 10:44:41 +01:00
konsti 47fc90d1b3
Reduce stack usage by boxing `File` in `Dist`, `CachePolicy` and large futures (#1004)
This is https://github.com/astral-sh/puffin/pull/947 again but this time
merging into main instead of downstack, sorry for the noise.

---

Windows has a default stack size of 1MB, which makes puffin often fail
with stack overflows. The PR reduces stack size by three changes:

* Boxing `File` in `Dist`, reducing the size from 496 to 240.
* Boxing the largest futures.
* Boxing `CachePolicy`

## Method

Debugging happened on linux using
https://github.com/astral-sh/puffin/pull/941 to limit the stack size to
1MB. Used ran the command below.

```
RUSTFLAGS=-Zprint-type-sizes cargo +nightly build -p puffin-cli -j 1 > type-sizes.txt && top-type-sizes -w -s -h 10 < type-sizes.txt > sizes.txt
```

The main drawback is top-type-sizes not saying what the `__awaitee` is,
so it requires manually looking up with a future with matching size.

When the `brotli` features on `reqwest` is active, a lot of brotli types
show up. Toggling this feature however seems to have no effect. I assume
they are false positives since the `brotli` crate has elaborate control
about allocation. The sizes are therefore shown with the feature off.

## Results

The largest future goes from 12208B to 6416B, the largest type
(`PrioritizedDistribution`, see also #948) from 17448B to 9264B. Full
diff: https://gist.github.com/konstin/62635c0d12110a616a1b2bfcde21304f

For the second commit, i iteratively boxed the largest file until the
tests passed, then with an 800KB stack limit looked through the
backtrace of a failing test and added some more boxing.

Quick benchmarking showed no difference:

```console
$ hyperfine --warmup 2 "target/profiling/main-dev resolve meine_stadt_transparent" "target/profiling/puffin-dev resolve meine_stadt_transparent" 
Benchmark 1: target/profiling/main-dev resolve meine_stadt_transparent
  Time (mean ± σ):      49.2 ms ±   3.0 ms    [User: 39.8 ms, System: 24.0 ms]
  Range (min … max):    46.6 ms …  63.0 ms    55 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Benchmark 2: target/profiling/puffin-dev resolve meine_stadt_transparent
  Time (mean ± σ):      47.4 ms ±   3.2 ms    [User: 41.3 ms, System: 20.6 ms]
  Range (min … max):    44.6 ms …  60.5 ms    62 runs
 
  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
 
Summary
  target/profiling/puffin-dev resolve meine_stadt_transparent ran
    1.04 ± 0.09 times faster than target/profiling/main-dev resolve meine_stadt_transparent
```
2024-01-19 09:38:36 +00:00
konsti 66e651901e
Add an env var to artificially limit the stack size (#941)
By default, windows has a stack size limit of 1MB which we run against
in debug without any explicit culprit. A new environment variable
`PUFFIN_STACK_SIZE` allows setting an artificially smaller stack size.
2024-01-19 09:34:46 +00:00
Charlie Marsh 69c72b6fa1
Validate wheel metadata against filename (#1002)
Closes #983.
2024-01-19 05:48:55 +00:00
Charlie Marsh f86d9b1c31
Add tests for missing file errors (#1001) 2024-01-19 05:47:25 +00:00
Charlie Marsh c8285cb5ef
Bump version to v0.0.3 (#999) 2024-01-18 23:39:35 -05:00
Charlie Marsh 9b24fcd306
Remove verbatim URL from path file location (#998)
## Summary

I got confused by why `VerbatimUrl` was on `Path`. Since it's directly
computed from it, I think we should just compute it as-needed. I think
it's also possibly-buggy because the URL is the URL of the _directory_,
not the artifact itself, which differs from other distributions.
2024-01-18 22:40:48 -05:00
Charlie Marsh 732ef7adb7
Bump version to v0.0.2 (#987)
Bumping the version so that I can test the release process again
(including PyPI publish).
2024-01-18 20:56:09 -05:00
Charlie Marsh fe180804b5
Avoid encoding current version in test output (#988) 2024-01-19 01:50:23 +00:00
Charlie Marsh 3a1cd44fc6
Add Puffin Docker image (#985)
Missing piece for the release.

## Test Plan

Built the image locally:

```shell
❯ docker run 99956098e1f8f04e209dcfc4a0afcee67df1fe8a726c164884e67f035b1a0f42
Usage: puffin [OPTIONS] <COMMAND>

Commands:
  pip    Resolve and install Python packages
  venv   Create a virtual environment
  clean  Clear the cache
  help   Print this message or the help of the given subcommand(s)

Options:
  -q, --quiet                  Do not print any output
  -v, --verbose                Use verbose output
  -n, --no-cache               Avoid reading from or writing to the cache
      --cache-dir <CACHE_DIR>  Path to the cache directory [env: PUFFIN_CACHE_DIR=]
  -h, --help                   Print help
  -V, --version                Print version
```
2024-01-18 20:21:31 -05:00
Charlie Marsh 5e2b715366
Rename `puffin-cli` crate to `puffin` (#976)
## Summary

Like in Ruff, this simplifies a few things.
2024-01-18 19:02:52 -05:00
Charlie Marsh 6cad0f609c
Mark `puffin-dev` as `publish = false` (#975) 2024-01-18 17:20:44 -05:00
Charlie Marsh 8eadca4f8d
Remove unused path method (#974) 2024-01-18 21:59:12 +00:00
Charlie Marsh a262936366
Allow file:-relative paths in editable installs (#970)
Supports editable install via (e.g.) `puffin pip install -e file:.`,
which pip seems to support.

Closes #964.
2024-01-18 21:15:42 +00:00
Charlie Marsh f9154e8297
Add release workflow (#961)
## Summary

This PR adds a release workflow powered by `cargo-dist`. It's similar to
the version that's PR'd in Ruff
(https://github.com/astral-sh/ruff/pull/9559), with the exception that
it doesn't include the Docker build or the "update dependents" step for
pre-commit.
2024-01-18 15:44:11 -05:00
Charlie Marsh a883de4fb0
Enforce modification freshness checks against virtual environment (#959)
## Summary

This PR is like #957, but for validating the virtual environment, rather
than the cache. So, if you have a local wheel, and you rebuild it, we'll
now correctly uninstall and reinstall it in the virtual environment.
2024-01-18 20:21:16 +00:00
Charlie Marsh 96a61fb351
Remove RFC2047 decoder (#967)
## Summary

- This was inherited from
d719988323/src/metadata.rs (LL78C2-L91C26)
- ...which introduced this code here:
9cd1d43f7c
- ...with the originating issue here:
https://github.com/PyO3/maturin/issues/612
- ...and the upstream issue here:
https://github.com/staktrace/mailparse/issues/50

It seems like the goal was to support Unicode in certain header fields,
but I don't think this is necessary for us. We only use
`get_first_value` for `Requires-Python`, which has to be ASCII, doesn't
it?

In my testing, it seems like the `charset` hack can also be removed. The
tests I copied over actually work without it, which makes me a bit
skeptical.

The main benefit here is that we get to a remove a _big_ dependency
stack, including Chumsky and Stacker and psm which have limited
cross-platform support.
2024-01-18 15:09:45 -05:00
Charlie Marsh f17bad0a75
Mark path-based cache entries as stale during install plan (#957)
## Summary

This is a small correctness improvement that ensures that we avoid using
stale cache entries for local dependencies in the install plan. We
already have some logic like this in the source distribution builder,
but it didn't apply in the install plan, and so we'd end up using stale
wheels.

Specifically, now, if you create a new local wheel, and run `pip sync`,
we'll mark the cache entries as stale and make sure we unzip it and
install it. (If the wheel is _already_ installed, we won't reinstall it
though, which will be a separate change. This is just about reading from
the cache, not the environment.)
2024-01-18 19:13:29 +00:00
konsti a11744e438
Normalize base python in venv creation (#966)
Fixes #965

We have to canonicalize the interpreter path, otherwise the home is set
to the venv dir instead of the real root. This would make
python-build-standalone fail with the encodings module not being found
because its home is wrong.
2024-01-18 15:32:30 +00:00
konsti 7acde5a9a0
Fix `pep508_rs` doc test (#963)
Since nextest does not run doctests, this did not show up on CI.
2024-01-18 14:24:30 +00:00
konsti 5ec5a3243c
Set miette hook in all of puffin-cli (#962)
Fixes #938
2024-01-18 08:37:26 -05:00
Charlie Marsh 8ae8ddc7d9
Fix 3-to-2 reference in pip sync test (#958) 2024-01-18 04:33:46 +00:00
Charlie Marsh fbe70f4218
Split install plan into builder and struct (#955)
The `InstallPlan` does a lot of work in the constructor, which I tend to
feel is an anti-pattern. With cache refresh, it's also going to need to
be made `async`, so it really feels like it should be a clearer method
rather than an async, fallible constructor that does a bunch of IO. This
PR splits into a `Planner` (with a `build` method) and a `Plan`.
2024-01-17 15:28:46 -05:00
Charlie Marsh 055fd64eb1
Add an `--update-package` setting to allow individual package upgrades (#953)
Closes #950.
2024-01-17 14:31:52 -05:00
Zanie Blue a4204d00c1
Bump to latest packse version with "extras" scenarios (#935)
Includes:

- https://github.com/zanieb/packse/pull/83 (replaces some of the
post-processing here)
- https://github.com/zanieb/packse/pull/82
- https://github.com/zanieb/packse/pull/81
2024-01-17 13:25:48 -06:00
Charlie Marsh a0420114c3
Avoid storing absolute URLs for files (#944)
## Summary

It turns out that storing an absolute URL for every file caused a
significant performance regression. This PR attempts to address the
regression with two changes.

The first is that we now store the raw string if the URL is an absolute
URL. If the URL is relative, we store the base URL alongside the raw
relative string. As such, we avoid serializing and deserializing URLs
until we need them (later on), except for the base URL.

The second is that we now use the internal `Url` crate methods for
serializing and deserializing. If you look inside `Url`, its standard
serializer and deserialization actually convert it to a string, then
parse the string. But the crate exposes some other methods for faster
serialization and deserialization (with fewer guarantees). I think this
is totally fine since the cache is entirely internal.

If we _just_ change the `Url` serialization (and no other code -- so
continue to store URLs for every file), then the regression goes down to
about 5%:

```shell
❯ python -m scripts.bench \
        --puffin-path ./target/release/main \
        --puffin-path ./target/release/relative --puffin-path ./target/release/puffin \
        scripts/requirements/home-assistant.in --benchmark resolve-warm
Benchmark 1: ./target/release/main (resolve-warm)
  Time (mean ± σ):     496.3 ms ±   4.3 ms    [User: 452.4 ms, System: 175.5 ms]
  Range (min … max):   487.3 ms … 502.4 ms    10 runs

Benchmark 2: ./target/release/relative (resolve-warm)
  Time (mean ± σ):     284.8 ms ±   2.1 ms    [User: 245.8 ms, System: 165.6 ms]
  Range (min … max):   280.3 ms … 288.0 ms    10 runs

Benchmark 3: ./target/release/puffin (resolve-warm)
  Time (mean ± σ):     300.4 ms ±   3.2 ms    [User: 255.5 ms, System: 178.1 ms]
  Range (min … max):   295.4 ms … 305.1 ms    10 runs

Summary
  './target/release/relative (resolve-warm)' ran
    1.05 ± 0.01 times faster than './target/release/puffin (resolve-warm)'
    1.74 ± 0.02 times faster than './target/release/main (resolve-warm)'
```

So I considered _just_ making that change. But 5% is kind of
borderline...

With both of these changes, the regression is down to 1-2%:

```
Benchmark 1: ./target/release/relative (resolve-warm)
  Time (mean ± σ):     282.6 ms ±   7.4 ms    [User: 244.6 ms, System: 181.3 ms]
  Range (min … max):   275.1 ms … 318.5 ms    30 runs

Benchmark 2: ./target/release/puffin (resolve-warm)
  Time (mean ± σ):     286.8 ms ±   2.2 ms    [User: 247.0 ms, System: 169.1 ms]
  Range (min … max):   282.3 ms … 290.7 ms    30 runs

Summary
  './target/release/relative (resolve-warm)' ran
    1.01 ± 0.03 times faster than './target/release/puffin (resolve-warm)'
```

It's consistently ~2%-ish, but at this point it's unclear if that's due
to the URL change or something other change between now and then.

Closes #943.
2024-01-17 09:15:21 -05:00
Charlie Marsh b8fbd529a1
Move `OnceMap` into its own crate (#946)
## Summary

This is extremely generic (like `WaitMap`), and I want to use it in the
cache.
2024-01-17 04:09:15 +00:00
konsti 5051b2c004
Use tempfile to prevent install io race crashes (#929)
On ubuntu and python 3.10,

```
cargo run -q -- pip-install --find-links https://storage.googleapis.com/jax-releases/jax_cuda_releases.html "jax[cuda12_pip]==0.4.23"
```

non-deterministically but for me consistently fails to install with
messages such as

```
error: Failed to install: nvidia_nccl_cu12-2.19.3-py3-none-manylinux1_x86_64.whl (nvidia-nccl-cu12==2.19.3)
  Caused by: failed to remove file `/home/konsti/projects/puffin/.venv/lib/python3.10/site-packages/nvidia/__init__.py`
  Caused by: No such file or directory (os error 2)
```

```
error: Failed to install: nvidia_cublas_cu12-12.3.4.1-py3-none-manylinux1_x86_64.whl (nvidia-cublas-cu12==12.3.4.1)
  Caused by: Replacing an existing file or directory failed
```

```
error: Failed to install: nvidia_cuda_nvcc_cu12-12.3.107-py3-none-manylinux1_x86_64.whl (nvidia-cuda-nvcc-cu12==12.3.107)
  Caused by: failed to hardlink file from /home/konsti/.cache/puffin/wheels-v0/pypi/nvidia-cuda-nvcc-cu12/nvidia_cuda_nvcc_cu12-12.3.107-py3-none-manylinux1_x86_64/nvidia/__init__.py to /home/konsti/projects/puffin/.venv/lib/python3.10/site-packages/nvidia/__init__.py
  Caused by: File exists (os error 17)
```

We install a lot of nvidia package, that all contain
`nvidia/__init__.py`, since they all install themselves into the
`nvidia` module:

```
nvidia-cublas-cu12==12.3.4.1
nvidia-cuda-cupti-cu12==12.3.101
nvidia-cuda-nvcc-cu12==12.3.107
nvidia-cuda-nvrtc-cu12==12.3.107
nvidia-cuda-runtime-cu12==12.3.101
nvidia-cudnn-cu12==8.9.7.29
nvidia-cufft-cu12==11.0.12.1
nvidia-cusolver-cu12==11.5.4.101
nvidia-cusparse-cu12==12.2.0.103
nvidia-nccl-cu12==2.19.3
nvidia-nvjitlink-cu12==12.3.101
```

```
$  tree -L 1 .venv/lib/python3.10/site-packages/nvidia
.venv/lib/python3.10/site-packages/nvidia
├── cublas
├── cuda_cupti
├── cuda_nvcc
├── cuda_nvrtc
├── cuda_runtime
├── cudnn
├── cufft
├── cusolver
├── cusparse
├── __init__.py
├── nccl
└── nvjitlink
```

When installing we get a race condition, each package installation is
its own thread:
* Installer Thread 1 creates `nvidia/__init__.py`
* Installer Thread 2 sees an existing  `nvidia/__init__.py`
* Installer Thread 3 sees an existing  `nvidia/__init__.py`
* Installer Thread 2 removes `nvidia/__init__.py`
* Installer Thread 3 tries to remove `nvidia/__init__.py`, it doesn't
exist anymore -> failure.

We switch to a new strategy: When the target files exists, we don't
remove it, but instead hardlink the source file to a tempfile first,
then renaming the tempfile to the target file. Renaming is considered an
atomic operation.

I've put the logging on debug level because they cases indicate a
conflict between two packages while being rare.

Closes #925

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-01-16 21:07:39 +00:00
Charlie Marsh b50e5fcbc5
Fetch `--find-links` indexes in parallel (#934)
## Summary

Removes a TODO.

## Test Plan

Tested manually with:

```shell
cargo run -p puffin-cli -- \
    pip compile requirements.in -n \
    --find-links 'https://download.pytorch.org/whl/torch_stable.html' \
    --find-links 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html' \
    --verbose
```

And inspecting the logs to ensure that the two requests were kicked off
concrurently.
2024-01-16 11:37:35 +01:00
Charlie Marsh 2f8f126f2f
Share a single `Index` across resolutions (#906)
## Summary

This PR uses a single `Index` that's shared between the top-level
resolver and any sub-resolutions happen in the course of that top-level
resolution (namely, to resolve build dependencies for any source
distributions).

In theory it's an optimization, since (e.g.) if we have two packages
that both need the `flit-core` build system, and we attempt to build
them both at once, we'll only fetch its metadata _once_, and share it
across the two resolutions. In practice, I haven't been able to get this
to show up in benchmarks. I suspect you'd need a _lot_ of source
distributions for it to matter... Though it may still be worth doing, it
strikes me as a cleaner design.

Closes #200.

Closes #541.
2024-01-16 05:37:15 +00:00
Charlie Marsh 0f592b67bb
Remove clone from `RegistryWheelIndex` (#937)
Doesn't need to own the package names.
2024-01-15 16:18:12 -05:00
Charlie Marsh 2a69b273ce
Use a standalone error type for `--find-links` registry (#936) 2024-01-15 19:48:48 +00:00
Charlie Marsh e71e3e8dd1
Refresh `BuildDispatch` when running pip install with `--reinstall` (#933)
## Summary

This fixes an extremely subtle bug in `pip install --reinstall`, whereby
if you depend on `setuptools` at the top level, we end up uninstalling
it after resolving, which breaks some cached state. If we have
`--reinstall`, we need to reset that cached state between resolving and
installing.

## Test Plan

Running `pip install --reinstall` with:

```txt
setuptools
devpi @ e334eb4dc9bb023329e4b610e4515b/devpi-2.2.0.tar.gz
```

Fails on `main`, but passes.
2024-01-15 18:56:18 +00:00
Charlie Marsh 116da6b7de
Share in-flight map across resolutions (#932)
## Summary

This PR fixes a subtle bug in `pip install` when using `--reinstall`. If
a package depends on a build system directly (e.g., `waitress` depends
on `setuptools`), and then you have other packages that also need the
build system to build a source distribution, right now, we don't share
the `OnceMap` between those cases.

This lifts the `InFlight` tracking up a level, so that it's initialized
once per command, then shared everywhere.

## Test Plan

I'm having trouble coming up with an identical test-case and hesitant to
add this slow test to the suite... But if you run `pip install
--reinstall` with:

```
waitress @ git+https://github.com/zanieb/waitress
devpi-server @ git+https://github.com/zanieb/devpi#subdirectory=server
```

It fails consistently on `main` and passes here.
2024-01-15 13:11:22 -05:00
Charlie Marsh 249ca10765
Move Puffin subcommands to a pip namespace (#921)
## Summary

This makes the separation clearer between the legacy `pip` API and the
API we'll add in the future for the package manager itself. It also
enables seamless `puffin pip` aliasing for those that want it.

Closes #918.
2024-01-15 16:36:45 +00:00
Charlie Marsh e54fdea93f
Continue to respect `--find-links` with `--no-index` (#931)
Like `pip`, we should allow `--find-links` with `--no-index`.
2024-01-15 16:19:27 +00:00
Charlie Marsh 42888a9609
Share flat index across resolutions (#930)
## Summary

This PR restructures the flat index fetching in a few ways:

1. It now lives in its own `FlatIndexClient`, since it felt a bit
awkward (in my opinion) for it to live in `RegistryClient`.
2. We now fetch the `FlatIndex` outside of the resolver. This has a few
benefits: (1) the resolver construct is no longer `async` and no longer
returns `Result`, which feels better for a resolver; and (2) we can
share the `FlatIndex` across resolutions rather than re-fetching it for
every source distribution build.
2024-01-15 11:02:02 -05:00
Charlie Marsh e6d7124147
Add an extra struct around the package-to-flat index map (#923)
## Summary

`FlatIndex` is now the thing that's keyed on `PackageName`, while
`FlatDistributions` is what used to be called `FlatIndex` (a map from
version to `PrioritizedDistribution`, for a single package). I find this
a bit clearer, since we can also remove the `from_files` that doesn't
return `Self`, which I had trouble following.
2024-01-15 14:48:10 +00:00
Charlie Marsh 9a3f3d385c
Remove `PubGrubVersion` (#924)
## Summary

I'm running into some annoyances converting `&Version` to
`&PubGrubVersion` (which is just a wrapper type around `Version`), and I
realized... We don't even need `PubGrubVersion`?

The reason we "need" it today is due to the orphan trait rule: `Version`
is defined in `pep440_rs`, but we want to `impl
pubgrub::version::Version for Version` in the resolver crate.

Instead of introducing a new type here, which leads to a lot of
awkwardness around conversion and API isolation, what if we instead just
implement `pubgrub::version::Version` in `pep440_rs` via a feature? That
way, we can just use `Version` everywhere without any confusion and
conversion for the wrapper type.
2024-01-15 08:51:12 -05:00
konsti 8860a9c29e
Add flat index urls to registry wheel index (#928)
Previously, we were missing flat index wheels in the cache.
2024-01-15 10:21:59 +00:00
konsti 95f3cca28d
Use fs_err in more places (#926)
Before:

```
error: Failed to download distributions
  Caused by: Failed to fetch wheel: jaxlib==0.4.23+cuda12.cudnn89
  Caused by: Directory not empty (os error 39)
```

After:

```
error: Failed to download distributions
  Caused by: Failed to fetch wheel: jaxlib==0.4.23+cuda12.cudnn89
  Caused by: failed to rename file from /home/konsti/.cache/puffin/.tmpcG7tVP/jaxlib-0.4.23+cuda12.cudnn89-cp310-cp310-manylinux2014_x86_64.whl to /home/konsti/.cache/puffin/wheels-v0/index/9ff50b883297fa9d/jaxlib/jaxlib-0.4.23+cuda12.cudnn89-cp310-cp310-manylinux2014_x86_64
  Caused by: Directory not empty (os error 39)
```
2024-01-15 09:39:33 +00:00
konsti 82ff136a74
Add find links supports to pip-sync (#914)
Closes #877
2024-01-15 03:04:55 +00:00
konsti f63776b894
Support HTML indexes in `--find-links` (#913)
The simple html format parser luckily seems to work for find links too,
at least it can parse
https://storage.googleapis.com/jax-releases/jax_cuda_releases.html.
2024-01-15 02:54:34 +00:00
konsti e9b6b6fa36
Implement `--find-links` as flat indexes (directories in pip-compile) (#912)
Add directory `--find-links` support for local paths to pip-compile.

It seems that pip joins all sources and then picks the best package. We
explicitly give find links packages precedence if the same exists on an
index and locally by prefilling the `VersionMap`, otherwise they are
added as another index and the existing rules of precedence apply.

Internally, the feature is called _flat index_, which is more meaningful
than _find links_: We're not looking for links, we're picking up local
directories, and (TBD) support another index format that's just a flat
list of files instead of a nested index.

`RegistryBuiltDist` and `RegistrySourceDist` now use `WheelFilename` and
`SourceDistFilename` respectively. The `File` inside `RegistryBuiltDist`
and `RegistrySourceDist` gained the ability to represent both a url and
a path so that `--find-links` with a url and with a path works the same,
both being locked as `<package_name>@<version>` instead of
`<package_name> @ <url>`. (This is more of a detail, this PR in general
still work if we strip that and have directory find links represented as
`<package_name> @ file:///path/to/file.ext`)

`PrioritizedDistribution` and `FlatIndex` have been moved to locations
where we can use them in the upstack PR.

I added a `scripts/wheels` directory with stripped down wheels to use
for testing.

We're lacking tests for correct tag priority precedence with flat
indexes, i only confirmed this manually since it is not covered in the
pip-compile or pip-sync output.

Closes #876
2024-01-15 02:04:10 +00:00
konsti 5ffbfadf66
Make hashes optional (#910)
There is no guarantee that indexes provide hashes at all or the sha256
we support specifically. [PEP
503](https://peps.python.org/pep-0503/#specification):

> The URL SHOULD include a hash in the form of a URL fragment with the
following syntax: #<hashname>=<hashvalue>, where <hashname> is the
lowercase name of the hash function (such as sha256) and <hashvalue> is
the hex encoded digest.

We instead use the url as input to generate a hash when caching.
2024-01-14 16:32:55 -05:00
Zanie Blue 9ad19b7e54
Bump to the latest packse version (#916) 2024-01-14 12:49:23 -06:00
konsti a53bdeba4c
Remove `base` from `RegistryBuiltDist` and `RegistrySourceDist` (#919)
Follow-up to https://github.com/astral-sh/puffin/pull/917 i found
rebasing the find-links PRs, this field became unused through the
absolute URLs.
2024-01-14 17:46:16 +00:00
Charlie Marsh 0374000ec0
Normalize extras when evaluating PEP 508 markers (#915)
## Summary

We always normalize extra names in our requirements (e.g., `cuda12_pip`
to `cuda12-pip`), but we weren't normalizing within PEP 508 markers,
which meant we ended up comparing `cuda12-pip` (normalized) against
`cuda12_pip` (unnormalized).

Closes https://github.com/astral-sh/puffin/issues/911.
2024-01-14 17:16:54 +00:00
konsti a99e5e00f2
Use absolute urls in `distribution_type::File` (#917)
Previously, the url on file could either be a relative or an absolute
url, depending on the index, and we would finalize it lazily. Now we
finalize the url when converting `pypi_types::File` to
`distribution_types::File`. This change is required to make the hashes
on `File` optional (https://github.com/astral-sh/puffin/pull/910), which
are currently the only unique field usable for caching.
2024-01-14 17:15:24 +00:00
Charlie Marsh 6e18e56789
Adjust markers to match target Python version (#909)
## Summary

This PR ensures that when the user passes in `--python-version`, we
adjust the _markers_ to match the target version, thus forcing us to
select compatible wheels for the `--python-version`, rather than the
installed version.

## Context

Let's call Python 3.10 the "installed" environment and Python 3.12 the
"target" environment. For each version, we have _both_ a Python version
(to match against `Requires-Python`) and a set of tags (to match against
wheels).

The rules for resolution are as follows...

- For each package, for each version, we try to find the "best
candidate" for resolution and installation.
- We first look for a wheel that's compatible with the _target_
environment. This requires testing against both the `Requires-Python`
and the markers. (We won't have to build or run this code, so the
_installed_ version is irrelevant.) **(This PR corrects _this_ bullet --
previously, we validated against the _installed_ markers, rather than
the target markers.)**
- If we can't find a compatible wheel, we accept any _incompatible_
wheel as long as there's a source distribution. The source distribution
_must_ be compatible with the target environment. (We won't have to
build or run this code, so the _installed_ version is irrelevant.)
- If there are no wheels, then the source distribution must be
compatible with _both_ the installed and target environments, since we
need to build it.

This is all true for the top-level resolution. When we perform a
sub-resolution (when resolving the build dependencies of a source
distribution), we should _only_ use the installed environment, and
ignore the target environment, since we assume that the dependencies
will be the same in both environments once built -- so our goal is
"just" to build the distribution, without concern for which build
dependencies it uses.

Closes https://github.com/astral-sh/puffin/issues/883.
2024-01-14 15:39:15 +00:00
Charlie Marsh 8187c05d8a
Use `DashMap` for redirects (#908)
## Summary

We don't need to wait on these, so it's simpler to use a standard
concurrent hash map.
2024-01-13 20:36:02 +00:00
Charlie Marsh f527f2add9
Remove erroneous local `Index` in resolver (#907) 2024-01-13 15:19:00 -05:00
Charlie Marsh 231686e71b
Remove `incompatibilities` from index (#905)
This isn't really part of the "index", it's part of the resolution.
2024-01-13 02:57:15 +00:00
Charlie Marsh 477186dcb3
Remove `ResolutionGraph#requirements` (#903) 2024-01-12 20:09:19 +00:00
Charlie Marsh d3f65c317d
Avoid some additional clones for `PackageName` (#896) 2024-01-12 17:54:40 +00:00
konsti aee6aed684
Make install_editable test faster (#901)
Remove a test case from the `install_editable` that slows it down from
3.6s to 6.5s while providing low test coverage. It also seems to block
other tests sometimes, `cargo nextest run -E "test(editable)"
--all-features` has more consistent and lower runtimes. Surprisingly
this seems to have bigger effect than switching from pyo3 to cffi.

Used test commands:
```
rm -rf scripts/editable-installs/maturin_editable/target/ && time cargo nextest run -E "test(=install_editable)" --all-features
rm -rf scripts/editable-installs/maturin_editable/target/ && time cargo nextest run -E "test(editable)" --all-features
 ```

Part of #878
2024-01-12 18:50:27 +01:00
konsti 878bc4bf8d
Stub out DTLSsocket test (#900)
Replace the DTLSsocket test with a dummy package that does nothing but
contain the build system specs that we need. This should speed up one of
the slowest tests.

Part of #878
2024-01-12 18:50:16 +01:00
Charlie Marsh 06039e1293
Add hashes to `pip-compile` output (#894)
## Summary

Adds hashes to `pip-compile` output, though we don't actually check
those hashes in `pip-sync` yet.

Closes https://github.com/astral-sh/puffin/issues/131.
2024-01-12 12:44:19 -05:00
konsti 0cc98c771e
Fix a tracing panic (#899) 2024-01-12 14:47:58 +00:00
Charlie Marsh 11b11d04a7
Ignore installed version when determining wheel compatibility (#890) 2024-01-12 08:57:00 -05:00
Charlie Marsh 5fd2c380a7
Add `into_cached_dist` to `LocalWheel` (#893)
Simplifies `unzip_wheel` a bit and avoids unnecessarily cloning in the
common case.
2024-01-12 09:01:30 +00:00
Charlie Marsh 35c1faa575
Move in-flight tracking to the download level (#892)
## Summary

Now that `get_or_build_wheel` will often _also_ handle the unzip step,
we need to move our per-target locking (`OnceMap`) up a level.
Previously, it was only applied to the unzip step, to prevent us from
attempting to unzip into the same target concurrently; now, it's applied
at the `get_wheel` level, which includes both downloading and unzipping.

## Test Plan

It seems like none of our existing tests catch this -- perhaps because
they're too "simple"? You need to run into a situation in which you're
doing multiple source distribution builds concurrently (since they'll
all try to download `setuptools`):

```
rm -rf foo && virtualenv --clear .venv && cargo run -p puffin-cli -- pip-compile ./scripts/requirements/pydantic.in  --verbose --cache-dir foo
```
2024-01-12 09:52:22 +01:00
Charlie Marsh 60cea0f07d
Use consistent parse terminology in pyproject error (#891)
We use `parse` for the other file types.
2024-01-11 21:25:47 -05:00
bojanserafimov 4c047f858f
Remove InMemoryWheel and dead code (#879) 2024-01-11 10:11:07 -05:00
bojanserafimov 10227a74f8
Unzip while downloading (#856) 2024-01-11 09:41:46 -05:00
konsti 0dfbddd275
Shorten resolve many dev output (#885) 2024-01-11 13:53:13 +00:00
konsti 8c2b7d55af
Cleanup deps and docs (#882)
Fix warnings from `cargo +nightly udeps` and `cargo doc`.

Removes all mentions of regex from pep440_rs.
2024-01-11 10:43:40 +00:00
Zanie Blue d6fa628e11
Fix failing test (#880) 2024-01-11 00:41:37 +00:00
Zanie Blue 811332eacc
Improve handling of "full" version ranges (#868)
Reduces the number of implementation branches handling `Range:full`,
deferring it to `PackageRange`.
Improves some user-facing messages, e.g. saying `all versions of
<package>` instead of `<package>*`.
Changes the member names of the `PackageRangeKind` enum — they were not
very clear.
2024-01-10 21:03:55 +00:00
Zanie Blue a65c55ff4a
Say "cannot be used" and "must be used" instead of "forbidden" and "mandatory" (#867)
Closes #858
2024-01-10 20:49:40 +00:00
Zanie Blue 845ba6801d
Improve formatting of incompatible terms when there are two items (#866) 2024-01-10 20:36:54 +00:00
Zanie Blue 93d3093a2a
Improve formatting of package ranges in error messages (#864)
Closes #810
Closes https://github.com/astral-sh/puffin/issues/812
Requires https://github.com/zanieb/pubgrub/pull/19 and
https://github.com/zanieb/pubgrub/pull/18

- Always pair package ranges with names e.g. `... of a matching a<1.0`
instead of `... of a matching <1.0`
- Split range segments onto multiple lines when not a singleton as
suggested in
[#850](https://github.com/astral-sh/puffin/pull/850#discussion_r1446419610)
- Improve formatting when ranges are split across multiple lines e.g. by
avoiding extra spaces and improving wording

Note review will require expanding the hidden files as there are
significant changes to the report formatter and snapshots.

Bear with me here as these are definitely not perfect still.

The following changes build on top of this independently for further
improvements:
- #868 
- #867 
- #866 
- #871
2024-01-10 14:16:23 -06:00
konsti 4d8bfd7f61
Split source dist error type into error and kind (#872)
It's a better, less redundant error type. It will come in handy when
adding a second parse function.
2024-01-10 17:42:54 +00:00
Charlie Marsh fbb57b24dd
Add `--seed` flag to `venv` to allow seed package environments (#865)
## Summary

Installs the seed packages you get with `virtualenv`, but opt-in rather
than opt-out.

Closes https://github.com/astral-sh/puffin/issues/852.

## Test Plan

```
❯ ./scripts/benchmarks/venv.sh
+ hyperfine --runs 20 --warmup 3 --prepare 'rm -rf .venv' './target/release/puffin venv' --prepare 'rm -rf .venv' 'virtualenv --without-pip .venv' --prepare 'rm -rf .venv' 'python -m venv --without-pip .venv'
Benchmark 1: ./target/release/puffin venv
  Time (mean ± σ):       4.6 ms ±   0.2 ms    [User: 2.4 ms, System: 3.6 ms]
  Range (min … max):     4.3 ms …   4.9 ms    20 runs

  Warning: Command took less than 5 ms to complete. Note that the results might be inaccurate because hyperfine can not calibrate the shell startup time much more precise than this limit. You can try to use the `-N`/`--shell=none` option to disable the shell completely.

Benchmark 2: virtualenv --without-pip .venv
  Time (mean ± σ):      73.3 ms ±   0.3 ms    [User: 57.4 ms, System: 14.2 ms]
  Range (min … max):    72.8 ms …  74.0 ms    20 runs

Benchmark 3: python -m venv --without-pip .venv
  Time (mean ± σ):      22.5 ms ±   0.3 ms    [User: 17.0 ms, System: 4.9 ms]
  Range (min … max):    22.0 ms …  23.2 ms    20 runs

Summary
  './target/release/puffin venv' ran
    4.92 ± 0.20 times faster than 'python -m venv --without-pip .venv'
   16.00 ± 0.63 times faster than 'virtualenv --without-pip .venv'
+ hyperfine --runs 20 --warmup 3 --prepare 'rm -rf .venv' './target/release/puffin venv --seed' --prepare 'rm -rf .venv' 'virtualenv .venv' --prepare 'rm -rf .venv' 'python -m venv .venv'
Benchmark 1: ./target/release/puffin venv --seed
  Time (mean ± σ):      20.2 ms ±   0.4 ms    [User: 8.6 ms, System: 15.7 ms]
  Range (min … max):    19.7 ms …  21.2 ms    20 runs

Benchmark 2: virtualenv .venv
  Time (mean ± σ):     135.1 ms ±   2.4 ms    [User: 66.7 ms, System: 65.7 ms]
  Range (min … max):   133.2 ms … 142.8 ms    20 runs

Benchmark 3: python -m venv .venv
  Time (mean ± σ):      1.656 s ±  0.014 s    [User: 1.447 s, System: 0.186 s]
  Range (min … max):    1.641 s …  1.697 s    20 runs

Summary
  './target/release/puffin venv --seed' ran
    6.67 ± 0.17 times faster than 'virtualenv .venv'
   81.79 ± 1.70 times faster than 'python -m venv .venv'
```
2024-01-09 20:45:56 -05:00
Charlie Marsh 55f2be72e2
Default to PEP 517-based builds (#843)
## Summary

Our current setup uses the legacy `setup.py`-based builds if a
`pyproject.toml` file isn't present. This matches pip's behavior.
However, `pypa/build` uses PEP 517-based builds in such cases, and it
looks like pip plans to make that the default
(https://github.com/pypa/pip/issues/9175), with the limiting factor
being performance issues related to isolated builds.

This is now the default behavior, but the `--legacy-setup-py` flag
allows users to opt-in to using `setup.py` directly for distributions
that lack a `pyproject.toml`.
2024-01-10 01:27:06 +00:00
Charlie Marsh e26dc8e33d
Add support for `prepare_metadata_for_build_wheel` (#842)
## Summary

This PR adds support for `prepare_metadata_for_build_wheel`, which
allows us to determine source distribution metadata without building the
source distribution. This represents an optimization for the resolver,
as we can skip the expensive build phase for build backends that support
it.

For reference, `prepare_metadata_for_build_wheel` seems to be supported
by:

- `hatchling` (as of
[1.0.9](https://hatch.pypa.io/latest/history/hatchling/#hatchling-v1.9.0)).
- `flit`
- `setuptools`

In fact, it seems to work for every backend _except_ those using legacy
`setup.py`.

Closes #599.
2024-01-10 00:07:37 +00:00
konsti 858d5584cc
Use `Dist` in `VersionMap` (#851)
Refactoring split out from find links support: Find links files can be
represented as `Dist`, but not really as `File`, they don't have url nor
hashes.

`DistRequiresPython` is somewhat odd as an in between type.
2024-01-10 00:14:42 +01:00
konsti 1203f8f9e8
Gourgeist updates (#862)
* Use caching again
* Make clap feature only required for the cli/bin optional
2024-01-09 23:04:15 +00:00
Zanie Blue 34d548de21
Improve error messages when there are no versions of a singleton range (#855) 2024-01-09 15:09:52 -06:00
Charlie Marsh 33982efb25
Remove a TOCTOU read in build (#860)
We should just read and handle the not-found case, rather than checking
if the file doesn't exist first.
2024-01-09 20:33:08 +00:00
Charlie Marsh 31139aa88d
Add derive feature to `gourgeist` (#854)
Needed to build `gourgeist` directly, probably dropped during a
refactor.
2024-01-09 17:46:16 +00:00
konsti ee6d809b60
Remove unused `Result` (#849)
Remove some dead code, seems to be a refactoring oversight
2024-01-09 16:35:10 +00:00
konsti 643e5e4a49
Use pdm for black editable as PEP 621 test case (#848)
This gives us a PEP 621 test package in tree and increases the diversity
for the editable tests a bit.
2024-01-09 16:33:05 +00:00
konsti 5b0b072e3c
Allow files >4GB on 32-bit platforms (#847)
Changes `File::size` from a `usize` to a `u64`.

The motivations are that with tensorflow wheels being 475 MB
(https://pypi.org/project/tensorflow/2.15.0.post1/#files), we're already
only one order of magnitude away and to avoid target dependent failures.
2024-01-09 17:31:49 +01:00
Charlie Marsh ee3a6431c7
Show available pre-releases in error hints (#844)
## Summary

If pre-releases are available for a package that we otherwise couldn't
resolve, we now show a hint that includes one of the example versions.

Closes https://github.com/astral-sh/puffin/issues/811.
2024-01-09 09:58:38 -05:00
konsti b1edecdf1f
Filter out files with invalid requires python specifiers (#775)
Instead of trying to fixup _all_ the invalid version specifiers on pypi
and elsewhere, this filters out distributions with invalid
`requires-python` version specifiers that even
`LenientVersionSpecifiers` couldn't parse, as opposed to failing
entirely, which we currently do.

I would be nicer to model through an invalid distribution pubgrub type,
together with e.g. source dists with an unknown extension, so that the
version itself still shows up in the error trace.

At the same time, we reduce the log level for fixups from warning to
trace, as they are not actionable for the user.
2024-01-09 02:46:27 +00:00
Zanie Blue 64da1f0306
Always pair package names with ranges in error messages (#838)
Adjusts display of "no versions available" in error messages to be
consistent with other package/range pairings i.e. we usually display
"<package-name><range>".
2024-01-08 22:11:10 +00:00
Charlie Marsh 19c6d655b5
Avoid duplicated source distribution handling in url (#841)
## Summary

Right now, both the callback _and_ the "We have no compatible wheel"
paths have a lot of repeated code. This PR changes the callback to
_just_ remove all the wheels and handle the download, and the rest of
the method following the callback is responsible for finding and
building any wheels.
2024-01-08 16:19:54 -05:00
Charlie Marsh cc9140643e
Rename `metadata` to `built_wheel` in `source/mod.rs` (#840) 2024-01-08 19:20:20 +00:00
Charlie Marsh df254087d9
Break `source_dist.rs` into a module (#839)
## Summary

Finding this file hard to edit and work in since it's gotten quite
large.
2024-01-08 19:14:45 +00:00
Zanie Blue 2b0c2e294b
Fix formatting of negated singleton versions in error messages (#836)
Closes #805 
Requires https://github.com/zanieb/pubgrub/pull/17
2024-01-08 12:33:01 -06:00
Charlie Marsh aeefe65227
Fix `tracing-duration-export` compilation (#835)
## Summary

I'm unable to run `puffin-cli` on `main` as the
`tracing-durations-export` is marked as optional, but the crate actually
depends on it to compile. Further, without `tracing-durations-export`,
there are `Option` types that can't resolve to a concrete type.

This PR fixes compilation with and without the feature.
2024-01-08 18:04:23 +00:00
Charlie Marsh c06bf658bb
Remove some filesystem calls from the installer (#834)
Noticed these when working on something unrelated. Generally:

- Prefer `entry.file_type()` over `entry.path().is_file()` or similar,
as the former is almost always free on Unix.
- Call `entry.path()` once, since it allocates internally (returns a
`PathBuf`).
2024-01-08 12:59:01 -05:00
konsti 004147d441
Add tracing_durations_export feature to puffin-cli (#830)
The optional `tracing-durations-export` feature allows creating
parallelism plots from all puffin-cli commands without affecting
production builds.

Usage:

```
virtualenv --clear -p 3.10 .venv310 && TRACING_DURATIONS_FILE=target/traces/jupyter-no-cache.ndjson RUST_LOG=puffin=info VIRTUAL_ENV=.venv310 cargo run --bin puffin --profile profiling --features tracing-durations-export -- pip-install -v --no-cache jupyter
virtualenv --clear -p 3.10 .venv310 && TRACING_DURATIONS_FILE=target/traces/jupyter.ndjson RUST_LOG=puffin=info VIRTUAL_ENV=.venv310 cargo run --bin puffin --profile profiling --features tracing-durations-export -- pip-install -v jupyter
 ```

Output, plotted in collapsed mode for readability:

Cached jupyter:

![jupyter](https://github.com/astral-sh/puffin/assets/6826232/f7e03c68-0438-4cf4-bceb-9a4a146cc506)

Uncached jupyter:

![image](https://github.com/astral-sh/puffin/assets/6826232/cfdd3383-7a9d-43d6-b8d0-201f64611596)
2024-01-08 16:20:45 +01:00
konsti b6338b5e4a
Use tracing-durations-export to visualize parallelism bottlenecks (dev commands) (#816)
Example usage:

```
# Cached
TRACING_DURATIONS_FILE=target/traces/black.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve black
TRACING_DURATIONS_FILE=target/traces/meine_stadt_transparent.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve meine_stadt_transparent
TRACING_DURATIONS_FILE=target/traces/jupyter.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve jupyter

# No cache
TRACING_DURATIONS_FILE=target/traces/black-no-cache.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve --no-cache black
TRACING_DURATIONS_FILE=target/traces/meine_stadt_transparent-no-cache.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve --no-cache meine_stadt_transparent
TRACING_DURATIONS_FILE=target/traces/jupyter-no-cache.ndjson RUST_LOG=puffin=info cargo run --bin puffin-dev --profile profiling -- resolve --no-cache jupyter
```

Uncached black output example:


![black-no-cache](https://github.com/astral-sh/puffin/assets/6826232/38497b89-7214-453b-9456-c9d9cbf7d2d5)
2024-01-08 16:20:38 +01:00
konsti 243392f718
`cargo run` run `puffin` by default (#831)
`cargo run` now runs `puffin` by default. `cargo run --bin puffin-dev`
remains working.
2024-01-08 12:49:06 +00:00
konsti 3f587156ec
Improve install instrumentation (#829)
Add tracing spans to different phases of the wheel installation.
2024-01-08 10:13:59 +00:00
konsti 60ba7dd14f
Use `std::io::read_to_string` (#826)
The `std::io::read_to_string` shorthand was stabilized in 1.65.
2024-01-08 09:15:38 +00:00
Charlie Marsh 54838914be
Migrate back to `owo-colors` (#824)
In the past, I moved us to `owo-colors`
(https://github.com/astral-sh/puffin/pull/121); then, we moved back,
because we ran into issues with overriding the settings to force-disable
colors. But `anstream` solved those problems, so I'm moving us _back_ to
`owo-colors`, since it's what `anstream` recommends, and it's already
used by many of our dependencies (`miette`, `configparser`).

---------

Co-authored-by: konstin <konstin@mailbox.org>
2024-01-08 08:54:57 +00:00
Charlie Marsh 17452e3e64
Simplify ranges in pre-release hints (#825)
Closes https://github.com/astral-sh/puffin/issues/807.
2024-01-07 12:40:22 -05:00
Charlie Marsh e6fcb9c4d3
Use `anstream` for all color control (#823)
## Summary

We can use `anstream` for all color control, rather than going through
`colored`. Note that we still need the `colored` crate, since `colored`
and `anstream` solve different problems. (`anstream` recommends using
`owo-colors` alongside it, but `colored` seems to work fine?)

Resolves the issue raised in
https://github.com/astral-sh/puffin/pull/742 via `anstream` rather than
`colored`.

Closes https://github.com/astral-sh/puffin/issues/782.
2024-01-06 20:44:05 -05:00
Charlie Marsh fed492831a
Inline some format placeholders (#822) 2024-01-06 23:13:44 +00:00
Charlie Marsh 77c3a67029
Remove `pub(crate)` from `RegistryClient` fields (#821) 2024-01-06 22:05:18 +00:00
Charlie Marsh 9ded337870
Remove unused `proxy` field from client (#820) 2024-01-06 17:02:35 -05:00
Zanie Blue 88adba83a0
Add scenarios with unresolvable dependencies due to excluded versions (#801)
Scenarios added in https://github.com/zanieb/packse/pull/71
2024-01-05 16:21:47 -06:00
Zanie Blue 9a75703973
Bump packse to hide `requires-python` in docstrings when not relevant (#797) 2024-01-05 20:49:09 +00:00
Zanie Blue def7f79f20
Add pre-release test scenario reproducing hint simplification bug (#796)
A reproduction of #751 

Scenarios added in https://github.com/zanieb/packse/pull/68
2024-01-05 14:41:40 -06:00
konsti 65efee1d76
Add compare_release fast path (#799)
Looking at the profile for tf-models-nightly after #789,
`compare_release` is the single biggest item. Adding a fast path, we
avoid paying the cost for padding releases with 0s when they are the
same length, resulting in a 16% for this pathological case. Note that
this mainly happens because tf-models-nightly is almost all large dev
releases that hit the slow path.

**Before**


![image](https://github.com/astral-sh/puffin/assets/6826232/0d2b4553-da69-4cdb-966b-0894a6dd5d94)

**After**


![image](https://github.com/astral-sh/puffin/assets/6826232/6d484808-9d16-408d-823e-a12d321802a5)

```
$ hyperfine --warmup 1 --runs 3 "target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt"
 "target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt"
Benchmark 1: target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
  Time (mean ± σ):     11.963 s ±  0.225 s    [User: 11.478 s, System: 0.451 s]
  Range (min … max):   11.747 s … 12.196 s    3 runs

Benchmark 2: target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt
  Time (mean ± σ):     10.317 s ±  0.720 s    [User: 9.885 s, System: 0.404 s]
  Range (min … max):    9.501 s … 10.860 s    3 runs

Summary
  target/profiling/puffin pip-compile -q scripts/requirements/tf-models-nightly.txt ran
    1.16 ± 0.08 times faster than target/profiling/main pip-compile -q scripts/requirements/tf-models-nightly.txt
```
2024-01-05 15:14:11 -05:00
Andrew Gallant 6c98ae9d77
pep440: rewrite the parser and make version comparisons cheaper (#789)
This PR builds on #780 by making both version parsing faster, and
perhaps more importantly, making version comparisons much faster.
Overall, these changes result in a considerable improvement for the
`boto3.in` workload. Here's the status quo:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 34.56s

real    34.579
user    34.004
sys     0.413
maxmem  2867 MB
faults  0
```

And now with this PR:

```
$ time puffin pip-compile --no-build --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/requirements/boto3.in
Resolved 31 packages in 9.20s

real    9.218
user    8.919
sys     0.165
maxmem  463 MB
faults  0
```

This particular workload gets stuck in pubgrub doing resolution, and
thus benefits mightily from a faster `Version::cmp` routine. With that
said, this change does also help a fair bit with "normal" runs:

```
$ hyperfine -w10 \
    "puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in" \
    "puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in"
Benchmark 1: puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     337.5 ms ±   3.9 ms    [User: 310.5 ms, System: 73.2 ms]
  Range (min … max):   333.6 ms … 343.4 ms    10 runs

Benchmark 2: puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
  Time (mean ± σ):     189.8 ms ±   3.0 ms    [User: 168.1 ms, System: 78.4 ms]
  Range (min … max):   185.0 ms … 196.2 ms    15 runs

Summary
  puffin-cmparc pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in ran
    1.78 ± 0.03 times faster than puffin-base pip-compile --cache-dir ~/astral/tmp/cache/ -o /dev/null ./scripts/benchmarks/requirements.in
```

There is perhaps some future work here (detailed in the commit
messages), but I suspect it would be more fruitful to explore ways of
making resolution itself and/or deserialization faster.

Fixes #373, Closes #396
2024-01-05 11:57:32 -05:00
Zanie Blue 74777c01ea
Improve documentation for scenario tests (#795)
- Fix documentation of scenario test module
- Add instructions to scenario update script for local development
2024-01-05 16:51:25 +00:00
konsti 5820a9d937
Update dependencies (#794)
Pull in a bunch of updates so they get some testing before we announce
the project. textwrap 0.16 is blocked on miette updating, http 1.0 on
reqwest.
2024-01-05 11:40:12 -05:00
Zanie Blue 08edbc9f60
Add assertions of expected scenario results (#791)
Uses new metadata added in https://github.com/zanieb/packse/pull/61 to
assert that resolution succeeded or failed _and_ that the installed
package versions match the expected result.
2024-01-05 10:32:37 -06:00
konsti 673bece595
Allow `pip-compile` without a venv (#494)
The semantics are a bit unintuitive because `--python-version` is a
preference when looking for a python version without a venv, but if we
don't find that exact version we'll take `python3` and patch the
markers. This will make more sense once we start provisioning python
builds.

We can now resolve black with both python 3.8 and 3.12, with or without
that python version being in scope. In the example below,
`PATH=$HOME/.cargo/bin:/usr/bin` removes the pyenv builds and leaves
only `python3`, which is python 3.11.

```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
    0.004108s DEBUG puffin::commands::pip_compile Using Python 3.8 at /home/konsti/.local/bin/python3.8
Resolved 8 packages in 44ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
tomli==2.0.1
    # via black
typing-extensions==4.8.0
    # via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
    0.004315s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 8 packages in 43ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
tomli==2.0.1
    # via black
typing-extensions==4.8.0
    # via black
```

```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
    0.004216s DEBUG puffin::commands::pip_compile Using Python 3.12 at /home/konsti/.local/bin/python3.12
Resolved 6 packages in 37ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
    0.004190s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 6 packages in 39ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
```

Fixes #235.

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-01-05 15:01:06 +00:00
Charlie Marsh 76064cdec2
Document Python interpreter discovery in README (#792) 2024-01-05 09:44:06 -05:00
Zanie Blue 0cd57a6cd8
Add pre-release scenarios (#790)
Scenarios added in https://github.com/zanieb/packse/pull/58
2024-01-05 03:10:43 +00:00
Zanie Blue 3d6ea7809a
Update scenario tests to include `requires-python` coverage (#769)
Includes creating a virtual env with the relevant environment python
version.

Scenarios added in https://github.com/zanieb/packse/pull/55
2024-01-04 14:15:13 -06:00
konsti 57c96df288
Explain ld errors (#773)
One of the most common ways source dists fail to build (on linux) is
when the linker fails because the shared library of a native dependency
is not installed. These errors are hard to understand when you're not a
c programmer:

```
       In file included from /usr/include/python3.10/unicodeobject.h:1046,
                        from /usr/include/python3.10/Python.h:83,
                        from Modules/3.x/readline.c:8:
       Modules/3.x/readline.c: In function ‘on_completion’:
       /usr/include/python3.10/cpython/unicodeobject.h:744:29: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
         744 | #define _PyUnicode_AsString PyUnicode_AsUTF8
             |                             ^~~~~~~~~~~~~~~~
       Modules/3.x/readline.c:842:23: note: in expansion of macro ‘_PyUnicode_AsString’
         842 |             char *s = _PyUnicode_AsString(r);
             |                       ^~~~~~~~~~~~~~~~~~~
       Modules/3.x/readline.c: In function ‘readline_until_enter_or_signal’:
       Modules/3.x/readline.c:1044:9: warning: ‘sigrelse’ is deprecated: Use the sigprocmask function instead [-Wdeprecated-declarations]
        1044 |         sigrelse(SIGINT);
             |         ^~~~~~~~
       In file included from Modules/3.x/readline.c:10:
       /usr/include/signal.h:359:12: note: declared here
         359 | extern int sigrelse (int __sig) __THROW
             |            ^~~~~~~~
       Modules/3.x/readline.c: In function ‘PyInit_readline’:
       Modules/3.x/readline.c:1179:34: warning: assignment to ‘char * (*)(FILE *, FILE *, const char *)’ from incompatible pointer type ‘char * (*)(FILE *, FILE *, char *)’ [-Wincompatible-pointer-types]
        1179 |     PyOS_ReadlineFunctionPointer = call_readline;
             |                                  ^
       In file included from /usr/include/string.h:535,
                        from /usr/include/python3.10/Python.h:30,
                        from Modules/3.x/readline.c:8:
       In function ‘strncpy’,
           inlined from ‘call_readline’ at Modules/3.x/readline.c:1124:9:
       /usr/include/x86_64-linux-gnu/bits/string_fortified.h:95:10: warning: ‘__builtin_strncpy’ output truncated before terminating nul copying as many bytes from a string as its length [-Wstringop-truncation]
          95 |   return __builtin___strncpy_chk (__dest, __src, __len,
             |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          96 |                                   __glibc_objsize (__dest));
             |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~
       Modules/3.x/readline.c: In function ‘call_readline’:
       Modules/3.x/readline.c:1099:9: note: length computed here
        1099 |     n = strlen(p);
             |         ^~~~~~~~~
       /usr/bin/ld: cannot find -lncurses: No such file or directory
       collect2: error: ld returned 1 exit status
       error: command '/usr/bin/x86_64-linux-gnu-gcc' failed with exit code 1
       ---
```

We parse these errors out, tell the user about the missing shared
library and even the most likely debian/ubuntu package name:

```
This error likely indicates that you need to install the library that provides a shared library for ncurses for pygraphviz-1.11 (e.g. libncurses-dev)
```
2024-01-04 20:56:38 +01:00
Zanie Blue 8ac6f9a198
Wrap scenario descriptions in docstrings (#787)
Otherwise, the lines can get kind of long.
2024-01-04 19:43:50 +00:00
Zanie Blue f89c6456e3
Explicitly pin scenarios to a packse commit (#788)
Previously, we just pulled the latest commit from `main` on every
update. This causes problems when you do not intend to update the
scenarios as in #787.

This bumps to the latest `packse` commit without new scenarios.
2024-01-04 19:38:48 +00:00
Zanie Blue 5e04a95c45
Disable line wrapping during scenario tests (#784)
Adds support for a `PUFFIN_NO_WRAP` environment variable which disables
line wrapping in `miette` output.

We set this variable in the scenario tests to improve the readability of
snapshots.

I contributed the ability to disable line wrapping upstream at
https://github.com/zkat/miette/pull/328
2024-01-04 19:07:16 +00:00
Andrew Gallant d7c9b151fb
pep440: some minor refactoring, mostly around error types (#780)
This PR does a bit of refactoring to the pep440 crate, and in
particular around the erorr types. This PR is meant to be a precursor
to another PR that does some surgery (both in parsing and in `Version`
representation) that benefits somewhat from this refactoring.

As usual, please review commit-by-commit.
2024-01-04 12:28:36 -05:00
Andrew Gallant 1cc3250e76
puffin-cli: fix botched merge (#785)
This fixes a compilation error with tests on current `main`. I didn't
track down the exact provenance, but I'd guess it's the result of a
botched merge. (i.e., Two or more PRs that pass CI independently, but
when merged cause failures.)
2024-01-04 17:03:45 +00:00
Charlie Marsh c6bdc43f37
Add missing feature to `Cargo.toml` (#777) 2024-01-04 11:39:11 -05:00
Zanie Blue e75fde7bfe
Filter prefixes from scenario snapshots to improve readability (#779)
I'm a _little_ unsure since this could be confusing but the prefixes can
be pretty long and this is much easier to read.
2024-01-04 09:57:41 -06:00
konsti 9b77a8873e
Disable color output when redirecting stderr (#742)
I'm still confused about it, but this seems to do the right thing?

`HierarchicalLayer` internally has [`let ansi =
io::stderr().is_terminal();`](fcd9eed252/src/lib.rs (L74)),
so the logging itself is already correctly uncolored, but errors in the
log weren't.

Test command, ran with network deactivated:

```shell
RUST_LOG=debug cargo run --bin puffin -- pip-compile -v ./scripts/popular_packages/pypi_8k_downloads.txt 2> log.txt
```

**Before**

```
error: Request error: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: failed to lookup address information: Temporary failure in name resolution
  ```

  **After**

  ```
  error: Request error: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: failed to lookup address information: Temporary failure in name resolution
```
2024-01-04 16:43:44 +01:00
konsti 92c780ec2f
Run custom insta filters before generic filters (#781)
I've noticed some non-deterministic test failures when a temp dir looks
like a timestamp
(https://github.com/astral-sh/puffin/actions/runs/7410022542/job/20161416805).
Running the custom filters for e.g. the temp dirs before the generic
time filters should fix that.
2024-01-04 16:40:28 +01:00
Charlie Marsh b2230e7f4d
Make index URLs insensitive to trailing slashes (#771)
Closes https://github.com/astral-sh/puffin/issues/770.
2024-01-04 08:45:50 -05:00
konsti 7d6e6fae25
Requirement fixup for trailing comma after trailing quote (#776)
Fixup for
7349527ceadde8fc265a33e6a4e662/boto3-1.2.0-py2.py3-none-any.whl:

```
botocore>=1.3.0,<1.4.0',
```

Note that neither the quote nor the comma are right.
2024-01-04 08:45:41 -05:00
konsti 0c5ca1cdd8
Delete unused file (#772)
This is a duplicate that's not used anymore, probably a refactoring
artifact.
2024-01-04 11:32:12 +00:00
Zanie Blue e18a6a0c03
Include permalink to scenarios used to generate test cases (#767) 2024-01-03 20:41:14 -06:00
Zanie Blue 0d5252580c
Improve scenario update script (#759)
Following #757, improves the script for generating scenario test cases
with:

- A requirements file
- Support for downloading packse scenarios from GitHub dynamically
- Running rustfmt on the generated test file
- Updating snapshots / running tests
2024-01-03 20:13:11 -06:00
Charlie Marsh bf9e9daa39
Make editable installs their own test feature flag (#766)
For whatever reason these fail for me with mold, and it's not worth it
to me to disable the linker.
2024-01-03 20:33:22 -05:00
Charlie Marsh 252d53e83a
Make environment validation a `--strict` flag (#765)
I don't necessarily want users to pay this cost every time. We could
consider making this `true` by default.

Closes https://github.com/astral-sh/puffin/issues/763.
2024-01-04 01:29:06 +00:00
Charlie Marsh ae8c7d11e3
Use `create_venv_py312` in pip-uninstall tests (#764) 2024-01-04 01:16:13 +00:00
Charlie Marsh 286145bc7f
Add a dedicated error for missing RECORD files (#762)
Related to: https://github.com/astral-sh/puffin/issues/716
2024-01-04 00:28:50 +00:00
Charlie Marsh 2d1d6ac0dd
Add context and diagnostics for missing `METADATA` (#761)
Closes https://github.com/astral-sh/puffin/issues/717.
2024-01-03 19:09:23 -05:00
Zanie Blue 1f2112191f
Unpack scenario root requirements in test cases (#757)
As mentioned in #746, instead of just installing the scenario root we
will unpack the root dependencies into the install command to allow
better coverage of direct user requests with scenarios.

I added display of the package tree provided by each scenario.

Use a mustache template for iterative replacements.
2024-01-03 17:31:29 -06:00
Charlie Marsh 02b157085e
Add INSTALLER file to install-wheel-rs (#760)
See:
https://packaging.python.org/en/latest/specifications/recording-installed-packages/#the-installer-file
2024-01-03 17:30:54 -05:00
Zanie Blue c9f43e915c
Add packse scenario tests (#746)
Adds tests using packse test scenarios! Uses `test.pypi.org` as a
backing index.

Tests are generated by a simple Python script. Requires
https://github.com/zanieb/packse/pull/49.

This opens us to a slight attack surface, as we cannot force use of
`test.pypi.org` only and someone could register these package names on
the real `pypi.org` index with malicious content. I could publish these
packages there too.
2024-01-03 15:50:06 -06:00
Charlie Marsh aa75d264cd
Clean up the Puffin CLI (#755)
- Rename to `puffin pip-freeze` for consistency.
- Add a `virtualenv` alias to `venv`.
- Hide the `add` and `remove` commands.
2024-01-03 21:22:06 +00:00
Charlie Marsh cfffcbb269
Cancel waiting tasks on resolution error (#753)
## Summary

I don't understand why this works (because I don't understand why it's
erroring) but it does. See:
https://github.com/astral-sh/puffin/pull/746#issuecomment-1875722454.

## Test Plan

```
cargo run --bin puffin pip-install requires-transitive-incompatible-with-transitive-8329cfc0 --extra-index-url https://test.pypi.org/simple -n
```
2024-01-03 20:18:27 +00:00
Charlie Marsh a8e52d2899
Split `resolver.rs` into a module (#752)
This is just getting hard to navigate. No code changes, just moving
stuff around.
2024-01-03 14:02:30 -05:00
Charlie Marsh 48c7359622
Always simplify dependency sets (#748)
`simplify_set` can itself simplify to the full range, so it seems like
we should be checking if the set is `Range::full` _after_ simplifying
rather than before.
2024-01-03 13:21:03 -05:00
Charlie Marsh 607a5bee6d
Use `register_owned` in prefetch path (#750) 2024-01-03 17:31:23 +00:00
Charlie Marsh fd556ccd44
Model Python version as a PubGrub package (#745)
## Summary

This PR modifies the resolver to treat the Python version as a package,
which allows for better error messages (since we no longer treat
incompatible packages as if they "don't exist at all").

There are a few tricky pieces here...

First, we need to track both the interpreter's Python version and the
_target_ Python version, because we support resolving for other versions
via `--python 3.7`.

Second, we allow using incompatible wheels during resolution, as long as
there's a compatible source distribution. So we still need to test for
`requires-python` compatibility when selecting distributions.

This could use more testing, but it feels like an area where `packse`
would be more productive than writing PyPI tests.

Closes https://github.com/astral-sh/puffin/issues/406.
2024-01-03 15:20:45 +00:00
Charlie Marsh 5a98add54e
Always pre-fetch distribution metadata (#744)
This PR fixes our prefetching logic to ensure that we always attempt to
prefetch the "best-guess" distribution for all dependencies. This logic
already existed, but because we only attempted to prefetch when package
metadata was available, it almost never triggered. Now, we wait for the
package metadata to become available, _then_ kick off the "best-guess"
prefetch (for every package).

In my testing, this dramatically improves performance (like 2x). I'm
wondering if this regressed at some point?

Closes #743.

Co-authored-by: konsti <konstin@mailbox.org>
2024-01-03 11:37:45 +01:00
Charlie Marsh ba23115465
Remove some package clones (#749) 2024-01-02 23:21:46 -05:00
Charlie Marsh 94076d6000
Use dependency package when simplifying dependency set (#747)
This manifested itself here:
https://github.com/astral-sh/puffin/pull/745/files#r1439912440.
2024-01-02 20:26:56 -06:00
konsti 26f597a787
Add spans to all significant tasks (#740)
I've tried to investigate puffin's performance wrt to builds and
parallelism in general, but found the previous instrumentation to
granular. I've tried to add spans to every function that either needs
noticeable io or cpu resources without creating duplication. This also
fixes some wrong tracing usage on async functions
(https://docs.rs/tracing/latest/tracing/struct.Span.html#in-asynchronous-code)
and some spans that weren't actually entered.
2024-01-02 16:17:03 +00:00
konsti a3d8b3d9ca
Don't install incompatible path and url wheels (#739)
Add early tag checking for path and url wheels.

This does not check for resolve for consistency with index wheels.
2024-01-02 15:00:50 +00:00
konsti cd43708369
Flag to force latest version in resolve-many (#741)
Also fixes color when redirecting puffin-dev to a log file.
2024-01-02 11:04:26 +00:00
konsti 35f6ea204b
Remove Box::pin usages (#738)
Rust 1.75 update follow-up, simplifies the code.
2023-12-29 15:49:12 +00:00
Charlie Marsh 2cfa4a3574
Add a dedicated error message to hint users towards enabling pre-releases (#697)
This PR adds a dedicated error message for resolutions that fail, but
might've succeeded if pre-releases were allowed. Specifically, if we see
a failed resolution, and failed to find a version for a package that
included a pre-release marker, we add a hint nudging the user to
explicitly enable all pre-releases.

We'd prefer a solution like
https://github.com/astral-sh/puffin/pull/666, but believe that it will
break some assumptions in PubGrub, so this is the lighter-weight
solution.

Closes https://github.com/astral-sh/puffin/issues/659.
2023-12-28 21:44:35 -05:00
konsti 2d4cb1ebf2
Rust 1.75 (#736)
The `async fn` and return-position `impl Trait` in traits improve
`BuildContext` ergonomics. The traits use `impl Future` over `async fn`
to make the send bound explicit
(https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html).

The remaining changes are due to clippy.
2023-12-28 16:08:35 -04:00
konsti 7bf2790a25
Remove all quotes from (lenient) version specifiers (#735)
Found in
https://pypi.org/simple/algoliasearch/?format=application/vnd.pypi.simple.v1+json
and
https://pypi.org/simple/okta/?format=application/vnd.pypi.simple.v1+json
2023-12-28 14:40:42 +00:00
konsti 0ebff943e4
Finish install-many with pypi 10k most dependents (#732)
This PR combines three small changes to finish up the install-many
testing.

* Download pypi_10k_most_dependents.txt in script I'd like to have the
setup process of the large scale checks automated.
* Some install-many dev script improvements 
* Fix mkl_fft-1.3.6-58-cp310-cp310-manylinux2014_x86_64.whl:
mkl_fft-1.3.6-58-cp310-cp310-manylinux2014_x86_64.whl has multiple
Wheel-Version entries, we have to ignore that like pip

Apart from the mkl-fft fix the only other errors i've seen showing up
are
https://github.com/astral-sh/puffin/issues/520#issuecomment-1869625642.
2023-12-27 09:42:51 -05:00
Charlie Marsh 007f52bb4e
Add support for relative URLs in simple metadata responses (#721)
## Summary

This PR adds support for relative URLs in the simple JSON responses. We
already support relative URLs for HTML responses, but the handling has
been consolidated between the two. Similar to index URLs, we now store
the base alongside the metadata, and use the base when resolving the
URL.

Closes #455.

## Test Plan

`cargo test` (to test HTML indexes). Separately, I also ran `cargo run
-p puffin-cli -- pip-compile requirements.in -n
--index-url=http://localhost:3141/packages/pypi/+simple` on the
`zb/relative` branch with `packse` running, and forced both HTML and
JSON by limiting the `accept` header.
2023-12-27 08:53:21 -05:00
Charlie Marsh ae83a74309
Review feedback for HTML indexes (#733)
See: https://github.com/astral-sh/puffin/pull/719
2023-12-26 21:57:20 +00:00
Charlie Marsh bbe0246205
Change internal representation of `CacheEntry` to avoid allocations (#730)
Removes a TODO.
2023-12-26 02:10:30 +00:00
Charlie Marsh 188ab75769
Split `File` into internal and external type (#729)
## Summary

This PR makes the `pypi_types::File` a response-only type (i.e., a type
that's only used when deserializing over the wire), and adds a separate
internal `File` type. Right now, the representations are similar, but
already, we can avoid the "lenient" deserialization on our internal
`File` type, and avoid the special-casing of the property names that's
required in the JSON. Over time, we can evolve this representation
entirely separately from the representation we receive from PyPI and
other indexes.
2023-12-25 15:42:28 -05:00
Charlie Marsh 6ff21374dc
Split `puffin-cache` into Puffin-specific and generic utilities (#728)
This crate started off as generic caching utilities, but we started
adding a lot of Puffin-specific stuff (like the cache buckets
abstraction that knows about Git vs. direct URL vs. indexes and so on).
This PR moves the generic stuff into a new `cache-key` crate.
2023-12-25 14:38:56 +00:00
Charlie Marsh 187ccef4e1
Cache `Tags` on `Interpreter` (#726) 2023-12-25 13:41:10 +00:00
Charlie Marsh 5b2e381f87
Remove `platform-tags` dependency on `puffin-interpreter` (#725)
Cuts off a large internal dependency chain from what is otherwise a very
general crate.
2023-12-24 23:06:50 +00:00
Charlie Marsh ad34bb02a9
Modify some inconsistent exports (#724) 2023-12-24 22:30:03 +00:00
Charlie Marsh 343880820b
Un-escape HTML entities when decoding (#723)
I don't have a good testing strategy here (I'm manually testing against
`devpi` via `packse`), but the HTML index uses (e.g.)
`data-requires-python="&gt;=3.8"`, so we need to decode.
2023-12-24 16:35:45 -05:00
Charlie Marsh 2d721a497e
Add a `SimpleHttp` abstraction similar to `SimpleJson` (#722)
Just an internal refactor to turn some standalone functions into
associated methods (and reduce the diff in the next PR).
2023-12-24 20:55:57 +00:00
konsti e23292641f
Add pypi 10k packages with most dependents dataset (#711)
From manual inspection, this dataset generated through the [libraries.io
API](https://libraries.io/api#project-search) seems more mainstream than
the current 8k one, which is also preserved. I've added the dataset to
the repo because the API requires an API key.
2023-12-24 18:31:52 +00:00
Charlie Marsh 5bce699ee1
Add support for HTML indexes (#719)
## Summary

This PR adds support for HTML index responses (as with
`--index-url=https://download.pytorch.org/whl`).

Closes https://github.com/astral-sh/puffin/issues/412.
2023-12-24 16:04:00 +00:00
Charlie Marsh 9e6cb706a0
Update test fixtures (#720) 2023-12-24 15:50:10 +00:00
konsti b7ad97a823
Show resource and lockfile when waiting (#715)
We lock git checkout directories and the virtualenv to avoid two puffin
instances running in parallel changing files at the same time and
leading to a broken state. When one instance is blocking another, we
need to inform the user (why is the program hanging?) and also add some
information for them to debug the situation.

The new messages will print

```
Waiting to acquire lock for /home/konsti/projects/puffin/.venv (lockfile: /home/konsti/projects/puffin/.venv/.lock)
```

or

```
Waiting to acquire lock for git+https://github.com/pydantic/pydantic-extra-types@0ce9f207a1e09a862287ab77512f0060c1625223 (lockfile: /home/konsti/projects/puffin/cache-all-kinds/git-v0/locks/f157fd329a506a34)
```

The messages aren't perfect but clear enough to see what the contention
is and in the worst case to delete the lockfile.

Fixes #714
2023-12-21 00:05:49 +01:00
konsti e60f0ec732
Update pubgrub (#713)
Easier than i expected: We simply never construct the pubgrub error
variants since we have our own main loop. The `unreachable!()`s can be
removed when never is stabilized
2023-12-20 23:56:59 +01:00
Zanie Blue e705267dac
Fix fallback download when index does not support HTTP range requests (#702)
Otherwise, when a server does not support HTTP range requests we throw
an error instead of downloading without range requests.

---------

Co-authored-by: konstin <konstin@mailbox.org>
2023-12-20 10:55:23 +00:00
Zanie Blue 665a59dae6
Fix deserialization of index response when `requires_python` field is missing (#708)
Closes https://github.com/astral-sh/puffin/issues/707
2023-12-20 11:53:48 +01:00
Zanie Blue 4e437ba7e5
Allow the default index url to be configured with `PUFFIN_INDEX_URL` (#704)
This allows the default index URL to be easily overridden with a local
index e.g. a `packse` server

```
export PUFFIN_INDEX_URL="http://localhost:3141/packages/all/+simple"
```
2023-12-20 11:52:00 +01:00
Zanie Blue ab15b08cbe
Perform 3 retries by default instead of 0 on failed index requests (#710)
As a user, I'd expect retries to occur by default.

We should also expose this via a setting probably.
2023-12-20 11:51:24 +01:00
konsti 9f8b7e7e12
Refactor `DistFinder` to allow handling errors (#709)
For the install tests, i need the ability to ignore failures in the
`DistFinder`. To avoid just copy&pasting a version that collects errors
separately, i followed
https://gendignoux.com/blog/2021/04/01/rust-async-streams-futures-part1.html
and switched the custom channel over to an async stream yielding
`Result` items.

I like the async streams mirror the normal iterator api.
2023-12-20 04:07:55 +00:00
Zanie Blue 12eedb1c12
Include `Accept` header specifying that we can only parse JSON responses (#701)
Otherwise, when an index does not support the query variable we get an
HTML response and a JSON parse error.
2023-12-19 12:22:53 -06:00
Zanie Blue 52ba65aa9c
Derive `Debug` for `CachedClientError` (#703)
Discovered while debugging https://github.com/astral-sh/puffin/pull/702
2023-12-19 12:22:39 -06:00
Andrew Gallant aa9f47bbde
improve tests for version parser (#696)
The high level goal here is to improve the tests for the version parser.
Namely, we now check not just that version strings parse successfully,
but that they parse to the expected result.

We also do a few other cleanups. Most notably, `Version` is now an
opaque type so that we can more easily change its representation going
forward.

Reviewing commit-by-commit is suggested. :-)
2023-12-19 12:25:32 -05:00
Charlie Marsh 6f90edda78
Reduce visibility of `PubGrubReportFormatter` (#699) 2023-12-19 08:53:38 -06:00
konsti 9e2bbee7f0
Name the directory whose lock we're waiting on (#700) 2023-12-19 12:19:27 +00:00
konsti 114548d945
Test that cache errors are non-fatal (#685)
The test creates a cache from multiple sources and injects faults (once
using invalid data and once by making the files unreadable on the fs
level), then resolves again.

I didn't test git because it has its own locking and correctness logic.

The main drawback is that this test is slow (2.5s for me), we could
`#[ignore]` it.
2023-12-19 12:02:49 +00:00
Charlie Marsh 878bb5c035
Remove remaining snapshot files from resolver test (#698) 2023-12-19 05:41:50 +00:00
Charlie Marsh 3660d8a08e
Introduce separate traits for ahead-of-time and installed metadata (#692)
This is a pure refactor to follow-up #690, to separate the metadata that
we know upfront about distributions (like the version, for
registry-based distributions) vs. the metadata that requires building
(like the version, for URL-based distributions).
2023-12-18 22:37:45 +00:00
Charlie Marsh 31afb39a10
Show URLs and version together for installed, URL-based dependencies (#690)
The snapshot test changes will give you a sense for the impact of the
change and the output formatting.

Closes https://github.com/astral-sh/puffin/issues/686.
2023-12-18 22:21:37 +00:00
Charlie Marsh 365c860e27
Show fully-resolved URLs in non-resolution contexts (#689)
We now show the fully-resolved URL, rather than the URL as given by the
user, _everywhere_ except for the output resolution file (which should
retain relative paths, unexpanded environment variables, etc.).

Closes https://github.com/astral-sh/puffin/issues/687.
2023-12-18 22:10:24 +00:00
konsti 43c837f7bb
Show enum defaults in `--help` output (#693)
With `Option<T>` and `.unwrap_or_default()` later, the default of `T`
isn't shown in the help output.

Old:

```
      --link-mode <LINK_MODE>
          The method to use when installing packages from the global cache

          Possible values:
          - clone:    Clone (i.e., copy-on-write) packages from the wheel into the site packages
          - copy:     Copy packages from the wheel into the site packages
          - hardlink: Hard link packages from the wheel into the site packages

      -q, --quiet
      Do not print any output

      --resolution <RESOLUTION>
          Possible values:
          - highest:       Resolve the highest compatible version of each package
          - lowest:        Resolve the lowest compatible version of each package
          - lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies

      --prerelease <PRERELEASE>
          Possible values:
          - disallow:                 Disallow all pre-release versions
          - allow:                    Allow all pre-release versions
          - if-necessary:             Allow pre-release versions if all versions of a package are pre-release
          - explicit:                 Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
          - if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```

![Screenshot from 2023-12-18
21-04-16](https://github.com/astral-sh/puffin/assets/6826232/6b3cb47a-f224-408a-8d7a-186ebeb88ecd)

New:

```
      --link-mode <LINK_MODE>
          The method to use when installing packages from the global cache

          [default: hardlink]

          Possible values:
          - clone:    Clone (i.e., copy-on-write) packages from the wheel into the site packages
          - copy:     Copy packages from the wheel into the site packages
          - hardlink: Hard link packages from the wheel into the site packages

  -q, --quiet
          Do not print any output

      --resolution <RESOLUTION>
          [default: highest]

          Possible values:
          - highest:       Resolve the highest compatible version of each package
          - lowest:        Resolve the lowest compatible version of each package
          - lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies

      --prerelease <PRERELEASE>
          [default: if-necessary-or-explicit]

          Possible values:
          - disallow:                 Disallow all pre-release versions
          - allow:                    Allow all pre-release versions
          - if-necessary:             Allow pre-release versions if all versions of a package are pre-release
          - explicit:                 Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
          - if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```


![image](https://github.com/astral-sh/puffin/assets/6826232/26c2c391-d959-4769-999d-481b3f179502)
2023-12-18 21:50:47 +00:00
Charlie Marsh 98fcb76015
Lock entire virtualenv during modifying commands (#695)
These commands all assume that the `site-packages` are constant
throughout.

Closes #691.
2023-12-18 16:44:45 -05:00
Charlie Marsh 207bb83a1c
Rename puffin-warnings macros to avoid tracing collision (#694)
Also more consistent with Ruff.
2023-12-18 21:33:21 +00:00
Charlie Marsh e98804141c
Re-add tf-models-nightly filter in `resolve_many.rs` (#688)
I accidentally resolved this in a prior PR.
2023-12-18 16:56:04 +00:00
Charlie Marsh dbf055fe6f
Use borrowed data in `BuildDispatch` (#679)
This PR uses borrowed data in `BuildDispatch` which makes creating a
`BuildDispatch` extremely cheap (only one allocation, for the Python
executable). I can be talked out of this, it will have no measurable
impact.
2023-12-18 16:43:03 +00:00
Charlie Marsh c400ab7d07
Add support for `file://` URLs in editable requirements (#680) 2023-12-18 14:55:37 +00:00
Charlie Marsh 74ca9128b4
Canonicalize virtualenv path once (#678)
This avoids filesystem calls when creating a `BuildDispatch`.

Co-authored-by: konsti <konstin@mailbox.org>
2023-12-18 14:42:58 +00:00
konsti 89ca0d68b9
`exclude_newer` in puffin-dev resolve-cli (#684)
Internal dev tool change.
2023-12-18 14:06:54 +00:00
konsti 7926749296
Fixup for `>=2.7,!=3.0.*,!=3.1.*,<3.4.*` (#683)
Found in
https://pypi.org/simple/wincertstore/?format=application/vnd.pypi.simple.v1+json
2023-12-18 12:56:48 +00:00
konsti f4f67ebde0
Rebase: Uninstall existing non-editable versions when installing editable requirements bug (#682)
Separate branch for rebasing #677 onto main because i don't trust the
rebase enough to force push.

Closes #677.

---

If you install `black` from PyPI, then `-e ../black`, we need to
uninstall the existing `black`. This sounds simple, but that in turn
requires that we _know_ `-e ../black` maps to the package `black`, so
that we can mark it for uninstallation in the install plan. This, in
turn, means that we need to build editable dependencies prior to the
install plan.

This is just a bunch of reorganization to fix that specific bug
(installing multiple versions of `black` if you run through the above
workflow): we now run through the list of editables upfront, mark those
that are already installed, build those that aren't, and then ensure
that `InstallPlan` correctly removes those that need to be removed, etc.

Closes #676.

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-12-18 09:28:14 +00:00
Charlie Marsh 0bb2c92246
Add editable install support to `pip-install` (#675)
Per the title: adds support for `-e` installs to `puffin pip-install`.
There were some challenges here around threading the editable installs
to the right places. Namely, we want to build _once_, then reuse the
editable installs from the resolution. At present, we were losing the
`editable: true` flag on the `Dist` that came back through the
resolution, so it required some changes to the resolver.

Closes https://github.com/astral-sh/puffin/issues/672.
2023-12-18 09:52:32 +01:00
konsti 8c6463d220
Allow identical `VIRTUAL_ENV` and `CONDA_PREFIX` env vars (#681)
Port of https://github.com/PyO3/maturin/pull/1879 for
https://github.com/PyO3/maturin/issues/1878
2023-12-18 08:42:31 +00:00
Charlie Marsh 77c6e6fa6c
Add support for `reinstall` to editable packages (#674)
Closes https://github.com/astral-sh/puffin/issues/673.
2023-12-17 15:41:57 +00:00
Charlie Marsh 00e1c33af4
Add an editable index to the site-packages registry (#671)
This PR modifies `SitePackages` to store all distributions in a flat
vector, and maintain two indexes (hash maps) from "per-element data for
an element in the vector" to "index of that element". This enables us to
maintain a map on both package name and editable URL.
2023-12-17 03:44:36 +00:00
Charlie Marsh 08edd173db
Add support for editable packages in `pip-uninstall` (#670) 2023-12-17 02:56:37 +00:00
konsti f059c6e6a6
Support editable in pip-sync and pip-compile (#587)
Support `-e path/do/dir` in pip-sync and and pip-compile.
2023-12-16 22:37:34 +00:00
Charlie Marsh f62458f600
Add explicit error message for URLs without package names (#669)
`pip` supports installing packages without names (e.g.,
`git+https://github.com/pallets/flask.git`), but it doesn't adhere to
the PEP grammar, and we don't yet support it (and may never) (#313).

This PR adds a dedicated error path for such cases, to ensure that we
can give meaningful feedback to the user:

```
error: Couldn't parse requirement in requirements.in position 0 to 18
  Caused by: URL requirement is missing a package name; expected: `package_name @ https://google.com`
https://google.com
^^^^^^^^^^^^^^^^^^
```

Closes https://github.com/astral-sh/puffin/issues/650.
2023-12-16 21:14:34 +00:00
konsti 71964ec7a8
Switch to msgpack in the cached client (#662)
This gives a 1.23 speedup on transformers-extras. We could change to
msgpack for the entire cache if we want. I only tried this format and
postcard so far, where postcard was much slower (like 1.6s).

I don't actually want to merge it like this, i wanted to figure out the
ballpark of improvement for switching away from json.

```
hyperfine --warmup 3 --runs 10 "target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in" "target/profiling/branch pip-compile scripts/requirements/transformers-extras.in"
Benchmark 1: target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in
  Time (mean ± σ):     179.1 ms ±   4.8 ms    [User: 157.5 ms, System: 48.1 ms]
  Range (min … max):   174.9 ms … 188.1 ms    10 runs

Benchmark 2: target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
  Time (mean ± σ):     221.1 ms ±   6.7 ms    [User: 208.1 ms, System: 46.5 ms]
  Range (min … max):   213.5 ms … 235.5 ms    10 runs

Summary
  target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in ran
    1.23 ± 0.05 times faster than target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
```

Disadvantage: We can't manually look into the cache anymore to debug
things

- [ ] Check more formats, i currently only tested json, msgpack and
postcard, there should be other formats, too
- [x] Switch over `CachedByTimestamp` serialization (for the interpreter
caching)
- [x] Switch over error handling and make sure puffin is still resilient
to cache failure
2023-12-16 21:01:35 +00:00
Charlie Marsh e4673a0c52
Modify PEP 508 cursor to use byte offsets (#668)
This enables us to remove a number of allocations (in particular,
`peek_while` and `take_while` no longer allocate). It also makes it
trivial to move the cursor to a new location, since you can just slice
and call `.chars()`. At present, moving to a new location would require
converting the iterator to a string, then back to a character iterator.
2023-12-15 22:05:28 +00:00
Charlie Marsh 875c9a635e
Rename `CharIter` to `Cursor` (#667)
This better aligns with the analogous struct that we have in Ruff.
2023-12-15 21:57:59 +00:00
konsti 620f73b38b
Speed up version parsing for a 1.27±0.03 speedup in transformers-extras with conservative changes (#660)
Two low-hanging fruits as optimizations for version parsing: A fast path
for release only versions and removing the regex from version specifiers
(still calling into version's parsing regex if required). This enables
optimizing the serde format since we now see the serde part instead of
only PEP 440 parsing. I intentionally didn't rewrite the full PEP 440 at
this step.

```console
$ hyperfine --warmup 5 --runs 50 "target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in" "target/profiling/main pip-compile scripts/requirements/transformers-extras.in"
  Benchmark 1: target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     217.1 ms ±   3.2 ms    [User: 194.0 ms, System: 55.1 ms]
    Range (min … max):   211.0 ms … 228.1 ms    50 runs

  Benchmark 2: target/profiling/main pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     276.7 ms ±   5.7 ms    [User: 252.4 ms, System: 54.6 ms]
    Range (min … max):   268.9 ms … 303.5 ms    50 runs

  Summary
    target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in ran
      1.27 ± 0.03 times faster than target/profiling/main pip-compile scripts/requirements/transformers-extras.in
```

---------

Co-authored-by: Andrew Gallant <andrew@astral.sh>
2023-12-15 14:03:35 -05:00
Charlie Marsh 305b9b080a
Show resolution error once on pip-install failure (#665)
Closes https://github.com/astral-sh/puffin/issues/664.
2023-12-15 18:43:23 +00:00
Charlie Marsh 47290f784e
Add fixup for invalid double quotes (#663)
Closes https://github.com/astral-sh/puffin/issues/658.
2023-12-15 18:11:22 +00:00
Charlie Marsh 9470c20e7a
Avoid double resolution during source builds (#656)
## Summary

This PR ensures that we re-use the resolution to install the build
dependencies when building a source distribution. Currently, we only
pass along the list of requirements, and then use the `Finder` to map
each requirement to a distribution. But we already determine the correct
distribution when resolving!

Closes https://github.com/astral-sh/puffin/issues/655.
2023-12-15 17:27:16 +00:00
Charlie Marsh 1129661a22
Ignore missing manifest entries in the built wheel cache (#654)
## Summary

This is more of a hypothetical problem, but the cache manifest could in
theory get out-of-sync with the contents on disk. This PR modifies the
`BuiltWheelMetadata` lookup to warn (but not fail) if the manifest
includes a wheel that no longer exists on disk. You can mimic this by
removing a wheel from the `built-wheels-v0` cache without modifying the
manifest correspondingly.
2023-12-15 17:24:09 +00:00
Charlie Marsh 84093773ef
Store source distribution sources in the cache (#653)
## Summary

This PR modifies `source_dist.rs` to store source distributions (from
remote URLs) in the cache. The cache structure for registries now looks
like:

<img width="1053" alt="Screen Shot 2023-12-14 at 10 43 43 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/3c2dbf6b-5926-41f2-b69b-74031741aba8">

(I will update the docs prior to merging, if approved.)

The benefit here is that we can reuse the source distribution (avoid
download + unzipping it) if we need to build multiple wheels. In the
future, it will be even more relevant, since we'll need to reuse the
source distribution to support
https://github.com/astral-sh/puffin/issues/599.

I also included some misc. refactors to DRY up repeated operations and
add some more abstraction to `source_dist.rs`.
2023-12-15 17:19:33 +00:00
Charlie Marsh a361ccfbb3
Remove additional metadata call in `source_dist.rs` (#652) 2023-12-14 19:45:31 +00:00
Charlie Marsh 22c7057b35
Expand environment variables in URLs (#640)
## Summary

This PR enables users to express relative dependencies via environment
variables. Like pip, PDM, Hatch, Rye, and others, we now allow users to
express dependencies like:

```text
flask @ file://${PROJECT_ROOT}/flask-3.0.0-py3-none-any.whl
```

In the compiled requirements file, we'll also preserve the unexpanded
environment variable.

Closes https://github.com/astral-sh/puffin/issues/592.
2023-12-14 15:09:12 +00:00
Charlie Marsh ed8dfbfcf7
Preserve verbatim URLs (#639)
## Summary

This PR adds a `VerbatimUrl` struct to preserve verbatim URLs throughout
the resolution and installation pipeline. In short, alongside the parsed
`Url`, we also keep the URL as written by the user. This enables us to
display the URL exactly as written by the user, rather than the
serialized path that we use internally.

This will be especially useful once we start expanding environment
variables since, at that point, we'll be able to write the version of
the URL that includes the _unexpected_ environment variable to the
output file.
2023-12-14 15:03:39 +00:00
Charlie Marsh eef9612719
Allow reporters to take `dyn Metadata` (#645) 2023-12-14 12:36:28 +01:00
Charlie Marsh 1a62ca0c62
Move source dist extraction into extract crate (#649) 2023-12-14 05:56:49 +00:00
Charlie Marsh 402b728bf7
Use `fs_err` with `AutoStream` (#648) 2023-12-14 04:56:55 +00:00
Charlie Marsh db7e2dedbb
Move archive extraction into its own crate (#647)
We have some shared utilities beyond `puffin-build` and
`puffin-distribution`, and further, I want to be able to access the
sdist archive extraction logic from `puffin-distribution`. This is
really generic, so moving into its own crate.
2023-12-14 04:49:09 +00:00
Charlie Marsh 388641643d
Remove `SourceDistDownload` struct (#646)
This is created in one place, then immediately destructed into fields.
2023-12-14 02:34:50 +00:00
Charlie Marsh e0127581b6
Use `fs_err` in more places (#644) 2023-12-14 01:11:45 +00:00
Charlie Marsh 4fd69c74b6
Use URL rather than String in direct URL types (#643) 2023-12-14 01:01:27 +00:00
Charlie Marsh 8071a23863
Add dedicated ID types to avoid opaque strings (#642)
This allows us to enforce type safety within the resolver. For example,
in the index, we can remove `String` as a key type and enforce that
callers _must_ present us with a `PackageId`. (This actually caught one
bug, where we were using the SHA rather than the package ID. That bug
shouldn't have had any effect given where it was, since those are 1:1,
but it's still problematic.)
2023-12-14 00:53:33 +00:00
Charlie Marsh 3549d9638e
Inline all snapshot files (#641)
Right now, we're inconsistent between checking in and inlining these.
The outputs are small in Puffin, so let's just inline them in all cases.
2023-12-14 00:35:38 +00:00
Charlie Marsh 2da6563a64
Use `Manifest::simple` in tests (#638) 2023-12-13 17:41:29 +00:00
Charlie Marsh eb1a630db2
Avoid hard-error for non-existent extras (#627)
## Summary

When resolving `transformers[tensorboard]`, the `[tensorboard]` extra
doesn't exist. Previously, we returned "unknown" dependencies for this
variant, which leads the resolution to try all versions, then fail. This
PR instead warns, but returns the base dependencies for the package,
which matches `pip`. (Poetry doesn't even warn, it just proceeds as
normal.)

Arguably, it would be better to return a custom incompatibility here and
then propagate... But this PR is better than the status quo, and I don't
know if we have support for that behavior yet...? (\cc @zanieb)

Closes #386.

Closes https://github.com/astral-sh/puffin/issues/423.
2023-12-13 17:36:27 +00:00
konsti 5c38825b93
Don't preserve tar mtime to work around tar-rs bug. (#634)
Don't preserve mtime to work around
https://github.com/alexcrichton/tar-rs/issues/349 to fix #579.
2023-12-13 15:11:02 +00:00
Charlie Marsh 69581c03c3
Enable package overrides in `pip-compile` (#631)
## Summary

This PR enables overrides to be passed to `pip-compile` and
`pip-install` via a new `--overrides` flag.

When overrides are provided, we effectively replace any requirements
that are overridden with the overridden versions. This is applied at all
depths of the tree.

The merge semantics are such that we replace _all_ requirements of a
package with _all_ requirements from the overrides files. So, for
example, if a package declares:

```
foo >= 1.0; python_version < '3.11'
foo < 1.0; python_version >= '3.11'
```

And the user provides an override like:
```
foo >= 2.0
```

Then _both_ of the `foo` requirements in the package will be replaced
with the override.

If instead, the user provided an override like:
```
foo >= 2.0; python_version < '3.11'
foo < 3.0; python_version >= '3.11'
```

Then we'd replace _both_ of the original `foo` requirements with both of
these overrides. (In technical terms, for each package in the
requirements file, we flat-map over its overrides.)

Closes https://github.com/astral-sh/puffin/issues/511.
2023-12-13 15:03:38 +00:00
konsti e51b397779
Typo: editable -> editables (#637)
Split out to minimize the diff in #587
2023-12-13 13:24:28 +00:00
konsti 0b20f6a25a
Proper unzip error type (#636)
Move the `Unzip` trait from anyhow to `ZipError|io::Error`.
2023-12-13 12:55:59 +00:00
konsti 0dde84dd27
Fix main (#635)
Seems to be a PR timing error
2023-12-13 13:55:06 +01:00
Charlie Marsh ea920e22d1
Validate environment after `pip-sync` (#629)
Not 100% sure that we actually want to do this, it seems reasonable
though.

Closes https://github.com/astral-sh/puffin/issues/410.
2023-12-13 09:13:43 +01:00
Charlie Marsh cbfd39093e
Clean up some function signatures (#633) 2023-12-13 06:21:47 +00:00
Charlie Marsh 920e10fc8f
Use `FxHash` consistently (#632) 2023-12-13 05:36:03 +00:00
Charlie Marsh edd741bf13
Add a diagnostic to detect invalid Python versions (#630)
Related to: https://github.com/astral-sh/puffin/issues/410.
2023-12-13 03:45:02 +00:00
Charlie Marsh a24eb57e93
Make warnings user-facing (#628)
## Summary

Now, `puffin_warnings::warn_once` and `puffin_warnings::warn` will go to
`stderr`, as long as the user isn't running under `--quiet`. Previously,
these went through `tracing`, and so were only visible when running
under `--verbose`.
2023-12-12 21:24:38 -05:00
Zanie Blue 490fb55ac5
Use available versions to simplify unsat error reports (#547)
Uses https://github.com/pubgrub-rs/pubgrub/pull/156 to consolidate
version ranges in error reports using the actual available versions for
each package.

Alternative to https://github.com/zanieb/pubgrub/pull/8 which implements
this behavior as a method in the `Reporter` — here it's implemented in
our custom report formatter (#521) instead which requires no upstream
changes.

Requires https://github.com/zanieb/pubgrub/pull/11 to only retrieve the
versions for packages that will be used in the report.

This is a work in progress. Some things to do:
- ~We may want to allow lazy retrieval of the version maps from the
formatter~
- [x] We should probably create a separate error type for no solution
instead of mixing them with other resolve errors
- ~We can probably do something smarter than creating vectors to hold
the versions~
- [x] This degrades error messages when a single version is not
available, we'll need to special case that
- [x] It seems safer to coerce the error type in `resolve` instead of
`solve` if feasible
2023-12-12 23:25:16 +00:00
Charlie Marsh a8512d7d51
Remove one string clone (#626) 2023-12-12 20:56:15 +00:00
konsti a24a681db9
Towards using `prepare_metadata_for_build_wheel` in the resolver (#616)
Make `prepare_metadata_for_build_wheel` accessible across the puffin
codebase by splitting the built call into a setup, a metadata and a
wheel call. This does not actually use the hook yet, but it's the
required refactoring for it.

Part of #599.
2023-12-12 20:45:37 +00:00
Charlie Marsh 85c37b2b9c
Add extra to debug logging (#625) 2023-12-12 20:09:09 +00:00
Charlie Marsh f459e1ee50
Use a non-async `Mutex` in `OnceMap` (#624)
I don't know why, but this seems to resolve
https://github.com/astral-sh/puffin/issues/619. The Tokio docs also say
that using Tokio's Mutex is _not_ recommended unless you need to hold
the Mutex across an `.await`, which we don't.

Since this is a non-deterministic failure, I just ran it a bunch of
times and ensured it didn't hang (whereas it did hang occasionally prior
to this PR).

Closes https://github.com/astral-sh/puffin/issues/619
2023-12-12 14:59:45 -05:00
Charlie Marsh 4fb2e0955e
Add a fast-path to skip resolution when installation is complete (#613)
For a very large resolution (a few hundred packages), I see 13ms vs.
400ms for a no-op. It's worth optimizing this case, in my opinion.
2023-12-12 17:43:12 +00:00
Charlie Marsh 3aaab32a9d
Omit extra in resolver progress (#623)
Closes #621.
2023-12-12 12:41:18 -05:00
Charlie Marsh 6c7f5cb846
Validate installed packages in virtual environment (#611)
## Summary

Now, after running `pip-install`, we validate that the set of installed
packages is consistent -- that is, that we don't have any packages that
are missing dependencies, or incompatible versions of installed
dependencies.
2023-12-12 17:33:38 +00:00
Charlie Marsh c764155988
Avoid double-resolving during `pip-install` (#610)
## Summary

At present, when performing a `pip-install`, we first do a resolution,
then take the set of requirements and basically run them through our
`pip-sync`, which itself includes re-resolving the dependencies to get a
specific `Dist` for each package. (E.g., the set of requirements might
say `flask==3.0.0`, but the installer needs a specific _wheel_ or source
distribution to install.)

This PR removes this second resolution by exposing the set of pinned
packages from the resolution. The main challenge here is that we have an
optimization in the resolver such that we let the resolver read metadata
from an incompatible wheel as long as a source distribution exists for a
given package. This lets us avoid building source distributions in the
resolver under the assumption that we'll be able to install the package
later on, if needed. As such, the resolver now needs to track the
resolution and installation filenames separately.
2023-12-12 17:29:09 +00:00
Charlie Marsh a0b3815d84
Respect existing versions when pip-installing (#608)
## Summary

When running `puffin pip-install`, we should respect versions that are
already installed in the environment. For example, if you run `puffin
pip-install flask==2.0.0` and then `puffin pip-install flask`, we should
avoid upgrading Flask. The most natural way to model this is to mark
them as "preferences".

(It's not enough to just filter those requirements out prior to
resolving, since we may not have the _dependencies_ of those packages
installed. We _could_ recursively verify this across the
`site-packages`, but that would be a larger PR.)
2023-12-12 17:22:47 +00:00
Charlie Marsh 974cb4cc15
Add a `pip-install` subcommand (#607)
## Summary

This PR adds a `pip-install` command that operates like, well, `pip
install`. In short, it resolves the provided dependency, then makes sure
they're all installed in the environment. The primary differences with
`pip-sync` are that (1) `pip-sync` ignores dependencies, and assumes
that the packages represent a complete set; and (2) `pip-sync`
uninstalls any unlisted packages.

There are a bunch of TODOs that I'll resolve in subsequent PRs.

Closes https://github.com/astral-sh/puffin/issues/129.
2023-12-12 12:16:00 -05:00
Charlie Marsh 3e837da5b8
Avoid unicode decoding in name normalization (#617) 2023-12-12 10:01:02 -05:00
konsti 5ae4023cda
Activate venv before source dist build (#567)
Fixes #552
2023-12-12 15:46:37 +01:00
konsti 7c1dd71f66
Implement editable installs in dev command (#566)
First step, sufficient to run
```shell
cargo run --bin puffin-dev -- build --editable -w target/editables/ scripts/editable-installs/poetry_editable/
```
and check the wheel to confirm its working. Tests will be added with the
pip-sync integration.
2023-12-12 15:45:55 +01:00
Charlie Marsh c25d5240f1
Remove regular expressions for package name normalization (#614)
Very random but the hand-written version is about 3-4x faster
(benchmarked in a standalone repo).
2023-12-12 05:50:48 +00:00
Charlie Marsh edcb71b1be
Remove some unused fields from `SimpleJson` (#612) 2023-12-11 23:01:37 -05:00
Charlie Marsh 1181288078
Download, build, and install in a single pipeline phase (#605)
## Summary

At present, we have two separate phases within the installation pipeline
related to populating wheels into the cache. The first phase downloads
the distribution, and then builds any source distributions into wheels;
the second phase unzips all the built wheels into the cache.

This PR merges those two phases into one, such that we seamlessly
download, build, and unzip wheels in one pass. This is more efficient,
since we can start unzipping while we build. It also ensures that if the
install _fails_ partway through, we don't end up with a bunch of
downloaded wheels that we never had a chance to unzip. The code is also
much simpler.

The main downside is that the user-facing feedback isn't as granular,
since we only have one phase and one progress bar for what was
originally three distinct phases.

Closes https://github.com/astral-sh/puffin/issues/571.

## Test Plan

I ran the benchmark script on two separate requirements files, and saw a
7% and 31% speedup respectively:

```text
+ TARGET=./scripts/benchmarks/requirements.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     269.4 ms ±  33.0 ms    [User: 42.4 ms, System: 117.5 ms]
  Range (min … max):   221.7 ms … 446.7 ms    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     250.6 ms ±  28.3 ms    [User: 41.5 ms, System: 127.4 ms]
  Range (min … max):   207.6 ms … 336.4 ms    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache' ran
    1.07 ± 0.18 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
```

```text
+ TARGET=./scripts/benchmarks/requirements-large.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      5.053 s ±  0.354 s    [User: 1.413 s, System: 6.710 s]
  Range (min … max):    4.584 s …  6.333 s    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      3.845 s ±  0.225 s    [User: 1.364 s, System: 6.970 s]
  Range (min … max):    3.482 s …  4.715 s    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran
```
2023-12-11 15:42:29 +00:00
konsti b84fbb86b2
Impl Version debug as display (#606)
Currently, `dbg!` is hard to read because versions are verbose, showing
all optional fields, and we have a lot of versions. Changing debug
formatting to displaying the version number (which can be losslessly
converted to the struct and back) makes this more readable.

See e.g.
https://gist.github.com/konstin/38c0f32b109dffa73b3aa0ab86b9662b

**Before**

```text
version: Version {
    epoch: 0,
    release: [
        1,
        2,
        3,
    ],
    pre: None,
    post: None,
    dev: None,
    local: None,
},
```

**After**

```text
version: "1.2.3",
```
2023-12-11 16:38:14 +01:00
Charlie Marsh 00f1703111
Avoid storing partial wheels in the cache (#604)
Closes https://github.com/astral-sh/puffin/issues/603.
2023-12-09 19:11:30 -05:00
Charlie Marsh 32f54a5947
Use async `Command` for wheel build operations (#601)
Incredibly, this speeds up the install on a large project from 2m6s to
50s.
2023-12-09 16:20:52 +00:00
Charlie Marsh f1c05dcd66
Buffer streamed file writes (#602) 2023-12-09 16:20:31 +00:00
Charlie Marsh 0499fe0613
Fix incorrect unknown size marker in traces (#600)
It said `(unknown size)` for _all_ disk-based wheels.
2023-12-09 04:46:01 +00:00
Charlie Marsh 24d81912cf
Use consistent change event order (#598)
Closes #591.
2023-12-09 04:12:40 +00:00
Charlie Marsh 714a64549b
Use a progress bar for the build phase (#597)
I think this might've been an oversight when copying over the build
reporting during the source distribution refactor.
2023-12-09 04:05:13 +00:00
Charlie Marsh 5878f8dde7
Misc. tweaks to puffin-lib's `lib.rs` (#596) 2023-12-09 03:37:47 +00:00
Charlie Marsh 600c5d072f
Avoid walrus operator in PEP 517 scripts (#595)
I believe this unnecessarily puts a Python 3.7+ requirement on these
scripts.
2023-12-09 01:25:22 +00:00
Charlie Marsh a24534b0ce
Use `rustc-hash` instead of `fxhash` crate (#594)
`fxhash` is the old, less maintained version of this crate
(`rustc-hash`). We use the latter in Ruff.
2023-12-08 20:27:49 +00:00
konsti 6005d7a552
Keep track of in flight unzips using `OnceMap` (#544)
I saw warnings when we were e.g. unzipping wheel and setuptools in two
tasks at the same time. We now keep track of in flight unzips.

This introduces a `OnceMap` abstraction which we also use in the
resolver.
2023-12-08 20:18:11 +00:00
Charlie Marsh ffb8480087
Add `--reinstall` flag to `pip-sync` (#590)
## Summary

This PR adds two flags to `pip-sync`: `--reinstall`, and
`--reinstall-package [PACKAGE]`. The former reinstalls all packages in
the requirements, while the latter can be repeated and reinstalls all
specified packages.

For our purposes, a reinstall includes (1) purging the cache, and (2)
marking any already-installed versions as extraneous.

Closes #572.

Closes https://github.com/astral-sh/puffin/issues/271.
2023-12-08 19:58:42 +00:00
Charlie Marsh 4b8642c6f7
Enable selective cache purging in `puffin clean` (#589)
## Summary

This PR enables `puffin clean` to accept package names as command line
arguments, and selectively purge entries from the cache tied to the
given package.

Relate to #572.

## Test Plan

Modified all the caching tests to run an additional step to (1) purge
the cache, and (2) re-install the package.
2023-12-08 19:51:32 +00:00
Charlie Marsh cbe1cb4229
Avoid race when unpacking wheels (#593)
## Summary

If someone else beats us to the unzip, we should let them win.

We already have a check for this at the top of the unzip method, but
it's also possible that two source distributions get built in parallel
that both try to unpack the same build dependency.
2023-12-08 17:46:19 +00:00
Charlie Marsh 5ae3a8b1cb
Restructure Git cache to include package name (#588)
## Summary

This PR modifies the Git wheel cache to: (1) use a shorter version of
the SHA, to save space; and (2) include the package name, for
consistency with all other buckets.

I considered removing the URL hash entirely, and _just_ using the SHA,
which would be even _more_ consistent with other buckets. But if we
remove the URL, then we won't have separate directories for
subdirectories (which are part of the URL).

Before:

<img width="1035" alt="Screen Shot 2023-12-07 at 7 23 42 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/86afce67-682f-464f-9ba1-0b60d5b7f19f">

After:

<img width="1232" alt="Screen Shot 2023-12-07 at 8 09 23 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/eda42a19-974f-47fe-8c83-54a602ddfd2d">
2023-12-07 20:17:41 -05:00
Zanie Blue ef7be9103c
Parse `SimpleJson` into categorized data in the client (#522)
Extends #517 with a suggestion from @konstin to parse the `SimpleJson`
into an intermediate type `SimpleMetadata(BTreeMap<Version,
VersionFiles>)` before converting to a `VersionMap`. This reduces the
number of times we need to parse the response. Additionally, we cache
the parsed response now instead of `SimpleJson`.

`VersionFiles` stores two vectors with
`WheelFilename`/`SourceDistFilename` and `File` tuples. These can be
iterated over together or separately. A new enum `DistFilename` was
added to capture the `SourceDistFilename` and `WheelFilename` variants
allowing iteration over both vectors.
2023-12-07 11:04:47 -06:00
Charlie Marsh 5d3ce963b2
Raise an error when `pip-sync` manifest contains duplicates (#584)
Also ensures that we filter out any incompatible requirements when
building the install plan. In general, we assume that requirements were
generated by `pip-compile`, in which case all requirements should be
compatible and there should be no duplicates; but we should handle this
case gracefully.

Closes https://github.com/astral-sh/puffin/issues/582.
2023-12-07 05:26:42 +00:00
Charlie Marsh a825b2db06
Shard the registry cache by package (#583)
## Summary

This PR modifies the cache structure in a few ways. Most notably, we now
shard the set of registry wheels by package, and index them lazily when
computing the install plan.

This applies both to built wheels:

<img width="989" alt="Screen Shot 2023-12-06 at 4 42 19 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/0e8a306f-befd-4be9-a63e-2303389837bb">

And remote wheels:

<img width="836" alt="Screen Shot 2023-12-06 at 4 42 30 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/7fd908cd-dd86-475e-9779-07ed067b4a1a">

For other distributions, we now consistently cache using the package
name, which is really just for clarity and debuggability (we could
consider omitting these):

<img width="955" alt="Screen Shot 2023-12-06 at 4 58 30 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/3e8d0f99-df45-429a-9175-d57b54a72e56">

Obliquely closes https://github.com/astral-sh/puffin/issues/575.
2023-12-07 05:02:46 +00:00
Charlie Marsh aa065f5c97
Modify install plan to support all distribution types (#581)
This PR adds caching support for built wheels in the installer.
Specifically, the `RegistryWheelIndex` now indexes both downloaded and
built wheels (from registries), and we have a new `BuiltWheelIndex` that
takes a subdirectory and returns the "best-matching" compatible wheel.

Closes #570.
2023-12-07 04:43:34 +00:00
Charlie Marsh edaeb9b0e8
Add tests for repeated installs with source distributions (#580)
Adds a few more tests for re-installs with various kinds of source
distributions, and changes the tests to use packages that we can safely
import (via `check_command`) for extra validation.

Once we properly respect cached built wheels, we should expect these
snapshots to change, since we'll no longer download and re-build
unnecessarily.
2023-12-06 20:02:32 +00:00
Zanie Blue 2bb04771ce
Allow switching out the resolver's IO (#517)
I'm working off of @konstin's commit here to implement arbitrary unsat
test cases for the resolver.

The entirety of the resolver's io are two functions: Get the version map
for a package (PEP 440 version -> distribution) and get the metadata for
a distribution. A new trait `ResolverProvider` abstracts these two away and
allows replacing the real network requests e.g. with stored responses
(https://github.com/pradyunsg/pip-resolver-benchmarks/blob/main/scenarios/pyrax_198.json).

---------

Co-authored-by: konsti <konstin@mailbox.org>
2023-12-06 11:53:16 -06:00
konsti 366c389385
Parse editable installs (#564)
Parse `-e` for editable installs in `requirements.txt`.

Unlike all the other requirements, editable installs don't have the name
of the package specified.
2023-12-06 18:21:15 +01:00
konsti 3f4d7b7826
Improve path source dist caching (#578)
Path distribution cache reading errors are no longer fatal.

We now invalidate the path file source dists if its modification
timestamp changed, and invalidate path dir source dists if
`pyproject.toml` or alternatively `setup.py` changed, which seems good
choices since changing pyproject.toml should trigger a rebuild and the
user can `touch` the file as part of their workflow.

`CachedByTimestamp` is now a shared util. It doesn't have methods as i
don't think it's worth it yet for two users.

Closes #478

TODO(konstin): Write a test. This is probably twice as much work as that
fix itself, so i made that PR without one for now.
2023-12-06 11:47:01 -05:00
konsti 1bf754556f
Add test for cache source dist installing (#545)
The code changes are outdated, now it's only adding a test
2023-12-06 11:37:55 +00:00
Charlie Marsh 218894375a
Avoid removing existing directories when unzipping and building (#577)
Now that we don't store zipped and unzipped wheels at the same location,
we can avoid these safeguards that entail removing existing directories
when writing. This supersedes
https://github.com/astral-sh/puffin/pull/545.

Closes https://github.com/astral-sh/puffin/issues/554.
2023-12-06 02:36:12 +00:00
Charlie Marsh 5fec63bff5
Add caching for path source distributions (#576)
Follows the strategy that we use for other source distributions.

Closes https://github.com/astral-sh/puffin/issues/557.
2023-12-06 01:33:52 +00:00
Charlie Marsh 5370484307
Remove `.whl` extension for cached, unzipped wheels (#574)
## Summary

This PR uses the wheel stem (e.g., `foo-1.2.3-py3-none-any`) instead of
the wheel name (e.g., `foo-1.2.3-py3-none-any.whl`) when storing
unzipped wheels in the cache, which removes a class of confusing issues
around overwrites and directory-vs.-file collisions.

For now, we retain _both_ the zipped and unzipped wheels in the cache,
though we can easily change this by storing the zipped wheels in a
temporary directory.

Closes https://github.com/astral-sh/puffin/issues/573.

## Test Plan

Some examples from my local cache:

<img width="835" alt="Screen Shot 2023-12-05 at 4 09 55 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/784146aa-b080-416e-9767-40c843fe5d6a">
<img width="847" alt="Screen Shot 2023-12-05 at 4 12 14 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/4bc7f30f-bef3-47f1-b4e8-da9cabf87f28">
<img width="637" alt="Screen Shot 2023-12-05 at 4 09 50 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/25ca4944-4a06-4a08-ac85-c6f7d8b5c8ea">
2023-12-05 22:41:22 +00:00
Charlie Marsh a15da36d74
Avoid removing local wheels when unzipping (#560)
## Summary

When installing a local wheel, we need to avoid removing the zipped
wheel (since it lives outside of the cache), _and_ need to ensure that
we unzip the wheel into the cache (rather than replacing the zipped
wheel, which may even live outside of the project).

Closes https://github.com/astral-sh/puffin/issues/553.
2023-12-05 17:50:08 +00:00
Charlie Marsh 6f055ecf3b
Remove existing built wheels when building source distributions (#559)
This PR modifies the source distribution building to replace any
existing targets after building the new wheel. In some cases, the
existence of an existing target may be indicative of a bug, so we warn.
It's partially a workaround for some (but not all) of the errors in
https://github.com/astral-sh/puffin/issues/554.
2023-12-05 12:45:24 -05:00
Charlie Marsh f99e3560e8
Avoid returning zipped wheels from registry and URL indexes (#558)
## Summary

This is hard to reproduce, but if you run a long installation process
that errors part-way through, you can end up with zipped wheels in the
`Wheels` cache, which is intended to contain only unzipped wheels. This
PR avoids returning those entries from the registry, which will then
lead to errors downstream when we treat them as directories.
2023-12-05 09:53:45 +01:00
Charlie Marsh 2d1e19e474
Allow yanked versions when specified via `==` (#561)
## Summary

This enables users to rely on yanked versions via explicit `==` markers,
which is necessary in some projects (and, in my opinion, reasonable).

Closes #551.
2023-12-05 09:44:06 +01:00
Charlie Marsh c3a917bbf6
Support granular target Python versions (#534)
## Summary

Allows, e.g., `--python-version 3.7` or `--python-version 3.7.9`. This
was also feedback I received in the original PR.

Closes https://github.com/astral-sh/puffin/issues/533.
2023-12-05 02:38:49 +00:00
Charlie Marsh 06ee321e9c
Use `u64` instead of `u32` in `Version` fields (#555)
It turns out that it's not uncommon to use timestamps as patch versions
(e.g., `20230628214621`). I believe this is the ISO 8601 "basic format".
These can't be represented by a `u32`, so I think it makes sense to just
bump to `u64` to remove this limitation.
2023-12-04 21:00:55 -05:00
Charlie Marsh af13c83177
Overwrite individual files when reflinking (#556)
Similar to #516, but for individual files.

## Test Plan

Ran:

```sh
cargo run -p puffin-cli -- pip-uninstall plaid-python
mkdir -p /Users/crmarsh/workspace/puffin/.venv/lib/python3.10/site-packages/tests
echo "x=1" > /Users/crmarsh/workspace/puffin/.venv/lib/python3.10/site-packages/tests/__init__.py
cargo run -p puffin-cli -- pip-sync requirements.txt --no-cache --verbose
```
2023-12-04 23:59:35 +00:00
Charlie Marsh 5fddcc362e
Improve error messages for 'file not found' case (#550)
Right now, if you specify a wheel that doesn't exist, you get: `no such
file or directory` with no additional context. Oops!
2023-12-04 22:01:51 +00:00
Charlie Marsh 4e05cd5dfd
Show build progress for path source distributions (#549) 2023-12-04 20:56:56 +00:00
konsti d5abd33813
Use atomic writes for the cache consistently (#546)
Ensure we're using atomic writes everywhere in our cache to avoid broken
cache records and error with parallel puffin actions
(https://github.com/astral-sh/puffin/pull/544#issuecomment-1838841581).

All json files that are written to the cache are written atomically and
the build wheels are written to temp dir and then moved atomically. I
didn't touch venv creation though, i don't think that's worth it since
python does not support atomic package installation through its design.
2023-12-04 12:02:01 -05:00
konsti e9c9e9718e
Use version in `RegistryIndex` (#543)
When building up the `RegistryIndex`, index by both package name and
version to fix #537.
2023-12-04 17:26:14 +01:00
Charlie Marsh 95b8316023
Preserve seed packages for non-Puffin-created virtualenvs (#535)
## Summary

This PR modifies the install plan to avoid removing seed packages if the
virtual environment was created by anyone other than Puffin.

Closes https://github.com/astral-sh/puffin/issues/414.

## Test Plan

- Ran: `virtualenv .venv`.
- Ran: `cargo run -p puffin-cli -- pip-sync
scripts/benchmarks/requirements.txt --verbose --no-cache`.
- Verified that `pip` et al were not removed, and that the logging
including a message around preserving seed packages.
2023-12-04 09:31:00 -05:00
konsti 77b3921b7a
Fix cargo warning (#542)
It's odd that `dev-dependencies` don't default to `dependencies` for
workspace versions.
2023-12-04 11:10:36 +00:00
Charlie Marsh 0ac4254a7e
Enforce target and interpreter `requires-python` versions (#532)
## Summary

This PR modifies the behavior of our `--python-version` override in two
ways:

1. First, we always use the "real" interpreter in the source
distribution builder. I think this is correct. We don't need to use the
fake markers for recursive builds, because all we care about is the
top-level resolution, and we already assume that a single source
distribution will always return the same metadata regardless of its
build environment.
2. Second, we require that source distributions are compatible with
_both_ the "real" interpreter version and the marker environment. This
ensures that we don't try to build source distributions that are
compatible with our interpreter, but incompatible with the target
version.

Closes https://github.com/astral-sh/puffin/issues/407.
2023-12-04 11:27:36 +01:00
Charlie Marsh d96c18b3a8
Respect `requires` for non-`build-backend` PEP 517 builds (#530)
## Summary

This PR modifies `puffin-build` to be closer in behavior to
[pip](a15dd75d98/src/pip/_internal/pyproject.py (L53))
and
[build](de5b44b0c2/src/build/__init__.py (L94)).

Specifically, if a project contains a `[build-system]` field, but no
`build-backend`, we now perform a PEP 517 build (instead of using
`setup.py` directly) _and_ respect the `requires` of the
`[build-system]`. Without this change, we were failing to build source
distributions for packages like `ujson`.

Closes #527.

---------

Co-authored-by: konstin <konstin@mailbox.org>
2023-12-04 10:13:42 +00:00
konsti 6dc8ebcb90
Test interpreter cache invalidation (#540)
Add missing test for #529/#508.
2023-12-04 10:03:43 +00:00
konsti 811c088603
Improve wheel cache docs: Unzipping is lazy (#539)
Also sneaking `fs_err::rename(staging.into_path(), &normalized_path)?`
in here, for a better resolution of
https://github.com/astral-sh/puffin/pull/524#discussion_r1412459016
2023-12-04 10:01:35 +00:00
Charlie Marsh ee009ace86
Remove target directory prior to unzipping (#538)
## Summary

This is not a _fix_ for https://github.com/astral-sh/puffin/issues/537,
but it does ensure that we avoid hard-failing on what's really an
optimization and caching case.
2023-12-04 05:18:45 +00:00
Charlie Marsh fc20d01593
Ignore empty `VIRTUAL_ENV` variables (#536)
I'm not sure how my interpreter gets into this state, but it's certainly
wrong to respect these.
2023-12-04 04:53:26 +00:00
Charlie Marsh 3b55d0b295
Deduplicate various `.dist-info/METADATA` read implementations (#531)
Closes https://github.com/astral-sh/puffin/issues/484.
2023-12-03 21:29:00 -05:00
Charlie Marsh fa3107b173
Use full Python version when determining compatibility (#528)
## Summary

When resolving with Python 3.7.13, I was failing to find a matching
distribution that required Python 3.7.9 or later.
2023-12-04 01:02:24 +00:00
Charlie Marsh 2613382747
Invalidate interpreter marker cache (#529)
In a refactor, we lost the cache invalidation behavior for interpreter
markers, leading to stale interpreter errors for me when creating
environments with different Python versions. Specifically, the
modification timestamp used to be part of the _cache key_ when we used
`cacache`. Now it's not -- but it's stored within the cache. So we need
to validate the key after-the-fact.
2023-12-03 22:44:43 +00:00
Charlie Marsh ee2fca3a48
Add CACHEDIR and .gitignore tags to cache directories (#526)
## Summary

Even if this will typically be in the user's application folder (rather
than a local directory), it's still a good practice.

Closes https://github.com/astral-sh/puffin/issues/280.
2023-12-02 00:37:51 +00:00
konsti 9806901a16
Consolidate wheel caches (#524)
After this change, two wheel caches remain: `built-wheels-v0` and
`wheels-v0`, docs screenshots below. Each contains both the wheel
metadata, cache policy and zip or unzipped wheels under the same name.

The zipped/unzipped strategy is as follows: In `pip-compile`, when we
build a wheel, we store it zipped. When `pip-sync` or a source dist
build in `pip-compile` need to install the wheel, we unzip it, remove
the file and replace it with the unzipped wheel.

This removes `WheelCache` and `UrlIndex` in favor of `Cache` plus
`WheelCache`. The non-built wheel cache now considers index urls and the
url for url wheels.

I'm unsure if we need the `Unzipper` type, this could just be a
function.

I move `no_index` into `IndexUrls` and started using `IndexUrl` up to
the clap level.

I left a number of TODOs in the code, namely performing the actual
invalidation of unzipped wheels and making the `InstallPlan` understand
cache invalidation (i.e. uninstall wheels when their remote changed).


![image](https://github.com/astral-sh/puffin/assets/6826232/c4d45979-485b-4954-848d-fd3347ee2510)
2023-12-01 20:16:33 +00:00
konsti 4551994b7d
Clear built wheels when remote changed (#519)
Remove built wheels alongside their metadata when their index source
dist or url source dist changed. For git source dists, we currently
don't clear the previous build but use a new directory (not sure what's
right here - are there any generic cache GC approaches out there? I've
seen that e.g. spotify keeps its cache at 10GB max, but i also haven't
seen any reusable, well tested approaches for this). Path distributions
are unchanged (#478).

I like the structure of metadata alongside the wheel for cache
invalidation, i'll try to do that for `wheels-v0`/`wheel-metadata-v0`
too. (The unzipped wheels afaik currently lack cache invalidation when
the remote changed.) This should give is roughly the same structure for
wheel and built wheels and a very similar pattern of invalidation.
2023-12-01 14:56:47 -05:00
Zanie Blue 2a8544df9e
Use a custom pubgrub report formatter (#521)
Uses https://github.com/zanieb/pubgrub/pull/10 to drastically simplify
our reporter implementation. This will allow us to make use of upstream
improvements to the reporter e.g.
https://github.com/zanieb/pubgrub/pull/8 without multiple duplicative
pull requests.
2023-12-01 13:36:12 -06:00
Zanie Blue 5f1f207628
Recursively merge existing package directories on installation (#516)
Previously, when installing a package we would delete the target
directory before copying (or linking) the contents of the package.
However, this means that we do not properly support namespace packages
which can share a target directory. Instead the last package to be
installed would be override existing packages. Since we install packages
in parallel, this could result in a race condition where the target
directory already exists which is not allowed when using `clonefile`.
See example error in #515.
c7e63d2dce
provides a regression test for this — it fails on `main`.

Here, we implement a recursive merge when the target directory already
exists. Both packages will be installed into the same directory. We no
longer delete the target directory, which seems okay since we uninstall
packages before installing now.

When files conflict, we will likely throw an error still. The correct
behavior to implement in this case is unclear, as if we just take "first
write wins" or "last write wins" we could end up with some files from
one package and some from another resulting in two broken packages. A
possible solution here is to lock the target directories while copying.
2023-11-30 10:14:51 -06:00
konsti 6841c06e2d
Show error paths in install-wheel-rs (#514)
Ensure that we consistently show a path for all io errors in
install-wheel-rs either (preferred) through `fs_err`, or as fallback by
a custom error type. For zip reading errors, we rely on the caller to
add the name and/or location of the wheel.
2023-11-29 20:14:34 +01:00
konsti 2539f00952
Better tracing span (#513)
This will help us get better insight into what is happening and how long
it takes. I'm particularly interested in how long the different source
dist steps take (download, extract, build step(s)), to make better
decisions about their caching, which i want to report through tracing.

Example output:

```console
$ RUST_LOG=puffin=info cargo run --bin puffin -q -- pip-compile -v --no-cache scripts/requirements/all-kinds.in > /dev/null
  puffin_distribution::source_dist::download_source_dist filename="werkzeug-3.0.1.tar.gz", source_dist=werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz
  puffin_dispatch::build_source source_dist="werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz", subdirectory=None
    puffin_build::extract_archive sdist="werkzeug-3.0.1.tar.gz"
    puffin_dispatch::resolve requirements="flit-core <4"
    puffin_dispatch::install requirements="flit-core ==3.9.0", venv="/tmp/.tmpgZAEAh/.venv"
    puffin_build::get_requires_for_build_wheel name="build_wheel", python_version=3.12
    puffin_build::build package_id="werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz"
      puffin_build::run_python_script name="build_wheel", python_version=3.12
  puffin_dispatch::build_source source_dist="pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git@843b753e9e8cb74e83cac55598719b39a4d5ef1f", subdirectory=None
    puffin_dispatch::resolve requirements="hatchling"
    puffin_dispatch::install requirements="hatchling ==1.18.0, trove-classifiers ==2023.11.22, editables ==0.5, pathspec ==0.11.2, pluggy ==1.3.0, packaging ==23.2", venv="/tmp/.tmpJjweUn/.venv"
    puffin_build::get_requires_for_build_wheel name="build_wheel", python_version=3.12
    puffin_build::build package_id="pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git@843b753e9e8cb74e83cac55598719b39a4d5ef1f"
      puffin_build::run_python_script name="build_wheel", python_version=3.12
  puffin_distribution::source_dist::download_source_dist filename="django-allauth-0.51.0.tar.gz", source_dist=django-allauth==0.51.0
  puffin_dispatch::build_source source_dist="django-allauth==0.51.0", subdirectory=None
    puffin_build::extract_archive sdist="django-allauth-0.51.0.tar.gz"
    puffin_dispatch::resolve requirements="wheel, setuptools, pip"
    puffin_dispatch::install requirements="setuptools ==69.0.2, pip ==23.3.1, wheel ==0.42.0", venv="/tmp/.tmplSZisu/.venv"
    puffin_build::build package_id="django-allauth==0.51.0"
 Resolved 35 packages in 11.71s
```
2023-11-29 10:34:18 +00:00
konsti 929df586fb
Skip tf-models-nightly in resolve-many dev script for now (#510)
`tf-models-nightly` has pathologic backtracking behaviour, skip it for
now so we can benchmark the rest.
2023-11-28 18:25:32 +00:00
konsti d89fbeb642
Migrate interpreter query to custom caching (#508)
This removes the last usage of cacache by replacing it with a custom,
flat json caching keyed by the digest of the executable path.


![image](https://github.com/astral-sh/puffin/assets/6826232/8f777c4c-1f1b-4656-ba7b-002175270556)

A step towards #478. I've made `CachedByTimestamp<T>` generic over `T`
but intentionally not moved it to `puffin-cache` yet.
2023-11-28 17:14:59 +00:00
konsti 5435d44756
Introduce `Cache`, `CacheBucket` and `CacheEntry` (#507)
This is mostly a mechanical refactor that moves 80% of our code to the
same cache abstraction.

It introduces cache `Cache`, which abstracts away the path of the cache
and the temp dir drop and is passed throughout the codebase. To get a
specific cache bucket, you need to requests your `CacheBucket` from
`Cache`. `CacheBucket` is the centralizes the names of all cache
buckets, moving them away from the string constants spread throughout
the crates.

Specifically for working with the `CachedClient`, there is a
`CacheEntry`. I'm not sure yet if that is a strict improvement over
`cache_dir: PathBuf, cache_file: String`, i may have to rotate that
later.

The interpreter cache moved into `interpreter-v0`.

We can use the `CacheBucket` page to document the cache structure in
each bucket:


![image](https://github.com/astral-sh/puffin/assets/6826232/b023fdfb-e34d-4c2d-8663-b5f73937a539)
2023-11-28 17:11:14 +00:00
Charlie Marsh 3d47d2b1da
Error when `ldd` is not in path (#506)
Closes https://github.com/astral-sh/puffin/issues/493.
2023-11-28 05:55:04 +00:00
konsti 8855f44b5f
Move simple index queries to `CachedClient` (#504)
Replaces the usage of `http-cache-reqwest` for simple index queries with
our custom cached client, removing `http-cache-reqwest` altogether.

The new cache paths are `<cache>/simple-v0/<index>/<package_name>.json`.
I could not test with a non-pypi index since i'm not aware of any other
json indices (jax and torch are both html indices).

In a future step, we can transform the response to be a
`HashMap<Version, {source_dists: Vec<(SourceDistFilename, File)>,
wheels: Vec<(WheeFilename, File)>}` (independent of python version, this
cache is used by all environments together). This should speed up cache
deserialization a bit, since we don't need to try source dist and wheel
anymore and drop incompatible dists, and it should make building the
`VersionMap` simpler. We can speed this up even further by splitting
into a version lists and the info for each version. I'm mentioning this
because deserialization was a major bottleneck in the rust part of the
old python prototype.

Fixes #481
2023-11-28 00:11:03 +00:00
konsti 1142a14f4d
Check compatibility for cached unzipped wheels (#501)
**Motivation** Previously, we would install any wheel with the correct
package name and version from the cache, even if it doesn't match the
current python interpreter.

**Summary** The unzipped wheel cache for registries now uses the entire
wheel filename over the name-version (`editables-0.5-py3-none-any.whl`
over `editables-0.5`).

Built wheels are not stored in the `wheels-v0` unzipped wheels cache
anymore. For each source distribution, there can be multiple built
wheels (with different compatibility tags), so i argue that we need a
different cache structure for them (follow up PR).

For `all-kinds.in` with

```bash
rm -rf cache-all-kinds
virtualenv --clear -p 3.12 .venv
cargo run --bin puffin -- pip-sync --cache-dir cache-all-kinds target/all-kinds.txt
```

we get:

**Before**
```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0
│   ├── asgiref-3.7.2
│   ├── blinker-1.7.0
│   ├── certifi-2023.11.17
│   ├── cffi-1.16.0
│   ├── [...]
│   ├── tzdata-2023.3
│   ├── urllib3-2.1.0
│   └── wheel-0.42.0
└── url
    ├── 4b8be67c801a7ecb
    │   ├── flask
    │   └── flask-3.0.0.dist-info
    ├── 6781bd6440ae72c2
    │   ├── werkzeug
    │   └── werkzeug-3.0.1.dist-info
    └── a67db8ed076e3814
        ├── pydantic_extra_types
        └── pydantic_extra_types-2.1.0.dist-info

48 directories, 0 files
```

**After**

```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0-py3-none-any.whl
│   ├── asgiref-3.7.2-py3-none-any.whl
│   ├── blinker-1.7.0-py3-none-any.whl
│   ├── certifi-2023.11.17-py3-none-any.whl
│   ├── cffi-1.16.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
│   ├── [...]
│   ├── tzdata-2023.3-py2.py3-none-any.whl
│   ├── urllib3-2.1.0-py3-none-any.whl
│   └── wheel-0.42.0-py3-none-any.whl
└── url
    └── 4b8be67c801a7ecb
        └── flask-3.0.0-py3-none-any.whl

39 directories, 0 files
```

**Outlook** Part of #477 "Fix wheel caching". Further tasks:
* Replace the `CacheShard` with `WheelMetadataCache` which handles urls
properly.
* Delete unzipped wheels when their remote wheel changed
* Store built wheels next to the `metadata.json` in the source dist
directory; delete built wheels when their source dist changed (different
cache bucket, but it's the same problem of fixing wheel caching) I'll
make stacked PRs for those
2023-11-27 16:03:58 -08:00
konsti 71295702bf
Reduce pip_sync test duplication (#502)
Move venv creation and running python to check the installation into
function instead of copy&pasting them every time
2023-11-27 10:21:40 +00:00
Charlie Marsh afda835544
Avoid clone for `WheelMetadataCache` (#500)
This doesn't need to own the underlying data which allows us to remove a
number of clones.
2023-11-25 23:33:59 +00:00
Charlie Marsh 3eb0a43995
Perform a single Git fetch when building source distributions (#499)
## Summary

We need to pass in the distribution with the "precise" URL to avoid
refetching.

## Test Plan

Ran `cargo run -p puffin-cli -- pip-compile requirements.in --verbose`
with `flask @ git+https://github.com/pallets/flask.git` and verified
that we only checked out Flask once.
2023-11-25 23:29:41 +00:00
konsti d54e780843
Source dist metadata refactor (#468)
## Summary and motivation

For a given source dist, we store the metadata of each wheel built
through it in `built-wheel-metadata-v0/pypi/<source dist
filename>/metadata.json`. During resolution, we check the cache status
of the source dist. If it is fresh, we check `metadata.json` for a
matching wheel. If there is one we use that metadata, if there isn't, we
build one. If the source is stale, we build a wheel and override
`metadata.json` with that single wheel. This PR thereby ties the local
built wheel metadata cache to the freshness of the remote source dist.
This functionality is available through `SourceDistCachedBuilder`.

`puffin_installer::Builder`, `puffin_installer::Downloader` and
`Fetcher` are removed, instead there are now `FetchAndBuild` which calls
into the also new `SourceDistCachedBuilder`. `FetchAndBuild` is the new
main high-level abstraction: It spawns parallel fetching/building, for
wheel metadata it calls into the registry client, for wheel files it
fetches them, for source dists it calls `SourceDistCachedBuilder`. It
handles locks around builds, and newly added also inter-process file
locking for git operations.

Fetching and building source distributions now happens in parallel in
`pip-sync`, i.e. we don't have to wait for the largest wheel to be
downloaded to start building source distributions.

In a follow-up PR, I'll also clear built wheels when they've become
stale.

Another effect is that in a fully cached resolution, we need neither zip
reading nor email parsing.

Closes #473

## Source dist cache structure 

Entries by supported sources:
 * `<build wheel metadata cache>/pypi/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/<sha256(index-url)>/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/url/<sha256(url)>/foo-1.0.0.zip/metadata.json`
But the url filename does not need to be a valid source dist filename

(<https://github.com/search?q=path%3A**%2Frequirements.txt+master.zip&type=code>),
so it could also be the following and we have to take any string as
filename:
* `<build wheel metadata
cache>/url/<sha256(url)>/master.zip/metadata.json`

Example:
```text
# git source dist
pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git
# pypi source dist
django_allauth==0.51.0
# url source dist
werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz
```
will be stored as
```text
built-wheel-metadata-v0
├── git
│   └── 5c56bc1c58c34c11
│       └── 843b753e9e8cb74e83cac55598719b39a4d5ef1f
│           └── metadata.json
├── pypi
│   └── django-allauth-0.51.0.tar.gz
│       └── metadata.json
└── url
    └── 6781bd6440ae72c2
        └── werkzeug-3.0.1.tar.gz
            └── metadata.json
```

The inside of a `metadata.json`:
```json
{
  "data": {
    "django_allauth-0.51.0-py3-none-any.whl": {
      "metadata-version": "2.1",
      "name": "django-allauth",
      "version": "0.51.0",
      ...
    }
  }
}
```
2023-11-24 17:47:58 +00:00
konsti 8d247fe95b
Add `Tags::from_interpreter` (#498)
Small refactoring
2023-11-24 11:36:01 +00:00
konsti f7976ce5cc
Write docs for distribution types (#495)
Document the type hierarchy, excluding the traits.
2023-11-23 13:39:39 +00:00
konsti 1c0e03f807
puffin_interpreter cleanup ahead of #235 (#492)
Preparing for #235, some refactoring to `puffin_interpreter`.

* Added a dedicated error type instead of anyhow
* `InterpreterInfo` -> `Interpreter`
* `detect_virtual_env` now returns an option so it can be chained for
#235
2023-11-23 08:57:33 +00:00
Charlie Marsh 9d35128840
Use Clippy lint table over Cargo config (#490)
Closes https://github.com/astral-sh/puffin/issues/482.
2023-11-22 15:10:27 +00:00
Charlie Marsh 443a0a9df2
Use a sparse Metadata 2.1 representation (#488)
This is an optimization to avoid parsing the entire Metadata 2.1 when we
only need a small subset of the fields.

Closes #175.
2023-11-22 13:25:35 +00:00
konsti a030a466e6
Error before download with no_build (#487)
This is fixes a performance regression where when `--no-build` was set,
the fetcher would still download the source dist only to error
afterwards.
2023-11-22 10:38:10 +00:00
konsti e1dafe7203
Allow applying multiple fixups for version specifiers (#486)
Allow applying multiple fixups for version specifiers, remove the
duplication from the code and add another test case.
2023-11-22 10:26:12 +00:00
konsti ff1100a1ab
Fixup for `>= '2.7'` (#485)
Fixup to allow parsing
https://pypi.org/simple/shellingham/?format=application/vnd.pypi.simple.v1+json
2023-11-22 10:00:12 +00:00
konsti 7c7daa8f83
Consistent Cargo.toml syntax (#483)
Remove the last Cargo.toml inconsistencies, see
1526b3458a (r1401083681).
Now all `[dependencies]` are workspace dependencies.
2023-11-22 08:34:08 +00:00
konsti 934e32ea98
Remove outdated todos (#476) 2023-11-21 13:57:40 +00:00
Charlie Marsh 17228ba04e
Add support for path dependencies (#471)
## Summary

This PR adds support for local path dependencies. The approach mostly
just falls out of our existing approach and infrastructure for Git and
URL dependencies.

Closes https://github.com/astral-sh/puffin/issues/436. (We'll open a
separate issue for editable installs.)

## Test Plan

Added `pip-compile` tests that pre-download a wheel or source
distribution, then install it via local path.
2023-11-21 11:49:42 +00:00
Charlie Marsh f1aa70d9d3
Refactor distribution types to return `Result` (#470)
## Summary

A variety of small refactors to the distribution types crate to (1)
return `Result` if we find an invalid wheel, rather than treating it as
a source distribution with a `.whl` suffix, and (2) DRY up some repeated
code around URLs.
2023-11-20 23:08:54 +00:00