Commit Graph

313 Commits

Author SHA1 Message Date
Zanie Blue 5e04a95c45
Disable line wrapping during scenario tests (#784)
Adds support for a `PUFFIN_NO_WRAP` environment variable which disables
line wrapping in `miette` output.

We set this variable in the scenario tests to improve the readability of
snapshots.

I contributed the ability to disable line wrapping upstream at
https://github.com/zkat/miette/pull/328
2024-01-04 19:07:16 +00:00
Andrew Gallant 1cc3250e76
puffin-cli: fix botched merge (#785)
This fixes a compilation error with tests on current `main`. I didn't
track down the exact provenance, but I'd guess it's the result of a
botched merge. (i.e., Two or more PRs that pass CI independently, but
when merged cause failures.)
2024-01-04 17:03:45 +00:00
Charlie Marsh c6bdc43f37
Add missing feature to `Cargo.toml` (#777) 2024-01-04 11:39:11 -05:00
Zanie Blue e75fde7bfe
Filter prefixes from scenario snapshots to improve readability (#779)
I'm a _little_ unsure since this could be confusing but the prefixes can
be pretty long and this is much easier to read.
2024-01-04 09:57:41 -06:00
konsti 9b77a8873e
Disable color output when redirecting stderr (#742)
I'm still confused about it, but this seems to do the right thing?

`HierarchicalLayer` internally has [`let ansi =
io::stderr().is_terminal();`](fcd9eed252/src/lib.rs (L74)),
so the logging itself is already correctly uncolored, but errors in the
log weren't.

Test command, ran with network deactivated:

```shell
RUST_LOG=debug cargo run --bin puffin -- pip-compile -v ./scripts/popular_packages/pypi_8k_downloads.txt 2> log.txt
```

**Before**

```
error: Request error: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: error sending request for url (https://pypi.org/simple/apache-airflow-providers-dbt-cloud/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: dns error: failed to lookup address information: Temporary failure in name resolution
  Caused by: failed to lookup address information: Temporary failure in name resolution
  ```

  **After**

  ```
  error: Request error: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: error sending request for url (https://pypi.org/simple/fissix/): error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: error trying to connect: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: dns error: failed to lookup address information: Temporary failure in name resolution
    Caused by: failed to lookup address information: Temporary failure in name resolution
```
2024-01-04 16:43:44 +01:00
konsti 92c780ec2f
Run custom insta filters before generic filters (#781)
I've noticed some non-deterministic test failures when a temp dir looks
like a timestamp
(https://github.com/astral-sh/puffin/actions/runs/7410022542/job/20161416805).
Running the custom filters for e.g. the temp dirs before the generic
time filters should fix that.
2024-01-04 16:40:28 +01:00
Charlie Marsh b2230e7f4d
Make index URLs insensitive to trailing slashes (#771)
Closes https://github.com/astral-sh/puffin/issues/770.
2024-01-04 08:45:50 -05:00
Zanie Blue e18a6a0c03
Include permalink to scenarios used to generate test cases (#767) 2024-01-03 20:41:14 -06:00
Zanie Blue 0d5252580c
Improve scenario update script (#759)
Following #757, improves the script for generating scenario test cases
with:

- A requirements file
- Support for downloading packse scenarios from GitHub dynamically
- Running rustfmt on the generated test file
- Updating snapshots / running tests
2024-01-03 20:13:11 -06:00
Charlie Marsh bf9e9daa39
Make editable installs their own test feature flag (#766)
For whatever reason these fail for me with mold, and it's not worth it
to me to disable the linker.
2024-01-03 20:33:22 -05:00
Charlie Marsh 252d53e83a
Make environment validation a `--strict` flag (#765)
I don't necessarily want users to pay this cost every time. We could
consider making this `true` by default.

Closes https://github.com/astral-sh/puffin/issues/763.
2024-01-04 01:29:06 +00:00
Charlie Marsh ae8c7d11e3
Use `create_venv_py312` in pip-uninstall tests (#764) 2024-01-04 01:16:13 +00:00
Charlie Marsh 286145bc7f
Add a dedicated error for missing RECORD files (#762)
Related to: https://github.com/astral-sh/puffin/issues/716
2024-01-04 00:28:50 +00:00
Zanie Blue 1f2112191f
Unpack scenario root requirements in test cases (#757)
As mentioned in #746, instead of just installing the scenario root we
will unpack the root dependencies into the install command to allow
better coverage of direct user requests with scenarios.

I added display of the package tree provided by each scenario.

Use a mustache template for iterative replacements.
2024-01-03 17:31:29 -06:00
Zanie Blue c9f43e915c
Add packse scenario tests (#746)
Adds tests using packse test scenarios! Uses `test.pypi.org` as a
backing index.

Tests are generated by a simple Python script. Requires
https://github.com/zanieb/packse/pull/49.

This opens us to a slight attack surface, as we cannot force use of
`test.pypi.org` only and someone could register these package names on
the real `pypi.org` index with malicious content. I could publish these
packages there too.
2024-01-03 15:50:06 -06:00
Charlie Marsh aa75d264cd
Clean up the Puffin CLI (#755)
- Rename to `puffin pip-freeze` for consistency.
- Add a `virtualenv` alias to `venv`.
- Hide the `add` and `remove` commands.
2024-01-03 21:22:06 +00:00
Charlie Marsh fd556ccd44
Model Python version as a PubGrub package (#745)
## Summary

This PR modifies the resolver to treat the Python version as a package,
which allows for better error messages (since we no longer treat
incompatible packages as if they "don't exist at all").

There are a few tricky pieces here...

First, we need to track both the interpreter's Python version and the
_target_ Python version, because we support resolving for other versions
via `--python 3.7`.

Second, we allow using incompatible wheels during resolution, as long as
there's a compatible source distribution. So we still need to test for
`requires-python` compatibility when selecting distributions.

This could use more testing, but it feels like an area where `packse`
would be more productive than writing PyPI tests.

Closes https://github.com/astral-sh/puffin/issues/406.
2024-01-03 15:20:45 +00:00
konsti a3d8b3d9ca
Don't install incompatible path and url wheels (#739)
Add early tag checking for path and url wheels.

This does not check for resolve for consistency with index wheels.
2024-01-02 15:00:50 +00:00
Charlie Marsh 188ab75769
Split `File` into internal and external type (#729)
## Summary

This PR makes the `pypi_types::File` a response-only type (i.e., a type
that's only used when deserializing over the wire), and adds a separate
internal `File` type. Right now, the representations are similar, but
already, we can avoid the "lenient" deserialization on our internal
`File` type, and avoid the special-casing of the property names that's
required in the JSON. Over time, we can evolve this representation
entirely separately from the representation we receive from PyPI and
other indexes.
2023-12-25 15:42:28 -05:00
Charlie Marsh 187ccef4e1
Cache `Tags` on `Interpreter` (#726) 2023-12-25 13:41:10 +00:00
Charlie Marsh 5b2e381f87
Remove `platform-tags` dependency on `puffin-interpreter` (#725)
Cuts off a large internal dependency chain from what is otherwise a very
general crate.
2023-12-24 23:06:50 +00:00
Charlie Marsh 5bce699ee1
Add support for HTML indexes (#719)
## Summary

This PR adds support for HTML index responses (as with
`--index-url=https://download.pytorch.org/whl`).

Closes https://github.com/astral-sh/puffin/issues/412.
2023-12-24 16:04:00 +00:00
Charlie Marsh 9e6cb706a0
Update test fixtures (#720) 2023-12-24 15:50:10 +00:00
Zanie Blue 4e437ba7e5
Allow the default index url to be configured with `PUFFIN_INDEX_URL` (#704)
This allows the default index URL to be easily overridden with a local
index e.g. a `packse` server

```
export PUFFIN_INDEX_URL="http://localhost:3141/packages/all/+simple"
```
2023-12-20 11:52:00 +01:00
Andrew Gallant aa9f47bbde
improve tests for version parser (#696)
The high level goal here is to improve the tests for the version parser.
Namely, we now check not just that version strings parse successfully,
but that they parse to the expected result.

We also do a few other cleanups. Most notably, `Version` is now an
opaque type so that we can more easily change its representation going
forward.

Reviewing commit-by-commit is suggested. :-)
2023-12-19 12:25:32 -05:00
konsti 114548d945
Test that cache errors are non-fatal (#685)
The test creates a cache from multiple sources and injects faults (once
using invalid data and once by making the files unreadable on the fs
level), then resolves again.

I didn't test git because it has its own locking and correctness logic.

The main drawback is that this test is slow (2.5s for me), we could
`#[ignore]` it.
2023-12-19 12:02:49 +00:00
Charlie Marsh 3660d8a08e
Introduce separate traits for ahead-of-time and installed metadata (#692)
This is a pure refactor to follow-up #690, to separate the metadata that
we know upfront about distributions (like the version, for
registry-based distributions) vs. the metadata that requires building
(like the version, for URL-based distributions).
2023-12-18 22:37:45 +00:00
Charlie Marsh 31afb39a10
Show URLs and version together for installed, URL-based dependencies (#690)
The snapshot test changes will give you a sense for the impact of the
change and the output formatting.

Closes https://github.com/astral-sh/puffin/issues/686.
2023-12-18 22:21:37 +00:00
Charlie Marsh 365c860e27
Show fully-resolved URLs in non-resolution contexts (#689)
We now show the fully-resolved URL, rather than the URL as given by the
user, _everywhere_ except for the output resolution file (which should
retain relative paths, unexpanded environment variables, etc.).

Closes https://github.com/astral-sh/puffin/issues/687.
2023-12-18 22:10:24 +00:00
konsti 43c837f7bb
Show enum defaults in `--help` output (#693)
With `Option<T>` and `.unwrap_or_default()` later, the default of `T`
isn't shown in the help output.

Old:

```
      --link-mode <LINK_MODE>
          The method to use when installing packages from the global cache

          Possible values:
          - clone:    Clone (i.e., copy-on-write) packages from the wheel into the site packages
          - copy:     Copy packages from the wheel into the site packages
          - hardlink: Hard link packages from the wheel into the site packages

      -q, --quiet
      Do not print any output

      --resolution <RESOLUTION>
          Possible values:
          - highest:       Resolve the highest compatible version of each package
          - lowest:        Resolve the lowest compatible version of each package
          - lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies

      --prerelease <PRERELEASE>
          Possible values:
          - disallow:                 Disallow all pre-release versions
          - allow:                    Allow all pre-release versions
          - if-necessary:             Allow pre-release versions if all versions of a package are pre-release
          - explicit:                 Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
          - if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```

![Screenshot from 2023-12-18
21-04-16](https://github.com/astral-sh/puffin/assets/6826232/6b3cb47a-f224-408a-8d7a-186ebeb88ecd)

New:

```
      --link-mode <LINK_MODE>
          The method to use when installing packages from the global cache

          [default: hardlink]

          Possible values:
          - clone:    Clone (i.e., copy-on-write) packages from the wheel into the site packages
          - copy:     Copy packages from the wheel into the site packages
          - hardlink: Hard link packages from the wheel into the site packages

  -q, --quiet
          Do not print any output

      --resolution <RESOLUTION>
          [default: highest]

          Possible values:
          - highest:       Resolve the highest compatible version of each package
          - lowest:        Resolve the lowest compatible version of each package
          - lowest-direct: Resolve the lowest compatible version of any direct dependencies, and the highest compatible version of any transitive dependencies

      --prerelease <PRERELEASE>
          [default: if-necessary-or-explicit]

          Possible values:
          - disallow:                 Disallow all pre-release versions
          - allow:                    Allow all pre-release versions
          - if-necessary:             Allow pre-release versions if all versions of a package are pre-release
          - explicit:                 Allow pre-release versions for first-party packages with explicit pre-release markers in their version requirements
          - if-necessary-or-explicit: Allow pre-release versions if all versions of a package are pre-release, or if the package has an explicit pre-release marker in its version requirements
```


![image](https://github.com/astral-sh/puffin/assets/6826232/26c2c391-d959-4769-999d-481b3f179502)
2023-12-18 21:50:47 +00:00
Charlie Marsh 98fcb76015
Lock entire virtualenv during modifying commands (#695)
These commands all assume that the `site-packages` are constant
throughout.

Closes #691.
2023-12-18 16:44:45 -05:00
Charlie Marsh dbf055fe6f
Use borrowed data in `BuildDispatch` (#679)
This PR uses borrowed data in `BuildDispatch` which makes creating a
`BuildDispatch` extremely cheap (only one allocation, for the Python
executable). I can be talked out of this, it will have no measurable
impact.
2023-12-18 16:43:03 +00:00
Charlie Marsh c400ab7d07
Add support for `file://` URLs in editable requirements (#680) 2023-12-18 14:55:37 +00:00
Charlie Marsh 74ca9128b4
Canonicalize virtualenv path once (#678)
This avoids filesystem calls when creating a `BuildDispatch`.

Co-authored-by: konsti <konstin@mailbox.org>
2023-12-18 14:42:58 +00:00
konsti f4f67ebde0
Rebase: Uninstall existing non-editable versions when installing editable requirements bug (#682)
Separate branch for rebasing #677 onto main because i don't trust the
rebase enough to force push.

Closes #677.

---

If you install `black` from PyPI, then `-e ../black`, we need to
uninstall the existing `black`. This sounds simple, but that in turn
requires that we _know_ `-e ../black` maps to the package `black`, so
that we can mark it for uninstallation in the install plan. This, in
turn, means that we need to build editable dependencies prior to the
install plan.

This is just a bunch of reorganization to fix that specific bug
(installing multiple versions of `black` if you run through the above
workflow): we now run through the list of editables upfront, mark those
that are already installed, build those that aren't, and then ensure
that `InstallPlan` correctly removes those that need to be removed, etc.

Closes #676.

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-12-18 09:28:14 +00:00
Charlie Marsh 0bb2c92246
Add editable install support to `pip-install` (#675)
Per the title: adds support for `-e` installs to `puffin pip-install`.
There were some challenges here around threading the editable installs
to the right places. Namely, we want to build _once_, then reuse the
editable installs from the resolution. At present, we were losing the
`editable: true` flag on the `Dist` that came back through the
resolution, so it required some changes to the resolver.

Closes https://github.com/astral-sh/puffin/issues/672.
2023-12-18 09:52:32 +01:00
Charlie Marsh 77c6e6fa6c
Add support for `reinstall` to editable packages (#674)
Closes https://github.com/astral-sh/puffin/issues/673.
2023-12-17 15:41:57 +00:00
Charlie Marsh 00e1c33af4
Add an editable index to the site-packages registry (#671)
This PR modifies `SitePackages` to store all distributions in a flat
vector, and maintain two indexes (hash maps) from "per-element data for
an element in the vector" to "index of that element". This enables us to
maintain a map on both package name and editable URL.
2023-12-17 03:44:36 +00:00
Charlie Marsh 08edd173db
Add support for editable packages in `pip-uninstall` (#670) 2023-12-17 02:56:37 +00:00
konsti f059c6e6a6
Support editable in pip-sync and pip-compile (#587)
Support `-e path/do/dir` in pip-sync and and pip-compile.
2023-12-16 22:37:34 +00:00
konsti 620f73b38b
Speed up version parsing for a 1.27±0.03 speedup in transformers-extras with conservative changes (#660)
Two low-hanging fruits as optimizations for version parsing: A fast path
for release only versions and removing the regex from version specifiers
(still calling into version's parsing regex if required). This enables
optimizing the serde format since we now see the serde part instead of
only PEP 440 parsing. I intentionally didn't rewrite the full PEP 440 at
this step.

```console
$ hyperfine --warmup 5 --runs 50 "target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in" "target/profiling/main pip-compile scripts/requirements/transformers-extras.in"
  Benchmark 1: target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     217.1 ms ±   3.2 ms    [User: 194.0 ms, System: 55.1 ms]
    Range (min … max):   211.0 ms … 228.1 ms    50 runs

  Benchmark 2: target/profiling/main pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     276.7 ms ±   5.7 ms    [User: 252.4 ms, System: 54.6 ms]
    Range (min … max):   268.9 ms … 303.5 ms    50 runs

  Summary
    target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in ran
      1.27 ± 0.03 times faster than target/profiling/main pip-compile scripts/requirements/transformers-extras.in
```

---------

Co-authored-by: Andrew Gallant <andrew@astral.sh>
2023-12-15 14:03:35 -05:00
Charlie Marsh 305b9b080a
Show resolution error once on pip-install failure (#665)
Closes https://github.com/astral-sh/puffin/issues/664.
2023-12-15 18:43:23 +00:00
Charlie Marsh 9470c20e7a
Avoid double resolution during source builds (#656)
## Summary

This PR ensures that we re-use the resolution to install the build
dependencies when building a source distribution. Currently, we only
pass along the list of requirements, and then use the `Finder` to map
each requirement to a distribution. But we already determine the correct
distribution when resolving!

Closes https://github.com/astral-sh/puffin/issues/655.
2023-12-15 17:27:16 +00:00
Charlie Marsh 22c7057b35
Expand environment variables in URLs (#640)
## Summary

This PR enables users to express relative dependencies via environment
variables. Like pip, PDM, Hatch, Rye, and others, we now allow users to
express dependencies like:

```text
flask @ file://${PROJECT_ROOT}/flask-3.0.0-py3-none-any.whl
```

In the compiled requirements file, we'll also preserve the unexpanded
environment variable.

Closes https://github.com/astral-sh/puffin/issues/592.
2023-12-14 15:09:12 +00:00
Charlie Marsh ed8dfbfcf7
Preserve verbatim URLs (#639)
## Summary

This PR adds a `VerbatimUrl` struct to preserve verbatim URLs throughout
the resolution and installation pipeline. In short, alongside the parsed
`Url`, we also keep the URL as written by the user. This enables us to
display the URL exactly as written by the user, rather than the
serialized path that we use internally.

This will be especially useful once we start expanding environment
variables since, at that point, we'll be able to write the version of
the URL that includes the _unexpected_ environment variable to the
output file.
2023-12-14 15:03:39 +00:00
Charlie Marsh eef9612719
Allow reporters to take `dyn Metadata` (#645) 2023-12-14 12:36:28 +01:00
Charlie Marsh 402b728bf7
Use `fs_err` with `AutoStream` (#648) 2023-12-14 04:56:55 +00:00
Charlie Marsh e0127581b6
Use `fs_err` in more places (#644) 2023-12-14 01:11:45 +00:00
Charlie Marsh 3549d9638e
Inline all snapshot files (#641)
Right now, we're inconsistent between checking in and inlining these.
The outputs are small in Puffin, so let's just inline them in all cases.
2023-12-14 00:35:38 +00:00
Charlie Marsh eb1a630db2
Avoid hard-error for non-existent extras (#627)
## Summary

When resolving `transformers[tensorboard]`, the `[tensorboard]` extra
doesn't exist. Previously, we returned "unknown" dependencies for this
variant, which leads the resolution to try all versions, then fail. This
PR instead warns, but returns the base dependencies for the package,
which matches `pip`. (Poetry doesn't even warn, it just proceeds as
normal.)

Arguably, it would be better to return a custom incompatibility here and
then propagate... But this PR is better than the status quo, and I don't
know if we have support for that behavior yet...? (\cc @zanieb)

Closes #386.

Closes https://github.com/astral-sh/puffin/issues/423.
2023-12-13 17:36:27 +00:00
Charlie Marsh 69581c03c3
Enable package overrides in `pip-compile` (#631)
## Summary

This PR enables overrides to be passed to `pip-compile` and
`pip-install` via a new `--overrides` flag.

When overrides are provided, we effectively replace any requirements
that are overridden with the overridden versions. This is applied at all
depths of the tree.

The merge semantics are such that we replace _all_ requirements of a
package with _all_ requirements from the overrides files. So, for
example, if a package declares:

```
foo >= 1.0; python_version < '3.11'
foo < 1.0; python_version >= '3.11'
```

And the user provides an override like:
```
foo >= 2.0
```

Then _both_ of the `foo` requirements in the package will be replaced
with the override.

If instead, the user provided an override like:
```
foo >= 2.0; python_version < '3.11'
foo < 3.0; python_version >= '3.11'
```

Then we'd replace _both_ of the original `foo` requirements with both of
these overrides. (In technical terms, for each package in the
requirements file, we flat-map over its overrides.)

Closes https://github.com/astral-sh/puffin/issues/511.
2023-12-13 15:03:38 +00:00
konsti 0dde84dd27
Fix main (#635)
Seems to be a PR timing error
2023-12-13 13:55:06 +01:00
Charlie Marsh ea920e22d1
Validate environment after `pip-sync` (#629)
Not 100% sure that we actually want to do this, it seems reasonable
though.

Closes https://github.com/astral-sh/puffin/issues/410.
2023-12-13 09:13:43 +01:00
Charlie Marsh cbfd39093e
Clean up some function signatures (#633) 2023-12-13 06:21:47 +00:00
Charlie Marsh 920e10fc8f
Use `FxHash` consistently (#632) 2023-12-13 05:36:03 +00:00
Charlie Marsh edd741bf13
Add a diagnostic to detect invalid Python versions (#630)
Related to: https://github.com/astral-sh/puffin/issues/410.
2023-12-13 03:45:02 +00:00
Charlie Marsh a24eb57e93
Make warnings user-facing (#628)
## Summary

Now, `puffin_warnings::warn_once` and `puffin_warnings::warn` will go to
`stderr`, as long as the user isn't running under `--quiet`. Previously,
these went through `tracing`, and so were only visible when running
under `--verbose`.
2023-12-12 21:24:38 -05:00
Zanie Blue 490fb55ac5
Use available versions to simplify unsat error reports (#547)
Uses https://github.com/pubgrub-rs/pubgrub/pull/156 to consolidate
version ranges in error reports using the actual available versions for
each package.

Alternative to https://github.com/zanieb/pubgrub/pull/8 which implements
this behavior as a method in the `Reporter` — here it's implemented in
our custom report formatter (#521) instead which requires no upstream
changes.

Requires https://github.com/zanieb/pubgrub/pull/11 to only retrieve the
versions for packages that will be used in the report.

This is a work in progress. Some things to do:
- ~We may want to allow lazy retrieval of the version maps from the
formatter~
- [x] We should probably create a separate error type for no solution
instead of mixing them with other resolve errors
- ~We can probably do something smarter than creating vectors to hold
the versions~
- [x] This degrades error messages when a single version is not
available, we'll need to special case that
- [x] It seems safer to coerce the error type in `resolve` instead of
`solve` if feasible
2023-12-12 23:25:16 +00:00
Charlie Marsh 4fb2e0955e
Add a fast-path to skip resolution when installation is complete (#613)
For a very large resolution (a few hundred packages), I see 13ms vs.
400ms for a no-op. It's worth optimizing this case, in my opinion.
2023-12-12 17:43:12 +00:00
Charlie Marsh 3aaab32a9d
Omit extra in resolver progress (#623)
Closes #621.
2023-12-12 12:41:18 -05:00
Charlie Marsh 6c7f5cb846
Validate installed packages in virtual environment (#611)
## Summary

Now, after running `pip-install`, we validate that the set of installed
packages is consistent -- that is, that we don't have any packages that
are missing dependencies, or incompatible versions of installed
dependencies.
2023-12-12 17:33:38 +00:00
Charlie Marsh c764155988
Avoid double-resolving during `pip-install` (#610)
## Summary

At present, when performing a `pip-install`, we first do a resolution,
then take the set of requirements and basically run them through our
`pip-sync`, which itself includes re-resolving the dependencies to get a
specific `Dist` for each package. (E.g., the set of requirements might
say `flask==3.0.0`, but the installer needs a specific _wheel_ or source
distribution to install.)

This PR removes this second resolution by exposing the set of pinned
packages from the resolution. The main challenge here is that we have an
optimization in the resolver such that we let the resolver read metadata
from an incompatible wheel as long as a source distribution exists for a
given package. This lets us avoid building source distributions in the
resolver under the assumption that we'll be able to install the package
later on, if needed. As such, the resolver now needs to track the
resolution and installation filenames separately.
2023-12-12 17:29:09 +00:00
Charlie Marsh a0b3815d84
Respect existing versions when pip-installing (#608)
## Summary

When running `puffin pip-install`, we should respect versions that are
already installed in the environment. For example, if you run `puffin
pip-install flask==2.0.0` and then `puffin pip-install flask`, we should
avoid upgrading Flask. The most natural way to model this is to mark
them as "preferences".

(It's not enough to just filter those requirements out prior to
resolving, since we may not have the _dependencies_ of those packages
installed. We _could_ recursively verify this across the
`site-packages`, but that would be a larger PR.)
2023-12-12 17:22:47 +00:00
Charlie Marsh 974cb4cc15
Add a `pip-install` subcommand (#607)
## Summary

This PR adds a `pip-install` command that operates like, well, `pip
install`. In short, it resolves the provided dependency, then makes sure
they're all installed in the environment. The primary differences with
`pip-sync` are that (1) `pip-sync` ignores dependencies, and assumes
that the packages represent a complete set; and (2) `pip-sync`
uninstalls any unlisted packages.

There are a bunch of TODOs that I'll resolve in subsequent PRs.

Closes https://github.com/astral-sh/puffin/issues/129.
2023-12-12 12:16:00 -05:00
Charlie Marsh 1181288078
Download, build, and install in a single pipeline phase (#605)
## Summary

At present, we have two separate phases within the installation pipeline
related to populating wheels into the cache. The first phase downloads
the distribution, and then builds any source distributions into wheels;
the second phase unzips all the built wheels into the cache.

This PR merges those two phases into one, such that we seamlessly
download, build, and unzip wheels in one pass. This is more efficient,
since we can start unzipping while we build. It also ensures that if the
install _fails_ partway through, we don't end up with a bunch of
downloaded wheels that we never had a chance to unzip. The code is also
much simpler.

The main downside is that the user-facing feedback isn't as granular,
since we only have one phase and one progress bar for what was
originally three distinct phases.

Closes https://github.com/astral-sh/puffin/issues/571.

## Test Plan

I ran the benchmark script on two separate requirements files, and saw a
7% and 31% speedup respectively:

```text
+ TARGET=./scripts/benchmarks/requirements.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     269.4 ms ±  33.0 ms    [User: 42.4 ms, System: 117.5 ms]
  Range (min … max):   221.7 ms … 446.7 ms    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     250.6 ms ±  28.3 ms    [User: 41.5 ms, System: 127.4 ms]
  Range (min … max):   207.6 ms … 336.4 ms    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache' ran
    1.07 ± 0.18 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
```

```text
+ TARGET=./scripts/benchmarks/requirements-large.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      5.053 s ±  0.354 s    [User: 1.413 s, System: 6.710 s]
  Range (min … max):    4.584 s …  6.333 s    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      3.845 s ±  0.225 s    [User: 1.364 s, System: 6.970 s]
  Range (min … max):    3.482 s …  4.715 s    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran
```
2023-12-11 15:42:29 +00:00
Charlie Marsh 24d81912cf
Use consistent change event order (#598)
Closes #591.
2023-12-09 04:12:40 +00:00
Charlie Marsh 714a64549b
Use a progress bar for the build phase (#597)
I think this might've been an oversight when copying over the build
reporting during the source distribution refactor.
2023-12-09 04:05:13 +00:00
konsti 6005d7a552
Keep track of in flight unzips using `OnceMap` (#544)
I saw warnings when we were e.g. unzipping wheel and setuptools in two
tasks at the same time. We now keep track of in flight unzips.

This introduces a `OnceMap` abstraction which we also use in the
resolver.
2023-12-08 20:18:11 +00:00
Charlie Marsh ffb8480087
Add `--reinstall` flag to `pip-sync` (#590)
## Summary

This PR adds two flags to `pip-sync`: `--reinstall`, and
`--reinstall-package [PACKAGE]`. The former reinstalls all packages in
the requirements, while the latter can be repeated and reinstalls all
specified packages.

For our purposes, a reinstall includes (1) purging the cache, and (2)
marking any already-installed versions as extraneous.

Closes #572.

Closes https://github.com/astral-sh/puffin/issues/271.
2023-12-08 19:58:42 +00:00
Charlie Marsh 4b8642c6f7
Enable selective cache purging in `puffin clean` (#589)
## Summary

This PR enables `puffin clean` to accept package names as command line
arguments, and selectively purge entries from the cache tied to the
given package.

Relate to #572.

## Test Plan

Modified all the caching tests to run an additional step to (1) purge
the cache, and (2) re-install the package.
2023-12-08 19:51:32 +00:00
Charlie Marsh 5d3ce963b2
Raise an error when `pip-sync` manifest contains duplicates (#584)
Also ensures that we filter out any incompatible requirements when
building the install plan. In general, we assume that requirements were
generated by `pip-compile`, in which case all requirements should be
compatible and there should be no duplicates; but we should handle this
case gracefully.

Closes https://github.com/astral-sh/puffin/issues/582.
2023-12-07 05:26:42 +00:00
Charlie Marsh aa065f5c97
Modify install plan to support all distribution types (#581)
This PR adds caching support for built wheels in the installer.
Specifically, the `RegistryWheelIndex` now indexes both downloaded and
built wheels (from registries), and we have a new `BuiltWheelIndex` that
takes a subdirectory and returns the "best-matching" compatible wheel.

Closes #570.
2023-12-07 04:43:34 +00:00
Charlie Marsh edaeb9b0e8
Add tests for repeated installs with source distributions (#580)
Adds a few more tests for re-installs with various kinds of source
distributions, and changes the tests to use packages that we can safely
import (via `check_command`) for extra validation.

Once we properly respect cached built wheels, we should expect these
snapshots to change, since we'll no longer download and re-build
unnecessarily.
2023-12-06 20:02:32 +00:00
Zanie Blue 2bb04771ce
Allow switching out the resolver's IO (#517)
I'm working off of @konstin's commit here to implement arbitrary unsat
test cases for the resolver.

The entirety of the resolver's io are two functions: Get the version map
for a package (PEP 440 version -> distribution) and get the metadata for
a distribution. A new trait `ResolverProvider` abstracts these two away and
allows replacing the real network requests e.g. with stored responses
(https://github.com/pradyunsg/pip-resolver-benchmarks/blob/main/scenarios/pyrax_198.json).

---------

Co-authored-by: konsti <konstin@mailbox.org>
2023-12-06 11:53:16 -06:00
konsti 1bf754556f
Add test for cache source dist installing (#545)
The code changes are outdated, now it's only adding a test
2023-12-06 11:37:55 +00:00
Charlie Marsh 2d1e19e474
Allow yanked versions when specified via `==` (#561)
## Summary

This enables users to rely on yanked versions via explicit `==` markers,
which is necessary in some projects (and, in my opinion, reasonable).

Closes #551.
2023-12-05 09:44:06 +01:00
Charlie Marsh c3a917bbf6
Support granular target Python versions (#534)
## Summary

Allows, e.g., `--python-version 3.7` or `--python-version 3.7.9`. This
was also feedback I received in the original PR.

Closes https://github.com/astral-sh/puffin/issues/533.
2023-12-05 02:38:49 +00:00
Charlie Marsh 5fddcc362e
Improve error messages for 'file not found' case (#550)
Right now, if you specify a wheel that doesn't exist, you get: `no such
file or directory` with no additional context. Oops!
2023-12-04 22:01:51 +00:00
konsti d5abd33813
Use atomic writes for the cache consistently (#546)
Ensure we're using atomic writes everywhere in our cache to avoid broken
cache records and error with parallel puffin actions
(https://github.com/astral-sh/puffin/pull/544#issuecomment-1838841581).

All json files that are written to the cache are written atomically and
the build wheels are written to temp dir and then moved atomically. I
didn't touch venv creation though, i don't think that's worth it since
python does not support atomic package installation through its design.
2023-12-04 12:02:01 -05:00
Charlie Marsh 0ac4254a7e
Enforce target and interpreter `requires-python` versions (#532)
## Summary

This PR modifies the behavior of our `--python-version` override in two
ways:

1. First, we always use the "real" interpreter in the source
distribution builder. I think this is correct. We don't need to use the
fake markers for recursive builds, because all we care about is the
top-level resolution, and we already assume that a single source
distribution will always return the same metadata regardless of its
build environment.
2. Second, we require that source distributions are compatible with
_both_ the "real" interpreter version and the marker environment. This
ensures that we don't try to build source distributions that are
compatible with our interpreter, but incompatible with the target
version.

Closes https://github.com/astral-sh/puffin/issues/407.
2023-12-04 11:27:36 +01:00
Charlie Marsh d96c18b3a8
Respect `requires` for non-`build-backend` PEP 517 builds (#530)
## Summary

This PR modifies `puffin-build` to be closer in behavior to
[pip](a15dd75d98/src/pip/_internal/pyproject.py (L53))
and
[build](de5b44b0c2/src/build/__init__.py (L94)).

Specifically, if a project contains a `[build-system]` field, but no
`build-backend`, we now perform a PEP 517 build (instead of using
`setup.py` directly) _and_ respect the `requires` of the
`[build-system]`. Without this change, we were failing to build source
distributions for packages like `ujson`.

Closes #527.

---------

Co-authored-by: konstin <konstin@mailbox.org>
2023-12-04 10:13:42 +00:00
Charlie Marsh fc20d01593
Ignore empty `VIRTUAL_ENV` variables (#536)
I'm not sure how my interpreter gets into this state, but it's certainly
wrong to respect these.
2023-12-04 04:53:26 +00:00
Charlie Marsh fa3107b173
Use full Python version when determining compatibility (#528)
## Summary

When resolving with Python 3.7.13, I was failing to find a matching
distribution that required Python 3.7.9 or later.
2023-12-04 01:02:24 +00:00
Charlie Marsh ee2fca3a48
Add CACHEDIR and .gitignore tags to cache directories (#526)
## Summary

Even if this will typically be in the user's application folder (rather
than a local directory), it's still a good practice.

Closes https://github.com/astral-sh/puffin/issues/280.
2023-12-02 00:37:51 +00:00
konsti 9806901a16
Consolidate wheel caches (#524)
After this change, two wheel caches remain: `built-wheels-v0` and
`wheels-v0`, docs screenshots below. Each contains both the wheel
metadata, cache policy and zip or unzipped wheels under the same name.

The zipped/unzipped strategy is as follows: In `pip-compile`, when we
build a wheel, we store it zipped. When `pip-sync` or a source dist
build in `pip-compile` need to install the wheel, we unzip it, remove
the file and replace it with the unzipped wheel.

This removes `WheelCache` and `UrlIndex` in favor of `Cache` plus
`WheelCache`. The non-built wheel cache now considers index urls and the
url for url wheels.

I'm unsure if we need the `Unzipper` type, this could just be a
function.

I move `no_index` into `IndexUrls` and started using `IndexUrl` up to
the clap level.

I left a number of TODOs in the code, namely performing the actual
invalidation of unzipped wheels and making the `InstallPlan` understand
cache invalidation (i.e. uninstall wheels when their remote changed).


![image](https://github.com/astral-sh/puffin/assets/6826232/c4d45979-485b-4954-848d-fd3347ee2510)
2023-12-01 20:16:33 +00:00
konsti 4551994b7d
Clear built wheels when remote changed (#519)
Remove built wheels alongside their metadata when their index source
dist or url source dist changed. For git source dists, we currently
don't clear the previous build but use a new directory (not sure what's
right here - are there any generic cache GC approaches out there? I've
seen that e.g. spotify keeps its cache at 10GB max, but i also haven't
seen any reusable, well tested approaches for this). Path distributions
are unchanged (#478).

I like the structure of metadata alongside the wheel for cache
invalidation, i'll try to do that for `wheels-v0`/`wheel-metadata-v0`
too. (The unzipped wheels afaik currently lack cache invalidation when
the remote changed.) This should give is roughly the same structure for
wheel and built wheels and a very similar pattern of invalidation.
2023-12-01 14:56:47 -05:00
Zanie Blue 5f1f207628
Recursively merge existing package directories on installation (#516)
Previously, when installing a package we would delete the target
directory before copying (or linking) the contents of the package.
However, this means that we do not properly support namespace packages
which can share a target directory. Instead the last package to be
installed would be override existing packages. Since we install packages
in parallel, this could result in a race condition where the target
directory already exists which is not allowed when using `clonefile`.
See example error in #515.
c7e63d2dce
provides a regression test for this — it fails on `main`.

Here, we implement a recursive merge when the target directory already
exists. Both packages will be installed into the same directory. We no
longer delete the target directory, which seems okay since we uninstall
packages before installing now.

When files conflict, we will likely throw an error still. The correct
behavior to implement in this case is unclear, as if we just take "first
write wins" or "last write wins" we could end up with some files from
one package and some from another resulting in two broken packages. A
possible solution here is to lock the target directories while copying.
2023-11-30 10:14:51 -06:00
konsti d89fbeb642
Migrate interpreter query to custom caching (#508)
This removes the last usage of cacache by replacing it with a custom,
flat json caching keyed by the digest of the executable path.


![image](https://github.com/astral-sh/puffin/assets/6826232/8f777c4c-1f1b-4656-ba7b-002175270556)

A step towards #478. I've made `CachedByTimestamp<T>` generic over `T`
but intentionally not moved it to `puffin-cache` yet.
2023-11-28 17:14:59 +00:00
konsti 5435d44756
Introduce `Cache`, `CacheBucket` and `CacheEntry` (#507)
This is mostly a mechanical refactor that moves 80% of our code to the
same cache abstraction.

It introduces cache `Cache`, which abstracts away the path of the cache
and the temp dir drop and is passed throughout the codebase. To get a
specific cache bucket, you need to requests your `CacheBucket` from
`Cache`. `CacheBucket` is the centralizes the names of all cache
buckets, moving them away from the string constants spread throughout
the crates.

Specifically for working with the `CachedClient`, there is a
`CacheEntry`. I'm not sure yet if that is a strict improvement over
`cache_dir: PathBuf, cache_file: String`, i may have to rotate that
later.

The interpreter cache moved into `interpreter-v0`.

We can use the `CacheBucket` page to document the cache structure in
each bucket:


![image](https://github.com/astral-sh/puffin/assets/6826232/b023fdfb-e34d-4c2d-8663-b5f73937a539)
2023-11-28 17:11:14 +00:00
konsti 1142a14f4d
Check compatibility for cached unzipped wheels (#501)
**Motivation** Previously, we would install any wheel with the correct
package name and version from the cache, even if it doesn't match the
current python interpreter.

**Summary** The unzipped wheel cache for registries now uses the entire
wheel filename over the name-version (`editables-0.5-py3-none-any.whl`
over `editables-0.5`).

Built wheels are not stored in the `wheels-v0` unzipped wheels cache
anymore. For each source distribution, there can be multiple built
wheels (with different compatibility tags), so i argue that we need a
different cache structure for them (follow up PR).

For `all-kinds.in` with

```bash
rm -rf cache-all-kinds
virtualenv --clear -p 3.12 .venv
cargo run --bin puffin -- pip-sync --cache-dir cache-all-kinds target/all-kinds.txt
```

we get:

**Before**
```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0
│   ├── asgiref-3.7.2
│   ├── blinker-1.7.0
│   ├── certifi-2023.11.17
│   ├── cffi-1.16.0
│   ├── [...]
│   ├── tzdata-2023.3
│   ├── urllib3-2.1.0
│   └── wheel-0.42.0
└── url
    ├── 4b8be67c801a7ecb
    │   ├── flask
    │   └── flask-3.0.0.dist-info
    ├── 6781bd6440ae72c2
    │   ├── werkzeug
    │   └── werkzeug-3.0.1.dist-info
    └── a67db8ed076e3814
        ├── pydantic_extra_types
        └── pydantic_extra_types-2.1.0.dist-info

48 directories, 0 files
```

**After**

```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0-py3-none-any.whl
│   ├── asgiref-3.7.2-py3-none-any.whl
│   ├── blinker-1.7.0-py3-none-any.whl
│   ├── certifi-2023.11.17-py3-none-any.whl
│   ├── cffi-1.16.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
│   ├── [...]
│   ├── tzdata-2023.3-py2.py3-none-any.whl
│   ├── urllib3-2.1.0-py3-none-any.whl
│   └── wheel-0.42.0-py3-none-any.whl
└── url
    └── 4b8be67c801a7ecb
        └── flask-3.0.0-py3-none-any.whl

39 directories, 0 files
```

**Outlook** Part of #477 "Fix wheel caching". Further tasks:
* Replace the `CacheShard` with `WheelMetadataCache` which handles urls
properly.
* Delete unzipped wheels when their remote wheel changed
* Store built wheels next to the `metadata.json` in the source dist
directory; delete built wheels when their source dist changed (different
cache bucket, but it's the same problem of fixing wheel caching) I'll
make stacked PRs for those
2023-11-27 16:03:58 -08:00
konsti 71295702bf
Reduce pip_sync test duplication (#502)
Move venv creation and running python to check the installation into
function instead of copy&pasting them every time
2023-11-27 10:21:40 +00:00
konsti d54e780843
Source dist metadata refactor (#468)
## Summary and motivation

For a given source dist, we store the metadata of each wheel built
through it in `built-wheel-metadata-v0/pypi/<source dist
filename>/metadata.json`. During resolution, we check the cache status
of the source dist. If it is fresh, we check `metadata.json` for a
matching wheel. If there is one we use that metadata, if there isn't, we
build one. If the source is stale, we build a wheel and override
`metadata.json` with that single wheel. This PR thereby ties the local
built wheel metadata cache to the freshness of the remote source dist.
This functionality is available through `SourceDistCachedBuilder`.

`puffin_installer::Builder`, `puffin_installer::Downloader` and
`Fetcher` are removed, instead there are now `FetchAndBuild` which calls
into the also new `SourceDistCachedBuilder`. `FetchAndBuild` is the new
main high-level abstraction: It spawns parallel fetching/building, for
wheel metadata it calls into the registry client, for wheel files it
fetches them, for source dists it calls `SourceDistCachedBuilder`. It
handles locks around builds, and newly added also inter-process file
locking for git operations.

Fetching and building source distributions now happens in parallel in
`pip-sync`, i.e. we don't have to wait for the largest wheel to be
downloaded to start building source distributions.

In a follow-up PR, I'll also clear built wheels when they've become
stale.

Another effect is that in a fully cached resolution, we need neither zip
reading nor email parsing.

Closes #473

## Source dist cache structure 

Entries by supported sources:
 * `<build wheel metadata cache>/pypi/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/<sha256(index-url)>/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/url/<sha256(url)>/foo-1.0.0.zip/metadata.json`
But the url filename does not need to be a valid source dist filename

(<https://github.com/search?q=path%3A**%2Frequirements.txt+master.zip&type=code>),
so it could also be the following and we have to take any string as
filename:
* `<build wheel metadata
cache>/url/<sha256(url)>/master.zip/metadata.json`

Example:
```text
# git source dist
pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git
# pypi source dist
django_allauth==0.51.0
# url source dist
werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz
```
will be stored as
```text
built-wheel-metadata-v0
├── git
│   └── 5c56bc1c58c34c11
│       └── 843b753e9e8cb74e83cac55598719b39a4d5ef1f
│           └── metadata.json
├── pypi
│   └── django-allauth-0.51.0.tar.gz
│       └── metadata.json
└── url
    └── 6781bd6440ae72c2
        └── werkzeug-3.0.1.tar.gz
            └── metadata.json
```

The inside of a `metadata.json`:
```json
{
  "data": {
    "django_allauth-0.51.0-py3-none-any.whl": {
      "metadata-version": "2.1",
      "name": "django-allauth",
      "version": "0.51.0",
      ...
    }
  }
}
```
2023-11-24 17:47:58 +00:00
konsti 8d247fe95b
Add `Tags::from_interpreter` (#498)
Small refactoring
2023-11-24 11:36:01 +00:00
konsti 1c0e03f807
puffin_interpreter cleanup ahead of #235 (#492)
Preparing for #235, some refactoring to `puffin_interpreter`.

* Added a dedicated error type instead of anyhow
* `InterpreterInfo` -> `Interpreter`
* `detect_virtual_env` now returns an option so it can be chained for
#235
2023-11-23 08:57:33 +00:00
Charlie Marsh 9d35128840
Use Clippy lint table over Cargo config (#490)
Closes https://github.com/astral-sh/puffin/issues/482.
2023-11-22 15:10:27 +00:00
konsti 7c7daa8f83
Consistent Cargo.toml syntax (#483)
Remove the last Cargo.toml inconsistencies, see
1526b3458a (r1401083681).
Now all `[dependencies]` are workspace dependencies.
2023-11-22 08:34:08 +00:00
Charlie Marsh 17228ba04e
Add support for path dependencies (#471)
## Summary

This PR adds support for local path dependencies. The approach mostly
just falls out of our existing approach and infrastructure for Git and
URL dependencies.

Closes https://github.com/astral-sh/puffin/issues/436. (We'll open a
separate issue for editable installs.)

## Test Plan

Added `pip-compile` tests that pre-download a wheel or source
distribution, then install it via local path.
2023-11-21 11:49:42 +00:00
Charlie Marsh 35fd86631b
Unify distribution operations into a single crate (#460)
## Summary

This PR unifies the behavior that lived in the resolver's `distribution`
crates with the behaviors that were spread between the various structs
in the installer crate into a single `Fetcher` struct that is intended
to manage all interactions with distributions. Specifically, the
interface of this struct is such that it can access distribution
metadata, download distributions, return those downloads, etc., all with
a common cache.

Overall, this is mostly just DRYing up code that was repeated between
the two crates, and putting it behind a reasonable shared interface.
2023-11-20 11:22:52 +00:00
Charlie Marsh 6fd582f8b9
Rename `puffin-distribution` to `distribution-types` (#458)
## Summary

This crate only contains types, and I want to introduce a new crate for
all _operations_ on distributions, so this feels like a more natural
name given we also have `pypi-types`.
2023-11-20 09:40:26 +01:00
Charlie Marsh 380030bb5c
Pin all resolver tests using `--exclude-newer` (#456)
Uses yesterday's date, which should make it much less likely that our
tests become stale over time.

Closes https://github.com/astral-sh/puffin/issues/449.
2023-11-19 15:10:57 +00:00