Commit Graph

199 Commits

Author SHA1 Message Date
Charlie Marsh 1181288078
Download, build, and install in a single pipeline phase (#605)
## Summary

At present, we have two separate phases within the installation pipeline
related to populating wheels into the cache. The first phase downloads
the distribution, and then builds any source distributions into wheels;
the second phase unzips all the built wheels into the cache.

This PR merges those two phases into one, such that we seamlessly
download, build, and unzip wheels in one pass. This is more efficient,
since we can start unzipping while we build. It also ensures that if the
install _fails_ partway through, we don't end up with a bunch of
downloaded wheels that we never had a chance to unzip. The code is also
much simpler.

The main downside is that the user-facing feedback isn't as granular,
since we only have one phase and one progress bar for what was
originally three distinct phases.

Closes https://github.com/astral-sh/puffin/issues/571.

## Test Plan

I ran the benchmark script on two separate requirements files, and saw a
7% and 31% speedup respectively:

```text
+ TARGET=./scripts/benchmarks/requirements.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     269.4 ms ±  33.0 ms    [User: 42.4 ms, System: 117.5 ms]
  Range (min … max):   221.7 ms … 446.7 ms    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     250.6 ms ±  28.3 ms    [User: 41.5 ms, System: 127.4 ms]
  Range (min … max):   207.6 ms … 336.4 ms    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache' ran
    1.07 ± 0.18 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
```

```text
+ TARGET=./scripts/benchmarks/requirements-large.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      5.053 s ±  0.354 s    [User: 1.413 s, System: 6.710 s]
  Range (min … max):    4.584 s …  6.333 s    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      3.845 s ±  0.225 s    [User: 1.364 s, System: 6.970 s]
  Range (min … max):    3.482 s …  4.715 s    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran
```
2023-12-11 15:42:29 +00:00
Charlie Marsh 24d81912cf
Use consistent change event order (#598)
Closes #591.
2023-12-09 04:12:40 +00:00
Charlie Marsh 714a64549b
Use a progress bar for the build phase (#597)
I think this might've been an oversight when copying over the build
reporting during the source distribution refactor.
2023-12-09 04:05:13 +00:00
konsti 6005d7a552
Keep track of in flight unzips using `OnceMap` (#544)
I saw warnings when we were e.g. unzipping wheel and setuptools in two
tasks at the same time. We now keep track of in flight unzips.

This introduces a `OnceMap` abstraction which we also use in the
resolver.
2023-12-08 20:18:11 +00:00
Charlie Marsh ffb8480087
Add `--reinstall` flag to `pip-sync` (#590)
## Summary

This PR adds two flags to `pip-sync`: `--reinstall`, and
`--reinstall-package [PACKAGE]`. The former reinstalls all packages in
the requirements, while the latter can be repeated and reinstalls all
specified packages.

For our purposes, a reinstall includes (1) purging the cache, and (2)
marking any already-installed versions as extraneous.

Closes #572.

Closes https://github.com/astral-sh/puffin/issues/271.
2023-12-08 19:58:42 +00:00
Charlie Marsh 4b8642c6f7
Enable selective cache purging in `puffin clean` (#589)
## Summary

This PR enables `puffin clean` to accept package names as command line
arguments, and selectively purge entries from the cache tied to the
given package.

Relate to #572.

## Test Plan

Modified all the caching tests to run an additional step to (1) purge
the cache, and (2) re-install the package.
2023-12-08 19:51:32 +00:00
Charlie Marsh 5d3ce963b2
Raise an error when `pip-sync` manifest contains duplicates (#584)
Also ensures that we filter out any incompatible requirements when
building the install plan. In general, we assume that requirements were
generated by `pip-compile`, in which case all requirements should be
compatible and there should be no duplicates; but we should handle this
case gracefully.

Closes https://github.com/astral-sh/puffin/issues/582.
2023-12-07 05:26:42 +00:00
Charlie Marsh aa065f5c97
Modify install plan to support all distribution types (#581)
This PR adds caching support for built wheels in the installer.
Specifically, the `RegistryWheelIndex` now indexes both downloaded and
built wheels (from registries), and we have a new `BuiltWheelIndex` that
takes a subdirectory and returns the "best-matching" compatible wheel.

Closes #570.
2023-12-07 04:43:34 +00:00
Charlie Marsh edaeb9b0e8
Add tests for repeated installs with source distributions (#580)
Adds a few more tests for re-installs with various kinds of source
distributions, and changes the tests to use packages that we can safely
import (via `check_command`) for extra validation.

Once we properly respect cached built wheels, we should expect these
snapshots to change, since we'll no longer download and re-build
unnecessarily.
2023-12-06 20:02:32 +00:00
Zanie Blue 2bb04771ce
Allow switching out the resolver's IO (#517)
I'm working off of @konstin's commit here to implement arbitrary unsat
test cases for the resolver.

The entirety of the resolver's io are two functions: Get the version map
for a package (PEP 440 version -> distribution) and get the metadata for
a distribution. A new trait `ResolverProvider` abstracts these two away and
allows replacing the real network requests e.g. with stored responses
(https://github.com/pradyunsg/pip-resolver-benchmarks/blob/main/scenarios/pyrax_198.json).

---------

Co-authored-by: konsti <konstin@mailbox.org>
2023-12-06 11:53:16 -06:00
konsti 1bf754556f
Add test for cache source dist installing (#545)
The code changes are outdated, now it's only adding a test
2023-12-06 11:37:55 +00:00
Charlie Marsh 2d1e19e474
Allow yanked versions when specified via `==` (#561)
## Summary

This enables users to rely on yanked versions via explicit `==` markers,
which is necessary in some projects (and, in my opinion, reasonable).

Closes #551.
2023-12-05 09:44:06 +01:00
Charlie Marsh c3a917bbf6
Support granular target Python versions (#534)
## Summary

Allows, e.g., `--python-version 3.7` or `--python-version 3.7.9`. This
was also feedback I received in the original PR.

Closes https://github.com/astral-sh/puffin/issues/533.
2023-12-05 02:38:49 +00:00
Charlie Marsh 5fddcc362e
Improve error messages for 'file not found' case (#550)
Right now, if you specify a wheel that doesn't exist, you get: `no such
file or directory` with no additional context. Oops!
2023-12-04 22:01:51 +00:00
konsti d5abd33813
Use atomic writes for the cache consistently (#546)
Ensure we're using atomic writes everywhere in our cache to avoid broken
cache records and error with parallel puffin actions
(https://github.com/astral-sh/puffin/pull/544#issuecomment-1838841581).

All json files that are written to the cache are written atomically and
the build wheels are written to temp dir and then moved atomically. I
didn't touch venv creation though, i don't think that's worth it since
python does not support atomic package installation through its design.
2023-12-04 12:02:01 -05:00
Charlie Marsh 0ac4254a7e
Enforce target and interpreter `requires-python` versions (#532)
## Summary

This PR modifies the behavior of our `--python-version` override in two
ways:

1. First, we always use the "real" interpreter in the source
distribution builder. I think this is correct. We don't need to use the
fake markers for recursive builds, because all we care about is the
top-level resolution, and we already assume that a single source
distribution will always return the same metadata regardless of its
build environment.
2. Second, we require that source distributions are compatible with
_both_ the "real" interpreter version and the marker environment. This
ensures that we don't try to build source distributions that are
compatible with our interpreter, but incompatible with the target
version.

Closes https://github.com/astral-sh/puffin/issues/407.
2023-12-04 11:27:36 +01:00
Charlie Marsh d96c18b3a8
Respect `requires` for non-`build-backend` PEP 517 builds (#530)
## Summary

This PR modifies `puffin-build` to be closer in behavior to
[pip](a15dd75d98/src/pip/_internal/pyproject.py (L53))
and
[build](de5b44b0c2/src/build/__init__.py (L94)).

Specifically, if a project contains a `[build-system]` field, but no
`build-backend`, we now perform a PEP 517 build (instead of using
`setup.py` directly) _and_ respect the `requires` of the
`[build-system]`. Without this change, we were failing to build source
distributions for packages like `ujson`.

Closes #527.

---------

Co-authored-by: konstin <konstin@mailbox.org>
2023-12-04 10:13:42 +00:00
Charlie Marsh fc20d01593
Ignore empty `VIRTUAL_ENV` variables (#536)
I'm not sure how my interpreter gets into this state, but it's certainly
wrong to respect these.
2023-12-04 04:53:26 +00:00
Charlie Marsh fa3107b173
Use full Python version when determining compatibility (#528)
## Summary

When resolving with Python 3.7.13, I was failing to find a matching
distribution that required Python 3.7.9 or later.
2023-12-04 01:02:24 +00:00
Charlie Marsh ee2fca3a48
Add CACHEDIR and .gitignore tags to cache directories (#526)
## Summary

Even if this will typically be in the user's application folder (rather
than a local directory), it's still a good practice.

Closes https://github.com/astral-sh/puffin/issues/280.
2023-12-02 00:37:51 +00:00
konsti 9806901a16
Consolidate wheel caches (#524)
After this change, two wheel caches remain: `built-wheels-v0` and
`wheels-v0`, docs screenshots below. Each contains both the wheel
metadata, cache policy and zip or unzipped wheels under the same name.

The zipped/unzipped strategy is as follows: In `pip-compile`, when we
build a wheel, we store it zipped. When `pip-sync` or a source dist
build in `pip-compile` need to install the wheel, we unzip it, remove
the file and replace it with the unzipped wheel.

This removes `WheelCache` and `UrlIndex` in favor of `Cache` plus
`WheelCache`. The non-built wheel cache now considers index urls and the
url for url wheels.

I'm unsure if we need the `Unzipper` type, this could just be a
function.

I move `no_index` into `IndexUrls` and started using `IndexUrl` up to
the clap level.

I left a number of TODOs in the code, namely performing the actual
invalidation of unzipped wheels and making the `InstallPlan` understand
cache invalidation (i.e. uninstall wheels when their remote changed).


![image](https://github.com/astral-sh/puffin/assets/6826232/c4d45979-485b-4954-848d-fd3347ee2510)
2023-12-01 20:16:33 +00:00
konsti 4551994b7d
Clear built wheels when remote changed (#519)
Remove built wheels alongside their metadata when their index source
dist or url source dist changed. For git source dists, we currently
don't clear the previous build but use a new directory (not sure what's
right here - are there any generic cache GC approaches out there? I've
seen that e.g. spotify keeps its cache at 10GB max, but i also haven't
seen any reusable, well tested approaches for this). Path distributions
are unchanged (#478).

I like the structure of metadata alongside the wheel for cache
invalidation, i'll try to do that for `wheels-v0`/`wheel-metadata-v0`
too. (The unzipped wheels afaik currently lack cache invalidation when
the remote changed.) This should give is roughly the same structure for
wheel and built wheels and a very similar pattern of invalidation.
2023-12-01 14:56:47 -05:00
Zanie Blue 5f1f207628
Recursively merge existing package directories on installation (#516)
Previously, when installing a package we would delete the target
directory before copying (or linking) the contents of the package.
However, this means that we do not properly support namespace packages
which can share a target directory. Instead the last package to be
installed would be override existing packages. Since we install packages
in parallel, this could result in a race condition where the target
directory already exists which is not allowed when using `clonefile`.
See example error in #515.
c7e63d2dce
provides a regression test for this — it fails on `main`.

Here, we implement a recursive merge when the target directory already
exists. Both packages will be installed into the same directory. We no
longer delete the target directory, which seems okay since we uninstall
packages before installing now.

When files conflict, we will likely throw an error still. The correct
behavior to implement in this case is unclear, as if we just take "first
write wins" or "last write wins" we could end up with some files from
one package and some from another resulting in two broken packages. A
possible solution here is to lock the target directories while copying.
2023-11-30 10:14:51 -06:00
konsti d89fbeb642
Migrate interpreter query to custom caching (#508)
This removes the last usage of cacache by replacing it with a custom,
flat json caching keyed by the digest of the executable path.


![image](https://github.com/astral-sh/puffin/assets/6826232/8f777c4c-1f1b-4656-ba7b-002175270556)

A step towards #478. I've made `CachedByTimestamp<T>` generic over `T`
but intentionally not moved it to `puffin-cache` yet.
2023-11-28 17:14:59 +00:00
konsti 5435d44756
Introduce `Cache`, `CacheBucket` and `CacheEntry` (#507)
This is mostly a mechanical refactor that moves 80% of our code to the
same cache abstraction.

It introduces cache `Cache`, which abstracts away the path of the cache
and the temp dir drop and is passed throughout the codebase. To get a
specific cache bucket, you need to requests your `CacheBucket` from
`Cache`. `CacheBucket` is the centralizes the names of all cache
buckets, moving them away from the string constants spread throughout
the crates.

Specifically for working with the `CachedClient`, there is a
`CacheEntry`. I'm not sure yet if that is a strict improvement over
`cache_dir: PathBuf, cache_file: String`, i may have to rotate that
later.

The interpreter cache moved into `interpreter-v0`.

We can use the `CacheBucket` page to document the cache structure in
each bucket:


![image](https://github.com/astral-sh/puffin/assets/6826232/b023fdfb-e34d-4c2d-8663-b5f73937a539)
2023-11-28 17:11:14 +00:00
konsti 1142a14f4d
Check compatibility for cached unzipped wheels (#501)
**Motivation** Previously, we would install any wheel with the correct
package name and version from the cache, even if it doesn't match the
current python interpreter.

**Summary** The unzipped wheel cache for registries now uses the entire
wheel filename over the name-version (`editables-0.5-py3-none-any.whl`
over `editables-0.5`).

Built wheels are not stored in the `wheels-v0` unzipped wheels cache
anymore. For each source distribution, there can be multiple built
wheels (with different compatibility tags), so i argue that we need a
different cache structure for them (follow up PR).

For `all-kinds.in` with

```bash
rm -rf cache-all-kinds
virtualenv --clear -p 3.12 .venv
cargo run --bin puffin -- pip-sync --cache-dir cache-all-kinds target/all-kinds.txt
```

we get:

**Before**
```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0
│   ├── asgiref-3.7.2
│   ├── blinker-1.7.0
│   ├── certifi-2023.11.17
│   ├── cffi-1.16.0
│   ├── [...]
│   ├── tzdata-2023.3
│   ├── urllib3-2.1.0
│   └── wheel-0.42.0
└── url
    ├── 4b8be67c801a7ecb
    │   ├── flask
    │   └── flask-3.0.0.dist-info
    ├── 6781bd6440ae72c2
    │   ├── werkzeug
    │   └── werkzeug-3.0.1.dist-info
    └── a67db8ed076e3814
        ├── pydantic_extra_types
        └── pydantic_extra_types-2.1.0.dist-info

48 directories, 0 files
```

**After**

```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0-py3-none-any.whl
│   ├── asgiref-3.7.2-py3-none-any.whl
│   ├── blinker-1.7.0-py3-none-any.whl
│   ├── certifi-2023.11.17-py3-none-any.whl
│   ├── cffi-1.16.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
│   ├── [...]
│   ├── tzdata-2023.3-py2.py3-none-any.whl
│   ├── urllib3-2.1.0-py3-none-any.whl
│   └── wheel-0.42.0-py3-none-any.whl
└── url
    └── 4b8be67c801a7ecb
        └── flask-3.0.0-py3-none-any.whl

39 directories, 0 files
```

**Outlook** Part of #477 "Fix wheel caching". Further tasks:
* Replace the `CacheShard` with `WheelMetadataCache` which handles urls
properly.
* Delete unzipped wheels when their remote wheel changed
* Store built wheels next to the `metadata.json` in the source dist
directory; delete built wheels when their source dist changed (different
cache bucket, but it's the same problem of fixing wheel caching) I'll
make stacked PRs for those
2023-11-27 16:03:58 -08:00
konsti 71295702bf
Reduce pip_sync test duplication (#502)
Move venv creation and running python to check the installation into
function instead of copy&pasting them every time
2023-11-27 10:21:40 +00:00
konsti d54e780843
Source dist metadata refactor (#468)
## Summary and motivation

For a given source dist, we store the metadata of each wheel built
through it in `built-wheel-metadata-v0/pypi/<source dist
filename>/metadata.json`. During resolution, we check the cache status
of the source dist. If it is fresh, we check `metadata.json` for a
matching wheel. If there is one we use that metadata, if there isn't, we
build one. If the source is stale, we build a wheel and override
`metadata.json` with that single wheel. This PR thereby ties the local
built wheel metadata cache to the freshness of the remote source dist.
This functionality is available through `SourceDistCachedBuilder`.

`puffin_installer::Builder`, `puffin_installer::Downloader` and
`Fetcher` are removed, instead there are now `FetchAndBuild` which calls
into the also new `SourceDistCachedBuilder`. `FetchAndBuild` is the new
main high-level abstraction: It spawns parallel fetching/building, for
wheel metadata it calls into the registry client, for wheel files it
fetches them, for source dists it calls `SourceDistCachedBuilder`. It
handles locks around builds, and newly added also inter-process file
locking for git operations.

Fetching and building source distributions now happens in parallel in
`pip-sync`, i.e. we don't have to wait for the largest wheel to be
downloaded to start building source distributions.

In a follow-up PR, I'll also clear built wheels when they've become
stale.

Another effect is that in a fully cached resolution, we need neither zip
reading nor email parsing.

Closes #473

## Source dist cache structure 

Entries by supported sources:
 * `<build wheel metadata cache>/pypi/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/<sha256(index-url)>/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/url/<sha256(url)>/foo-1.0.0.zip/metadata.json`
But the url filename does not need to be a valid source dist filename

(<https://github.com/search?q=path%3A**%2Frequirements.txt+master.zip&type=code>),
so it could also be the following and we have to take any string as
filename:
* `<build wheel metadata
cache>/url/<sha256(url)>/master.zip/metadata.json`

Example:
```text
# git source dist
pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git
# pypi source dist
django_allauth==0.51.0
# url source dist
werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz
```
will be stored as
```text
built-wheel-metadata-v0
├── git
│   └── 5c56bc1c58c34c11
│       └── 843b753e9e8cb74e83cac55598719b39a4d5ef1f
│           └── metadata.json
├── pypi
│   └── django-allauth-0.51.0.tar.gz
│       └── metadata.json
└── url
    └── 6781bd6440ae72c2
        └── werkzeug-3.0.1.tar.gz
            └── metadata.json
```

The inside of a `metadata.json`:
```json
{
  "data": {
    "django_allauth-0.51.0-py3-none-any.whl": {
      "metadata-version": "2.1",
      "name": "django-allauth",
      "version": "0.51.0",
      ...
    }
  }
}
```
2023-11-24 17:47:58 +00:00
konsti 8d247fe95b
Add `Tags::from_interpreter` (#498)
Small refactoring
2023-11-24 11:36:01 +00:00
konsti 1c0e03f807
puffin_interpreter cleanup ahead of #235 (#492)
Preparing for #235, some refactoring to `puffin_interpreter`.

* Added a dedicated error type instead of anyhow
* `InterpreterInfo` -> `Interpreter`
* `detect_virtual_env` now returns an option so it can be chained for
#235
2023-11-23 08:57:33 +00:00
Charlie Marsh 9d35128840
Use Clippy lint table over Cargo config (#490)
Closes https://github.com/astral-sh/puffin/issues/482.
2023-11-22 15:10:27 +00:00
konsti 7c7daa8f83
Consistent Cargo.toml syntax (#483)
Remove the last Cargo.toml inconsistencies, see
1526b3458a (r1401083681).
Now all `[dependencies]` are workspace dependencies.
2023-11-22 08:34:08 +00:00
Charlie Marsh 17228ba04e
Add support for path dependencies (#471)
## Summary

This PR adds support for local path dependencies. The approach mostly
just falls out of our existing approach and infrastructure for Git and
URL dependencies.

Closes https://github.com/astral-sh/puffin/issues/436. (We'll open a
separate issue for editable installs.)

## Test Plan

Added `pip-compile` tests that pre-download a wheel or source
distribution, then install it via local path.
2023-11-21 11:49:42 +00:00
Charlie Marsh 35fd86631b
Unify distribution operations into a single crate (#460)
## Summary

This PR unifies the behavior that lived in the resolver's `distribution`
crates with the behaviors that were spread between the various structs
in the installer crate into a single `Fetcher` struct that is intended
to manage all interactions with distributions. Specifically, the
interface of this struct is such that it can access distribution
metadata, download distributions, return those downloads, etc., all with
a common cache.

Overall, this is mostly just DRYing up code that was repeated between
the two crates, and putting it behind a reasonable shared interface.
2023-11-20 11:22:52 +00:00
Charlie Marsh 6fd582f8b9
Rename `puffin-distribution` to `distribution-types` (#458)
## Summary

This crate only contains types, and I want to introduce a new crate for
all _operations_ on distributions, so this feels like a more natural
name given we also have `pypi-types`.
2023-11-20 09:40:26 +01:00
Charlie Marsh 380030bb5c
Pin all resolver tests using `--exclude-newer` (#456)
Uses yesterday's date, which should make it much less likely that our
tests become stale over time.

Closes https://github.com/astral-sh/puffin/issues/449.
2023-11-19 15:10:57 +00:00
konsti dd4347980a
Fix tests: Certifi got an update (#451) 2023-11-19 12:10:54 +00:00
Zanie Blue 5dedfeb097
Fix import of `CacheArgs` in `puffin-cli` (#447)
```
error[E0432]: unresolved imports `puffin_cache::CacheArgs`, `puffin_cache::CacheDir`
  --> crates/puffin-cli/src/main.rs:11:20
   |
11 | use puffin_cache::{CacheArgs, CacheDir};
   |                    ^^^^^^^^^  ^^^^^^^^ no `CacheDir` in the root
   |                    |
   |                    no `CacheArgs` in the root
   |
note: found an item that was configured out
  --> /Users/mz/eng/src/astral-sh/puffin/crates/puffin-cache/src/lib.rs:7:15
   |
7  | pub use cli::{CacheArgs, CacheDir};
   |               ^^^^^^^^^
   = note: the item is gated behind the `clap` feature
note: found an item that was configured out
  --> /Users/mz/eng/src/astral-sh/puffin/crates/puffin-cache/src/lib.rs:7:26
   |
7  | pub use cli::{CacheArgs, CacheDir};
   |                          ^^^^^^^^
   = note: the item is gated behind the `clap` feature

For more information about this error, try `rustc --explain E0432`.
error: could not compile `puffin-cli` (bin "puffin") due to previous error
```
2023-11-17 15:35:01 -05:00
Charlie Marsh 03599d2bb4
Split resolver inputs into manifest and options (#446)
## Summary

This is a refactor to address a TODO in the build context whereby we
aren't respecting the resolution options in recursive resolutions. Now,
the options are split out from the resolution _manifest_, and shared
across the build context tree.
2023-11-17 18:53:53 +00:00
Zanie Blue 221751487c
Use `UnusableDependencies` for URL dependency conflicts (#425)
Extends #424 with support for URL dependency incompatibilities.

Requires changes to `miette` to prevent URLs from being word wrapped;
accepted upstream in https://github.com/zkat/miette/pull/321
2023-11-17 08:28:12 -06:00
Charlie Marsh b1c29447df
Use `temp_dir` casing everywhere (#440) 2023-11-16 21:04:10 +00:00
konsti 1883dbdc21
Always¹ clear temporary directories (#437)
Always¹ clear the temporary directories we create.

* Clear source dist downloads: Previously, the temporary directories
would remain in the cache dir, now they are cleared properly
* Clear wheel file downloads: Delete the `.whl` file, we only need to
cache the unpacked wheel
* Consistent handling of cache arguments: Abstract the handling for CLI
cache args away, again making sure we remove the `--no-cache` temp dir.

There are no more `into_path()` calls that persist `TempDir`s that i
could find.

¹Assuming drop is run, and deleting the directory doesn't silently
error.
2023-11-16 20:49:48 +00:00
Zanie Blue 0d9d4f9fca
Add an `UnusableDependencies` incompatibility kind and use for conflicting versions (#424)
Addresses
https://github.com/astral-sh/puffin/issues/309#issuecomment-1792648969

Similar to #338 this throws an error when merging versions results in an
empty set. Instead of propagating that error, we capture it and return a
new dependency type of `Unusable`. Unusable dependencies are a new
incompatibility kind which includes an arbitrary "reason" string that we
present to the user. Adding a new incompatibility kind requires changes
to the vendored pubgrub crate.

We could use this same incompatibility kind for conflicting urls as in
#284 which should allow the solver to backtrack to another valid version
instead of failing (see #425).

Unlike #383 this does not require changes to PubGrub's package mapping
model. I think in the long run we'll want PubGrub to accept multiple
versions per package to solve this specific issue, but we're interested
in it being merged upstream first. This pull request is just using the
issue as a simple case to explore adding a new incompatibility type.

We may or may not be able convince them to add this new incompatibility
type upstream. As discussed in
https://github.com/pubgrub-rs/pubgrub/issues/152, we may want a more
general incompatibility kind instead which can be used for arbitrary
problems. An upstream pull request has been opened for discussion at
https://github.com/pubgrub-rs/pubgrub/pull/153.

Related to:
- https://github.com/pubgrub-rs/pubgrub/issues/152
- #338 
- #383

---------

Co-authored-by: konsti <konstin@mailbox.org>
2023-11-16 20:02:06 +00:00
Zanie Blue 832058dbba
Switch from vendored PubGrub to a fork (#438)
A fork will let us stay up to date with the upstream while replaying our
work on top of it.

I expect a similar workflow to the RustPython-Parser fork we maintained,
except that I wrote an automation to create tags for each commit on the
fork (https://github.com/zanieb/pubgrub/pull/2) so we do not need to
manually tag and document each commit.

To update with the upstream:

- Rebase our fork's `main` branch on top of the latest changes in
upstream's `dev` branch
- Force push, overwriting our `main` branch history
- Change the commit hash here to the last commit on `main` in our fork

Since we automatically tag each commit on the fork, we should never lose
the commits that are dropped from `main` during rebase.
2023-11-16 13:49:19 -06:00
konsti e41ec12239
Option to resolve at a fixed timestamp with `pip-compile --exclude-newer YYYY-MM-DD` (#434)
This works by filtering out files with a more recent upload time, so if
the index you use does not provide upload times, the results might be
inaccurate. pypi provides upload times for all files. This is, the field
is non-nullable in the warehouse schema, but the simple API PEP does not
know this field.

If you have only pypi dependencies, this means deterministic,
reproducible(!) resolution. We could try doing the same for git repos
but it doesn't seem worth the effort, i'd recommend pinning commits
since git histories are arbitrarily malleable and also if you care about
reproducibility and such you such not use git dependencies but a custom
index.

Timestamps are given either as RFC 3339 timestamps such as
`2006-12-02T02:07:43Z` or as UTC dates in the same format such as
`2006-12-02`. Dates are interpreted as including this day, i.e. until
midnight UTC that day. Date only is required to make this ergonomic and
midnight seems like an ergonomic choice.

In action for `pandas`:

```console
$ target/debug/puffin pip-compile --exclude-newer 2023-11-16 target/pandas.in
Resolved 6 packages in 679ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2023-11-16 target/pandas.in
numpy==1.26.2
    # via pandas
pandas==2.1.3
python-dateutil==2.8.2
    # via pandas
pytz==2023.3.post1
    # via pandas
six==1.16.0
    # via python-dateutil
tzdata==2023.3
    # via pandas
$ target/debug/puffin pip-compile --exclude-newer 2022-11-16 target/pandas.in
Resolved 5 packages in 655ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2022-11-16 target/pandas.in
numpy==1.23.4
    # via pandas
pandas==1.5.1
python-dateutil==2.8.2
    # via pandas
pytz==2022.6
    # via pandas
six==1.16.0
    # via python-dateutil
$ target/debug/puffin pip-compile --exclude-newer 2021-11-16 target/pandas.in
Resolved 5 packages in 594ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2021-11-16 target/pandas.in
numpy==1.21.4
    # via pandas
pandas==1.3.4
python-dateutil==2.8.2
    # via pandas
pytz==2021.3
    # via pandas
six==1.16.0
    # via python-dateutil
```
2023-11-16 19:46:17 +00:00
konsti 0d455ebd06
Always use puffin as binary name (#435)
It doesn't matter how exactly the user called puffin, the lockfile
should look the same either way.
2023-11-16 19:05:46 +01:00
konsti 751f7fa9c6
Improve PEP 691 compatibility (#428)
[PEP 691](https://peps.python.org/pep-0691/#project-detail) has slightly
different, more relaxed rules around file metadata. These changes are
now reflected in the `File` struct. This will make it easier to support
alternative indices.

I had expected that i need to introduce a separate type for that, so i'm
happy it's two `Option`s more and an alias.

Part of #412
2023-11-16 19:03:44 +01:00
konsti 3a4988f999
Small test cleanup after #431 (#433)
Remove unused filters after #431
2023-11-16 11:22:47 +00:00
konsti c0339893e7
Use `sys.executable` as python root path (#431)
Previously, we were assuming that `which <python>` return the path to
the python executable. This is not true when using pyenv shims, which
are bash scripts. Instead, we have to use `sys.executable`. Luckily,
we're already querying the python interpreter and can do it in that
pass.

We are also not allowed to cache the execution of the python interpreter
through the shim because pyenv might change the target. As a heuristic,
we check whether `sys.executable`, the real binary, is the same our
canonicalized `which` result.

---------

Co-authored-by: Zanie Blue <contact@zanie.dev>
2023-11-16 12:16:49 +01:00
Charlie Marsh d3caf9ae86
Choose most-compatible wheel in resolver and installer (#422)
## Summary

This PR implements logic to sort wheels by priority, where priority is
defined as preferring more "specific" wheels over less "specific"
wheels. For example, in the case of Black, my machine now selects
`black-23.11.0-cp311-cp311-macosx_11_0_arm64.whl`, whereas sorting by
lowest priority instead gives me `black-23.11.0-py3-none-any.whl`.

As part of this change, I've also modified the resolver to fallback to
using incompatible wheels when determining package metadata, if no
compatible wheels are available.

The `VersionMap` was also moved out of `resolver.rs` and into its own
file with a wrapper type, for clarity.

Closes https://github.com/astral-sh/puffin/issues/380.
Closes https://github.com/astral-sh/puffin/issues/421.
2023-11-15 18:22:11 +00:00