Commit Graph

143 Commits

Author SHA1 Message Date
konsti 5820a9d937
Update dependencies (#794)
Pull in a bunch of updates so they get some testing before we announce
the project. textwrap 0.16 is blocked on miette updating, http 1.0 on
reqwest.
2024-01-05 11:40:12 -05:00
konsti 673bece595
Allow `pip-compile` without a venv (#494)
The semantics are a bit unintuitive because `--python-version` is a
preference when looking for a python version without a venv, but if we
don't find that exact version we'll take `python3` and patch the
markers. This will make more sense once we start provisioning python
builds.

We can now resolve black with both python 3.8 and 3.12, with or without
that python version being in scope. In the example below,
`PATH=$HOME/.cargo/bin:/usr/bin` removes the pyenv builds and leaves
only `python3`, which is python 3.11.

```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
    0.004108s DEBUG puffin::commands::pip_compile Using Python 3.8 at /home/konsti/.local/bin/python3.8
Resolved 8 packages in 44ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
tomli==2.0.1
    # via black
typing-extensions==4.8.0
    # via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
    0.004315s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 8 packages in 43ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py38
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
tomli==2.0.1
    # via black
typing-extensions==4.8.0
    # via black
```

```console
$ RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
    0.004216s DEBUG puffin::commands::pip_compile Using Python 3.12 at /home/konsti/.local/bin/python3.12
Resolved 6 packages in 37ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
$ PATH=$HOME/.cargo/bin:/usr/bin RUST_LOG=puffin::commands=debug cargo run --bin puffin -q -- pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
    0.004190s DEBUG puffin::commands::pip_compile Using Python 3.11 at /usr/bin/python3
Resolved 6 packages in 39ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    puffin pip-compile -v scripts/benchmarks/requirements/black.in --python-version py312
black==23.11.0
[...]
platformdirs==4.0.0
    # via black
```

Fixes #235.

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-01-05 15:01:06 +00:00
Zanie Blue 5e04a95c45
Disable line wrapping during scenario tests (#784)
Adds support for a `PUFFIN_NO_WRAP` environment variable which disables
line wrapping in `miette` output.

We set this variable in the scenario tests to improve the readability of
snapshots.

I contributed the ability to disable line wrapping upstream at
https://github.com/zkat/miette/pull/328
2024-01-04 19:07:16 +00:00
konsti 2db9135c51
Update pubgrub to 78b8add6942766e5fb070bbda1de570e93d6399f (#783)
Pull in the latest perf improvements
2024-01-04 15:55:35 +00:00
konsti cd43708369
Flag to force latest version in resolve-many (#741)
Also fixes color when redirecting puffin-dev to a log file.
2024-01-02 11:04:26 +00:00
konsti 3f8dc9f5bb
Update pubgrub (#737)
Pull in https://github.com/pubgrub-rs/pubgrub/pull/170 and
https://github.com/pubgrub-rs/pubgrub/pull/171
2023-12-28 21:13:27 +00:00
Charlie Marsh 188ab75769
Split `File` into internal and external type (#729)
## Summary

This PR makes the `pypi_types::File` a response-only type (i.e., a type
that's only used when deserializing over the wire), and adds a separate
internal `File` type. Right now, the representations are similar, but
already, we can avoid the "lenient" deserialization on our internal
`File` type, and avoid the special-casing of the property names that's
required in the JSON. Over time, we can evolve this representation
entirely separately from the representation we receive from PyPI and
other indexes.
2023-12-25 15:42:28 -05:00
Charlie Marsh 6ff21374dc
Split `puffin-cache` into Puffin-specific and generic utilities (#728)
This crate started off as generic caching utilities, but we started
adding a lot of Puffin-specific stuff (like the cache buckets
abstraction that knows about Git vs. direct URL vs. indexes and so on).
This PR moves the generic stuff into a new `cache-key` crate.
2023-12-25 14:38:56 +00:00
Charlie Marsh 187ccef4e1
Cache `Tags` on `Interpreter` (#726) 2023-12-25 13:41:10 +00:00
Charlie Marsh 5b2e381f87
Remove `platform-tags` dependency on `puffin-interpreter` (#725)
Cuts off a large internal dependency chain from what is otherwise a very
general crate.
2023-12-24 23:06:50 +00:00
Charlie Marsh 343880820b
Un-escape HTML entities when decoding (#723)
I don't have a good testing strategy here (I'm manually testing against
`devpi` via `packse`), but the HTML index uses (e.g.)
`data-requires-python="&gt;=3.8"`, so we need to decode.
2023-12-24 16:35:45 -05:00
konsti e23292641f
Add pypi 10k packages with most dependents dataset (#711)
From manual inspection, this dataset generated through the [libraries.io
API](https://libraries.io/api#project-search) seems more mainstream than
the current 8k one, which is also preserved. I've added the dataset to
the repo because the API requires an API key.
2023-12-24 18:31:52 +00:00
Charlie Marsh 5bce699ee1
Add support for HTML indexes (#719)
## Summary

This PR adds support for HTML index responses (as with
`--index-url=https://download.pytorch.org/whl`).

Closes https://github.com/astral-sh/puffin/issues/412.
2023-12-24 16:04:00 +00:00
konsti e60f0ec732
Update pubgrub (#713)
Easier than i expected: We simply never construct the pubgrub error
variants since we have our own main loop. The `unreachable!()`s can be
removed when never is stabilized
2023-12-20 23:56:59 +01:00
Charlie Marsh 98fcb76015
Lock entire virtualenv during modifying commands (#695)
These commands all assume that the `site-packages` are constant
throughout.

Closes #691.
2023-12-18 16:44:45 -05:00
konsti 89ca0d68b9
`exclude_newer` in puffin-dev resolve-cli (#684)
Internal dev tool change.
2023-12-18 14:06:54 +00:00
konsti f059c6e6a6
Support editable in pip-sync and pip-compile (#587)
Support `-e path/do/dir` in pip-sync and and pip-compile.
2023-12-16 22:37:34 +00:00
konsti 71964ec7a8
Switch to msgpack in the cached client (#662)
This gives a 1.23 speedup on transformers-extras. We could change to
msgpack for the entire cache if we want. I only tried this format and
postcard so far, where postcard was much slower (like 1.6s).

I don't actually want to merge it like this, i wanted to figure out the
ballpark of improvement for switching away from json.

```
hyperfine --warmup 3 --runs 10 "target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in" "target/profiling/branch pip-compile scripts/requirements/transformers-extras.in"
Benchmark 1: target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in
  Time (mean ± σ):     179.1 ms ±   4.8 ms    [User: 157.5 ms, System: 48.1 ms]
  Range (min … max):   174.9 ms … 188.1 ms    10 runs

Benchmark 2: target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
  Time (mean ± σ):     221.1 ms ±   6.7 ms    [User: 208.1 ms, System: 46.5 ms]
  Range (min … max):   213.5 ms … 235.5 ms    10 runs

Summary
  target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in ran
    1.23 ± 0.05 times faster than target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
```

Disadvantage: We can't manually look into the cache anymore to debug
things

- [ ] Check more formats, i currently only tested json, msgpack and
postcard, there should be other formats, too
- [x] Switch over `CachedByTimestamp` serialization (for the interpreter
caching)
- [x] Switch over error handling and make sure puffin is still resilient
to cache failure
2023-12-16 21:01:35 +00:00
konsti 620f73b38b
Speed up version parsing for a 1.27±0.03 speedup in transformers-extras with conservative changes (#660)
Two low-hanging fruits as optimizations for version parsing: A fast path
for release only versions and removing the regex from version specifiers
(still calling into version's parsing regex if required). This enables
optimizing the serde format since we now see the serde part instead of
only PEP 440 parsing. I intentionally didn't rewrite the full PEP 440 at
this step.

```console
$ hyperfine --warmup 5 --runs 50 "target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in" "target/profiling/main pip-compile scripts/requirements/transformers-extras.in"
  Benchmark 1: target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     217.1 ms ±   3.2 ms    [User: 194.0 ms, System: 55.1 ms]
    Range (min … max):   211.0 ms … 228.1 ms    50 runs

  Benchmark 2: target/profiling/main pip-compile scripts/requirements/transformers-extras.in
    Time (mean ± σ):     276.7 ms ±   5.7 ms    [User: 252.4 ms, System: 54.6 ms]
    Range (min … max):   268.9 ms … 303.5 ms    50 runs

  Summary
    target/profiling/puffin pip-compile scripts/requirements/transformers-extras.in ran
      1.27 ± 0.03 times faster than target/profiling/main pip-compile scripts/requirements/transformers-extras.in
```

---------

Co-authored-by: Andrew Gallant <andrew@astral.sh>
2023-12-15 14:03:35 -05:00
Charlie Marsh 9470c20e7a
Avoid double resolution during source builds (#656)
## Summary

This PR ensures that we re-use the resolution to install the build
dependencies when building a source distribution. Currently, we only
pass along the list of requirements, and then use the `Finder` to map
each requirement to a distribution. But we already determine the correct
distribution when resolving!

Closes https://github.com/astral-sh/puffin/issues/655.
2023-12-15 17:27:16 +00:00
Charlie Marsh ed8dfbfcf7
Preserve verbatim URLs (#639)
## Summary

This PR adds a `VerbatimUrl` struct to preserve verbatim URLs throughout
the resolution and installation pipeline. In short, alongside the parsed
`Url`, we also keep the URL as written by the user. This enables us to
display the URL exactly as written by the user, rather than the
serialized path that we use internally.

This will be especially useful once we start expanding environment
variables since, at that point, we'll be able to write the version of
the URL that includes the _unexpected_ environment variable to the
output file.
2023-12-14 15:03:39 +00:00
Charlie Marsh db7e2dedbb
Move archive extraction into its own crate (#647)
We have some shared utilities beyond `puffin-build` and
`puffin-distribution`, and further, I want to be able to access the
sdist archive extraction logic from `puffin-distribution`. This is
really generic, so moving into its own crate.
2023-12-14 04:49:09 +00:00
Charlie Marsh 920e10fc8f
Use `FxHash` consistently (#632) 2023-12-13 05:36:03 +00:00
Charlie Marsh a24eb57e93
Make warnings user-facing (#628)
## Summary

Now, `puffin_warnings::warn_once` and `puffin_warnings::warn` will go to
`stderr`, as long as the user isn't running under `--quiet`. Previously,
these went through `tracing`, and so were only visible when running
under `--verbose`.
2023-12-12 21:24:38 -05:00
Zanie Blue 490fb55ac5
Use available versions to simplify unsat error reports (#547)
Uses https://github.com/pubgrub-rs/pubgrub/pull/156 to consolidate
version ranges in error reports using the actual available versions for
each package.

Alternative to https://github.com/zanieb/pubgrub/pull/8 which implements
this behavior as a method in the `Reporter` — here it's implemented in
our custom report formatter (#521) instead which requires no upstream
changes.

Requires https://github.com/zanieb/pubgrub/pull/11 to only retrieve the
versions for packages that will be used in the report.

This is a work in progress. Some things to do:
- ~We may want to allow lazy retrieval of the version maps from the
formatter~
- [x] We should probably create a separate error type for no solution
instead of mixing them with other resolve errors
- ~We can probably do something smarter than creating vectors to hold
the versions~
- [x] This degrades error messages when a single version is not
available, we'll need to special case that
- [x] It seems safer to coerce the error type in `resolve` instead of
`solve` if feasible
2023-12-12 23:25:16 +00:00
Charlie Marsh 1181288078
Download, build, and install in a single pipeline phase (#605)
## Summary

At present, we have two separate phases within the installation pipeline
related to populating wheels into the cache. The first phase downloads
the distribution, and then builds any source distributions into wheels;
the second phase unzips all the built wheels into the cache.

This PR merges those two phases into one, such that we seamlessly
download, build, and unzip wheels in one pass. This is more efficient,
since we can start unzipping while we build. It also ensures that if the
install _fails_ partway through, we don't end up with a bunch of
downloaded wheels that we never had a chance to unzip. The code is also
much simpler.

The main downside is that the user-facing feedback isn't as granular,
since we only have one phase and one progress bar for what was
originally three distinct phases.

Closes https://github.com/astral-sh/puffin/issues/571.

## Test Plan

I ran the benchmark script on two separate requirements files, and saw a
7% and 31% speedup respectively:

```text
+ TARGET=./scripts/benchmarks/requirements.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     269.4 ms ±  33.0 ms    [User: 42.4 ms, System: 117.5 ms]
  Range (min … max):   221.7 ms … 446.7 ms    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache
  Time (mean ± σ):     250.6 ms ±  28.3 ms    [User: 41.5 ms, System: 127.4 ms]
  Range (min … max):   207.6 ms … 336.4 ms    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements.txt --no-cache' ran
    1.07 ± 0.18 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements.txt --no-cache'
```

```text
+ TARGET=./scripts/benchmarks/requirements-large.txt
+ hyperfine --runs 100 --warmup 10 --prepare 'virtualenv --clear .venv' './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' --prepare 'virtualenv --clear .venv' './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache'
Benchmark 1: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      5.053 s ±  0.354 s    [User: 1.413 s, System: 6.710 s]
  Range (min … max):    4.584 s …  6.333 s    100 runs

Benchmark 2: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      3.845 s ±  0.225 s    [User: 1.364 s, System: 6.970 s]
  Range (min … max):    3.482 s …  4.715 s    100 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran
```
2023-12-11 15:42:29 +00:00
Charlie Marsh 32f54a5947
Use async `Command` for wheel build operations (#601)
Incredibly, this speeds up the install on a large project from 2m6s to
50s.
2023-12-09 16:20:52 +00:00
Charlie Marsh a24534b0ce
Use `rustc-hash` instead of `fxhash` crate (#594)
`fxhash` is the old, less maintained version of this crate
(`rustc-hash`). We use the latter in Ruff.
2023-12-08 20:27:49 +00:00
konsti 6005d7a552
Keep track of in flight unzips using `OnceMap` (#544)
I saw warnings when we were e.g. unzipping wheel and setuptools in two
tasks at the same time. We now keep track of in flight unzips.

This introduces a `OnceMap` abstraction which we also use in the
resolver.
2023-12-08 20:18:11 +00:00
Charlie Marsh 4b8642c6f7
Enable selective cache purging in `puffin clean` (#589)
## Summary

This PR enables `puffin clean` to accept package names as command line
arguments, and selectively purge entries from the cache tied to the
given package.

Relate to #572.

## Test Plan

Modified all the caching tests to run an additional step to (1) purge
the cache, and (2) re-install the package.
2023-12-08 19:51:32 +00:00
Zanie Blue ef7be9103c
Parse `SimpleJson` into categorized data in the client (#522)
Extends #517 with a suggestion from @konstin to parse the `SimpleJson`
into an intermediate type `SimpleMetadata(BTreeMap<Version,
VersionFiles>)` before converting to a `VersionMap`. This reduces the
number of times we need to parse the response. Additionally, we cache
the parsed response now instead of `SimpleJson`.

`VersionFiles` stores two vectors with
`WheelFilename`/`SourceDistFilename` and `File` tuples. These can be
iterated over together or separately. A new enum `DistFilename` was
added to capture the `SourceDistFilename` and `WheelFilename` variants
allowing iteration over both vectors.
2023-12-07 11:04:47 -06:00
Charlie Marsh aa065f5c97
Modify install plan to support all distribution types (#581)
This PR adds caching support for built wheels in the installer.
Specifically, the `RegistryWheelIndex` now indexes both downloaded and
built wheels (from registries), and we have a new `BuiltWheelIndex` that
takes a subdirectory and returns the "best-matching" compatible wheel.

Closes #570.
2023-12-07 04:43:34 +00:00
konsti 366c389385
Parse editable installs (#564)
Parse `-e` for editable installs in `requirements.txt`.

Unlike all the other requirements, editable installs don't have the name
of the package specified.
2023-12-06 18:21:15 +01:00
konsti 3f4d7b7826
Improve path source dist caching (#578)
Path distribution cache reading errors are no longer fatal.

We now invalidate the path file source dists if its modification
timestamp changed, and invalidate path dir source dists if
`pyproject.toml` or alternatively `setup.py` changed, which seems good
choices since changing pyproject.toml should trigger a rebuild and the
user can `touch` the file as part of their workflow.

`CachedByTimestamp` is now a shared util. It doesn't have methods as i
don't think it's worth it yet for two users.

Closes #478

TODO(konstin): Write a test. This is probably twice as much work as that
fix itself, so i made that PR without one for now.
2023-12-06 11:47:01 -05:00
Charlie Marsh a15da36d74
Avoid removing local wheels when unzipping (#560)
## Summary

When installing a local wheel, we need to avoid removing the zipped
wheel (since it lives outside of the cache), _and_ need to ensure that
we unzip the wheel into the cache (rather than replacing the zipped
wheel, which may even live outside of the project).

Closes https://github.com/astral-sh/puffin/issues/553.
2023-12-05 17:50:08 +00:00
Charlie Marsh 6f055ecf3b
Remove existing built wheels when building source distributions (#559)
This PR modifies the source distribution building to replace any
existing targets after building the new wheel. In some cases, the
existence of an existing target may be indicative of a bug, so we warn.
It's partially a workaround for some (but not all) of the errors in
https://github.com/astral-sh/puffin/issues/554.
2023-12-05 12:45:24 -05:00
Zanie Blue 37ca2e2928
Bump pubgrub for latest upstream (#525)
https://github.com/pubgrub-rs/pubgrub/pull/157
2023-12-04 09:09:30 -06:00
konsti 6dc8ebcb90
Test interpreter cache invalidation (#540)
Add missing test for #529/#508.
2023-12-04 10:03:43 +00:00
Charlie Marsh ee2fca3a48
Add CACHEDIR and .gitignore tags to cache directories (#526)
## Summary

Even if this will typically be in the user's application folder (rather
than a local directory), it's still a good practice.

Closes https://github.com/astral-sh/puffin/issues/280.
2023-12-02 00:37:51 +00:00
konsti 9806901a16
Consolidate wheel caches (#524)
After this change, two wheel caches remain: `built-wheels-v0` and
`wheels-v0`, docs screenshots below. Each contains both the wheel
metadata, cache policy and zip or unzipped wheels under the same name.

The zipped/unzipped strategy is as follows: In `pip-compile`, when we
build a wheel, we store it zipped. When `pip-sync` or a source dist
build in `pip-compile` need to install the wheel, we unzip it, remove
the file and replace it with the unzipped wheel.

This removes `WheelCache` and `UrlIndex` in favor of `Cache` plus
`WheelCache`. The non-built wheel cache now considers index urls and the
url for url wheels.

I'm unsure if we need the `Unzipper` type, this could just be a
function.

I move `no_index` into `IndexUrls` and started using `IndexUrl` up to
the clap level.

I left a number of TODOs in the code, namely performing the actual
invalidation of unzipped wheels and making the `InstallPlan` understand
cache invalidation (i.e. uninstall wheels when their remote changed).


![image](https://github.com/astral-sh/puffin/assets/6826232/c4d45979-485b-4954-848d-fd3347ee2510)
2023-12-01 20:16:33 +00:00
Zanie Blue 2a8544df9e
Use a custom pubgrub report formatter (#521)
Uses https://github.com/zanieb/pubgrub/pull/10 to drastically simplify
our reporter implementation. This will allow us to make use of upstream
improvements to the reporter e.g.
https://github.com/zanieb/pubgrub/pull/8 without multiple duplicative
pull requests.
2023-12-01 13:36:12 -06:00
Zanie Blue efcc4f1409
Use upstream commit for reflink-copy requirement (#523)
https://github.com/cargo-bins/reflink-copy/pull/51 was merged
2023-12-01 10:58:24 +00:00
Zanie Blue 5f1f207628
Recursively merge existing package directories on installation (#516)
Previously, when installing a package we would delete the target
directory before copying (or linking) the contents of the package.
However, this means that we do not properly support namespace packages
which can share a target directory. Instead the last package to be
installed would be override existing packages. Since we install packages
in parallel, this could result in a race condition where the target
directory already exists which is not allowed when using `clonefile`.
See example error in #515.
c7e63d2dce
provides a regression test for this — it fails on `main`.

Here, we implement a recursive merge when the target directory already
exists. Both packages will be installed into the same directory. We no
longer delete the target directory, which seems okay since we uninstall
packages before installing now.

When files conflict, we will likely throw an error still. The correct
behavior to implement in this case is unclear, as if we just take "first
write wins" or "last write wins" we could end up with some files from
one package and some from another resulting in two broken packages. A
possible solution here is to lock the target directories while copying.
2023-11-30 10:14:51 -06:00
konsti 929df586fb
Skip tf-models-nightly in resolve-many dev script for now (#510)
`tf-models-nightly` has pathologic backtracking behaviour, skip it for
now so we can benchmark the rest.
2023-11-28 18:25:32 +00:00
konsti d89fbeb642
Migrate interpreter query to custom caching (#508)
This removes the last usage of cacache by replacing it with a custom,
flat json caching keyed by the digest of the executable path.


![image](https://github.com/astral-sh/puffin/assets/6826232/8f777c4c-1f1b-4656-ba7b-002175270556)

A step towards #478. I've made `CachedByTimestamp<T>` generic over `T`
but intentionally not moved it to `puffin-cache` yet.
2023-11-28 17:14:59 +00:00
konsti 5435d44756
Introduce `Cache`, `CacheBucket` and `CacheEntry` (#507)
This is mostly a mechanical refactor that moves 80% of our code to the
same cache abstraction.

It introduces cache `Cache`, which abstracts away the path of the cache
and the temp dir drop and is passed throughout the codebase. To get a
specific cache bucket, you need to requests your `CacheBucket` from
`Cache`. `CacheBucket` is the centralizes the names of all cache
buckets, moving them away from the string constants spread throughout
the crates.

Specifically for working with the `CachedClient`, there is a
`CacheEntry`. I'm not sure yet if that is a strict improvement over
`cache_dir: PathBuf, cache_file: String`, i may have to rotate that
later.

The interpreter cache moved into `interpreter-v0`.

We can use the `CacheBucket` page to document the cache structure in
each bucket:


![image](https://github.com/astral-sh/puffin/assets/6826232/b023fdfb-e34d-4c2d-8663-b5f73937a539)
2023-11-28 17:11:14 +00:00
konsti 8855f44b5f
Move simple index queries to `CachedClient` (#504)
Replaces the usage of `http-cache-reqwest` for simple index queries with
our custom cached client, removing `http-cache-reqwest` altogether.

The new cache paths are `<cache>/simple-v0/<index>/<package_name>.json`.
I could not test with a non-pypi index since i'm not aware of any other
json indices (jax and torch are both html indices).

In a future step, we can transform the response to be a
`HashMap<Version, {source_dists: Vec<(SourceDistFilename, File)>,
wheels: Vec<(WheeFilename, File)>}` (independent of python version, this
cache is used by all environments together). This should speed up cache
deserialization a bit, since we don't need to try source dist and wheel
anymore and drop incompatible dists, and it should make building the
`VersionMap` simpler. We can speed this up even further by splitting
into a version lists and the info for each version. I'm mentioning this
because deserialization was a major bottleneck in the rust part of the
old python prototype.

Fixes #481
2023-11-28 00:11:03 +00:00
konsti d54e780843
Source dist metadata refactor (#468)
## Summary and motivation

For a given source dist, we store the metadata of each wheel built
through it in `built-wheel-metadata-v0/pypi/<source dist
filename>/metadata.json`. During resolution, we check the cache status
of the source dist. If it is fresh, we check `metadata.json` for a
matching wheel. If there is one we use that metadata, if there isn't, we
build one. If the source is stale, we build a wheel and override
`metadata.json` with that single wheel. This PR thereby ties the local
built wheel metadata cache to the freshness of the remote source dist.
This functionality is available through `SourceDistCachedBuilder`.

`puffin_installer::Builder`, `puffin_installer::Downloader` and
`Fetcher` are removed, instead there are now `FetchAndBuild` which calls
into the also new `SourceDistCachedBuilder`. `FetchAndBuild` is the new
main high-level abstraction: It spawns parallel fetching/building, for
wheel metadata it calls into the registry client, for wheel files it
fetches them, for source dists it calls `SourceDistCachedBuilder`. It
handles locks around builds, and newly added also inter-process file
locking for git operations.

Fetching and building source distributions now happens in parallel in
`pip-sync`, i.e. we don't have to wait for the largest wheel to be
downloaded to start building source distributions.

In a follow-up PR, I'll also clear built wheels when they've become
stale.

Another effect is that in a fully cached resolution, we need neither zip
reading nor email parsing.

Closes #473

## Source dist cache structure 

Entries by supported sources:
 * `<build wheel metadata cache>/pypi/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/<sha256(index-url)>/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/url/<sha256(url)>/foo-1.0.0.zip/metadata.json`
But the url filename does not need to be a valid source dist filename

(<https://github.com/search?q=path%3A**%2Frequirements.txt+master.zip&type=code>),
so it could also be the following and we have to take any string as
filename:
* `<build wheel metadata
cache>/url/<sha256(url)>/master.zip/metadata.json`

Example:
```text
# git source dist
pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git
# pypi source dist
django_allauth==0.51.0
# url source dist
werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz
```
will be stored as
```text
built-wheel-metadata-v0
├── git
│   └── 5c56bc1c58c34c11
│       └── 843b753e9e8cb74e83cac55598719b39a4d5ef1f
│           └── metadata.json
├── pypi
│   └── django-allauth-0.51.0.tar.gz
│       └── metadata.json
└── url
    └── 6781bd6440ae72c2
        └── werkzeug-3.0.1.tar.gz
            └── metadata.json
```

The inside of a `metadata.json`:
```json
{
  "data": {
    "django_allauth-0.51.0-py3-none-any.whl": {
      "metadata-version": "2.1",
      "name": "django-allauth",
      "version": "0.51.0",
      ...
    }
  }
}
```
2023-11-24 17:47:58 +00:00
konsti 8d247fe95b
Add `Tags::from_interpreter` (#498)
Small refactoring
2023-11-24 11:36:01 +00:00
Charlie Marsh 17228ba04e
Add support for path dependencies (#471)
## Summary

This PR adds support for local path dependencies. The approach mostly
just falls out of our existing approach and infrastructure for Git and
URL dependencies.

Closes https://github.com/astral-sh/puffin/issues/436. (We'll open a
separate issue for editable installs.)

## Test Plan

Added `pip-compile` tests that pre-download a wheel or source
distribution, then install it via local path.
2023-11-21 11:49:42 +00:00