This gives a 1.23 speedup on transformers-extras. We could change to
msgpack for the entire cache if we want. I only tried this format and
postcard so far, where postcard was much slower (like 1.6s).
I don't actually want to merge it like this, i wanted to figure out the
ballpark of improvement for switching away from json.
```
hyperfine --warmup 3 --runs 10 "target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in" "target/profiling/branch pip-compile scripts/requirements/transformers-extras.in"
Benchmark 1: target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in
Time (mean ± σ): 179.1 ms ± 4.8 ms [User: 157.5 ms, System: 48.1 ms]
Range (min … max): 174.9 ms … 188.1 ms 10 runs
Benchmark 2: target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
Time (mean ± σ): 221.1 ms ± 6.7 ms [User: 208.1 ms, System: 46.5 ms]
Range (min … max): 213.5 ms … 235.5 ms 10 runs
Summary
target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in ran
1.23 ± 0.05 times faster than target/profiling/branch pip-compile scripts/requirements/transformers-extras.in
```
Disadvantage: We can't manually look into the cache anymore to debug
things
- [ ] Check more formats, i currently only tested json, msgpack and
postcard, there should be other formats, too
- [x] Switch over `CachedByTimestamp` serialization (for the interpreter
caching)
- [x] Switch over error handling and make sure puffin is still resilient
to cache failure
## Summary
This PR modifies the install plan to avoid removing seed packages if the
virtual environment was created by anyone other than Puffin.
Closes https://github.com/astral-sh/puffin/issues/414.
## Test Plan
- Ran: `virtualenv .venv`.
- Ran: `cargo run -p puffin-cli -- pip-sync
scripts/benchmarks/requirements.txt --verbose --no-cache`.
- Verified that `pip` et al were not removed, and that the logging
including a message around preserving seed packages.
Preparing for #235, some refactoring to `puffin_interpreter`.
* Added a dedicated error type instead of anyhow
* `InterpreterInfo` -> `Interpreter`
* `detect_virtual_env` now returns an option so it can be chained for
#235
Previously, we had two python interpreter metadata structs, one in
gourgeist and one in puffin. Both would spawn a subprocess to query
overlapping metadata and both would appear in the cli crate, if you
weren't careful you could even have to different base interpreters at
once. This change unifies this to one set of metadata, queried and
cached once.
Another effect of this crate is proper separation of python interpreter
and venv. A base interpreter (such as `/usr/bin/python/`, but also pyenv
and conda installed python) has a set of metadata. A venv has a root and
inherits the base python metadata except for `sys.prefix`, which unlike
`sys.base_prefix`, gets set to the venv root. From the root and the
interpreter info we can compute the paths inside the venv. We can reuse
the interpreter info of the base interpreter when creating a venv
without having to query the newly created `python`.
This is isn't ready, but it can resolve
`meine_stadt_transparent==0.2.14`.
The source distributions are currently being built serially one after
the other, i don't know if that is incidentally due to the resolution
order, because sdist building is blocking or because of something in the
resolver that could be improved.
It's a bit annoying that the thing that was supposed to do http requests
now suddenly also has to a whole download/unpack/resolve/install/build
routine, it messes up the type hierarchy. The much bigger problem though
is avoid recursive crate dependencies, it's the reason for the callback
and for splitting the builder into two crates (badly named atm)
Gets rid of the custom `DistInfo` struct in the site-packages
abstraction in favor of a new kind of distribution
(`InstalledDistribution`). No change in behavior.
This is also a lot faster. Unfortunately it copies a lot of code from
the sync cli since the `Printer` is private.
The first commit are some refactorings i made when i thought about how i
could reuse the existing code.
This PR modifies the `install-wheel-rs` (and a few other crates) to get
everything playing nicely. Specifically, CI should pass, and all these
crates now use workspace dependencies between one another.
As part of this change, I split out the wheel name parsing into its own
`wheel-filename` crate, and the compatibility tag parsing into its own
`platform-tags` crate.