Commit Graph

842 Commits

Author SHA1 Message Date
konsti
809c6d676f Use normalized display in tests and other small windows fixes (#1228)
Split out from the large test refactoring PR. Use `normalized_display`
in tests and two more thiserror derives to match snapshots and output,
and other small windows fixes.
2024-02-01 16:12:30 +01:00
Charlie Marsh
9487378ef9 Avoid TOCTOU errors in data directory installations (#1227)
## Summary

See: https://github.com/astral-sh/puffin/issues/1224

## Test Plan

Ran `python -m scripts.bench --puffin
scripts/requirements/compiled/jupyter.txt --min-runs 100 --benchmark
install-warm --verbose` several times, which failed eventually on `main`
but not on this branch.
2024-02-01 14:55:29 +00:00
Charlie Marsh
fcf848877c Change 'duplication' to 'deduplication' (#1223) 2024-02-01 14:13:45 +00:00
konsti
ea0bfc565d Refactor pip scenario tests (#1212)
Mostly a mechanical refactor to use the `puffin_snapshot!` and
`TestContext` infrastructure in the pip compile and pip install
scenarios, in preparation for adding programmatic windows testing
filters.
2024-02-01 10:31:40 +01:00
Charlie Marsh
0757862a7a Accommodate minute-level filters in Insta (#1219)
I don't know why `compile_editable` took over a minute in this case, but
seems like it did? Hard to test this fix.


https://github.com/astral-sh/puffin/actions/runs/7734769259/job/21089338951?pr=1216
2024-02-01 09:43:09 +01:00
Charlie Marsh
631ab51d6e Use publicly available Apple Silicon runners (#1216) 2024-01-31 23:41:51 -05:00
Charlie Marsh
8cbe1d220c Remove double-download for source distributions (#1218)
## Summary

Oops -- this was using a different cache key than the route above (this
is the wheel _metadata_ route vs. the wheel build route), so we were
saving and building source distributions twice in `pip install`.
2024-02-01 04:41:29 +00:00
Charlie Marsh
51e8609ee8 Use Python 3.12 in benchmarks (#1215)
I originally used Python 3.10, since 3.10 and 3.11 are by far the most
common (at least for [Ruff](https://pypistats.org/packages/ruff)). But
3.12 should give Python tools the most favorable benchmarks.
2024-01-31 15:51:13 -05:00
Charlie Marsh
ee69fb51ea Add PDM to benchmark script (#1214)
## Summary

Overall, similar to Poetry, with some simplifications (e.g., we don't
need to translate to Poetry's dependency syntax), and the need to adjust
how we manage the cache and virtual environment.
2024-01-31 20:31:45 +00:00
Charlie Marsh
c4bfb6efee Add a BENCHMARKS.md with rendered benchmarks (#1211)
As a precursor to the release, I want to include a structured document
with detailed benchmarks.

Closes https://github.com/astral-sh/puffin/issues/1210.
2024-01-31 20:11:52 +00:00
Andrew Gallant
b9d89e7624 puffin-client: generalize SimpleMetadaRaw into OwnedArchive<A> (#1208)
It turns out that the pattern I coded up for SimpleMetadataRaw is
generally useful when working with rkyv. This commit makes it generic by
supporting any type that implements rkyv's traits, and makes a few
simplifying assumptions by picking a concrete serializer, validator and
deserializer. In effect, this lets use own any archived value.

We also rejigger the API a little bit and double-down on
`OwnedArchive<A>` just being a owned wrapper for `Archived<A>`. Namely,
we implement `Deref` and turn its inherent methods into methods that
require fully qualified syntax. (As is standard for things that
implement `Deref` to avoid ambiguity with the deref target's methods.)

(This PR also makes a couple small simplifications to our custom rkyv
serializer since we no longer need to use it directly. We do still need
to name the type in trait bounds, so it has to be public.)
2024-01-31 11:56:34 -05:00
konsti
234e8d0bb7 Abstract away test duplication in pip-compile (#1187)
In preparation for the new windows handling, i want to introduce a
`TestContext` and `puffin_snapshot!` abstraction. This PR applies those
changes for pip-compile. My plan is to use those for all venv-based
integration tests and build the custom windows filters on top of
`puffin_snapshot!`.
2024-01-31 16:11:10 +00:00
Charlie Marsh
01258c1bb3 Report number of bytes deleted when clearing cache (#1203)
## Summary

This is based on Cargo's `clean` implementation, with modifications
based on some of my own preferences, and to better adhere to patterns we
use in our codebase:

![Screenshot 2024-01-31 at 1 31
10 AM](https://github.com/astral-sh/puffin/assets/1309177/38704798-b17f-4972-ab67-00484ce63d62)
2024-01-31 10:48:28 -05:00
Charlie Marsh
ec816a3322 Update Python discovery documentation (#1194)
Closes https://github.com/astral-sh/puffin/issues/1109.
2024-01-31 15:42:32 +00:00
Charlie Marsh
35113c1d06 Enable macOS checks on CI (#1193)
## Summary

Enables tests for macOS in CI, using the M1 runners (which are free in
public, but count against our quota in private
repos). For now, I'm just running them on `main` to save quota.

I did the math, and the M1 runners are the best value:

![Screenshot 2024-01-30 at 9 33
36 PM](https://github.com/astral-sh/puffin/assets/1309177/bd5a14b6-740c-487f-bcad-81c0fce5b62e)

Closes #1053.
2024-01-31 15:27:04 +00:00
Charlie Marsh
462fec1968 Remove readarray from install.sh (#1198)
## Summary

This isn't available on macOS (see, e.g.,
https://stackoverflow.com/questions/23842261/alternative-to-readarray-because-it-does-not-work-on-mac-os-x),
but this version works both on macOS and Linux.

Closes https://github.com/astral-sh/puffin/issues/1196. (Verified
locally and on CI.)
2024-01-31 10:22:13 -05:00
Charlie Marsh
8f9258fae3 Invert default feature for testing (#1200)
## Summary

We have some flags in Puffin that enable us to opt-in to certain tests.
To date, they've been opt-in, so we've run our tests with
`--all-features`. This PR makes them opt-out, and we now run tests with
default features.

The main motivation here is that I want to get tests working for macOS
on CI, but for unknown reasons, macOS can't compile the PyO3 features at
the same time as everything else due to strange linker issues. By
avoiding `--all-features` for tests, we thus avoid unnecessarily
including features that we don't actually use in Puffin.

I verified that the exact same number of tests (439) are run before and
after this change. For users, the primary difference is that you now
need to specify `--no-default-features --features pypi --features
python` to avoid (e.g.) including the Git tests.
2024-01-31 09:44:26 -05:00
Charlie Marsh
b2f1bbaa63 Add a Ctrl+C handler to the confirm workflow (#1202)
Fixes an issue whereby exiting the confirmation prompt can lead to your
cursor disappearing: https://github.com/console-rs/dialoguer/issues/294.

See:
b839a2c5b7/rye/src/main.rs (L36-L48).
2024-01-31 02:08:27 +00:00
Charlie Marsh
262f29b558 Add missing --exclude-newer to executable tests (#1201)
A new version of `platformdirs` came out, which broke these.
2024-01-30 20:26:11 -05:00
Charlie Marsh
b88b9e1f3d Remove dedicated flate2 features from Puffin (#1199)
We should be able to enable and disable these without crate-internal
features.
2024-01-30 19:41:08 -05:00
Andrew Gallant
b47f70917f puffin-client: simplify use of http-cache-semantics (#1197)
The `http-cache-semantics` crate is polymorphic on the types of requests
and responses it accepts. We had previously been explicitly converting
between `http` and `reqwest` types, but this isn't necessary. We can
provide impls of the traits in `http-cache-semantics` for `reqwest`'s
types (via a wrapper). This saves us from the awkward request/response
type conversions.

While this does clone the request, this is:

1. Not new. We were previously cloning the request to do the conversion.
2. An artifact (I believe) of http-cache-semantics API. (It kind of
   seems like an API bug to me?)

There is also a little bit of messiness around inter-operating between
http::uri::Uri and url::Url. But overall shouldn't be a big deal.
2024-01-30 18:20:44 -05:00
Charlie Marsh
7ae9d3c631 Remove Windows limitation from README (#1195) 2024-01-30 21:39:15 +00:00
Charlie Marsh
3f5e7306a5 Remove WaitMap dependency (#1183)
## Summary

This is an attempt to https://github.com/astral-sh/puffin/pull/1163 by
removing the `WaitMap` and gaining more granular control over the values
that we hold over `await` boundaries.
2024-01-30 15:25:22 -05:00
Charlie Marsh
c129717b41 Add support for --no-deps to pip install (#1191)
## Summary

Closes https://github.com/astral-sh/puffin/issues/1188.
2024-01-30 19:54:57 +00:00
Charlie Marsh
8305acc584 Add a builder for resolution options (#1192) 2024-01-30 19:50:16 +00:00
Charlie Marsh
aa3b79ec63 Prompt user for missing -r and -e flags in pip install (#1180)
## Summary

If the user runs a command like `pip install requirements.txt`, we now
prompt them to ask if they meant to include the `-r` flag:

![Screenshot 2024-01-29 at 8 38
29 PM](https://github.com/astral-sh/puffin/assets/1309177/82b9f7a2-2526-4144-b200-a5015e5b8a4b)

![Screenshot 2024-01-29 at 8 38
33 PM](https://github.com/astral-sh/puffin/assets/1309177/bd8ebb51-2537-4540-a0e0-718e66a1c69c)

The specific logic is: if the requirement ends in `.txt` or `.in`, and
the file exists locally, prompt the user for `-r`. If the requirement
contains a directory separator, and the directory exists locally, prompt
the user for `-e`.

Closes #1166.
2024-01-30 18:58:45 +00:00
Charlie Marsh
7a937e0f60 Error when parsing requirements.txt-like packages in requirements.txt file (#1179)
## Summary

Like https://github.com/astral-sh/puffin/pull/1180, this PR adds logic
for `requirements.txt` parsing whereby if a requirement _looks like_ a
local requirements file or an editable directory, we prompt the user to
correct the error (typically, by adding `-r`).
2024-01-30 18:55:11 +00:00
konsti
4ad0dc8b9e Add windows aarch64 trampolines (#1190)
Lacking windows compatible aarch64 hardware, i cross compiled the
trampoline from x86_64 linux to aarch64-pc-windows-msvc; I added the
instructions to the puffin-trampoline readme. With some testing on an
aarch64 windows machine, this should be sufficient to build working
win_arm64 tagged wheels.

i686-pc-windows-msvc is failing with an error:

```
error: linking with `lld-link` failed: exit status: 1
  = note: lld-link: error: undefined symbol: __aulldiv
          >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
          >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
          >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
          >>> referenced 4 more times

          lld-link: error: undefined symbol: __aullrem
          >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
          >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
          >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c)
          >>> referenced 4 more times
```
2024-01-30 17:51:27 +00:00
konsti
614bb0cf52 Update async_http_range_reader to 0.5.0 (#1189)
Removes a git dep and removes itertools 0.11
2024-01-30 16:32:53 +00:00
Charlie Marsh
c479c26cab Add compatibility arguments for pip sync (#1185)
## Summary

As with `pip compile`, we can provide useful error messages and warnings
when people pass `pip sync` arguments.

Closes https://github.com/astral-sh/puffin/issues/1184.
2024-01-30 08:48:55 -05:00
konsti
ab27913f68 Instrument the main function and add jupyter.in (#1186)
Instrument the main function as anchor span for checking overhead and
update tracing-durations-export to 0.2.0 for differentiating
blocking/non-blocking tasks.

Add a `jupyter.in` requirement since `pip install jupyter` is a common
operation. I tried `jupyterlab` too but there is no difference in
performance (1.00 ± 0.07).
2024-01-30 11:03:24 +00:00
konsti
a6c4cbfe55 Cleanup puffin interpreter errors (#1169)
Use `virtualenv` consistently, remove unused error variants and hint the
user towards installing missing python versions.

I didn't touch the Readme but i replaced `virtualenv environment` with
`virtualenv` in the strings i found.

Fixes https://github.com/astral-sh/puffin/issues/1167
2024-01-30 10:52:46 +01:00
Charlie Marsh
bd934207e4 Accept relative file paths in CLI requirements (#1182)
## Summary

See: https://github.com/astral-sh/puffin/issues/1181.

## Test Plan

```
❯ cargo run -- pip install packse@../../zanieb/packse
    Finished dev [unoptimized + debuginfo] target(s) in 0.15s
     Running `target/debug/puffin pip install 'packse@../../zanieb/packse'`
error: Distribution not found at: file:///Users/crmarsh/zanieb/packse
```
2024-01-30 03:31:24 +00:00
Charlie Marsh
61a3060383 Run cargo update (#1178) 2024-01-29 21:01:37 -05:00
Charlie Marsh
fa3c9afdc1 Deduplicate pep440_rs in dependency tree (#1177)
## Summary

Closes https://github.com/astral-sh/puffin/issues/1176.

## Test Plan

`cargo tree -p puffin -i pep440_rs` runs without error. Previously, it
errored due to multiple versions.
2024-01-29 16:11:42 -05:00
konsti
d4ed5ea858 Fix the compile_python_37 test with python 3.7 installed (#1172)
Make the test `compile_python_37` pass whether python 3.7 is installed
or not by muting the warning for a missing 3.7. The resolution error is
independent of whether 3.7 is installed or not.
2024-01-29 18:59:28 +01:00
Charlie Marsh
67a09649f2 Support parsing --find-links, --index-url, and --extra-index-url in requirements.txt (#1146)
## Summary

This PR adds support for `--find-links`, `--index-url`, and
`--extra-index-url` arguments when specified in a `requirements.txt`.

It's a mostly-straightforward change. The only uncertain piece is what
to do when multiple files include these flags, and/or when we include
them on the CLI and in other files.

In general:

- If _anything_ specifies `--no-index`, we respect it.
- We combine all `--extra-index-url` and `--find-links` across all
sources, since those are just vectors.
- If we see multiple `--index-url` in requirements files, we error.
- We respect the `--index-url` from the command line over any provided
in a requirements file.

(`pip-compile` seems to just pick one semi-arbitrarily when multiple are
provided.)

Closes https://github.com/astral-sh/puffin/issues/1143.
2024-01-29 15:06:40 +00:00
Charlie Marsh
4b9daf9604 Use tokio_tar instead of async_tar (#1170)
## Summary

`tokio_tar` is a fork of `async_tar` that uses Tokio instead of
`async-std`. Using it removes a significant dependency from our tree.

(There is an open PR
(https://github.com/dignifiedquire/async-tar/pull/41) in `async-tar` to
add Tokio support, but it's over a year old.)

See:
https://github.com/astral-sh/puffin/pull/1157#discussion_r1469190249.
2024-01-29 10:00:30 -05:00
Andrew Gallant
a42b385e9b puffin-client: add SimpleMetadataRaw (#1150)
This adds what is effectively an owned wrapper around
`Archived<SimpleMetadata>`. Normally, an `Archived<SimpleMetadata>`
has to be used behind a pointer (since it has a lifetime
attached to its underlying byte buffer), but we create a
wrapper around it that owns the underlying buffer and provides
free access to the archived type.

This in effect creates an anchor point for the archived type
and lets us pass it around easily. (There has to be an anchor
point for it somewhere.)

An alternative to this approach would be to store it as a file
backed memory map. But in practice, we're dealing with small
files, and just reading them on to the heap is likely to be
faster. (Memory maps also have wildly different perf characteristics
across platforms.)

Note that this commit just defines the type. It isn't actually
used anywhere yet.
2024-01-29 09:37:06 -05:00
Charlie Marsh
d94cf0e763 Remove specific MUSL mention from README (#1171)
See:
https://github.com/astral-sh/puffin/pull/1158#discussion_r1469603073.
2024-01-29 13:50:23 +00:00
konsti
be48200642 Small instrumentation improvements (#1164)
Less verbose span fields for `Dist`s by using the display impl and no
more min length in the tracing durations plot config for comparability
(we lose spans due to a speedup otherwise). Both wait points in the
solver loop are now instrumented so we can inspect what we're waiting
for to progress in the solver.
2024-01-29 10:55:19 +00:00
konsti
8bfc3c1b37 Trim get_cached_with_callback and send_cached down some more. (#1128)
I noticed that `get_cached_with_callback` and `send_cached` are large
both in terms of llvm lines and in terms of types (and large types can
cause buffer overflows on windows). `get_cached_with_callback`
specifically is large because it's monomorphized for each callback. I've
split both functions into smaller units and boxed the callback.

llvm lines, before:

```
  Lines                 Copies               Function name
  -----                 ------               -------------
  909511                21625                (TOTAL)
   36026 (4.0%,  4.0%)     33 (0.2%,  0.2%)  <&mut rmp_serde::decode::Deserializer<R,C> as serde::de::Deserializer>::deserialize_any
   14688 (1.6%,  5.6%)      8 (0.0%,  0.2%)  puffin_client::cached_client::CachedClient::get_cached_with_callback::{{closure}}::{{closure}}
   13748 (1.5%,  7.1%)      5 (0.0%,  0.2%)  puffin_client::cached_client::CachedClient::send_cached::{{closure}}
   12460 (1.4%,  8.5%)     35 (0.2%,  0.4%)  alloc::raw_vec::RawVec<T,A>::grow_amortized
   10731 (1.2%,  9.6%)    122 (0.6%,  0.9%)  <alloc::boxed::Box<T,A> as core::ops::drop::Drop>::drop
    8952 (1.0%, 10.6%)      9 (0.0%,  1.0%)  core::slice::sort::partition_in_blocks
    8216 (0.9%, 11.5%)    323 (1.5%,  2.5%)  <core::result::Result<T,E> as core::ops::try_trait::Try>::branch
    7745 (0.9%, 12.4%)    205 (0.9%,  3.4%)  core::result::Result<T,E>::map_err
    6862 (0.8%, 13.1%)     54 (0.2%,  3.7%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    6720 (0.7%, 13.9%)    133 (0.6%,  4.3%)  std::panicking::try
    6600 (0.7%, 14.6%)     45 (0.2%,  4.5%)  <alloc::sync::Weak<T,A> as core::ops::drop::Drop>::drop
    5899 (0.6%, 15.2%)     33 (0.2%,  4.6%)  rmp_serde::decode::Deserializer<R,C>::read_str_data
    5610 (0.6%, 15.9%)     33 (0.2%,  4.8%)  alloc::raw_vec::RawVec<T,A>::allocate_in
    5187 (0.6%, 16.4%)    133 (0.6%,  5.4%)  std::panicking::try::do_catch
    4740 (0.5%, 17.0%)    268 (1.2%,  6.7%)  core::ops::function::FnOnce::call_once
    4670 (0.5%, 17.5%)     40 (0.2%,  6.8%)  puffin_client::cached_client::CachedClient::get_cached_with_callback::{{closure}}::{{closure}}::{{closure}}
    4527 (0.5%, 18.0%)     54 (0.2%,  7.1%)  core::iter::traits::iterator::Iterator::try_fold
```

after:

```
  Lines                 Copies               Function name
  -----                 ------               -------------
  910275                21712                (TOTAL)
   36026 (4.0%,  4.0%)     33 (0.2%,  0.2%)  <&mut rmp_serde::decode::Deserializer<R,C> as serde::de::Deserializer>::deserialize_any
   12460 (1.4%,  5.3%)     35 (0.2%,  0.3%)  alloc::raw_vec::RawVec<T,A>::grow_amortized
   10935 (1.2%,  6.5%)    124 (0.6%,  0.9%)  <alloc::boxed::Box<T,A> as core::ops::drop::Drop>::drop
    8952 (1.0%,  7.5%)      9 (0.0%,  0.9%)  core::slice::sort::partition_in_blocks
    8714 (1.0%,  8.5%)      5 (0.0%,  0.9%)  puffin_client::cached_client::CachedClient::send_cached_handle_stale::{{closure}}
    8216 (0.9%,  9.4%)    323 (1.5%,  2.4%)  <core::result::Result<T,E> as core::ops::try_trait::Try>::branch
    8192 (0.9%, 10.3%)      8 (0.0%,  2.5%)  puffin_client::cached_client::CachedClient::get_cached_with_callback::{{closure}}::{{closure}}
    7745 (0.9%, 11.1%)    205 (0.9%,  3.4%)  core::result::Result<T,E>::map_err
    6862 (0.8%, 11.9%)     54 (0.2%,  3.7%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    6778 (0.7%, 12.6%)      5 (0.0%,  3.7%)  puffin_client::cached_client::CachedClient::send_cached::{{closure}}
    6720 (0.7%, 13.4%)    133 (0.6%,  4.3%)  std::panicking::try
    6600 (0.7%, 14.1%)     45 (0.2%,  4.5%)  <alloc::sync::Weak<T,A> as core::ops::drop::Drop>::drop
    5899 (0.6%, 14.7%)     33 (0.2%,  4.7%)  rmp_serde::decode::Deserializer<R,C>::read_str_data
    5610 (0.6%, 15.3%)     33 (0.2%,  4.8%)  alloc::raw_vec::RawVec<T,A>::allocate_in
    5187 (0.6%, 15.9%)    133 (0.6%,  5.4%)  std::panicking::try::do_catch
    4740 (0.5%, 16.4%)    268 (1.2%,  6.7%)  core::ops::function::FnOnce::call_once
    4527 (0.5%, 16.9%)     54 (0.2%,  6.9%)  core::iter::traits::iterator::Iterator::try_fold
```

Stack sizes diff:
https://gist.github.com/konstin/a3f38276aacf1170038a756c8c49793c
2024-01-29 08:31:27 +00:00
Zanie Blue
ebd8cd425d Use large Windows runner (#1134) 2024-01-29 08:34:40 +01:00
Charlie Marsh
fe03e74669 Add platform support to the README (#1158)
Closes https://github.com/astral-sh/puffin/issues/1149.
2024-01-28 22:52:25 -05:00
Charlie Marsh
fa3f0d7a55 Remove cache purge methods to clean (#1159)
This is more consistent with the public interface.
2024-01-28 21:15:11 -05:00
Charlie Marsh
d88ce76979 Stream unpacking of source distribution downloads (#1157)
This PR migrates our source distribution downloads to unzip as we
stream, similar to our approach for wheels.

In my testing, this showed a consistent speedup (e.g., 6% here for a few
representative source distributions):

```text
❯ python -m scripts.bench --puffin-path ./target/release/main --puffin-path ./target/release/puffin --benchmark install-cold requirements.in
Benchmark 1: ./target/release/main (install-cold)
  Time (mean ± σ):      1.503 s ±  0.039 s    [User: 1.479 s, System: 0.537 s]
  Range (min … max):    1.466 s …  1.605 s    10 runs

Benchmark 2: ./target/release/puffin (install-cold)
  Time (mean ± σ):      1.421 s ±  0.024 s    [User: 1.505 s, System: 0.593 s]
  Range (min … max):    1.381 s …  1.454 s    10 runs

Summary
  './target/release/puffin (install-cold)' ran
    1.06 ± 0.03 times faster than './target/release/main (install-cold)'
```
2024-01-28 20:09:24 -05:00
Andrew Gallant
5219d37250 add initial rkyv support (#1135)
This PR adds initial support for [rkyv] to puffin. In particular,
the main aim here is to make puffin-client's `SimpleMetadata` type
possible to deserialize from a `&[u8]` without doing any copies. This
PR **stops short of actuallying doing that zero-copy deserialization**.
Instead, this PR is about adding the necessary trait impls to a variety
of types, along with a smattering of small refactorings to make rkyv
possible to use.

For those unfamiliar, rkyv works via the interplay of three traits:
`Archive`, `Serialize` and `Deserialize`. The usual flow of things is
this:

* Make a type `T` implement `Archive`, `Serialize` and `Deserialize`.
rkyv
helpfully provides `derive` macros to make this pretty painless in most
  cases.
* The process of implementing `Archive` for `T` *usually* creates an
entirely
new distinct type within the same namespace. One can refer to this type
without naming it explicitly via `Archived<T>` (where `Archived` is a
clever
  type alias defined by rkyv).
* Serialization happens from `T` to (conceptually) a `Vec<u8>`. The
serialization format is specifically designed to reflect the in-memory
layout
  of `Archived<T>`. Notably, *not* `T`. But `Archived<T>`.
* One can then get an `Archived<T>` with no copying (albeit, we will
likely
need to incur some cost for validation) from the previously created
`&[u8]`.
This is quite literally [implemented as a pointer cast][rkyv-ptr-cast].
* The problem with an `Archived<T>` is that it isn't your `T`. It's
something
  else. And while there is limited interoperability between a `T` and an
`Archived<T>`, the main issue is that the surrounding code generally
demands
a `T` and not an `Archived<T>`. **This is at the heart of the tension
for
  introducing zero-copy deserialization, and this is mostly an intrinsic
problem to the technique and not an rkyv-specific issue.** For this
reason,
  given an `Archived<T>`, one can get a `T` back via an explicit
deserialization step. This step is like any other kind of
deserialization,
although generally faster since no real "parsing" is required. But it
will
  allocate and create all necessary objects.

This PR largely proceeds by deriving the three aforementioned traits
for `SimpleMetadata`. And, of course, all of its type dependencies. But
we stop there for now.

The main issue with carrying this work forward so that rkyv is actually
used to deserialize a `SimpleMetadata` is figuring out how to deal
with `DataWithCachePolicy` inside of the cached client. Ideally, this
type would itself have rkyv support, but adding it is difficult. The
main difficulty lay in the fact that its `CachePolicy` type is opaque,
not easily constructable and is internally the tip of the iceberg of
a rat's nest of types found in more crates such as `http`. While one
"dumb"-but-annoying approach would be to fork both of those crates
and add rkyv trait impls to all necessary types, it is my belief that
this is the wrong approach. What we'd *like* to do is not just use
rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to
get an `Archived<DataWithCachePolicy>` and make actual decisions used
the archived type directly. Doing that will require some work to make
`Archived<DataWithCachePolicy>` directly useful.

My suspicion is that, after doing the above, we may want to mush
forward with a similar approach for `SimpleMetadata`. That is, we want
`Archived<SimpleMetadata>` to be as useful as possible. But right
now, the structure of the code demands an eager conversion (and thus
deserialization) into a `SimpleMetadata` and then into a `VersionMap`.
Getting rid of that eagerness is, I think, the next step after dealing
with `DataWithCachePolicy` to unlock bigger wins here.

There are many commits in this PR, but most are tiny. I still encourage
review to happen commit-by-commit.

[rkyv]: https://rkyv.org/
[rkyv-ptr-cast]:
https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68
2024-01-28 12:14:59 -05:00
Zanie Blue
c0e7668dfa Add bootstrapped installation in Python for Windows (#1130)
A 1:1 port of the Bash script to Python for use on Windows.

Pulls some parts of #1068 but much more minimal. Avoids an additional
dependency on `requests`. Because we require `zstandard` to unzip the
distributions we unfortunately cannot be dependency free and cannot have
`bootstrap.sh` download the Python version needed to run this script
without it doing a non-trivial amount of work.

Retains the Bash script for now so you can bootstrap without Python
available. I may drop it in the future?
2024-01-28 10:24:49 -06:00
Charlie Marsh
a25a1f2958 Avoid re-creating directories in async unzip (#1155)
This PR extends the optimizations from #1154 to other unzip paths.
2024-01-28 14:30:38 +00:00
Charlie Marsh
3d10f344f3 Only include visited packages in error message derivation (#1144)
## Summary

This is my guess as to the source of the resolver flake, based on
information and extensive debugging from @zanieb. In short, if we rely
on `self.index.packages` as a source of truth during error reporting, we
open ourselves up to a source of non-determinism, because we fetch
package metadata asynchronously in the background while we solve -- so
packages _could_ be included in or excluded from the index depending on
the order in which those requests are returned.

So, instead, we now track the set of packages that _were_ visited by the
solver. Visiting a package _requires_ that we wait for its metadata to
be available. By limiting analysis to those packages that were visited
during solving, we are faithfully representing the state of the solver
at the time of failure.

Closes #863
2024-01-28 09:27:22 -05:00