Python/uv - uv - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
konsti	47fc90d1b3	Reduce stack usage by boxing `File` in `Dist`, `CachePolicy` and large futures (#1004 ) This is https://github.com/astral-sh/puffin/pull/947 again but this time merging into main instead of downstack, sorry for the noise. --- Windows has a default stack size of 1MB, which makes puffin often fail with stack overflows. The PR reduces stack size by three changes: * Boxing `File` in `Dist`, reducing the size from 496 to 240. * Boxing the largest futures. * Boxing `CachePolicy` ## Method Debugging happened on linux using https://github.com/astral-sh/puffin/pull/941 to limit the stack size to 1MB. Used ran the command below. ``` RUSTFLAGS=-Zprint-type-sizes cargo +nightly build -p puffin-cli -j 1 > type-sizes.txt && top-type-sizes -w -s -h 10 < type-sizes.txt > sizes.txt ``` The main drawback is top-type-sizes not saying what the `__awaitee` is, so it requires manually looking up with a future with matching size. When the `brotli` features on `reqwest` is active, a lot of brotli types show up. Toggling this feature however seems to have no effect. I assume they are false positives since the `brotli` crate has elaborate control about allocation. The sizes are therefore shown with the feature off. ## Results The largest future goes from 12208B to 6416B, the largest type (`PrioritizedDistribution`, see also #948) from 17448B to 9264B. Full diff: https://gist.github.com/konstin/62635c0d12110a616a1b2bfcde21304f For the second commit, i iteratively boxed the largest file until the tests passed, then with an 800KB stack limit looked through the backtrace of a failing test and added some more boxing. Quick benchmarking showed no difference: ```console $ hyperfine --warmup 2 "target/profiling/main-dev resolve meine_stadt_transparent" "target/profiling/puffin-dev resolve meine_stadt_transparent" Benchmark 1: target/profiling/main-dev resolve meine_stadt_transparent Time (mean ± σ): 49.2 ms ± 3.0 ms [User: 39.8 ms, System: 24.0 ms] Range (min … max): 46.6 ms … 63.0 ms 55 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Benchmark 2: target/profiling/puffin-dev resolve meine_stadt_transparent Time (mean ± σ): 47.4 ms ± 3.2 ms [User: 41.3 ms, System: 20.6 ms] Range (min … max): 44.6 ms … 60.5 ms 62 runs Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options. Summary target/profiling/puffin-dev resolve meine_stadt_transparent ran 1.04 ± 0.09 times faster than target/profiling/main-dev resolve meine_stadt_transparent ```	2024-01-19 09:38:36 +00:00
Charlie Marsh	9b24fcd306	Remove verbatim URL from path file location (#998 ) ## Summary I got confused by why `VerbatimUrl` was on `Path`. Since it's directly computed from it, I think we should just compute it as-needed. I think it's also possibly-buggy because the URL is the URL of the _directory_, not the artifact itself, which differs from other distributions.	2024-01-18 22:40:48 -05:00
Charlie Marsh	a0420114c3	Avoid storing absolute URLs for files (#944 ) ## Summary It turns out that storing an absolute URL for every file caused a significant performance regression. This PR attempts to address the regression with two changes. The first is that we now store the raw string if the URL is an absolute URL. If the URL is relative, we store the base URL alongside the raw relative string. As such, we avoid serializing and deserializing URLs until we need them (later on), except for the base URL. The second is that we now use the internal `Url` crate methods for serializing and deserializing. If you look inside `Url`, its standard serializer and deserialization actually convert it to a string, then parse the string. But the crate exposes some other methods for faster serialization and deserialization (with fewer guarantees). I think this is totally fine since the cache is entirely internal. If we _just_ change the `Url` serialization (and no other code -- so continue to store URLs for every file), then the regression goes down to about 5%: ```shell ❯ python -m scripts.bench \ --puffin-path ./target/release/main \ --puffin-path ./target/release/relative --puffin-path ./target/release/puffin \ scripts/requirements/home-assistant.in --benchmark resolve-warm Benchmark 1: ./target/release/main (resolve-warm) Time (mean ± σ): 496.3 ms ± 4.3 ms [User: 452.4 ms, System: 175.5 ms] Range (min … max): 487.3 ms … 502.4 ms 10 runs Benchmark 2: ./target/release/relative (resolve-warm) Time (mean ± σ): 284.8 ms ± 2.1 ms [User: 245.8 ms, System: 165.6 ms] Range (min … max): 280.3 ms … 288.0 ms 10 runs Benchmark 3: ./target/release/puffin (resolve-warm) Time (mean ± σ): 300.4 ms ± 3.2 ms [User: 255.5 ms, System: 178.1 ms] Range (min … max): 295.4 ms … 305.1 ms 10 runs Summary './target/release/relative (resolve-warm)' ran 1.05 ± 0.01 times faster than './target/release/puffin (resolve-warm)' 1.74 ± 0.02 times faster than './target/release/main (resolve-warm)' ``` So I considered _just_ making that change. But 5% is kind of borderline... With both of these changes, the regression is down to 1-2%: ``` Benchmark 1: ./target/release/relative (resolve-warm) Time (mean ± σ): 282.6 ms ± 7.4 ms [User: 244.6 ms, System: 181.3 ms] Range (min … max): 275.1 ms … 318.5 ms 30 runs Benchmark 2: ./target/release/puffin (resolve-warm) Time (mean ± σ): 286.8 ms ± 2.2 ms [User: 247.0 ms, System: 169.1 ms] Range (min … max): 282.3 ms … 290.7 ms 30 runs Summary './target/release/relative (resolve-warm)' ran 1.01 ± 0.03 times faster than './target/release/puffin (resolve-warm)' ``` It's consistently ~2%-ish, but at this point it's unclear if that's due to the URL change or something other change between now and then. Closes #943.	2024-01-17 09:15:21 -05:00
Charlie Marsh	b50e5fcbc5	Fetch `--find-links` indexes in parallel (#934 ) ## Summary Removes a TODO. ## Test Plan Tested manually with: ```shell cargo run -p puffin-cli -- \ pip compile requirements.in -n \ --find-links 'https://download.pytorch.org/whl/torch_stable.html' \ --find-links 'https://storage.googleapis.com/jax-releases/jax_cuda_releases.html' \ --verbose ``` And inspecting the logs to ensure that the two requests were kicked off concrurently.	2024-01-16 11:37:35 +01:00
Charlie Marsh	2a69b273ce	Use a standalone error type for `--find-links` registry (#936 )	2024-01-15 19:48:48 +00:00
Charlie Marsh	e54fdea93f	Continue to respect `--find-links` with `--no-index` (#931 ) Like `pip`, we should allow `--find-links` with `--no-index`.	2024-01-15 16:19:27 +00:00
Charlie Marsh	42888a9609	Share flat index across resolutions (#930 ) ## Summary This PR restructures the flat index fetching in a few ways: 1. It now lives in its own `FlatIndexClient`, since it felt a bit awkward (in my opinion) for it to live in `RegistryClient`. 2. We now fetch the `FlatIndex` outside of the resolver. This has a few benefits: (1) the resolver construct is no longer `async` and no longer returns `Result`, which feels better for a resolver; and (2) we can share the `FlatIndex` across resolutions rather than re-fetching it for every source distribution build.	2024-01-15 11:02:02 -05:00
Charlie Marsh	e6d7124147	Add an extra struct around the package-to-flat index map (#923 ) ## Summary `FlatIndex` is now the thing that's keyed on `PackageName`, while `FlatDistributions` is what used to be called `FlatIndex` (a map from version to `PrioritizedDistribution`, for a single package). I find this a bit clearer, since we can also remove the `from_files` that doesn't return `Self`, which I had trouble following.	2024-01-15 14:48:10 +00:00
Charlie Marsh	9a3f3d385c	Remove `PubGrubVersion` (#924 ) ## Summary I'm running into some annoyances converting `&Version` to `&PubGrubVersion` (which is just a wrapper type around `Version`), and I realized... We don't even need `PubGrubVersion`? The reason we "need" it today is due to the orphan trait rule: `Version` is defined in `pep440_rs`, but we want to `impl pubgrub::version::Version for Version` in the resolver crate. Instead of introducing a new type here, which leads to a lot of awkwardness around conversion and API isolation, what if we instead just implement `pubgrub::version::Version` in `pep440_rs` via a feature? That way, we can just use `Version` everywhere without any confusion and conversion for the wrapper type.	2024-01-15 08:51:12 -05:00
konsti	f63776b894	Support HTML indexes in `--find-links` (#913 ) The simple html format parser luckily seems to work for find links too, at least it can parse https://storage.googleapis.com/jax-releases/jax_cuda_releases.html.	2024-01-15 02:54:34 +00:00
konsti	e9b6b6fa36	Implement `--find-links` as flat indexes (directories in pip-compile) (#912 ) Add directory `--find-links` support for local paths to pip-compile. It seems that pip joins all sources and then picks the best package. We explicitly give find links packages precedence if the same exists on an index and locally by prefilling the `VersionMap`, otherwise they are added as another index and the existing rules of precedence apply. Internally, the feature is called _flat index_, which is more meaningful than _find links_: We're not looking for links, we're picking up local directories, and (TBD) support another index format that's just a flat list of files instead of a nested index. `RegistryBuiltDist` and `RegistrySourceDist` now use `WheelFilename` and `SourceDistFilename` respectively. The `File` inside `RegistryBuiltDist` and `RegistrySourceDist` gained the ability to represent both a url and a path so that `--find-links` with a url and with a path works the same, both being locked as `<package_name>@<version>` instead of `<package_name> @ <url>`. (This is more of a detail, this PR in general still work if we strip that and have directory find links represented as `<package_name> @ file:///path/to/file.ext`) `PrioritizedDistribution` and `FlatIndex` have been moved to locations where we can use them in the upstack PR. I added a `scripts/wheels` directory with stripped down wheels to use for testing. We're lacking tests for correct tag priority precedence with flat indexes, i only confirmed this manually since it is not covered in the pip-compile or pip-sync output. Closes #876	2024-01-15 02:04:10 +00:00
konsti	5ffbfadf66	Make hashes optional (#910 ) There is no guarantee that indexes provide hashes at all or the sha256 we support specifically. [PEP 503](https://peps.python.org/pep-0503/#specification): > The URL SHOULD include a hash in the form of a URL fragment with the following syntax: #<hashname>=<hashvalue>, where <hashname> is the lowercase name of the hash function (such as sha256) and <hashvalue> is the hex encoded digest. We instead use the url as input to generate a hash when caching.	2024-01-14 16:32:55 -05:00
konsti	a53bdeba4c	Remove `base` from `RegistryBuiltDist` and `RegistrySourceDist` (#919 ) Follow-up to https://github.com/astral-sh/puffin/pull/917 i found rebasing the find-links PRs, this field became unused through the absolute URLs.	2024-01-14 17:46:16 +00:00
konsti	a99e5e00f2	Use absolute urls in `distribution_type::File` (#917 ) Previously, the url on file could either be a relative or an absolute url, depending on the index, and we would finalize it lazily. Now we finalize the url when converting `pypi_types::File` to `distribution_types::File`. This change is required to make the hashes on `File` optional (https://github.com/astral-sh/puffin/pull/910), which are currently the only unique field usable for caching.	2024-01-14 17:15:24 +00:00
konsti	b1edecdf1f	Filter out files with invalid requires python specifiers (#775 ) Instead of trying to fixup _all_ the invalid version specifiers on pypi and elsewhere, this filters out distributions with invalid `requires-python` version specifiers that even `LenientVersionSpecifiers` couldn't parse, as opposed to failing entirely, which we currently do. I would be nicer to model through an invalid distribution pubgrub type, together with e.g. source dists with an unknown extension, so that the version itself still shows up in the error trace. At the same time, we reduce the log level for fixups from warning to trace, as they are not actionable for the user.	2024-01-09 02:46:27 +00:00
Charlie Marsh	fed492831a	Inline some format placeholders (#822 )	2024-01-06 23:13:44 +00:00
Charlie Marsh	77c3a67029	Remove `pub(crate)` from `RegistryClient` fields (#821 )	2024-01-06 22:05:18 +00:00
Charlie Marsh	9ded337870	Remove unused `proxy` field from client (#820 )	2024-01-06 17:02:35 -05:00
konsti	5820a9d937	Update dependencies (#794 ) Pull in a bunch of updates so they get some testing before we announce the project. textwrap 0.16 is blocked on miette updating, http 1.0 on reqwest.	2024-01-05 11:40:12 -05:00
Andrew Gallant	d7c9b151fb	pep440: some minor refactoring, mostly around error types (#780 ) This PR does a bit of refactoring to the pep440 crate, and in particular around the erorr types. This PR is meant to be a precursor to another PR that does some surgery (both in parsing and in `Version` representation) that benefits somewhat from this refactoring. As usual, please review commit-by-commit.	2024-01-04 12:28:36 -05:00
Charlie Marsh	b2230e7f4d	Make index URLs insensitive to trailing slashes (#771 ) Closes https://github.com/astral-sh/puffin/issues/770.	2024-01-04 08:45:50 -05:00
konsti	26f597a787	Add spans to all significant tasks (#740 ) I've tried to investigate puffin's performance wrt to builds and parallelism in general, but found the previous instrumentation to granular. I've tried to add spans to every function that either needs noticeable io or cpu resources without creating duplication. This also fixes some wrong tracing usage on async functions (https://docs.rs/tracing/latest/tracing/struct.Span.html#in-asynchronous-code) and some spans that weren't actually entered.	2024-01-02 16:17:03 +00:00
Charlie Marsh	007f52bb4e	Add support for relative URLs in simple metadata responses (#721 ) ## Summary This PR adds support for relative URLs in the simple JSON responses. We already support relative URLs for HTML responses, but the handling has been consolidated between the two. Similar to index URLs, we now store the base alongside the metadata, and use the base when resolving the URL. Closes #455. ## Test Plan `cargo test` (to test HTML indexes). Separately, I also ran `cargo run -p puffin-cli -- pip-compile requirements.in -n --index-url=http://localhost:3141/packages/pypi/+simple` on the `zb/relative` branch with `packse` running, and forced both HTML and JSON by limiting the `accept` header.	2023-12-27 08:53:21 -05:00
Charlie Marsh	ae83a74309	Review feedback for HTML indexes (#733 ) See: https://github.com/astral-sh/puffin/pull/719	2023-12-26 21:57:20 +00:00
Charlie Marsh	bbe0246205	Change internal representation of `CacheEntry` to avoid allocations (#730 ) Removes a TODO.	2023-12-26 02:10:30 +00:00
Charlie Marsh	188ab75769	Split `File` into internal and external type (#729 ) ## Summary This PR makes the `pypi_types::File` a response-only type (i.e., a type that's only used when deserializing over the wire), and adds a separate internal `File` type. Right now, the representations are similar, but already, we can avoid the "lenient" deserialization on our internal `File` type, and avoid the special-casing of the property names that's required in the JSON. Over time, we can evolve this representation entirely separately from the representation we receive from PyPI and other indexes.	2023-12-25 15:42:28 -05:00
Charlie Marsh	6ff21374dc	Split `puffin-cache` into Puffin-specific and generic utilities (#728 ) This crate started off as generic caching utilities, but we started adding a lot of Puffin-specific stuff (like the cache buckets abstraction that knows about Git vs. direct URL vs. indexes and so on). This PR moves the generic stuff into a new `cache-key` crate.	2023-12-25 14:38:56 +00:00
Charlie Marsh	343880820b	Un-escape HTML entities when decoding (#723 ) I don't have a good testing strategy here (I'm manually testing against `devpi` via `packse`), but the HTML index uses (e.g.) `data-requires-python=">=3.8"`, so we need to decode.	2023-12-24 16:35:45 -05:00
Charlie Marsh	2d721a497e	Add a `SimpleHttp` abstraction similar to `SimpleJson` (#722 ) Just an internal refactor to turn some standalone functions into associated methods (and reduce the diff in the next PR).	2023-12-24 20:55:57 +00:00
Charlie Marsh	5bce699ee1	Add support for HTML indexes (#719 ) ## Summary This PR adds support for HTML index responses (as with `--index-url=https://download.pytorch.org/whl`). Closes https://github.com/astral-sh/puffin/issues/412.	2023-12-24 16:04:00 +00:00
Zanie Blue	e705267dac	Fix fallback download when index does not support HTTP range requests (#702 ) Otherwise, when a server does not support HTTP range requests we throw an error instead of downloading without range requests. --------- Co-authored-by: konstin <konstin@mailbox.org>	2023-12-20 10:55:23 +00:00
Zanie Blue	ab15b08cbe	Perform 3 retries by default instead of 0 on failed index requests (#710 ) As a user, I'd expect retries to occur by default. We should also expose this via a setting probably.	2023-12-20 11:51:24 +01:00
Zanie Blue	12eedb1c12	Include `Accept` header specifying that we can only parse JSON responses (#701 ) Otherwise, when an index does not support the query variable we get an HTML response and a JSON parse error.	2023-12-19 12:22:53 -06:00
Zanie Blue	52ba65aa9c	Derive `Debug` for `CachedClientError` (#703 ) Discovered while debugging https://github.com/astral-sh/puffin/pull/702	2023-12-19 12:22:39 -06:00
Charlie Marsh	3660d8a08e	Introduce separate traits for ahead-of-time and installed metadata (#692 ) This is a pure refactor to follow-up #690, to separate the metadata that we know upfront about distributions (like the version, for registry-based distributions) vs. the metadata that requires building (like the version, for URL-based distributions).	2023-12-18 22:37:45 +00:00
konsti	f4f67ebde0	Rebase: Uninstall existing non-editable versions when installing editable requirements bug (#682 ) Separate branch for rebasing #677 onto main because i don't trust the rebase enough to force push. Closes #677. --- If you install `black` from PyPI, then `-e ../black`, we need to uninstall the existing `black`. This sounds simple, but that in turn requires that we _know_ `-e ../black` maps to the package `black`, so that we can mark it for uninstallation in the install plan. This, in turn, means that we need to build editable dependencies prior to the install plan. This is just a bunch of reorganization to fix that specific bug (installing multiple versions of `black` if you run through the above workflow): we now run through the list of editables upfront, mark those that are already installed, build those that aren't, and then ensure that `InstallPlan` correctly removes those that need to be removed, etc. Closes #676. Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2023-12-18 09:28:14 +00:00
konsti	f059c6e6a6	Support editable in pip-sync and pip-compile (#587 ) Support `-e path/do/dir` in pip-sync and and pip-compile.	2023-12-16 22:37:34 +00:00
konsti	71964ec7a8	Switch to msgpack in the cached client (#662 ) This gives a 1.23 speedup on transformers-extras. We could change to msgpack for the entire cache if we want. I only tried this format and postcard so far, where postcard was much slower (like 1.6s). I don't actually want to merge it like this, i wanted to figure out the ballpark of improvement for switching away from json. ``` hyperfine --warmup 3 --runs 10 "target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in" "target/profiling/branch pip-compile scripts/requirements/transformers-extras.in" Benchmark 1: target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in Time (mean ± σ): 179.1 ms ± 4.8 ms [User: 157.5 ms, System: 48.1 ms] Range (min … max): 174.9 ms … 188.1 ms 10 runs Benchmark 2: target/profiling/branch pip-compile scripts/requirements/transformers-extras.in Time (mean ± σ): 221.1 ms ± 6.7 ms [User: 208.1 ms, System: 46.5 ms] Range (min … max): 213.5 ms … 235.5 ms 10 runs Summary target/profiling/puffin pip-compile --cache-dir cache-msgpack scripts/requirements/transformers-extras.in ran 1.23 ± 0.05 times faster than target/profiling/branch pip-compile scripts/requirements/transformers-extras.in ``` Disadvantage: We can't manually look into the cache anymore to debug things - [ ] Check more formats, i currently only tested json, msgpack and postcard, there should be other formats, too - [x] Switch over `CachedByTimestamp` serialization (for the interpreter caching) - [x] Switch over error handling and make sure puffin is still resilient to cache failure	2023-12-16 21:01:35 +00:00
Charlie Marsh	84093773ef	Store source distribution sources in the cache (#653 ) ## Summary This PR modifies `source_dist.rs` to store source distributions (from remote URLs) in the cache. The cache structure for registries now looks like: <img width="1053" alt="Screen Shot 2023-12-14 at 10 43 43 PM" src="https://github.com/astral-sh/puffin/assets/1309177/3c2dbf6b-5926-41f2-b69b-74031741aba8"> (I will update the docs prior to merging, if approved.) The benefit here is that we can reuse the source distribution (avoid download + unzipping it) if we need to build multiple wheels. In the future, it will be even more relevant, since we'll need to reuse the source distribution to support https://github.com/astral-sh/puffin/issues/599. I also included some misc. refactors to DRY up repeated operations and add some more abstraction to `source_dist.rs`.	2023-12-15 17:19:33 +00:00
Charlie Marsh	ed8dfbfcf7	Preserve verbatim URLs (#639 ) ## Summary This PR adds a `VerbatimUrl` struct to preserve verbatim URLs throughout the resolution and installation pipeline. In short, alongside the parsed `Url`, we also keep the URL as written by the user. This enables us to display the URL exactly as written by the user, rather than the serialized path that we use internally. This will be especially useful once we start expanding environment variables since, at that point, we'll be able to write the version of the URL that includes the _unexpected_ environment variable to the output file.	2023-12-14 15:03:39 +00:00
Charlie Marsh	0499fe0613	Fix incorrect unknown size marker in traces (#600 ) It said `(unknown size)` for _all_ disk-based wheels.	2023-12-09 04:46:01 +00:00
Zanie Blue	ef7be9103c	Parse `SimpleJson` into categorized data in the client (#522 ) Extends #517 with a suggestion from @konstin to parse the `SimpleJson` into an intermediate type `SimpleMetadata(BTreeMap<Version, VersionFiles>)` before converting to a `VersionMap`. This reduces the number of times we need to parse the response. Additionally, we cache the parsed response now instead of `SimpleJson`. `VersionFiles` stores two vectors with `WheelFilename`/`SourceDistFilename` and `File` tuples. These can be iterated over together or separately. A new enum `DistFilename` was added to capture the `SourceDistFilename` and `WheelFilename` variants allowing iteration over both vectors.	2023-12-07 11:04:47 -06:00
Charlie Marsh	a825b2db06	Shard the registry cache by package (#583 ) ## Summary This PR modifies the cache structure in a few ways. Most notably, we now shard the set of registry wheels by package, and index them lazily when computing the install plan. This applies both to built wheels: <img width="989" alt="Screen Shot 2023-12-06 at 4 42 19 PM" src="https://github.com/astral-sh/puffin/assets/1309177/0e8a306f-befd-4be9-a63e-2303389837bb"> And remote wheels: <img width="836" alt="Screen Shot 2023-12-06 at 4 42 30 PM" src="https://github.com/astral-sh/puffin/assets/1309177/7fd908cd-dd86-475e-9779-07ed067b4a1a"> For other distributions, we now consistently cache using the package name, which is really just for clarity and debuggability (we could consider omitting these): <img width="955" alt="Screen Shot 2023-12-06 at 4 58 30 PM" src="https://github.com/astral-sh/puffin/assets/1309177/3e8d0f99-df45-429a-9175-d57b54a72e56"> Obliquely closes https://github.com/astral-sh/puffin/issues/575.	2023-12-07 05:02:46 +00:00
Zanie Blue	2bb04771ce	Allow switching out the resolver's IO (#517 ) I'm working off of @konstin's commit here to implement arbitrary unsat test cases for the resolver. The entirety of the resolver's io are two functions: Get the version map for a package (PEP 440 version -> distribution) and get the metadata for a distribution. A new trait `ResolverProvider` abstracts these two away and allows replacing the real network requests e.g. with stored responses (https://github.com/pradyunsg/pip-resolver-benchmarks/blob/main/scenarios/pyrax_198.json). --------- Co-authored-by: konsti <konstin@mailbox.org>	2023-12-06 11:53:16 -06:00
Charlie Marsh	6f055ecf3b	Remove existing built wheels when building source distributions (#559 ) This PR modifies the source distribution building to replace any existing targets after building the new wheel. In some cases, the existence of an existing target may be indicative of a bug, so we warn. It's partially a workaround for some (but not all) of the errors in https://github.com/astral-sh/puffin/issues/554.	2023-12-05 12:45:24 -05:00
Charlie Marsh	5fddcc362e	Improve error messages for 'file not found' case (#550 ) Right now, if you specify a wheel that doesn't exist, you get: `no such file or directory` with no additional context. Oops!	2023-12-04 22:01:51 +00:00
konsti	d5abd33813	Use atomic writes for the cache consistently (#546 ) Ensure we're using atomic writes everywhere in our cache to avoid broken cache records and error with parallel puffin actions (https://github.com/astral-sh/puffin/pull/544#issuecomment-1838841581). All json files that are written to the cache are written atomically and the build wheels are written to temp dir and then moved atomically. I didn't touch venv creation though, i don't think that's worth it since python does not support atomic package installation through its design.	2023-12-04 12:02:01 -05:00
konsti	9806901a16	Consolidate wheel caches (#524 ) After this change, two wheel caches remain: `built-wheels-v0` and `wheels-v0`, docs screenshots below. Each contains both the wheel metadata, cache policy and zip or unzipped wheels under the same name. The zipped/unzipped strategy is as follows: In `pip-compile`, when we build a wheel, we store it zipped. When `pip-sync` or a source dist build in `pip-compile` need to install the wheel, we unzip it, remove the file and replace it with the unzipped wheel. This removes `WheelCache` and `UrlIndex` in favor of `Cache` plus `WheelCache`. The non-built wheel cache now considers index urls and the url for url wheels. I'm unsure if we need the `Unzipper` type, this could just be a function. I move `no_index` into `IndexUrls` and started using `IndexUrl` up to the clap level. I left a number of TODOs in the code, namely performing the actual invalidation of unzipped wheels and making the `InstallPlan` understand cache invalidation (i.e. uninstall wheels when their remote changed). ![image](https://github.com/astral-sh/puffin/assets/6826232/c4d45979-485b-4954-848d-fd3347ee2510)	2023-12-01 20:16:33 +00:00
konsti	d89fbeb642	Migrate interpreter query to custom caching (#508 ) This removes the last usage of cacache by replacing it with a custom, flat json caching keyed by the digest of the executable path. ![image](https://github.com/astral-sh/puffin/assets/6826232/8f777c4c-1f1b-4656-ba7b-002175270556) A step towards #478. I've made `CachedByTimestamp<T>` generic over `T` but intentionally not moved it to `puffin-cache` yet.	2023-11-28 17:14:59 +00:00
konsti	5435d44756	Introduce `Cache`, `CacheBucket` and `CacheEntry` (#507 ) This is mostly a mechanical refactor that moves 80% of our code to the same cache abstraction. It introduces cache `Cache`, which abstracts away the path of the cache and the temp dir drop and is passed throughout the codebase. To get a specific cache bucket, you need to requests your `CacheBucket` from `Cache`. `CacheBucket` is the centralizes the names of all cache buckets, moving them away from the string constants spread throughout the crates. Specifically for working with the `CachedClient`, there is a `CacheEntry`. I'm not sure yet if that is a strict improvement over `cache_dir: PathBuf, cache_file: String`, i may have to rotate that later. The interpreter cache moved into `interpreter-v0`. We can use the `CacheBucket` page to document the cache structure in each bucket: ![image](https://github.com/astral-sh/puffin/assets/6826232/b023fdfb-e34d-4c2d-8663-b5f73937a539)	2023-11-28 17:11:14 +00:00

1 2

94 Commits