Python/uv - uv - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Charlie Marsh	6a15950cb5	Rename `Distribution` to `Dist` in all structs and traits (#384 ) We tend to avoid abbreviations, but this one is just so long and absolutely ubiquitous.	2023-11-10 14:55:11 +00:00
Charlie Marsh	a148f9d0be	Refactor distribution types to adhere to a clear hierarchy (#369 ) ## Summary This PR refactors our `RemoteDistribution` type such that it now follows a clear hierarchy that matches the actual variants, and encodes the differences between source and built distributions: ```rust pub enum Distribution { Built(BuiltDistribution), Source(SourceDistribution), } pub enum BuiltDistribution { Registry(RegistryBuiltDistribution), DirectUrl(DirectUrlBuiltDistribution), } pub enum SourceDistribution { Registry(RegistrySourceDistribution), DirectUrl(DirectUrlSourceDistribution), Git(GitSourceDistribution), } /// A built distribution (wheel) that exists in a registry, like `PyPI`. pub struct RegistryBuiltDistribution { pub name: PackageName, pub version: Version, pub file: File, } /// A built distribution (wheel) that exists at an arbitrary URL. pub struct DirectUrlBuiltDistribution { pub name: PackageName, pub url: Url, } /// A source distribution that exists in a registry, like `PyPI`. pub struct RegistrySourceDistribution { pub name: PackageName, pub version: Version, pub file: File, } /// A source distribution that exists at an arbitrary URL. pub struct DirectUrlSourceDistribution { pub name: PackageName, pub url: Url, } /// A source distribution that exists in a Git repository. pub struct GitSourceDistribution { pub name: PackageName, pub url: Url, } ``` Most of the PR just stems downstream from this change. There are no behavioral changes, so I'm largely relying on lint, tests, and the compiler for correctness.	2023-11-10 02:45:41 +00:00
konsti	d407bbbee6	Special case missing header build errors (on linux) (#354 ) One of the most common errors i observed are build failures due to missing header files. On ubuntu, this generally means that you need to install some `<...>-dev` package that the documentation tells you about, e.g. [mysqlclient](https://github.com/PyMySQL/mysqlclient#linux) needs `default-libmysqlclient-dev`, [some psycopg versions](https://www.psycopg.org/psycopg3/docs/basic/install.html#local-installation) (i remember that this was always required at some earlier point) require `libpq-dev` and pygraphviz wants `graphviz-dev`. This is quite common for many scientific packages (where conda has an advantage because they can provide those package as a dependency). The error message can be completely inscrutable if you're just a python programmer (or user) and not a c programmer (example: pygraphviz): ``` warning: no files found matching '.png' under directory 'doc' warning: no files found matching '.txt' under directory 'doc' warning: no files found matching '.css' under directory 'doc' warning: no previously-included files matching '~' found anywhere in distribution warning: no previously-included files matching '.pyc' found anywhere in distribution warning: no previously-included files matching '.svn' found anywhere in distribution no previously-included directories found matching 'doc/build' pygraphviz/graphviz_wrap.c:3020:10: fatal error: graphviz/cgraph.h: No such file or directory 3020 \| #include "graphviz/cgraph.h" \| ^~~~~~~~~~~~~~~~~~~ compilation terminated. error: command '/usr/bin/gcc' failed with exit code 1 ``` The only relevant part is `Fatal error: graphviz/cgraph.h: No such file or directory`. Why is this file not there and how do i get it to be there? This is even harder to spot in pip's output, where it's 11 lines above the last line: ![image](https://github.com/astral-sh/puffin/assets/6826232/7a3d7279-e7b1-4511-ab22-d0a35be5e672) I've special cased missing headers and made sure that the last line tells you the important information: We're missing some header, please check the documentation of {package} {version} for what to install: ![image](https://github.com/astral-sh/puffin/assets/6826232/4bbb8923-5a82-472f-ab1f-9e1471aa2896) Scrolling up: ![image](https://github.com/astral-sh/puffin/assets/6826232/89a2495a-e188-4288-b534-ad885ee08763) The difference gets even clearer with a default ubuntu terminal with its 80 columns: ![image](https://github.com/astral-sh/puffin/assets/6826232/49fb27bc-07c6-4b10-a1a1-30ec8e112438) --- Note that the situation is better for a missing compiler, there i get: ``` [...] warning: no previously-included files matching '~' found anywhere in distribution warning: no previously-included files matching '*.pyc' found anywhere in distribution warning: no previously-included files matching '.svn' found anywhere in distribution no previously-included directories found matching 'doc/build' error: command 'gcc' failed: No such file or directory --- ``` Putting the last line into google, the first two results tell me to `sudo apt-get install gcc`, the third even tells me about `sudo apt install build-essential`	2023-11-08 15:26:39 +00:00
konsti	2ebe40b986	Add `--no-build` (#358 ) By default, we will build source distributions for both resolving and installing, running arbitrary code. `--no-build` adds an option to ban this and only install from wheels, no source distributions or git builds allowed. We also don't fetch these and instead report immediately. I've heard from users for whom this is a requirement, i'm implementing it now because it's helpful for testing. I'm thinking about adding a shared `PuffinSharedArgs` struct so we don't have to repeat each option everywhere.	2023-11-08 10:05:15 -05:00
Charlie Marsh	2c32bc5a86	Respect direct URLs in puffin installer (#345 ) We now write the `direct_url.json` when installing, and _skip_ installing if we find a package installed via the direct URL that the user is requesting. A lot of TODOs, especially around cleaning up the `Source` abstraction and its relationship to `DirectUrl`. I'm gonna keep working on these today, but this works and makes the requirements clear. Closes #332.	2023-11-07 09:11:27 -05:00
konsti	aac8ae997f	Rename source distribution build to source build (#334 ) This is less verbose and better reflects that we're building both source distributions and source trees passed into the function.	2023-11-07 03:55:23 +00:00
Charlie Marsh	2c114592bd	Only store small wheels in-memory (#348 ) Closes https://github.com/astral-sh/puffin/issues/246.	2023-11-07 00:50:00 +00:00
Charlie Marsh	24e30e6557	Split `puffin-package` into requirements.txt parser and `pypi-types` (#341 ) There are only two things left in this crate and they don't really have anything to do with one another.	2023-11-06 18:19:49 +00:00
Charlie Marsh	1f447892f3	Rename `PartitionedRequirements` to `InstallPlan` (#340 ) @konstin named this file at some point and I like it, it feels appropriate for the struct itself too.	2023-11-06 12:44:35 -05:00
Charlie Marsh	d9bcfafa16	Write `direct_url.json` in wheel installer (#337 ) ## Summary This PR just adds the logic in `install-wheel-rs` to write `direct_url.json`. We're not actually taking advantage of it yet (or wiring it through) in Puffin. Part of https://github.com/astral-sh/puffin/issues/332.	2023-11-06 17:09:28 +00:00
konsti	6f83a44fea	Improve error messages and make cache failures non fatal (#333 )	2023-11-06 15:06:27 +01:00
Charlie Marsh	6d672b8951	Add source distribution support to `pip-compile` (#323 ) ## Summary This is a first-pass at adding source distribution support to the installer. The previous installation flow was: 1. Come up with a plan. 1. Find a distribution (specific file) for every package that we'll need to download. 1. Download those distributions. 1. Unzip them (since we assumed they were all wheels). 1. Install them into the virtual environment. Now, Step (3) downloads both wheels and source distributions, and we insert a step between Steps (3) and (4) to build any source distributions into zipped wheels. There are a bunch of TODOs, the most important (IMO) is that we basically have two implementations of downloading and building, between the stuff in `puffin_installer` and `puffin_resolver` (namely in `crates/puffin-resolver/src/distribution`). I didn't attempt to clean that up here -- it's already a problem, and it's related to the overall problem we need to solve around unified caching and resource management. Closes #243.	2023-11-06 08:22:36 -05:00
Charlie Marsh	a4002fe132	Make cache non-optional in most crates (#293 ) This PR makes the cache non-optional in most of Puffin, which simplifies the code, allows us to reuse the cache within a single command (even with `--no-cache`), and also allows us to use the cache for disk storage across an invocation. I left the cache as optional for the `Virtualenv` and `InterpreterInfo` abstractions, since those are generic enough that it seems nice to have a non-cached version, but it's kind of arbitrary.	2023-11-02 13:40:20 -04:00
konsti	4adaa9a700	Wheel filename distribution package name (#278 ) The normalized name abstractions were not consistently, this PR uses them where they were previously missing: * `WheelFilename::distribution` * `Requirement::name` * `Requirement::extras` * `Metadata21::name` * `Metadata21::provides_dist` With `puffin-package` depending on `pep508_rs` this would be cyclical crate dependency, so `puffin-normalize` gets split out from `puffin-package`. `DistInfoName` has the same task and semantics as `PackageName`, so it's merged into the latter. `PackageName` and `ExtraName` documentation is moved onto the type and their constructors are called `new` instead of `normalize`. We now use these constructors rarely enough the implicit allocation by `to_string()` shouldn't matter anymore, while more actual cloning becomes visible.	2023-11-02 11:15:27 +00:00
Charlie Marsh	8123e1a8f6	Add stable hash crate (#281 ) This PR adds a `puffin-cache` crate that we can share across a variety of other crates to generate stable hashes.	2023-11-01 23:41:45 +00:00
Charlie Marsh	2652caa3e3	Add support for URL dependencies (#251 ) ## Summary This PR adds support for resolving and installing dependencies via direct URLs, like: ``` werkzeug @ `960bb4017c`4aed12b5ed8b78e0153e/Werkzeug-2.0.0-py3-none-any.whl ``` These are fairly common (e.g., with `torch`), but you most often see them as Git dependencies. Broadly, structs like `RemoteDistribution` and friends are now enums that can represent either registry-based dependencies or URL-based dependencies: ```rust /// A built distribution (wheel) that exists as a remote file (e.g., on `PyPI`). #[derive(Debug, Clone)] #[allow(clippy::large_enum_variant)] pub enum RemoteDistribution { /// The distribution exists in a registry, like `PyPI`. Registry(PackageName, Version, File), /// The distribution exists at an arbitrary URL. Url(PackageName, Url), } ``` In the resolver, we now allow packages to take on an extra, optional `Url` field: ```rust #[derive(Debug, Clone, Eq, Derivative)] #[derivative(PartialEq, Hash)] pub enum PubGrubPackage { Root, Package( PackageName, Option<DistInfoName>, #[derivative(PartialEq = "ignore")] #[derivative(PartialOrd = "ignore")] #[derivative(Hash = "ignore")] Option<Url>, ), } ``` However, for the purpose of version satisfaction, we ignore the URL. This allows for the URL dependency to satisfy the transitive request in cases like: ``` flask==3.0.0 werkzeug @ `254c3e9b5f`5941e900b71206e6313b/werkzeug-3.0.1-py3-none-any.whl ``` There are a couple limitations in the current approach: - The caching for remote URLs is done separately in the resolver vs. the installer. I decided not to sweat this too much... We need to figure out caching holistically. - We don't support any sort of time-based cache for remote URLs -- they just exist forever. This will be a problem for URL dependencies, where we need some way to evict and refresh them. But I've deferred it for now. - I think I need to redo how this is modeled in the resolver, because right now, we don't detect a variety of invalid cases, e.g., providing two different URLs for a dependency, asking for a URL dependency and a _different version_ of the same dependency in the list of first-party dependencies, etc. - (We don't yet support VCS dependencies.)	2023-11-01 09:21:44 -04:00
Charlie Marsh	079b685c8c	Use distributions for `Reporter` signatures (#266 )	2023-11-01 03:19:13 +00:00
Charlie Marsh	89dad0c9ad	Move distribution abstraction in shared crate (#258 ) This also allows us to get rid of `PinnedPackage` _and_ to remove some `Result<...>` types due to needless conversions between otherwise-identical types.	2023-10-31 15:30:06 -04:00
Charlie Marsh	16aac834ee	Move PyPI-oriented types out of `puffin-client` crate (#255 ) Just an internal change to avoid a dependency on `puffin-client` for those crates that need access to PyPI-metadata types.	2023-10-31 17:10:23 +00:00
Charlie Marsh	2f38701008	Remove unused wheel cache argument from downloader (#248 )	2023-10-31 02:23:50 +00:00
Charlie Marsh	ae203f998a	Rename `Unzipper#download` to `Unzipper#unzip` (#247 )	2023-10-31 01:19:27 +00:00
konsti	35d6bd761b	Fallback to copy if hardlinking failed (#237 )	2023-10-30 19:10:01 +00:00
Charlie Marsh	d5c3ff789a	Sort wheels by size when downloading and zipping (#210 ) I just learned about this from PackagingCon, and locally, it shows a nice speedup: ``` ❯ hyperfine --warmup 3 --prepare "rm -rf .venv && ./target/release/puffin venv .venv" "./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache" "./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache" Benchmark 1: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache Time (mean ± σ): 3.958 s ± 0.250 s [User: 1.323 s, System: 5.840 s] Range (min … max): 3.652 s … 4.402 s 10 runs Benchmark 2: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache Time (mean ± σ): 4.214 s ± 0.451 s [User: 1.322 s, System: 5.976 s] Range (min … max): 3.708 s … 5.268 s 10 runs Summary './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran 1.06 ± 0.13 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ```	2023-10-26 20:50:56 +00:00
Charlie Marsh	7bce41498e	Improve debug logging in dispatcher (#206 ) Also makes the order of operations more similar to that of the `pip-compile` command.	2023-10-26 18:54:47 +00:00
konsti	889f6173cc	Unify python interpreter abstractions (#178 ) Previously, we had two python interpreter metadata structs, one in gourgeist and one in puffin. Both would spawn a subprocess to query overlapping metadata and both would appear in the cli crate, if you weren't careful you could even have to different base interpreters at once. This change unifies this to one set of metadata, queried and cached once. Another effect of this crate is proper separation of python interpreter and venv. A base interpreter (such as `/usr/bin/python/`, but also pyenv and conda installed python) has a set of metadata. A venv has a root and inherits the base python metadata except for `sys.prefix`, which unlike `sys.base_prefix`, gets set to the venv root. From the root and the interpreter info we can compute the paths inside the venv. We can reuse the interpreter info of the base interpreter when creating a venv without having to query the newly created `python`.	2023-10-25 20:11:36 +00:00
konsti	1fbe328257	Build source distributions in the resolver (#138 ) This is isn't ready, but it can resolve `meine_stadt_transparent==0.2.14`. The source distributions are currently being built serially one after the other, i don't know if that is incidentally due to the resolution order, because sdist building is blocking or because of something in the resolver that could be improved. It's a bit annoying that the thing that was supposed to do http requests now suddenly also has to a whole download/unpack/resolve/install/build routine, it messes up the type hierarchy. The much bigger problem though is avoid recursive crate dependencies, it's the reason for the callback and for splitting the builder into two crates (badly named atm)	2023-10-25 20:05:13 +00:00
konsti	b5c57ee6fe	Fix rustdoc warnings (#182 ) Changes to make `cargo doc --all --all-features` pass without warnings.	2023-10-25 11:48:24 +00:00
Charlie Marsh	0e097874f8	Add support for alternate index URLs (#169 ) As elsewhere, we just use the `pip` and `pip-compile` APIs. So we support `--index-url` to override PyPI, then `--extra-index-url` to add _additional_ indexes, and `--no-index` to avoid hitting the index at all. Closes #156.	2023-10-23 03:18:30 +00:00
Charlie Marsh	49a27ff33c	Add support for parameterized link modes (#164 ) Allows the user to select between clone, hardlink, and copy semantics for installs. (The pnpm documentation has a decent description of what these mean: https://pnpm.io/npmrc#package-import-method.) Closes #159.	2023-10-22 04:35:50 +00:00
konsti	ae9d1f7572	Add source distribution filename abstraction (#154 ) The need for this became clear when working on the source distribution integration into the resolver. While at it i also switch the `WheelFilename` version to the parsed `pep440_rs` version now that we have this crate.	2023-10-20 17:45:57 +02:00
Charlie Marsh	bcd281eb1f	Remove `async` from some filesystem-only APIs (#146 )	2023-10-20 01:08:51 +00:00
Charlie Marsh	7ef6c0315c	Unify site-packages into distribution enum (#136 ) Gets rid of the custom `DistInfo` struct in the site-packages abstraction in favor of a new kind of distribution (`InstalledDistribution`). No change in behavior.	2023-10-19 04:37:52 +00:00
Charlie Marsh	bd01fb490e	Remove packages when syncing (#135 ) `pip-sync` will now uninstall any packages that aren't necessary. Closes https://github.com/astral-sh/puffin/issues/128.	2023-10-19 00:14:20 -04:00
Charlie Marsh	7e8ffeb2df	Use `fs-err` in more crates (#100 ) Closes https://github.com/astral-sh/puffin/issues/88.	2023-10-16 13:37:58 +00:00
konsti	de9e85978b	Fix tempdir rename (#94 ) This fixes two bugs on linux: `/tmp` and `$HOME` are technically on two different partitions on my machine, which means that rename-as-atomic-dir-write doesn't work. The solution is to create the temp dir in the target directory. zip files may contain directory entries, we can't create files for them but need to create directories. We could skip them though because iirc they are not in the RECORD so they won't be uninstalled.	2023-10-12 18:47:38 +00:00
Charlie Marsh	906a482499	Separate unzip into its own install phase (#87 )	2023-10-11 15:18:23 +00:00
Charlie Marsh	85162d1111	Parallelize wheel installations with Rayon (#84 ) It looks like using _either_ async Rust with a `JoinSet` _or_ parallelizing a fixed threadpool with Rayon provide about a ~5% speed-up over our current serial approach: ```console ❯ hyperfine --runs 30 --warmup 5 --prepare "./target/release/puffin venv .venv" \ "./target/release/rayon sync ./scripts/benchmarks/requirements-large.txt" \ "./target/release/async sync ./scripts/benchmarks/requirements-large.txt" \ "./target/release/main sync ./scripts/benchmarks/requirements-large.txt" Benchmark 1: ./target/release/rayon sync ./scripts/benchmarks/requirements-large.txt Time (mean ± σ): 295.7 ms ± 16.9 ms [User: 28.6 ms, System: 263.3 ms] Range (min … max): 249.2 ms … 315.9 ms 30 runs Benchmark 2: ./target/release/async sync ./scripts/benchmarks/requirements-large.txt Time (mean ± σ): 296.2 ms ± 20.2 ms [User: 36.1 ms, System: 340.1 ms] Range (min … max): 258.0 ms … 359.4 ms 30 runs Benchmark 3: ./target/release/main sync ./scripts/benchmarks/requirements-large.txt Time (mean ± σ): 306.6 ms ± 19.5 ms [User: 25.3 ms, System: 220.5 ms] Range (min … max): 269.6 ms … 332.2 ms 30 runs Summary './target/release/rayon sync ./scripts/benchmarks/requirements-large.txt' ran 1.00 ± 0.09 times faster than './target/release/async sync ./scripts/benchmarks/requirements-large.txt' 1.04 ± 0.09 times faster than './target/release/main sync ./scripts/benchmarks/requirements-large.txt' ``` It's much easier to just parallelize with Rayon and avoid async in the underlying wheel code, so this PR takes that approach for now.	2023-10-10 23:46:30 -04:00
Charlie Marsh	c1fb698eae	Add a separate dist-info name struct (#85 )	2023-10-10 23:21:18 +00:00
Charlie Marsh	a0294a510c	Rework `puffin sync` output to summarize (#81 ) This also moves away from using `tracing` for user-facing logging, instead introducing a new `Printer` abstraction. Closes #66.	2023-10-10 03:29:09 +00:00
Charlie Marsh	ba2b200fce	Enable release builds via `cargo-dist` (#79 )	2023-10-09 20:48:55 +00:00
Charlie Marsh	b90140e1bc	Add support for wheel uninstalls (#77 ) Closes #36.	2023-10-09 14:14:33 -04:00
Charlie Marsh	485b1dceb6	Use a single requirements iterator in `sync` (#71 )	2023-10-09 03:29:38 +00:00
Charlie Marsh	ba72950546	Avoid passing cached wheels to the resolver step (#70 ) When we go to install a locked `requirements.txt`, if a wheel is already available in the local cache, and matches the version specifiers, we can just use it directly without fetching the package metadata. This speeds up the no-op case by about 33%. Closes https://github.com/astral-sh/puffin/issues/48.	2023-10-08 22:17:19 -04:00
Charlie Marsh	5b71cfdd0b	Remove Monotrail-specific code from `install-wheel-rs` (#68 ) I think this isn't necessary to support in this generic crate. If we choose to adopt Monotrail-style concepts, we'll likely need to rework them anyway.	2023-10-08 18:28:57 -04:00
Charlie Marsh	1c942ab8fe	Tweak tracing output for sync command (#64 )	2023-10-08 20:09:15 +00:00
Charlie Marsh	a53f697f62	Use `tracing` for user-facing output (#63 ) The setup is now as follows: - All user-facing logging goes through `tracing` at an `info` leve. (This excludes messages that go to `stdout`, like the compiled `requirements.txt` file.) - We have `--quiet` and `--verbose` command-line flags to set the tracing filter and format defaults. So if you use `--verbose`, we include timestamps and targets, and filter at `puffin=debug` level. - However, we always respect `RUST_LOG`. So you can override the _filter_ via `RUST_LOG`. For example: the standard setup filters to `puffin=info`, and doesn't show timestamps or targets: <img width="1235" alt="Screen Shot 2023-10-08 at 3 41 22 PM" src="https://github.com/astral-sh/puffin/assets/1309177/54ca4db6-c66a-439e-bfa3-b86dee136e45"> If you run with `--verbose`, you get debug logging, but confined to our crates: <img width="1235" alt="Screen Shot 2023-10-08 at 3 41 57 PM" src="https://github.com/astral-sh/puffin/assets/1309177/c5c1af11-7f7a-4038-a173-d9eca4c3630b"> If you want verbose logging with _all_ crates, you can add `RUST_LOG=debug`: <img width="1235" alt="Screen Shot 2023-10-08 at 3 42 39 PM" src="https://github.com/astral-sh/puffin/assets/1309177/0b5191f4-4db0-4db9-86ba-6f9fa521bcb6"> I think this is a reasonable setup, though we can see how it feels and refine over time. Closes https://github.com/astral-sh/puffin/issues/57.	2023-10-08 15:46:06 -04:00
Charlie Marsh	5eef6e9636	Store cached wheels by dist-info-like name (#52 ) Closes https://github.com/astral-sh/puffin/issues/50.	2023-10-08 04:28:04 +00:00
Charlie Marsh	2a846e76b7	Store unzipped wheels in a cache (#49 ) This PR massively speeds up the case in which you need to install wheels that already exist in the global cache. The new strategy is as follows: - Download the wheel into the content-addressed cache. - Unzip the wheel into the cache, but ignore content-addressing. It turns out that writing to `cacache` for every file in the zip added a ton of overhead, and I don't see any actual advantages to doing so. Instead, we just unzip the contents into a directory at, e.g., `~/.cache/puffin/django-4.1.5`. - (The unzip itself is now parallelized with Rayon.) - When installing the wheel, we now support unzipping from a directory instead of a zip archive. This required duplicating and tweaking a few functions. - When installing the wheel, we now use reflinks (or copy-on-write links). These have a few fantastic properties: (1) they're extremely cheap to create (on macOS, they are allegedly faster than hard links); (2) they minimize disk space, since we avoid copying files entirely in the vast majority of cases; and (3) if the user then edits a file locally, the cache doesn't get polluted. Orogene, Bun, and soon pnpm all use reflinks. Puffin is now ~15x faster than `pip` for the common case of installing cached data into a fresh environment. Closes https://github.com/astral-sh/puffin/issues/21. Closes https://github.com/astral-sh/puffin/issues/39.	2023-10-08 04:04:48 +00:00
Charlie Marsh	f3015ffc1f	Add a `clean` command to clear the cache (#41 )	2023-10-07 15:19:03 +00:00
Charlie Marsh	162952bf64	Add a content-addressed cache for wheels (#38 ) Closes https://github.com/astral-sh/puffin/issues/4.	2023-10-07 14:24:52 +00:00

1 2

55 Commits