Python/uv - uv - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Charlie Marsh	4a09889c80	Enforce URL constraints for non-URL dependencies (#1565 ) ## Summary This was just a missing line -- we have `dependencies.remove(&package);` in the ~identical branch above, but it must've been an oversight to omit it here. Closes https://github.com/astral-sh/uv/issues/1467. ## Test Plan `cargo test`	2024-02-17 03:11:28 +00:00
Charlie Marsh	f897ee3f88	Allow repeated dependencies when installing (#1558 ) ## Summary It turns out that it's not uncommon to end up with repeated packages in requirements files when running `pip-sync`, e.g., you might have `anyio==4.0.0` specified multiple times. This PR relaxes our assertions in the install plan to allow such repeated packages, as long as the requirement markers are exactly the same (i.e., they are truly duplicates). Closes https://github.com/astral-sh/uv/issues/1552.	2024-02-17 01:33:40 +00:00
Charlie Marsh	1110489c29	Bump version to v0.1.3 (#1557 )	2024-02-16 19:45:29 -05:00
Charlie Marsh	c1eb6130e1	Support MD5 hashes (#1556 ) ## Summary We can add other hashes if necessary, but I don't know that they're really used in practice. Closes https://github.com/astral-sh/uv/issues/1547.	2024-02-17 00:25:16 +00:00
Charlie Marsh	9e0336c28a	Remove URL encoding when determining file name (#1555 ) ## Summary Closes https://github.com/astral-sh/uv/issues/1553.	2024-02-16 19:15:24 -05:00
Charlie Marsh	6392961f44	Add support for extras in editable requirements (#1531 ) ## Summary If you're developing on a package like `attrs` locally, and it has a recursive extra like `attrs[dev]`, it turns out that we then try to find the `attrs` in `attrs[dev]` from the registry, rather than recognizing that it's part of the editable. This PR fixes the issue by making editables slightly more first-class throughout the resolver. Instead of mocking metadata, we explicitly check for extras in various places. Part of the problem here is that we treated editables as URL dependencies, but when we saw an _extra_ like `attrs[dev]`, we didn't map that back to the URL. So now, we treat them as registry dependencies, but with the appropriate guardrails throughout. Closes https://github.com/astral-sh/uv/issues/1447. ## Test Plan - Cloned `attrs`. - Ran `cargo run venv && cargo run pip install -e ".[dev]" -v`.	2024-02-16 18:48:35 -05:00
Charlie Marsh	4e0b6f8f84	Avoid attempting rename in copy fallback path (#1546 ) ## Summary This _could_ fix https://github.com/astral-sh/uv/issues/1454, but I'm not sure. I was able to replicate by forcing a bunch of error states. But, in short, if we fail to hardlink on the initial copy due to a file existing, and then fail _again_, we fallback to copying. But if we copy, then the tempfile doesn't exist, and so the `fs_err::rename(&tempfile, &out_path)?;` will fail with "File not found". This PR just ensures that the cases are explicitly mutually exclusive: we only attempt to rename if the hardlink succeeded.	2024-02-16 17:08:49 -05:00
David Szotten	8050370717	Fix trailing commas on `Requires-Python` in HTML indexes (#1507 ) illustration and suggested fix for #1464	2024-02-16 22:05:28 +00:00
Charlie Marsh	4f216f3a74	Apply percent-decoding to filepaths in HTML find-links (#1544 ) ## Summary Closes https://github.com/astral-sh/uv/issues/1542.	2024-02-16 16:47:04 -05:00
Andrew Gallant	3aa7a6b796	fix OS detection for Alpine Linux (#1545 ) This PR fixes the OS detection for Alpine Linux such that the version of musl available is correctly determined. The issue boiled down to a regex that required 2 digits for each version component. But a valid musl version is 1.2.4, which only has a single digit for each component. It's unclear how this was working for musl before this change. My theory is that our other methods of OS detection were somehow working. The first commit in this PR cleans up our Linux detection logic and adds lots of tracing calls to make debugging issues like this easier in the future. To do so, one can run: $ RUST_LOG=trace uv pip install -v whatever The second commit has the actual fix. Fixes #1427	2024-02-16 16:37:18 -05:00
Charlie Marsh	b4ea48955b	Use comparable representation for `PackageId` (#1543 ) ## Summary By using the display representation of `Version` to form a `PackageId`, we run the risk (as seen in the linked issue) of thinking that versions like `2021.1` and `2021.1.0` are not equivalent. Closes https://github.com/astral-sh/uv/issues/1536	2024-02-16 16:30:54 -05:00
Charlie Marsh	01ffc36520	Apply percent-decoding to file-based URLs (#1541 ) ## Summary Closes https://github.com/astral-sh/uv/issues/1537.	2024-02-16 16:11:16 -05:00
Andrew Gallant	a97c207674	pypi-types: fix lenient requirement parsing (#1529 ) This fixes a bug where `uv pip install` failed to install `polars`: ``` $ uv pip install polars==0.14.0 error: Failed to download: polars==0.14.0 Caused by: Couldn't parse metadata of polars-0.14.0-cp37-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.whl from `749022b096`cb7c1c2cc32b7f433c4f/polars-0.14.0-cp37-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.whl Caused by: Operator >= cannot be used with a wildcard version specifier pyarrow>=4.0.; extra == 'pyarrow' ^^^^^^^ ``` Since `pyarrow>=4.0.; extra == 'pyarrow'` is invalid and it comes from the metadata of a dependency (that isn't under the control of the end user), we actually attempt to "fix" it. Namely, wildcard dependency specifications are only allowed with `==` and `!=`, as per the [Version Specifiers spec]. (They aren't explicitly forbidden in these cases, but instead only have specified behavior for the `==` and `!=` operators.) This is all fine, but it turns out that when we fix the `>=4.0.` component, we also strip the quotes around `pyarrow`. (Because some dependency specifications include stray quotes.) We fix this by making our quote stripping a bit more selective. (We require that it appear adjacent to a digit or a ``.) Note that #1477 also reports this error: ``` $ uv pip install 'requests>=2.30.' error: Failed to parse `requests>=2.30.` Caused by: Operator >= cannot be used with a wildcard version specifier requests>=2.30.* ``` However, we specifically keep that error message since it's something under the end user's control. And similarly for a dependency specification in a `requirements.txt` file. Fixes #1477 [Version Specifiers spec]: https://packaging.python.org/en/latest/specifications/version-specifiers/	2024-02-16 15:52:44 -05:00
Zanie Blue	d5e8531ae3	Add support for `UV_EXTRA_INDEX_URL` (#1515 ) Closes https://github.com/astral-sh/uv/issues/1450	2024-02-16 12:54:58 -06:00
Zanie Blue	2ea44d863a	Add warning for empty requirements files (#1519 ) Also, improve tracing of requirements file parsing. Per my confusion in #1334	2024-02-16 18:19:09 +00:00
Zanie Blue	89ad1c6fa1	Add `pip install --constraint` test coverage (#1334 ) Exploring behavior reported in https://github.com/astral-sh/uv/issues/1332	2024-02-16 17:39:39 +00:00
Charlie Marsh	f25781ff6c	Support recursive extras (#1435 ) ## Summary We had a guard in the resolve to avoid "self-dependencies" (as in `gps3==0.33.3`), but this guard was _unintentionally_ filtering out recursive extras. Closes https://github.com/astral-sh/uv/issues/1342. ## Test Plan Taken from https://github.com/astral-sh/uv/pull/1352.	2024-02-16 11:42:04 -05:00
Zanie Blue	e6c4c77ba1	Use string display instead of debug for url parse trace (#1498 ) e.g. `uv_client::html::parse url=https://download.pytorch.org/whl/torch_stable.html` instead of `uv_client::html::parse url=Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("download.pytorch.org")), port: None, path: "/whl/torch_stable.html", query: None, fragment: None }`	2024-02-16 15:13:16 +00:00
Andrew Gallant	67cde15420	only parse /bin/sh (not /bin/ls) (#1493 ) It turns out that /bin/ls can sometimes be plain text file. For example, in Rocky Linux 9: ``` $ cat /bin/ls #!/usr/bin/coreutils --coreutils-prog-shebang=ls ``` However, `/bin/sh` is an ELF binary: ``` $ file /bin/sh /bin/sh: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=7acbb41bf6f1b7d977f1b44675bf3ed213776835, for GNU/Linux 3.2.0, stripped ``` In a related issue (#1433), @zanieb fixed #1395 where, on NixOS, `/bin/ls` doesn't exist but `/bin/sh` does. However, the fix attempts `/bin/ls` first and only tries `/bin/sh` if `/bin/ls` doesn't exist. If `/bin/ls` exists but isn't a valid ELF file, then the entire enterprise gives up and `uv` fails to detect the version of `libc` that is installed. Instead of tweaking the logic to keep trying `/bin/ls` and then `/bin/sh` after even if parsing `/bin/ls` fails, we just switch over to reading `/bin/sh` only. It seems like a more fundamental thing to sniff and likely less error prone. We can adjust this heuristic as needed if it provdes to be problematic. I tested this fix manually on Rocky Linux 9 via Docker: ``` $ cross b -r -p uv --target x86_64-unknown-linux-musl $ cp target/x86_64-unknown-linux-musl/release/uv ~/astral/issues/uv/i1486/uv $ docker run --rm -it --mount type=bind,src=/home/andrew/astral/issues/uv/i1486,dst=/host rockylinux:9 bash [root@df2baa65d2f8 /]# /host/uv venv Using Python 3.9.18 interpreter at /usr/bin/python3.9 Creating virtualenv at: .venv [root@df2baa65d2f8 /]# ``` Fixes #1486, Ref #1433	2024-02-16 09:44:47 -05:00
Micha Reiser	e913167849	Fix list rendering in `venv --help` output (#1459 )	2024-02-16 15:36:36 +01:00
Aarni Koskela	3280562e3a	Loosen package script regexp to match spec (#1482 ) Fixes #1479.	2024-02-16 09:25:32 -05:00
Zanie Blue	d99c4cacdf	Read from `/bin/sh` if `/bin/ls` cannot be found when determing libc path (#1433 ) I'm not sure if we should just switch to _always_ reading from sh instead? I don't love that all these errors are strings and I if `/bin/ls` exists but can't be parsed we still won't try `/bin/sh`. We may want to address these things in the future. Closes https://github.com/astral-sh/uv/issues/1395	2024-02-16 07:51:34 -05:00
Charlie Marsh	c474370064	Allow empty fragments in HTML parser (#1443 ) ## Summary It looks like `devpi` might add an empty fragment (`#`) at the end of the URL. We expect it to contain the hash; this just makes empty-fragment map to "no hash". Closes https://github.com/astral-sh/uv/issues/1441.	2024-02-16 06:42:21 +00:00
Charlie Marsh	659327f24a	Bump version to v0.1.2 (#1439 )	2024-02-16 01:17:19 -05:00
Charlie Marsh	0d005a2a71	Decode HTML escapes when extracting SHA (#1440 ) ## Summary If a distribution contains a `+`, it'll be HTML-escaped; so when we try to identify the `#`, we'll split in the wrong location. Closes https://github.com/astral-sh/uv/issues/1338.	2024-02-16 06:15:51 +00:00
Charlie Marsh	958e88ddbf	Ignore invalid extra named `.none` (#1428 ) ## Summary Some packages erroneously include an extra named `.none`. It turns out that certain versions of `flit` included this by accident: https://github.com/pypa/flit/issues/228/. This PR adds leniency for that specific extra name. Closes https://github.com/astral-sh/uv/issues/1363. Closes https://github.com/astral-sh/uv/issues/1399.	2024-02-16 05:01:21 +00:00
Zanie Blue	0bfce353fb	Fix broken URLs parsed from relative paths in registries (#1413 ) Closes https://github.com/astral-sh/uv/issues/1388 Fixes incorrect handling of relative paths returned by indexes without an explicit `<base>`. `Url.join` will drop the last segment in an url e.g. `http://foo/bar` -> `http://foo/baz` if there is not a trailing slash but what we want is `http://foo/bar/baz`. We don't add the trailing `/` in `base_url_join_relative` because flat indexes are `http://foo/bar.html` and we _want_ `bar.html` to be replaced.	2024-02-15 22:37:09 -06:00
Charlie Marsh	e48edf02fa	Parse `-r` and `-c` entries as relative to containing file (#1421 ) ## Summary In a `requirements.txt` file, it turns out that the `-c` and `-r` entries should be interpreted as relative to the file in which they're declared, while the `-e` entries should be interpreted as relative to the current working directory, no matter where they're defined. Previously, we always used the current working directory; now, we use the declaring file's directory for `-c` and `-r`. Closes https://github.com/astral-sh/uv/issues/1367. Closes https://github.com/astral-sh/uv/issues/1416.	2024-02-15 23:19:43 -05:00
Charlie Marsh	1837641138	Add fix-up for invalid star comparison with major-only version (#1410 ) ## Summary Closes https://github.com/astral-sh/uv/issues/1402. ## Test Plan Ran `cargo run pip install junos-eznc==2.6.5`, which still fails for me, but fails identically to `pip` (and not on the `requires-python`): ``` /private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py:421: SyntaxWarning: invalid escape sequence '\s' LONG_VERSION_PY['git'] = ''' Traceback (most recent call last): File "<string>", line 10, in <module> File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmplD5mMO/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 366, in prepare_metadata_for_build_wheel self.run_setup() File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmplD5mMO/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 480, in run_setup super().run_setup(setup_script=setup_script) File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmplD5mMO/.venv/lib/python3.12/site-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "<string>", line 45, in <module> File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py", line 1480, in get_version return get_versions()["version"] ^^^^^^^^^^^^^^ File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py", line 1412, in get_versions cfg = get_config_from_root(root) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/private/var/folders/nt/6gf2v7_s3k13zq_t3944rwz40000gn/T/.tmp7mxT9L/built-wheels-v0/pypi/ncclient/0.6.13/4vvPwmDC_CL2OUXd68Zqb/ncclient-0.6.13.tar.gz/versioneer.py", line 342, in get_config_from_root parser = configparser.SafeConfigParser() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: module 'configparser' has no attribute 'SafeConfigParser'. Did you mean: 'RawConfigParser'? ```	2024-02-16 02:12:10 +00:00
Charlie Marsh	7994b68654	Add fix-up for trailing comma with trailing space (#1409 ) ## Summary Closes https://github.com/astral-sh/uv/issues/1361. ## Test Plan ```text Resolved 3 packages in 243ms Downloaded 3 packages in 193ms Installed 3 packages in 6ms + et-xmlfile==1.1.0 + jdcal==1.4.1 + openpyxl==3.0.5 ```	2024-02-16 02:08:05 +00:00
Zanie Blue	0f554b0913	Add `-U`/`-P` short flags for `--upgrade`/`--upgrade-package` (#1394 ) Closes https://github.com/astral-sh/uv/issues/1340	2024-02-16 01:34:19 +00:00
Zanie Blue	896ab1c54f	Add `--upgrade` support to `pip install` (#1379 ) Adds support for `--upgrade` — similar to `--reinstall`. Closes https://github.com/astral-sh/uv/issues/1391	2024-02-15 19:25:28 -06:00
Shantanu	e9d82cf0fa	Avoid import contextlib in `_virtualenv` (#1406 ) No need to pay 3ms on basically every Python invocation. I opened a PR upstream last week: https://github.com/pypa/virtualenv/pull/2688	2024-02-15 20:23:05 -05:00
Robin Krahl	f9a9f53476	Improve error message for invalid sdist archives (#1389 ) This PR improves the error message for the problem described in https://github.com/astral-sh/uv/issues/1376. The original output duplicates the actual error message and includes lots of noise (`DirEntry { inner: DirEntry(...) }`). ``` $ uv pip install hexdump==3.3 error: Failed to download and build: hexdump==3.3 Caused by: Failed to extract source distribution: The top level of the archive must only contain a list directory, but it contains: [DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/__main__.py") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/hexdump.py") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/data") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/PKG-INFO") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/setup.py") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/README.txt") }] Caused by: The top level of the archive must only contain a list directory, but it contains: [DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/__main__.py") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/hexdump.py") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/data") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/PKG-INFO") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/setup.py") }, DirEntry { inner: DirEntry("/home/robin/.cache/uv/.tmpgSvTCk/README.txt") }] ``` This PR removes the duplication and `DirEntry` internals so that the error message is easier to grasp: ``` $ uv pip install hexdump==3.3 error: Failed to download and build: hexdump==3.3 Caused by: Failed to extract source distribution Caused by: The top level of the archive must only contain a list directory, but it contains: ["__main__.py", "hexdump.py", "data", "PKG-INFO", "setup.py", "README.txt"] ```	2024-02-15 18:03:23 -06:00
Zanie Blue	c6a43e92f9	Add `UV_NO_CACHE` environment variable (#1383 ) It's a little picky about the value, but that seems okay. ``` ❯ ./target/debug/uv pip install trio Audited 1 package in 4ms ❯ UV_NO_CACHE=true ./target/debug/uv pip install trio Audited 1 package in 50ms ``` Closes #1382	2024-02-15 23:34:42 +00:00
mikcl	fbd6d87214	uv-cache: Add hidden alias for --no-cache-dir (#1380 ) This is for compatability with pip install --no-cache-dir Closes https://github.com/astral-sh/uv/issues/1373 Signed-off-by: mikcl <mikesmikes400@gmail.com>	2024-02-15 23:26:16 +00:00
Zanie Blue	18a7c079de	Fix search for `python.exe` on Windows (#1381 ) A la #1351	2024-02-15 17:25:20 -06:00
Zanie Blue	e0885b7c8e	Bump version to 0.1.1 (#1359 )	2024-02-15 15:38:22 -06:00
Yannik Sander	36544e1678	Fix diagram alignment (#1354 ) Drive by alignment PR for the trait structure diagram	2024-02-15 15:32:33 -06:00
Zanie Blue	102e5ddfbe	Fix bug where `python3` is not found in the global path (#1351 ) When we refactored handling for Windows tests, we accidentally dropped `python3` from path searches.	2024-02-15 15:29:33 -06:00
Charlie Marsh	27177613d4	Bump version to v0.1.0 (#1325 )	2024-02-15 14:12:23 -05:00
Zanie Blue	0780afff95	Rename `PUFFIN` environment variables to `UV` (#1319 ) A couple of these are actually user-facing although undocumented	2024-02-15 12:49:27 -06:00
Charlie Marsh	0579a04014	Bump to v0.0.5 for pre-release (#1324 ) This is easier than figuring out the version parsing.	2024-02-15 18:33:34 +00:00
Charlie Marsh	ad12d97e71	Set crate to prerelease (#1320 )	2024-02-15 18:21:09 +00:00
Charlie Marsh	06f2b6eee2	Bump version and update pyproject.toml metadata (#1316 ) Also ensures that we no longer clear the README when uploading to PyPI :)	2024-02-15 18:03:35 +00:00
Charlie Marsh	55808a451f	Regenerate benchmarks (#1305 )	2024-02-15 17:54:04 +00:00
Zanie Blue	2586f655bb	Rename to `uv` (#1302 ) First, replace all usages in files in-place. I used my editor for this. If someone wants to add a one-liner that'd be fun. Then, update directory and file names: ``` # Run twice for nested directories find . -type d -print0 \| xargs -0 rename s/puffin/uv/g find . -type d -print0 \| xargs -0 rename s/puffin/uv/g # Update files find . -type f -print0 \| xargs -0 rename s/puffin/uv/g ``` Then add all the files again ``` # Add all the files again git add crates git add python/uv # This one needs a force-add git add -f crates/uv-trampoline ```	2024-02-15 11:19:46 -06:00
Zanie Blue	e9e3e573a2	Report incompatible distributions to users (#1293 ) Instead of dropping versions without a compatible distribution, we track them as incompatibilities in the solver. This implementation follows patterns established in https://github.com/astral-sh/puffin/pull/1290. This required some significant refactoring of how we track incompatible distributions. Notably: - `Option<TagPriority>` is now `WheelCompatibility` which allows us to track the reason a wheel is incompatible instead of just `None`. - `Candidate` now has a `CandidateDist` with `Compatible` and `Incompatibile` variants instead of just `ResolvableDist`; candidates are not strictly compatible anymore - `ResolvableDist` was renamed to `CompatibleDist` - `IncompatibleWheel` was given an ordering implementation so we can track the "most compatible" (but still incompatible) wheel. This allows us to collapse the reason a version cannot be used to a single incompatibility. - The filtering in the `VersionMap` is retained, we still only store one incompatible wheel per version. This is sufficient for error reporting. - A `TagCompatibility` type was added for tracking which part of a wheel tag is incompatible - `Candidate::validate_python` moved to `PythonRequirement::validate_dist` I am doing more refactoring in #1298 — I think a couple passes will be necessary to clarify the relationships of these types. Includes improved error message snapshots for multiple incompatible Python tag types from #1285 — we should add more scenarios for coverage of behavior when multiple tags with different levels are present.	2024-02-15 10:48:15 -06:00
Andrew Gallant	b6fba00153	cli: conventionally treat `-` as "read from stdin" (#1314 ) Basically, when a path to a requirements file is `-`, then we should read its contents from `stdin` instead of the file path named `-`. Fixes #1313	2024-02-15 11:25:56 -05:00
Armin Ronacher	bf2ee6bc31	Adds support for --no-deps to pip compile (#1311 ) Mostly throwing this up here as a discussion topic. Having something like this is primarily useful for enabling use cases similar to `rye add` where I want to use this currently. One can accomplish something similar with `unearth` today or by abusing regular `pip install`: ``` $ ~/.rye/self/bin/pip install --no-deps --dry-run flask --report - -q \| jq '.install[0].metadata \| {name, version}' { "name": "Flask", "version": "3.0.2" } ``` Another option would be to have a `puffin resolve` command or similar that works like `pip compile` without dependencies, takes the requirements as arguments and returns a line for each resolution. That would be a larger change.	2024-02-15 09:01:31 -05:00
Andrew Gallant	94437175c7	puffin-resolver: make VersionMap::iter even lazier This rollbacks the optimization in the previous commit to be more general. That is, instead of specializing the case of a range for a singleton version, we make iteration over the distributions in a `VersionMap` more explicitly lazy. Iteration now provides a `Version` (like it did previously) and a _handle_ to a distribution that can be turned into a `ResolvableDist`. Doing things this way permits callers to iterate over the versions and only materialize a distribution if they actually need one. In cases like candidate selection, one can often rule out use of a distribution through its version alone, and thus skip construction of that distribution entirely.	2024-02-15 08:10:32 -05:00
Andrew Gallant	ed000d0dd5	puffin-resolver: add singleton fast path In many cases, version ranges are actually just pins to a specific and single version. And we can detect that statically by examining the range. If we do have a range that is just one version, then we can ask a `VersionMap` for just that version instead of iterating over what's in the map until we find one that satisfies the range. I had tried this before making `VersionMap` construction lazy, but it didn't seem to matter much. But helps a lot more now with a lazy `VersionMap` because it lets us avoid creating a lot of distributions in memory that we won't ultimately use.	2024-02-15 08:10:32 -05:00
Andrew Gallant	8102980192	puffin-resolver: make VersionMap construction lazy That is, a `PrioritizedDistribution` for a specific version of a package is not actually materialized in memory until a corresponding `VersionMap::get` call is made for that version. Similarly, iteration lazily materializes distributions as it moves through the map. It specifically does not materialize everything first. The main reason why this is effective is that an `OwnedArchive<SimpleMetadata>` represents a zero-copy (other than reading the source file) version of `SimpleMetadata` that is really just a `Vec<u8>` internally. The problem with `VersionMap` construction previously is that it had to eagerly materialize a `SimpleMetadata` in memory before anything else, which defeats a large part of the purpose of zero-copy deserialization. By making more of `VersionMap` construction itself lazy, we permit doing some parts of resolution without necessarily fully deserializing a `SimpleMetadata` into memory. Indeed, with this commit, in the warm cached case, a `SimpleMetadata` is itself never materialized fully in memory. This does not completely and totally fully realize the benefits of zero-copy deserialization. For example, we are likely still building lots of distributions in memory that we don't actually need in some cases. Perhaps in cases where no resolution exists, or when one needs to iterate over large portions of the total versions published for a package.	2024-02-15 08:10:32 -05:00
Andrew Gallant	e2f3ad0e28	puffin-resolver: add some trace calls This commit adds some logging to candidate selection during resolution. The idea with these logs is to get a signal on how much "exploring" the resolver does in specific examples. For example, this logs helped me realize that at least in some cases, candidate selection was looking through a long list of versions even when its range consisted of exactly one version. We'll use this fact in a later commit.	2024-02-15 08:10:32 -05:00
Andrew Gallant	1cff7c3774	platform-tags: make Tags use an Arc internally This makes cloning and thus sharing across multiple threads much cheaper. Since Tags is conceptually immutable once it is constructed, this doesn't pose an issue and shouldn't introduce any additional costs.	2024-02-15 08:10:32 -05:00
Charlie Marsh	e4fffc15f5	Remove Cargo-specific error messages (#1306 ) We're leveraging Cargo's git implementation, but we left in some Cargo-specific error messages for features we don't yet support.	2024-02-15 06:04:22 +00:00
Zanie Blue	9808c6b500	Reset all of the snapshots for consistent indentation (#1300 ) This is really annoying, but the snapshots keep changing indentation when updated. I could not get insta to update them. So I added a print statement to `main` and updated the snapshots, then removed the statement and updated the snapshots again to force them all to refresh.	2024-02-14 12:50:28 -06:00
Charlie Marsh	40b74fb0fb	Replace `MarkupSafe` for no-binary tests (#1296 )	2024-02-14 04:44:07 +00:00
Zanie Blue	7fec2a311a	Refactor storage of distribution metadata needed in resolver (#1291 ) Follows #1290 and https://github.com/astral-sh/puffin/pull/912 with some minor clean-up.	2024-02-13 04:19:21 +00:00
Zanie Blue	3bff8d5f79	Add scenario coverage for wheels with incompatible ABI and Python tags (#1285 ) We use - An arbitrary ABI hash: `MMMMMM` (six base64 characters) - An unlikely Jython27 Python tag For cases that are valid but are never going to be available during tests. See https://github.com/zanieb/packse/pull/109	2024-02-12 22:14:38 -06:00
Zanie Blue	b5dd8b7de2	Track yanked versions as incompatibilities (#1290 ) Moves yanked version filtering from `VersionMap::from_metadata` to the resolver and tracks it as a PubGrub unavailable incompatibility so yanked versions are reflected in error messages. e.g. before ``` ╰─▶ Because only albatross<=0.1.0 is available and you require albatross>0.1.0, we can conclude that the requirements are unsatisfiable. ``` after ``` ╰─▶ Because only the following versions of albatross are available: albatross<=0.1.0 albatross==1.0.0 and albatross==1.0.0 is unusable because it was yanked, we can conclude that albatross>0.1.0 cannot be used. And because you require albatross>0.1.0, we can conclude that the requirements are unsatisfiable. ```	2024-02-12 22:01:17 -06:00
Charlie Marsh	d8619f668a	Surface errors for offline `--find-links` URLs (#1271 ) ## Summary Ensures that if the user passes `--no-index` with `--find-links`, and we're unable to access the HTML page, we show an appropriate hint.	2024-02-13 03:41:00 +00:00
Charlie Marsh	16bb80132f	Add an `--offline` mode (#1270 ) ## Summary This PR adds an `--offline` flag to Puffin that disables network requests (implemented as a Reqwest middleware on our registry client). When `--offline` is provided, we also allow the HTTP cache to return stale data. Closes #942.	2024-02-13 03:35:23 +00:00
Zanie Blue	0cd6b7be8c	Fix incompatible wheel test scenarios (#1284 ) I had specified the tags incorrectly https://github.com/zanieb/packse/pull/105	2024-02-12 18:51:49 +00:00
Zanie Blue	6d24d998e0	Add scenarios for yanked packages (#1283 )	2024-02-12 12:44:59 -06:00
Zanie Blue	336d12556c	Add scenario tests for `--only-binary` and `--no-binary` (#1279 )	2024-02-12 11:21:14 -06:00
Charlie Marsh	b386590b3c	Add some compatibility arguments to `puffin venv` (#1282 ) See: https://github.com/astral-sh/puffin/issues/1276.	2024-02-12 03:19:55 +00:00
Charlie Marsh	93b7a1140f	Allow virtualenv creation at existing, empty directories (#1281 ) ## Summary If the directory exists but is empty, we should allow `puffin venv` without erroring. Also adds test cases for a variety of error cases.	2024-02-12 03:13:13 +00:00
Charlie Marsh	b7e3933fe7	Place editable requirements before non-editable requirements (#1278 ) ## Summary `pip-compile` puts the editable requirements first. Closes https://github.com/astral-sh/puffin/issues/1275.	2024-02-12 02:26:40 +00:00
Charlie Marsh	a16ec45d1f	Set an exclude cutoff for virtualenv tests (#1280 ) ## Summary This test is failing since a new version of one of the seed packages was uploaded.	2024-02-12 02:21:05 +00:00
Zanie Blue	a37b08808e	Implement pip compatible `--no-binary` and `--only-binary` options (#1268 ) Updates our `--no-binary` option and adds a `--only-binary` option for compatibility with `pip` which uses `:all:`, `:none:` and `<name>` for specifying packages. This required adding support for `--only-binary <name>` into our resolver, previously it was only a boolean toggle. Retains`--no-build` which is equivalent to `--only-binary :all:`. This is common enough for safety that I would prefer it is available without pip's awkward `:all:` syntax. --------- Co-authored-by: konsti <konstin@mailbox.org>	2024-02-11 19:31:41 -06:00
Charlie Marsh	d98b3c3070	Strip UNC prefix when setting working directory (#1277 ) ## Summary For PEP 517 builds, the current working directory needs to be set to the directory of the source distribution. It turns out that on Windows, if you use a UNC path for the working directory, then relative paths are interpreted relative to the root of the current drive ([source](https://www.fileside.app/blog/2023-03-17_windows-file-paths/#paths-relative-to-the-root-of-the-current-drive)). So, when builds attempted to resolve relative paths, they always errored... This PR ensures that we remove the UNC prefix when setting the current working directory. Closes #1238. ## Test Plan I tested this on my Windows machine by installing `ujson` with `--no-binary ujson`. (I don't want to add that specific test, since it's really slow to build.)	2024-02-12 00:51:36 +00:00
Charlie Marsh	ba4c6e1a55	Remove unused deps (#1273 )	2024-02-11 18:53:58 +00:00
Charlie Marsh	32aacc35a9	Bump version to v0.0.4 (#1269 )	2024-02-09 16:42:17 -05:00
konsti	561e33e353	Validate instead of discovering python patch version (#1266 ) Contrary to our prior assumption, we can't reliably select a specific patch version. With the deadsnakes PPA for example, `python3.12` is installed into `PATH`, but `python3.12.1` isn't. Based on the assumption (or rather, observation) that users have a single python patch version per python minor version installed, generally the latest, we only check if the installed patch version matches the selected patch version, if any, instead of search for one. In the process, i deduplicated the python discovery logic.	2024-02-08 22:38:00 +01:00
konsti	1dc9904f8c	Run the test suite on windows in CI (#1262 ) Run `cargo test` on windows in CI, pulling the switch on tier 1 windows support. These changes make the bootstrap script virtually required for running the tests. This gives us consistency between and CI, but it also locks our tests to python-build-standalone and an articificial `PATH`. I've deleted the shell bootstrap script in favor of only the python one, which also runs on windows. I've left the (sym)link creation of the bootstrap in place, even though it is not used by the tests anymore. I've reactivated the three tests that would previously stack overflow by doubling their stack sizes. The stack overflows only happen in debug mode, so this is neither a user facing problem nor an actual problem with our code and this workaround seems better than optimizing our code for case that the (release) compiler can optimize much better for. The handling of patch versions will be fixed in a follow-up PR. Closes #1160 Closes #1161 --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2024-02-08 22:09:55 +01:00
Andrew Gallant	96276d9e3e	puffin-resolver: simplify version map construction (#1267 ) In the process of making VersionMap construction lazy, I realized this refactoring would be useful to me. It also simplifies a fair bit of case analysis and does fewer BTreeMap lookups during construction. With that said, this doesn't seem to matter for perf: ``` $ hyperfine -w10 --runs 50 \ "puffin-main pip compile --cache-dir ~/astral/tmp/cache-main ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null" \ "puffin-test pip compile --cache-dir ~/astral/tmp/cache-test ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null" Benchmark 1: puffin-main pip compile --cache-dir ~/astral/tmp/cache-main ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null Time (mean ± σ): 146.8 ms ± 4.1 ms [User: 350.1 ms, System: 314.2 ms] Range (min … max): 140.7 ms … 158.0 ms 50 runs Benchmark 2: puffin-test pip compile --cache-dir ~/astral/tmp/cache-test ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null Time (mean ± σ): 146.8 ms ± 4.5 ms [User: 359.8 ms, System: 308.3 ms] Range (min … max): 138.2 ms … 160.1 ms 50 runs Summary puffin-main pip compile --cache-dir ~/astral/tmp/cache-main ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null ran 1.00 ± 0.04 times faster than puffin-test pip compile --cache-dir ~/astral/tmp/cache-test ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null ``` But the simplification is still nice, and will decrease the delta between what we have now and a lazy version map.	2024-02-08 15:33:33 -05:00
Zanie Blue	fc2ab611d5	Use "locations" instead of "listings" for find links errors (#1263 )	2024-02-07 10:28:22 -06:00
konsti	ab45485eb5	Reduce stack sizes further and ignore remaining tests (#1261 ) This PR reduces the stack sizes a windows a little further using the stack traces from stack overflows combined with looking at the type sizes. Ultimately, it ignore the three remaining tests failing in debug on windows due to stack overflows to unblock `cargo test` for windows on CI. 444 tests run: 444 passed (39 slow), 1 skipped	2024-02-06 23:08:18 +01:00
konsti	e0cdf1a16f	Use anstream consistently and remove clippy lints (#1260 ) We need to use the anstream print macros instead of the std print macros, otherwise we risk wrong color behavior (https://github.com/astral-sh/puffin/pull/1258#discussion_r1480428236). Luckily, the `print_stderr` and `print_stdout` lints catch usages of the std prints. This PR switches over to anstream consistently and removes the now redundant clippy lints. The lints should catch missing anstream usage in the future.	2024-02-06 22:16:26 +01:00
konsti	f4ca175df4	Search and replace windows specific tests in deps (#1255 ) Remove windows-only dependencies from the snapshot output using regex. We now do the filtering entirely on our without relying on insta settings. 435 tests run: 430 passed (30 slow), 5 failed, 1 skipped	2024-02-06 19:31:42 +00:00
konsti	ac49dec4a2	Multiple entries in PUFFIN_PYTHON_PATH for windows tests (#1254 ) There are no binary installers for the latests patch versions of cpython for windows, and building them is hard. As an alternative, we download python-build-standanlone cpythons and put them into `<project root>/bin`. On unix, we can symlink `pythonx.y.z` into this directory and point `PUFFIN_PYTHON_PATH` to it. On windows, all pythons are called `python.exe` and they don't like being linked. Instead, we add the path to each directory containing a `python.exe` to `PUFFIN_PYTHON_PATH`, similar to the regular `PATH`. The python discovery on windows was extended to respect `PUFFIN_PYTHON_PATH` where needed. These changes mean that we don't need to (sym)link pythons anymore and could drop that part to the script. 435 tests run: 389 passed (21 slow), 46 failed, 1 skipped	2024-02-06 20:28:30 +01:00
Charlie Marsh	91118a962a	Offer tip when users omits pip prefix (#1257 ) ## Summary Open to other opinions here. We could just continue (and warn), prompt the user with a confirmation, etc. (The weird thing about those two options is we might need to validate the command-line arguments _before_ we do that -- so you could get errors for bad arguments, and then get a warning that your subcommand is wrong. I can probably avoid that with more work if it feels like a better out come though.) Closes https://github.com/astral-sh/puffin/issues/1256.	2024-02-06 19:25:07 +00:00
Charlie Marsh	62416286e2	Remove `add` and `remove` commands (#1259 ) ## Summary These add and remove dependencies from a `pyproject.toml` -- but they're currently hidden, and don't match the rest of the workflow. We can re-add them when the time is right.	2024-02-06 14:18:27 -05:00
Zanie Blue	d4bbaf1755	Add hint for `--no-index` without `--find-links` (#1258 ) Since unavailable packages with `--no-index` can be confusing when the user does not also provide `--find-links` we add a hint for this case. Required some plumbing to get the required information to the `NoSolution` error. --------- Co-authored-by: konstin <konstin@mailbox.org>	2024-02-06 11:04:14 -06:00
konsti	b2a810fe37	Add windows specific filters for tests (#1231 ) Add more windows specific filters in various places. 435 tests run: 333 passed (21 slow), 102 failed, 1 skipped	2024-02-06 15:58:16 +01:00
Andrew Gallant	d4b4c21133	initial implementation of zero-copy deserialization for SimpleMetadata (#1249 ) (Please review this PR commit by commit.) This PR closes an initial loop on zero-copy deserialization. That is, provides a way to get a `Archived<SimpleMetadata>` (spelled `OwnedArchive<SimpleMetadata>` in the code) from a `CachedClient`. The main benefit of zero-copy deserialization is that we can read bytes from a file, cast those bytes to a structured representation without cost, and then start using that type as any other Rust type. The "catch" is that the structured representation is not the actual type you started with, but the "archived" version of it. In order to make all this work, we ended up needing to shave a rather large yak: we had to re-implement HTTP cache semantics. Previously, we were using the `http-cache-semantics` crate. While it does support Serde, it doesn't support `rkyv`. Moreover, even simple support for `rkyv` wouldn't be enough. What we actually want is for the HTTP cache semantics to be implemented on the archived type so that we can decide whether our cached response is stale or not without needing to do a full deserialization into the unarchived type. This is why, in this PR, you'll see `impl ArchivedCachePolicy { ... }` instead of `impl CachePolicy { ... }`. (The `derive(rkyv::Archive)` macro automatically introduces the `ArchivedCachePolicy` type into the current namespace.) Unfortunately, this PR does not fully realize the dream that is zero-copy deserialization. Namely, while a `CachedClient` can now provide an `OwnedArchive<SimpleMetadata>`, the rest of our code doesn't really make use of it. Indeed, as soon as we go to build a `VersionMap`, we eagerly convert our archived metadata into an owned `SimpleMetadata` via deserialization (that isn't zero-copy). After this change, a lot of the work now shifts to `rkyv` deserialization and `VersionMap` construction. More precisely, the main thing we drop here is `CachePolicy` deserialization (which is now truly zero-copy) and the parsing of the MessagePack format for `SimpleMetadata`. But we are still paying for deserialization. We're just paying for it in a different place. This PR does seem to bring a speed-up, but it is somewhat underwhelming. My measurements have been pretty noisy, but I get a 1.1x speedup fairly often: ``` $ hyperfine -w5 "puffin-main pip compile --cache-dir ~/astral/tmp/cache-main ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null" "puffin-test pip compile --cache-dir ~/astral/tmp/cache-test ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null" ; A kang Benchmark 1: puffin-main pip compile --cache-dir ~/astral/tmp/cache-main ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null Time (mean ± σ): 164.4 ms ± 18.8 ms [User: 427.1 ms, System: 348.6 ms] Range (min … max): 131.1 ms … 190.5 ms 18 runs Benchmark 2: puffin-test pip compile --cache-dir ~/astral/tmp/cache-test ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null Time (mean ± σ): 148.3 ms ± 10.2 ms [User: 357.1 ms, System: 319.4 ms] Range (min … max): 136.8 ms … 184.4 ms 19 runs Summary puffin-test pip compile --cache-dir ~/astral/tmp/cache-test ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null ran 1.11 ± 0.15 times faster than puffin-main pip compile --cache-dir ~/astral/tmp/cache-main ~/astral/tmp/reqs/home-assistant-reduced.in -o /dev/null ``` One downside is that this does increase cache size (`rkyv`'s serialization format is not as compact as MessagePack). On disk size increases by about 1.8x for our `simple-v0` cache. ``` $ sort-filesize cache-main 4.0K cache-main/CACHEDIR.TAG 4.0K cache-main/.gitignore 8.0K cache-main/interpreter-v0 8.7M cache-main/wheels-v0 18M cache-main/archive-v0 59M cache-main/simple-v0 109M cache-main/built-wheels-v0 193M cache-main 193M total $ sort-filesize cache-test 4.0K cache-test/CACHEDIR.TAG 4.0K cache-test/.gitignore 8.0K cache-test/interpreter-v0 8.7M cache-test/wheels-v0 18M cache-test/archive-v0 107M cache-test/simple-v0 109M cache-test/built-wheels-v0 242M cache-test 242M total ``` Also, while I initially intended to do a simplistic implementation of HTTP cache semantics, I found that everything was somewhat inter-connected. I could have wrote code that _specifically_ only worked with the present behavior of PyPI, but then it would need to be special cased and everything else would need to continue to use `http-cache-sematics`. By implementing what we need based on what Puffin actually is (which is still less than what `http-cache-semantics` does), we can avoid special casing and use zero-copy deserialization for our cache policy in _all_ cases.	2024-02-05 16:47:53 -05:00
Charlie Marsh	398659a9b0	Show yank warnings for `pip install` (#1253 ) Closes https://github.com/astral-sh/puffin/issues/1252.	2024-02-05 17:15:44 +00:00
Zanie Blue	d090acf13d	Improve error messaging when a dependency is not found (#1241 ) Previously, whenever we encountered a missing package we would throw an error without information about why the package was requested. This meant that if a transitive dependency required a missing package, the user would have no idea why it was even selected. Here, we track `NotFound` and `NoIndex` errors as `NoVersions` incompatibilities with an attached reason. Improves our test coverage for `--no-index` without `--find-links`. The [snapshots](https://github.com/astral-sh/puffin/pull/1241/files#diff-3eea1658f165476252f1f061d0aa9f915aabdceafac21611cdf45019447f60ec) show a nice improvement. I think this will also enable backtracking to another version if some version of transitive dependency has a missing dependency. I'll write a scenario for that next. Requires https://github.com/zanieb/pubgrub/pull/22	2024-02-05 08:43:05 -06:00
Charlie Marsh	be9125b0f0	Remove unnecessary `is_dir` in `clone_recursive` (#1247 )	2024-02-04 23:54:22 +00:00
Charlie Marsh	9d42cfd09b	Clarify documentation for `--no-index` (#1243 ) Closes #1242.	2024-02-04 18:46:01 -05:00
Andrew Gallant	586eeb6eca	tests: update snapshot for new `pip` release (#1245 ) See: https://pypi.org/project/pip/#history	2024-02-03 15:44:13 -05:00
Zanie Blue	bc2f8f5b1e	Fix direct use of `range.simplify` (#1236 )	2024-02-02 18:08:25 +00:00
Zanie Blue	6db9db0079	Invert display of "no versions" incompatibilities with multiple ranges (#1233 ) Closes #884 e.g. ``` ❯ cargo run -q -- pip compile --python-version 3.12 requirements.in × No solution found when resolving dependencies: ╰─▶ Because the requested Python version (3.12) does not satisfy Python>=3.6,<3.10 and recommenders==1.0.0 depends on Python>=3.6,<3.9, we can conclude that recommenders==1.0.0 cannot be used. And because only the following versions of recommenders are available: recommenders<=0.7 recommenders==1.0.0 recommenders==1.1.0 recommenders==1.1.1 we can conclude that recommenders>0.7,<1.1.0 cannot be used. (1) Because the requested Python version (3.12) does not satisfy Python>=3.6,<3.10 and recommenders>=1.1.0 depends on Python>=3.6,<3.10, we can conclude that recommenders>=1.1.0 cannot be used. And because we know from (1) that recommenders>0.7,<1.1.0 cannot be used, we can conclude that recommenders>0.7 cannot be used. And because you require recommenders>0.7, we can conclude that the requirements are unsatisfiable. ```	2024-02-02 12:00:25 -06:00
konsti	f10f902570	Yield after channel send and move cpu tasks to thread (#1163 ) ## Summary Previously, we were blocking operations that could run in parallel. We would send request through our main requests channel, but not yield so that the receiver could only start processing requests much later than necessary. We solve this by switching to the async `tokio::sync::mpsc::channel`, where send is an async functions that yields. Due to the increased parallelism cache deserialization and the conversion from simple api request to version map became bottlenecks, so i moved them to `spawn_blocking`. Together these result in a 30-60% speedup for larger warm cache resolution. Small cases such as black already resolve in 5.7 ms on my machine so there's no speedup to be gained, refresh and no cache were to noisy to get signal from. Note for the future: Revisit the bounded channel if we want to produce requests from `process_request`, too, (this would be good for prefetching) to avoid deadlocks. ## Details We can look at the behavior change through the spans: ``` RUST_LOG=puffin=info TRACING_DURATIONS_FILE=target/traces/jupyter-warm-branch.ndjson cargo run --features tracing-durations-export --bin puffin-dev --profile profiling -- resolve jupyter 2> /dev/null ``` Below, you can see how on main, we have discrete phases: All (cached) simple api requests in parallel, then all (cached) metadata requests in parallel, repeat until done. The solver is mostly waiting until it has it's version map from the simple API query to be able to choose a version. The main thread is blocked by process requests. In the PR branch, the simple api requests succeeds much earlier, allowing the solver to advance and also to schedule more prefetching. Due to that `parse_cache` and `from_metadata` became bottlenecks, so i moved them off the main thread (green color, and their spans can now overlap because they can run on multiple threads in parallel). The main thread isn't blocked on `process_request` anymore, instead it has frequent idle times. The spans are all much shorter, which indicates that on main they could have finished much earlier, but a task didn't yield so they weren't scheduled to finish (though i haven't dug deep enough to understand the exact scheduling between the process request stream and the solver here). main ![jupyter-warm-main](https://github.com/astral-sh/puffin/assets/6826232/693c53cc-1090-41b7-b02a-a607fcd2cd99) PR ![jupyter-warm-branch](https://github.com/astral-sh/puffin/assets/6826232/33435f34-b39b-4b0a-a9d7-4bfc22f55f05) ## Benchmarks ``` $ hyperfine --warmup 3 "target/profiling/main-dev resolve jupyter" "target/profiling/branch-dev resolve jupyter" Benchmark 1: target/profiling/main-dev resolve jupyter Time (mean ± σ): 29.1 ms ± 0.7 ms [User: 22.9 ms, System: 11.1 ms] Range (min … max): 27.7 ms … 32.2 ms 103 runs Benchmark 2: target/profiling/branch-dev resolve jupyter Time (mean ± σ): 18.8 ms ± 1.1 ms [User: 37.0 ms, System: 22.7 ms] Range (min … max): 16.5 ms … 21.9 ms 154 runs Summary target/profiling/branch-dev resolve jupyter ran 1.55 ± 0.10 times faster than target/profiling/main-dev resolve jupyter $ hyperfine --warmup 3 "target/profiling/main-dev resolve meine_stadt_transparent" "target/profiling/branch-dev resolve meine_stadt_transparent" Benchmark 1: target/profiling/main-dev resolve meine_stadt_transparent Time (mean ± σ): 37.8 ms ± 0.9 ms [User: 30.7 ms, System: 14.1 ms] Range (min … max): 36.6 ms … 41.5 ms 79 runs Benchmark 2: target/profiling/branch-dev resolve meine_stadt_transparent Time (mean ± σ): 24.7 ms ± 1.5 ms [User: 47.0 ms, System: 39.3 ms] Range (min … max): 21.5 ms … 28.7 ms 113 runs Summary target/profiling/branch-dev resolve meine_stadt_transparent ran 1.53 ± 0.10 times faster than target/profiling/main-dev resolve meine_stadt_transparent $ hyperfine --warmup 3 "target/profiling/main pip compile scripts/requirements/home-assistant.in" "target/profiling/branch pip compile scripts/requirements/home-assistant.in" Benchmark 1: target/profiling/main pip compile scripts/requirements/home-assistant.in Time (mean ± σ): 229.0 ms ± 2.8 ms [User: 197.3 ms, System: 63.7 ms] Range (min … max): 225.8 ms … 234.0 ms 13 runs Benchmark 2: target/profiling/branch pip compile scripts/requirements/home-assistant.in Time (mean ± σ): 91.4 ms ± 5.3 ms [User: 289.2 ms, System: 176.9 ms] Range (min … max): 81.0 ms … 104.7 ms 32 runs Summary target/profiling/branch pip compile scripts/requirements/home-assistant.in ran 2.50 ± 0.15 times faster than target/profiling/main pip compile scripts/requirements/home-assistant.in ```	2024-02-02 18:18:24 +01:00
konsti	3771f6656e	Allow additional assertions on command output (#1226 ) In the scenario tests, we want to make sure we're actually conforming to the scenario's expectations, so we now have an extra assertion on whether resolution failed or succeeded as well as that it includes the given packages. Closes #1112 Closes #1030	2024-02-02 09:41:35 +00:00
konsti	b16422a108	Remove insta_cmd (#1225 ) We need more flexible filters than those `inta` offers, and `insta_cmd` makes it impossible to plug in programmatic filters. At the same time we use barely any of `insta_cmd`'s features. We can replace the subset we need in about 50 loc.	2024-02-02 09:37:04 +00:00
konsti	0925e446a8	Refactor remaining integration tests (#1220 ) Mostly a mechanical refactor to use the `puffin_snapshot!` and `TestContext` infrastructure in the add, remove, venv and pip uninstall tests, in preparation for adding programmatic windows testing filters. The is only one remaining usage of `assert_cmd_snapshot!` now in the `puffin_snapshot!` macro.	2024-02-02 10:26:59 +01:00
konsti	6b050a1972	Refactor pip install and sync tests (#1213 ) Mostly a mechanical refactor to use the `puffin_snapshot!` and `TestContext` infrastructure in the pip install and pip sync tests, in preparation for adding programmatic windows testing filters.	2024-02-02 10:26:31 +01:00
Charlie Marsh	d77d129e8d	Run `cargo update` (#1230 )	2024-02-01 11:14:38 -05:00
Charlie Marsh	bb49ebee1e	Avoid race condition in clone file replacement (#1229 ) ## Summary I've never seen this in practice but in theory it is possible, and we have the same guardrail in the hardlink path.	2024-02-01 10:55:23 -05:00
konsti	809c6d676f	Use normalized display in tests and other small windows fixes (#1228 ) Split out from the large test refactoring PR. Use `normalized_display` in tests and two more thiserror derives to match snapshots and output, and other small windows fixes.	2024-02-01 16:12:30 +01:00
Charlie Marsh	9487378ef9	Avoid TOCTOU errors in data directory installations (#1227 ) ## Summary See: https://github.com/astral-sh/puffin/issues/1224 ## Test Plan Ran `python -m scripts.bench --puffin scripts/requirements/compiled/jupyter.txt --min-runs 100 --benchmark install-warm --verbose` several times, which failed eventually on `main` but not on this branch.	2024-02-01 14:55:29 +00:00
konsti	ea0bfc565d	Refactor pip scenario tests (#1212 ) Mostly a mechanical refactor to use the `puffin_snapshot!` and `TestContext` infrastructure in the pip compile and pip install scenarios, in preparation for adding programmatic windows testing filters.	2024-02-01 10:31:40 +01:00
Charlie Marsh	0757862a7a	Accommodate minute-level filters in Insta (#1219 ) I don't know why `compile_editable` took over a minute in this case, but seems like it did? Hard to test this fix. https://github.com/astral-sh/puffin/actions/runs/7734769259/job/21089338951?pr=1216	2024-02-01 09:43:09 +01:00
Charlie Marsh	8cbe1d220c	Remove double-download for source distributions (#1218 ) ## Summary Oops -- this was using a different cache key than the route above (this is the wheel _metadata_ route vs. the wheel build route), so we were saving and building source distributions twice in `pip install`.	2024-02-01 04:41:29 +00:00
Charlie Marsh	51e8609ee8	Use Python 3.12 in benchmarks (#1215 ) I originally used Python 3.10, since 3.10 and 3.11 are by far the most common (at least for [Ruff](https://pypistats.org/packages/ruff)). But 3.12 should give Python tools the most favorable benchmarks.	2024-01-31 15:51:13 -05:00
Charlie Marsh	c4bfb6efee	Add a `BENCHMARKS.md` with rendered benchmarks (#1211 ) As a precursor to the release, I want to include a structured document with detailed benchmarks. Closes https://github.com/astral-sh/puffin/issues/1210.	2024-01-31 20:11:52 +00:00
Andrew Gallant	b9d89e7624	puffin-client: generalize SimpleMetadaRaw into OwnedArchive<A> (#1208 ) It turns out that the pattern I coded up for SimpleMetadataRaw is generally useful when working with rkyv. This commit makes it generic by supporting any type that implements rkyv's traits, and makes a few simplifying assumptions by picking a concrete serializer, validator and deserializer. In effect, this lets use own any archived value. We also rejigger the API a little bit and double-down on `OwnedArchive<A>` just being a owned wrapper for `Archived<A>`. Namely, we implement `Deref` and turn its inherent methods into methods that require fully qualified syntax. (As is standard for things that implement `Deref` to avoid ambiguity with the deref target's methods.) (This PR also makes a couple small simplifications to our custom rkyv serializer since we no longer need to use it directly. We do still need to name the type in trait bounds, so it has to be public.)	2024-01-31 11:56:34 -05:00
konsti	234e8d0bb7	Abstract away test duplication in pip-compile (#1187 ) In preparation for the new windows handling, i want to introduce a `TestContext` and `puffin_snapshot!` abstraction. This PR applies those changes for pip-compile. My plan is to use those for all venv-based integration tests and build the custom windows filters on top of `puffin_snapshot!`.	2024-01-31 16:11:10 +00:00
Charlie Marsh	01258c1bb3	Report number of bytes deleted when clearing cache (#1203 ) ## Summary This is based on Cargo's `clean` implementation, with modifications based on some of my own preferences, and to better adhere to patterns we use in our codebase: ![Screenshot 2024-01-31 at 1 31 10 AM](https://github.com/astral-sh/puffin/assets/1309177/38704798-b17f-4972-ab67-00484ce63d62)	2024-01-31 10:48:28 -05:00
Charlie Marsh	8f9258fae3	Invert default feature for testing (#1200 ) ## Summary We have some flags in Puffin that enable us to opt-in to certain tests. To date, they've been opt-in, so we've run our tests with `--all-features`. This PR makes them opt-out, and we now run tests with default features. The main motivation here is that I want to get tests working for macOS on CI, but for unknown reasons, macOS can't compile the PyO3 features at the same time as everything else due to strange linker issues. By avoiding `--all-features` for tests, we thus avoid unnecessarily including features that we don't actually use in Puffin. I verified that the exact same number of tests (439) are run before and after this change. For users, the primary difference is that you now need to specify `--no-default-features --features pypi --features python` to avoid (e.g.) including the Git tests.	2024-01-31 09:44:26 -05:00
Charlie Marsh	b2f1bbaa63	Add a Ctrl+C handler to the confirm workflow (#1202 ) Fixes an issue whereby exiting the confirmation prompt can lead to your cursor disappearing: https://github.com/console-rs/dialoguer/issues/294. See: `b839a2c5b7/rye/src/main.rs (L36-L48)`.	2024-01-31 02:08:27 +00:00
Charlie Marsh	262f29b558	Add missing `--exclude-newer` to executable tests (#1201 ) A new version of `platformdirs` came out, which broke these.	2024-01-30 20:26:11 -05:00
Charlie Marsh	b88b9e1f3d	Remove dedicated `flate2` features from Puffin (#1199 ) We should be able to enable and disable these without crate-internal features.	2024-01-30 19:41:08 -05:00
Andrew Gallant	b47f70917f	puffin-client: simplify use of http-cache-semantics (#1197 ) The `http-cache-semantics` crate is polymorphic on the types of requests and responses it accepts. We had previously been explicitly converting between `http` and `reqwest` types, but this isn't necessary. We can provide impls of the traits in `http-cache-semantics` for `reqwest`'s types (via a wrapper). This saves us from the awkward request/response type conversions. While this does clone the request, this is: 1. Not new. We were previously cloning the request to do the conversion. 2. An artifact (I believe) of http-cache-semantics API. (It kind of seems like an API bug to me?) There is also a little bit of messiness around inter-operating between http::uri::Uri and url::Url. But overall shouldn't be a big deal.	2024-01-30 18:20:44 -05:00
Charlie Marsh	3f5e7306a5	Remove `WaitMap` dependency (#1183 ) ## Summary This is an attempt to https://github.com/astral-sh/puffin/pull/1163 by removing the `WaitMap` and gaining more granular control over the values that we hold over `await` boundaries.	2024-01-30 15:25:22 -05:00
Charlie Marsh	c129717b41	Add support for `--no-deps` to `pip install` (#1191 ) ## Summary Closes https://github.com/astral-sh/puffin/issues/1188.	2024-01-30 19:54:57 +00:00
Charlie Marsh	8305acc584	Add a builder for resolution options (#1192 )	2024-01-30 19:50:16 +00:00
Charlie Marsh	aa3b79ec63	Prompt user for missing `-r` and `-e` flags in `pip install` (#1180 ) ## Summary If the user runs a command like `pip install requirements.txt`, we now prompt them to ask if they meant to include the `-r` flag: ![Screenshot 2024-01-29 at 8 38 29 PM](https://github.com/astral-sh/puffin/assets/1309177/82b9f7a2-2526-4144-b200-a5015e5b8a4b) ![Screenshot 2024-01-29 at 8 38 33 PM](https://github.com/astral-sh/puffin/assets/1309177/bd8ebb51-2537-4540-a0e0-718e66a1c69c) The specific logic is: if the requirement ends in `.txt` or `.in`, and the file exists locally, prompt the user for `-r`. If the requirement contains a directory separator, and the directory exists locally, prompt the user for `-e`. Closes #1166.	2024-01-30 18:58:45 +00:00
Charlie Marsh	7a937e0f60	Error when parsing `requirements.txt`-like packages in `requirements.txt` file (#1179 ) ## Summary Like https://github.com/astral-sh/puffin/pull/1180, this PR adds logic for `requirements.txt` parsing whereby if a requirement _looks like_ a local requirements file or an editable directory, we prompt the user to correct the error (typically, by adding `-r`).	2024-01-30 18:55:11 +00:00
konsti	4ad0dc8b9e	Add windows aarch64 trampolines (#1190 ) Lacking windows compatible aarch64 hardware, i cross compiled the trampoline from x86_64 linux to aarch64-pc-windows-msvc; I added the instructions to the puffin-trampoline readme. With some testing on an aarch64 windows machine, this should be sufficient to build working win_arm64 tagged wheels. i686-pc-windows-msvc is failing with an error: ``` error: linking with `lld-link` failed: exit status: 1 = note: lld-link: error: undefined symbol: __aulldiv >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c) >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c) >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c) >>> referenced 4 more times lld-link: error: undefined symbol: __aullrem >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c) >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c) >>> referenced by libcompiler_builtins-2fb09dee087e9f64.rlib(compiler_builtins-2fb09dee087e9f64.compiler_builtins.597f0152646f1b8-cgu.0.rcgu.o):(compiler_builtins::int::specialized_div_rem::u128_div_rem::h06aed1e23a3f8f5c) >>> referenced 4 more times ```	2024-01-30 17:51:27 +00:00
Charlie Marsh	c479c26cab	Add compatibility arguments for `pip sync` (#1185 ) ## Summary As with `pip compile`, we can provide useful error messages and warnings when people pass `pip sync` arguments. Closes https://github.com/astral-sh/puffin/issues/1184.	2024-01-30 08:48:55 -05:00
konsti	ab27913f68	Instrument the main function and add jupyter.in (#1186 ) Instrument the main function as anchor span for checking overhead and update tracing-durations-export to 0.2.0 for differentiating blocking/non-blocking tasks. Add a `jupyter.in` requirement since `pip install jupyter` is a common operation. I tried `jupyterlab` too but there is no difference in performance (1.00 ± 0.07).	2024-01-30 11:03:24 +00:00
konsti	a6c4cbfe55	Cleanup puffin interpreter errors (#1169 ) Use `virtualenv` consistently, remove unused error variants and hint the user towards installing missing python versions. I didn't touch the Readme but i replaced `virtualenv environment` with `virtualenv` in the strings i found. Fixes https://github.com/astral-sh/puffin/issues/1167	2024-01-30 10:52:46 +01:00
Charlie Marsh	bd934207e4	Accept relative file paths in CLI requirements (#1182 ) ## Summary See: https://github.com/astral-sh/puffin/issues/1181. ## Test Plan ``` ❯ cargo run -- pip install packse@../../zanieb/packse Finished dev [unoptimized + debuginfo] target(s) in 0.15s Running `target/debug/puffin pip install 'packse@../../zanieb/packse'` error: Distribution not found at: file:///Users/crmarsh/zanieb/packse ```	2024-01-30 03:31:24 +00:00
konsti	d4ed5ea858	Fix the `compile_python_37` test with python 3.7 installed (#1172 ) Make the test `compile_python_37` pass whether python 3.7 is installed or not by muting the warning for a missing 3.7. The resolution error is independent of whether 3.7 is installed or not.	2024-01-29 18:59:28 +01:00
Charlie Marsh	67a09649f2	Support parsing `--find-links`, `--index-url`, and `--extra-index-url` in `requirements.txt` (#1146 ) ## Summary This PR adds support for `--find-links`, `--index-url`, and `--extra-index-url` arguments when specified in a `requirements.txt`. It's a mostly-straightforward change. The only uncertain piece is what to do when multiple files include these flags, and/or when we include them on the CLI and in other files. In general: - If _anything_ specifies `--no-index`, we respect it. - We combine all `--extra-index-url` and `--find-links` across all sources, since those are just vectors. - If we see multiple `--index-url` in requirements files, we error. - We respect the `--index-url` from the command line over any provided in a requirements file. (`pip-compile` seems to just pick one semi-arbitrarily when multiple are provided.) Closes https://github.com/astral-sh/puffin/issues/1143.	2024-01-29 15:06:40 +00:00
Charlie Marsh	4b9daf9604	Use tokio_tar instead of async_tar (#1170 ) ## Summary `tokio_tar` is a fork of `async_tar` that uses Tokio instead of `async-std`. Using it removes a significant dependency from our tree. (There is an open PR (https://github.com/dignifiedquire/async-tar/pull/41) in `async-tar` to add Tokio support, but it's over a year old.) See: https://github.com/astral-sh/puffin/pull/1157#discussion_r1469190249.	2024-01-29 10:00:30 -05:00
Andrew Gallant	a42b385e9b	puffin-client: add SimpleMetadataRaw (#1150 ) This adds what is effectively an owned wrapper around `Archived<SimpleMetadata>`. Normally, an `Archived<SimpleMetadata>` has to be used behind a pointer (since it has a lifetime attached to its underlying byte buffer), but we create a wrapper around it that owns the underlying buffer and provides free access to the archived type. This in effect creates an anchor point for the archived type and lets us pass it around easily. (There has to be an anchor point for it somewhere.) An alternative to this approach would be to store it as a file backed memory map. But in practice, we're dealing with small files, and just reading them on to the heap is likely to be faster. (Memory maps also have wildly different perf characteristics across platforms.) Note that this commit just defines the type. It isn't actually used anywhere yet.	2024-01-29 09:37:06 -05:00
konsti	be48200642	Small instrumentation improvements (#1164 ) Less verbose span fields for `Dist`s by using the display impl and no more min length in the tracing durations plot config for comparability (we lose spans due to a speedup otherwise). Both wait points in the solver loop are now instrumented so we can inspect what we're waiting for to progress in the solver.	2024-01-29 10:55:19 +00:00
konsti	8bfc3c1b37	Trim `get_cached_with_callback` and `send_cached` down some more. (#1128 ) I noticed that `get_cached_with_callback` and `send_cached` are large both in terms of llvm lines and in terms of types (and large types can cause buffer overflows on windows). `get_cached_with_callback` specifically is large because it's monomorphized for each callback. I've split both functions into smaller units and boxed the callback. llvm lines, before: ``` Lines Copies Function name ----- ------ ------------- 909511 21625 (TOTAL) 36026 (4.0%, 4.0%) 33 (0.2%, 0.2%) <&mut rmp_serde::decode::Deserializer<R,C> as serde:🇩🇪:Deserializer>::deserialize_any 14688 (1.6%, 5.6%) 8 (0.0%, 0.2%) puffin_client::cached_client::CachedClient::get_cached_with_callback::{{closure}}::{{closure}} 13748 (1.5%, 7.1%) 5 (0.0%, 0.2%) puffin_client::cached_client::CachedClient::send_cached::{{closure}} 12460 (1.4%, 8.5%) 35 (0.2%, 0.4%) alloc::raw_vec::RawVec<T,A>::grow_amortized 10731 (1.2%, 9.6%) 122 (0.6%, 0.9%) <alloc::boxed::Box<T,A> as core::ops::drop::Drop>::drop 8952 (1.0%, 10.6%) 9 (0.0%, 1.0%) core::slice::sort::partition_in_blocks 8216 (0.9%, 11.5%) 323 (1.5%, 2.5%) <core::result::Result<T,E> as core::ops::try_trait::Try>::branch 7745 (0.9%, 12.4%) 205 (0.9%, 3.4%) core::result::Result<T,E>::map_err 6862 (0.8%, 13.1%) 54 (0.2%, 3.7%) <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter 6720 (0.7%, 13.9%) 133 (0.6%, 4.3%) std::panicking::try 6600 (0.7%, 14.6%) 45 (0.2%, 4.5%) <alloc::sync::Weak<T,A> as core::ops::drop::Drop>::drop 5899 (0.6%, 15.2%) 33 (0.2%, 4.6%) rmp_serde::decode::Deserializer<R,C>::read_str_data 5610 (0.6%, 15.9%) 33 (0.2%, 4.8%) alloc::raw_vec::RawVec<T,A>::allocate_in 5187 (0.6%, 16.4%) 133 (0.6%, 5.4%) std::panicking::try::do_catch 4740 (0.5%, 17.0%) 268 (1.2%, 6.7%) core::ops::function::FnOnce::call_once 4670 (0.5%, 17.5%) 40 (0.2%, 6.8%) puffin_client::cached_client::CachedClient::get_cached_with_callback::{{closure}}::{{closure}}::{{closure}} 4527 (0.5%, 18.0%) 54 (0.2%, 7.1%) core::iter::traits::iterator::Iterator::try_fold ``` after: ``` Lines Copies Function name ----- ------ ------------- 910275 21712 (TOTAL) 36026 (4.0%, 4.0%) 33 (0.2%, 0.2%) <&mut rmp_serde::decode::Deserializer<R,C> as serde:🇩🇪:Deserializer>::deserialize_any 12460 (1.4%, 5.3%) 35 (0.2%, 0.3%) alloc::raw_vec::RawVec<T,A>::grow_amortized 10935 (1.2%, 6.5%) 124 (0.6%, 0.9%) <alloc::boxed::Box<T,A> as core::ops::drop::Drop>::drop 8952 (1.0%, 7.5%) 9 (0.0%, 0.9%) core::slice::sort::partition_in_blocks 8714 (1.0%, 8.5%) 5 (0.0%, 0.9%) puffin_client::cached_client::CachedClient::send_cached_handle_stale::{{closure}} 8216 (0.9%, 9.4%) 323 (1.5%, 2.4%) <core::result::Result<T,E> as core::ops::try_trait::Try>::branch 8192 (0.9%, 10.3%) 8 (0.0%, 2.5%) puffin_client::cached_client::CachedClient::get_cached_with_callback::{{closure}}::{{closure}} 7745 (0.9%, 11.1%) 205 (0.9%, 3.4%) core::result::Result<T,E>::map_err 6862 (0.8%, 11.9%) 54 (0.2%, 3.7%) <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter 6778 (0.7%, 12.6%) 5 (0.0%, 3.7%) puffin_client::cached_client::CachedClient::send_cached::{{closure}} 6720 (0.7%, 13.4%) 133 (0.6%, 4.3%) std::panicking::try 6600 (0.7%, 14.1%) 45 (0.2%, 4.5%) <alloc::sync::Weak<T,A> as core::ops::drop::Drop>::drop 5899 (0.6%, 14.7%) 33 (0.2%, 4.7%) rmp_serde::decode::Deserializer<R,C>::read_str_data 5610 (0.6%, 15.3%) 33 (0.2%, 4.8%) alloc::raw_vec::RawVec<T,A>::allocate_in 5187 (0.6%, 15.9%) 133 (0.6%, 5.4%) std::panicking::try::do_catch 4740 (0.5%, 16.4%) 268 (1.2%, 6.7%) core::ops::function::FnOnce::call_once 4527 (0.5%, 16.9%) 54 (0.2%, 6.9%) core::iter::traits::iterator::Iterator::try_fold ``` Stack sizes diff: https://gist.github.com/konstin/a3f38276aacf1170038a756c8c49793c	2024-01-29 08:31:27 +00:00
Charlie Marsh	fa3f0d7a55	Remove cache `purge` methods to `clean` (#1159 ) This is more consistent with the public interface.	2024-01-28 21:15:11 -05:00
Charlie Marsh	d88ce76979	Stream unpacking of source distribution downloads (#1157 ) This PR migrates our source distribution downloads to unzip as we stream, similar to our approach for wheels. In my testing, this showed a consistent speedup (e.g., 6% here for a few representative source distributions): ```text ❯ python -m scripts.bench --puffin-path ./target/release/main --puffin-path ./target/release/puffin --benchmark install-cold requirements.in Benchmark 1: ./target/release/main (install-cold) Time (mean ± σ): 1.503 s ± 0.039 s [User: 1.479 s, System: 0.537 s] Range (min … max): 1.466 s … 1.605 s 10 runs Benchmark 2: ./target/release/puffin (install-cold) Time (mean ± σ): 1.421 s ± 0.024 s [User: 1.505 s, System: 0.593 s] Range (min … max): 1.381 s … 1.454 s 10 runs Summary './target/release/puffin (install-cold)' ran 1.06 ± 0.03 times faster than './target/release/main (install-cold)' ```	2024-01-28 20:09:24 -05:00
Andrew Gallant	5219d37250	add initial rkyv support (#1135 ) This PR adds initial support for [rkyv] to puffin. In particular, the main aim here is to make puffin-client's `SimpleMetadata` type possible to deserialize from a `&[u8]` without doing any copies. This PR stops short of actuallying doing that zero-copy deserialization. Instead, this PR is about adding the necessary trait impls to a variety of types, along with a smattering of small refactorings to make rkyv possible to use. For those unfamiliar, rkyv works via the interplay of three traits: `Archive`, `Serialize` and `Deserialize`. The usual flow of things is this: * Make a type `T` implement `Archive`, `Serialize` and `Deserialize`. rkyv helpfully provides `derive` macros to make this pretty painless in most cases. * The process of implementing `Archive` for `T` usually creates an entirely new distinct type within the same namespace. One can refer to this type without naming it explicitly via `Archived<T>` (where `Archived` is a clever type alias defined by rkyv). * Serialization happens from `T` to (conceptually) a `Vec<u8>`. The serialization format is specifically designed to reflect the in-memory layout of `Archived<T>`. Notably, not `T`. But `Archived<T>`. * One can then get an `Archived<T>` with no copying (albeit, we will likely need to incur some cost for validation) from the previously created `&[u8]`. This is quite literally [implemented as a pointer cast][rkyv-ptr-cast]. * The problem with an `Archived<T>` is that it isn't your `T`. It's something else. And while there is limited interoperability between a `T` and an `Archived<T>`, the main issue is that the surrounding code generally demands a `T` and not an `Archived<T>`. This is at the heart of the tension for introducing zero-copy deserialization, and this is mostly an intrinsic problem to the technique and not an rkyv-specific issue. For this reason, given an `Archived<T>`, one can get a `T` back via an explicit deserialization step. This step is like any other kind of deserialization, although generally faster since no real "parsing" is required. But it will allocate and create all necessary objects. This PR largely proceeds by deriving the three aforementioned traits for `SimpleMetadata`. And, of course, all of its type dependencies. But we stop there for now. The main issue with carrying this work forward so that rkyv is actually used to deserialize a `SimpleMetadata` is figuring out how to deal with `DataWithCachePolicy` inside of the cached client. Ideally, this type would itself have rkyv support, but adding it is difficult. The main difficulty lay in the fact that its `CachePolicy` type is opaque, not easily constructable and is internally the tip of the iceberg of a rat's nest of types found in more crates such as `http`. While one "dumb"-but-annoying approach would be to fork both of those crates and add rkyv trait impls to all necessary types, it is my belief that this is the wrong approach. What we'd like to do is not just use rkyv to deserialize a `DataWithCachePolicy`, but we'd actually like to get an `Archived<DataWithCachePolicy>` and make actual decisions used the archived type directly. Doing that will require some work to make `Archived<DataWithCachePolicy>` directly useful. My suspicion is that, after doing the above, we may want to mush forward with a similar approach for `SimpleMetadata`. That is, we want `Archived<SimpleMetadata>` to be as useful as possible. But right now, the structure of the code demands an eager conversion (and thus deserialization) into a `SimpleMetadata` and then into a `VersionMap`. Getting rid of that eagerness is, I think, the next step after dealing with `DataWithCachePolicy` to unlock bigger wins here. There are many commits in this PR, but most are tiny. I still encourage review to happen commit-by-commit. [rkyv]: https://rkyv.org/ [rkyv-ptr-cast]: https://docs.rs/rkyv/latest/src/rkyv/util/mod.rs.html#63-68	2024-01-28 12:14:59 -05:00
Charlie Marsh	a25a1f2958	Avoid re-creating directories in async unzip (#1155 ) This PR extends the optimizations from #1154 to other unzip paths.	2024-01-28 14:30:38 +00:00
Charlie Marsh	3d10f344f3	Only include visited packages in error message derivation (#1144 ) ## Summary This is my guess as to the source of the resolver flake, based on information and extensive debugging from @zanieb. In short, if we rely on `self.index.packages` as a source of truth during error reporting, we open ourselves up to a source of non-determinism, because we fetch package metadata asynchronously in the background while we solve -- so packages _could_ be included in or excluded from the index depending on the order in which those requests are returned. So, instead, we now track the set of packages that _were_ visited by the solver. Visiting a package _requires_ that we wait for its metadata to be available. By limiting analysis to those packages that were visited during solving, we are faithfully representing the state of the solver at the time of failure. Closes #863	2024-01-28 09:27:22 -05:00
Charlie Marsh	6f2c235d21	Avoid re-creating directories during unzip (#1154 ) ## Summary We have this optimization in `wheel.rs`, in the installer, but it makes a huge difference for zips with many small files: ``` Benchmarking file_reader/Django-5.0.1-py3-none-any.whl: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 74.2s, or reduce sample count to 10. file_reader/Django-5.0.1-py3-none-any.whl time: [751.63 ms 757.78 ms 764.27 ms] change: [-1.0290% +0.0841% +1.2289%] (p = 0.88 > 0.05) No change in performance detected. Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild Benchmarking buffered_reader/Django-5.0.1-py3-none-any.whl: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 53.4s, or reduce sample count to 10. buffered_reader/Django-5.0.1-py3-none-any.whl time: [529.86 ms 536.44 ms 543.35 ms] change: [+0.0293% +1.5543% +3.1426%] (p = 0.05 > 0.05) No change in performance detected. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild ``` That's almost 30% faster...	2024-01-28 00:07:54 -05:00
Charlie Marsh	888a9e6f53	Remove an unnecessary `Path` clone (#1153 )	2024-01-28 03:16:51 +00:00
Charlie Marsh	d243250dec	Avoid unnecessary permissions changes for copy paths (#1152 ) In Rust, `fs::copy` automatically preserves permissions (see: https://doc.rust-lang.org/std/fs/fn.copy.html). Elsewhere, when copying from the zip archive out to the cache, we can set permissions during file creation, rather than as a separate call. Both of these should be slightly more efficient.	2024-01-27 22:11:55 -05:00
Charlie Marsh	d6795da0ea	Set permissions after streaming unzip (#1151 ) ## Summary When we migrated to an "unzip while we stream" solution, we lost the logic to set permissions on the extracted files, so executables in wheels were no longer executable. It turns out this is a little tricky, since the permissions metadata is in the central directory at the _end_ of the zip file, and the async ZIP reader explicitly stops iteration once it hits the central directory. (Specifically, it goes 4 bytes into the central directory, since it sees the 4-byte signature header and then stops.) So, to solve that, I've added a `CentralDirectoryReader` that continues where that iterator left off. This required forking the async zip crate: https://github.com/charliermarsh/rs-async-zip/pull/1. It took a lot of fiddling but I'm quite confident in the code now, especially since the async zip crate validates the signature kind on every read. The central directory is typically quite small (even for the Zig wheel, which is enormous, it's just around 1MB), so I don't expect this to have a high cost. Closes https://github.com/astral-sh/puffin/issues/1148.	2024-01-27 19:22:44 -05:00
Charlie Marsh	15ca17a68d	Support relative `file:` paths for `--find-links` (#1147 ) Just for consistency.	2024-01-27 03:48:25 +00:00
Charlie Marsh	4e19e6846d	Accept long form of pip arguments in `requirements.txt` (#1145 )	2024-01-26 21:56:10 -05:00
Charlie Marsh	addb94fbd6	Add support for emitting index URLs and --find-links (#1142 ) Closes https://github.com/astral-sh/puffin/issues/1140.	2024-01-27 01:37:55 +00:00
Charlie Marsh	a2ef2010d2	Add arguments for pip-compile compatibility (#1139 ) ## Summary This ensures that we warn when redundant options are passed (like `--allow-unsafe`, which is really common for forwards compatibility since it's going to be the default in a future release), and errors when known variants are passed that we _don't_ support (like `--resolver=backtracking`). Closes https://github.com/astral-sh/puffin/issues/1127.	2024-01-26 16:54:02 -05:00
Charlie Marsh	06024653f9	Reduce visibility of some methods in `wheel.rs` (#1125 )	2024-01-26 16:34:51 -05:00
Zanie Blue	5cc4e5d31e	Add `pip compile` test where specific Python versions are available on the system (#1111 ) Extends https://github.com/astral-sh/puffin/pull/1106 with the scenario from https://github.com/zanieb/packse/pull/95 which tests that `pip compile` will use the matching system Python version for builds when available	2024-01-26 18:38:24 +00:00
Zanie Blue	91f421cf97	Do not allow `pip compile` scenario tests to discover other Python versions (#1106 ) In https://github.com/astral-sh/puffin/pull/1040 we broke the pip compile scenarios designed to test failure when a required Python version is not available — resolution succeeded because all of the Python versions were available in CI. Following #1105 we have the ability to isolate tests from Python versions available in the system. Here, we limit the scenarios to only the Python version in the current environment, restoring our ability to test the error messages. With https://github.com/zanieb/packse/pull/95, we will be able to specify scenarios with access to additional system Python versions. This will allow us to include test coverage where resolution can succeed by using a version available elsewhere on the system. See #1111 for this follow-up.	2024-01-26 18:18:15 +00:00
Zanie Blue	21577ad002	Add bootstrapping and isolation of development Python versions (#1105 ) Replaces https://github.com/astral-sh/puffin/pull/1068 and #1070 which were more complicated than I wanted. - Introduces a `.python-versions` file which defines the Python versions needed for development - Adds a Bash script at `scripts/bootstrap/install` which installs the required Python versions from `python-build-standalone` to `./bin` - Checks in a `versions.json` file with metadata about available versions on each platform and a `fetch-version` Python script derived from `rye` for updating the versions - Updates CI to use these Python builds instead of the `setup-python` action - Updates to the latest packse scenarios which require Python 3.8+ instead of 3.7+ since we cannot use 3.7 anymore and includes new test coverage of patch Python version requests - Adds a `PUFFIN_PYTHON_PATH` variable to prevent lookup of system Python versions for isolation during development Tested on Linux (via CI) and macOS (locally) — presumably it will be a bit more complicated to do proper Windows support.	2024-01-26 12:12:48 -06:00
Charlie Marsh	cc0e211074	Avoid embedding launcher scripts on non-Windows (#1124 ) Just to reduce binary size on all other platforms.	2024-01-26 17:19:05 +01:00
Charlie Marsh	f946d46273	Avoid allocating a max-size buffer (#1123 ) This seems potentially-dangerous with no upside.	2024-01-26 14:27:19 +00:00
konsti	39021263dd	Windows launchers using posy trampolines (#1092 ) ## Background In virtual environments, we want to install python programs as console commands, e.g. `black .` over `python -m black .`. They may be called [entrypoints](https://packaging.python.org/en/latest/specifications/entry-points/) or scripts. For entrypoints, we're given a module name and function to call in that module. On Unix, we generate a minimal python script launcher. Text files are runnable on unix by adding a shebang at their top, e.g. ```python #!/usr/bin/env python ``` will make the operating system run the file with the current python interpreter. A venv launcher for black in `/home/ferris/colorize/.venv` (module name: `black`, function to call: `patched_main`) would look like this: ```python #!/home/ferris/colorize/.venv/bin/python # -- coding: utf-8 -- import re import sys from black import patched_main if __name__ == "__main__": sys.argv[0] = re.sub(r"(-script\.pyw\|\.exe)?$", "", sys.argv[0]) sys.exit(patched_main()) ``` On windows, this doesn't work, we can only rely on launching `.exe` files. ## Summary We use posy's rust implementation of a trampoline, which is based on distlib's c++ implementation. We pre-build a minimal exe and append the launcher script as stored zip archive behind it. The exe will look for the venv python interpreter next to it and use it to execute the appended script. The changes in this PR make the `black` entrypoint work: ```powershell cargo run -- venv .venv cargo run -q -- pip install black .\.venv\Scripts\black --version ``` Integration with our existing tests will be done in follow-up PRs. ## Implementation and Details I've vendored the posy trampoline crate. It is a formatted, renamed and slightly changed for embedding version of https://github.com/njsmith/posy/pull/28. The posy launchers are smaller than the distlib launchers, 16K vs 106K for black. Currently only `x86_64-pc-windows-msvc` is supported. The crate requires a nightly compiler for its no-std binary size tricks. On windows, an application can be launched with a console or without (to create windows instead), which needs two different launchers. The gui launcher will subsequently use `pythonw.exe` while the console launcher uses `python.exe`.	2024-01-26 13:54:11 +00:00
konsti	f1d3b08c12	Add missing version to pip sync test (#1121 ) The test started failing due to a newer version on pypi.	2024-01-26 13:36:25 +00:00
Charlie Marsh	361a2039d2	Add `--no-annotate` and `--no-header` flags (#1117 ) Closes #1107. Closes #1108.	2024-01-26 12:14:18 +00:00
Charlie Marsh	7755f986c3	Support extras in editable requirements (#1113 ) ## Summary This PR adds support for requirements like `-e .[d]`. Closes #1091.	2024-01-26 12:07:51 +00:00
Charlie Marsh	f593b65447	Remove refresh checks from the install plan (#1119 ) ## Summary Rather than checking cache freshness in the install plan, it's a lot simple to have the install plan _never_ return cached data when the refresh policy is in place, and then rely on the distribution database to check for freshness. The original implementation didn't support this, since the distribution database was rebuilding things too often. Now, it rarely rebuilds (it's much better about this), so it seems conceptually much simpler to split up the responsibilities like this.	2024-01-25 22:48:16 -05:00
Charlie Marsh	50057cd5f2	Re-add Cargo's known hosts checking (#1118 ) ## Summary This ensures that (like Cargo) we don't suffer from https://github.com/advisories/GHSA-r5w3-xm58-jv6j, by way of checking known hosts when fetching via `libgit2`. The implementation is taken from Cargo itself, modified to remove all configuration, since we don't yet support configuration for known hosts, etc. Closes #285.	2024-01-25 22:29:36 -05:00
Charlie Marsh	67b41427cc	Store source distribution directly in the cache (#1116 ) I want to move towards using the archive bucket exclusively for wheels. We never overwrite source distributions, so there's no need to symlink them.	2024-01-25 20:52:31 -05:00
Charlie Marsh	77351c7874	Use snapshots for requirements.txt error tests (#1115 ) ## Summary I find these too difficult to edit and maintain. This brings them closer to the rest of our testing setups.	2024-01-25 20:35:52 -05:00
Charlie Marsh	57c116ee9a	Move Black editable to flit backend (#1114 ) I ran into a bug in PDM that's making it impossible to use the Black example for extras: https://github.com/pdm-project/pdm/issues/2591. I've confirmed that Flit handles it correctly.	2024-01-25 19:54:54 -05:00
Zanie Blue	3a05ef5285	Add venv tests for missing Python versions (#1096 ) These demonstrate some lackluster error messages.	2024-01-25 13:57:05 -06:00
Charlie Marsh	f36c167982	Use a consolidated error for distribution failures (#1104 ) ## Summary Use a single error type in `puffin_distribution`, rather than two confusingly similar types between `DistributionDatabase` and the source distribution module. Also removes the `#[from]` for IO errors and replaces with explicit wrapping, which is verbose but removes a bunch of incorrect error messages.	2024-01-25 14:49:11 -05:00
Charlie Marsh	8ef819e07e	Remove `Option` wrapper from requirement extras (#1103 ) There's no semantic difference between `None` and empty, so seems simpler to represent this way.	2024-01-25 13:21:53 -05:00
Andrew Gallant	067acfe79e	puffin-client: rejigger error type (#1102 ) This PR changes the error type to be boxed internally so that it uses less size on the stack. This makes functions returning `Result<T, Error>`, in particular, return something much smaller. The specific thing that motivated this was Clippy lints firing when I tried to refactor code in this crate. I chose to achieve boxing by splitting the enum out into a separate type, and then wiring up the necessary `From` impl to make error conversions easy, and then making `Error` itself opaque. We could expose the `Box`, but there isn't a ton of benefit in doing so because one cannot pattern match through a `Box`. This required using more explicit error conversions in several places. And as a result, I was able to remove all `#[from]` attributes on non-transparent error variants.	2024-01-25 13:13:21 -05:00
Charlie Marsh	3e86c80874	Set buffer size when unzipping (#1101 ) The zip archive includes an uncompressed size header, which we can use to preallocate.	2024-01-25 17:58:36 +00:00
Charlie Marsh	e0902d7d5a	Make `puffin-fs` `tokio` dependency opt-in (#1100 )	2024-01-25 12:47:46 -05:00
Charlie Marsh	5ad2e60561	Use `same-file` to detect interpreter shims (#1099 ) Our existing detection doesn't work on Windows, because we canoncalize the interpreter path but not `info.sys_executable`, so the former includes the UNC prefix, etc. This is cross-platform and gets at the intent of the check.	2024-01-25 12:27:49 -05:00
Charlie Marsh	f4939e50a6	Remove UNC prefixes on Windows (#1086 ) ## Summary This PR adds a `NormalizedDisplay` trait that we can use for user-facing paths, to strip the UNC prefix on Windows. On other platforms, the implementation is a no-op (vs. `Display`). I audited all usages of `.display()`, and changed any that were user-facing, either via `println!` or `eprintln!`, or by way of being included in error messages. I did _not_ change uses that were only in tests or only went to tracing. Closes https://github.com/astral-sh/puffin/issues/1084.	2024-01-25 11:44:22 -05:00
konsti	035cd81ac8	Fix venv PATH on windows (#1095 ) Windows uses `;` instead of `:` to separate `PATH` entries. This pull request switches from manually using `:` to the `std::env` functions. This fixes ``` puffin pip install -e scripts/editable-installs/maturin_editable ``` on windows.	2024-01-25 15:40:52 +00:00
Charlie Marsh	904db967af	Use junctions instead of symlinks on Windows (#1087 ) ## Summary When we unzip wheels in the cache, we write the directories out to an `archive-v0` bucket, and then symlink into that bucket from the `wheels-v0` and `built-wheels-v0` buckets. On Windows, symlinks are not well supported. Specifically, they need to be explicitly enabled by the user. So, instead of symlinks, we now use junctions, which are well-supported on Windows, and allow you to (effectively) symlink a directory to another directory. This PR implements said junction support, which gets the core installer working on Windows. In the past, we also used symlinks to implement another primitive: we wanted to be able to replace a directory "atomically" (I put "atomically" in quotes because I don't know if it's actually a guaranteed atomic operation), in case someone was trying to use the directory while we were replacing it (as opposed to deleting the directory, then moving it into place). On Windows, it doesn't appear to be possible to atomically replace a junction. So instead, I'm using a new design, whereby the cache always returns canonicalized paths. We know these canonicalized paths are unique and won't be replaced, so they're safe for writers to rely on. In general, when we write new data to the cache, we now return the canonicalized path. When we read from the cache, and try to identify (e.g.) the set of wheels available to us, we canonicalize the links immediately and consider them non-existent if that operation fails. Closes #1085. --------- Co-authored-by: konstin <konstin@mailbox.org>	2024-01-25 10:06:38 +01:00
Charlie Marsh	036b7e5f43	Use `parse_headers` rather than parsing body (#1090 ) Looking at the internals, this should make almost no difference in performance, but anyway...	2024-01-25 09:41:21 +01:00
Zanie Blue	ed1ac640b9	Consolidate `UnusableDependencies` into a generic `Unavailable` incompatibility (#1088 ) Requires https://github.com/zanieb/pubgrub/pull/20 In short, `UnusableDependencies` can be generalized into `Unavailable` which encompasses incompatibilities where a package range which is unusable for some inherent reason as well as when its dependencies are unusable. We can eventually use this to track more incompatibilities in the solver. I made the reason string required because I can't see a case where we should leave it out. Additionally, this improves the display of conflicts in the root requirements.	2024-01-24 22:10:44 -06:00
Zanie Blue	091f8e09ff	Use a cache directory for venv tests (#1089 )	2024-01-24 22:09:37 -06:00
konsti	ed6a1606b9	Use `which::which` instead of `which::which_global` (#1083 ) `which::which_global` does not resolve relative paths, which we want to support, while `which::which` does.	2024-01-24 18:35:57 -06:00
Charlie Marsh	cedd2e0b3f	Use a buffered reader for wheel metadata (#1082 ) ## Summary It turns out this is significantly faster when reading (e.g.) _just_ the `METADATA` file from a zipped wheel. I audited other `File::open` usages, and everything else seems to be using a buffered reader already (directly, or in whatever third-party crate it's passed to) _or_ is read immediately in full. See the criterion benchmark: ``` file_reader/numpy-1.26.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl time: [6.9618 ms 6.9664 ms 6.9713 ms] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild file_reader/flask-3.0.1-py3-none-any.whl time: [237.50 µs 238.25 µs 239.13 µs] Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) high mild 4 (4.00%) high severe buffered_reader/numpy-1.26.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl time: [648.92 µs 653.85 µs 660.09 µs] Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe buffered_reader/flask-3.0.1-py3-none-any.whl time: [39.578 µs 39.712 µs 39.869 µs] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) high mild 5 (5.00%) high severe ```	2024-01-24 15:22:55 -05:00
Zanie Blue	0019fe71f6	Add warning when target version does not match build version (#1072 ) Follow-up to https://github.com/astral-sh/puffin/pull/1040 adding a user-facing warning when we cannot build with their requested version. e.g. ``` ❯ cargo run -- pip compile requirements.in --python-version 3.11.4 --no-build Resolved 8 packages in 483ms ❯ cargo run -- pip compile requirements.in --python-version 3.11.4 warning: The requested Python version 3.11.4 is not available; 3.11.7 will be used to build dependencies instead. Resolved 8 packages in 71ms ❯ cargo run -- pip compile requirements.in --python-version 3.11 Resolved 8 packages in 71ms ```	2024-01-24 13:42:19 -06:00
Charlie Marsh	738e8341e2	Use a consistent `Timestamp` struct (#1081 ) ## Summary This PR uses `ctime` consistently on Unix as a more conservative approach to change detection. It also ensures that our timestamp abstraction is entirely internal, so we can change the representation and logic easily across the codebase in the future.	2024-01-24 14:21:31 -05:00
Zanie Blue	bdfabfb088	Fixup doc for `find_best` (#1079 )	2024-01-24 12:55:01 -06:00
konsti	2e0ce70d13	Initial windows support (#940 ) ## Summary First batch of changes for windows support. Notable changes: * Fixes all compile errors and added windows specific paths. * Working venv creation on windows, both from a base interpreter and from a venv. This requires querying `stdlib` from the sysconfig paths to find the launcher. * Basic url/path conversion handling for windows. * `if cfg!(...)` instead of `#[cfg()]`. This should make it easier to keep everything compiling across platforms. ## Outlook Test summary: 402 tests run: 299 passed (15 slow), 103 failed, 1 skipped There are various reason for the remaining test failure: * Windows-specific colorama and tzdata dependencies that change the snapshot slightly. This is by far the biggest batch. * Some url-path handling issues. I fixed some in the PR, some remain. * Lack of the latest python patch versions for older pythons on my machine, since there are no builds for windows and we need to register them in the registry for them to be picked up for `py --list-paths` (CC @zanieb RE #1070). * Lack of entrypoint launchers. * ... likely more	2024-01-24 18:27:49 +01:00
Zanie Blue	ea4ab29bad	Prefer target Python version over current version for builds (#1040 ) Extends #1029 Closes https://github.com/astral-sh/puffin/issues/1038 Instead of always using the current Python version for builds when a target version is provided, we will do our best to use a compatible Python version for builds. Removes behavior where Python versions without patch versions were always assumed to be the latest known patch version (previously discussed in https://github.com/astral-sh/puffin/pull/534). While this was convenient for resolutions which include packages which require minimum patch versions e.g. `requires-python=">=3.7.4"`, it conflicts with the idea that the target Python version you provide is the _minimum_ compatible version. Additionally, it complicates interpreter lookup as we cannot tell if the user has asked for that specific patch version or not.	2024-01-24 11:12:02 -06:00
Charlie Marsh	0519375bd6	Remove some unused dependencies (#1077 )	2024-01-24 11:58:21 -05:00
Charlie Marsh	afb571643f	Avoid unzipping local wheels when fresh (#1076 ) Since the archive is a single file in this case, we can rely on the modification timestamp to check for freshness.	2024-01-24 15:01:16 +00:00
konsti	411613a24e	No python prefix in packse scenarios (#1066 ) In windows, `python3.9` and `python3.11` are not in `PATH`. Instead, we should pass only the python version to `puffin venv -p` in packse scenarios (#1039).	2024-01-24 11:22:48 +00:00
Charlie Marsh	63f3434b21	Use nanoid instead of uuid (#1074 ) ## Summary Gives us equivalent randomness with ~half as many characters.	2024-01-24 05:05:14 +00:00
Andrew Gallant	eebc2f340a	make some things guaranteed to be deterministic (#1065 ) This PR replaces a few uses of hash maps/sets with btree maps/sets and index maps/sets. This has the benefit of guaranteeing a deterministic order of iteration. I made these changes as part of looking into a flaky test. Unfortunately, I'm not optimistic that anything here will actually fix the flaky test, since I don't believe anything was actually dependent on the order of iteration.	2024-01-23 20:30:33 -05:00
Charlie Marsh	1b3a3f4e80	Add `--refresh` behavior to the cache (#1057 ) ## Summary This PR is an alternative approach to #949 which should be much safer. As in #949, we add a `Refresh` policy to the cache. However, instead of deleting entries from the cache the first time we read them, we now check if the entry is sufficiently new (created after the start of the command) if the refresh policy applies. If the entry is stale, then we avoid reading it and continue onward, relying on the cache to appropriately overwrite based on "new" data. (This relies on the preceding PRs, which ensure the cache is append-only, and ensure that we can atomically overwrite.) Unfortunately, there are just a lot of paths through the cache, and didn't data is handled with different policies, so I really had to go through and consider the "right" behavior for each case. For example, the HTTP requests can use `max-age=0, must-revalidate`. But for the routes that are based on filesystem modification, we need to do something slightly different. Closes #945.	2024-01-23 18:30:26 -05:00
Charlie Marsh	cf8b452414	Track HTTP caches for URL wheels (#1071 ) ## Summary This PR ensures that we store HTTP caching information for wheels. Previously, we only stored these for source distributions. This will be helpful for refresh, since we can avoid re-downloading wheels that are unchanged per HTTP caching semantics. There should be zero performance hit here for warm installs, and only an extremely small hit for cold installs (writing the HTTP cache data to disk). The hyperfine benchmarks reflect this.	2024-01-23 17:31:42 -05:00
Charlie Marsh	09f5884f28	Avoid revalidating immutable HTTP responses (#1069 ) ## Summary If you send a revalidation request to a resource that returns an `immutable` directive, the server apparently returns a 200 instead of a 304? In other words, the server can ignore the revalidation request. This PR adds handling on top of the HTTP cache semantics to respect immutable resources, which is especially useful since all PyPI files are immutable.	2024-01-23 16:22:21 -05:00
Charlie Marsh	5621c414cf	Use symlinks for directories entries in cache (#1037 ) ## Summary One problem we have in the cache today is that we can't overwrite entries atomically, because we store unzipped _directories_ in the cache (which makes installation _much_ faster than storing zipped directories). So, if you ignore the existing contents of the cache when writing, you might run into an error, because you might attempt to write a directory where a directory already exists. This is especially annoying for cache refresh, because in order to refresh the cache, we have to purge it (i.e., delete a bunch of stuff), which is also highly unsafe if Puffin is running across multiple threads or multiple processes. The solution I'm proposing here is that whenever we persist a _directory_ to the cache, we persist it to a special "archive" bucket. Then, within the other buckets, directory entries are actually symlinks into that "archive" bucket. With symlinks, we can atomically replace, which means we can easily overwrite cache entries without having to delete from the cache. The main downside is that we'll now accumulate dangling entries in the "archive" bucket, and so we'll need to implement some form of garbage collection to ensure that we remove entries with no symlinks. Another downside is that cache reads and writes will be a bit slower, since we need to deal with creating and resolving these symlinks. As an example... after this change, the cache entry for this unzipped wheel is actually a symlink: ![Screenshot 2024-01-22 at 11 56 18 AM](https://github.com/astral-sh/puffin/assets/1309177/99ff6940-5096-4246-8d16-2a7bdcdd8d4b) Then, within the archive directory, we actually have two unique entries (since I intentionally ran the command twice to ensure overwrites were safe): ![Screenshot 2024-01-22 at 11 56 22 AM](https://github.com/astral-sh/puffin/assets/1309177/717d04e2-25d9-4225-b190-bad1441868c6)	2024-01-23 19:52:37 +00:00
Charlie Marsh	556080225d	Use ctime for interpreter timestamps (#1067 ) Per https://apenwarr.ca/log/20181113, `ctime` should be a lot more conservative, and should detect things like the issue we see with the python-build-standalone builds, where the `mtime` is identical across builds. On Windows, I'm just using `last_write_time`. But we should probably add `volume_serial_number` and other attributes via [`winapi_util`](https://docs.rs/winapi-util/latest/winapi_util/index.html).	2024-01-23 19:52:20 +00:00
Charlie Marsh	6561617c56	Store source distribution builds under a unique manifest ID (#1051 ) ## Summary This is a refactor of the source distribution cache that again aims to make the cache purely additive. Instead of deleting all built wheels when the cache gets invalidated (e.g., because the source distribution changed on PyPI or something), we now treat each invalidation as its own cache directory. The manifest inside of the source distribution directory now becomes a pointer to the "latest" version of the source distribution cache. Here's a visual example: ![Screenshot 2024-01-22 at 5 35 41 PM](https://github.com/astral-sh/puffin/assets/1309177/ca103c83-e116-4956-b91c-8434fe62cffe) With this change, we avoid deleting built distributions that might be relied on elsewhere and maintain our invariant that the cache is purely additive. The cost is that we now preserve stale wheels, but we should add a garbage collection mechanism to deal with that.	2024-01-23 19:49:11 +00:00
Charlie Marsh	e32027e384	Avoid persisting manifest data in standalone file (#1044 ) ## Summary This PR gets rid of the manifest that we store for source distributions. Historically, that manifest included the source distribution metadata, plus a list of built wheels. The problem with the manifest is that it duplicates state, since we now have to look at both the manifest and the filesystem to understand the cache state. Instead, I think we should treat the cache as the source of truth, and get rid of the duplicated state in the manifest. Now, we store the manifest (which is merely used to check for cache freshness -- in future PRs, I will repurpose it though, so I left it around), then the distribution metadata as its own file, then any distributions in the same directory. When we want to see if there are any valid distributions, we `readdir` on the directory. This is also much more consistent with how the install plan works.	2024-01-23 19:46:48 +00:00
Zanie Blue	1f0a21d127	Write an `Into<anstream::ColorChoice>` implementation for more idiomatic code (#1064 ) Follow-up to #1049	2024-01-23 15:43:16 +00:00
konsti	1131341cbc	Support more formats in `puffin venv`, incl. windows support (#1039 ) Mirroring `virtualenv -p` and driven by the lack of `pythonx.y` in `PATH` on windows, this PR adds `-p x.y` support to `puffin venv` (first commit). Supported formats: * NEW: `-p 3.10` searches for an installed Python 3.10 (Looking for `python3.10` on linux/mac). Specifying a patch version is not supported * `-p python3.10` or `-p python.exe` looks for a binary in `PATH` * `-p /home/ferris/.local/bin/python3.10` uses this exact Python In the second commit, we add python interpreter search on windows using `py --list-paths`. On windows, all python are called `python.exe` so the unix trick of looking for `python{}.{}` in `PATH` doesn't work. Instead, we ask the python launcher for windows to tell us about all installed packages. We should eventually migrate this to [PEP 514](https://peps.python.org/pep-0514/) by reading the registry entries ourselves.	2024-01-23 15:35:07 +00:00
Charlie Marsh	cb04fa4496	Hide `--exclude-newer` from the command line (#1058 ) This exists for our own test suite.	2024-01-23 00:29:47 -05:00
Zanie Blue	5db81c7caa	Add `--color always\|never\|auto` interface (#1049 ) Extends #1048 interface providing a more general interface that I think should be standard. Allows forcing colors to be on _or_ off. e.g. `NO_COLOR=1 pip install pip-tools --color always` would be colored. Hides the `--no-color` option as it only exists for compatibility (and seems better than throwing an error when people assume it will exist). Has a nice side-effect of documenting our coloring behaviors e.g. ``` --color <COLOR> Control colors in output [default: auto] Possible values: - auto: Enables colored output only when the output is going to a terminal or TTY with support - always: Enables colored output regardless of the detected environment - never: Disables colored output ```	2024-01-22 23:01:36 -06:00
Zanie Blue	a9a7b0069b	Add `--force-reinstall` alias for `--reinstall` to match pip interface (#1045 ) Tested with `cargo run -- pip install pip-tools --force-reinstall`. The alias is hidden.	2024-01-22 22:59:43 -06:00
Zanie Blue	a87e071b5e	Add `--no-color` support for `pip` compatibility (#1048 ) Adds `--no-color` as provided by `pip`. See #1049 for follow-up.	2024-01-22 22:56:51 -06:00
Charlie Marsh	81401a17e5	Use `archive_mtime` in another call site (#1056 ) _Not_ using this was an oversight.	2024-01-23 04:51:18 +00:00
Charlie Marsh	9fd3b8298d	Use `fs_err::tokio` consistently in distribution database (#1055 )	2024-01-22 19:14:29 -05:00

... 2 3 4 5 6 ...

1001 Commits