Commit Graph

640 Commits

Author SHA1 Message Date
Charlie Marsh 6f055ecf3b
Remove existing built wheels when building source distributions (#559)
This PR modifies the source distribution building to replace any
existing targets after building the new wheel. In some cases, the
existence of an existing target may be indicative of a bug, so we warn.
It's partially a workaround for some (but not all) of the errors in
https://github.com/astral-sh/puffin/issues/554.
2023-12-05 12:45:24 -05:00
Charlie Marsh f99e3560e8
Avoid returning zipped wheels from registry and URL indexes (#558)
## Summary

This is hard to reproduce, but if you run a long installation process
that errors part-way through, you can end up with zipped wheels in the
`Wheels` cache, which is intended to contain only unzipped wheels. This
PR avoids returning those entries from the registry, which will then
lead to errors downstream when we treat them as directories.
2023-12-05 09:53:45 +01:00
Charlie Marsh 2d1e19e474
Allow yanked versions when specified via `==` (#561)
## Summary

This enables users to rely on yanked versions via explicit `==` markers,
which is necessary in some projects (and, in my opinion, reasonable).

Closes #551.
2023-12-05 09:44:06 +01:00
Charlie Marsh c3a917bbf6
Support granular target Python versions (#534)
## Summary

Allows, e.g., `--python-version 3.7` or `--python-version 3.7.9`. This
was also feedback I received in the original PR.

Closes https://github.com/astral-sh/puffin/issues/533.
2023-12-05 02:38:49 +00:00
Charlie Marsh 06ee321e9c
Use `u64` instead of `u32` in `Version` fields (#555)
It turns out that it's not uncommon to use timestamps as patch versions
(e.g., `20230628214621`). I believe this is the ISO 8601 "basic format".
These can't be represented by a `u32`, so I think it makes sense to just
bump to `u64` to remove this limitation.
2023-12-04 21:00:55 -05:00
Charlie Marsh af13c83177
Overwrite individual files when reflinking (#556)
Similar to #516, but for individual files.

## Test Plan

Ran:

```sh
cargo run -p puffin-cli -- pip-uninstall plaid-python
mkdir -p /Users/crmarsh/workspace/puffin/.venv/lib/python3.10/site-packages/tests
echo "x=1" > /Users/crmarsh/workspace/puffin/.venv/lib/python3.10/site-packages/tests/__init__.py
cargo run -p puffin-cli -- pip-sync requirements.txt --no-cache --verbose
```
2023-12-04 23:59:35 +00:00
Charlie Marsh 5fddcc362e
Improve error messages for 'file not found' case (#550)
Right now, if you specify a wheel that doesn't exist, you get: `no such
file or directory` with no additional context. Oops!
2023-12-04 22:01:51 +00:00
Charlie Marsh 4e05cd5dfd
Show build progress for path source distributions (#549) 2023-12-04 20:56:56 +00:00
konsti d5abd33813
Use atomic writes for the cache consistently (#546)
Ensure we're using atomic writes everywhere in our cache to avoid broken
cache records and error with parallel puffin actions
(https://github.com/astral-sh/puffin/pull/544#issuecomment-1838841581).

All json files that are written to the cache are written atomically and
the build wheels are written to temp dir and then moved atomically. I
didn't touch venv creation though, i don't think that's worth it since
python does not support atomic package installation through its design.
2023-12-04 12:02:01 -05:00
konsti e9c9e9718e
Use version in `RegistryIndex` (#543)
When building up the `RegistryIndex`, index by both package name and
version to fix #537.
2023-12-04 17:26:14 +01:00
Charlie Marsh 95b8316023
Preserve seed packages for non-Puffin-created virtualenvs (#535)
## Summary

This PR modifies the install plan to avoid removing seed packages if the
virtual environment was created by anyone other than Puffin.

Closes https://github.com/astral-sh/puffin/issues/414.

## Test Plan

- Ran: `virtualenv .venv`.
- Ran: `cargo run -p puffin-cli -- pip-sync
scripts/benchmarks/requirements.txt --verbose --no-cache`.
- Verified that `pip` et al were not removed, and that the logging
including a message around preserving seed packages.
2023-12-04 09:31:00 -05:00
konsti 77b3921b7a
Fix cargo warning (#542)
It's odd that `dev-dependencies` don't default to `dependencies` for
workspace versions.
2023-12-04 11:10:36 +00:00
Charlie Marsh 0ac4254a7e
Enforce target and interpreter `requires-python` versions (#532)
## Summary

This PR modifies the behavior of our `--python-version` override in two
ways:

1. First, we always use the "real" interpreter in the source
distribution builder. I think this is correct. We don't need to use the
fake markers for recursive builds, because all we care about is the
top-level resolution, and we already assume that a single source
distribution will always return the same metadata regardless of its
build environment.
2. Second, we require that source distributions are compatible with
_both_ the "real" interpreter version and the marker environment. This
ensures that we don't try to build source distributions that are
compatible with our interpreter, but incompatible with the target
version.

Closes https://github.com/astral-sh/puffin/issues/407.
2023-12-04 11:27:36 +01:00
Charlie Marsh d96c18b3a8
Respect `requires` for non-`build-backend` PEP 517 builds (#530)
## Summary

This PR modifies `puffin-build` to be closer in behavior to
[pip](a15dd75d98/src/pip/_internal/pyproject.py (L53))
and
[build](de5b44b0c2/src/build/__init__.py (L94)).

Specifically, if a project contains a `[build-system]` field, but no
`build-backend`, we now perform a PEP 517 build (instead of using
`setup.py` directly) _and_ respect the `requires` of the
`[build-system]`. Without this change, we were failing to build source
distributions for packages like `ujson`.

Closes #527.

---------

Co-authored-by: konstin <konstin@mailbox.org>
2023-12-04 10:13:42 +00:00
konsti 6dc8ebcb90
Test interpreter cache invalidation (#540)
Add missing test for #529/#508.
2023-12-04 10:03:43 +00:00
konsti 811c088603
Improve wheel cache docs: Unzipping is lazy (#539)
Also sneaking `fs_err::rename(staging.into_path(), &normalized_path)?`
in here, for a better resolution of
https://github.com/astral-sh/puffin/pull/524#discussion_r1412459016
2023-12-04 10:01:35 +00:00
Charlie Marsh ee009ace86
Remove target directory prior to unzipping (#538)
## Summary

This is not a _fix_ for https://github.com/astral-sh/puffin/issues/537,
but it does ensure that we avoid hard-failing on what's really an
optimization and caching case.
2023-12-04 05:18:45 +00:00
Charlie Marsh fc20d01593
Ignore empty `VIRTUAL_ENV` variables (#536)
I'm not sure how my interpreter gets into this state, but it's certainly
wrong to respect these.
2023-12-04 04:53:26 +00:00
Charlie Marsh 3b55d0b295
Deduplicate various `.dist-info/METADATA` read implementations (#531)
Closes https://github.com/astral-sh/puffin/issues/484.
2023-12-03 21:29:00 -05:00
Charlie Marsh fa3107b173
Use full Python version when determining compatibility (#528)
## Summary

When resolving with Python 3.7.13, I was failing to find a matching
distribution that required Python 3.7.9 or later.
2023-12-04 01:02:24 +00:00
Charlie Marsh 2613382747
Invalidate interpreter marker cache (#529)
In a refactor, we lost the cache invalidation behavior for interpreter
markers, leading to stale interpreter errors for me when creating
environments with different Python versions. Specifically, the
modification timestamp used to be part of the _cache key_ when we used
`cacache`. Now it's not -- but it's stored within the cache. So we need
to validate the key after-the-fact.
2023-12-03 22:44:43 +00:00
Charlie Marsh ee2fca3a48
Add CACHEDIR and .gitignore tags to cache directories (#526)
## Summary

Even if this will typically be in the user's application folder (rather
than a local directory), it's still a good practice.

Closes https://github.com/astral-sh/puffin/issues/280.
2023-12-02 00:37:51 +00:00
konsti 9806901a16
Consolidate wheel caches (#524)
After this change, two wheel caches remain: `built-wheels-v0` and
`wheels-v0`, docs screenshots below. Each contains both the wheel
metadata, cache policy and zip or unzipped wheels under the same name.

The zipped/unzipped strategy is as follows: In `pip-compile`, when we
build a wheel, we store it zipped. When `pip-sync` or a source dist
build in `pip-compile` need to install the wheel, we unzip it, remove
the file and replace it with the unzipped wheel.

This removes `WheelCache` and `UrlIndex` in favor of `Cache` plus
`WheelCache`. The non-built wheel cache now considers index urls and the
url for url wheels.

I'm unsure if we need the `Unzipper` type, this could just be a
function.

I move `no_index` into `IndexUrls` and started using `IndexUrl` up to
the clap level.

I left a number of TODOs in the code, namely performing the actual
invalidation of unzipped wheels and making the `InstallPlan` understand
cache invalidation (i.e. uninstall wheels when their remote changed).


![image](https://github.com/astral-sh/puffin/assets/6826232/c4d45979-485b-4954-848d-fd3347ee2510)
2023-12-01 20:16:33 +00:00
konsti 4551994b7d
Clear built wheels when remote changed (#519)
Remove built wheels alongside their metadata when their index source
dist or url source dist changed. For git source dists, we currently
don't clear the previous build but use a new directory (not sure what's
right here - are there any generic cache GC approaches out there? I've
seen that e.g. spotify keeps its cache at 10GB max, but i also haven't
seen any reusable, well tested approaches for this). Path distributions
are unchanged (#478).

I like the structure of metadata alongside the wheel for cache
invalidation, i'll try to do that for `wheels-v0`/`wheel-metadata-v0`
too. (The unzipped wheels afaik currently lack cache invalidation when
the remote changed.) This should give is roughly the same structure for
wheel and built wheels and a very similar pattern of invalidation.
2023-12-01 14:56:47 -05:00
Zanie Blue 2a8544df9e
Use a custom pubgrub report formatter (#521)
Uses https://github.com/zanieb/pubgrub/pull/10 to drastically simplify
our reporter implementation. This will allow us to make use of upstream
improvements to the reporter e.g.
https://github.com/zanieb/pubgrub/pull/8 without multiple duplicative
pull requests.
2023-12-01 13:36:12 -06:00
Zanie Blue 5f1f207628
Recursively merge existing package directories on installation (#516)
Previously, when installing a package we would delete the target
directory before copying (or linking) the contents of the package.
However, this means that we do not properly support namespace packages
which can share a target directory. Instead the last package to be
installed would be override existing packages. Since we install packages
in parallel, this could result in a race condition where the target
directory already exists which is not allowed when using `clonefile`.
See example error in #515.
c7e63d2dce
provides a regression test for this — it fails on `main`.

Here, we implement a recursive merge when the target directory already
exists. Both packages will be installed into the same directory. We no
longer delete the target directory, which seems okay since we uninstall
packages before installing now.

When files conflict, we will likely throw an error still. The correct
behavior to implement in this case is unclear, as if we just take "first
write wins" or "last write wins" we could end up with some files from
one package and some from another resulting in two broken packages. A
possible solution here is to lock the target directories while copying.
2023-11-30 10:14:51 -06:00
konsti 6841c06e2d
Show error paths in install-wheel-rs (#514)
Ensure that we consistently show a path for all io errors in
install-wheel-rs either (preferred) through `fs_err`, or as fallback by
a custom error type. For zip reading errors, we rely on the caller to
add the name and/or location of the wheel.
2023-11-29 20:14:34 +01:00
konsti 2539f00952
Better tracing span (#513)
This will help us get better insight into what is happening and how long
it takes. I'm particularly interested in how long the different source
dist steps take (download, extract, build step(s)), to make better
decisions about their caching, which i want to report through tracing.

Example output:

```console
$ RUST_LOG=puffin=info cargo run --bin puffin -q -- pip-compile -v --no-cache scripts/requirements/all-kinds.in > /dev/null
  puffin_distribution::source_dist::download_source_dist filename="werkzeug-3.0.1.tar.gz", source_dist=werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz
  puffin_dispatch::build_source source_dist="werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz", subdirectory=None
    puffin_build::extract_archive sdist="werkzeug-3.0.1.tar.gz"
    puffin_dispatch::resolve requirements="flit-core <4"
    puffin_dispatch::install requirements="flit-core ==3.9.0", venv="/tmp/.tmpgZAEAh/.venv"
    puffin_build::get_requires_for_build_wheel name="build_wheel", python_version=3.12
    puffin_build::build package_id="werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz"
      puffin_build::run_python_script name="build_wheel", python_version=3.12
  puffin_dispatch::build_source source_dist="pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git@843b753e9e8cb74e83cac55598719b39a4d5ef1f", subdirectory=None
    puffin_dispatch::resolve requirements="hatchling"
    puffin_dispatch::install requirements="hatchling ==1.18.0, trove-classifiers ==2023.11.22, editables ==0.5, pathspec ==0.11.2, pluggy ==1.3.0, packaging ==23.2", venv="/tmp/.tmpJjweUn/.venv"
    puffin_build::get_requires_for_build_wheel name="build_wheel", python_version=3.12
    puffin_build::build package_id="pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git@843b753e9e8cb74e83cac55598719b39a4d5ef1f"
      puffin_build::run_python_script name="build_wheel", python_version=3.12
  puffin_distribution::source_dist::download_source_dist filename="django-allauth-0.51.0.tar.gz", source_dist=django-allauth==0.51.0
  puffin_dispatch::build_source source_dist="django-allauth==0.51.0", subdirectory=None
    puffin_build::extract_archive sdist="django-allauth-0.51.0.tar.gz"
    puffin_dispatch::resolve requirements="wheel, setuptools, pip"
    puffin_dispatch::install requirements="setuptools ==69.0.2, pip ==23.3.1, wheel ==0.42.0", venv="/tmp/.tmplSZisu/.venv"
    puffin_build::build package_id="django-allauth==0.51.0"
 Resolved 35 packages in 11.71s
```
2023-11-29 10:34:18 +00:00
konsti 929df586fb
Skip tf-models-nightly in resolve-many dev script for now (#510)
`tf-models-nightly` has pathologic backtracking behaviour, skip it for
now so we can benchmark the rest.
2023-11-28 18:25:32 +00:00
konsti d89fbeb642
Migrate interpreter query to custom caching (#508)
This removes the last usage of cacache by replacing it with a custom,
flat json caching keyed by the digest of the executable path.


![image](https://github.com/astral-sh/puffin/assets/6826232/8f777c4c-1f1b-4656-ba7b-002175270556)

A step towards #478. I've made `CachedByTimestamp<T>` generic over `T`
but intentionally not moved it to `puffin-cache` yet.
2023-11-28 17:14:59 +00:00
konsti 5435d44756
Introduce `Cache`, `CacheBucket` and `CacheEntry` (#507)
This is mostly a mechanical refactor that moves 80% of our code to the
same cache abstraction.

It introduces cache `Cache`, which abstracts away the path of the cache
and the temp dir drop and is passed throughout the codebase. To get a
specific cache bucket, you need to requests your `CacheBucket` from
`Cache`. `CacheBucket` is the centralizes the names of all cache
buckets, moving them away from the string constants spread throughout
the crates.

Specifically for working with the `CachedClient`, there is a
`CacheEntry`. I'm not sure yet if that is a strict improvement over
`cache_dir: PathBuf, cache_file: String`, i may have to rotate that
later.

The interpreter cache moved into `interpreter-v0`.

We can use the `CacheBucket` page to document the cache structure in
each bucket:


![image](https://github.com/astral-sh/puffin/assets/6826232/b023fdfb-e34d-4c2d-8663-b5f73937a539)
2023-11-28 17:11:14 +00:00
Charlie Marsh 3d47d2b1da
Error when `ldd` is not in path (#506)
Closes https://github.com/astral-sh/puffin/issues/493.
2023-11-28 05:55:04 +00:00
konsti 8855f44b5f
Move simple index queries to `CachedClient` (#504)
Replaces the usage of `http-cache-reqwest` for simple index queries with
our custom cached client, removing `http-cache-reqwest` altogether.

The new cache paths are `<cache>/simple-v0/<index>/<package_name>.json`.
I could not test with a non-pypi index since i'm not aware of any other
json indices (jax and torch are both html indices).

In a future step, we can transform the response to be a
`HashMap<Version, {source_dists: Vec<(SourceDistFilename, File)>,
wheels: Vec<(WheeFilename, File)>}` (independent of python version, this
cache is used by all environments together). This should speed up cache
deserialization a bit, since we don't need to try source dist and wheel
anymore and drop incompatible dists, and it should make building the
`VersionMap` simpler. We can speed this up even further by splitting
into a version lists and the info for each version. I'm mentioning this
because deserialization was a major bottleneck in the rust part of the
old python prototype.

Fixes #481
2023-11-28 00:11:03 +00:00
konsti 1142a14f4d
Check compatibility for cached unzipped wheels (#501)
**Motivation** Previously, we would install any wheel with the correct
package name and version from the cache, even if it doesn't match the
current python interpreter.

**Summary** The unzipped wheel cache for registries now uses the entire
wheel filename over the name-version (`editables-0.5-py3-none-any.whl`
over `editables-0.5`).

Built wheels are not stored in the `wheels-v0` unzipped wheels cache
anymore. For each source distribution, there can be multiple built
wheels (with different compatibility tags), so i argue that we need a
different cache structure for them (follow up PR).

For `all-kinds.in` with

```bash
rm -rf cache-all-kinds
virtualenv --clear -p 3.12 .venv
cargo run --bin puffin -- pip-sync --cache-dir cache-all-kinds target/all-kinds.txt
```

we get:

**Before**
```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0
│   ├── asgiref-3.7.2
│   ├── blinker-1.7.0
│   ├── certifi-2023.11.17
│   ├── cffi-1.16.0
│   ├── [...]
│   ├── tzdata-2023.3
│   ├── urllib3-2.1.0
│   └── wheel-0.42.0
└── url
    ├── 4b8be67c801a7ecb
    │   ├── flask
    │   └── flask-3.0.0.dist-info
    ├── 6781bd6440ae72c2
    │   ├── werkzeug
    │   └── werkzeug-3.0.1.dist-info
    └── a67db8ed076e3814
        ├── pydantic_extra_types
        └── pydantic_extra_types-2.1.0.dist-info

48 directories, 0 files
```

**After**

```
cache-all-kinds/wheels-v0/
├── registry
│   ├── annotated_types-0.6.0-py3-none-any.whl
│   ├── asgiref-3.7.2-py3-none-any.whl
│   ├── blinker-1.7.0-py3-none-any.whl
│   ├── certifi-2023.11.17-py3-none-any.whl
│   ├── cffi-1.16.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
│   ├── [...]
│   ├── tzdata-2023.3-py2.py3-none-any.whl
│   ├── urllib3-2.1.0-py3-none-any.whl
│   └── wheel-0.42.0-py3-none-any.whl
└── url
    └── 4b8be67c801a7ecb
        └── flask-3.0.0-py3-none-any.whl

39 directories, 0 files
```

**Outlook** Part of #477 "Fix wheel caching". Further tasks:
* Replace the `CacheShard` with `WheelMetadataCache` which handles urls
properly.
* Delete unzipped wheels when their remote wheel changed
* Store built wheels next to the `metadata.json` in the source dist
directory; delete built wheels when their source dist changed (different
cache bucket, but it's the same problem of fixing wheel caching) I'll
make stacked PRs for those
2023-11-27 16:03:58 -08:00
konsti 71295702bf
Reduce pip_sync test duplication (#502)
Move venv creation and running python to check the installation into
function instead of copy&pasting them every time
2023-11-27 10:21:40 +00:00
Charlie Marsh afda835544
Avoid clone for `WheelMetadataCache` (#500)
This doesn't need to own the underlying data which allows us to remove a
number of clones.
2023-11-25 23:33:59 +00:00
Charlie Marsh 3eb0a43995
Perform a single Git fetch when building source distributions (#499)
## Summary

We need to pass in the distribution with the "precise" URL to avoid
refetching.

## Test Plan

Ran `cargo run -p puffin-cli -- pip-compile requirements.in --verbose`
with `flask @ git+https://github.com/pallets/flask.git` and verified
that we only checked out Flask once.
2023-11-25 23:29:41 +00:00
konsti d54e780843
Source dist metadata refactor (#468)
## Summary and motivation

For a given source dist, we store the metadata of each wheel built
through it in `built-wheel-metadata-v0/pypi/<source dist
filename>/metadata.json`. During resolution, we check the cache status
of the source dist. If it is fresh, we check `metadata.json` for a
matching wheel. If there is one we use that metadata, if there isn't, we
build one. If the source is stale, we build a wheel and override
`metadata.json` with that single wheel. This PR thereby ties the local
built wheel metadata cache to the freshness of the remote source dist.
This functionality is available through `SourceDistCachedBuilder`.

`puffin_installer::Builder`, `puffin_installer::Downloader` and
`Fetcher` are removed, instead there are now `FetchAndBuild` which calls
into the also new `SourceDistCachedBuilder`. `FetchAndBuild` is the new
main high-level abstraction: It spawns parallel fetching/building, for
wheel metadata it calls into the registry client, for wheel files it
fetches them, for source dists it calls `SourceDistCachedBuilder`. It
handles locks around builds, and newly added also inter-process file
locking for git operations.

Fetching and building source distributions now happens in parallel in
`pip-sync`, i.e. we don't have to wait for the largest wheel to be
downloaded to start building source distributions.

In a follow-up PR, I'll also clear built wheels when they've become
stale.

Another effect is that in a fully cached resolution, we need neither zip
reading nor email parsing.

Closes #473

## Source dist cache structure 

Entries by supported sources:
 * `<build wheel metadata cache>/pypi/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/<sha256(index-url)>/foo-1.0.0.zip/metadata.json`
* `<build wheel metadata
cache>/url/<sha256(url)>/foo-1.0.0.zip/metadata.json`
But the url filename does not need to be a valid source dist filename

(<https://github.com/search?q=path%3A**%2Frequirements.txt+master.zip&type=code>),
so it could also be the following and we have to take any string as
filename:
* `<build wheel metadata
cache>/url/<sha256(url)>/master.zip/metadata.json`

Example:
```text
# git source dist
pydantic-extra-types @ git+https://github.com/pydantic/pydantic-extra-types.git
# pypi source dist
django_allauth==0.51.0
# url source dist
werkzeug @ ff1904eb5e2853bf83db817a7dd53d/werkzeug-3.0.1.tar.gz
```
will be stored as
```text
built-wheel-metadata-v0
├── git
│   └── 5c56bc1c58c34c11
│       └── 843b753e9e8cb74e83cac55598719b39a4d5ef1f
│           └── metadata.json
├── pypi
│   └── django-allauth-0.51.0.tar.gz
│       └── metadata.json
└── url
    └── 6781bd6440ae72c2
        └── werkzeug-3.0.1.tar.gz
            └── metadata.json
```

The inside of a `metadata.json`:
```json
{
  "data": {
    "django_allauth-0.51.0-py3-none-any.whl": {
      "metadata-version": "2.1",
      "name": "django-allauth",
      "version": "0.51.0",
      ...
    }
  }
}
```
2023-11-24 17:47:58 +00:00
konsti 8d247fe95b
Add `Tags::from_interpreter` (#498)
Small refactoring
2023-11-24 11:36:01 +00:00
konsti f7976ce5cc
Write docs for distribution types (#495)
Document the type hierarchy, excluding the traits.
2023-11-23 13:39:39 +00:00
konsti 1c0e03f807
puffin_interpreter cleanup ahead of #235 (#492)
Preparing for #235, some refactoring to `puffin_interpreter`.

* Added a dedicated error type instead of anyhow
* `InterpreterInfo` -> `Interpreter`
* `detect_virtual_env` now returns an option so it can be chained for
#235
2023-11-23 08:57:33 +00:00
Charlie Marsh 9d35128840
Use Clippy lint table over Cargo config (#490)
Closes https://github.com/astral-sh/puffin/issues/482.
2023-11-22 15:10:27 +00:00
Charlie Marsh 443a0a9df2
Use a sparse Metadata 2.1 representation (#488)
This is an optimization to avoid parsing the entire Metadata 2.1 when we
only need a small subset of the fields.

Closes #175.
2023-11-22 13:25:35 +00:00
konsti a030a466e6
Error before download with no_build (#487)
This is fixes a performance regression where when `--no-build` was set,
the fetcher would still download the source dist only to error
afterwards.
2023-11-22 10:38:10 +00:00
konsti e1dafe7203
Allow applying multiple fixups for version specifiers (#486)
Allow applying multiple fixups for version specifiers, remove the
duplication from the code and add another test case.
2023-11-22 10:26:12 +00:00
konsti ff1100a1ab
Fixup for `>= '2.7'` (#485)
Fixup to allow parsing
https://pypi.org/simple/shellingham/?format=application/vnd.pypi.simple.v1+json
2023-11-22 10:00:12 +00:00
konsti 7c7daa8f83
Consistent Cargo.toml syntax (#483)
Remove the last Cargo.toml inconsistencies, see
1526b3458a (r1401083681).
Now all `[dependencies]` are workspace dependencies.
2023-11-22 08:34:08 +00:00
konsti 934e32ea98
Remove outdated todos (#476) 2023-11-21 13:57:40 +00:00
Charlie Marsh 17228ba04e
Add support for path dependencies (#471)
## Summary

This PR adds support for local path dependencies. The approach mostly
just falls out of our existing approach and infrastructure for Git and
URL dependencies.

Closes https://github.com/astral-sh/puffin/issues/436. (We'll open a
separate issue for editable installs.)

## Test Plan

Added `pip-compile` tests that pre-download a wheel or source
distribution, then install it via local path.
2023-11-21 11:49:42 +00:00
Charlie Marsh f1aa70d9d3
Refactor distribution types to return `Result` (#470)
## Summary

A variety of small refactors to the distribution types crate to (1)
return `Result` if we find an invalid wheel, rather than treating it as
a source distribution with a `.whl` suffix, and (2) DRY up some repeated
code around URLs.
2023-11-20 23:08:54 +00:00
konsti f0841cdb6e
Wheel metadata refactor (#462)
A consistent cache structure for remote wheel metadata:

 * `<wheel metadata cache>/pypi/foo-1.0.0-py3-none-any.json`
* `<wheel metadata
cache>/<digest(index-url)>/foo-1.0.0-py3-none-any.json`
* `<wheel metadata cache>/url/<digest(url)>/foo-1.0.0-py3-none-any.json`

The source dist caching will use a similar structure (#468).
2023-11-20 17:26:36 +01:00
konsti d3e9e1783f
Refactor lenient parsing (#467)
Deduplicate lenient parsing code between version specifiers and
Requirement. Use `warn_once!` since the warnings did show up multiple
times in my code. Fix the macro hygiene in `warn_once!`.
2023-11-20 15:35:38 +00:00
Charlie Marsh 60f595b469
Prefer future stream over `JoinSet` in downloader (#469)
This avoids introducing a static lifetime requirement and, in my
benchmarks, is even a little faster.
2023-11-20 13:23:30 +00:00
Charlie Marsh 8decb29bad
Use a dedicated error type for `puffin-distribution` (#466) 2023-11-20 11:38:27 +00:00
Charlie Marsh 342fc628f0
Store downloaded wheels in a local cache (#463)
This PR modifies the `Fetcher` to cache remote wheels that we _already_
store to-disk. We might read these again in the future, so we might as
well store them in the cache for consistency (rather than using a
temporary directory).
2023-11-20 11:32:22 +00:00
Charlie Marsh 35fd86631b
Unify distribution operations into a single crate (#460)
## Summary

This PR unifies the behavior that lived in the resolver's `distribution`
crates with the behaviors that were spread between the various structs
in the installer crate into a single `Fetcher` struct that is intended
to manage all interactions with distributions. Specifically, the
interface of this struct is such that it can access distribution
metadata, download distributions, return those downloads, etc., all with
a common cache.

Overall, this is mostly just DRYing up code that was repeated between
the two crates, and putting it behind a reasonable shared interface.
2023-11-20 11:22:52 +00:00
konsti 45d032dd7d
Fix wheel filename serialization (#465)
We need an underscore in the wheel filename, not a dash
2023-11-20 11:21:22 +00:00
konsti 46bb18f06e
Track file index (#452)
Track the index (or at least its url) where we got a file from across
the source code.

Fixes #448
2023-11-20 08:48:16 +00:00
Charlie Marsh 6fd582f8b9
Rename `puffin-distribution` to `distribution-types` (#458)
## Summary

This crate only contains types, and I want to introduce a new crate for
all _operations_ on distributions, so this feels like a more natural
name given we also have `pypi-types`.
2023-11-20 09:40:26 +01:00
konsti 2fed14fdc6
Optional serde feature for distribution-filename (#461)
https://github.com/astral-sh/puffin/pull/459#discussion_r1398482972
2023-11-19 19:53:32 +00:00
konsti 255edf4445
Serde support for WheelFilename through str repr (#459)
I need this later, splitting out for PR size
2023-11-19 19:43:14 +00:00
Charlie Marsh 3df3110800
Use shortened `anyhow::Result` everywhere (#457) 2023-11-19 19:26:21 +00:00
Charlie Marsh 380030bb5c
Pin all resolver tests using `--exclude-newer` (#456)
Uses yesterday's date, which should make it much less likely that our
tests become stale over time.

Closes https://github.com/astral-sh/puffin/issues/449.
2023-11-19 15:10:57 +00:00
konsti 24f00f5a33
Create cache dir before canonicalize (#454)
`fs::canonicalize` fails when the directory does not exist, which i
missed in #453
2023-11-19 13:49:13 +00:00
konsti ab60233131
Use absolute cache paths (#453)
Previously, git requirements would fail when setting `--cache-dir`:

```console
$ cargo run --bin puffin -- pip-compile --cache-dir cache-all-kinds scripts/benchmarks/requirements/all-kinds.in
error: Failed to build distribution from URL: git+https://github.com/pydantic/pydantic-extra-types.git
  Caused by: Invalid path URL: cache-all-kinds/git-v0/db/b49ffcfeb6c2e9d8
  ```

The cause is using a relative and not an absolute path, which `Url` needs, the solution is to turn the cache dir into an absolute path.

This never showed up in the tests since the tests use absolute temp dirs for everything.
2023-11-19 13:32:32 +00:00
konsti dd4347980a
Fix tests: Certifi got an update (#451) 2023-11-19 12:10:54 +00:00
Zanie Blue 5dedfeb097
Fix import of `CacheArgs` in `puffin-cli` (#447)
```
error[E0432]: unresolved imports `puffin_cache::CacheArgs`, `puffin_cache::CacheDir`
  --> crates/puffin-cli/src/main.rs:11:20
   |
11 | use puffin_cache::{CacheArgs, CacheDir};
   |                    ^^^^^^^^^  ^^^^^^^^ no `CacheDir` in the root
   |                    |
   |                    no `CacheArgs` in the root
   |
note: found an item that was configured out
  --> /Users/mz/eng/src/astral-sh/puffin/crates/puffin-cache/src/lib.rs:7:15
   |
7  | pub use cli::{CacheArgs, CacheDir};
   |               ^^^^^^^^^
   = note: the item is gated behind the `clap` feature
note: found an item that was configured out
  --> /Users/mz/eng/src/astral-sh/puffin/crates/puffin-cache/src/lib.rs:7:26
   |
7  | pub use cli::{CacheArgs, CacheDir};
   |                          ^^^^^^^^
   = note: the item is gated behind the `clap` feature

For more information about this error, try `rustc --explain E0432`.
error: could not compile `puffin-cli` (bin "puffin") due to previous error
```
2023-11-17 15:35:01 -05:00
Charlie Marsh 03599d2bb4
Split resolver inputs into manifest and options (#446)
## Summary

This is a refactor to address a TODO in the build context whereby we
aren't respecting the resolution options in recursive resolutions. Now,
the options are split out from the resolution _manifest_, and shared
across the build context tree.
2023-11-17 18:53:53 +00:00
konsti 9db6644be6
Test requirements script (#382)
This script can compare different requirements between pip(-compile) and
puffin across python versions, with debug and release builds.

Examples:
```shell
scripts/compare_with_pip/compare_with_pip.py
scripts/compare_with_pip/compare_with_pip.py -p 3.10
scripts/compare_with_pip/compare_with_pip.py --release -p 3.9 --target 'transformers[deepspeed-testing,dev-tensorflow]'
```

It found a bunch of fixed bugs, e.g. the lack of yanked package handling
and source dist handling, as well as #423, which is currently most of
the output.

Example output:
https://gist.github.com/konstin/9ccf8dc7c2dcca737bf705429ced4892

#443 should be merged first
2023-11-17 18:26:55 +00:00
konsti bf71e7adcf
Add graphviz output to puffin-dev resolve-cli (#443)
I added output in graphviz DOT format to `puffin-dev resolve-cli` to
help with debugging resolutions. This requires tracking the requested
ranges in the graph. I also fixed the direction of the graph.

 Output for `black`:

```dot
digraph {
    0 [ label="click\n8.1.7"]
    1 [ label="black\n23.11.0"]
    2 [ label="packaging\n23.2"]
    3 [ label="mypy-extensions\n1.0.0"]
    4 [ label="tomli\n2.0.1"]
    5 [ label="pathspec\n0.11.2"]
    6 [ label="typing-extensions\n4.8.0"]
    7 [ label="platformdirs\n4.0.0"]
    1 -> 0 [ label=">=8.0.0"]
    1 -> 3 [ label=">=0.4.3"]
    1 -> 5 [ label=">=0.9.0"]
    1 -> 4 [ label=">=1.1.0"]
    1 -> 6 [ label=">=4.0.1"]
    1 -> 2 [ label=">=22.0"]
    1 -> 7 [ label=">=2"]
}
```


![image](https://github.com/astral-sh/puffin/assets/6826232/4a440fcd-6248-4349-8e1a-c3e0363e42b1)

transformers:


![image](https://github.com/astral-sh/puffin/assets/6826232/a13a693c-a8c0-4a4f-95d9-3458431c678a)

jupyter:


![graphviz](https://github.com/astral-sh/puffin/assets/6826232/ef730033-6fd9-4ea9-ac93-8c874c19a101)
2023-11-17 18:16:24 +00:00
Zanie Blue d39e9b3499
Remove duplicate `cache_dir` argument from `puffin-dev resolve-cli` (#445) 2023-11-17 17:17:00 +00:00
Zanie Blue 221751487c
Use `UnusableDependencies` for URL dependency conflicts (#425)
Extends #424 with support for URL dependency incompatibilities.

Requires changes to `miette` to prevent URLs from being word wrapped;
accepted upstream in https://github.com/zkat/miette/pull/321
2023-11-17 08:28:12 -06:00
Charlie Marsh 2094680cdd
Add a `warn_user_once!` macro (#442)
Closes https://github.com/astral-sh/puffin/issues/429.
2023-11-17 02:34:06 +00:00
Charlie Marsh 25fcee0d9f
Avoid using incompatible wheels for source distribution-less packages (#441)
We're willing to use platform-incompatible wheels during resolution, to
quicken access to metadata... But we should avoid choosing an
incompatible wheel if the package lacks a source distribution since, in
that case, we definitely won't be able to install it.

Closes https://github.com/astral-sh/puffin/issues/439.
2023-11-17 02:10:54 +00:00
Charlie Marsh b1c29447df
Use `temp_dir` casing everywhere (#440) 2023-11-16 21:04:10 +00:00
konsti 1883dbdc21
Always¹ clear temporary directories (#437)
Always¹ clear the temporary directories we create.

* Clear source dist downloads: Previously, the temporary directories
would remain in the cache dir, now they are cleared properly
* Clear wheel file downloads: Delete the `.whl` file, we only need to
cache the unpacked wheel
* Consistent handling of cache arguments: Abstract the handling for CLI
cache args away, again making sure we remove the `--no-cache` temp dir.

There are no more `into_path()` calls that persist `TempDir`s that i
could find.

¹Assuming drop is run, and deleting the directory doesn't silently
error.
2023-11-16 20:49:48 +00:00
Zanie Blue 0d9d4f9fca
Add an `UnusableDependencies` incompatibility kind and use for conflicting versions (#424)
Addresses
https://github.com/astral-sh/puffin/issues/309#issuecomment-1792648969

Similar to #338 this throws an error when merging versions results in an
empty set. Instead of propagating that error, we capture it and return a
new dependency type of `Unusable`. Unusable dependencies are a new
incompatibility kind which includes an arbitrary "reason" string that we
present to the user. Adding a new incompatibility kind requires changes
to the vendored pubgrub crate.

We could use this same incompatibility kind for conflicting urls as in
#284 which should allow the solver to backtrack to another valid version
instead of failing (see #425).

Unlike #383 this does not require changes to PubGrub's package mapping
model. I think in the long run we'll want PubGrub to accept multiple
versions per package to solve this specific issue, but we're interested
in it being merged upstream first. This pull request is just using the
issue as a simple case to explore adding a new incompatibility type.

We may or may not be able convince them to add this new incompatibility
type upstream. As discussed in
https://github.com/pubgrub-rs/pubgrub/issues/152, we may want a more
general incompatibility kind instead which can be used for arbitrary
problems. An upstream pull request has been opened for discussion at
https://github.com/pubgrub-rs/pubgrub/pull/153.

Related to:
- https://github.com/pubgrub-rs/pubgrub/issues/152
- #338 
- #383

---------

Co-authored-by: konsti <konstin@mailbox.org>
2023-11-16 20:02:06 +00:00
Zanie Blue 832058dbba
Switch from vendored PubGrub to a fork (#438)
A fork will let us stay up to date with the upstream while replaying our
work on top of it.

I expect a similar workflow to the RustPython-Parser fork we maintained,
except that I wrote an automation to create tags for each commit on the
fork (https://github.com/zanieb/pubgrub/pull/2) so we do not need to
manually tag and document each commit.

To update with the upstream:

- Rebase our fork's `main` branch on top of the latest changes in
upstream's `dev` branch
- Force push, overwriting our `main` branch history
- Change the commit hash here to the last commit on `main` in our fork

Since we automatically tag each commit on the fork, we should never lose
the commits that are dropped from `main` during rebase.
2023-11-16 13:49:19 -06:00
konsti e41ec12239
Option to resolve at a fixed timestamp with `pip-compile --exclude-newer YYYY-MM-DD` (#434)
This works by filtering out files with a more recent upload time, so if
the index you use does not provide upload times, the results might be
inaccurate. pypi provides upload times for all files. This is, the field
is non-nullable in the warehouse schema, but the simple API PEP does not
know this field.

If you have only pypi dependencies, this means deterministic,
reproducible(!) resolution. We could try doing the same for git repos
but it doesn't seem worth the effort, i'd recommend pinning commits
since git histories are arbitrarily malleable and also if you care about
reproducibility and such you such not use git dependencies but a custom
index.

Timestamps are given either as RFC 3339 timestamps such as
`2006-12-02T02:07:43Z` or as UTC dates in the same format such as
`2006-12-02`. Dates are interpreted as including this day, i.e. until
midnight UTC that day. Date only is required to make this ergonomic and
midnight seems like an ergonomic choice.

In action for `pandas`:

```console
$ target/debug/puffin pip-compile --exclude-newer 2023-11-16 target/pandas.in
Resolved 6 packages in 679ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2023-11-16 target/pandas.in
numpy==1.26.2
    # via pandas
pandas==2.1.3
python-dateutil==2.8.2
    # via pandas
pytz==2023.3.post1
    # via pandas
six==1.16.0
    # via python-dateutil
tzdata==2023.3
    # via pandas
$ target/debug/puffin pip-compile --exclude-newer 2022-11-16 target/pandas.in
Resolved 5 packages in 655ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2022-11-16 target/pandas.in
numpy==1.23.4
    # via pandas
pandas==1.5.1
python-dateutil==2.8.2
    # via pandas
pytz==2022.6
    # via pandas
six==1.16.0
    # via python-dateutil
$ target/debug/puffin pip-compile --exclude-newer 2021-11-16 target/pandas.in
Resolved 5 packages in 594ms
# This file was autogenerated by Puffin v0.0.1 via the following command:
#    target/debug/puffin pip-compile --exclude-newer 2021-11-16 target/pandas.in
numpy==1.21.4
    # via pandas
pandas==1.3.4
python-dateutil==2.8.2
    # via pandas
pytz==2021.3
    # via pandas
six==1.16.0
    # via python-dateutil
```
2023-11-16 19:46:17 +00:00
konsti 0d455ebd06
Always use puffin as binary name (#435)
It doesn't matter how exactly the user called puffin, the lockfile
should look the same either way.
2023-11-16 19:05:46 +01:00
konsti 751f7fa9c6
Improve PEP 691 compatibility (#428)
[PEP 691](https://peps.python.org/pep-0691/#project-detail) has slightly
different, more relaxed rules around file metadata. These changes are
now reflected in the `File` struct. This will make it easier to support
alternative indices.

I had expected that i need to introduce a separate type for that, so i'm
happy it's two `Option`s more and an alias.

Part of #412
2023-11-16 19:03:44 +01:00
konsti 3a4988f999
Small test cleanup after #431 (#433)
Remove unused filters after #431
2023-11-16 11:22:47 +00:00
konsti c0339893e7
Use `sys.executable` as python root path (#431)
Previously, we were assuming that `which <python>` return the path to
the python executable. This is not true when using pyenv shims, which
are bash scripts. Instead, we have to use `sys.executable`. Luckily,
we're already querying the python interpreter and can do it in that
pass.

We are also not allowed to cache the execution of the python interpreter
through the shim because pyenv might change the target. As a heuristic,
we check whether `sys.executable`, the real binary, is the same our
canonicalized `which` result.

---------

Co-authored-by: Zanie Blue <contact@zanie.dev>
2023-11-16 12:16:49 +01:00
Charlie Marsh d3caf9ae86
Choose most-compatible wheel in resolver and installer (#422)
## Summary

This PR implements logic to sort wheels by priority, where priority is
defined as preferring more "specific" wheels over less "specific"
wheels. For example, in the case of Black, my machine now selects
`black-23.11.0-cp311-cp311-macosx_11_0_arm64.whl`, whereas sorting by
lowest priority instead gives me `black-23.11.0-py3-none-any.whl`.

As part of this change, I've also modified the resolver to fallback to
using incompatible wheels when determining package metadata, if no
compatible wheels are available.

The `VersionMap` was also moved out of `resolver.rs` and into its own
file with a wrapper type, for clarity.

Closes https://github.com/astral-sh/puffin/issues/380.
Closes https://github.com/astral-sh/puffin/issues/421.
2023-11-15 18:22:11 +00:00
konsti 1147a4de14
Simpler and more resilient pip compile tests (#426)
The pip compile test now explicitly set their python version and `puffin
venv` resolves e.g. `python3.12` correctly now. The venv creation is
moved to a shared method
2023-11-15 18:32:33 +01:00
Charlie Marsh a20325f184
Remove unnecessary clones in resolver (#420) 2023-11-13 21:00:52 -05:00
Charlie Marsh 13ba4405aa
Update README and crates manifest (#419) 2023-11-14 01:20:07 +00:00
konsti bacf1dc911
Filter out yanked files (#413)
Implement two behaviors for yanked versions:

* During `pip-compile`, yanked versions are filtered out entirely, we
currently treat them is if they don't exist. This is leads to confusing
error messages because a version that does exist seems to have suddenly
disappeared.
* During `pip-sync`, we warn when we fetch a remote distribution and it
has been yanked. We currently don't warn on cached or installed
distributions that have been yanked.
2023-11-13 20:58:50 +00:00
Charlie Marsh 28ec4e79f0
Co-locate lenient requirement parsing (#418)
No behavior changes.
2023-11-13 15:46:21 -05:00
Charlie Marsh 437d4fb87e
Add trailing-comma fix to lenient requirements (#417)
Closes https://github.com/astral-sh/puffin/issues/408.
2023-11-13 20:20:57 +00:00
Charlie Marsh 582c94cec3
Add missing-dot fix to lenient requirements (#416)
Part of https://github.com/astral-sh/puffin/issues/408.
2023-11-13 20:17:01 +00:00
Charlie Marsh 0af2f7e39f
Use `anstream` to avoid writing colorized output (#415)
A more robust solution to avoiding colorized output by ensuring we write
to `stdout` and `stderr` via the
[`anstream`](https://docs.rs/anstream/latest/anstream/) crate.

Closes https://github.com/astral-sh/puffin/issues/393.
2023-11-13 20:00:12 +00:00
konsti 76a41066ac
Filter out incompatible dists (#398)
Filter out source dists and wheels whose `requires-python` from the
simple api is incompatible with the current python version.

This change showed an important problem: When we use a fake python
version for resolving, building source distributions breaks down because
we can only build with versions we actually have.

This change became surprisingly big. The tests now require python 3.7 to
be installed, but changing that would mean an even bigger change.

Fixes #388
2023-11-13 17:14:07 +01:00
konsti 81c9cd0d4a
Print url for bad json error (#409)
Split out from #382
2023-11-13 11:41:20 +00:00
konsti fa423b8751
Backend path is supported (#405)
The check is outdated now
2023-11-13 08:19:20 +00:00
Zanie Blue beadd3274a
Improve debug log version display (#403)
Follow-up to https://github.com/astral-sh/puffin/pull/346 for some debug
messages
2023-11-10 17:07:29 -06:00
Charlie Marsh 06b312de7e
Overwrite existing files when hardlinking (#402)
## Summary

Closes https://github.com/astral-sh/puffin/issues/390.

## Test Plan

Installed `jupyter_core==5.5.0`, then removed the `jupyter_core` and
`jupyter_core-5.5.0.dist-info` directories from my virtualenv manually,
but left `jupyter.py`. I then re-ran `puffin pip-compile`, and verified
that it errored on `main` but succeeded here.
2023-11-10 20:24:19 +00:00
Charlie Marsh 56a4b51eb6
Refactor hardlink fallback to use an enum (#401)
Makes an invalid state unrepresentable (`first_try_hard_linking = true`,
`use_copy_fallback` = true`).
2023-11-10 15:18:51 -05:00
Andrew Gallant ff4d079dc9
pep508-rs: remove \x20 trailing whitespace hack (#400)
... we just remove the trailing whitespace from the input and that
resolves things.

Thanks @konstin for pointing this out!

Ref https://github.com/astral-sh/puffin/pull/399#discussion_r1389854823
2023-11-10 15:11:29 -05:00
Charlie Marsh e8108cb28b
Remove `__pycache__` directories when uninstalling (#397)
According to the [packaging
documentation](https://packaging.python.org/en/latest/specifications/binary-distribution-format/#binary-distribution-format),
"uninstallers should be smart enough to remove .pyc even if it is not
mentioned in RECORD". Previously, we weren't handling this case, so if
you installed via Puffin, then imported a file (to trigger bytecode
compilation), then uninstalled, we'd leave spare `__pycache__`
directories around.

Closes https://github.com/astral-sh/puffin/issues/395.
2023-11-10 14:55:33 -05:00
Andrew Gallant 63f7f65190
change global allocator to jemalloc (and mimalloc on Windows) (#399)
This copies the allocator configuration used in the Ruff project. In
particular, this gives us an instant 10% win when resolving the top 1K
PyPI packages:

    $ hyperfine \
"./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null" \
"./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null"
Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 974.2 ms ± 26.4 ms [User: 17503.3 ms, System: 2205.3
ms]
      Range (min … max):   943.5 ms … 1015.9 ms    10 runs

Benchmark 2: ./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null
Time (mean ± σ): 883.1 ms ± 23.3 ms [User: 14626.1 ms, System: 2542.2
ms]
      Range (min … max):   849.5 ms … 916.9 ms    10 runs

    Summary
'./target/profiling/puffin-dev resolve-many --cache-dir
cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2>
/dev/null' ran
1.10 ± 0.04 times faster than './target/profiling/puffin-dev-main
resolve-many --cache-dir cache-docker-no-build --no-build
pypi_top_8k_flat.txt --limit 1000 2> /dev/null'

I was moved to do this because I noticed `malloc`/`free` taking up a
fairly sizeable percentage of time during light profiling.

As is becoming a pattern, it will be easier to review this
commit-by-commit.

Ref #396 (wouldn't call this issue fixed)

-----

I did also try adding a `smallvec` optimization to the
`Version::release` field, but it didn't bare any fruit. I still think
there is more to explore since the results I observed don't quite line
up with what I expect. (So probably either my mental model is off or my
measurement process is flawed.) You can see that attempt with a little
more explanation here:
f9528b4ecd

In the course of adding the `smallvec` optimization, I also shrunk the
`Version` fields from a `usize` to a `u32`. They should at least be a
fixed size integer since version numbers aren't used to index memory,
and I shrunk it to `u32` since it seems reasonable to assume that all
version numbers will be smaller than `2^32`.
2023-11-10 14:48:59 -05:00
konsti d8408b1783
Add source to failing metadata parsing (#387)
Before:
```
cargo run --bin puffin-dev -q -- resolve-cli "transformers[accelerate, agents, all, audio, codecarbon, deepspeed, deepspeed-testing, dev, dev-tensorflow, dev-torch, docs, docs_specific, flax, flax-speech, ftfy, integrations, ja, modelcreation, onnx, onnxruntime, optuna, quality, ray, retrieval, sagemaker, sentencepiece, serving, sigopt, sklearn, speech, testing, tf, tf-cpu, tf-speech, timm, tokenizers, torch, torch-speech, torch-vision, torchhub, video, vision]"
puffin-dev failed
  Caused by: No solution found when resolving: transformers[accelerate,agents,all,audio,codecarbon,deepspeed,deepspeed-testing,dev,dev-tensorflow,dev-torch,docs,docs-specific,flax,flax-speech,ftfy,integrations,ja,modelcreation,onnx,onnxruntime,optuna,quality,ray,retrieval,sagemaker,sentencepiece,serving,sigopt,sklearn,speech,testing,tf,tf-cpu,tf-speech,timm,tokenizers,torch,torch-speech,torch-vision,torchhub,video,vision]
  Caused by: Not a valid package or extra name: ".none". Names must start and end with a letter or digit and may only contain -, _, ., and alphanumeric characters
```
After:
```
cargo run --bin puffin-dev -q -- resolve-cli "transformers[accelerate, agents, all, audio, codecarbon, deepspeed, deepspeed-testing, dev, dev-tensorflow, dev-torch, docs, docs_specific, flax, flax-speech, ftfy, integrations, ja, modelcreation, onnx, onnxruntime, optuna, quality, ray, retrieval, sagemaker, sentencepiece, serving, sigopt, sklearn, speech, testing, tf, tf-cpu, tf-speech, timm, tokenizers, torch, torch-speech, torch-vision, torchhub, video, vision]"
puffin-dev failed
  Caused by: No solution found when resolving: transformers[accelerate,agents,all,audio,codecarbon,deepspeed,deepspeed-testing,dev,dev-tensorflow,dev-torch,docs,docs-specific,flax,flax-speech,ftfy,integrations,ja,modelcreation,onnx,onnxruntime,optuna,quality,ray,retrieval,sagemaker,sentencepiece,serving,sigopt,sklearn,speech,testing,tf,tf-cpu,tf-speech,timm,tokenizers,torch,torch-speech,torch-vision,torchhub,video,vision]
  Caused by: Couldn't parse metadata in fastapi-0.10.1-py3-none-any.whl (97ac91cb7cd2baab1a50b0c7a17d83/fastapi-0.10.1-py3-none-any.whl)
  Caused by: Not a valid package or extra name: ".none". Names must start and end with a letter or digit and may only contain -, _, ., and alphanumeric characters
```
2023-11-10 18:33:49 +00:00
Charlie Marsh b3edf7c2b2
Delete any directories listed in the RECORD file (#394)
## Summary

It looks like, when you install `pip`, it includes a bunch of
`__pycache__` directories in the RECORD file (although these directories
don't exist until you run `pip`). Our uninstaller assumed that the
RECORD file only contained _files_.

Closes https://github.com/astral-sh/puffin/issues/389.
2023-11-10 18:17:52 +00:00
Charlie Marsh 6a15950cb5
Rename `Distribution` to `Dist` in all structs and traits (#384)
We tend to avoid abbreviations, but this one is just so long and
absolutely ubiquitous.
2023-11-10 14:55:11 +00:00
konsti 5cef40d87a
Add proper caching for pypi metadata fetching kinds (#368)
I intend this to become the main form of caching for puffin: You can
make http requests, you tranform the data to what you really need, you
have control over the cache key, and the cache is always json (or
anything else much faster we want to replace it with as long as it's
serde!)
2023-11-10 11:03:40 +00:00
konsti d1b57acaa8
Implement PEP 517 backend-path (#385)
Closes #192
2023-11-10 11:54:23 +01:00
Charlie Marsh a148f9d0be
Refactor distribution types to adhere to a clear hierarchy (#369)
## Summary

This PR refactors our `RemoteDistribution` type such that it now follows
a clear hierarchy that matches the actual variants, and encodes the
differences between source and built distributions:

```rust
pub enum Distribution {
    Built(BuiltDistribution),
    Source(SourceDistribution),
}

pub enum BuiltDistribution {
    Registry(RegistryBuiltDistribution),
    DirectUrl(DirectUrlBuiltDistribution),
}

pub enum SourceDistribution {
    Registry(RegistrySourceDistribution),
    DirectUrl(DirectUrlSourceDistribution),
    Git(GitSourceDistribution),
}

/// A built distribution (wheel) that exists in a registry, like `PyPI`.
pub struct RegistryBuiltDistribution {
    pub name: PackageName,
    pub version: Version,
    pub file: File,
}

/// A built distribution (wheel) that exists at an arbitrary URL.
pub struct DirectUrlBuiltDistribution {
    pub name: PackageName,
    pub url: Url,
}

/// A source distribution that exists in a registry, like `PyPI`.
pub struct RegistrySourceDistribution {
    pub name: PackageName,
    pub version: Version,
    pub file: File,
}

/// A source distribution that exists at an arbitrary URL.
pub struct DirectUrlSourceDistribution {
    pub name: PackageName,
    pub url: Url,
}

/// A source distribution that exists in a Git repository.
pub struct GitSourceDistribution {
    pub name: PackageName,
    pub url: Url,
}
```

Most of the PR just stems downstream from this change. There are no
behavioral changes, so I'm largely relying on lint, tests, and the
compiler for correctness.
2023-11-10 02:45:41 +00:00
Andrew Gallant 33c0901a28
distribution-filename: speed up is_compatible (#367)
This PR tweaks the representation of `Tags` in order to offer a
faster implementation of `WheelFilename::is_compatible`. We now use a
nested map of tags that lets us avoid looping over every supported
platform tag. As the code comments suggest, that is the essential gain.
We still do not mind looping over the tags in each wheel name since they
tend to be quite small. And pushing our thumb on that side of things can
make things worse overall since it would likely slow down WheelFilename
construction itself.

For micro-benchmarks, we improve considerably for compatibility
checking:

    $ critcmp base test3
group base test3
----- ---- -----
build_platform_tags/burntsushi-archlinux 1.00 46.2±0.28µs ? ?/sec 2.48
114.8±0.45µs ? ?/sec
wheelname_parsing/flyte-long-compatible 1.00 624.8±3.31ns 174.0 MB/sec
1.01 629.4±4.30ns 172.7 MB/sec
wheelname_parsing/flyte-long-incompatible 1.00 743.6±4.23ns 165.4 MB/sec
1.00 746.9±4.62ns 164.7 MB/sec
wheelname_parsing/flyte-short-compatible 1.00 526.7±4.76ns 54.3 MB/sec
1.01 530.2±5.81ns 54.0 MB/sec
wheelname_parsing/flyte-short-incompatible 1.00 540.4±4.93ns 60.0 MB/sec
1.01 545.7±5.31ns 59.4 MB/sec
wheelname_parsing_failure/flyte-long-extension 1.00 13.6±0.13ns 3.2
GB/sec 1.01 13.7±0.14ns 3.2 GB/sec
wheelname_parsing_failure/flyte-short-extension 1.00 14.0±0.20ns 1160.4
MB/sec 1.01 14.1±0.14ns 1146.5 MB/sec
wheelname_tag_compatibility/flyte-long-compatible 11.33 159.8±2.79ns
680.5 MB/sec 1.00 14.1±0.23ns 7.5 GB/sec
wheelname_tag_compatibility/flyte-long-incompatible 237.60
1671.8±37.99ns 73.6 MB/sec 1.00 7.0±0.08ns 17.1 GB/sec
wheelname_tag_compatibility/flyte-short-compatible 16.07 223.5±8.60ns
128.0 MB/sec 1.00 13.9±0.30ns 2.0 GB/sec
wheelname_tag_compatibility/flyte-short-incompatible 149.83 628.3±2.13ns
51.6 MB/sec 1.00 4.2±0.10ns 7.6 GB/sec

We do regress slightly on the time it takes for `Tags::new` to run, but
this is somewhat expected. And in absolute terms, 114us is perfectly
acceptable given that it's only executed ~once for each `puffin`
invocation.

Ad hoc benchmarks indicate an overall 25% perf improvement in `puffin
pip-compile` times. This roughly corresponds with how much time
`is_compatible` was taking. Indeed, profiling confirms that it has
virtually disappeared from the profile.

Fixes #157
2023-11-09 09:01:03 -05:00
konsti bdb89b4072
Allow setting num tasks in puffin-dev parallel resolve (#374) 2023-11-09 13:12:03 +00:00
Charlie Marsh 6144de0a7e
Implement some minor optimizations to version match (#371)
`Range::intersection` goes from 74.2% to 64.9%, and `sortable_tuple`
goes from 2.3% to 1.5%.
2023-11-09 02:11:40 +00:00
Charlie Marsh cfd84d6365
Support resolving for an alternate Python distribution (#364)
## Summary

Low-priority but fun thing to end the day. You can now pass
`--target-version py37`, and we'll generate a resolution for Python 3.7.

See: https://github.com/astral-sh/puffin/issues/183.
2023-11-08 23:19:16 +00:00
konsti d407bbbee6
Special case missing header build errors (on linux) (#354)
One of the most common errors i observed are build failures due to
missing header files. On ubuntu, this generally means that you need to
install some `<...>-dev` package that the documentation tells you about,
e.g. [mysqlclient](https://github.com/PyMySQL/mysqlclient#linux) needs
`default-libmysqlclient-dev`, [some psycopg
versions](https://www.psycopg.org/psycopg3/docs/basic/install.html#local-installation)
(i remember that this was always required at some earlier point) require
`libpq-dev` and pygraphviz wants `graphviz-dev`. This is quite common
for many scientific packages (where conda has an advantage because they
can provide those package as a dependency).

The error message can be completely inscrutable if you're just a python
programmer (or user) and not a c programmer (example: pygraphviz):

```
warning: no files found matching '*.png' under directory 'doc'
warning: no files found matching '*.txt' under directory 'doc'
warning: no files found matching '*.css' under directory 'doc'
warning: no previously-included files matching '*~' found anywhere in distribution
warning: no previously-included files matching '*.pyc' found anywhere in distribution
warning: no previously-included files matching '.svn' found anywhere in distribution
no previously-included directories found matching 'doc/build'
pygraphviz/graphviz_wrap.c:3020:10: fatal error: graphviz/cgraph.h: No such file or directory
 3020 | #include "graphviz/cgraph.h"
      |          ^~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command '/usr/bin/gcc' failed with exit code 1
```

The only relevant part is `Fatal error: graphviz/cgraph.h: No such file
or directory`. Why is this file not there and how do i get it to be
there?

This is even harder to spot in pip's output, where it's 11 lines above
the last line:


![image](https://github.com/astral-sh/puffin/assets/6826232/7a3d7279-e7b1-4511-ab22-d0a35be5e672)

I've special cased missing headers and made sure that the last line
tells you the important information: We're missing some header, please
check the documentation of {package} {version} for what to install:


![image](https://github.com/astral-sh/puffin/assets/6826232/4bbb8923-5a82-472f-ab1f-9e1471aa2896)

Scrolling up:


![image](https://github.com/astral-sh/puffin/assets/6826232/89a2495a-e188-4288-b534-ad885ee08763)

The difference gets even clearer with a default ubuntu terminal with its
80 columns:


![image](https://github.com/astral-sh/puffin/assets/6826232/49fb27bc-07c6-4b10-a1a1-30ec8e112438)

---

Note that the situation is better for a missing compiler, there i get:

```
[...]
warning: no previously-included files matching '*~' found anywhere in distribution
warning: no previously-included files matching '*.pyc' found anywhere in distribution
warning: no previously-included files matching '.svn' found anywhere in distribution
no previously-included directories found matching 'doc/build'
error: command 'gcc' failed: No such file or directory
---
```
Putting the last line into google, the first two results tell me to
`sudo apt-get install gcc`, the third even tells me about `sudo apt
install build-essential`
2023-11-08 15:26:39 +00:00
konsti 2ebe40b986
Add `--no-build` (#358)
By default, we will build source distributions for both resolving and
installing, running arbitrary code. `--no-build` adds an option to ban
this and only install from wheels, no source distributions or git builds
allowed. We also don't fetch these and instead report immediately.

I've heard from users for whom this is a requirement, i'm implementing
it now because it's helpful for testing.

I'm thinking about adding a shared `PuffinSharedArgs` struct so we don't
have to repeat each option everywhere.
2023-11-08 10:05:15 -05:00
Charlie Marsh 4fe583257e
Use a custom PubGrub error type to always show resolution report (#365)
Closes https://github.com/astral-sh/puffin/issues/356.

The example from the issue now renders as:

```
❯ cargo run --bin puffin-dev -q -- resolve-cli tensorflow-cpu-aws
puffin-dev failed
  Caused by: No solution found when resolving build dependencies for source distribution:
  Caused by: Because there is no available version for tensorflow-cpu-aws and root depends on tensorflow-cpu-aws, version solving failed.
```
2023-11-08 09:57:26 -05:00
Charlie Marsh 3c24301193
Avoid removing progress bars (#362)
This was dumb of me. We pass out indexes when adding progress bars, but
were then removing entries on completion, so any outstanding indexes
were now _invalid_. We just shouldn't remove them. The `MultiProgress`
retains a reference anyway, IIUC.

Closes https://github.com/astral-sh/puffin/issues/360.
2023-11-07 18:58:17 +00:00
Charlie Marsh 7abe141d3f
Add SSL to possible spurious errors (#361)
\cc @konstin
2023-11-07 18:53:39 +00:00
Andrew Gallant 294955ecff
fix platform detection on Linux (#359)
Rejigger Linux platform detection

This change makes some very small improvements to the Linux platform
detection logic. In particular, the existing logic did not work on my
Archlinux machine since /lib64/ld-linux-x86-64.so.2 isn't a symlink. In
that case, the detection logic should have fallen back to the slower
`ldd --version` technique, but `read_link` fails outright when its
argument isn't a symbolic link. So we tweak the logic to allow it to
fail, and if it does, we still try the `ldd --version` approach instead
of giving up completely.

I also made some cosmetic improvements to the regex matching, as well as
ensuring that the regexes are only compiled exactly once.
2023-11-07 11:39:35 -05:00
konsti 692d2eb26f
puffin-dev resolve many improvements (#357)
Print the current step, the time for and also respect the cache dir arg.
2023-11-07 14:56:35 +00:00
Charlie Marsh b0286a8939
Add user feedback when building source distributions in the resolver (#347)
It looks like Cargo, notice the bold green lines at the top (which
appear during the resolution, to indicate Git fetches and source
distribution builds):

<img width="868" alt="Screen Shot 2023-11-06 at 11 28 47 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/9647a480-7be7-41e9-b1d3-69faefd054ae">

<img width="868" alt="Screen Shot 2023-11-06 at 11 28 51 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/6bc491aa-5b51-4b37-9ee1-257f1bc1c049">

Closes https://github.com/astral-sh/puffin/issues/287 although we can do
a lot more here.
2023-11-07 14:17:31 +00:00
Charlie Marsh 2c32bc5a86
Respect direct URLs in puffin installer (#345)
We now write the `direct_url.json` when installing, and _skip_
installing if we find a package installed via the direct URL that the
user is requesting.

A lot of TODOs, especially around cleaning up the `Source` abstraction
and its relationship to `DirectUrl`. I'm gonna keep working on these
today, but this works and makes the requirements clear.

Closes #332.
2023-11-07 09:11:27 -05:00
konsti c11586f2f0
Fix index out of bounds in SourceDistributionFilename::parse (#353)
Found this one in the top 8k pypi tests too
2023-11-07 11:44:40 +00:00
konsti c883b123ac
Allow greater than star (`torch (>=1.9.*)`) in lenient requirement (#351)
This appeared in the pypi top 8k testing.
2023-11-07 11:37:23 +00:00
konsti fbe28d3b7c
Fix mastodon-py dist-info handling (#336)
mastodon-py 1.5.1 uses a dot in its dist-info dir name, which we
previously didn't handle, causing home-assistant to fail. The new
implementation is based on
2f83540272/src/packaging/utils.py (L146-L172).

Part of #199

```
unzip -l  Mastodon.py-1.5.1-py2.py3-none-any.whl
Archive:  Mastodon.py-1.5.1-py2.py3-none-any.whl
  Length      Date    Time    Name
---------  ---------- -----   ----
   153929  2020-02-29 17:39   mastodon/Mastodon.py
     1029  2019-10-11 19:15   mastodon/__init__.py
     7357  2019-10-11 20:24   mastodon/streaming.py
       10  2020-03-14 18:14   Mastodon.py-1.5.1.dist-info/DESCRIPTION.rst
     1398  2020-03-14 18:14   Mastodon.py-1.5.1.dist-info/metadata.json
        9  2020-03-14 18:14   Mastodon.py-1.5.1.dist-info/top_level.txt
      110  2020-03-14 18:14   Mastodon.py-1.5.1.dist-info/WHEEL
     1543  2020-03-14 18:14   Mastodon.py-1.5.1.dist-info/METADATA
      753  2020-03-14 18:14   Mastodon.py-1.5.1.dist-info/RECORD
---------                     -------
   166138                     9 files
```
2023-11-07 12:36:11 +01:00
konsti aac8ae997f
Rename source distribution build to source build (#334)
This is less verbose and better reflects that we're building both source
distributions and source trees passed into the function.
2023-11-07 03:55:23 +00:00
Charlie Marsh 620afc3caf
Avoid refreshing Git repo twice (#350)
This was a bug in the Git code (that I wrote, not from Cargo) -- when we
`precise` the reference, we should store the resolved commit.
2023-11-07 02:52:15 +00:00
Charlie Marsh 243549876c
Upgrade PubGrub (#349)
Upgrades to `fe309ffb63b2f3ce9b35eb7746b2350cd704515e`, with our changes
layered on top.
2023-11-07 02:00:57 +00:00
Charlie Marsh 2c114592bd
Only store small wheels in-memory (#348)
Closes https://github.com/astral-sh/puffin/issues/246.
2023-11-07 00:50:00 +00:00
Zanie Blue e952557bf1
Improve root message when version solving fails (#344)
Matching description at
https://github.com/dart-lang/pub/blob/master/doc/solver.md#linear-error-reporting
2023-11-06 20:07:50 +00:00
Zanie Blue b0720ea5b2
Improve error message for dependencies with no versions available (#342)
Partially addresses https://github.com/astral-sh/puffin/issues/310
Addresses case at
https://github.com/astral-sh/puffin/issues/309#issuecomment-1793541558
Follow-up to #300 ensuring `PuffinExternal` is used consistently when
formatting messages

Example at
https://github.com/astral-sh/puffin/pull/342/files#diff-5c74a74ef34ef1d6e7453de8d2d19134813156e8b6a657e6b5ed71fda5a3a870
2023-11-06 14:04:29 -06:00
Zanie Blue 1748cfb522
Display dependency versions in pip-like format during solve failure (#346)
- Display `==` for exact version ranges
- Remove space between dependency and version range
2023-11-06 13:53:15 -06:00
Charlie Marsh a5e535f6fb
Remove `virtualenv` setup from gourgeist (#339)
We now only support building bare environments.
2023-11-06 18:32:45 +00:00
Charlie Marsh b013ea9c93
Move `DirectUrl` into `pypi-types` (#343)
This needs to be reused elsewhere, and there's nothing specific to wheel
installation about it.
2023-11-06 18:26:33 +00:00
Charlie Marsh 24e30e6557
Split `puffin-package` into requirements.txt parser and `pypi-types` (#341)
There are only two things left in this crate and they don't really have
anything to do with one another.
2023-11-06 18:19:49 +00:00
Charlie Marsh 1f447892f3
Rename `PartitionedRequirements` to `InstallPlan` (#340)
@konstin named this file at some point and I like it, it feels
appropriate for the struct itself too.
2023-11-06 12:44:35 -05:00
Charlie Marsh d9bcfafa16
Write `direct_url.json` in wheel installer (#337)
## Summary

This PR just adds the logic in `install-wheel-rs` to write
`direct_url.json`. We're not actually taking advantage of it yet (or
wiring it through) in Puffin.

Part of https://github.com/astral-sh/puffin/issues/332.
2023-11-06 17:09:28 +00:00
konsti 9b077f3d0f
`cargo upgrade --incompatible` (#330)
Ran `cargo upgrade --incompatible`, seems there are no changes required.

From cacache 0.12.0:
> BREAKING CHANGE: some signatures for copy have changed, and copy no
longer automatically reflinks

`which` 5.0.0 seems to have only error message changes.
2023-11-06 14:14:47 +00:00
konsti d99ca3159b
Cache the setup.py resolution (#327)
Cache the resolution for the setup.py requirements (`pip`, `setuptools`,
`wheels`) across builds.
2023-11-06 14:14:24 +00:00
konsti b2439b24a1
Fetch wheel metadata by async range requests on the remote wheel (#301)
Use range requests and async zip to extract the METADATA file from a
remote wheel.

We currently only cache when the remote says the remote declares the
resource as immutable, see
https://github.com/06chaynes/http-cache/issues/57 and
https://github.com/baszalmstra/async_http_range_reader/pull/1 . The
cache is stored as json with the description omitted, this improve cache
deserialization performance.
2023-11-06 15:06:49 +01:00
konsti 6f83a44fea
Improve error messages and make cache failures non fatal (#333) 2023-11-06 15:06:27 +01:00
konsti 3defe233e6
Use dist info name in cache again (#331)
Fixup for the `PackageName`/`DistInfoName` refactor that would lead to
invalid cache entries
2023-11-06 13:47:38 +00:00
Charlie Marsh 6d672b8951
Add source distribution support to `pip-compile` (#323)
## Summary

This is a first-pass at adding source distribution support to the
installer.

The previous installation flow was:

1. Come up with a plan.
1. Find a distribution (specific file) for every package that we'll need
to download.
1. Download those distributions.
1. Unzip them (since we assumed they were all wheels).
1. Install them into the virtual environment.

Now, Step (3) downloads both wheels and source distributions, and we
insert a step between Steps (3) and (4) to build any source
distributions into zipped wheels.

There are a bunch of TODOs, the most important (IMO) is that we
basically have two implementations of downloading and building, between
the stuff in `puffin_installer` and `puffin_resolver` (namely in
`crates/puffin-resolver/src/distribution`). I didn't attempt to clean
that up here -- it's already a problem, and it's related to the overall
problem we need to solve around unified caching and resource management.

Closes #243.
2023-11-06 08:22:36 -05:00
konsti b79a15b458
Update pyproject-toml to 0.8.0 (#329) 2023-11-06 13:16:36 +00:00
konsti 81f380b10e
Validate package and extra name (#290)
`PackageName` and `ExtraName` can now only be constructed from valid
names. They share the same rules, so i gave them the same
implementation. Constructors are split between `new` (owned) and
`from_str` (borrowed), with the owned version avoiding allocations.

Closes #279

---------

Co-authored-by: Zanie <contact@zanie.dev>
2023-11-06 10:04:31 +00:00
Charlie Marsh ea28b3d0d3
Add a git feature to tests (#325) 2023-11-06 05:32:43 +00:00
Charlie Marsh 8463e92121
Fix bad Flask reference in tests (#324) 2023-11-06 05:20:43 +00:00
Charlie Marsh 1637f1c216
Add source distribution support to the `DistributionFinder` (#322)
## Summary

This just enables the `DistributionFinder` (previously known as the
`WheelFinder`) to select source distributions when there are no matching
wheels for a given platform. As a reminder, the `DistributionFinder` is
a simple resolver that doesn't look at any dependencies: it just takes a
set of pinned packages, and finds a distribution to install to satisfy
each requirement.
2023-11-06 00:16:04 -05:00
Charlie Marsh d785ffdbff
Move `Source` abstraction into `puffin-distribution` (#321)
No code changes, but this will allow it to be shared between the
installer and the resolver.
2023-11-06 02:31:15 +00:00
Charlie Marsh 4b83d8e949
Require URL dependencies to be declared upfront (#319)
In the resolver, our current model for solving URL dependencies requires
that we visit the URL dependency _before_ the registry-based dependency.
This PR encodes a strict requirement that all URL dependencies be
declared upfront, either as requirements or constraints.

I wrote more about how it works and why it's necessary in documentation
[here](https://github.com/astral-sh/puffin/pull/319/files#diff-2b1c4f36af0c62a2b7bebeae9473ae083588f2a6b18a3ec52393a24266adecbbR20).
I think we could relax this constraint over time, but it requires a more
sophisticated model -- and for now, I just want something that's (1)
correct, (2) easy for us to reason about, and (3) easy for users to
reason about.

As additional motivation... allowing arbitrary URL dependencies anywhere
in the tree creates some really confusing situations in which I'm not
even sure what the right answers are. For example, assume you declare a
direct dependency on `Werkzeug==2.0.0`. You then depend on a version of
Flask that depends on a version of `Werkzeug` from some arbitrary URL.
You build the source distribution at that arbitrary URL, and it turns
out it _does_ build to a declared version of 2.0.0. What should happen?
(And if it resolves to a version that _isn't_ 2.0.0, what should happen
_then_?) I suspect different tools handle this differently, but it must
lead to a lot of "silent" failures. In my testing of Poetry, it seems
like Poetry just ignores the URL dependency, which seems wrong, but is
also a behavior we could implement in the future.

Closes https://github.com/astral-sh/puffin/issues/303.
Closes https://github.com/astral-sh/puffin/issues/284.
2023-11-05 17:09:58 +00:00
Charlie Marsh c03b4da3a2
Properly remove `.git ` extension even for URLs with `@` commit markers (#320) 2023-11-04 19:45:30 +00:00
Charlie Marsh a53188cac7
Avoid unnecessarily fetching non-marker-required first-party dependencies (#318)
E.g., given:

```
flask; python_version < '3.7'
requests
```

We shouldn't request the metadata for Flask when on Python versions 3.7
or later.
2023-11-04 17:03:43 +00:00
Charlie Marsh 051188dce0
Use separate representations for canonical repository vs. commit (#317)
Given `https://github.com/pypa/package.git#subdirectory=pkg_a` and
`https://github.com/pypa/package.git#subdirectory=pkg_b`, we want these
to map to the same shared _resource_ (for locking and cloning), but
different _packages_ (for determining whether the wheel already exists
in the cache). As such, we need two distinct concepts for "canonical
equality".

Closes #316.
2023-11-04 11:46:42 -04:00
Charlie Marsh b589813e59
Enforce that built package name matches declared package name (#315)
Closes https://github.com/astral-sh/puffin/issues/306.
2023-11-03 22:58:12 +00:00
Charlie Marsh 643cf3b3aa
Unify subdirectory handling in `source.rs` (#314)
Avoids having to encode all the `git+` and `subdirectory=` logic in
multiple places.
2023-11-03 19:33:38 +00:00
Charlie Marsh edce4ccb24
Add support for subdirectories in URL dependencies (#312)
Closes https://github.com/astral-sh/puffin/issues/307.
2023-11-03 15:28:38 -04:00
Zanie Blue cbfd6af125
Error if `--all-extras` is used without a `pyproject.toml` source (#292)
Closes https://github.com/astral-sh/puffin/issues/260
2023-11-03 12:07:32 -05:00
Charlie Marsh aa9882eee8
Use locks to prevent concurrent accesses to the same Git repo (#304)
Ensures that if we need to access the same Git repo twice in a
resolution, we only have one handler to that repo at a time. (Otherwise,
`git2` panics.)
2023-11-03 16:33:14 +00:00
Charlie Marsh fa1bbbbe08
Write fully-precise Git SHAs to `pip-compile` output (#299)
This PR adds a mechanism by which we can ensure that we _always_ try to
refresh Git dependencies when resolving; further, we now write the fully
resolved SHA to the "lockfile". However, nothing in the code _assumes_
we do this, so the installer will remain agnostic to this behavior.

The specific approach taken here is minimally invasive. Specifically,
when we try to fetch a source distribution, we check if it's a Git
dependency; if it is, we fetch, and return the exact SHA, which we then
map back to a new URL. In the resolver, we keep track of URL
"redirects", and then we use the redirect (1) for the actual source
distribution building, and (2) when writing back out to the lockfile. As
such, none of the types outside of the resolver change at all, since
we're just mapping `RemoteDistribution` to `RemoteDistribution`, but
swapping out the internal URLs.

There are some inefficiencies here since, e.g., we do the Git fetch,
send back the "precise" URL, then a moment later, do a Git checkout of
that URL (which will be _mostly_ a no-op -- since we have a full SHA, we
don't have to fetch anything, but we _do_ check back on disk to see if
the SHA is still checked out). A more efficient approach would be to
return the path to the checked-out revision when we do this conversion
to a "precise" URL, since we'd then only interact with the Git repo
exactly once. But this runs the risk that the checked-out SHA changes
between the time we make the "precise" URL and the time we build the
source distribution.

Closes #286.
2023-11-03 16:26:57 +00:00
Zanie Blue addcfe533a
Implement custom resolution failure reporter to hide root package versions (#300)
Extends #295 
Closes #214 

Copies some of the implementations from `pubgrub::report` so we can
implement Puffin `PubGrubPackage` specific display when explaining
failed resolutions.

Here, we just drop the dummy version number if it's a
`PubGrubPackage::Root` package. In the future, we can further customize
reporting.
2023-11-03 10:47:01 -05:00
Zanie Blue e1382cc747
Report project name instead of `root` when using `pyproject.toml` files (#295)
Part of https://github.com/astral-sh/puffin/issues/214

Adds a `project: Option<PackageName>` to the `Manifest`, `Resolver`, and
`RequirementsSpecification`.
To populate an optional `name` for `PubGubPackage::Root`.

I'll work on removing the version number next.

Should we consider using the parent directory name when a
`pyproject.toml` file is not present?
2023-11-03 10:22:10 -05:00
konsti e008c43f29
Add PackageName::as_dist_info_name (#305)
From
https://packaging.python.org/en/latest/specifications/recording-installed-packages/#recording-installed-packages

> This directory is named as {name}-{version}.dist-info, with name and
version fields corresponding to Core metadata specifications. Both
fields must be normalized (see Package name normalization and PEP 440
for the definition of normalization for each field respectively), and
replace dash (-) characters with underscore (_) characters, so the
.dist-info directory always has exactly one dash (-) character in its
stem, separating the name and version fields.

Follow up to #278
2023-11-03 08:16:44 +00:00
Charlie Marsh e47d3f1f66
Respect pip-like Git branch, tag, and commit references (#297)
We need to parse revisions out from URLs like `MyProject @
git+https://git.example.com/MyProject.git@v1.0`, per [VCS
Support](https://pip.pypa.io/en/stable/topics/vcs-support/). Cargo has
the advantage that it uses a TOML table in its configuration, so the
user has to specify whether they're fetching a commit, a tag, a branch,
etc. We have to instead assume that anything that isn't clearly a commit
is _either_ a branch or a tag.

Closes https://github.com/astral-sh/puffin/issues/296.
2023-11-02 15:10:02 -04:00
Charlie Marsh a4002fe132
Make cache non-optional in most crates (#293)
This PR makes the cache non-optional in most of Puffin, which simplifies
the code, allows us to reuse the cache within a single command (even
with `--no-cache`), and also allows us to use the cache for disk storage
across an invocation.

I left the cache as optional for the `Virtualenv` and `InterpreterInfo`
abstractions, since those are generic enough that it seems nice to have
a non-cached version, but it's kind of arbitrary.
2023-11-02 13:40:20 -04:00
Charlie Marsh a02bf2e415
Split `source_distribution.rs` into separate wheel and sdist fetchers (#291) 2023-11-02 16:04:51 +00:00
konsti c6f2dfd727
Use shared insta filters (#270)
Internal refactoring for consistency between tests
2023-11-02 16:42:59 +01:00
Charlie Marsh 62c474d880
Add support for Git dependencies (#283)
## Summary

This PR adds support for Git dependencies, like:

```
flask @ git+https://github.com/pallets/flask.git
```

Right now, they're only supported in the resolver (and not the
installer), since the installer doesn't yet support source distributions
at all.

The general approach here is based on Cargo's Git implementation.
Specifically, I adapted Cargo's
[`git`](23eb492cf9/src/cargo/sources/git/mod.rs)
module to perform the cloning, which is based on `libgit2`.

As compared to Cargo's implementation, I made the following changes:

- Removed any unnecessary code.
- Fixed any Clippy errors for our stricter ruleset.
- Removed the dependency on `curl`, in favor of `reqwest` which we use
elsewhere.
- Removed the ability to use `gix`. Cargo allows the use of `gix` as an
experimental flag, but it only supports a small subset of the
operations. When Cargo fully adopts `gix`, we should plan to do the
same.
- Removed Cargo's host key checking. We need to re-add this! I'll do it
shortly.
- Removed Cargo's progress bars. We should re-add this too, but we use
`indicatif` and Cargo had their own thing.

There are a few follow-ups to consider:

- Adding support in the installer.
- When we lock, we should write out the Git URL that includes the exact
SHA. This lets us cache in perpetuity and avoids dependencies changing
without re-locking.
- When we resolve, we should _always_ try to refresh Git dependencies.
(Right now, we skip if the wheel was already built.)

I'll work on the latter two in follow-up PRs.

Closes #202.
2023-11-02 15:14:55 +00:00
konsti 4adaa9a700
Wheel filename distribution package name (#278)
The normalized name abstractions were not consistently, this PR uses
them where they were previously missing:
* `WheelFilename::distribution`
* `Requirement::name`
* `Requirement::extras`
* `Metadata21::name`
* `Metadata21::provides_dist`

With `puffin-package` depending on `pep508_rs` this would be cyclical
crate dependency, so `puffin-normalize` gets split out from
`puffin-package`.

`DistInfoName` has the same task and semantics as `PackageName`, so it's
merged into the latter.

`PackageName` and `ExtraName` documentation is moved onto the type and
their constructors are called `new` instead of `normalize`. We now use
these constructors rarely enough the implicit allocation by
`to_string()` shouldn't matter anymore, while more actual cloning
becomes visible.
2023-11-02 11:15:27 +00:00
konsti 9488804024
Add docker builder (#238)
This docker container provides isolation of source distribution builds,
whether [intended to be
helpful](https://pypi.org/project/nvidia-pyindex/) or other more or less
malicious forms of host system modification.

Fixes #194

---------

Co-authored-by: Zanie Blue <contact@zanie.dev>
2023-11-02 12:03:56 +01:00
Charlie Marsh 2ee555df7b
Use `puffin_cache::digest` in another site (#289) 2023-11-02 04:48:14 +00:00
Charlie Marsh 0c9e975f75
Rename `distribution.rs` to `file.rs` in `puffin-resolver` (#288) 2023-11-01 23:52:53 -04:00
Zanie Blue b8ff32f6be
Respect markers on constraints (#282)
Closes #252
2023-11-01 20:20:32 -05:00
Charlie Marsh 8123e1a8f6
Add stable hash crate (#281)
This PR adds a `puffin-cache` crate that we can share across a variety of
other crates to generate stable hashes.
2023-11-01 23:41:45 +00:00
Zanie Blue 67e3e45839
Add support for `--all-extras` to `pip-compile` (#259)
Closes #244

Notable decision to error if `--all-extra` and `--extra <name>` are both
provided.
2023-11-01 13:39:49 -05:00
konsti c6aa1cd7a3
Only fall back to copy when the first hard linking failed (#268)
Hard linking might not be supported but we (afaik) can't detect this
ahead of time, so we'll try hard linking the first file, if this
succeeds we'll know later hard linking errors are not due to lack of
os/fs support, if it fails we'll switch to copying for the rest of the
install. Follow up to
https://github.com/astral-sh/puffin/pull/237#discussion_r1376705137
2023-11-01 18:35:52 +01:00
konsti b0678aa6fc
Show dev resolve output (#277)
Show the resolution in a concise format for puffin-dev. Note that this
doesn't affect the main puffin output, it's just more convenient for me
when developing.
2023-11-01 15:54:47 +00:00
konsti d1af90163b
Improve client reqwest errors (#276)
I had to debug a failure involving these errors and had to improve their
output.
2023-11-01 16:52:58 +01:00
konsti 997228f4be
Add resolve from cli dev command (#272)
I don't want to create a new file for every requirement i test
2023-11-01 15:46:37 +00:00
Zanie Blue 3d5f8249ef
Add validation of extra names (#257)
Extends #254 

Adds validation of extra names provided by users in `pip-compile` e.g. 

```
error: invalid value 'foo!' for '--extra <EXTRA>': Extra names must start and end with a
letter or digit and may only contain -, _, ., and alphanumeric characters
```

We'll want to add something similar to `PackageName`. I'd be curious to
improve the AP, making the unvalidated nature of `::normalize` clear?
Perhaps worth pursuing later though as I don't have a better idea.
2023-11-01 10:40:43 -05:00
Charlie Marsh 2652caa3e3
Add support for URL dependencies (#251)
## Summary

This PR adds support for resolving and installing dependencies via
direct URLs, like:

```
werkzeug @ 960bb4017c4aed12b5ed8b78e0153e/Werkzeug-2.0.0-py3-none-any.whl
```

These are fairly common (e.g., with `torch`), but you most often see
them as Git dependencies.

Broadly, structs like `RemoteDistribution` and friends are now enums
that can represent either registry-based dependencies or URL-based
dependencies:

```rust
/// A built distribution (wheel) that exists as a remote file (e.g., on `PyPI`).
#[derive(Debug, Clone)]
#[allow(clippy::large_enum_variant)]
pub enum RemoteDistribution {
    /// The distribution exists in a registry, like `PyPI`.
    Registry(PackageName, Version, File),
    /// The distribution exists at an arbitrary URL.
    Url(PackageName, Url),
}
```

In the resolver, we now allow packages to take on an extra, optional
`Url` field:

```rust
#[derive(Debug, Clone, Eq, Derivative)]
#[derivative(PartialEq, Hash)]
pub enum PubGrubPackage {
    Root,
    Package(
        PackageName,
        Option<DistInfoName>,
        #[derivative(PartialEq = "ignore")]
        #[derivative(PartialOrd = "ignore")]
        #[derivative(Hash = "ignore")]
        Option<Url>,
    ),
}
```

However, for the purpose of version satisfaction, we ignore the URL.
This allows for the URL dependency to satisfy the transitive request in
cases like:

```
flask==3.0.0
werkzeug @ 254c3e9b5f5941e900b71206e6313b/werkzeug-3.0.1-py3-none-any.whl
```

There are a couple limitations in the current approach:

- The caching for remote URLs is done separately in the resolver vs. the
installer. I decided not to sweat this too much... We need to figure out
caching holistically.
- We don't support any sort of time-based cache for remote URLs -- they
just exist forever. This will be a problem for URL dependencies, where
we need some way to evict and refresh them. But I've deferred it for
now.
- I think I need to redo how this is modeled in the resolver, because
right now, we don't detect a variety of invalid cases, e.g., providing
two different URLs for a dependency, asking for a URL dependency and a
_different version_ of the same dependency in the list of first-party
dependencies, etc.
- (We don't yet support VCS dependencies.)
2023-11-01 09:21:44 -04:00
Zanie Blue fa9f8df396
Fix test snapshot filter when runtime is greater than 1s (#267)
Tests would sometimes flake with this locally e.g. "1.50s" was not
filtered correctly.

Verified with

```diff
diff --git a/crates/puffin-cli/src/commands/pip_compile.rs b/crates/puffin-cli/src/commands/pip_compile.rs
index 0193216..2d6f8af 100644
--- a/crates/puffin-cli/src/commands/pip_compile.rs
+++ b/crates/puffin-cli/src/commands/pip_compile.rs
@@ -150,6 +150,8 @@ pub(crate) async fn pip_compile(
         result => result,
     }?;
 
+    std:🧵:sleep(std::time::Duration::from_secs(1));
+
     let s = if resolution.len() == 1 { "" } else { "s" };
     writeln!(
         printer,
```
2023-11-01 13:15:06 +00:00
Charlie Marsh 079b685c8c
Use distributions for `Reporter` signatures (#266) 2023-11-01 03:19:13 +00:00
Charlie Marsh bee1b0f5ad
Avoid re-parsing wheel filename in source distribution tree (#265) 2023-10-31 21:02:09 +00:00
Charlie Marsh aff26f2301
Reuse distribution structs in Resolver's `source_distribution.rs` (#264) 2023-10-31 20:50:34 +00:00
Zanie Blue 4be9ba483f
Remove implicit clone from `ExtraName` and document requirement in `PackageName` (#262)
per discussion in #137


https://discord.com/channels/1039017663004942429/1148719284013510676/1169000261746962473
2023-10-31 15:24:27 -05:00
Zanie Blue 0dc7e6335e
Default to `puffin venv` path to `.venv` (#261)
Closes https://github.com/astral-sh/puffin/issues/236
2023-10-31 15:24:19 -05:00
Zanie Blue e00d208318
Add documentation to `PackageName::normalize` (#263) 2023-10-31 15:24:08 -05:00
Charlie Marsh 89dad0c9ad
Move distribution abstraction in shared crate (#258)
This also allows us to get rid of `PinnedPackage` _and_ to remove some
`Result<...>` types due to needless conversions between
otherwise-identical types.
2023-10-31 15:30:06 -04:00
Zanie Blue 1ddb7d2827
Add error when user requests extras that do not exist (#254)
Extends #253 
Closes #241 

Adds `extras` to `RequirementsSpecification` to track extras used to
construct the requirements so we can throw an error when not all of the
requested extras are used.
2023-10-31 19:17:36 +00:00
Zanie Blue 322532d6f9
Normalize optional dependency group names in pyproject files (#253)
Going to add some tests.

Extends #239 
Closes #245 

Normalizes optional dependency group names found in pyproject files
before comparing them to the normalized user-requested extras.
2023-10-31 14:15:00 -05:00
Charlie Marsh 3312ce30f5
Upgrade crates and remove unused dependencies (#256) 2023-10-31 13:16:58 -04:00
Charlie Marsh 16aac834ee
Move PyPI-oriented types out of `puffin-client` crate (#255)
Just an internal change to avoid a dependency on `puffin-client` for
those crates that need access to PyPI-metadata types.
2023-10-31 17:10:23 +00:00
Zanie Blue 08f09e4743
Add support for `pip-compile --extra <name>` (#239)
Adds support for `pip-compile --extra <name> ...` which includes
optional dependencies in the specified group in the resolution.

Following precedent in `pip-compile`, if a given extra is not found,
there is no error. ~We could consider warning in this case.~ We should
probably add an error but it expands scope and will be considered
separately in #241
2023-10-31 11:59:40 -05:00
Charlie Marsh 9244404102
Resolve interpreter symlinks when creating virtual environments (#250)
Closes https://github.com/astral-sh/puffin/issues/249.
2023-10-31 08:22:52 -04:00
Charlie Marsh 2f38701008
Remove unused wheel cache argument from downloader (#248) 2023-10-31 02:23:50 +00:00
Charlie Marsh ae203f998a
Rename `Unzipper#download` to `Unzipper#unzip` (#247) 2023-10-31 01:19:27 +00:00
Charlie Marsh 1f059b30dd
Remove `Box<Pin<...>>` from `process_request` (#242) 2023-10-30 16:34:57 -04:00
konsti 35d6bd761b
Fallback to copy if hardlinking failed (#237) 2023-10-30 19:10:01 +00:00
konstin 1529def563 Implement mixed PEP 517 and setup.py build
There are packages such as DTLSSocket 0.1.16 that say
```toml
[build-system]
requires = ["Cython<3", "setuptools", "wheel"]
```
In this case we need to install requires PEP 517 style but then call setup.py in the
legacy way

Part of making home-assistant work
2023-10-30 19:11:52 +01:00
konsti 29bd0a4ed8
Fix musl compilation (#234)
musl (which we already use in ruff) allows statically linked binaries on
linux. This PR switches to rustls and vendors and fixes the glibc
detection. Using static musl builds makes it easier to avoid glibc
errors in docker and we'll need it later for alpine users anyway.

An alternative is using vendored openssl.
2023-10-30 18:10:17 +01:00
konsti d47dc64974
Ignore self requirements (#233)
gps3 0.33.3 depends on itself, which we can ignore. I've also added the
home assistant requirements since it occurred when testing with this.
2023-10-30 17:13:52 +01:00
Charlie Marsh 0be20a41a4
Make version selection wheel-vs.-sdist-agnostic (#232)
Closes https://github.com/astral-sh/puffin/issues/231.
2023-10-30 11:21:10 -04:00
Charlie Marsh 8d992dca3f
Fail gracefully when invalid markers are stored (#230) 2023-10-30 04:02:51 +00:00
Charlie Marsh e73d3f0ff8
Use bounded ranges rather than constructing manual ranges (#228)
I didn't realize this, but they made a bunch of improvements to how
PubGrub represents versions which lets us greatly simplify our own
PubGrub version wrapper
(https://github.com/pubgrub-rs/guide/pull/6/files).
2023-10-30 03:58:43 +00:00
Charlie Marsh fb2d4fc421
Set style before message (#229)
Prevents flickering in the resolver case.
2023-10-30 03:57:03 +00:00
Charlie Marsh ffbf6b6c16
Avoid symlinking `RECORD` file (#227)
This is the one file that gets modified during installation. Hardlinking
it is bad!
2023-10-30 03:10:38 +00:00
Charlie Marsh 1d3ea242d4
Re-export from PubGrub module (#226) 2023-10-30 02:03:52 +00:00
Charlie Marsh f2dd0d90be
Add a resolver reporter (#225)
Closes https://github.com/astral-sh/puffin/issues/223.
2023-10-30 02:00:09 +00:00
Charlie Marsh 6da9c2f534
Tweak response buffering (#224)
In my testing, we can both increase the number of concurrent requests
and remove the `ready_chunks`.
2023-10-29 21:07:46 -04:00
Charlie Marsh 1c5cdcd70a
Prioritize packages in visited order (#222) 2023-10-30 00:48:36 +00:00
Charlie Marsh 2ba85bf80e
Add PubGrub's priority queue (#221)
Pulls in https://github.com/pubgrub-rs/pubgrub/pull/104.
2023-10-29 21:16:02 +00:00
Charlie Marsh 4209e77c95
Upgrade `pubgrub-rs` version (#220)
Upgrades our PubGrub to 8951e37fe923a7edd5a78ed5f49f165b0fdc48de.
2023-10-29 20:25:55 +00:00
Charlie Marsh 1e4259a608
Make the resolver deterministic (#218)
At a minor performance cost...

Closes https://github.com/astral-sh/puffin/issues/204.
2023-10-29 18:42:25 +00:00
Charlie Marsh bae3c89ab1
Add a `--prerelease` flag to the CLI (#217) 2023-10-29 18:39:30 +00:00
Charlie Marsh 7e7e9f8a0c
Add support for pre-release versions (#216)
We now accept a pre-release if (1) all versions are pre-releases, or (2)
there was a pre-release marker in the dependency specifiers for a direct
dependency.

The code is written such that we can support a variety of pre-release
strategies.

Closes https://github.com/astral-sh/puffin/issues/191.
2023-10-29 14:31:55 -04:00
konsti 6cd4650c1f
Support missing `build_system` key (#213)
Reduces the number of failing projects out of the top 1000 pypi projects
from 5 to 2.
2023-10-27 12:12:12 +02:00
Charlie Marsh 8b83385763
Support constraints in `requirements.in` files (#212)
Closes #172.
2023-10-27 00:41:02 +00:00
Charlie Marsh 58011f98b6
Revert "Add TODO around preferring local wheels" (#211)
Reverts astral-sh/puffin#208. Unclear if we actually want to do this.
2023-10-26 23:03:00 +00:00
Charlie Marsh d5c3ff789a
Sort wheels by size when downloading and zipping (#210)
I just learned about this from PackagingCon, and locally, it shows a
nice speedup:

```
❯ hyperfine --warmup 3 --prepare "rm -rf .venv && ./target/release/puffin venv .venv" "./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache" "./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache"
Benchmark 1: ./target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      3.958 s ±  0.250 s    [User: 1.323 s, System: 5.840 s]
  Range (min … max):    3.652 s …  4.402 s    10 runs

Benchmark 2: ./target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache
  Time (mean ± σ):      4.214 s ±  0.451 s    [User: 1.322 s, System: 5.976 s]
  Range (min … max):    3.708 s …  5.268 s    10 runs

Summary
  './target/release/puffin pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache' ran
    1.06 ± 0.13 times faster than './target/release/main pip-sync ./scripts/benchmarks/requirements-large.txt --no-cache'
```
2023-10-26 20:50:56 +00:00
Charlie Marsh 12e6b46ae8
Add TODO around preferring local wheels (#208) 2023-10-26 19:09:03 +00:00
Charlie Marsh 7bce41498e
Improve debug logging in dispatcher (#206)
Also makes the order of operations more similar to that of the
`pip-compile` command.
2023-10-26 18:54:47 +00:00
konsti 5ad58474ca
Add script to check the top 8k pypi packages (#198)
To check to top 1k (current state):

```bash
scripts/resolve/get_pypi_top_8k.sh
cargo run --bin puffin-dev -- resolve-many scripts/resolve/pypi_top_8k_flat.txt --limit 1000
```

Results:
```
Errors: pywin32, geoip2, maxminddb, pypika, dirac
Success: 995, Error: 5
```
pywin32 has no solution for the build environment, 3 have no
`[build-system]` entry in pyproject.toml, `dirac` is missing cmake
2023-10-26 12:03:59 +00:00
konsti 216b6c41c2
Start puffin-dev (#193)
Currently, this is only the source distribution building feature moved.
It's intended that we can add development and test commands there
without affecting the main cli surface
2023-10-26 09:17:22 +00:00
konsti 862c1654a0
Select most recent wheel, most recent sdist (#190)
Select a compatible wheel for a version, even we already found a source
distribution previously.

If no wheel is found, select the most recent source distribution, not
the oldest compatible one.

This fixes the resolution of `mst.in`, which i added
2023-10-26 08:15:26 +00:00
Charlie Marsh 13e4171916
Inline manifest creations in resolver tests (#188) 2023-10-26 04:36:03 +00:00
Charlie Marsh 6faaf4bc24
Respect existing versions in "lockfile" (#187)
Like `pip-compile`, we now respect existing versions from the
`requirements.txt` provided via `--output-file`, unless you pass a
`--upgrade` flag.

Closes #166.
2023-10-26 04:28:58 +00:00
Charlie Marsh 9f894213e0
Omit colors when writing to output file (#186)
We were writing color escape codes to the file specified by `-o`.
2023-10-26 04:12:25 +00:00
Charlie Marsh 61a61db154
Filter and store all distributions upfront (#185)
Modifies the resolver to remove any incompatible distributions upfront,
and store them in an index by version. This will be necessary to support
`--upgrade` semantics.

This actually does cause a meaningful slowdown right now (since we now
iterate over all files, even if we otherwise never would've needed to
touch them), but we should be able to optimize it out later.
2023-10-26 01:06:44 +00:00
Charlie Marsh 5ed913af50
Rename `SolverCache` (#184)
Everywhere else, we use cache to refer to a filesystem cache, so this is
kind of confusing. It's really an in-memory index that we build up over
the course of the solve.
2023-10-25 23:53:31 +00:00
konsti 889f6173cc
Unify python interpreter abstractions (#178)
Previously, we had two python interpreter metadata structs, one in
gourgeist and one in puffin. Both would spawn a subprocess to query
overlapping metadata and both would appear in the cli crate, if you
weren't careful you could even have to different base interpreters at
once. This change unifies this to one set of metadata, queried and
cached once.

Another effect of this crate is proper separation of python interpreter
and venv. A base interpreter (such as `/usr/bin/python/`, but also pyenv
and conda installed python) has a set of metadata. A venv has a root and
inherits the base python metadata except for `sys.prefix`, which unlike
`sys.base_prefix`, gets set to the venv root. From the root and the
interpreter info we can compute the paths inside the venv. We can reuse
the interpreter info of the base interpreter when creating a venv
without having to query the newly created `python`.
2023-10-25 20:11:36 +00:00
konsti 1fbe328257
Build source distributions in the resolver (#138)
This is isn't ready, but it can resolve
`meine_stadt_transparent==0.2.14`.

The source distributions are currently being built serially one after
the other, i don't know if that is incidentally due to the resolution
order, because sdist building is blocking or because of something in the
resolver that could be improved.

It's a bit annoying that the thing that was supposed to do http requests
now suddenly also has to a whole download/unpack/resolve/install/build
routine, it messes up the type hierarchy. The much bigger problem though
is avoid recursive crate dependencies, it's the reason for the callback
and for splitting the builder into two crates (badly named atm)
2023-10-25 20:05:13 +00:00
konsti b5c57ee6fe
Fix rustdoc warnings (#182)
Changes to make `cargo doc --all --all-features` pass without warnings.
2023-10-25 11:48:24 +00:00
Charlie Marsh d0aeb2ac80
Remove vector allocation in `WheelFilename` (#177) 2023-10-24 01:23:14 +00:00
Charlie Marsh 21bb9c29cc
Add an additional requirements fixup (#174)
Also checking in a variety of different requirements inputs.
2023-10-23 19:50:39 -04:00
konstin 815c2117c8 Clippy 2023-10-23 13:54:31 +02:00
Charlie Marsh 0e097874f8
Add support for alternate index URLs (#169)
As elsewhere, we just use the `pip` and `pip-compile` APIs. So we
support `--index-url` to override PyPI, then `--extra-index-url` to add
_additional_ indexes, and `--no-index` to avoid hitting the index at
all.

Closes #156.
2023-10-23 03:18:30 +00:00
Charlie Marsh 49a27ff33c
Add support for parameterized link modes (#164)
Allows the user to select between clone, hardlink, and copy semantics
for installs. (The pnpm documentation has a decent description of what
these mean: https://pnpm.io/npmrc#package-import-method.)

Closes #159.
2023-10-22 04:35:50 +00:00
Charlie Marsh 9bcc7fe77a
Move venv command to miette (#162) 2023-10-22 04:17:16 +00:00
Charlie Marsh 370771b28c
Make `clean` non-async (#163) 2023-10-22 03:54:13 +00:00
Charlie Marsh b665f1489a
Add tests for `puffin sync` (#161)
Closes #158.
2023-10-22 03:25:00 +00:00
Charlie Marsh 3072c3265e
Add support for lowest and lowest-direct resolution modes (#160)
Borrows terminology from pnpm by introducing three resolution modes:

- "Highest": always choose the highest compliant version (default).
- "Lowest": always choose the lowest compliant version.
- "LowestDirect": choose the lowest compliant version of direct
dependencies, and the highest compliant version of any transitive
dependencies. (This makes a bit more sense than "lowest".)

Closes https://github.com/astral-sh/puffin/issues/142.
2023-10-21 22:58:06 -04:00
konsti ae9d1f7572
Add source distribution filename abstraction (#154)
The need for this became clear when working on the source distribution
integration into the resolver.

While at it i also switch the `WheelFilename` version to the parsed
`pep440_rs` version now that we have this crate.
2023-10-20 17:45:57 +02:00
Charlie Marsh 6f52b5ca4d
Use index instead of current selection (#155)
We can also use `swap_remove` because we're discarding the vector.
2023-10-20 14:02:24 +00:00
Charlie Marsh 4645f79237
Use `FxHash` (#151) 2023-10-20 05:26:06 +00:00
Charlie Marsh 8001c792e7
Show requirement sources in `pip-compile` output (#149)
Builds up a complete resolved graph from PubGrub, and shows the sources
that led to each package being included in the resolution, like
`pip-compile`.

Closes https://github.com/astral-sh/puffin/issues/60.
2023-10-20 05:14:59 +00:00
Charlie Marsh e662fe341b
Short-circuit when a dependency has no matching versions (#148)
Kind of an oversight in my initial implementation. If we find that any
package has _no_ matching versions, we should select it! This lets us
short-circuit _immediately_ when top-level dependencies aren't
satisfiable.
2023-10-20 03:49:20 +00:00
Charlie Marsh 9b3405bf0e
Upgrade PubGrub to dev branch (#147)
Updates to `29c48fb9f3daa11bd02794edd55060d0b01ee705` from the
`pubgrub-rs` dev branch. This lets us reduce the number of changes we've
made to PubGrub itself (now, only changing visibility to export a few
things from the `solver.rs` module).
2023-10-20 03:23:26 +00:00
Charlie Marsh bcd281eb1f
Remove `async` from some filesystem-only APIs (#146) 2023-10-20 01:08:51 +00:00
Charlie Marsh 03101c6a5c
Add an autogeneration header to pip-compile (#145)
Closes https://github.com/astral-sh/puffin/issues/132.
2023-10-19 20:57:27 -04:00
Charlie Marsh 0b60804db6
Add support for constraints during pip-compile resolution (#144)
Closes https://github.com/astral-sh/puffin/issues/130.
2023-10-20 00:24:05 +00:00
Charlie Marsh d5105a76c5
Improve and test diagnostics for requirements-reading CLI commands (#143)
Also removes `owo_colors` because it was really painful to get it to
avoid printing colors during tests.
2023-10-19 18:13:40 -04:00
Charlie Marsh ba181eacdd
Accept dependencies from `pyproject.toml` (#141)
Doesn't support extras yet. It's also supported for `pip uninstall`,
which `pip` itself doesn't support, but whatever.

Closes #127.
2023-10-19 18:42:05 +00:00
Charlie Marsh 385345807c
Accept multiple input files in pip-sync and pip-compile (#140)
Closes https://github.com/astral-sh/puffin/issues/126.
2023-10-19 18:17:27 +00:00
Charlie Marsh 7ef6c0315c
Unify site-packages into distribution enum (#136)
Gets rid of the custom `DistInfo` struct in the site-packages
abstraction in favor of a new kind of distribution
(`InstalledDistribution`). No change in behavior.
2023-10-19 04:37:52 +00:00
Charlie Marsh bd01fb490e
Remove packages when syncing (#135)
`pip-sync` will now uninstall any packages that aren't necessary.

Closes https://github.com/astral-sh/puffin/issues/128.
2023-10-19 00:14:20 -04:00
Charlie Marsh 41ece4184b
Print to stderr by default (#134) 2023-10-18 23:30:07 -04:00
Charlie Marsh 20bb4c5c61
Avoid showing resolver progress bar when no resolution is required (#133) 2023-10-19 03:23:22 +00:00
Charlie Marsh 573f5832a3
Allow uninstall to take multiple packages and files (#125)
Moves the command to `puffin pip-uninstall` for now to separate from the
managed interface, and redoes the command output.
2023-10-18 22:30:11 -04:00
Charlie Marsh 4b91ae4769
Add CLI tests for add and remove commands (#124) 2023-10-19 01:06:48 +00:00
Charlie Marsh e15b99b911
Rename commands to `pip-sync` and `pip-compile` (#123)
To free up the rest of the interface.
2023-10-18 21:15:20 +00:00
konsti 8cc4fe0d44
Install source distribution requirements with puffin itself instead of pip (#122)
This is also a lot faster. Unfortunately it copies a lot of code from
the sync cli since the `Printer` is private.

The first commit are some refactorings i made when i thought about how i
could reuse the existing code.
2023-10-18 19:11:17 +00:00
Charlie Marsh 7bc42ca2ce
Use `owo_colors` instead of `colored` (#121)
This is what `miette` uses so seems better to avoid two coloring crates.
2023-10-18 18:57:07 +00:00
Charlie Marsh 2d14c0647e
Add a `puffin remove` command (#120) 2023-10-18 18:50:08 +00:00
Charlie Marsh 1fc03780f9
Use `miette` for `puffin add` diagnostics (#119)
Experiment in using `miette` for better user-facing diagnostics in the
CLI crate:

<img width="710" alt="Screen Shot 2023-10-18 at 2 11 54 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/30299da0-da65-4972-944f-cb8cc5f72a77">

For now, only the `add` command has been migrated, and all the library
crates continue to use `anyhow`.
2023-10-18 14:24:09 -04:00
konsti fec4ee2848
Support prepare_metadata_for_build_wheel (#106)
Support calling `prepare_metadata_for_build_wheel`, which can give you
the metadata without executing the actual build if the backend supports
it.

This makes the code a lot uglier since we effectively have a state
machine:

* Setup: Either venv plus requires (PEP 517) or just a venv (setup.py)
* Get metadata (optional step): None (setup.py) or
`prepare_metadata_for_build_wheel` and saving that result
* Build: `setup.py`, `build_wheel()` or
`build_wheel(metadata_directory=metadata_directory)`, but i think i got
general flow right.

@charliermarsh This is a "barely works but unblocks building on top"
implementation, say if you want more polishing (i'll look at this again
tomorrow)
2023-10-18 14:48:30 +02:00
Charlie Marsh 4c87a1d42c
Add a `puffin add` command (#117)
This needs far better error handling and user-facing feedback, but it
does the basic operation (and includes discovery of the `pyproject.toml`
file, etc.).
2023-10-18 00:51:20 -04:00
Charlie Marsh 339553e228
Mark `--no-cache` as global (#116) 2023-10-17 23:15:36 -04:00
Charlie Marsh 89db5d79bc
Add support for lenient parsing (#115)
This PR enables us to make "fixups" to bad metadata. I copied over the
one fixup that @konstin made in `monotrail-resolve`, and added a few
common ones for `Requires-Python`.
2023-10-17 22:03:16 -04:00
Charlie Marsh 0d90256151
Store all distributions rather than compatible wheels (#114)
This PR reverts #109 which is actually a performance _regression_ since
we need to iterate over a bunch of wheels that we could otherwise
entirely ignore.
2023-10-17 17:09:31 -04:00
Charlie Marsh 5b046a8102
Use `select!` instead of `tokio::spawn` for network thread (#110) 2023-10-16 15:41:25 -04:00
Charlie Marsh 1b433fdcee
Only store compatible wheels in the resolver (#109)
Rather than constantly iterating over all files and testing their
compatibility with the current platform, just store wheels we can
actually consider in the solver cache.
2023-10-16 19:21:07 +00:00
Charlie Marsh 5f5788e866
Surface PubGrub derivation trees (#108)
I think the derivation trees could be stronger but this exposes
PubGrub's proof-like error messages.

Closes #102.
2023-10-16 14:14:36 -04:00
Charlie Marsh bae52d5edd
Surface request stream errors in the resolver (#107)
Closes https://github.com/astral-sh/puffin/issues/105.
2023-10-16 17:26:46 +00:00
Charlie Marsh 7e8ffeb2df
Use `fs-err` in more crates (#100)
Closes https://github.com/astral-sh/puffin/issues/88.
2023-10-16 13:37:58 +00:00
konsti fa2fd14587
Add basic sdist builder (#104)
This adds a basic sdist builder that has been tested with two source
distributions, one with a PEP 517 backend and one with setup.py.

It uses pip for requirements installation atm, lacks testing in all
directions, lacks checks for recursive requirements, can't pass in
already resolved versions, doesn't support prepare metadata for build to
allow resolution to continue without doing the actual (native) build,
error messages are mediocre, etc.

```console
$ RUST_LOG=puffin_build=debug puffin-build --wheels wheels downloads/tqdm-4.66.1.tar.gz
2023-10-16T12:28:35.503182Z DEBUG build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}: puffin_build: Building downloads/tqdm-4.66.1.tar.gz
2023-10-16T12:28:35.521780Z  INFO build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}:extract_archive: puffin_build: close time.busy=18.4ms time.idle=16.7µs
2023-10-16T12:28:35.845096Z DEBUG build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}:resolve_and_install: puffin_build: Calling pip to install build dependencies
2023-10-16T12:28:37.668660Z  INFO build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}:resolve_and_install: puffin_build: close time.busy=1.82s time.idle=13.2µs
2023-10-16T12:28:37.668744Z DEBUG build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}: puffin_build: Calling `setuptools.build_meta.get_requires_for_build_wheel()`
2023-10-16T12:28:38.159205Z  INFO build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}:run_python_script{python_interpreter="/tmp/.tmpm4cTra/venv/bin/python"}: puffin_build: close time.busy=490ms time.idle=13.0µs
2023-10-16T12:28:38.159304Z DEBUG build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}: puffin_build: Calling `setuptools.build_meta.build_wheel()`
2023-10-16T12:28:38.501732Z  INFO build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}:run_python_script{python_interpreter="/tmp/.tmpm4cTra/venv/bin/python"}: puffin_build: close time.busy=342ms time.idle=15.2µs
2023-10-16T12:28:38.522700Z  INFO build_sdist{path="downloads/tqdm-4.66.1.tar.gz" base_python="/usr/bin/python3"}: puffin_build: close time.busy=3.02s time.idle=16.2µs
Wheel built to /home/konsti/projects/puffin/crates/puffin-build/wheels/tqdm-4.66.1-py3-none-any.whl
2023-10-16T12:28:38.522772Z DEBUG puffin_build: Took 3020ms
$ puffin-build --wheels wheels downloads/geoextract-0.3.1.tar.gz
2023-10-16T12:28:40.884622Z DEBUG build_sdist{path="downloads/geoextract-0.3.1.tar.gz" base_python="/usr/bin/python3"}: puffin_build: Building downloads/geoextract-0.3.1.tar.gz
2023-10-16T12:28:40.887743Z  INFO build_sdist{path="downloads/geoextract-0.3.1.tar.gz" base_python="/usr/bin/python3"}:extract_archive: puffin_build: close time.busy=2.97ms time.idle=12.6µs
2023-10-16T12:28:41.469738Z  INFO build_sdist{path="downloads/geoextract-0.3.1.tar.gz" base_python="/usr/bin/python3"}: puffin_build: close time.busy=585ms time.idle=15.3µs
Wheel built to /home/konsti/projects/puffin/crates/puffin-build/wheels/geoextract-0.3.1-py3-none-any.whl
2023-10-16T12:28:41.469814Z DEBUG puffin_build: Took 585ms
```
2023-10-16 12:43:31 +00:00
konsti cb29c89424
Better error reporting (#95)
The main change is to print the whole error chain. We can combine this
with adding `.context` to distinct phases to be able to locate crashes
without having to use a debugger.
2023-10-16 02:15:10 +00:00
Charlie Marsh 471a1d657d
Migrate resolver proof-of-concept to PubGrub (#97)
## Summary

This PR enables the proof-of-concept resolver to backtrack by way of
using the `pubgrub-rs` crate.

Rather than using PubGrub as a _framework_ (implementing the
`DependencyProvider` trait, letting PubGrub call us), I've instead
copied over PubGrub's primary solver hook (which is only ~100 lines or
so) and modified it for our purposes (e.g., made it async).

There's a lot to improve here, but it's a start that will let us
understand PubGrub's appropriateness for this problem space. A few
observations:

- In simple cases, the resolver is slower than our current (naive)
resolver. I think it's just that the pipelining isn't as efficient as in
the naive case, where we can just stream package and version fetches
concurrently without any bottlenecks.
- A lot of the code here relates to bridging PubGrub with our own
abstractions -- so we need a `PubGrubPackage`, a `PubGrubVersion`, etc.
2023-10-15 22:05:44 -04:00
konsti de9e85978b
Fix tempdir rename (#94)
This fixes two bugs on linux:

`/tmp` and `$HOME` are technically on two different partitions on my
machine, which means that rename-as-atomic-dir-write doesn't work. The
solution is to create the temp dir in the target directory.

zip files may contain directory entries, we can't create files for them
but need to create directories. We could skip them though because iirc
they are not in the RECORD so they won't be uninstalled.
2023-10-12 18:47:38 +00:00
konsti 530edb6e39
Add output file option to compile (#93)
`pip-compile` has the same option. I need this esp. since piping doesn't
work as we write to stdout.
2023-10-12 20:42:06 +02:00
konsti 6a7954cdd0
Add `-p` base python option to venv command (#92)
This is the same option that `virtualenv` offers, except that we only
support absolute paths atm and not e.g. `-p 3.10` (which we need to
eventually).
2023-10-12 20:41:52 +02:00
Charlie Marsh a622345fbc
Replace mocked server with 'real' integration tests (#91)
We can always restore these from history, but right now, it feels a lot
more productive to just hit PyPI directly for our integration tests,
since we don't have to spend time figuring out mocks.
2023-10-12 17:34:48 +00:00
Charlie Marsh 496cb7b2ef
Migrate to `requirements_txt.rs` (#90)
Remove the parser I wrote in favor of Konsti's which is much more
complete. The only change vs. the version in `poc-monotrail` is that I
changed the tests to use insta rather than manually storing and
comparing against JSON snapshots.

Closes https://github.com/astral-sh/puffin/issues/89.
2023-10-12 17:09:00 +00:00
Charlie Marsh 906a482499
Separate unzip into its own install phase (#87) 2023-10-11 15:18:23 +00:00
Charlie Marsh 85162d1111
Parallelize wheel installations with Rayon (#84)
It looks like using _either_ async Rust with a `JoinSet` _or_
parallelizing a fixed threadpool with Rayon provide about a ~5% speed-up
over our current serial approach:

```console
❯ hyperfine --runs 30 --warmup 5 --prepare "./target/release/puffin venv .venv" \
  "./target/release/rayon sync ./scripts/benchmarks/requirements-large.txt" \
  "./target/release/async sync ./scripts/benchmarks/requirements-large.txt" \
  "./target/release/main sync ./scripts/benchmarks/requirements-large.txt"
Benchmark 1: ./target/release/rayon sync ./scripts/benchmarks/requirements-large.txt
  Time (mean ± σ):     295.7 ms ±  16.9 ms    [User: 28.6 ms, System: 263.3 ms]
  Range (min … max):   249.2 ms … 315.9 ms    30 runs

Benchmark 2: ./target/release/async sync ./scripts/benchmarks/requirements-large.txt
  Time (mean ± σ):     296.2 ms ±  20.2 ms    [User: 36.1 ms, System: 340.1 ms]
  Range (min … max):   258.0 ms … 359.4 ms    30 runs

Benchmark 3: ./target/release/main sync ./scripts/benchmarks/requirements-large.txt
  Time (mean ± σ):     306.6 ms ±  19.5 ms    [User: 25.3 ms, System: 220.5 ms]
  Range (min … max):   269.6 ms … 332.2 ms    30 runs

Summary
  './target/release/rayon sync ./scripts/benchmarks/requirements-large.txt' ran
    1.00 ± 0.09 times faster than './target/release/async sync ./scripts/benchmarks/requirements-large.txt'
    1.04 ± 0.09 times faster than './target/release/main sync ./scripts/benchmarks/requirements-large.txt'
```

It's much easier to just parallelize with Rayon and avoid async in the
underlying wheel code, so this PR takes that approach for now.
2023-10-10 23:46:30 -04:00
Charlie Marsh ed68d31e03
Add a basic test for the resolver (#86)
Mocks out the PyPI client using some checked-in fixtures. The test is
very basic, and I'm not very happy with all the ceremony around the
mocks and such, but it's an interesting experiment at least.
2023-10-11 03:30:53 +00:00
Charlie Marsh c1fb698eae
Add a separate dist-info name struct (#85) 2023-10-10 23:21:18 +00:00
Charlie Marsh d0764bdc23
Add `puffin venv` command to create virtual environments (#83)
Closes https://github.com/astral-sh/puffin/issues/58.
2023-10-10 13:46:25 -04:00
Charlie Marsh a0294a510c
Rework `puffin sync` output to summarize (#81)
This also moves away from using `tracing` for user-facing logging,
instead introducing a new `Printer` abstraction.

Closes #66.
2023-10-10 03:29:09 +00:00
Charlie Marsh 2d4a8c361b
Change puffin-cli binary to puffin (#80) 2023-10-09 17:19:33 -04:00
Charlie Marsh ba2b200fce
Enable release builds via `cargo-dist` (#79) 2023-10-09 20:48:55 +00:00
Charlie Marsh b90140e1bc
Add support for wheel uninstalls (#77)
Closes #36.
2023-10-09 14:14:33 -04:00
Charlie Marsh 239b5893d8
Fix version satisfier for unpinned dependencies (#74) 2023-10-09 11:48:39 -04:00
Charlie Marsh 485b1dceb6
Use a single requirements iterator in `sync` (#71) 2023-10-09 03:29:38 +00:00
Charlie Marsh ba72950546
Avoid passing cached wheels to the resolver step (#70)
When we go to install a locked `requirements.txt`, if a wheel is already
available in the local cache, and matches the version specifiers, we can
just use it directly without fetching the package metadata. This speeds
up the no-op case by about 33%.

Closes https://github.com/astral-sh/puffin/issues/48.
2023-10-08 22:17:19 -04:00
Charlie Marsh 5b71cfdd0b
Remove Monotrail-specific code from `install-wheel-rs` (#68)
I think this isn't necessary to support in this generic crate. If we
choose to adopt Monotrail-style concepts, we'll likely need to rework
them anyway.
2023-10-08 18:28:57 -04:00
Charlie Marsh adbee4fb32
Use recursive `clonefile` calls on macOS (#67)
It turns out that on macOS, you can pass `clonefile` a directory to
recursively copy an entire directory. This speeds up wheel installation
dramatically, by about 3x.
2023-10-08 21:44:02 +00:00
Charlie Marsh 1c942ab8fe
Tweak tracing output for sync command (#64) 2023-10-08 20:09:15 +00:00
Charlie Marsh a53f697f62
Use `tracing` for user-facing output (#63)
The setup is now as follows:

- All user-facing logging goes through `tracing` at an `info` leve.
(This excludes messages that go to `stdout`, like the compiled
`requirements.txt` file.)
- We have `--quiet` and `--verbose` command-line flags to set the
tracing filter and format defaults. So if you use `--verbose`, we
include timestamps and targets, and filter at `puffin=debug` level.
- However, we always respect `RUST_LOG`. So you can override the
_filter_ via `RUST_LOG`.

For example: the standard setup filters to `puffin=info`, and doesn't
show timestamps or targets:

<img width="1235" alt="Screen Shot 2023-10-08 at 3 41 22 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/54ca4db6-c66a-439e-bfa3-b86dee136e45">

If you run with `--verbose`, you get debug logging, but confined to our
crates:

<img width="1235" alt="Screen Shot 2023-10-08 at 3 41 57 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/c5c1af11-7f7a-4038-a173-d9eca4c3630b">

If you want verbose logging with _all_ crates, you can add
`RUST_LOG=debug`:

<img width="1235" alt="Screen Shot 2023-10-08 at 3 42 39 PM"
src="https://github.com/astral-sh/puffin/assets/1309177/0b5191f4-4db0-4db9-86ba-6f9fa521bcb6">

I think this is a reasonable setup, though we can see how it feels and
refine over time.

Closes https://github.com/astral-sh/puffin/issues/57.
2023-10-08 15:46:06 -04:00
Charlie Marsh 0ca17a1cf2
Use local copy of `gourgeist` (#62)
This PR gets `gourgeist` passing our local CI and integrated into the
broader workspace.

There's some duplicate between concepts in `gourgeist` (like the
`InterpreterInfo`) and structs we have elsewhere, but we can tackle
those later.
2023-10-08 18:45:08 +00:00
Charlie Marsh 7caf5f42b8
Copy over `gourgeist` crate (#61)
This PR copies over the `gourgeist` crate at commit
`e64c17a263dac6933702dc8d155425c053fe885a` with no modifications.

It won't pass CI, but modifications will intentionally be confined to
later PRs.
2023-10-08 14:37:09 -04:00
Charlie Marsh d1ed41170b
Cache environment marker lookups (#55)
Closes https://github.com/astral-sh/puffin/issues/53.
2023-10-08 05:31:19 +00:00
Charlie Marsh 5eef6e9636
Store cached wheels by dist-info-like name (#52)
Closes https://github.com/astral-sh/puffin/issues/50.
2023-10-08 04:28:04 +00:00
Charlie Marsh fd5aef2c75
Avoid error when repeatedly clearing cache (#51)
Also avoid failing to clear the cache when it contains non-directories
(e.g., I had a `.DS_Store` after looking at it in Finder).
2023-10-08 04:16:48 +00:00
Charlie Marsh 2a846e76b7
Store unzipped wheels in a cache (#49)
This PR massively speeds up the case in which you need to install wheels
that already exist in the global cache.

The new strategy is as follows:

- Download the wheel into the content-addressed cache.
- Unzip the wheel into the cache, but ignore content-addressing. It
turns out that writing to `cacache` for every file in the zip added a
ton of overhead, and I don't see any actual advantages to doing so.
Instead, we just unzip the contents into a directory at, e.g.,
`~/.cache/puffin/django-4.1.5`.
- (The unzip itself is now parallelized with Rayon.)
- When installing the wheel, we now support unzipping from a directory
instead of a zip archive. This required duplicating and tweaking a few
functions.
- When installing the wheel, we now use reflinks (or copy-on-write
links). These have a few fantastic properties: (1) they're extremely
cheap to create (on macOS, they are allegedly faster than hard links);
(2) they minimize disk space, since we avoid copying files entirely in
the vast majority of cases; and (3) if the user then edits a file
locally, the cache doesn't get polluted. Orogene, Bun, and soon pnpm all
use reflinks.

Puffin is now ~15x faster than `pip` for the common case of installing
cached data into a fresh environment.

Closes https://github.com/astral-sh/puffin/issues/21.

Closes https://github.com/astral-sh/puffin/issues/39.
2023-10-08 04:04:48 +00:00
Charlie Marsh 92160e37df
Surface error when unable to find package (#45) 2023-10-07 19:43:12 +00:00
Charlie Marsh 9be02d1590
Skip already-installed dependencies during `sync` command (#43)
Closes https://github.com/astral-sh/puffin/issues/35.
2023-10-07 19:26:45 +00:00
Charlie Marsh bc1736feff
Add a `freeze` command to list installed dependencies (#42)
A pre-requisite for https://github.com/astral-sh/puffin/issues/35.
2023-10-07 18:46:09 +00:00
Charlie Marsh f3015ffc1f
Add a `clean` command to clear the cache (#41) 2023-10-07 15:19:03 +00:00
Charlie Marsh 162952bf64
Add a content-addressed cache for wheels (#38)
Closes https://github.com/astral-sh/puffin/issues/4.
2023-10-07 14:24:52 +00:00
Charlie Marsh 6c31631913
Fetch from `data-dist-info-metadata` when available (#37)
As specified in https://peps.python.org/pep-0658/#specification.
2023-10-07 13:05:29 +00:00
Charlie Marsh ae28552b3a
Use local copy of `install-wheel-rs` (#34)
This PR modifies the `install-wheel-rs` (and a few other crates) to get
everything playing nicely. Specifically, CI should pass, and all these
crates now use workspace dependencies between one another.

As part of this change, I split out the wheel name parsing into its own
`wheel-filename` crate, and the compatibility tag parsing into its own
`platform-tags` crate.
2023-10-07 01:43:55 +00:00
Charlie Marsh e824fe6d2b
Copy over `install-wheel-rs` crate (#33)
This PR copies over the `install-wheel-rs` crate at commit
`10730ea1a84c58af6b35fb74c89ed0578ab042b6` with no modifications.

It won't pass CI, but modifications will intentionally be confined to
later PRs.
2023-10-06 21:38:38 -04:00
Charlie Marsh c8477991a9
Use local versions of PEP 440 and PEP 508 crates (#32)
This PR modifies the PEP 440 and PEP 508 crates to pass CI, primarily by
fixing all lint violations.

We're also now using these crates in the workspace via `path`.
(Previously, we were still fetching them from Cargo.)
2023-10-07 00:16:44 +00:00
Charlie Marsh 4fcdb3c045
Copy over `pep508-rs` crate (#31)
This PR copies over the `pep440-rs` crate at commit
`82aa5d4dcbe676b121dc931b0afa09a82de8e3d7` with no modifications.

It won't pass CI, but modifications will intentionally be confined to
later PRs.
2023-10-06 20:12:19 -04:00
Charlie Marsh f03398bee3
Copy over `pep440-rs` crate (#30)
This PR copies over the `pep440-rs` crate at commit
`a8303b01ffef6fccfdce562a887f6b110d482ef3` with no modifications.

It won't pass CI, but modifications will intentionally be confined to
later PRs.
2023-10-06 20:11:52 -04:00
Charlie Marsh 36d0124e60
Do wheel downloads concurrently (#28) 2023-10-06 20:51:31 +00:00
Charlie Marsh dd26cfa0cc
Migrate to `tokio` (#27)
Closes https://github.com/astral-sh/puffin/issues/26.
2023-10-06 20:31:03 +00:00
Charlie Marsh ca6aa207ff
Move to workspace dependencies (#25) 2023-10-06 19:49:41 +00:00
Charlie Marsh dab70a661a
Change `install` to `sync` (with sync semantics) (#24)
For better separate at this stage (and following `pip-tools`), it's now
`puffin sync`, and it assumes `--no-deps`.
2023-10-06 19:42:58 +00:00
Charlie Marsh ff8e24a621
Move `puffin-installer` to its own crate (#23) 2023-10-06 19:31:21 +00:00
Charlie Marsh f395c9c98c Update README 2023-10-06 01:03:07 -04:00
Charlie Marsh 28721cf5fc Avoid caching wheel fetches 2023-10-06 00:50:30 -04:00
Charlie Marsh a43328d914
Support wheel installation (#19)
Closes https://github.com/astral-sh/puffin/issues/8.
2023-10-06 00:47:45 -04:00
Charlie Marsh 47bbb7a78e
Separate platform tags (#18) 2023-10-05 23:24:38 -04:00
Charlie Marsh 9ea6eaeb10
Add separate compile and install commands (#17)
Closes #9.
2023-10-05 21:44:31 -04:00
Charlie Marsh 4c30cb146a Add crate README 2023-10-05 21:09:58 -04:00
Charlie Marsh 8b151a64d5
Rename `puffin-requirements` to `puffin-package` (#16)
Closes https://github.com/astral-sh/puffin/issues/7.
2023-10-05 21:03:20 -04:00
Charlie Marsh 94895de46d
Add support for wheel tag parsing (#15)
Closes https://github.com/astral-sh/puffin/issues/12.
2023-10-05 20:59:58 -04:00
Charlie Marsh 2d6266b167
Add an HTTP cache (and `--no-cache` argument) (#14)
Closes https://github.com/astral-sh/puffin/issues/3.
2023-10-05 19:14:05 -04:00
Charlie Marsh 1063d8c150
Add Python interpreter detection (#11)
Closes https://github.com/astral-sh/puffin/issues/2.
2023-10-05 15:09:22 -04:00
Charlie Marsh b059c590c4
Add basic CI via GitHub Actions (#10)
Closes https://github.com/astral-sh/puffin/issues/1.
2023-10-05 13:42:58 -04:00
Charlie Marsh b4828fb3f2 Remove progress bar 2023-10-05 12:45:38 -04:00
Charlie Marsh 7f497fa43f Add progress bar 2023-10-05 12:45:38 -04:00
Charlie Marsh 8032d4606e Misc. changes 2023-10-05 12:45:38 -04:00
Charlie Marsh f51432382a Do basic resolution 2023-10-05 12:45:38 -04:00
Charlie Marsh 0f10595ac3 Add version selection 2023-10-05 12:45:38 -04:00
Charlie Marsh 44b444494e Fetch package metadata in parallel 2023-10-05 12:45:38 -04:00
Charlie Marsh b08e8c78b5 Remove normalized representation of SimpleJson 2023-10-05 12:45:38 -04:00
Charlie Marsh 610fd9994f Add client networking stack 2023-10-05 12:45:38 -04:00
Charlie Marsh 1a2f35801b Add client networking stack 2023-10-05 12:45:38 -04:00
Charlie Marsh 53607df7c6 Add a requirements.txt parser 2023-10-05 12:45:38 -04:00
Charlie Marsh 8b9ac30507 Add license, Cargo.toml, etc. 2023-10-05 12:45:38 -04:00