This commit attempts an optimization that switches a version's `release`
field over to a `smallvec` optimization. The idea is that most versions
are very small and can be stored inline.
Interestingly, I was unable to observe any obvious benefit:
$ hyperfine \
"./target/profiling/puffin-dev-u32 resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null" \
"./target/profiling/puffin-dev-smallvec-release resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null"
Benchmark 1: ./target/profiling/puffin-dev-u32 resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
Time (mean ± σ): 872.2 ms ± 26.5 ms [User: 14646.0 ms, System: 2516.0 ms]
Range (min … max): 833.0 ms … 912.0 ms 10 runs
Benchmark 2: ./target/profiling/puffin-dev-smallvec-release resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
Time (mean ± σ): 882.3 ms ± 17.4 ms [User: 14764.4 ms, System: 2520.9 ms]
Range (min … max): 859.7 ms … 912.7 ms 10 runs
Summary
'./target/profiling/puffin-dev-u32 resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null' ran
1.01 ± 0.04 times faster than './target/profiling/puffin-dev-smallvec-release resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null'
My hypothesis is that because of an earlier commit that switched the
global allocator to jemalloc, the cost of allocation had precipitously
decreased. To the point that the reduction in allocs from the smallvec
becomes a wash. To test my hypothesis, I dropped the jemalloc commit and
measured the perf of the smallvec optimization against main:
$ hyperfine \
"./target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null" \
"./target/profiling/puffin-dev-smallvec-release-no-jemalloc resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null"
Benchmark 1: ./target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
Time (mean ± σ): 968.0 ms ± 20.0 ms [User: 17637.4 ms, System: 2151.9 ms]
Range (min … max): 940.2 ms … 1005.3 ms 10 runs
Benchmark 2: ./target/profiling/puffin-dev-smallvec-release-no-jemalloc resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null
Time (mean ± σ): 958.4 ms ± 15.7 ms [User: 17119.7 ms, System: 2246.1 ms]
Range (min … max): 944.7 ms … 993.3 ms 10 runs
Summary
'./target/profiling/puffin-dev-smallvec-release-no-jemalloc resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null' ran
1.01 ± 0.03 times faster than './target/profiling/puffin-dev-main resolve-many --cache-dir cache-docker-no-build --no-build pypi_top_8k_flat.txt --limit 1000 2> /dev/null'
Fiddlesticks. Even when allocation is (presumably) more expensive, the
smallvec optimization didn't help. This suggests something is off about
my mental model of the code. So there are more avenues to explore here!
|
||
|---|---|---|
| .. | ||
| src | ||
| Cargo.lock | ||
| Cargo.toml | ||
| License-Apache | ||
| License-BSD | ||
| Readme.md | ||
Readme.md
Dependency specifiers (PEP 508) in Rust
A library for python dependency specifiers, better known as PEP 508.
Usage
In Rust
use std::str::FromStr;
use pep508_rs::Requirement;
let marker = r#"requests [security,tests] >= 2.8.1, == 2.8.* ; python_version > "3.8""#;
let dependency_specification = Requirement::from_str(marker).unwrap();
assert_eq!(dependency_specification.name, "requests");
assert_eq!(dependency_specification.extras, Some(vec!["security".to_string(), "tests".to_string()]));
In Python
from pep508_rs import Requirement
requests = Requirement(
'requests [security,tests] >= 2.8.1, == 2.8.* ; python_version > "3.8"'
)
assert requests.name == "requests"
assert requests.extras == ["security", "tests"]
assert [str(i) for i in requests.version_or_url] == [">= 2.8.1", "== 2.8.*"]
Python bindings are built with maturin, but you can also use the normal pip install .
Version and VersionSpecifier from pep440_rs are reexported to avoid type mismatches.
Markers
Markers allow you to install dependencies only in specific environments (python version, operating system, architecture, etc.) or when a specific feature is activated. E.g. you can say importlib-metadata ; python_version < "3.8" or itsdangerous (>=1.1.0) ; extra == 'security'. Unfortunately, the marker grammar has some oversights (e.g. https://github.com/pypa/packaging.python.org/pull/1181) and the design of comparisons (PEP 440 comparisons with lexicographic fallback) leads to confusing outcomes. This implementation tries to carefully validate everything and emit warnings whenever bogus comparisons with unintended semantics are made.
In python, warnings are by default sent to the normal python logging infrastructure:
from pep508_rs import Requirement, MarkerEnvironment
env = MarkerEnvironment.current()
assert not Requirement("numpy; extra == 'science'").evaluate_markers(env, [])
assert Requirement("numpy; extra == 'science'").evaluate_markers(env, ["science"])
assert not Requirement(
"numpy; extra == 'science' and extra == 'arrays'"
).evaluate_markers(env, ["science"])
assert Requirement(
"numpy; extra == 'science' or extra == 'arrays'"
).evaluate_markers(env, ["science"])
from pep508_rs import Requirement, MarkerEnvironment
env = MarkerEnvironment.current()
Requirement("numpy; python_version >= '3.9.'").evaluate_markers(env, [])
# This will log:
# "Expected PEP 440 version to compare with python_version, found '3.9.', "
# "evaluating to false: Version `3.9.` doesn't match PEP 440 rules"