## Summary
A bug in `requires_python` (which infers the Python requirement from a
marker) was leading us to break an invariant around the relationship
between the marker environment and the Python requirement. This, in
turn, was leading us to drop parts of the environment space when
solving.
Specifically, in the linked example, we generated a fork for
`python_full_version < '3.10' or platform_python_implementation !=
'CPython'`, which was later split into `python_full_version == '3.8.*'`
and `python_full_version == '3.9.*'`, losing the
`platform_python_implementation != 'CPython'` portion.
Closes https://github.com/astral-sh/uv/issues/10669.
## Summary
We never construct these -- they should be impossible, since we always
translate to `python_full_version`. This PR encodes that impossibility
in the types.
## Summary
I want to move towards a more normalized marker representation within
the marker tree, which means that the things we warn against will
disappear by the time we get to evaluation. I think it makes more sense
to show these warnings when we create the tree, rather than when we
evaluate it.
## Summary
We now respect the user-provided upper-bound in for `requires-python`.
So, if the user has `requires-python = "==3.11.*"`, we won't explore
forks that have `python_version >= '3.12'`, for example.
However, we continue to _only_ compare the lower bounds when assessing
whether a dependency is compatible with a given Python range.
Closes https://github.com/astral-sh/uv/issues/6150.
## Summary
This PR rewrites the `MarkerTree` type to use algebraic decision
diagrams (ADD). This has many benefits:
- The diagram is canonical for a given marker function. It is impossible
to create two functionally equivalent marker trees that don't refer to
the same underlying ADD. This also means that any trivially true or
unsatisfiable markers are represented by the same constants.
- The diagram can handle complex operations (conjunction/disjunction) in
polynomial time, as well as constant-time negation.
- The diagram can be converted to a simplified DNF form for user-facing
output.
The new representation gives us a lot more confidence in our marker
operations and simplification, which is proving to be very important
(see https://github.com/astral-sh/uv/pull/5733 and
https://github.com/astral-sh/uv/pull/5163).
Unfortunately, it is not easy to split this PR into multiple commits
because it is a large rewrite of the `marker` module. I'd suggest
reading through the `marker/algebra.rs`, `marker/simplify.rs`, and
`marker/tree.rs` files for the new implementation, as well as the
updated snapshots to verify how the new simplification rules work in
practice. However, a few other things were changed:
- [We now use release-only comparisons for `python_full_version`, where
we previously only did for
`python_version`](https://github.com/astral-sh/uv/blob/ibraheem/canonical-markers/crates/pep508-rs/src/marker/algebra.rs#L522).
I'm unsure how marker operations should work in the presence of
pre-release versions if we decide that this is incorrect.
- [Meaningless marker expressions are now
ignored](https://github.com/astral-sh/uv/blob/ibraheem/canonical-markers/crates/pep508-rs/src/marker/parse.rs#L502).
This means that a marker such as `'x' == 'x'` will always evaluate to
`true` (as if the expression did not exist), whereas we previously
treated this as always `false`. It's negation however, remains `false`.
- [Unsatisfiable markers are written as `python_version <
'0'`](https://github.com/astral-sh/uv/blob/ibraheem/canonical-markers/crates/pep508-rs/src/marker/tree.rs#L1329).
- The `PubGrubSpecifier` type has been moved to the new `uv-pubgrub`
crate, shared by `pep508-rs` and `uv-resolver`. `pep508-rs` also depends
on the `pubgrub` crate for the `Range` type, we probably want to move
`pubgrub::Range` into a separate crate to break this, but I don't think
that should block this PR (cc @konstin).
There is still some remaining work here that I decided to leave for now
for the sake of unblocking some of the related work on the resolver.
- We still use `Option<MarkerTree>` throughout uv, which is unnecessary
now that `MarkerTree::TRUE` is canonical.
- The `MarkerTree` type is now interned globally and can potentially
implement `Copy`. However, it's unclear if we want to add more
information to marker trees that would make it `!Copy`. For example, we
may wish to attach extra and requires-python environment information to
avoid simplifying after construction.
- We don't currently combine `python_full_version` and `python_version`
markers.
- I also have not spent too much time investigating performance and
there is probably some low-hanging fruit. Many of the test cases I did
run actually saw large performance improvements due to the markers being
simplified internally, reducing the stress on the old `normalize`
routine, especially for the extremely large markers seen in
`transformers` and other projects.
Resolves https://github.com/astral-sh/uv/issues/5660,
https://github.com/astral-sh/uv/issues/5179.
Consider the following packse scenario:
```toml
[root]
requires = [
"a>=1.0.0 ; python_version < '3.10'",
"a>=1.1.0 ; python_version >= '3.10'",
"a>=1.2.0 ; python_version >= '3.11'",
]
[packages.a.versions."1.0.0"]
[packages.a.versions."1.1.0"]
[packages.a.versions."1.2.0"]
```
On current `main`, this produces a dependency on `a` that looks like
this:
```toml
dependencies = [
{ name = "fork-overlapping-markers-basic-a", marker = "python_version < '3.10' or python_version >= '3.11'" },
]
```
But the marker expression is clearly wrong here, since it implies that
`a` isn't installed at all for Python 3.10. With this PR, the above
dependency becomes:
```toml
dependencies = [
{ name = "fork-overlapping-markers-basic-a" },
]
```
That is, it's unconditional. Which is I believe correct here since there
aren't any other constraints on which version to select.
The specific bug here is that when we found overlapping dependency
specifications for the same package *within* a pre-existing fork, we
intersected all of their marker expressions instead of unioning them.
That in turn resulted in incorrect marker expressions.
While this doesn't fix any known bug on the issue tracker (like #4640),
it does appear to fix a couple of our snapshot tests. And fixes a basic
test case I came up with while working on #4732.
For the packse scenario test: https://github.com/astral-sh/packse/pull/206
## Summary
Normalize the order of marker expressions on construction. This removes
the distinction between expressions like `os_name == 'Linux'` vs.
`'Linux' == os_name` throughout the codebase. One caveat here is that
the `in` operator does not have a direct inverse, so we introduce
`MarkerOperator::Contains` to handle that case.
I wanted to land this smaller change before some more intrusive changes
as it simplifies the existing code quite a bit.
In some cases, it's possible for the marker expressions on conflicting
dependency specification to be disjoint but *incomplete*. That is, if
one unions the disjoint markers, the result is not the complete set of
marker environments possible. There may be some "gap" of marker
environments not covered by the markers.
This is a problem in practice because, before this commit, we only
created forks in the resolver for specific marker expressions. So if a
dependency happened to fall in a "gap," our resolver would never see it.
This commit fixes this by adding a new split covering the negation of
the union of all marker expressions in a set of forks for a specific
package.
Originally, I had planned on only creating this split when it was known
that the gap actually existed. That is, when the negation of the marker
expressions did *not* correspond to the empty set. After a lot of
thought, unfortunately, this (I believe) effectively boils down to 3SAT,
which is NP-complete.
Instead, what we do here is *always* create an extra split unless we can
definitively tell that it is empty. We look for a few cases, but
otherwise throw our hands up and potentially do wasted work.
This also updates the lock scenario tests to reflect the actual bug fix
here.
## Summary
In marker normalization, we now remove any markers that are redundant
with the `requires-python` specifier (i.e., always true for the given
Python requirement).
For example, given `iniconfig ; python_version >= '3.7'`, we can remove
the `python_version >= '3.7'` marker when resolving with
`--python-version 3.8`.
Closes#4852.
## Summary
Given `python_version != '3.8' and python_version < '3.10'`, the first
term was expanded to `python_version < '3.8'` and `python_version >
'3.8'`. We then AND'd all three terms together. We don't seem to have a
way to differentiate between the terms to AND and the terms to OR in the
normalization code (it all gets flattened together), so instead this PR
expands the expressions at the leaf level and then flattens them at the
level above when appropriate.
Closes https://github.com/astral-sh/uv/issues/4910.
## Summary
More marker simplification:
- Filters out redundant subtrees based on outer expressions, e.g. `a and (a or
b)` simplifies to `a`.
- Flattens nested trees internally, e.g. `(a and b) and c`
Resolves https://github.com/astral-sh/uv/issues/4536.
## Summary
There are a few ideas at play here:
1. pip always strips versions to the release when evaluating against a
`Requires-Python`, so we now do the same. That means, e.g., using
`3.13.0b0` will be accepted by a project with `Requires-Python: >=
3.13`, which does _not_ adhere to PEP 440 semantics but is somewhat
intuitive.
2. Because we know we'll only be evaluating against release-only
versions, we can use different semantics in PubGrub that let us collapse
ranges. For example, `python_version >= '3.10' or python_version <
'3.10'` can be collapsed to the truthy marker.
Closes https://github.com/astral-sh/uv/issues/4714.
Closes https://github.com/astral-sh/uv/issues/4272.
Closes https://github.com/astral-sh/uv/issues/4719.
## Summary
Given:
```text
numpy >=1.26 ; python_version >= '3.9'
numpy <1.26 ; python_version < '3.9'
```
When resolving for Python 3.8, we need to narrow the `requires-python`
requirement in the top branch of the fork, because `numpy >=1.26` all
require Python 3.9 or later -- but we know (in that branch) that we only
need to _solve_ for Python 3.9 or later.
Closes https://github.com/astral-sh/uv/issues/4669.
I found this while testing the tracking of marker expressions across
resolver forks. Namely, given
sys_platform == 'darwin' and implementation_name == 'pypy'
And:
sys_platform == 'bar' or implementation_name == 'foo'
These should be disjoint, but the disjointness checker was reporting
them as overlapping. I fixed this by giving handling of disjunctions
higher precedence than conjunctions, although I am not 100% confident
that this is correct for all cases.
## Summary
`normalize` now takes an owned value and returns `Option<MarkerTree>`,
such that if any sub-expression evaluates to `true`, we can normalize
out the entire marker.
Closes https://github.com/astral-sh/uv/issues/4267.
## Summary
Simplify and normalize marker expressions in the lockfile. Right now
this does a simple analysis by only looking at related operators at the
same level of precedence. I think anything more complex would be out of
scope.
Resolves https://github.com/astral-sh/uv/issues/4002.