Commit Graph

5 Commits

Author SHA1 Message Date
Douglas Creager 45842cc034
[ty] Fix non-determinism in `ConstraintSet.specialize_constrained` (#21744)
This fixes a non-determinism that we were seeing in the constraint set
tests in https://github.com/astral-sh/ruff/pull/21715.

In this test, we create the following constraint set, and then try to
create a specialization from it:

```
(T@constrained_by_gradual_list = list[Base])
  ∨
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
```

That is, `T` is either specifically `list[Base]`, or it's any `list`.
Our current heuristics say that, absent other restrictions, we should
specialize `T` to the more specific type (`list[Base]`).

In the correct test output, we end up creating a BDD that looks like
this:

```
(T@constrained_by_gradual_list = list[Base])
┡━₁ always
└─₀ (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
    ┡━₁ always
    └─₀ never
```

In the incorrect output, the BDD looks like this:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ never
```

The difference is the ordering of the two individual constraints. Both
constraints appear in the first BDD, but the second BDD only contains `T
is any list`. If we were to force the second BDD to contain both
constraints, it would look like this:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ (T@constrained_by_gradual_list = list[Base])
    ┡━₁ always
    └─₀ never
```

This is the standard shape for an OR of two constraints. However! Those
two constraints are not independent of each other! If `T` is
specifically `list[Base]`, then it's definitely also "any `list`". From
that, we can infer the contrapositive: that if `T` is not any list, then
it cannot be `list[Base]` specifically. When we encounter impossible
situations like that, we prune that path in the BDD, and treat it as
`false`. That rewrites the second BDD to the following:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ (T@constrained_by_gradual_list = list[Base])
    ┡━₁ never   <-- IMPOSSIBLE, rewritten to never
    └─₀ never
```

We then would see that that BDD node is redundant, since both of its
outgoing edges point at the `never` node. Our BDDs are _reduced_, which
means we have to remove that redundant node, resulting in the BDD we saw
above:

```
(Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]])
┡━₁ always
└─₀ never       <-- redundant node removed
```

The end result is that we were "forgetting" about the `T = list[Base]`
constraint, but only for some BDD variable orderings.

To fix this, I'm leaning in to the fact that our BDDs really do need to
"remember" all of the constraints that they were created with. Some
combinations might not be possible, but we now have the sequent map,
which is quite good at detecting and pruning those.

So now our BDDs are _quasi-reduced_, which just means that redundant
nodes are allowed. (At first I was worried that allowing redundant nodes
would be an unsound "fix the glitch". But it turns out they're real!
[This](https://ieeexplore.ieee.org/abstract/document/130209) is the
paper that introduces them, though it's very difficult to read. Knuth
mentions them in §7.1.4 of
[TAOCP](https://course.khoury.northeastern.edu/csu690/ssl/bdd-knuth.pdf),
and [this paper](https://par.nsf.gov/servlets/purl/10128966) has a nice
short summary of them in §2.)

While we're here, I've added a bunch of `debug` and `trace` level log
messages to the constraint set implementation. I was getting tired of
having to add these by hands over and over. To enable them, just set
`TY_LOG` in your environment, e.g.

```sh
env TY_LOG=ty_python_semantic::types::constraints::SequentMap=trace ty check ...
```

[Note, this has an `internal` label because are still not using
`specialize_constrained` in anything user-facing yet.]
2025-12-03 10:19:39 -05:00
Douglas Creager cf4196466c
[ty] Stop testing the (brittle) constraint set display implementation (#21743)
The `Display` implementation for constraint sets is brittle, and
deserves a rethink. But later! It's perfectly fine for printf debugging;
we just shouldn't be writing mdtests that depend on any particular
rendering details. Most of these tests can be replaced with an
equivalence check that actually validates that the _behavior_ of two
constraint sets are identical.
2025-12-02 09:17:29 +01:00
Douglas Creager 7e277667d1
[ty] Distinguish "unconstrained" from "constrained to any type" (#21539)
Before, we would collapse any constraint of the form `Never ≤ T ≤
object` down to the "always true" constraint set. This is correct in
terms of BDD semantics, but loses information, since "not constraining a
typevar at all" is different than "constraining a typevar to take on any
type". Once we get to specialization inference, we should fall back on
the typevar's default for the former, but not for the latter.

This is much easier to support now that we have a sequent map, since we
need to treat `¬(Never ≤ T ≤ object)` as being impossible, and prune it
when we walk through BDD paths, just like we do for other impossible
combinations.
2025-11-24 15:23:09 -05:00
Douglas Creager 83134fb380
[ty] Handle nested types when creating specializations from constraint sets (#21530)
#21414 added the ability to create a specialization from a constraint
set. It handled mutually constrained typevars just fine, e.g. given `T ≤
int ∧ U = T` we can infer `T = int, U = int`.

But it didn't handle _nested_ constraints correctly, e.g. `T ≤ int ∧ U =
list[T]`. Now we do! This requires doing a fixed-point "apply the
specialization to itself" step to propagate the assignments of any
nested typevars, and then a cycle detection check to make sure we don't
have an infinite expansion in the specialization.

This gets at an interesting nuance in our constraint set structure that
@sharkdp has asked about before. Constraint sets are BDDs, and each
internal node represents an _individual constraint_, of the form `lower
≤ T ≤ upper`. `lower` and `upper` are allowed to be other typevars, but
only if they appear "later" in the arbitary ordering that we establish
over typevars. The main purpose of this is to avoid infinite expansion
for mutually constrained typevars.

However, that restriction doesn't help us here, because only applies
when `lower` and `upper` _are_ typevars, not when they _contain_
typevars. That distinction is important, since it means the restriction
does not affect our expressiveness: we can always rewrite `Never ≤ T ≤
U` (a constraint on `T`) into `T ≤ U ≤ object` (a constraint on `U`).
The same is not true of `Never ≤ T ≤ list[U]` — there is no "inverse" of
`list` that we could apply to both sides to transform this into a
constraint on a bare `U`.
2025-11-19 17:37:16 -05:00
Douglas Creager 97935518e9
[ty] Create a specialization from a constraint set (#21414)
This patch lets us create specializations from a constraint set. The
constraint encodes the restrictions on which types each typevar can
specialize to. Given a generic context and a constraint set, we iterate
through all of the generic context's typevars. For each typevar, we
abstract the constraint set so that it only mentions the typevar in
question (propagating derived facts if needed). We then find the "best
representative type" for the typevar given the abstracted constraint
set.

When considering the BDD structure of the abstracted constraint set,
each path from the BDD root to the `true` terminal represents one way
that the constraint set can be satisfied. (This is also one of the
clauses in the DNF representation of the constraint set's boolean
formula.) Each of those paths is the conjunction of the individual
constraints of each internal node that we traverse as we walk that path,
giving a single lower/upper bound for the path. We use the upper bound
as the "best" (i.e. "closest to `object`") type for that path.

If there are multiple paths in the BDD, they technically represent
independent possible specializations. If there's a single specialization
that satisfies all of them, we will return that as the specialization.
If not, then the constraint set is ambiguous. (This happens most often
with constrained typevars.) We could in the future turn _each_ of the
paths into separate specializations, but it's not clear what we would do
with that, so instead we just report the ambiguity as a specialization
failure.
2025-11-19 14:20:33 -05:00