Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Douglas Creager	45842cc034	[ty] Fix non-determinism in `ConstraintSet.specialize_constrained` (#21744 ) This fixes a non-determinism that we were seeing in the constraint set tests in https://github.com/astral-sh/ruff/pull/21715. In this test, we create the following constraint set, and then try to create a specialization from it: ``` (T@constrained_by_gradual_list = list[Base]) ∨ (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]]) ``` That is, `T` is either specifically `list[Base]`, or it's any `list`. Our current heuristics say that, absent other restrictions, we should specialize `T` to the more specific type (`list[Base]`). In the correct test output, we end up creating a BDD that looks like this: ``` (T@constrained_by_gradual_list = list[Base]) ┡━₁ always └─₀ (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]]) ┡━₁ always └─₀ never ``` In the incorrect output, the BDD looks like this: ``` (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]]) ┡━₁ always └─₀ never ``` The difference is the ordering of the two individual constraints. Both constraints appear in the first BDD, but the second BDD only contains `T is any list`. If we were to force the second BDD to contain both constraints, it would look like this: ``` (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]]) ┡━₁ always └─₀ (T@constrained_by_gradual_list = list[Base]) ┡━₁ always └─₀ never ``` This is the standard shape for an OR of two constraints. However! Those two constraints are not independent of each other! If `T` is specifically `list[Base]`, then it's definitely also "any `list`". From that, we can infer the contrapositive: that if `T` is not any list, then it cannot be `list[Base]` specifically. When we encounter impossible situations like that, we prune that path in the BDD, and treat it as `false`. That rewrites the second BDD to the following: ``` (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]]) ┡━₁ always └─₀ (T@constrained_by_gradual_list = list[Base]) ┡━₁ never <-- IMPOSSIBLE, rewritten to never └─₀ never ``` We then would see that that BDD node is redundant, since both of its outgoing edges point at the `never` node. Our BDDs are _reduced_, which means we have to remove that redundant node, resulting in the BDD we saw above: ``` (Bottom[list[Any]] ≤ T@constrained_by_gradual_list ≤ Top[list[Any]]) ┡━₁ always └─₀ never <-- redundant node removed ``` The end result is that we were "forgetting" about the `T = list[Base]` constraint, but only for some BDD variable orderings. To fix this, I'm leaning in to the fact that our BDDs really do need to "remember" all of the constraints that they were created with. Some combinations might not be possible, but we now have the sequent map, which is quite good at detecting and pruning those. So now our BDDs are _quasi-reduced_, which just means that redundant nodes are allowed. (At first I was worried that allowing redundant nodes would be an unsound "fix the glitch". But it turns out they're real! [This](https://ieeexplore.ieee.org/abstract/document/130209) is the paper that introduces them, though it's very difficult to read. Knuth mentions them in §7.1.4 of [TAOCP](https://course.khoury.northeastern.edu/csu690/ssl/bdd-knuth.pdf), and [this paper](https://par.nsf.gov/servlets/purl/10128966) has a nice short summary of them in §2.) While we're here, I've added a bunch of `debug` and `trace` level log messages to the constraint set implementation. I was getting tired of having to add these by hands over and over. To enable them, just set `TY_LOG` in your environment, e.g. ```sh env TY_LOG=ty_python_semantic::types::constraints::SequentMap=trace ty check ... ``` [Note, this has an `internal` label because are still not using `specialize_constrained` in anything user-facing yet.]	2025-12-03 10:19:39 -05:00
Douglas Creager	cf4196466c	[ty] Stop testing the (brittle) constraint set display implementation (#21743 ) The `Display` implementation for constraint sets is brittle, and deserves a rethink. But later! It's perfectly fine for printf debugging; we just shouldn't be writing mdtests that depend on any particular rendering details. Most of these tests can be replaced with an equivalence check that actually validates that the _behavior_ of two constraint sets are identical.	2025-12-02 09:17:29 +01:00
Douglas Creager	7e277667d1	[ty] Distinguish "unconstrained" from "constrained to any type" (#21539 ) Before, we would collapse any constraint of the form `Never ≤ T ≤ object` down to the "always true" constraint set. This is correct in terms of BDD semantics, but loses information, since "not constraining a typevar at all" is different than "constraining a typevar to take on any type". Once we get to specialization inference, we should fall back on the typevar's default for the former, but not for the latter. This is much easier to support now that we have a sequent map, since we need to treat `¬(Never ≤ T ≤ object)` as being impossible, and prune it when we walk through BDD paths, just like we do for other impossible combinations.	2025-11-24 15:23:09 -05:00
Douglas Creager	83134fb380	[ty] Handle nested types when creating specializations from constraint sets (#21530 ) #21414 added the ability to create a specialization from a constraint set. It handled mutually constrained typevars just fine, e.g. given `T ≤ int ∧ U = T` we can infer `T = int, U = int`. But it didn't handle _nested_ constraints correctly, e.g. `T ≤ int ∧ U = list[T]`. Now we do! This requires doing a fixed-point "apply the specialization to itself" step to propagate the assignments of any nested typevars, and then a cycle detection check to make sure we don't have an infinite expansion in the specialization. This gets at an interesting nuance in our constraint set structure that @sharkdp has asked about before. Constraint sets are BDDs, and each internal node represents an _individual constraint_, of the form `lower ≤ T ≤ upper`. `lower` and `upper` are allowed to be other typevars, but only if they appear "later" in the arbitary ordering that we establish over typevars. The main purpose of this is to avoid infinite expansion for mutually constrained typevars. However, that restriction doesn't help us here, because only applies when `lower` and `upper` _are_ typevars, not when they _contain_ typevars. That distinction is important, since it means the restriction does not affect our expressiveness: we can always rewrite `Never ≤ T ≤ U` (a constraint on `T`) into `T ≤ U ≤ object` (a constraint on `U`). The same is not true of `Never ≤ T ≤ list[U]` — there is no "inverse" of `list` that we could apply to both sides to transform this into a constraint on a bare `U`.	2025-11-19 17:37:16 -05:00
Douglas Creager	97935518e9	[ty] Create a specialization from a constraint set (#21414 ) This patch lets us create specializations from a constraint set. The constraint encodes the restrictions on which types each typevar can specialize to. Given a generic context and a constraint set, we iterate through all of the generic context's typevars. For each typevar, we abstract the constraint set so that it only mentions the typevar in question (propagating derived facts if needed). We then find the "best representative type" for the typevar given the abstracted constraint set. When considering the BDD structure of the abstracted constraint set, each path from the BDD root to the `true` terminal represents one way that the constraint set can be satisfied. (This is also one of the clauses in the DNF representation of the constraint set's boolean formula.) Each of those paths is the conjunction of the individual constraints of each internal node that we traverse as we walk that path, giving a single lower/upper bound for the path. We use the upper bound as the "best" (i.e. "closest to `object`") type for that path. If there are multiple paths in the BDD, they technically represent independent possible specializations. If there's a single specialization that satisfies all of them, we will return that as the specialization. If not, then the constraint set is ambiguous. (This happens most often with constrained typevars.) We could in the future turn _each_ of the paths into separate specializations, but it's not clear what we would do with that, so instead we just report the ambiguity as a specialization failure.	2025-11-19 14:20:33 -05:00

5 Commits