## Summary
This PR makes explicit specialization of a type variable itself an
error, and the result of the specialization is `Unknown`.
The change also fixes https://github.com/astral-sh/ty/issues/1794.
## Test Plan
mdtests updated
new corpus test
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This PR includes the following changes:
* When attempting to specialize a non-generic type (or a type that is
already specialized), the result is `Unknown`. Also, the error message
is improved.
* When an implicit type alias is incorrectly specialized, the result is
`Unknown`. Also, the error message is improved.
* When only some of the type alias bounds and constraints are not
satisfied, not all substitutions are `Unknown`.
* Double specialization is prohibited. e.g. `G[int][int]`
Furthermore, after applying this PR, the fuzzing tests for seeds 1052
and 4419, which panic in main, now pass.
This is because the false recursions on type variables have been
removed.
```python
# name_2[0] => Unknown
class name_1[name_2: name_2[0]]:
def name_4(name_3: name_2, /):
if name_3:
pass
# (name_5 if unique_name_0 else name_1)[0] => Unknown
def name_4[name_5: (name_5 if unique_name_0 else name_1)[0], **name_1](): ...
```
## Test Plan
New corpus test
mdtest files updated
Fixes https://github.com/astral-sh/ty/issues/1716.
## Test plan
I added a corpus snippet that causes us to panic on `main` (I tested by
running `cargo run -p ty_python_semantic --test=corpus` without the fix
applied).
## Summary
Derived from #17371Fixesastral-sh/ty#256
Fixes https://github.com/astral-sh/ty/issues/1415
Fixes https://github.com/astral-sh/ty/issues/1433
Fixes https://github.com/astral-sh/ty/issues/1524
Properly handles any kind of recursive inference and prevents panics.
---
Let me explain techniques for converging fixed-point iterations during
recursive type inference.
There are two types of type inference that naively don't converge
(causing salsa to panic): divergent type inference and oscillating type
inference.
### Divergent type inference
Divergent type inference occurs when eagerly expanding a recursive type.
A typical example is this:
```python
class C:
def f(self, other: "C"):
self.x = (other.x, 1)
reveal_type(C().x) # revealed: Unknown | tuple[Unknown | tuple[Unknown | tuple[..., Literal[1]], Literal[1]], Literal[1]]
```
To solve this problem, we have already introduced `Divergent` types
(https://github.com/astral-sh/ruff/pull/20312). `Divergent` types are
treated as a kind of dynamic type [^1].
```python
Unknown | tuple[Unknown | tuple[Unknown | tuple[..., Literal[1]], Literal[1]], Literal[1]]
=> Unknown | tuple[Divergent, Literal[1]]
```
When a query function that returns a type enters a cycle, it sets
`Divergent` as the cycle initial value (instead of `Never`). Then, in
the cycle recovery function, it reduces the nesting of types containing
`Divergent` to converge.
```python
0th: Divergent
1st: Unknown | tuple[Divergent, Literal[1]]
2nd: Unknown | tuple[Unknown | tuple[Divergent, Literal[1]], Literal[1]]
=> Unknown | tuple[Divergent, Literal[1]]
```
Each cycle recovery function for each query should operate only on the
`Divergent` type originating from that query.
For this reason, while `Divergent` appears the same as `Any` to the
user, it internally carries some information: the location where the
cycle occurred. Previously, we roughly identified this by having the
scope where the cycle occurred, but with the update to salsa, functions
that create cycle initial values can now receive a `salsa::Id`
(https://github.com/salsa-rs/salsa/pull/1012). This is an opaque ID that
uniquely identifies the cycle head (the query that is the starting point
for the fixed-point iteration). `Divergent` now has this `salsa::Id`.
### Oscillating type inference
Now, another thing to consider is oscillating type inference.
Oscillating type inference arises from the fact that monotonicity is
broken. Monotonicity here means that for a query function, if it enters
a cycle, the calculation must start from a "bottom value" and progress
towards the final result with each cycle. Monotonicity breaks down in
type systems that have features like overloading and overriding.
```python
class Base:
def flip(self) -> "Sub":
return Sub()
class Sub(Base):
def flip(self) -> "Base":
return Base()
class C:
def __init__(self, x: Sub):
self.x = x
def replace_with(self, other: "C"):
self.x = other.x.flip()
reveal_type(C(Sub()).x)
```
Naive fixed-point iteration results in `Divergent -> Sub -> Base -> Sub
-> ...`, which oscillates forever without diverging or converging. To
address this, the salsa API has been modified so that the cycle recovery
function receives the value of the previous cycle
(https://github.com/salsa-rs/salsa/pull/1012).
The cycle recovery function returns the union type of the current cycle
and the previous cycle. In the above example, the result type for each
cycle is `Divergent -> Sub -> Base (= Sub | Base) -> Base`, which
converges.
The final result of oscillating type inference does not contain
`Divergent` because `Divergent` that appears in a union type can be
removed, as is clear from the expansion. This simplification is
performed at the same time as nesting reduction.
```
T | Divergent = T | (T | (T | ...)) = T
```
[^1]: In theory, it may be possible to strictly treat types containing
`Divergent` types as recursive types, but we probably shouldn't go that
deep yet. (AFAIK, there are no PEPs that specify how to handle
implicitly recursive types that aren't named by type aliases)
## Performance analysis
A happy side effect of this PR is that we've observed widespread
performance improvements!
This is likely due to the removal of the `ITERATIONS_BEFORE_FALLBACK`
and max-specialization depth trick
(https://github.com/astral-sh/ty/issues/1433,
https://github.com/astral-sh/ty/issues/1415), which means we reach a
fixed point much sooner.
## Ecosystem analysis
The changes look good overall.
You may notice changes in the converged values for recursive types,
this is because the way recursive types are normalized has been changed.
Previously, types containing `Divergent` types were normalized by
replacing them with the `Divergent` type itself, but in this PR, types
with a nesting level of 2 or more that contain `Divergent` types are
normalized by replacing them with a type with a nesting level of 1. This
means that information about the non-divergent parts of recursive types
is no longer lost.
```python
# previous
tuple[tuple[Divergent, int], int] => Divergent
# now
tuple[tuple[Divergent, int], int] => tuple[Divergent, int]
```
The false positive error introduced in this PR occurs in class
definitions with self-referential base classes, such as the one below.
```python
from typing_extensions import Generic, TypeVar
T = TypeVar("T")
U = TypeVar("U")
class Base2(Generic[T, U]): ...
# TODO: no error
# error: [unsupported-base] "Unsupported class base with type `<class 'Base2[Sub2, U@Sub2]'> | <class 'Base2[Sub2[Unknown], U@Sub2]'>`"
class Sub2(Base2["Sub2", U]): ...
```
This is due to the lack of support for unions of MROs, or because cyclic
legacy generic types are not inferred as generic types early in the
query cycle.
## Test Plan
All samples listed in astral-sh/ty#256 are tested and passed without any
panic!
## Acknowledgments
Thanks to @MichaReiser for working on bug fixes and improvements to
salsa for this PR. @carljm also contributed early on to the discussion
of the query convergence mechanism proposed in this PR.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
cf. https://github.com/astral-sh/ruff/pull/20962
In the following code, `foo` in the comprehension was not reported as
unresolved:
```python
# error: [unresolved-reference] "Name `foo` used when not defined"
foo
foo = [
# no error!
# revealed: Divergent
reveal_type(x) for _ in () for x in [foo]
]
baz = [
# error: [unresolved-reference] "Name `baz` used when not defined"
# revealed: Unknown
reveal_type(x) for _ in () for x in [baz]
]
```
In fact, this is a more serious bug than it looks: for `foo`,
[`explicit_global_symbol` is
called](6cc3393ccd/crates/ty_python_semantic/src/types/infer/builder.rs (L8052)),
causing a symbol that should actually be `Undefined` to be reported as
being of type `Divergent`.
This PR fixes this bug. As a result, the code in
`mdtest/regression/pr_20962_comprehension_panics.md` no longer panics.
## Test Plan
`corpus\cyclic_symbol_in_comprehension.py` is added.
New tests are added in `mdtest/comprehensions/basic.md`.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Adds type inference for list/dict/set comprehensions, including
bidirectional inference:
```py
reveal_type({k: v for k, v in [("a", 1), ("b", 2)]}) # dict[Unknown | str, Unknown | int]
squares: list[int | None] = [x for x in range(10)]
reveal_type(squares) # list[int | None]
```
## Ecosystem impact
I did spot check the changes and most of them seem like known
limitations or true positives. Without proper bidirectional inference,
we saw a lot of false positives.
## Test Plan
New Markdown tests
## Summary
For PEP 695 generic functions and classes, there is an extra "type
params scope" (a child of the outer scope, and wrapping the body scope)
in which the type parameters are defined; class bases and function
parameter/return annotations are resolved in that type-params scope.
This PR fixes some longstanding bugs in how we resolve name loads from
inside these PEP 695 type parameter scopes, and also defers type
inference of PEP 695 typevar bounds/constraints/default, so we can
handle cycles without panicking.
We were previously treating these type-param scopes as lazy nested
scopes, which is wrong. In fact they are eager nested scopes; the class
`C` here inherits `int`, not `str`, and previously we got that wrong:
```py
Base = int
class C[T](Base): ...
Base = str
```
But certain syntactic positions within type param scopes (typevar
bounds/constraints/defaults) are lazy at runtime, and we should use
deferred name resolution for them. This also means they can have cycles;
in order to handle that without panicking in type inference, we need to
actually defer their type inference until after we have constructed the
`TypeVarInstance`.
PEP 695 does specify that typevar bounds and constraints cannot be
generic, and that typevar defaults can only reference prior typevars,
not later ones. This reduces the scope of (valid from the type-system
perspective) cycles somewhat, although cycles are still possible (e.g.
`class C[T: list[C]]`). And this is a type-system-only restriction; from
the runtime perspective an "invalid" case like `class C[T: T]` actually
works fine.
I debated whether to implement the PEP 695 restrictions as a way to
avoid some cycles up-front, but I ended up deciding against that; I'd
rather model the runtime name-resolution semantics accurately, and
implement the PEP 695 restrictions as a separate diagnostic on top.
(This PR doesn't yet implement those diagnostics, thus some `# TODO:
error` in the added tests.)
Introducing the possibility of cyclic typevars made typevar display
potentially stack overflow. For now I've handled this by simply removing
typevar details (bounds/constraints/default) from typevar display. This
impacts display of two kinds of types. If you `reveal_type(T)` on an
unbound `T` you now get just `typing.TypeVar` instead of
`typing.TypeVar("T", ...)` where `...` is the bound/constraints/default.
This matches pyright and mypy; pyrefly uses `type[TypeVar[T]]` which
seems a bit confusing, but does include the name. (We could easily
include the name without cycle issues, if there's a syntax we like for
that.)
It also means that displaying a generic function type like `def f[T:
int](x: T) -> T: ...` now displays as `f[T](x: T) -> T` instead of `f[T:
int](x: T) -> T`. This matches pyright and pyrefly; mypy does include
bound/constraints/defaults of typevars in function/callable type
display. If we wanted to add this, we would either need to thread a
visitor through all the type display code, or add a `decycle` type
transformation that replaced recursive reoccurrence of a type with a
marker.
## Test Plan
Added mdtests and modified existing tests to improve their correctness.
After this PR, there's only a single remaining py-fuzzer seed in the
0-500 range that panics! (Before this PR, there were 10; the fuzzer
likes to generate cyclic PEP 695 syntax.)
## Ecosystem report
It's all just the changes to `TypeVar` display.