The `Display` implementation for constraint sets is brittle, and
deserves a rethink. But later! It's perfectly fine for printf debugging;
we just shouldn't be writing mdtests that depend on any particular
rendering details. Most of these tests can be replaced with an
equivalence check that actually validates that the _behavior_ of two
constraint sets are identical.
## Summary
This is another small refactor for
https://github.com/astral-sh/ruff/pull/21445 that splits the single
`paramspec.md` into `generics/legacy/paramspec.md` and
`generics/pep695/paramspec.md`.
## Test Plan
Make sure that all mdtests pass.
## Summary
Previously if an explicit specialization failed (e.g. wrong number of
type arguments or violates an upper bound) we just inferred `Unknown`
for the entire type. This actually caused us to panic on an a case of a
recursive upper bound with invalid specialization; the upper bound would
oscillate indefinitely in fixpoint iteration between `Unknown` and the
given specialization. This could be fixed with a cycle recovery
function, but in this case there's a simpler fix: if we infer
`C[Unknown]` instead of `Unknown` for an invalid attempt to specialize
`C`, that allows fixpoint iteration to quickly converge, as well as
giving a more precise type inference.
Other type checkers actually just go with the attempted specialization
even if it's invalid. So if `C` has a type parameter with upper bound
`int`, and you say `C[str]`, they'll emit a diagnostic but just go with
`C[str]`. Even weirder, if `C` has a single type parameter and you say
`C[str, bytes]`, they'll just go with `C[str]` as the type. I'm not
convinced by this approach; it seems odd to have specializations
floating around that explicitly violate the declared upper bound, or in
the latter case aren't even the specialization the annotation requested.
I prefer `C[Unknown]` for this case.
Fixing this revealed an issue with `collections.namedtuple`, which
returns `type[tuple[Any, ...]]`. Due to
https://github.com/astral-sh/ty/issues/1649 we consider that to be an
invalid specialization. So previously we returned `Unknown`; after this
PR it would be `type[tuple[Unknown]]`, leading to more false positives
from our lack of functional namedtuple support. To avoid that I added an
explicit Todo type for functional namedtuples for now.
## Test Plan
Added and updated mdtests.
The conformance suite changes have to do with `ParamSpec`, so no
meaningful signal there.
The ecosystem changes appear to be the expected effects of having more
precise type information (including occurrences of known issues such as
https://github.com/astral-sh/ty/issues/1495 ). Most effects are just
changes to types in diagnostics.
## Summary
This PR updates the explicit specialization logic to avoid using the
call machinery.
Previously, the logic would use the call machinery by converting the
list of type variables into a `Binding` with a single `Signature` where
all the type variables are positional-only parameters with bounds and
constraints as the annotated type and the default type as the default
parameter value. This has the advantage that it doesn't need to
implement any specific logic but the disadvantages are subpar diagnostic
messages as it would use the ones specific to a function call. But, an
important disadvantage is that the kind of type variable is lost in this
translation which becomes important in #21445 where a `ParamSpec` can
specialize into a list of types which is provided using list literal.
For example,
```py
class Foo[T, **P]: ...
Foo[int, [int, str]]
```
This PR converts the logic to use a simple loop using `zip_longest` as
all type variables and their corresponding type argument maps on a 1-1
basis. They cannot be specified using keyword argument either e.g.,
`dict[_VT=str, _KT=int]` is invalid.
This PR also makes an initial attempt to improve the diagnostic message
to specifically target the specialization part by using words like "type
argument" instead of just "argument" and including information like the
type variable, bounds, and constraints. Further improvements can be made
by highlighting the type variable definition or the bounds / constraints
as a sub-diagnostic but I'm going to leave that as a follow-up.
## Test Plan
Update messages in existing test cases.
## Summary
Derived from #17371Fixesastral-sh/ty#256
Fixes https://github.com/astral-sh/ty/issues/1415
Fixes https://github.com/astral-sh/ty/issues/1433
Fixes https://github.com/astral-sh/ty/issues/1524
Properly handles any kind of recursive inference and prevents panics.
---
Let me explain techniques for converging fixed-point iterations during
recursive type inference.
There are two types of type inference that naively don't converge
(causing salsa to panic): divergent type inference and oscillating type
inference.
### Divergent type inference
Divergent type inference occurs when eagerly expanding a recursive type.
A typical example is this:
```python
class C:
def f(self, other: "C"):
self.x = (other.x, 1)
reveal_type(C().x) # revealed: Unknown | tuple[Unknown | tuple[Unknown | tuple[..., Literal[1]], Literal[1]], Literal[1]]
```
To solve this problem, we have already introduced `Divergent` types
(https://github.com/astral-sh/ruff/pull/20312). `Divergent` types are
treated as a kind of dynamic type [^1].
```python
Unknown | tuple[Unknown | tuple[Unknown | tuple[..., Literal[1]], Literal[1]], Literal[1]]
=> Unknown | tuple[Divergent, Literal[1]]
```
When a query function that returns a type enters a cycle, it sets
`Divergent` as the cycle initial value (instead of `Never`). Then, in
the cycle recovery function, it reduces the nesting of types containing
`Divergent` to converge.
```python
0th: Divergent
1st: Unknown | tuple[Divergent, Literal[1]]
2nd: Unknown | tuple[Unknown | tuple[Divergent, Literal[1]], Literal[1]]
=> Unknown | tuple[Divergent, Literal[1]]
```
Each cycle recovery function for each query should operate only on the
`Divergent` type originating from that query.
For this reason, while `Divergent` appears the same as `Any` to the
user, it internally carries some information: the location where the
cycle occurred. Previously, we roughly identified this by having the
scope where the cycle occurred, but with the update to salsa, functions
that create cycle initial values can now receive a `salsa::Id`
(https://github.com/salsa-rs/salsa/pull/1012). This is an opaque ID that
uniquely identifies the cycle head (the query that is the starting point
for the fixed-point iteration). `Divergent` now has this `salsa::Id`.
### Oscillating type inference
Now, another thing to consider is oscillating type inference.
Oscillating type inference arises from the fact that monotonicity is
broken. Monotonicity here means that for a query function, if it enters
a cycle, the calculation must start from a "bottom value" and progress
towards the final result with each cycle. Monotonicity breaks down in
type systems that have features like overloading and overriding.
```python
class Base:
def flip(self) -> "Sub":
return Sub()
class Sub(Base):
def flip(self) -> "Base":
return Base()
class C:
def __init__(self, x: Sub):
self.x = x
def replace_with(self, other: "C"):
self.x = other.x.flip()
reveal_type(C(Sub()).x)
```
Naive fixed-point iteration results in `Divergent -> Sub -> Base -> Sub
-> ...`, which oscillates forever without diverging or converging. To
address this, the salsa API has been modified so that the cycle recovery
function receives the value of the previous cycle
(https://github.com/salsa-rs/salsa/pull/1012).
The cycle recovery function returns the union type of the current cycle
and the previous cycle. In the above example, the result type for each
cycle is `Divergent -> Sub -> Base (= Sub | Base) -> Base`, which
converges.
The final result of oscillating type inference does not contain
`Divergent` because `Divergent` that appears in a union type can be
removed, as is clear from the expansion. This simplification is
performed at the same time as nesting reduction.
```
T | Divergent = T | (T | (T | ...)) = T
```
[^1]: In theory, it may be possible to strictly treat types containing
`Divergent` types as recursive types, but we probably shouldn't go that
deep yet. (AFAIK, there are no PEPs that specify how to handle
implicitly recursive types that aren't named by type aliases)
## Performance analysis
A happy side effect of this PR is that we've observed widespread
performance improvements!
This is likely due to the removal of the `ITERATIONS_BEFORE_FALLBACK`
and max-specialization depth trick
(https://github.com/astral-sh/ty/issues/1433,
https://github.com/astral-sh/ty/issues/1415), which means we reach a
fixed point much sooner.
## Ecosystem analysis
The changes look good overall.
You may notice changes in the converged values for recursive types,
this is because the way recursive types are normalized has been changed.
Previously, types containing `Divergent` types were normalized by
replacing them with the `Divergent` type itself, but in this PR, types
with a nesting level of 2 or more that contain `Divergent` types are
normalized by replacing them with a type with a nesting level of 1. This
means that information about the non-divergent parts of recursive types
is no longer lost.
```python
# previous
tuple[tuple[Divergent, int], int] => Divergent
# now
tuple[tuple[Divergent, int], int] => tuple[Divergent, int]
```
The false positive error introduced in this PR occurs in class
definitions with self-referential base classes, such as the one below.
```python
from typing_extensions import Generic, TypeVar
T = TypeVar("T")
U = TypeVar("U")
class Base2(Generic[T, U]): ...
# TODO: no error
# error: [unsupported-base] "Unsupported class base with type `<class 'Base2[Sub2, U@Sub2]'> | <class 'Base2[Sub2[Unknown], U@Sub2]'>`"
class Sub2(Base2["Sub2", U]): ...
```
This is due to the lack of support for unions of MROs, or because cyclic
legacy generic types are not inferred as generic types early in the
query cycle.
## Test Plan
All samples listed in astral-sh/ty#256 are tested and passed without any
panic!
## Acknowledgments
Thanks to @MichaReiser for working on bug fixes and improvements to
salsa for this PR. @carljm also contributed early on to the discussion
of the query convergence mechanism proposed in this PR.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
We now use the type context for a lot of things, so re-inferring without
type context actually makes diagnostics more confusing (in most cases).
Before, we would collapse any constraint of the form `Never ≤ T ≤
object` down to the "always true" constraint set. This is correct in
terms of BDD semantics, but loses information, since "not constraining a
typevar at all" is different than "constraining a typevar to take on any
type". Once we get to specialization inference, we should fall back on
the typevar's default for the former, but not for the latter.
This is much easier to support now that we have a sequent map, since we
need to treat `¬(Never ≤ T ≤ object)` as being impossible, and prune it
when we walk through BDD paths, just like we do for other impossible
combinations.
These were added to try to make it clearer that assignability checks
will eventually return more detailed answers than true or false.
However, the constraint set display rendering is still more brittle than
I'd like it to be, and it's more trouble than it's worth to keep them
updated with semantically identically but textually different edits. The
`static_assert`s are sufficient to check correctness, and we can always
add `reveal_type` when needed for further debugging.
#21414 added the ability to create a specialization from a constraint
set. It handled mutually constrained typevars just fine, e.g. given `T ≤
int ∧ U = T` we can infer `T = int, U = int`.
But it didn't handle _nested_ constraints correctly, e.g. `T ≤ int ∧ U =
list[T]`. Now we do! This requires doing a fixed-point "apply the
specialization to itself" step to propagate the assignments of any
nested typevars, and then a cycle detection check to make sure we don't
have an infinite expansion in the specialization.
This gets at an interesting nuance in our constraint set structure that
@sharkdp has asked about before. Constraint sets are BDDs, and each
internal node represents an _individual constraint_, of the form `lower
≤ T ≤ upper`. `lower` and `upper` are allowed to be other typevars, but
only if they appear "later" in the arbitary ordering that we establish
over typevars. The main purpose of this is to avoid infinite expansion
for mutually constrained typevars.
However, that restriction doesn't help us here, because only applies
when `lower` and `upper` _are_ typevars, not when they _contain_
typevars. That distinction is important, since it means the restriction
does not affect our expressiveness: we can always rewrite `Never ≤ T ≤
U` (a constraint on `T`) into `T ≤ U ≤ object` (a constraint on `U`).
The same is not true of `Never ≤ T ≤ list[U]` — there is no "inverse" of
`list` that we could apply to both sides to transform this into a
constraint on a bare `U`.
This patch lets us create specializations from a constraint set. The
constraint encodes the restrictions on which types each typevar can
specialize to. Given a generic context and a constraint set, we iterate
through all of the generic context's typevars. For each typevar, we
abstract the constraint set so that it only mentions the typevar in
question (propagating derived facts if needed). We then find the "best
representative type" for the typevar given the abstracted constraint
set.
When considering the BDD structure of the abstracted constraint set,
each path from the BDD root to the `true` terminal represents one way
that the constraint set can be satisfied. (This is also one of the
clauses in the DNF representation of the constraint set's boolean
formula.) Each of those paths is the conjunction of the individual
constraints of each internal node that we traverse as we walk that path,
giving a single lower/upper bound for the path. We use the upper bound
as the "best" (i.e. "closest to `object`") type for that path.
If there are multiple paths in the BDD, they technically represent
independent possible specializations. If there's a single specialization
that satisfies all of them, we will return that as the specialization.
If not, then the constraint set is ambiguous. (This happens most often
with constrained typevars.) We could in the future turn _each_ of the
paths into separate specializations, but it's not clear what we would do
with that, so instead we just report the ambiguity as a specialization
failure.
## Summary
Extends literal promotion to apply to any generic method, as opposed to
only generic class constructors. This PR also improves our literal
promotion heuristics to only promote literals in non-covariant position
in the return type, and avoid promotion if the literal is present in
non-covariant position in any argument type.
Resolves https://github.com/astral-sh/ty/issues/1357.
## Summary
cf. https://github.com/astral-sh/ruff/pull/20962
In the following code, `foo` in the comprehension was not reported as
unresolved:
```python
# error: [unresolved-reference] "Name `foo` used when not defined"
foo
foo = [
# no error!
# revealed: Divergent
reveal_type(x) for _ in () for x in [foo]
]
baz = [
# error: [unresolved-reference] "Name `baz` used when not defined"
# revealed: Unknown
reveal_type(x) for _ in () for x in [baz]
]
```
In fact, this is a more serious bug than it looks: for `foo`,
[`explicit_global_symbol` is
called](6cc3393ccd/crates/ty_python_semantic/src/types/infer/builder.rs (L8052)),
causing a symbol that should actually be `Undefined` to be reported as
being of type `Divergent`.
This PR fixes this bug. As a result, the code in
`mdtest/regression/pr_20962_comprehension_panics.md` no longer panics.
## Test Plan
`corpus\cyclic_symbol_in_comprehension.py` is added.
New tests are added in `mdtest/comprehensions/basic.md`.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Carl Meyer <carl@astral.sh>
This manifested as an error when inferring the type of a PEP-695 generic
class via its constructor parameters:
```py
class D[T, U]:
@overload
def __init__(self: "D[str, U]", u: U) -> None: ...
@overload
def __init__(self, t: T, u: U) -> None: ...
def __init__(self, *args) -> None: ...
# revealed: D[Unknown, str]
# SHOULD BE: D[str, str]
reveal_type(D("string"))
```
This manifested because `D` is inferred to be bivariant in both `T` and
`U`. We weren't seeing this in the equivalent example for legacy
typevars, since those default to invariant. (This issue also showed up
for _covariant_ typevars, so this issue was not limited to bivariance.)
The underlying cause was because of a heuristic that we have in our
current constraint solver, which attempts to handle situations like
this:
```py
def f[T](t: T | None): ...
f(None)
```
Here, the `None` argument matches the non-typevar union element, so this
argument should not add any constraints on what `T` can specialize to.
Our previous heuristic would check for this by seeing if the argument
type is a subtype of the parameter annotation as a whole — even if it
isn't a union! That would cause us to erroneously ignore the `self`
parameter in our constructor call, since bivariant classes are
equivalent to each other, regardless of their specializations.
The quick fix is to move this heuristic "down a level", so that we only
apply it when the parameter annotation is a union. This heuristic should
go away completely 🤞 with the new constraint solver.
## Summary
This PR adds support for understanding the legacy definition and PEP 695
definition for `ParamSpec`.
This is still very initial and doesn't really implement any of the
semantics.
Part of https://github.com/astral-sh/ty/issues/157
## Test Plan
Add mdtest cases.
## Ecosystem analysis
Most of the diagnostics in `starlette` are due to the fact that ty now
understands `ParamSpec` is not a `Todo` type, so the assignability check
fails. The code looks something like:
```py
class _MiddlewareFactory(Protocol[P]):
def __call__(self, app: ASGIApp, /, *args: P.args, **kwargs: P.kwargs) -> ASGIApp: ... # pragma: no cover
class Middleware:
def __init__(
self,
cls: _MiddlewareFactory[P],
*args: P.args,
**kwargs: P.kwargs,
) -> None:
self.cls = cls
self.args = args
self.kwargs = kwargs
# ty complains that `ServerErrorMiddleware` is not assignable to `_MiddlewareFactory[P]`
Middleware(ServerErrorMiddleware, handler=error_handler, debug=debug)
```
There are multiple diagnostics where there's an attribute access on the
`Wrapped` object of `functools` which Pyright also raises:
```py
from functools import wraps
def my_decorator(f):
@wraps(f)
def wrapper(*args, **kwds):
return f(*args, **kwds)
# Pyright: Cannot access attribute "__signature__" for class "_Wrapped[..., Unknown, ..., Unknown]"
Attribute "__signature__" is unknown [reportAttributeAccessIssue]
# ty: Object of type `_Wrapped[Unknown, Unknown, Unknown, Unknown]` has no attribute `__signature__` [unresolved-attribute]
wrapper.__signature__
return wrapper
```
There are additional diagnostics that is due to the assignability checks
failing because ty now infers the `ParamSpec` instead of using the
`Todo` type which would always succeed. This results in a few
`no-matching-overload` diagnostics because the assignability checks
fail.
There are a few diagnostics related to
https://github.com/astral-sh/ty/issues/491 where there's a variable
which is either a bound method or a variable that's annotated with
`Callable` that doesn't contain the instance as the first parameter.
Another set of (valid) diagnostics are where the code hasn't provided
all the type variables. ty is now raising diagnostics for these because
we include `ParamSpec` type variable in the signature. For example,
`staticmethod[Any]` which contains two type variables.
## Summary
Splitting this one out from https://github.com/astral-sh/ruff/pull/21210. This is also something that should be made obselete by the new constraint solver, but is easy enough to fix now.
## Summary
The solver is currently order-dependent, and will choose a supertype
over the exact type if it appears earlier in the list of constraints. We
could be smarter and try to choose the most precise subtype, but I
imagine this is something the new constraint solver will fix anyways,
and this fixes the issue showing up on
https://github.com/astral-sh/ruff/pull/21070.
This PR updates the mdtests that test how our generics solver interacts
with our new constraint set implementation. Because the rendering of a
constraint set can get long, this standardizes on putting the `revealed`
assertion on a separate line. We also add a `static_assert` test for
each constraint set to verify that they are all coerced into simple
`bool`s correctly.
This is a pure reformatting (not even a refactoring!) that changes no
behavior. I've pulled it out of #20093 to reduce the amount of effort
that will be required to review that PR.
## Summary
Derived from #20900
Implement `VarianceInferable` for `KnownInstanceType` (especially for
`KnownInstanceType::TypeAliasType`).
The variance of a type alias matches its value type. In normal usage,
type aliases are expanded to value types, so the variance of a type
alias can be obtained without implementing this. However, for example,
if we want to display the variance when hovering over a type alias, we
need to be able to obtain the variance of the type alias itself (cf.
#20900).
## Test Plan
I couldn't come up with a way to test this in mdtest, so I'm testing it
in a test submodule at the end of `types.rs`.
I also added a test to `mdtest/generics/pep695/variance.md`, but it
passes without the changes in this PR.
Generic classes are not allowed to bind or reference a typevar from an
enclosing scope:
```py
def f[T](x: T, y: T) -> None:
class Ok[S]: ...
# error: [invalid-generic-class]
class Bad1[T]: ...
# error: [invalid-generic-class]
class Bad2(Iterable[T]): ...
class C[T]:
class Ok1[S]: ...
# error: [invalid-generic-class]
class Bad1[T]: ...
# error: [invalid-generic-class]
class Bad2(Iterable[T]): ...
```
It does not matter if the class uses PEP 695 or legacy syntax. It does
not matter if the enclosing scope is a generic class or function. The
generic class cannot even _reference_ an enclosing typevar in its base
class list.
This PR adds diagnostics for these cases.
In addition, the PR adds better fallback behavior for generic classes
that violate this rule: any enclosing typevars are not included in the
class's generic context. (That ensures that we don't inadvertently try
to infer specializations for those typevars in places where we
shouldn't.) The `dulwich` ecosystem project has [examples of
this](d912eaaffd/dulwich/config.py (L251))
that were causing new false positives on #20677.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Implements bidirectional type inference using function return type
annotations.
This PR was originally proposed to solve astral-sh/ty#1167, but this
does not fully resolve it on its own.
Additionally, I believe we need to allow dataclasses to generate their
own `__new__` methods, [use constructor return types for
inference](5844c0103d/crates/ty_python_semantic/src/types.rs (L5326-L5328)),
and a mechanism to discard type narrowing like `& ~AlwaysFalsy` if
necessary (at a more general level than this PR).
## Test Plan
`mdtest/bidirectional.md` is added.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Ibraheem Ahmed <ibraheem@ibraheem.ca>
## Summary
This allows us to handle self-referential bounds/constraints/defaults
without panicking.
Handles more cases from https://github.com/astral-sh/ty/issues/256
This also changes the way we infer the types of legacy TypeVars. Rather
than understanding a constructor call to `typing[_extension].TypeVar`
inside of any (arbitrarily nested) expression, and having to use a
special `assigned_to` field of the semantic index to try to best-effort
figure out what name the typevar was assigned to, we instead understand
the creation of a legacy `TypeVar` only in the supported syntactic
position (RHS of a simple un-annotated assignment with one target). In
any other position, we just infer it as creating an opaque instance of
`typing.TypeVar`. (This behavior matches all other type checkers.)
So we now special-case TypeVar creation in `TypeInferenceBuilder`, as a
special case of an assignment definition, rather than deeper inside call
binding. This does mean we re-implement slightly more of
argument-parsing, but in practice this is minimal and easy to handle
correctly.
This is easier to implement if we also make the RHS of a simple (no
unpacking) one-target assignment statement no longer a standalone
expression. Which is fine to do, because simple one-target assignments
don't need to infer the RHS more than once. This is a bonus performance
(0-3% across various projects) and significant memory-usage win, since
most assignment statements are simple one-target assignment statements,
meaning we now create many fewer standalone-expression salsa
ingredients.
This change does mean that inference of manually-constructed
`TypeAliasType` instances can no longer find its Definition in
`assigned_to`, which regresses go-to-definition for these aliases. In a
future PR, `TypeAliasType` will receive the same treatment that
`TypeVar` did in this PR (moving its special-case inference into
`TypeInferenceBuilder` and supporting it only in the correct syntactic
position, and lazily inferring its value type to support recursion),
which will also fix the go-to-definition regression. (I decided a
temporary edge-case regression is better in this case than doubling the
size of this PR.)
This PR also tightens up and fixes various aspects of the validation of
`TypeVar` creation, as seen in the tests.
We still (for now) treat all typevars as instances of `typing.TypeVar`,
even if they were created using `typing_extensions.TypeVar`. This means
we'll wrongly error on e.g. `T.__default__` on Python 3.11, even if `T`
is a `typing_extensions.TypeVar` instance at runtime. We share this
wrong behavior with both mypy and pyrefly. It will be easier to fix
after we pull in https://github.com/python/typeshed/pull/14840.
There are some issues that showed up here with typevar identity and
`MarkTypeVarsInferable`; the fix here (using the new `original` field
and `is_identical_to` methods on `BoundTypeVarInstance` and
`TypeVarInstance`) is a bit kludgy, but it can go away when we eliminate
`MarkTypeVarsInferable`.
## Test Plan
Added and updated mdtests.
### Conformance suite impact
The impact here is all positive:
* We now correctly error on a legacy TypeVar with exactly one constraint
type given.
* We now correctly error on a legacy TypeVar with both an upper bound
and constraints specified.
### Ecosystem impact
Basically none; in the setuptools case we just issue slightly different
errors on an invalid TypeVar definition, due to the modified validation
code.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This PR adds a specialization inference special case that lets us handle
the following examples better:
```py
def f[T](t: T | None) -> T: ...
def g[T](t: T | int | None) -> T | int: ...
def _(x: str | None):
reveal_type(f(x)) # revealed: str (previously str | None)
def _(y: str | int | None):
reveal_type(g(x)) # revealed: str | int (previously str | int | None)
```
We already have a special case for when the formal is a union where one
element is a typevar, but it maps the entire actual type to the typevar
(as you can see in the "previously" results above).
The new special case kicks in when the actual is also a union. Now, we
filter out any actual union elements that are already subtypes of the
formal, and only bind whatever types remain to the typevar. (The `|
None` pattern appears quite often in the ecosystem results, but it's
more general and works with any number of non-typevar union elements.)
The new constraint solver should handle this case as well, but it's
worth adding this heuristic now with the old solver because it
eliminates some false positives from the ecosystem report, and makes the
ecosystem report less noisy on the other constraint solver PRs.
## Summary
Currently we do not emit an error on this code:
```py
from ty_extensions import Not
def f[T](x: T, y: Not[T]) -> T:
x = y
return x
```
But we should do! `~T` should never be assignable to `T`.
This fixes a small regression introduced in
14fe1228e7 (diff-8049ab5af787dba29daa389bbe2b691560c15461ef536f122b1beab112a4b48aR1443-R1446),
where a branch that previously returned `false` was replaced with a
branch that returns `C::always_satisfiable` -- the opposite of what it
used to be! The regression occurred because we didn't have any tests for
this -- so I added some tests in this PR that fail on `main`. I only
spotted the problem because I was going through the code of
`has_relation_to_impl` with a fine toothcomb for
https://github.com/astral-sh/ruff/pull/20602😄
## Summary
Add two simple tests that we recently discussed with @dcreager. They
demonstrate that the `TypeMapping::MarkTypeVarsInferable` operation
really does need to keep track of the binding context.
## Test Plan
Made sure that those tests fail if we create
`TypeMapping::MarkTypeVarsInferable(None)`s everywhere.
## Summary
Modify the (external) signature of instance methods such that the first
parameter uses `Self` unless it is explicitly annotated. This allows us
to correctly type-check more code, and allows us to infer correct return
types for many functions that return `Self`. For example:
```py
from pathlib import Path
from datetime import datetime, timedelta
reveal_type(Path(".config") / ".ty") # now Path, previously Unknown
def _(dt: datetime, delta: timedelta):
reveal_type(dt - delta) # now datetime, previously Unknown
```
part of https://github.com/astral-sh/ty/issues/159
## Performance
I ran benchmarks locally on `attrs`, `freqtrade` and `colour`, the
projects with the largest regressions on CodSpeed. I see much smaller
effects locally, but can definitely reproduce the regression on `attrs`.
From looking at the profiling results (on Codspeed), it seems that we
simply do more type inference work, which seems plausible, given that we
now understand much more return types (of many stdlib functions). In
particular, whenever a function uses an implicit `self` and returns
`Self` (without mentioning `Self` anywhere else in its signature), we
will now infer the correct type, whereas we would previously return
`Unknown`. This also means that we need to invoke the generics solver in
more cases. Comparing half a million lines of log output on attrs, I can
see that we do 5% more "work" (number of lines in the log), and have a
lot more `apply_specialization` events (7108 vs 4304). On freqtrade, I
see similar numbers for `apply_specialization` (11360 vs 5138 calls).
Given these results, I'm not sure if it's generally worth doing more
performance work, especially since none of the code modifications
themselves seem to be likely candidates for regressions.
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/attrs` | 92.6 ± 3.6 | 85.9 |
102.6 | 1.00 |
| `./ty_self check /home/shark/ecosystem/attrs` | 101.7 ± 3.5 | 96.9 |
113.8 | 1.10 ± 0.06 |
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/freqtrade` | 599.0 ± 20.2 |
568.2 | 627.5 | 1.00 |
| `./ty_self check /home/shark/ecosystem/freqtrade` | 607.9 ± 11.5 |
594.9 | 626.4 | 1.01 ± 0.04 |
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./ty_main check /home/shark/ecosystem/colour` | 423.9 ± 17.9 | 394.6
| 447.4 | 1.00 |
| `./ty_self check /home/shark/ecosystem/colour` | 426.9 ± 24.9 | 373.8
| 456.6 | 1.01 ± 0.07 |
## Test Plan
New Markdown tests
## Ecosystem report
* apprise: ~300 new diagnostics related to problematic stubs in apprise
😩
* attrs: a new true positive, since [this
function](4e2c89c823/tests/test_make.py (L2135))
is missing a `@staticmethod`?
* Some legitimate true positives
* sympy: lots of new `invalid-operator` false positives in [matrix
multiplication](cf9f4b6805/sympy/matrices/matrixbase.py (L3267-L3269))
due to our limited understanding of [generic `Callable[[Callable[[T1,
T2], T3]], Callable[[T1, T2], T3]]` "identity"
types](cf9f4b6805/sympy/core/decorators.py (L83-L84))
of decorators. This is not related to type-of-self.
## Typing conformance results
The changes are all correct, except for
```diff
+generics_self_usage.py:50:5: error[invalid-assignment] Object of type `def foo(self) -> int` is not assignable to `(typing.Self, /) -> int`
```
which is related to an assignability problem involving type variables on
both sides:
```py
class CallableAttribute:
def foo(self) -> int:
return 0
bar: Callable[[Self], int] = foo # <- we currently error on this assignment
```
---------
Co-authored-by: Shaygan Hooshyari <sh.hooshyari@gmail.com>
## Summary
Fixes a bug observed by @AlexWaygood where `C[Any] <: C[object]` should
hold for a class that is covariant in its type parameter (and similar
subtyping relations involving dynamic types for other variance
configurations).
## Test Plan
New and updated Markdown tests
### Summary
This PR includes two changes, both of which are necessary to resolve
https://github.com/astral-sh/ty/issues/1196:
* For a generic class `C[T]`, we previously used `C[Unknown]` as the
upper bound of the `Self` type variable. There were two problems with
this. For one, when `Self` appeared in contravariant position, we would
materialize its upper bound to `Bottom[C[Unknown]]` (which might
simplify to `C[Never]` if `C` is covariant in `T`) when accessing
methods on `Top[C[Unknown]]`. This would result in `invalid-argument`
errors on the `self` parameter. Also, using an upper bound of
`C[Unknown]` would mean that inside methods, references to `T` would be
treated as `Unknown`. This could lead to false negatives. To fix this,
we now use `C[T]` (with a "nested" typevar) as the upper bound for
`Self` on `C[T]`.
* In order to make this work, we needed to allow assignability/subtyping
of inferable typevars to other types, since we now check assignability
of e.g. `C[int]` to `C[T]` (when checking assignability to the upper
bound of `Self`) when calling an instance-method on `C[int]` whose
`self` parameter is annotated as `self: Self` (or implicitly `Self`,
following https://github.com/astral-sh/ruff/pull/18007).
closes https://github.com/astral-sh/ty/issues/1196
closes https://github.com/astral-sh/ty/issues/1208
### Test Plan
Regression tests for both issues.
This PR adds a new `ty_extensions.ConstraintSet` class, which is used to
expose constraint sets to our mdtest framework. This lets us write a
large collection of unit tests that exercise the invariants and rewrite
rules of our constraint set implementation.
As part of this, `is_assignable_to` and friends are updated to return a
`ConstraintSet` instead of a `bool`, and we implement
`ConstraintSet.__bool__` to return when a constraint set is always
satisfied. That lets us still use
`static_assert(is_assignable_to(...))`, since the assertion will coerce
the constraint set to a bool, and also lets us
`reveal_type(is_assignable_to(...))` to see more detail about
whether/when the two types are assignable. That lets us get rid of
`reveal_when_assignable_to` and friends, since they are now redundant
with the expanded capabilities of `is_assignable_to`.
## Summary
Adds support for generic PEP695 type aliases, e.g.,
```python
type A[T] = T
reveal_type(A[int]) # A[int]
```
Resolves https://github.com/astral-sh/ty/issues/677.
## Summary
Support cases like the following, where we need the generic context to
include both `Self` and `T` (not just `T`):
```py
from typing import Self
class C:
def method[T](self: Self, arg: T): ...
C().method(1)
```
closes https://github.com/astral-sh/ty/issues/1131
## Test Plan
Added regression test
This PR adds two new `ty_extensions` functions,
`reveal_when_assignable_to` and `reveal_when_subtype_of`. These are
closely related to the existing `is_assignable_to` and `is_subtype_of`,
but instead of returning when the property (always) holds, it produces a
diagnostic that describes _when_ the property holds. (This will let us
construct mdtests that print out constraints that are not always true or
always false — though we don't currently have any instances of those.)
I did not replace _every_ occurrence of the `is_property` variants in
the mdtest suite, instead focusing on the generics-related tests where
it will be important to see the full detail of the constraint sets.
As part of this, I also updated the mdtest harness to accept the shorter
`# revealed:` assertion format for more than just `reveal_type`, and
updated the existing uses of `reveal_protocol_interface` to take
advantage of this.
"Why would you do this? This looks like you just replaced `bool` with an
overly complex trait"
Yes that's correct!
This should be a no-op refactoring. It replaces all of the logic in our
assignability, subtyping, equivalence, and disjointness methods to work
over an arbitrary `Constraints` trait instead of only working on `bool`.
The methods that `Constraints` provides looks very much like what we get
from `bool`. But soon we will add a new impl of this trait, and some new
methods, that let us express "fuzzy" constraints that aren't always true
or false. (In particular, a constraint will express the upper and lower
bounds of the allowed specializations of a typevar.)
Even once we have that, most of the operations that we perform on
constraint sets will be the usual boolean operations, just on sets.
(`false` becomes empty/never; `true` becomes universe/always; `or`
becomes union; `and` becomes intersection; `not` becomes negation.) So
it's helpful to have this separate PR to refactor how we invoke those
operations without introducing the new functionality yet.
Note that we also have translations of `Option::is_some_and` and
`is_none_or`, and of `Iterator::any` and `all`, and that the `and`,
`or`, `when_any`, and `when_all` methods are meant to short-circuit,
just like the corresponding boolean operations. For constraint sets,
that depends on being able to implement the `is_always` and `is_never`
trait methods.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
`Type::TypeVar` now distinguishes whether the typevar in question is
inferable or not.
A typevar is _not inferable_ inside the body of the generic class or
function that binds it:
```py
def f[T](t: T) -> T:
return t
```
The infered type of `t` in the function body is `TypeVar(T,
NotInferable)`. This represents how e.g. assignability checks need to be
valid for all possible specializations of the typevar. Most of the
existing assignability/etc logic only applies to non-inferable typevars.
Outside of the function body, the typevar is _inferable_:
```py
f(4)
```
Here, the parameter type of `f` is `TypeVar(T, Inferable)`. This
represents how e.g. assignability doesn't need to hold for _all_
specializations; instead, we need to find the constraints under which
this specific assignability check holds.
This is in support of starting to perform specialization inference _as
part of_ performing the assignability check at the call site.
In the [[POPL2015][]] paper, this concept is called _monomorphic_ /
_polymorphic_, but I thought _non-inferable_ / _inferable_ would be
clearer for us.
Depends on #19784
[POPL2015]: https://doi.org/10.1145/2676726.2676991
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
For PEP 695 generic functions and classes, there is an extra "type
params scope" (a child of the outer scope, and wrapping the body scope)
in which the type parameters are defined; class bases and function
parameter/return annotations are resolved in that type-params scope.
This PR fixes some longstanding bugs in how we resolve name loads from
inside these PEP 695 type parameter scopes, and also defers type
inference of PEP 695 typevar bounds/constraints/default, so we can
handle cycles without panicking.
We were previously treating these type-param scopes as lazy nested
scopes, which is wrong. In fact they are eager nested scopes; the class
`C` here inherits `int`, not `str`, and previously we got that wrong:
```py
Base = int
class C[T](Base): ...
Base = str
```
But certain syntactic positions within type param scopes (typevar
bounds/constraints/defaults) are lazy at runtime, and we should use
deferred name resolution for them. This also means they can have cycles;
in order to handle that without panicking in type inference, we need to
actually defer their type inference until after we have constructed the
`TypeVarInstance`.
PEP 695 does specify that typevar bounds and constraints cannot be
generic, and that typevar defaults can only reference prior typevars,
not later ones. This reduces the scope of (valid from the type-system
perspective) cycles somewhat, although cycles are still possible (e.g.
`class C[T: list[C]]`). And this is a type-system-only restriction; from
the runtime perspective an "invalid" case like `class C[T: T]` actually
works fine.
I debated whether to implement the PEP 695 restrictions as a way to
avoid some cycles up-front, but I ended up deciding against that; I'd
rather model the runtime name-resolution semantics accurately, and
implement the PEP 695 restrictions as a separate diagnostic on top.
(This PR doesn't yet implement those diagnostics, thus some `# TODO:
error` in the added tests.)
Introducing the possibility of cyclic typevars made typevar display
potentially stack overflow. For now I've handled this by simply removing
typevar details (bounds/constraints/default) from typevar display. This
impacts display of two kinds of types. If you `reveal_type(T)` on an
unbound `T` you now get just `typing.TypeVar` instead of
`typing.TypeVar("T", ...)` where `...` is the bound/constraints/default.
This matches pyright and mypy; pyrefly uses `type[TypeVar[T]]` which
seems a bit confusing, but does include the name. (We could easily
include the name without cycle issues, if there's a syntax we like for
that.)
It also means that displaying a generic function type like `def f[T:
int](x: T) -> T: ...` now displays as `f[T](x: T) -> T` instead of `f[T:
int](x: T) -> T`. This matches pyright and pyrefly; mypy does include
bound/constraints/defaults of typevars in function/callable type
display. If we wanted to add this, we would either need to thread a
visitor through all the type display code, or add a `decycle` type
transformation that replaced recursive reoccurrence of a type with a
marker.
## Test Plan
Added mdtests and modified existing tests to improve their correctness.
After this PR, there's only a single remaining py-fuzzer seed in the
0-500 range that panics! (Before this PR, there were 10; the fuzzer
likes to generate cyclic PEP 695 syntax.)
## Ecosystem report
It's all just the changes to `TypeVar` display.