Quick follow-on to #17788. If there is no bound `self` parameter, we can
reuse the existing `CallArgument{,Type}s`, and we can use a straight
`Vec` instead of a `VecDeque`.
## Summary
Remove mutability in parameter types for a few functions such as
`with_self` and `try_call`. I tried the `Rc`-approach with cheap cloning
[suggest
here](https://github.com/astral-sh/ruff/pull/17733#discussion_r2068722860)
first, but it turns out we need a whole stack of prepended arguments
(there can be [both `self` *and*
`cls`](3cf44e401a/crates/red_knot_python_semantic/resources/mdtest/call/constructor.md?plain=1#L113)),
and we would need the same construct not just for `CallArguments` but
also for `CallArgumentTypes`. At that point we're cloning `VecDeque`s
anyway, so the overhead of cloning the whole `VecDeque` with all
arguments didn't seem to justify the additional code complexity.
## Benchmarks
Benchmarks on tomllib, black, jinja, isort seem neutral.
## Summary
Add the ability to detect instance attribute assignments in class
methods that are generic.
This does not address the code duplication mentioned in #16928. I can
open a ticket for this after this has been merged.
closes#16928
## Test Plan
Added regression test.
## Summary
Adds preliminary support for `NamedTuple`s, including:
* No false positives when constructing a `NamedTuple` object
* Correct signature for the synthesized `__new__` method, i.e. proper
checking of constructor calls
* A patched MRO (`NamedTuple` => `tuple`), mainly to make type inference
of named attributes possible, but also to better reflect the runtime
MRO.
All of this works:
```py
from typing import NamedTuple
class Person(NamedTuple):
id: int
name: str
age: int | None = None
alice = Person(1, "Alice", 42)
alice = Person(id=1, name="Alice", age=42)
reveal_type(alice.id) # revealed: int
reveal_type(alice.name) # revealed: str
reveal_type(alice.age) # revealed: int | None
# error: [missing-argument]
Person(3)
# error: [too-many-positional-arguments]
Person(3, "Eve", 99, "extra")
# error: [invalid-argument-type]
Person(id="3", name="Eve")
```
Not included:
* type inference for index-based access.
* support for the functional `MyTuple = NamedTuple("MyTuple", […])`
syntax
## Test Plan
New Markdown tests
## Ecosystem analysis
```
Diagnostic Analysis Report
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┓
┃ Diagnostic ID ┃ Severity ┃ Removed ┃ Added ┃ Net Change ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━┩
│ lint:call-non-callable │ error │ 0 │ 3 │ +3 │
│ lint:call-possibly-unbound-method │ warning │ 0 │ 4 │ +4 │
│ lint:invalid-argument-type │ error │ 0 │ 72 │ +72 │
│ lint:invalid-context-manager │ error │ 0 │ 2 │ +2 │
│ lint:invalid-return-type │ error │ 0 │ 2 │ +2 │
│ lint:missing-argument │ error │ 0 │ 46 │ +46 │
│ lint:no-matching-overload │ error │ 19121 │ 0 │ -19121 │
│ lint:not-iterable │ error │ 0 │ 6 │ +6 │
│ lint:possibly-unbound-attribute │ warning │ 13 │ 32 │ +19 │
│ lint:redundant-cast │ warning │ 0 │ 1 │ +1 │
│ lint:unresolved-attribute │ error │ 0 │ 10 │ +10 │
│ lint:unsupported-operator │ error │ 3 │ 9 │ +6 │
│ lint:unused-ignore-comment │ warning │ 15 │ 4 │ -11 │
├───────────────────────────────────┼──────────┼─────────┼───────┼────────────┤
│ TOTAL │ │ 19152 │ 191 │ -18961 │
└───────────────────────────────────┴──────────┴─────────┴───────┴────────────┘
Analysis complete. Found 13 unique diagnostic IDs.
Total diagnostics removed: 19152
Total diagnostics added: 191
Net change: -18961
```
I uploaded the ecosystem full diff (ignoring the 19k
`no-matching-overload` diagnostics)
[here](https://shark.fish/diff-namedtuple.html).
* There are some new `missing-argument` false positives which come from
the fact that named tuples are often created using unpacking as in
`MyNamedTuple(*fields)`, which we do not understand yet.
* There are some new `unresolved-attribute` false positives, because
methods like `_replace` are not available.
* Lots of the `invalid-argument-type` diagnostics look like true
positives
---------
Co-authored-by: Douglas Creager <dcreager@dcreager.net>
## Summary
This PR updates the existing overload matching methods to return an
iterator of all the matched overloads instead.
This would be useful once the overload call evaluation algorithm is
implemented which should provide an accurate picture of all the matched
overloads. The return type would then be picked from either the only
matched overload or the first overload from the ones that are matched.
In an earlier version of this PR, it tried to check if using an
intersection of return types from the matched overload would help reduce
the false positives but that's not enough. [This
comment](https://github.com/astral-sh/ruff/pull/17618#issuecomment-2842891696)
keep the ecosystem analysis for that change for prosperity.
> [!NOTE]
>
> The best way to review this PR is by hiding the whitespace changes
because there are two instances where a large match expression is
indented to be inside a loop over matching overlods
>
> <img width="1207" alt="Screenshot 2025-04-28 at 15 12 16"
src="https://github.com/user-attachments/assets/e06cbfa4-04fa-435f-84ef-4e5c3c5626d1"
/>
## Test Plan
Make sure existing test cases are unaffected and no ecosystem changes.
## Summary
Model the lookup of `__new__` without going through
`Type::try_call_dunder`. The `__new__` method is only looked up on the
constructed type itself, not on the meta-type.
This now removes ~930 false positives across the ecosystem (vs 255 for
https://github.com/astral-sh/ruff/pull/17662). It introduces 30 new
false positives related to the construction of enums via something like
`Color = enum.Enum("Color", ["RED", "GREEN"])`. This is expected,
because we don't handle custom metaclass `__call__` methods. The fact
that we previously didn't emit diagnostics there was a coincidence (we
incorrectly called `EnumMeta.__new__`, and since we don't fully
understand its signature, that happened to work with `str`, `list`
arguments).
closes#17462
## Test Plan
Regression test
## Summary
Part of #15383.
As per the spec
(https://typing.python.org/en/latest/spec/overload.html#invalid-overload-definitions):
For `@staticmethod` and `@classmethod`:
> If one overload signature is decorated with `@staticmethod` or
`@classmethod`, all overload signatures must be similarly decorated. The
implementation, if present, must also have a consistent decorator. Type
checkers should report an error if these conditions are not met.
For `@final` and `@override`:
> If a `@final` or `@override` decorator is supplied for a function with
overloads, the decorator should be applied only to the overload
implementation if it is present. If an overload implementation isn’t
present (for example, in a stub file), the `@final` or `@override`
decorator should be applied only to the first overload. Type checkers
should enforce these rules and generate an error when they are violated.
If a `@final` or `@override` decorator follows these rules, a type
checker should treat the decorator as if it is present on all overloads.
## Test Plan
Update existing tests; add snapshots.
## Summary
As mentioned in the spec
(https://typing.python.org/en/latest/spec/overload.html#invalid-overload-definitions),
part of #15383:
> The `@overload`-decorated definitions must be followed by an overload
implementation, which does not include an `@overload` decorator. Type
checkers should report an error or warning if an implementation is
missing. Overload definitions within stub files, protocols, and on
abstract methods within abstract base classes are exempt from this
check.
## Test Plan
Remove TODOs from the test; create one diagnostic snapshot.
Re: #17526
## Summary
Adds tests to red knot and `linter.rs` for the semantic syntax.
Specifically add tests for `ReboundComprehensionVariable`,
`DuplicateTypeParameter`, and `MultipleCaseAssignment`.
Refactor the `test_async_comprehension_in_sync_comprehension` →
`test_semantic_error` to be more general for all semantic syntax test
cases.
## Test Plan
This is a test.
## Question
I'm happy to contribute more tests the coming days.
Should that happen here or should we merge this PR such that the
refactor `test_async_comprehension_in_sync_comprehension` →
`test_semantic_error` is available on main and others can chime in, too?
## Summary
Part of #15383, this PR adds the core infrastructure to check for
invalid overloads and adds a diagnostic to raise if there are < 2
overloads for a given definition.
### Design notes
The requirements to check the overloads are:
* Requires `FunctionType` which has the `to_overloaded` method
* The `FunctionType` **should** be for the function that is either the
implementation or the last overload if the implementation doesn't exists
* Avoid checking any `FunctionType` that are part of an overload chain
* Consider visibility constraints
This required a couple of iteration to make sure all of the above
requirements are fulfilled.
#### 1. Use a set to deduplicate
The logic would first collect all the `FunctionType` that are part of
the overload chain except for the implementation or the last overload if
the implementation doesn't exists. Then, when iterating over all the
function declarations within the scope, we'd avoid checking these
functions. But, this approach would fail to consider visibility
constraints as certain overloads _can_ be behind a version check. Those
aren't part of the overload chain but those aren't a separate overload
chain either.
<details><summary>Implementation:</summary>
<p>
```rs
fn check_overloaded_functions(&mut self) {
let function_definitions = || {
self.types
.declarations
.iter()
.filter_map(|(definition, ty)| {
// Filter out function literals that result from anything other than a function
// definition e.g., imports.
if let DefinitionKind::Function(function) = definition.kind(self.db()) {
ty.inner_type()
.into_function_literal()
.map(|ty| (ty, definition.symbol(self.db()), function.node()))
} else {
None
}
})
};
// A set of all the functions that are part of an overloaded function definition except for
// the implementation function and the last overload in case the implementation doesn't
// exists. This allows us to collect all the function definitions that needs to be skipped
// when checking for invalid overload usages.
let mut overloads: HashSet<FunctionType<'db>> = HashSet::default();
for (function, _) in function_definitions() {
let Some(overloaded) = function.to_overloaded(self.db()) else {
continue;
};
if overloaded.implementation.is_some() {
overloads.extend(overloaded.overloads.iter().copied());
} else if let Some((_, previous_overloads)) = overloaded.overloads.split_last() {
overloads.extend(previous_overloads.iter().copied());
}
}
for (function, function_node) in function_definitions() {
let Some(overloaded) = function.to_overloaded(self.db()) else {
continue;
};
if overloads.contains(&function) {
continue;
}
// At this point, the `function` variable is either the implementation function or the
// last overloaded function if the implementation doesn't exists.
if overloaded.overloads.len() < 2 {
if let Some(builder) = self
.context
.report_lint(&INVALID_OVERLOAD, &function_node.name)
{
let mut diagnostic = builder.into_diagnostic(format_args!(
"Function `{}` requires at least two overloads",
&function_node.name
));
if let Some(first_overload) = overloaded.overloads.first() {
diagnostic.annotate(
self.context
.secondary(first_overload.focus_range(self.db()))
.message(format_args!("Only one overload defined here")),
);
}
}
}
}
}
```
</p>
</details>
#### 2. Define a `predecessor` query
The `predecessor` query would return the previous `FunctionType` for the
given `FunctionType` i.e., the current logic would be extracted to be a
query instead. This could then be used to make sure that we're checking
the entire overload chain once. The way this would've been implemented
is to have a `to_overloaded` implementation which would take the root of
the overload chain instead of the leaf. But, this would require updates
to the use-def map to somehow be able to return the _following_
functions for a given definition.
#### 3. Create a successor link
This is what Pyrefly uses, we'd create a forward link between two
functions that are involved in an overload chain. This means that for a
given function, we can get the successor function. This could be used to
find the _leaf_ of the overload chain which can then be used with the
`to_overloaded` method to get the entire overload chain. But, this would
also require updating the use-def map to be able to "see" the
_following_ function.
### Implementation
This leads us to the final implementation that this PR implements which
is to consider the overloaded functions using:
* Collect all the **function symbols** that are defined **and** called
within the same file. This could potentially be an overloaded function
* Use the public bindings to get the leaf of the overload chain and use
that to get the entire overload chain via `to_overloaded` and perform
the check
This has a limitation that in case a function redefines an overload,
then that overload will not be checked. For example:
```py
from typing import overload
@overload
def f() -> None: ...
@overload
def f(x: int) -> int: ...
# The above overload will not be checked as the below function with the same name
# shadows it
def f(*args: int) -> int: ...
```
## Test Plan
Update existing mdtest and add snapshot diagnostics.
## Summary
@sharkdp and I realised in our 1:1 this morning that our control flow
for `assert` statements isn't quite accurate at the moment. Namely, for
something like this:
```py
def _(x: int | None):
assert x is None, reveal_type(x)
```
we currently reveal `None` for `x` here, but this is incorrect. In
actual fact, the `msg` expression of an `assert` statement (the
expression after the comma) will only be evaluated if the test (`x is
None`) evaluates to `False`. As such, we should be adding a constraint
of `~None` to `x` in the `msg` expression, which should simplify the
inferred type of `x` to `int` in that context (`(int | None) & ~None` ->
`int`).
## Test Plan
Mdtests added.
---------
Co-authored-by: David Peter <mail@david-peter.de>
## Summary
We were previously recording wrong reachability constraints for negative
branches. Instead of `[cond] AND (NOT [True])` below, we were recording
`[cond] AND (NOT ([cond] AND [True]))`, i.e. we were negating not just
the last predicate, but the `AND`-ed reachability constraint from last
clause. With this fix, we now record the correct constraints for the
example from #17723:
```py
def _(cond: bool):
if cond:
# reachability: [cond]
if True:
# reachability: [cond] AND [True]
pass
else:
# reachability: [cond] AND (NOT [True])
x
```
closes#17723
## Test Plan
* Regression test.
* Verified the ecosystem changes
## Summary
Part of #15383, this PR adds `is_equivalent_to` support for overloaded
callables.
This is mainly done by delegating it to the subtyping check in that two
types A and B are considered equivalent if A is a subtype of B and B is
a subtype of A.
## Test Plan
Add test cases for overloaded callables in `is_equivalent_to.md`
## Summary
Subtyping was already modeled, but assignability also needs an explicit
branch. Removes 921 ecosystem false positives.
## Test Plan
New Markdown tests.
We are currently representing type variables using a `KnownInstance`
variant, which wraps a `TypeVarInstance` that contains the information
about the typevar (name, bounds, constraints, default type). We were
previously only constructing that type for PEP 695 typevars. This PR
constructs that type for legacy typevars as well.
It also detects functions that are generic because they use legacy
typevars in their parameter list. With the existing logic for inferring
specializations of function calls (#17301), that means that we are
correctly detecting that the definition of `reveal_type` in the typeshed
is generic, and inferring the correct specialization of `_T` for each
call site.
This does not yet handle legacy generic classes; that will come in a
follow-on PR.
This is done in what appears to be the same way as Ruff: we get the CWD,
strip the prefix from the path if possible, and use that. If stripping
the prefix fails, then we print the full path as-is.
Fixes#17233
## Summary
Removes ~850 diagnostics related to assignability of callable types,
where the callable-being-assigned-to has a "Todo signature", which
should probably accept any left hand side callable/signature.
## Summary
Do not emit errors when defining `TypedDict`s:
```py
from typing_extensions import TypedDict
# No error here
class Person(TypedDict):
name: str
age: int | None
# No error for this alternative syntax
Message = TypedDict("Message", {"id": int, "content": str})
```
## Ecosystem analysis
* Removes ~ 450 false positives for `TypedDict` definitions.
* Changes a few diagnostic messages.
* Adds a few (< 10) false positives, for example:
```diff
+ error[lint:unresolved-attribute]
/tmp/mypy_primer/projects/hydra-zen/src/hydra_zen/structured_configs/_utils.py:262:5:
Type `Literal[DataclassOptions]` has no attribute `__required_keys__`
+ error[lint:unresolved-attribute]
/tmp/mypy_primer/projects/hydra-zen/src/hydra_zen/structured_configs/_utils.py:262:42:
Type `Literal[DataclassOptions]` has no attribute `__optional_keys__`
```
* New true positive
4f8263cd7f/corporate/lib/remote_billing_util.py (L155-L157)
```diff
+ error[lint:invalid-assignment]
/tmp/mypy_primer/projects/zulip/corporate/lib/remote_billing_util.py:155:5:
Object of type `RemoteBillingIdentityDict | LegacyServerIdentityDict |
None` is not assignable to `LegacyServerIdentityDict | None`
```
## Test Plan
New Markdown tests
## Summary
I remember we discussed about adding this as a property tests so here I
am.
## Test Plan
```console
❯ QUICKCHECK_TESTS=10000000 cargo test --locked --release --package red_knot_python_semantic -- --ignored types::property_tests::stable::bottom_callable_is_subtype_of_all_fully_static_callable
Finished `release` profile [optimized] target(s) in 0.10s
Running unittests src/lib.rs (target/release/deps/red_knot_python_semantic-e41596ca2dbd0e98)
running 1 test
test types::property_tests::stable::bottom_callable_is_subtype_of_all_fully_static_callable ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 233 filtered out; finished in 30.91s
```
As discussed today, this is needed to handle legacy generic classes
without having to infer the types of the class's explicit bases eagerly
at class construction time. Pulling this out into a separate PR so
there's a smaller diff to review.
This also makes our representation of generic classes and functions more
consistent — before, we had separate Rust types and enum variants for
generic/non-generic classes, but a single type for generic functions.
Now we each a single (respective) type for each.
There were very few places we were differentiation between generic and
non-generic _class literals_, and these are handled now by calling the
(salsa cached) `generic_context` _accessor function_.
Note that _`ClassType`_ is still an enum with distinct variants for
non-generic classes and specialized generic classes.
## Summary
After https://github.com/astral-sh/ruff/pull/17620 (which this PR is
based on), I was looking at other call sites of `Type::into_class_type`,
and I began to feel that _all_ of them were currently buggy due to
silently skipping unspecialized generic class literal types (though in
some cases the bug hadn't shown up yet because we don't understand
legacy generic classes from typeshed), and in every case they would be
better off if an unspecialized generic class literal were implicitly
specialized with the default specialization (which is the usual Python
typing semantics for an unspecialized reference to a generic class),
instead of silently skipped.
So I changed the method to implicitly apply the default specialization,
and added a test that previously failed for detecting metaclasses on an
unspecialized generic base.
I also renamed the method to `to_class_type`, because I feel we have a
strong naming convention where `Type::into_foo` is always a trivial
`const fn` that simply returns `Some()` if the type is of variant `Foo`
and `None` otherwise. Even the existing method (with it handling both
`GenericAlias` and `ClassLiteral`, and distinguishing kinds of
`ClassLiteral`) was stretching this convention, and the new version
definitely breaks that envelope.
## Test Plan
Added a test that failed before this PR.
## Summary
The `ClassLiteralType::inheritance_cycle` method is intended to detect
inheritance cycles that would result in cyclic MROs, emit a diagnostic,
and skip actually trying to create the cyclic MRO, falling back to an
"error" MRO instead with just `Unknown` and `object`.
This method didn't work properly for generic classes. It used
`fully_static_explicit_bases`, which filter-maps `explicit_bases` over
`Type::into_class_type`, which returns `None` for an unspecialized
generic class literal. So in a case like `class C[T](C): ...`, because
the explicit base is an unspecialized generic, we just skipped it, and
failed to detect the class as cyclically defined.
Instead, iterate directly over all `explicit_bases`, and explicitly
handle both the specialized (`GenericAlias`) and unspecialized
(`ClassLiteral`) cases, so that we check all bases and correctly detect
cyclic inheritance.
## Test Plan
Added mdtests.
## Summary
Tracked structs have some issues with fixpoint iteration in Salsa, and
there's not actually any need for this to be tracked, it should be
interned like most of our type structs.
The removed comment was probably never correct (in that we could have
disambiguated sufficiently), and is definitely not relevant now that
`TypeVarInstance` also holds its `Definition`.
## Test Plan
Existing tests.
## Summary
This PR adds special-casing for `@final` and `@override` decorator for a
similar reason as https://github.com/astral-sh/ruff/pull/17591 to
support the invalid overload check.
Both `final` and `override` are identity functions which can be removed
once `TypeVar` support is added.
## Summary
As promised, this just adds a TODO comment to document something we
discussed today that should probably be improved at some point, but
isn't a priority right now (since it's an issue that in practice would
only affect generic classes with both `__init__` and `__new__` methods,
where some typevar is bound to `Unknown` in one and to some other type
in another.)
## Summary
While adding semantic error support to red-knot, I noticed duplicate
diagnostics for code like this:
```py
# error: [invalid-syntax] "cannot use an asynchronous comprehension outside of an asynchronous function on Python 3.9 (syntax was added in 3.11)"
# error: [invalid-syntax] "`asynchronous comprehension` outside of an asynchronous function"
[reveal_type(x) async for x in AsyncIterable()]
```
Beyond the duplication, the first error message doesn't make much sense
because this syntax is _not_ allowed on Python 3.11 either.
To fix this, this PR renames the
`async-comprehension-outside-async-function` semantic syntax error to
`async-comprehension-in-sync-comprehension` and fixes the rule to avoid
applying outside of sync comprehensions at all.
## Test Plan
New linter test demonstrating the false positive. The mdtests from my red-knot
PR also reflect this change.
## Summary
This PR updates the `to_overloaded` method to use an iterative approach
instead of a recursive one.
Refer to
https://github.com/astral-sh/ruff/pull/17585#discussion_r2056804587 for
context.
The main benefit here is that it avoids calling the `to_overloaded`
function in a recursive manner which is a salsa query. So, this is a bit
hand wavy but we should also see less memory used because the cache will
only contain a single entry which should be the entire overload chain.
Previously, the recursive approach would mean that each of the function
involved in an overload chain would have a cache entry. This reduce in
memory shouldn't be too much and I haven't looked at the actual data for
it.
## Test Plan
Existing test cases should pass.
This mostly only improves things for incorrect arguments and for an
incorrect return type. It doesn't do much to improve the case where
`__bool__` isn't callable and leaves the union/other cases untouched
completely.
I picked this one because, at first glance, this _looked_ like a lower
hanging fruit. The conceptual improvement here is pretty
straight-forward: add annotations for relevant data. But it took me a
bit to figure out how to connect all of the pieces.
I wanted to use this method in other places, so I moved it
to what appears to be a God-type. I also made it slightly
more versatile: callers can ask for the entire parameter list
by omitting a specific parameter index.
## Summary
Historically we have avoided narrowing on `==` tests because in many
cases it's unsound, since subclasses of a type could compare equal to
who-knows-what. But there are a lot of types (literals and unions of
them, as well as some known instances like `None` -- single-valued
types) whose `__eq__` behavior we know, and which we can safely narrow
away based on equality comparisons.
This PR implements equality narrowing in the cases where it is sound.
The most elegant way to do this (and the way that is most in-line with
our approach up until now) would be to introduce new Type variants
`NeverEqualTo[...]` and `AlwaysEqualTo[...]`, and then implement all
type relations for those variants, narrow by intersection, and let union
and intersection simplification sort it all out. This is analogous to
our existing handling for `AlwaysFalse` and `AlwaysTrue`.
But I'm reluctant to add new `Type` variants for this, mostly because
they could end up un-simplified in some types and make types even more
complex. So let's try this approach, where we handle more of the
narrowing logic as a special case.
## Test Plan
Updated and added tests.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Another follow-up to the unions-of-large-literals optimization. Restore
the behavior that e.g. `Literal[""] | ~Literal[""]` collapses to
`object`.
## Test Plan
Added mdtests.