## Summary
This avoids looking up `__bool__` on class `bool` for every
`Type::Instance(bool).bool()` call. 1% performance win on cold cache, 4%
win on incremental performance.
This updates the `SymbolBindings` and `SymbolDeclarations` types to use
a single smallvec of live bindings/declarations, instead of splitting
that out into separate containers for each field.
I'm seeing an 11-13% `cargo bench` performance improvement with this
locally (for both cold and incremental). I'm interested to see if
Codspeed agrees!
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
A minor cleanup that breaks up a `HashMap` of an enum into separate
`HashMap`s for each variant. (These separate fields were already how
this cache was being described in the big comment at the top of the
file!)
This is a small tweak to avoid adding the callable `Type` on the error
value itself. Namely, it's always available regardless of the error, and
it's easy to pass it down explicitly to the diagnostic generating code.
It's likely that the other `CallBindingError` variants will also want
the callable `Type` to improve diagnostics too. This way, we don't have
to duplicate the `Type` on each variant. It's just available to all of
them.
Ref https://github.com/astral-sh/ruff/pull/16239#discussion_r1962352646
## Summary
Follow up on the discussion
[here](https://github.com/astral-sh/ruff/pull/16121#discussion_r1962973298).
Replace builtin classes with custom placeholder names, which should
hopefully make the tests a bit easier to understand.
I carefully renamed things one after the other, to make sure that there
is no functional change in the tests.
We now resolve references in "eager" scopes correctly — using the
bindings and declarations that are visible at the point where the eager
scope is created, not the "public" type of the symbol (typically the
bindings visible at the end of the scope).
---------
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
This uses the refactoring and support for secondary diagnostic messages
to improve the diagnostic for "invalid argument type." The main
improvement here is that we show where the function being called is
defined, and annotate the span corresponding to the invalid parameter.
This is a small little hack to make the `Diagnostic` trait
capable of supporting attaching multiple spans.
This design should be considered transient. This was just the
quickest way that I could see to pass multiple spans through from
the type checker to the diagnostic renderer.
It seems nothing is using it, and I'm not sure if it makes semantic
sense. Particularly if we want to support multiple ranges. One could
make an argument that this ought to correspond to the "primary"
range (which we should have), but I think such a concept is better
expressed as an explicit routine if possible.
## Summary
This PR updates the formatter and linter to use the `PythonVersion`
struct from the `ruff_python_ast` crate internally. While this doesn't
remove the need for the `linter::PythonVersion` enum, it does remove the
`formatter::PythonVersion` enum and limits the use in the linter to
deserializing from CLI arguments and config files and moves most of the
remaining methods to the `ast::PythonVersion` struct.
## Test Plan
Existing tests, with some inputs and outputs updated to reflect the new
(de)serialization format. I think these are test-specific and shouldn't
affect any external (de)serialization.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR does the following:
* Moves the following from `types.rs` in `symbol.rs`:
* `symbol`
* `global_symbol`
* `imported_symbol`
* `symbol_from_bindings`
* `symbol_from_declarations`
* `SymbolAndQualifiers`
* `SymbolFromDeclarationsResult`
* Moves the following from `stdlib.rs` in `symbol.rs` and removes
`stdlib.rs`:
* `known_module_symbol`
* `builtins_symbol`
* `typing_symbol` (only for tests)
* `typing_extensions_symbol`
* `builtins_module_scope`
* `core_module_scope`
* Add `symbol_from_bindings_impl` and `symbol_from_declarations_impl` to
keep `RequiresExplicitReExport` an implementation detail
* Make `declaration_type` a `pub(crate)` as it's required in
`symbol_from_declarations` (`binding_type` is already `pub(crate)`
The main motivation is to keep the implementation details private and
only expose an ergonomic API which uses sane defaults for various
scenario to avoid any mistakes from the caller. Refer to
https://github.com/astral-sh/ruff/pull/16133#discussion_r1955262772,
https://github.com/astral-sh/ruff/pull/16133#issue-2850146612 for
details.
## Summary
This PR makes the following changes:
- It adjusts various callsites to use the new
`ast::StringLiteral::contents_range()` method that was introduced in
https://github.com/astral-sh/ruff/pull/16183. This is less verbose and
more type-safe than using the `ast::str::raw_contents()` helper
function.
- It adds a new `ast::ExprStringLiteral::as_unconcatenated_literal()`
helper method, and adjusts various callsites to use it. This addresses
@MichaReiser's review comment at
https://github.com/astral-sh/ruff/pull/16183#discussion_r1957334365.
There is no functional change here, but it helps readability to make it
clearer that we're differentiating between implicitly concatenated
strings and unconcatenated strings at various points.
- It renames the `StringLiteralValue::flags()` method to
`StringLiteralFlags::first_literal_flags()`. If you're dealing with an
implicitly concatenated string `string_node`,
`string_node.value.flags().closer_len()` could give an incorrect result;
this renaming makes it clearer that the `StringLiteralFlags` instance
returned by the method is only guaranteed to give accurate information
for the first `StringLiteral` contained in the `ExprStringLiteral` node.
- It deletes the unused `BytesLiteralValue::flags()` method. This seems
prone to misuse in the same way as `StringLiteralValue::flags()`: if
it's an implicitly concatenated bytestring, the `BytesLiteralFlags`
instance returned by the method would only give accurate information for
the first `BytesLiteral` in the bytestring.
## Test Plan
`cargo test`
## Summary
Running `cargo test -p red_knot_python_semantic` failed because of a
missing serde feature. This PR enables the `ruff_python_ast`'`s `serde`
if the crate's `serde` feature is enabled
## Test Plan
`cargo test -p red_knot_python_semantic` compiles again
When adjusting the existing tests, I aimed to avoid dealing with the
special case in other tests if it's not necessary to do so (that is,
avoid using `float` and `complex` as examples where we just need "some
type"), and keep the tests for the special case mostly collected in the
mdtest dedicated to that purpose.
Fixes https://github.com/astral-sh/ruff/issues/14932
## Summary
This PR moves the `PythonVersion` struct from the
`red_knot_python_semantic` crate to the `ruff_python_ast` crate so that
it can be used more easily in the syntax error detection work. Compared
to that [prototype](https://github.com/astral-sh/ruff/pull/16090/) these
changes reduce us from 2 `PythonVersion` structs to 1.
This does not unify any of the `PythonVersion` *enums*, but I hope to
make some progress on that in a follow-up.
## Test Plan
Existing tests, this should not change any external behavior.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR refactors the symbol lookup APIs to better facilitate the
re-export implementation. Specifically,
* Add `module_type_symbol` which returns the `Symbol` that's a member of
`types.ModuleType`
* Rename `symbol` -> `symbol_impl`; add `symbol` which delegates to
`symbol_impl` with `RequireExplicitReExport::No`
* Update `global_symbol` to do `symbol_impl` -> fall back to
`module_type_symbol` and default to `RequireExplicitReExport::No`
* Add `imported_symbol` to do `symbol_impl` with
`RequireExplicitReExport` as `Yes` if the module is in a stub file else
`No`
* Update `known_module_symbol` to use `imported_symbol` with a fallback
to `module_type_symbol`
* Update `ModuleLiteralType::member` to use `imported_symbol` with a
custom fallback
We could potentially also update `symbol_from_declarations` and
`symbol_from_bindings` to avoid passing in the `RequireExplicitReExport`
as it would be always `No` if called directly. We could add
`symbol_from_declarations_impl` and `symbol_from_bindings_impl`.
Looking at the `_impl` functions, I think we should move all of these
symbol related logic into `symbol.rs` where `Symbol` is defined and the
`_impl` could be private while we expose the public APIs at the crate
level. This would also make the `RequireExplicitReExport` an
implementation detail and the caller doesn't need to worry about it.
This is an alternative implementation to #15848.
## Summary
This PR adds support for re-export conventions for imports for stub
files.
**How does this work?**
* Add a new flag on the `Import` and `ImportFrom` definitions to
indicate whether they're being exported or not
* Add a new enum to indicate whether the symbol lookup is happening
within the same file or is being queried from another file (e.g., an
import statement)
* When a `Symbol` is being queried, we'll skip the definitions that are
(a) coming from a stub file (b) external lookup and (c) check the
re-export flag on the definition
This implementation does not yet support `__all__` and `*` imports as
both are features that needs to be implemented independently.
closes: #14099closes: #15476
## Test Plan
Add test cases, update existing ones if required.
## Summary
After I was asked twice within the same day, I thought it would be a
good idea to write some *user facing* documentation that explains our
reasoning behind inferring `Unknown | T_inferred` for public uses of
undeclared symbols. This is a major deviation from the behavior of other
type checkers and it seems like a good practice to defend our choice
like this.
This essentially makes it impossible to construct a `Diagnostic`
that has a `TextRange` but no `File`.
This is meant to be a precursor to multi-span support.
(Note that I consider this more of a prototyping-change and not
necessarily what this is going to look like longer term.)
Reviewers can probably review this PR as one big diff instead of
commit-by-commit.
## Summary
This is a follow up to
https://github.com/astral-sh/ruff/pull/15763#discussion_r1949681336
It reverts the change to using ptr equality for `AstNodeRef`s, which in
turn removes the `Eq`, `PartialEq`, and `Hash` implementations for
`AstNodeRef`s parametrized with AST nodes.
Cheap comparisons shouldn't be needed because the node field is
generally marked as `[#tracked]` and `#[no_eq]` and removing the
implementations even enforces that those
attributes are set on all `AstNodeRef` fields (which is good).
The only downside this has is that we technically wouldn't have to mark
the `Unpack::target` as `#[tracked]` because
the `target` field is accessed in every query accepting `Unpack` as an
argument.
Overall, enforcing the use of `#[tracked]` seems like a good trade off,
espacially considering that it's very likely that
we'd probably forget to mark the `Unpack::target` field as tracked if we
add a new `Unpack` query that doesn't access the target.
## Test Plan
`cargo test`
## Summary
Transition to using coarse-grained tracked structs (depends on
https://github.com/salsa-rs/salsa/pull/657). For now, this PR doesn't
add any `#[tracked]` fields, meaning that any changes cause the entire
struct to be invalidated. It also changes `AstNodeRef` to be
compared/hashed by pointer address, instead of performing a deep AST
comparison.
## Test Plan
This yields a 10-15% improvement on my machine (though weirdly some runs
were 5-10% without being flagged as inconsistent by criterion, is there
some non-determinism involved?). It's possible that some of this is
unrelated, I'll try applying the patch to the current salsa version to
make sure.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
- Simplify unions with `object` to `object`.
- Add a new `Type::object(db)` constructor to abbreviate
`KnownClass::Object.to_instance(db)` in some places.
- Add a `Type::is_object` and `Class::is_object` function to make some
tests for a bit easier to read.
closes#16084
## Test Plan
New Markdown tests.
## Summary
This PR adds support for user-level configurations
(`~/.config/knot/knot.toml`) to Red Knot.
Red Knot will watch the user-level configuration file for changes but
only if it exists
when the process start. It doesn't watch for new configurations,
mainly to simplify things for now (it would require watching the entire
`.config` directory because the `knot` subfolder might not exist
either).
The new `ConfigurationFile` struct seems a bit overkill for now but I
plan to use it for
hierarchical configurations as well.
Red Knot uses the same strategy as uv and Ruff by using the etcetera
crate.
## Test Plan
Added CLI and file watching test
## Summary
This PR adds a new `user_configuration_directory` method to `System`. We
need it to resolve where to lookup a user-level `knot.toml`
configuration file.
The method belongs to `System` because not all platforms have a
convention of where to store such configuration files (e.g. wasm).
I refactored `TestSystem` to be a simple wrapper around an `Arc<dyn
System...>` and use the `System.as_any` method instead to cast it down
to an `InMemory` system. I also removed some `System` specific methods
from `InMemoryFileSystem`, they don't belong there.
This PR removes the `os` feature as a default feature from `ruff_db`.
Most crates depending on `ruff_db` don't need it because they only
depend on `System` or only depend on `os` for testing. This was
necessary to fix a compile error with `red_knot_wasm`
## Test Plan
I'll make use of the method in my next PR. So I guess we won't know if
it works before then but I copied the code from Ruff/uv, so I have high
confidence that it is correct.
`cargo test`
## Summary
This PR generalize the idea that we may want to emit diagnostics for
invalid or incompatible configuration values similar to how we already
do it for `rules`.
This PR introduces a new `Settings` struct that is similar to `Options`
but, unlike
`Options`, are fields have their default values filled in and they use a
representation optimized for reads.
The diagnostics created during loading the `Settings` are stored on the
`Project` so that we can emit them when calling `check`.
The motivation for this work is that it simplifies adding new settings.
That's also why I went ahead and added the `terminal.error-on-warning`
setting to demonstrate how new settings are added.
## Test Plan
Existing tests, new CLI test.
## Summary
No functional change here; this is another simplification split out from
my outcome-refactor branch to reduce the diff there. This merges
`TypeInferenceBuilder::infer_name_load` and
`TypeInferenceBuilder::lookup_name`. This removes the need to have
extensive doc-comments about the purpose of
`TypeInferenceBuilder::lookup_name`, since the method only makes sense
when called from the specific context of
`TypeInferenceBuilder::infer_name_load`.
## Test Plan
`cargo test -p red_knot_python_semantic`
## Summary
- Do not return `Option<Type<…>>` from `Unpacker::get`, but just `Type`.
Panic otherwise.
- Rename `Unpacker::get` to `Unpacker::expression_type`
## Summary
* Support assignments to attributes in more cases:
- assignments in `for` loops
- in unpacking assignments
* Add test for multi-target assignments
* Add tests for all other possible assignments to attributes that could
possibly occur (in decreasing order of likeliness):
- augmented attribute assignments
- attribute assignments in `with` statements
- attribute assignments in comprehensions
- Note: assignments to attributes in named expressions are not
syntactically allowed
closes#15962
## Test Plan
New Markdown tests
## Summary
This PR reverts the behavior changes from
https://github.com/astral-sh/ruff/pull/15990
But it isn't just a revert, it also:
* Adds a test covering this specific behavior
* Preserves the improvement to use `saturating_sub` in the package case
to avoid overflows in the case of invalid syntax
* Use `ancestors` instead of a `for` loop
## Test Plan
Added test
## Summary
Adds a JSON schema generation step for Red Knot. This PR doesn't yet add
a publishing step because it's still a bit early for that
## Test plan
I tested the schema in Zed, VS Code and PyCharm:
* PyCharm: You have to manually add a schema mapping (settings JSON
Schema Mappings)
* Zed and VS code support the inline schema specification
```toml
#:schema /Users/micha/astral/ruff/knot.schema.json
[environment]
extra-paths = []
[rules]
call-possibly-unbound-method = "error"
unknown-rule = "error"
# duplicate-base = "error"
```
```json
{
"$schema": "file:///Users/micha/astral/ruff/knot.schema.json",
"environment": {
"python-version": "3.13",
"python-platform": "linux2"
},
"rules": {
"unknown-rule": "error"
}
}
```
https://github.com/user-attachments/assets/a18fcd96-7cbe-4110-985b-9f1935584411
The Schema overall works but all editors have their own quirks:
* PyCharm: Hovering a name always shows the section description instead
of the description of the specific setting. But it's the same for other
settings in `pyproject.toml` files 🤷
* VS Code (JSON): Using the generated schema in a JSON file gives
exactly the experience I want
* VS Code (TOML):
* Properties with multiple possible values are repeated during
auto-completion without giving any hint how they're different. 
* The property description mushes together the description of the
property and the value, which looks sort of ridiculous. 
* Autocompletion and documentation hovering works (except the
limitations mentioned above)
* Zed:
* Very similar to VS Code with the exception that it uses the
description attribute to distinguish settings with multiple possible
values 
I don't think there's much we can do here other than hope (or help)
editors improve their auto completion. The same short comings also apply
to ruff, so this isn't something new. For now, I think this is good
enough
## Summary
I noticed that the diagnostic range in specific unpacking assignments is
wrong. For this example
```py
a, b = 1
```
we previously got (see first commit):
```
error: lint:not-iterable
--> /src/mdtest_snippet.py:1:1
|
1 | a, b = 1
| ^^^^ Object of type `Literal[1]` is not iterable
|
```
and with this change, we get:
```
error: lint:not-iterable
--> /src/mdtest_snippet.py:1:8
|
1 | a, b = 1
| ^ Object of type `Literal[1]` is not iterable
|
```
## Test Plan
New snapshot tests.
## Summary
Fixes https://github.com/astral-sh/ruff/issues/15989
Red Knot failed to resolve relative imports if the importing module is
located at a search path root.
The issue was that the module resolver returned an `Err(TooManyDots)` as
soon as the parent of the current module is `None` (which is the case
for a module at the search path root).
However, this is incorrect if a `tail` (a module name) exists.
## Summary
- Minor wording update
- Code improvement (thanks Alex)
- Removed all unnecessary filenames throughout our Markdown tests (two
new ones were added in the meantime)
- Minor rewording of the statically-known-branches introduction
This example from @sharkdp shows how terminal statements can appear in
statically known branches:
https://github.com/astral-sh/ruff/pull/15676#issuecomment-2618809716
```py
def _(cond: bool):
x = "a"
if cond:
x = "b"
if True:
return
reveal_type(x) # revealed: "a", "b"; should be "a"
```
We now use visibility constraints to track reachability, which allows us
to model this correctly. There are two related changes as a result:
- New bindings are not assumed to be visible; they inherit the current
"scope start" visibility, which effectively means that new bindings are
visible if/when the current flow is reachable
- When simplifying visibility constraints after branching control flow,
we only simplify if none of the intervening branches included a terminal
statement. That is, earlier unaffected bindings are only _actually_
unaffected if all branches make it to the merge point.
## Summary
Allow for literate style in Markdown tests and merge multiple (unnamed)
code blocks into a single embedded file.
closes#15941
## Test Plan
- Interactively made sure that error-lines were reported correctly in
multi-snippet sections.
This causes the diagnostic to highlight the actual unresovable import
instead of the entire `from ... import ...` statement.
While we're here, we expand the test coverage to cover all of the
possible ways that an `import` or a `from ... import` can fail.
Some considerations:
* The first commit in this PR adds a regression test for the current
behavior.
* This creates a new `mdtest/diagnostics` directory. Are folks cool
with this? I guess the idea is to put tests more devoted to diagnostics
than semantics in this directory. (Although I'm guessing there will
be some overlap.)
Fixes#15866
## Summary
This is a first step towards creating a test suite for
[descriptors](https://docs.python.org/3/howto/descriptor.html). It does
not (yet) aim to be exhaustive.
relevant ticket: #15966
## Test Plan
Compared desired behavior with the runtime behavior and the behavior of
existing type checkers.
---------
Co-authored-by: Mike Perlov <mishamsk@gmail.com>
This ties together everything from the previous commits.
Some interesting bits here are how the snapshot is generated
(where we include relevant info to make it easier to review
the snapshots) and also a tweak to how inline assertions are
processed.
This commit also includes some example snapshots just to get
a sense of what they look like. Follow-up work should add
more of these I think.
This makes it possible for callers to set where snapshots
should be stored. In general, I think we expect this to
always be set, since otherwise snapshots will end up in
`red_knot_test`, which is where the tests are actually run.
But that's overall counter-intuitive. This permits us to
store snapshots from mdtests alongside the mdtests themselves.
## Summary
This PR adds `Type::call_bound` method for calls that should follow
descriptor protocol calling convention. The PR is intentionally shallow
in scope and only fixes#15672
Couple of obvious things that weren't done:
* Switch to `call_bound` everywhere it should be used
* Address the fact, that red_knot resolves `__bool__ = bool` as a Union,
which includes `Type::Dynamic` and hence fails to infer that the
truthiness is always false for such a class (I've added a todo comment
in mdtests)
* Doesn't try to invent a new type for descriptors, although I have a
gut feeling it may be more convenient in the end, instead of doing
method lookup each time like I did in `call_bound`
## Test Plan
* extended mdtests with 2 examples from the issue
* cargo neatest run
We now use ternary decision diagrams (TDDs) to represent visibility
constraints. A TDD is just like a BDD ([_binary_ decision
diagram](https://en.wikipedia.org/wiki/Binary_decision_diagram)), but
with "ambiguous" as an additional allowed value. Unlike the previous
representation, TDDs are strongly normalizing, so equivalent ternary
formulas are represented by exactly the same graph node, and can be
compared for equality in constant time.
We currently have a slight 1-3% performance regression with this in
place, according to local testing. However, we also have a _5× increase_
in performance for pathological cases, since we can now remove the
recursion limit when we evaluate visibility constraints.
As follow-on work, we are now closer to being able to remove the
`simplify_visibility_constraint` calls in the semantic index builder. In
the vast majority of cases, we now see (for instance) that the
visibility constraint after an `if` statement, for bindings of symbols
that weren't rebound in any branch, simplifies back to `true`. But there
are still some cases we generate constraints that are cyclic. With
fixed-point cycle support in salsa, or with some careful analysis of the
still-failing cases, we might be able to remove those.
## Summary
I experimented with [not trimming trailing newlines in code
snippets](https://github.com/astral-sh/ruff/pull/15926#discussion_r1940992090),
but since came to the conclusion that the current behavior is better
because otherwise, there is no way to write snippets without a trailing
newline at all. And when you copy the code from a Markdown snippet in
GitHub, you also don't get a trailing newline.
I was surprised to see some test failures when I played with this
though, and decided to make this test independent from this
implementation detail.
## Summary
This is a follow-up to #15726, #15778, and #15794 to preserve the triple
quote and prefix flags in plain strings, bytestrings, and f-strings.
I also added a `StringLiteralFlags::without_triple_quotes` method to
avoid passing along triple quotes in rules like SIM905 where it might
not make sense, as discussed
[here](https://github.com/astral-sh/ruff/pull/15726#discussion_r1930532426).
## Test Plan
Existing tests, plus many new cases in the `generator::tests::quote`
test that should cover all combinations of quotes and prefixes, at least
for simple string bodies.
Closes#7799 when combined with #15694, #15726, #15778, and #15794.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Resolves#15695, rework of #15704.
This change modifies the Mdtests framework so that:
* Paths must now be specified in a separate preceding line:
`````markdown
`a.py`:
```py
x = 1
```
`````
If the path of a file conflicts with its `lang`, an error will be
thrown.
* Configs are no longer accepted. The pattern still take them into
account, however, to avoid "Unterminated code block" errors.
* Unnamed files are now assigned unique, `lang`-respecting paths
automatically.
Additionally, all legacy usages have been updated.
## Test Plan
Unit tests and Markdown tests.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
This extracts some pure refactoring noise from
https://github.com/astral-sh/ruff/pull/15861. This changes the API for
creating and evaluating visibility constraints, but does not change how
they are respresented internally. There should be no behavioral or
performance changes in this PR.
Changes:
- Hide the internal representation isn't changed, so that we can make
changes to it in #15861.
- Add a separate builder type for visibility constraints. (With TDDs, we
will have some additional builder state that we can throw away once
we're done constructing.)
- Remove a layer of helper methods from `UseDefMapBuilder`, making
`SemanticIndexBuilder` responsible for constructing whatever visibility
constraints it needs.
## Summary
Add support for implicitly-defined instance attributes, i.e. support
type inference for cases like this:
```py
class C:
def __init__(self) -> None:
self.x: int = 1
self.y = None
reveal_type(C().x) # int
reveal_type(C().y) # Unknown | None
```
## Benchmarks
Codspeed reports no change in a cold-cache benchmark, and a -1%
regression in the incremental benchmark. On `black`'s `src` folder, I
don't see a statistically significant difference between the branches:
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./red_knot_main check --project /home/shark/black/src` | 133.7 ± 9.5 | 126.7 | 164.7 | 1.01 ± 0.08 |
| `./red_knot_feature check --project /home/shark/black/src` | 132.2 ± 5.1 | 118.1 | 140.9 | 1.00 |
## Test Plan
Updated and new Markdown tests
## Summary
Related to #15848, this PR adds the imports explicitly as we'll now flag
these symbols as undefined.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This mimics a simplification we have on the OR side, where we simplify
`A ∨ !A` to true. This requires changes to how we add `while` statements
to the semantic index, since we now need distinct
`VisibilityConstraint`s if we need to model evaluating a `Constraint`
multiple times at different points in the execution of the program.
Something Alex and I threw together during our 1:1 this morning. Allows
us to collect statistics on the prevalence of various types in a file,
most usefully TODO types or other dynamic types.
`FlowSnapshot` now tracks a `reachable` bool, which indicates whether we
have encountered a terminal statement on that control flow path. When
merging flow states together, we skip any that have been marked
unreachable. This ensures that bindings that can only be reached through
unreachable paths are not considered visible.
## Test Plan
The new mdtests failed (with incorrect `reveal_type` results, and
spurious `possibly-unresolved-reference` errors) before adding the new
visibility constraints.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
When we discussed the plan on how to proceed with instance attributes,
we said that we should first extend our research into the behavior of
existing type checkers. The result of this research is summarized in the
newly added / modified tests in this PR. The TODO comments align with
existing behavior of other type checkers. If we deviate from the
behavior, it is described in a comment.
## Summary
Adds a slightly more comprehensive documentation of our behavior
regarding type inference for public uses of symbols. In particular:
- What public type do we infer for `x: int = any()`?
- What public type do we infer for `x: Unknown = 1`?
## Summary
On `main`, red-knot:
- Considers `P | Q` equivalent to `Q | P`
- Considered `tuple[P | Q]` equivalent to `tuple[Q | P]`
- Considers `tuple[P | tuple[P | Q]]` equivalent to `tuple[tuple[Q | P]
| P]`
- ‼️ Does _not_ consider `tuple[tuple[P | Q]]` equivalent to
`tuple[tuple[Q | P]]`
The key difference for the last one of these is that the union appears
inside a tuple that is directly nested inside another tuple.
This PR fixes this so that differently ordered unions are considered
equivalent even when they appear inside arbitrarily nested tuple types.
## Test Plan
- Added mdtests that fails on `main`
- Checked that all property tests continue to pass with this PR
This is a follow-up to #15702 that hopefully claws back the 1%
performance regression. Assuming it works, the trick is to iterate over
the constraints vectors via mut reference (aka a single pointer), so
that we're not copying `BitSet`s into and out of the zip tuples as we
iterate. We use `std::mem::take` as a poor-man's move constructor only
at the very end, when we're ready to emplace it into the result. (C++
idioms intended! 😄)
With local testing via hyperfine, I'm seeing this be 1-3% faster than
`main` most of the time — though a small number of runs (1 in 10,
maybe?) are a wash or have `main` faster. Codspeed reports a 2%
gain.
## Summary
Use `Unknown | T_inferred` as the type for *undeclared* public symbols.
## Test Plan
- Updated existing tests
- New test for external `__slots__` modifications.
- New tests for external modifications of public symbols.
## Summary
Another small PR to focus #15674 solely on the relevant changes. This
makes our Markdown tests less dependent on precise types of public
symbols, without actually changing anything semantically in these tests.
Best reviewed using ignore-whitespace-mode.
## Test Plan
Tested these changes on `main` and on the branch from #15674.
## Summary
Make the remaining `infer.rs` unit tests independent from public symbol
type inference decisions (see upcoming change in #15674).
## Test Plan
- Made sure that the unit tests actually fail if one of the
`assert_type` assertions is changed.
## Summary
Port comprehension tests from Rust to Markdown
I don' think the remaining tests in `infer.rs` should be ported to
Markdown, maybe except for the incremental-checking tests when (if ever)
we have support for that in the MD tests.
closes#13696
## Summary
- Port "deferred annotations" unit tests to Markdown
- Port `implicit_global_in_function` unit test to Markdown
- Removed `resolve_method` and `local_inference` unit tests. These seem
like relics from a time where type inference was in it's early stages.
There is no way that these tests would fail today without lots of other
things going wrong as well.
part of #13696
based on #15683
## Test Plan
New MD tests for existing Rust unit tests.
## Summary
- Add feature to specify a custom typeshed from within Markdown-based
tests
- Port "builtins" unit tests from `infer.rs` to Markdown tests, part of
#13696
## Test Plan
- Tests for the custom typeshed feature
- New Markdown tests for deleted Rust unit tests
## Summary
Somehow, I managed to crash the `mdtest` runner today. I struggled to
reproduce this again to see if it's actually fixed (even with an
artificial `sleep` between the two `cargo test` invocations), but the
original backtrace clearly showed that this is where the problem
originated from. And it seems like a clear TOCTOU problem.
## Summary
Raise "invalid-assignment" diagnostics for incorrect assignments to
attributes, for example:
```py
class C:
var: str = "a"
C.var = 1 # error: "Object of type `Literal[1]` is not assignable to `str`"
```
closes#15456
## Test Plan
- Updated test assertions
- New test for assignments to module-attributes
## Summary
This PR generalizes some of the logic we have in `Type::is_subtype_of`
and `Type::is_disjoint_from` so that we fallback to the instance type of
the metaclass more often in `Type::ClassLiteral` and `Type::SubclassOf`
branches. This simplifies the code (we end up with one less branch in
`is_subtype_of`, and we can remove a helper method that's no longer
used), makes the code more robust (any fixes made to subtyping or
disjointness of instance types will automatically improve our
understanding of subtyping/disjointness for class-literal types and
`type[]` types) and more elegantly expresses the type-system invariants
encoded in these branches.
## Test Plan
No new tests added (it's a pure refactor, adding no new functionality).
All existing tests pass, however, including the property tests.
The AST generator creates a reference enum for each syntax group — an
enum where each variant contains a reference to the relevant syntax
node. Previously you could customize the name of the reference enum for
a group — primarily because there was an existing `ExpressionRef` type
that wouldn't have lined up with the auto-derived name `ExprRef`. This
follow-up PR is a simple search/replace to switch over to the
auto-derived name, so that we can remove this customization point.
## Summary
Test executables usually write failure messages (including panics) to
stdout, but I just managed to make a mdtest crash with
```
thread 'mdtest__unary_not' has overflowed its stack
fatal runtime error: stack overflow
```
which is printed to stderr. This test simply appends stderr to stdout
(`stderr=subprocess.STDOUT` can not be used with `capture_output`)
## Test Plan
Make sure that the error message is now visible in the output of `uv -q
run crates/red_knot_python_semantic/mdtest.py`
## Summary
The `Options` struct is intended to capture the user's configuration
options but
`EnvironmentOptions::venv_path` supports both a `SitePackages::Known`
and `SitePackages::Derived`.
Users should only be able to provide `SitePackages::Derived`—they
specify a path to a venv, and Red Knot derives the path to the
site-packages directory. We'll only use the `Known` variant once we
automatically discover the Python installation.
That's why this PR changes `EnvironmentOptions::venv_path` from
`Option<SitePackages>` to `Option<SystemPathBuf>`.
This requires making some changes to the file watcher test, and I
decided to use `extra_paths` over venv path
because our venv validation is annoyingly correct -- making mocking a
venv rather involved.
## Test Plan
`cargo test`
## Summary
As more and more tests move to Markdown, running the mdtest suite
becomes one of the most common tasks for developers working on Red Knot.
There are a few pain points when doing so, however:
- The `build.rs` script enforces recompilation (~five seconds) whenever
something changes in the `resource/mdtest` folder. This is strictly
necessary, because whenever files are added or removed, the test harness
needs to be updated. But this is very rarely the case! The most common
scenario is that a Markdown file has *changed*, and in this case, no
recompilation is necessary. It is currently not possible to distinguish
these two cases using `cargo::rerun-if-changed`. One can work around
this by running the test executable manually, but it requires finding
the path to the correct `mdtest-<random-hash>` executable.
- All Markdown tests are run by default. This is needed whenever Rust
code changes, but while working on the tests themselves, it is often
much more convenient to only run the tests for a single file. This can
be done by using a `mdtest__path_to_file` filter, but this needs to be
manually spelled out or copied from the test output.
- `cargo`s test output for a failing Markdown test is often
unnecessarily verbose. Unless there is an *actual* panic somewhere in
the code, mdtests usually fail with the explicit *"Some tests failed"*
panic in the mdtest suite. But in those cases, we are not interested in
the pointer to the source of this panic, but only in the mdtest suite
output.
This PR adds a Markdown test runner tool that attempts to make the
developer experience better.
Once it is started using
```bash
uv run -q crates/red_knot_python_semantic/mdtest.py
```
it will first recompile the tests once (if cargo requires it), find the
path to the `mdtest` executable, and then enter into a mode where it
watches for changes in the `red_knot_python_semantic` crate. Whenever …
* … a Markdown file changes, it will rerun the mdtest for this specific
file automatically (no recompilation!).
* … a Markdown file is added, it will recompile the tests and then run
the mdtest for the new file
* … Rust code is changed, it will recompile the tests and run all of
them
The tool also trims down `cargo test` output and only shows the actual
mdtest errors.
The tool will certainly require a few more iterations before it becomes
mature, but I'm curious to hear if there is any interest for something
like this.
## Test Plan
- Tested the new runner under various scenarios.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Rename two functions with outdated names (they used to return `Type`s):
* `bindings_ty` => `symbol_from_bindings` (returns `Symbol`)
* `declarations_ty` => `symbol_from_declarations` (returns a
`SymbolAndQualifiers` result)
I chose `symbol_from_*` instead of `*_symbol` as I found the previous
name quite confusing. Especially since `binding_ty` and `declaration_ty`
also exist (singular).
## Test Plan
—
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Add support for `typing.ClassVar`, i.e. emit a diagnostic in this
scenario:
```py
from typing import ClassVar
class C:
x: ClassVar[int] = 1
c = C()
c.x = 3 # error: "Cannot assign to pure class variable `x` from an instance of type `C`"
```
## Test Plan
- New tests for the `typing.ClassVar` qualifier
- Fixed one TODO in `attributes.md`
## Summary
This is a small, tentative step towards the bigger goal of understanding
instance attributes.
- Adds partial support for pure instance variables declared in the class
body, i.e. this case:
```py
class C:
variable1: str = "a"
variable2 = "b"
reveal_type(C().variable1) # str
reveal_type(C().variable2) # Unknown | Literal["b"]
```
- Adds `property` as a known class to query for `@property` decorators
- Splits up various `@Todo(instance attributes)` cases into
sub-categories.
## Test Plan
Modified existing MD tests.
## Summary
This PR adds support for configuring Red Knot in the `tool.knot` section
of the project's
`pyproject.toml` section. Options specified on the CLI precede the
options in the configuration file.
This PR only supports the `environment` and the `src.root` options for
now.
Other options will be added as separate PRs.
There are also a few concerns that I intentionally ignored as part of
this PR:
* Handling of relative paths: We need to anchor paths relative to the
current working directory (CLI), or the project (`pyproject.toml` or
`knot.toml`)
* Tracking the source of a value. Diagnostics would benefit from knowing
from which configuration a value comes so that we can point the user to
the right configuration file (or CLI) if the configuration is invalid.
* Schema generation and there's a lot more; see
https://github.com/astral-sh/ruff/issues/15491
This PR changes the default for first party codes: Our existing default
was to only add the project root. Now, Red Knot adds the project root
and `src` (if such a directory exists).
Theoretically, we'd have to add a file watcher event that changes the
first-party search paths if a user later creates a `src` directory. I
think this is pretty uncommon, which is why I ignored the complexity for
now but I can be persuaded to handle it if it's considered important.
Part of https://github.com/astral-sh/ruff/issues/15491
## Test Plan
Existing tests, new file watching test demonstrating that changing the
python version and platform is correctly reflected.
## Summary
Closes https://github.com/astral-sh/ruff/issues/15508
For any two instance types `T` and `S`, we know they are disjoint if
either `T` is final and `T` is not a subclass of `S` or `S` is final and
`S` is not a subclass of `T`.
Correspondingly, for any two types `type[T]` and `S` where `S` is an
instance type, `type[T]` can be said to be disjoint from `S` if `S` is
disjoint from `U`, where `U` is the type that represents all instances
of `T`'s metaclass.
And a heterogeneous tuple type can be said to be disjoint from an
instance type if the instance type is disjoint from `tuple` (a type
representing all instances of the `tuple` class at runtime).
## Test Plan
- A new mdtest added. Most of our `is_disjoint_from()` tests are not
written as mdtests just yet, but it's pretty hard to test some of these
edge cases from a Rust unit test!
- Ran `QUICKCHECK_TESTS=1000000 cargo test --release -p
red_knot_python_semantic -- --ignored types::property_tests::stable`
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Ref: https://github.com/astral-sh/ruff/pull/15387#discussion_r1917796907
This PR updates `F722` to show syntax error message instead of the
string content.
I think it's more useful to show the syntax error message than the
string content. In the future, when the diagnostics renderer is more
capable, we could even highlight the exact location of the syntax error
along with the annotation string.
This is also in line with how we show the diagnostic in red knot.
## Test Plan
Update existing test snapshots.
## Summary
Resolves#9467
Parse quoted annotations as if the string content is inside parenthesis.
With this logic `x` and `y` in this example are equal:
```python
y: """
int |
str
"""
z: """(
int |
str
)
"""
```
Also this rule only applies to triple
quotes([link](https://github.com/python/typing-council/issues/9#issuecomment-1890808610)).
This PR is based on the
[comments](https://github.com/astral-sh/ruff/issues/9467#issuecomment-2579180991)
on the issue.
I did one extra change, since we don't want any indentation tokens I am
setting the `State::Other` as the initial state of the Lexer.
Remaining work:
- [x] Add a test case for red-knot.
- [x] Add more tests.
## Test Plan
Added a test which previously failed because quoted annotation contained
indentation.
Added an mdtest for red-knot.
Updated previous test.
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
If `S <: T`, then `~T <: ~S`. This test currently fails with example
like:
```
S = tuple[()]
T = ~Literal[True] & ~Literal[False]
```
`T` is equivalent to `~(Literal[True] | Literal[False])` and therefore
equivalent to `~bool`, but the minimal example for a failure is what is
stated above. We correctly recognize that `S <: T`, but fail to see that
`~T <: ~S`, i.e. `bool <: ~tuple[()]`.
This is why the tests goes into the "flaky" section as well.
## Test Plan
```
export QUICKCHECK_TESTS=100000
while cargo test --release -p red_knot_python_semantic -- --ignored types::property_tests::flaky::negation_reverses_subtype_order; do :; done
```
## Summary
Adds some initial tests for class and instance attributes, mostly to
document (and discuss) what we want to support eventually. These
tests are not exhaustive yet. The idea is to specify the coarse-grained
behavior first.
Things that we'll eventually want to test:
- Interplay with inheritance
- Support `Final` in addition to `ClassVar`
- Specific tests for `ClassVar`, like making sure that we support things
like `x: Annotated[ClassVar[int], "metadata"]`
- … or making sure that we raise an error here:
```py
class Foo:
def __init__(self):
self.x: ClassVar[str] = "x"
```
- Add tests for `__new__` in addition to the tests for `__init__`
- Add tests that show that we use the union of types if multiple methods
define the symbol with different types
- Make sure that diagnostics are raised if, e.g., the inferred type of
an assignment within a method does not match the declared type in the
class body.
- https://github.com/astral-sh/ruff/pull/15474#discussion_r1916556284
- Method calls are completely left out for now.
- Same for `@property`
- … and the descriptor protocol
## Test Plan
New Markdown tests
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
The next sync of typeshed would have failed without manual changes
anyway, so I'm doing one manual sync + the required changes in our
`sys.platform` tests (which are necessary because of my tiny typeshed PR
here: https://github.com/python/typeshed/pull/13378).
closes#15485 (the next run of the pipeline in two weeks should be fine
as the bug has been fixed upstream)
## Summary
Adds two additional tests for `is_equivalent_to` so that we cover all
properties of an [equivalence relation].
## Test Plan
```
while cargo test --release -p red_knot_python_semantic -- --ignored types::property_tests::stable; do :; done
```
[equivalence relation]:
https://en.wikipedia.org/wiki/Equivalence_relation
## Summary
This changeset adds new tests for public uses of symbols,
considering all possible declaredness and boundness states.
Note that this is a mere documentation of the current behavior. There is
still an [open ticket] questioning some of these choices (or unintential
behaviors).
## Test plan
Made sure that the respective test fails if I add the questionable case
again in `symbol_by_id`:
```rs
Symbol::Type(inferred_ty, Boundness::Bound) => {
Symbol::Type(inferred_ty, Boundness::Bound)
}
```
[open ticket]: https://github.com/astral-sh/ruff/issues/14297
## Summary
Simplification follow-up to #15413.
There's no need to have a dedicated `CallOutcome` variant for every
known function, it's only necessary if the special-cased behavior of the
known function includes emitting extra diagnostics. For `typing.cast`,
there's no such need; we can use the regular `Callable` outcome variant,
and update the return type according to the cast. (This is the same way
we already handle `len`.)
One reason to avoid proliferating unnecessary `CallOutcome` variants is
that currently we have to explicitly add emitting call-binding
diagnostics, for each outcome variant. So we were previously wrongly
silencing any binding diagnostics on calls to `typing.cast`. Fixing this
revealed a separate bug, that we were emitting a bogus error anytime
more than one keyword argument mapped to a `**kwargs` parameter. So this
PR also adds test and fix for that bug.
## Test Plan
Existing `cast` tests pass unchanged, added new test for `**kwargs` bug.
## Summary
In `SymbolState` merging, use `BitSet::union` instead of inserting
declarations one by one. This used to be the case but was changed in
https://github.com/astral-sh/ruff/pull/15019 because we had to iterate
over declarations anyway.
This is an alternative to https://github.com/astral-sh/ruff/pull/15419
by @MichaReiser. It's similar in performance, but a bit more
declarative and less imperative.
## Summary
Follow-up PR from https://github.com/astral-sh/ruff/pull/15415🥲
The exact same property test already exists:
`intersection_assignable_to_both` and
`all_type_pairs_can_be_assigned_from_their_intersection`
## Test Plan
`cargo test -p red_knot_python_semantic -- --ignored
types::property_tests::flaky`
A small PR to reduce some of the code duplication between the various
branches, make it a little more readable and move the API closer to what
we already have for `KnownClass`
## Summary
[**Rendered version of the new test
suite**](https://github.com/astral-sh/ruff/blob/david/intersection-type-tests/crates/red_knot_python_semantic/resources/mdtest/intersection_types.md)
Moves most of our existing intersection-types tests to a dedicated
Markdown test suite, extends the test coverage, unifies the notation for
these tests, groups tests into a proper structure, and adds some
explanations for various simplification strategies.
This changeset also:
- Adds a new simplification where `~Never` is removed from
intersections.
- Adds a new simplification where adding `~object` simplifies the whole
intersection to `Never`
- Avoids unnecessary assignment-checks between inferred and declared
type. This was added to this changeset to avoid many false positive
errors in this test suite.
Resolves the task described in this old comment
[here](e01da82a5a..e7e432bca2 (r1819924085)).
## Test Plan
Running the new Markdown tests
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Prompted by
> One nit: I think we need to consider `Any` and `Unknown` and `Todo` as
all (gradually) equivalent to each other, and thus `type & Any` and
`type & Unknown` and `type & Todo` as also equivalent. The distinction
between `Any` vs `Unknown` vs `Todo` is entirely about
provenance/debugging, there is no type level distinction. (And I've been
wondering if the `Any` vs `Unknown` distinction is really worth it.)
The thought here is that _most_ places want to treat `Any`, `Unknown`,
and `Todo` identically. So this PR simplifies things by having a single
`Type::Any` variant, and moves the provenance part into a new `AnyType`
type. If you need to treat e.g. `Todo` differently, you still can by
pattern-matching into the `AnyType`. But if you don't, you can just use
`Type::Any(_)`.
(This would also allow us to (more easily) distinguish "unknown via an
unannotated value" from "unknown because of a typing error" should we
want to do that in the future)
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This moves almost all of our existing `UnionBuilder` tests to a
Markdown-based test suite.
I see how this could be a more controversial change, since these tests
where written specifically for `UnionBuilder`, and by creating the union
types using Python type expressions, we add an additional layer on top
(parsing and inference of these expressions) that moves these tests away
from clean unit tests more in the direction of integration tests. Also,
there are probably a few implementation details of `UnionBuilder` hidden
in the test assertions (e.g. order of union elements after
simplifications).
That said, I think we would like to see all those properties that are
being tested here from *any* implementation of union types. And the
Markdown tests come with the usual advantages:
- More consice
- Better readability
- No re-compiliation when working on tests
- Easier to add additional explanations and structure to the test suite
This changeset adds a few additional tests, but keeps the logic of the
existing tests except for a few minor modifications for consistency.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: T-256 <132141463+T-256@users.noreply.github.com>
## Summary
- Add a workflow to run property tests on a daily basis (based on
`daily_fuzz.yaml`)
- Mark `assignable_to_is_reflexive` as flaky (related to #14899)
- Add new (failing) `intersection_assignable_to_both` test (also related
to #14899)
## Test Plan
Ran:
```bash
export QUICKCHECK_TESTS=100000
while cargo test --release -p red_knot_python_semantic -- \
--ignored types::property_tests::stable; do :; done
```
Observed successful property_tests CI run
## Summary
This changeset migrates all existing `is_assignable_to` tests to a
Markdown-based test. It also increases our test coverage in a hopefully
meaningful way (not claiming to be complete in any sense). But at least
I found and fixed one bug while doing so.
## Test Plan
Ran property tests to make sure the new test succeeds after fixing it.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Adds a type-check-time Python API that allows us to create and
manipulate types and to test various of their properties. For example,
this can be used to write a Markdown test to make sure that `A & B` is a
subtype of `A` and `B`, but not of an unrelated class `C` (something
that requires quite a bit more code to do in Rust):
```py
from knot_extensions import Intersection, is_subtype_of, static_assert
class A: ...
class B: ...
type AB = Intersection[A, B]
static_assert(is_subtype_of(AB, A))
static_assert(is_subtype_of(AB, B))
class C: ...
static_assert(not is_subtype_of(AB, C))
```
I think this functionality is also helpful for interactive debugging
sessions, in order to query various properties of Red Knot's type
system. Which is something that otherwise requires a custom Rust unit
test, some boilerplate code and constant re-compilation.
## Test Plan
- New Markdown tests
- Tested the modified typeshed_sync workflow locally
## Summary
`Type[Any]` should be assignable to `object`. All types should be
assignable to `object`.
We specifically didn't understand the former; this PR adds a test for
it, and a case to ensure that `Type[Any]` is assignable to anything that
`type` is assignable to (which includes `object`).
This PR also adds a property test that all types are assignable to
object. In order to make it pass, I added a special case to check early
if we are assigning to `object` and just return `true`. In principle,
once we get all the more general cases correct, this special case might
be removable. But having the special case for now allows the property
test to pass.
And we add a property test that all types are subtypes of object. This
failed for the case of an intersection with no positive elements (that
is, a negation type). This really does need to be a special case for
`object`, because there is no other type we can know that a negation
type is a subtype of.
## Test Plan
Added unit test and property test.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
We now support class patterns in a match statement, adding a narrowing
constraint that within the body of that match arm, we can assume that
the subject is an instance of that class.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This implements checking of calls.
I ended up following Micha's original suggestion from back when the
signature representation was first introduced, and flattening it to a
single array of parameters. This turned out to be easier to manage,
because we can represent parameters using indices into that array, and
represent the bound argument types as an array of the same length.
Starred and double-starred arguments are still TODO; these won't be very
useful until we have generics.
The handling of diagnostics is just hacked into `return_ty_result`,
which was already inconsistent about whether it emitted diagnostics or
not; now it's even more inconsistent. This needs to be addressed, but
could be a follow-up.
The new benchmark errors here surface the need for intersection support
in `is_assignable_to`.
Fixes#14161.
## Test Plan
Added mdtests.
## Summary
When debugging, I frequently want to know which symbols are being looked
up. `symbol_by_id` adds tracing information, but it only shows the
`ScopedSymbolId`. Since `symbol_by_id` is only called from `symbol`, it
seems reasonable to move the tracing call one level up from
`symbol_by_id` to `symbol`, where we can also show the name of the
symbol.
**Before**:
```
6 └─┐red_knot_python_semantic::types::infer::infer_expression_types{expression=Id(60de), file=/home/shark/tomllib_modified/_parser.py}
6 └─┐red_knot_python_semantic::types::symbol_by_id{symbol=ScopedSymbolId(33)}
6 ┌─┘
6 └─┐red_knot_python_semantic::types::symbol_by_id{symbol=ScopedSymbolId(123)}
6 ┌─┘
6 └─┐red_knot_python_semantic::types::symbol_by_id{symbol=ScopedSymbolId(54)}
6 ┌─┘
6 └─┐red_knot_python_semantic::types::symbol_by_id{symbol=ScopedSymbolId(122)}
6 ┌─┘
6 └─┐red_knot_python_semantic::types::symbol_by_id{symbol=ScopedSymbolId(165)}
6 ┌─┘
6 ┌─┘
6 └─┐red_knot_python_semantic::types::symbol_by_id{symbol=ScopedSymbolId(32)}
6 ┌─┘
6 └─┐red_knot_python_semantic::types::symbol_by_id{symbol=ScopedSymbolId(232)}
6 ┌─┘
6 ┌─┘
6 ┌─┘
6┌─┘
```
**After**:
```
5 └─┐red_knot_python_semantic::types::infer::infer_expression_types{expression=Id(60de), file=/home/shark/tomllib_modified/_parser.py}
5 └─┐red_knot_python_semantic::types::symbol{name="dict"}
5 ┌─┘
5 └─┐red_knot_python_semantic::types::symbol{name="dict"}
5 ┌─┘
5 └─┐red_knot_python_semantic::types::symbol{name="list"}
5 ┌─┘
5 └─┐red_knot_python_semantic::types::symbol{name="list"}
5 ┌─┘
5 └─┐red_knot_python_semantic::types::symbol{name="isinstance"}
5 ┌─┘
5 └─┐red_knot_python_semantic::types::symbol{name="isinstance"}
5 ┌─┘
5 ┌─┘
5 └─┐red_knot_python_semantic::types::symbol{name="ValueError"}
5 ┌─┘
5 └─┐red_knot_python_semantic::types::symbol{name="ValueError"}
5 ┌─┘
5 ┌─┘
5 ┌─┘
5┌─┘
```
## Test Plan
```
cargo run --bin red_knot -- --current-directory path/to/tomllib -vvv
```
## Summary
While looking at #14899, I looked at seeing if I could get shrinking on
the examples. It turned out to be straightforward, with a couple of
caveats.
I'm calling `clone` a lot during shrinking. Since by the shrink step
we're already looking at a test failure this feels fine? Unless I
misunderstood `quickcheck`'s core loop
When shrinking `Intersection`s, in order to just rely on `quickcheck`'s
`Vec` shrinking without thinking about it too much, the shrinking
strategy is:
- try to shrink the negative side (keeping the positive side the same)
- try to shrink the positive side (keeping the negative side the same)
This means that you can't shrink from `(A & B & ~C & ~D)` directly to
`(A & ~C)`! You would first need an intermediate failure at `(A & B &
~C)` or `(A & ~C & ~D)`. This feels good enough. Shrinking the negative
side first also has the benefit of trying to strip down negative
elements in these intersections.
## Test Plan
`cargo test -p red_knot_python_semantic -- --ignored
types::property_tests::stable` still fails as it current does on `main`,
but now the errors seem more minimal.
Just like in #15045 for unary expressions: In binary expressions, we
were only looking for dunder expressions for `Type::Instance` types. We
had some special cases for coercing the various `Literal` types into
their corresponding `Instance` types before doing the lookup. But we can
side-step all of that by using the existing `Type::to_meta_type` and
`Type::to_instance` methods.
Resolves#14840
## Summary
Usage of ellipsis literal as default argument is allowed in stub files.
## Test Plan
Added mdtest for both python files and stub files.
---------
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
The test expression in an `elif` clause is evaluated whether or not we
take the branch. Our control flow model for if/elif chains failed to
reflect this, causing wrong inference in cases where an assignment
expression occurs inside an `elif` test expression. Our "no branch taken
yet" snapshot (which is the starting state for every new elif branch)
can't simply be the pre-if state, it must be updated after visiting each
test expression.
Once we do this, it also means we no longer need to track a vector of
narrowing constraints to reapply for each new branch, since our "branch
not taken" state (which is the initial state for each branch) is
continuously updated to include the negative narrowing constraints of
all previous branches.
Fixes#15033.
## Test Plan
Added mdtests.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
We understand `sys.version_info` branches now! As such, I _believe_ this
branch is no longer required; all tests pass without it. I also ran
`QUICKCHECK_TESTS=100000 cargo test -p red_knot_python_semantic --
--ignored types::property_tests::stable`, and no tests failed except for
the known issue with `Type::is_assignable_to()`
(https://github.com/astral-sh/ruff/issues/14899)
## Test Plan
See above
## Summary
Remove `Type::tuple` in favor of `TupleType::from_elements`, avoid a few
intermediate `Vec`tors. Resolves an old [review
comment](https://github.com/astral-sh/ruff/pull/14744#discussion_r1867493706).
## Test Plan
New regression test for something I ran into while implementing this.
## Summary
Part of #13773
This PR adds diagnostics when there is a length mismatch during
unpacking between the number of target expressions and the number of
types for the unpack value expression.
There are 3 cases of diagnostics here where the first two occurs when
there isn't a starred expression and the last one occurs when there's a
starred expression:
1. Number of target expressions is **less** than the number of types
that needs to be unpacked
2. Number of target expressions is **greater** then the number of types
that needs to be unpacked
3. When there's a starred expression as one of the target expression and
the number of target expressions is greater than the number of types
Examples for all each of the above cases:
```py
# red-knot: Too many values to unpack (expected 2, got 3) [lint:invalid-assignment]
a, b = (1, 2, 3)
# red-knot: Not enough values to unpack (expected 2, got 1) [lint:invalid-assignment]
a, b = (1,)
# red-knot: Not enough values to unpack (expected 3 or more, got 2) [lint:invalid-assignment]
a, *b, c, d = (1, 2)
```
The (3) case is a bit special because it uses a distinct wording
"expected n or more" instead of "expected n" because of the starred
expression.
### Location
The diagnostic location is the target expression that's being unpacked.
For nested targets, the location will be the nested expression. For
example:
```py
(a, (b, c), d) = (1, (2, 3, 4), 5)
# ^^^^^^
# red-knot: Too many values to unpack (expected 2, got 3) [lint:invalid-assignment]
```
For future improvements, it would be useful to show the context for why
this unpacking failed. For example, for why the expected number of
targets is `n`, we can highlight the relevant elements for the value
expression.
In the **ecosystem**, **Pyright** uses the target expressions for
location while **mypy** uses the value expression for the location. For
example:
```py
if 1:
# mypy: Too many values to unpack (2 expected, 3 provided) [misc]
# vvvvvvvvv
a, b = (1, 2, 3)
# ^^^^
# Pyright: Expression with type "tuple[Literal[1], Literal[2], Literal[3]]" cannot be assigned to target tuple
# Type "tuple[Literal[1], Literal[2], Literal[3]]" is incompatible with target tuple
# Tuple size mismatch; expected 2 but received 3 [reportAssignmentType]
# red-knot: Too many values to unpack (expected 2, got 3) [lint:invalid-assignment]
```
## Test Plan
Update existing test cases TODO with the error directives.
## Summary
Ref:
3533d7f5b4 (r150651102)
This PR removes the `Ranged` implementation on `DefinitionKind` and
instead uses a method called `target_range` to avoid any confusion about
what range this is for i.e., it's not the range of the node that
represents the definition.
## Summary
Related to #13773
This PR adds support for unpacking `for` statement targets.
This involves updating the `value` field in the `Unpack` target to use
an enum which specifies the "where did the value expression came from?".
This is because for an iterable expression, we need to unpack the
iterator type while for assignment statement we need to unpack the value
type itself. And, this needs to be done in the unpack query.
### Question
One of the ways unpacking works in `for` statement is by looking at the
union of the types because if the iterable expression is a tuple then
the iterator type will be union of all the types in the tuple. This
means that the test cases that will test the unpacking in `for`
statement will also implicitly test the unpacking union logic. I was
wondering if it makes sense to merge these cases and only add the ones
that are specific to the union unpacking or for statement unpacking
logic.
## Test Plan
Add test cases involving iterating over a tuple type. I've intentionally
left out certain cases for now and I'm curious to know any thoughts on
the above query.
## Summary
This changeset adds support for precise type-inference and
boundness-handling of definitions inside control-flow branches with
statically-known conditions, i.e. test-expressions whose truthiness we
can unambiguously infer as *always false* or *always true*.
This branch also includes:
- `sys.platform` support
- statically-known branches handling for Boolean expressions and while
loops
- new `target-version` requirements in some Markdown tests which were
now required due to the understanding of `sys.version_info` branches.
closes#12700closes#15034
## Performance
### `tomllib`, -7%, needs to resolve one additional module (sys)
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./red_knot_main --project /home/shark/tomllib` | 22.2 ± 1.3 | 19.1 |
25.6 | 1.00 |
| `./red_knot_feature --project /home/shark/tomllib` | 23.8 ± 1.6 | 20.8
| 28.6 | 1.07 ± 0.09 |
### `black`, -6%
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./red_knot_main --project /home/shark/black` | 129.3 ± 5.1 | 119.0 |
137.8 | 1.00 |
| `./red_knot_feature --project /home/shark/black` | 136.5 ± 6.8 | 123.8
| 147.5 | 1.06 ± 0.07 |
## Test Plan
- New Markdown tests for the main feature in
`statically-known-branches.md`
- New Markdown tests for `sys.platform`
- Adapted tests for `EllipsisType`, `Never`, etc
## Summary
Refer:
https://github.com/astral-sh/ruff/issues/13773#issuecomment-2548020368
This PR adds support for unpacking union types.
Unpacking a union type requires us to first distribute the types for all
the targets that are involved in an unpacking. For example, if there are
two targets and a union type that needs to be unpacked, each target will
get a type from each element in the union type.
For example, if the type is `tuple[int, int] | tuple[int, str]` and the
target has two elements `(a, b)`, then
* The type of `a` will be a union of `int` and `int` which are at index
0 in the first and second tuple respectively which resolves to an `int`.
* Similarly, the type of `b` will be a union of `int` and `str` which
are at index 1 in the first and second tuple respectively which will be
`int | str`.
### Refactors
There are couple of refactors that are added in this PR:
* Add a `debug_assertion` to validate that the unpack target is a list
or a tuple
* Add a separate method to handle starred expression
## Test Plan
Update `unpacking.md` with additional test cases that uses union types.
This is done using parameter type hints style.
## Summary
This PR adds initial support for `type: ignore`. It doesn't do anything
fancy yet like:
* Detecting invalid type ignore comments
* Detecting type ignore comments that are part of another suppression
comment: `# fmt: skip # type: ignore`
* Suppressing specific lints `type: ignore [code]`
* Detecting unsused type ignore comments
* ...
The goal is to add this functionality in separate PRs.
## Test Plan
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
We have a handy `to_meta_type` that does the right thing for class
instances, and also works for all of the other types that are “instances
of” something. Unless I'm missing something, this should let us get rid
of the catch-all clause in one fell swoop.
cf #14548
## Summary
I'm currently on the fence about landing the #14760 PR because it's
unclear how we'd support tracking used and unused suppression comments
in a performant way:
* Salsa adds an "untracked" dependency to every query reading
accumulated values. This has the effect that the query re-runs on every
revision. For example, a possible future query
`unused_suppression_comments(db, file)` would re-run on every
incremental change and for every file. I don't expect the operation
itself to be expensive, but it all adds up in a project with 100k+ files
* Salsa collects the accumulated values by traversing the entire query
dependency graph. It can skip over sub-graphs if it is known that they
contain no accumulated values. This makes accumulators a great tool for
when they are rare; diagnostics are a good example. Unfortunately,
suppressions are more common, and they often appear in many different
files, making the "skip over subgraphs" optimization less effective.
Because of that, I want to wait to adopt salsa accumulators for type
check diagnostics (we could start using them for other diagnostics)
until we have very specific reasons that justify regressing incremental
check performance.
This PR does a "small" refactor that brings us closer to what I have in
#14760 but without using accumulators. To emit a diagnostic, a method
needs:
* Access to the db
* Access to the currently checked file
This PR introduces a new `InferContext` that holds on to the db, the
current file, and the reported diagnostics. It replaces the
`TypeCheckDiagnosticsBuilder`. We pass the `InferContext` instead of the
`db` to methods that *might* emit diagnostics. This simplifies some of
the `Outcome` methods, which can now be called with a context instead of
a `db` and the diagnostics builder. Having the `db` and the file on a
single type like this would also be useful when using accumulators.
This PR doesn't solve the issue that the `Outcome` types feel somewhat
complicated nor that it can be annoying when you need to report a
`Diagnostic,` but you don't have access to an `InferContext` (or the
file). However, I also believe that accumulators won't solve these
problems because:
* Even with accumulators, it's necessary to have a reference to the file
that's being checked. The struggle would be to get a reference to that
file rather than getting a reference to `InferContext`.
* Users of the `HasTy` trait (e.g., a linter) don't want to bother
getting the `File` when calling `Type::return_ty` because they aren't
interested in the created diagnostics. They just want to know what
calling the current expression would return (and if it even is a
callable). This is what the different methods of `Outcome` enable today.
I can ask for the return type without needing extra data that's only
relevant for emitting a diagnostic.
A shortcoming of this approach is that it is now a bit confusing when to
pass `db` and when an `InferContext`. An option is that we'd make the
`file` on `InferContext` optional (it won't collect any diagnostics if
`None`) and change all methods on `Type` to take `InferContext` as the
first argument instead of a `db`. I'm interested in your opinion on
this.
Accumulators are definitely harder to use incorrectly because they
remove the need to merge the diagnostics explicitly and there's no risk
that we accidentally merge the diagnostics twice, resulting in
duplicated diagnostics. I still value performance more over making our
life slightly easier.
This tweaks the new semantics from #15026 a bit when a symbol could be
interpreted both as an attribute and a submodule of a package. For
`from...import`, we should actually prioritize the attribute, because of
how the statement itself is implemented [1].
> 1. check if the imported module has an attribute by that name
> 2. if not, attempt to import a submodule with that name and then check
the imported module again for that attribute
[1] https://docs.python.org/3/reference/simple_stmts.html#the-import-statement
## Summary
Fixes#14550.
Add `AlwaysTruthy` and `AlwaysFalsy` types, representing the set of objects whose `__bool__` method can only ever return `True` or `False`, respectively, and narrow `if x` and `if not x` accordingly.
## Test Plan
- New Markdown test for truthiness narrowing `narrow/truthiness.md`
- unit tests in `types.rs` and `builders.rs` (`cargo test --package
red_knot_python_semantic --lib -- types`)
## Summary
Fixes https://github.com/astral-sh/ruff/issues/15027
The `MemoryFileSystem::write_file` API automatically creates
non-existing ancestor directoryes
but we failed to update the status of the now created ancestor
directories in the `Files` data structure.
## Test Plan
Tested that the case in https://github.com/astral-sh/ruff/issues/15027
now passes regardless of whether the *Simple* case is commented out or
not
## Summary
This PR updates the logic when raising conflicting declarations
diagnostic to avoid the undeclared path if present.
The conflicting declaration diagnostics is added when there are two or
more declarations in the control flow path of a definition whose type
isn't equivalent to each other. This can be seen in the following
example:
```py
if flag:
x: int
x = 1 # conflicting-declarations: Unknown, int
```
After this PR, we'd avoid considering "Unknown" as part of the
conflicting declarations. This means we'd still flag it for the
following case:
```py
if flag:
x: int
else:
x: str
x = 1 # conflicting-declarations: int, str
```
A solution that's local to the exception control flow was also explored
which required updating the logic for merging the flow snapshot to avoid
considering declarations using a flag. This is preserved here:
https://github.com/astral-sh/ruff/compare/dhruv/control-flow-no-declarations?expand=1.
The main motivation to avoid that is we don't really understand what the
user experience is w.r.t. the Unknown type and the
conflicting-declaration diagnostics. This makes us unsure on what the
right semantics are as to whether that diagnostics should be raised or
not and when to raise them. For now, we've decided to move forward with
this PR and could decide to adopt another solution or remove the
conflicting-declaration diagnostics in the future.
Closes: #13966
## Test Plan
Update the existing mdtest case. Add an additional case specific to
exception control flow to verify that the diagnostic is not being raised
now.
When importing a nested module, we were correctly creating a binding for
the top-most parent, but we were binding that to the nested module, not
to that parent module. Moreover, we weren't treating those submodules as
members of their containing parents. This PR addresses both issues, so
that nested imports work as expected.
As discussed in ~Slack~ whatever chat app I find myself in these days
😄, this requires keeping track of which modules have been imported
within the current file, so that when we resolve member access on a
module reference, we can see if that member has been imported as a
submodule. If so, we return the submodule reference immediately, instead
of checking whether the parent module's definition defines the symbol.
This is currently done in a flow insensitive manner. The `SemanticIndex`
now tracks all of the modules that are imported (via `import`, not via
`from...import`). The member access logic mentioned above currently only
considers module imports in the file containing the attribute
expression.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
A class is an instance of its metaclass, so `ClassLiteral("ABC")` is not
disjoint from `Instance("ABCMeta")`. However, we erroneously consider
the two types disjoint on the `main` branch. This PR fixes that.
This bug was uncovered by adding some more core types to the property
tests that provide coverage for classes that have custom metaclasses.
The additions to the property tests are included in this PR.
## Test Plan
New unit tests and property tests added. Tested with:
- `cargo test -p red_knot_python_semantic`
- `QUICKCHECK_TESTS=100000 cargo test -p red_knot_python_semantic --
--ignored types::property_tests::stable`
The assignability property test fails on this branch, but that's a known
issue that exists on `main`, due to
https://github.com/astral-sh/ruff/issues/14899.
## Summary
Teach red-knot that `type[...]` is always disjoint from `None` and from
`LiteralString`. Fixes#14925.
This should properly be generalized to "all instances of final types
which are not subclasses of `type`", but until we support finality,
hardcoding `None` (which is known to be final) allows us to fix the
subtype transitivity property test.
## Test Plan
Existing tests pass, added new unit tests for `is_disjoint_from` and
`is_subtype_of`.
`QUICKCHECK_TESTS=100000 cargo test -p red_knot_python_semantic --
--ignored types::property_tests::stable` fails only the "assignability
is reflexive" test, which is known to fail on `main` (#14899).
The same command, with `property_tests.rs` edited to prevent generating
intersection tests (the cause of #14899), passes all quickcheck tests.
## Summary
Resolves#14922.
## Test Plan
Markdown tests.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This is not strictly required yet, but makes these tests future-proof.
They need a `python-version` requirement as they rely on language
features that are not available in 3.9.
## Summary
Add support for `typing.TYPE_CHECKING` and
`typing_extensions.TYPE_CHECKING`.
relates to: https://github.com/astral-sh/ruff/issues/14170
## Test Plan
New Markdown-based tests
## Summary
This PR extends the mdtest configuration with a `log` setting that can
be any of:
* `true`: Enables tracing
* `false`: Disables tracing (default)
* String: An ENV_FILTER similar to `RED_KNOT_LOG`
```toml
log = true
```
Closes https://github.com/astral-sh/ruff/issues/13865
## Test Plan
I changed a test and tried `log=true`, `log=false`, and `log=INFO`
## Summary
This PR renames the `--custom-typeshed-dir`, `target-version`, and
`--current-directory` cli options to `--typeshed`,
`--python-version`, and `--project` as discussed in the CLI proposal
document.
I added aliases for `--target-version` (for Ruff compat) and
`--custom-typeshed-dir` (for Alex)
## Test Plan
Long help
```
An extremely fast Python type checker.
Usage: red_knot [OPTIONS] [COMMAND]
Commands:
server Start the language server
help Print this message or the help of the given subcommand(s)
Options:
--project <PROJECT>
Run the command within the given project directory.
All `pyproject.toml` files will be discovered by walking up the directory tree from the project root, as will the project's virtual environment (`.venv`).
Other command-line arguments (such as relative paths) will be resolved relative to the current working directory."#,
--venv-path <PATH>
Path to the virtual environment the project uses.
If provided, red-knot will use the `site-packages` directory of this virtual environment to resolve type information for the project's third-party dependencies.
--typeshed-path <PATH>
Custom directory to use for stdlib typeshed stubs
--extra-search-path <PATH>
Additional path to use as a module-resolution source (can be passed multiple times)
--python-version <VERSION>
Python version to assume when resolving types
[possible values: 3.7, 3.8, 3.9, 3.10, 3.11, 3.12, 3.13]
-v, --verbose...
Use verbose output (or `-vv` and `-vvv` for more verbose output)
-W, --watch
Run in watch mode by re-running whenever files change
-h, --help
Print help (see a summary with '-h')
-V, --version
Print version
```
Short help
```
An extremely fast Python type checker.
Usage: red_knot [OPTIONS] [COMMAND]
Commands:
server Start the language server
help Print this message or the help of the given subcommand(s)
Options:
--project <PROJECT> Run the command within the given project directory
--venv-path <PATH> Path to the virtual environment the project uses
--typeshed-path <PATH> Custom directory to use for stdlib typeshed stubs
--extra-search-path <PATH> Additional path to use as a module-resolution source (can be passed multiple times)
--python-version <VERSION> Python version to assume when resolving types [possible values: 3.7, 3.8, 3.9, 3.10, 3.11, 3.12, 3.13]
-v, --verbose... Use verbose output (or `-vv` and `-vvv` for more verbose output)
-W, --watch Run in watch mode by re-running whenever files change
-h, --help Print help (see more with '--help')
-V, --version Print version
```
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Regression test(s) for something that broken while implementing #14759.
We have similar tests for other control flow elements, but feel free to
let me know if this seems superfluous.
## Test Plan
New mdtests
## Summary
Using `typing.LiteralString` breaks as soon as we understand
`sys.version_info` branches, as it's only available in 3.11 and later.
## Test Plan
Made sure it didn't fail on my #14759 branch anymore.
We support using `typing.Type[]` as a base class (and we have tests for
it), but not yet `builtins.type[]`. At some point we should fix that,
but I don't think it';s worth spending much time on now (and it might be
easier once we've implemented generics?). This PR just adds a failing
test with a TODO.
## Summary
This is the third and last PR in this stack that adds support for
toggling lints at a per-rule level.
This PR introduces a new `LintRegistry`, a central index of known lints.
The registry is required because we want to support lint rules from many
different crates but need a way to look them up by name, e.g., when
resolving a lint from a name in the configuration or analyzing a
suppression comment.
Adding a lint now requires two steps:
1. Declare the lint with `declare_lint`
2. Register the lint in the registry inside the `register_lints`
function.
I considered some more involved macros to avoid changes in two places.
Still, I ultimately decided against it because a) it's just two places
and b) I'd expect that registering a type checker lint will differ from
registering a lint that runs as a rule in the linter. I worry that any
more opinionated design could limit our options when working on the
linter, so I kept it simple.
The second part of this PR is the `RuleSelection`. It stores which lints
are enabled and what severity they should use for created diagnostics.
For now, the `RuleSelection` always gets initialized with all known
lints and it uses their default level.
## Linter crates
Each crate that defines lints should export a `register_lints` function
that accepts a `&mut LintRegistryBuilder` to register all its known
lints in the registry. This should make registering all known lints in a
top-level crate easy: Just call `register_lints` of every crate that
defines lint rules.
I considered defining a `LintCollection` trait and even some fancy
macros to accomplish the same but decided to go for this very simplistic
approach for now. We can add more abstraction once needed.
## Lint rules
This is a bit hand-wavy. I don't have a good sense for how our linter
infrastructure will look like, but I expect we'll need a way to register
the rules that should run as part of the red knot linter. One way is to
keep doing what Ruff does by having one massive `checker` and each lint
rule adds a call to itself in the relevant AST visitor methods. An
alternative is that we have a `LintRule` trait that provides common
hooks and implementations will be called at the "right time". Such a
design would need a way to register all known lint implementations,
possibly with the lint. This is where we'd probably want a dedicated
`register_rule` method. A third option is that lint rules are handled
separately from the `LintRegistry` and are specific to the linter crate.
The current design should be flexible enough to support the three
options.
## Documentation generation
The documentation for all known lints can be generated by creating a
factory, registering all lints by calling the `register_lints` methods,
and then querying the registry for the metadata.
## Deserialization and Schema generation
I haven't fully decided what the best approach is when it comes to
deserializing lint rule names:
* Reject invalid names in the deserializer. This gives us error messages
with line and column numbers (by serde)
* Don't validate lint rule names during deserialization; defer the
validation until the configuration is resolved. This gives us more
control over handling the error, e.g. emit a warning diagnostic instead
of aborting when a rule isn't known.
One technical challenge for both deserialization and schema generation
is that the `Deserialize` and `JSONSchema` traits do not allow passing
the `LintRegistry`, which is required to look up the lints by name. I
suggest that we either rely on the salsa db being set for the current
thread (`salsa::Attach`) or build our own thread-local storage for the
`LintRegistry`. It's the caller's responsibility to make the lint
registry available before calling `Deserialize` or `JSONSchema`.
## CLI support
I prefer deferring adding support for enabling and disabling lints from
the CLI for now because I think it will be easier
to add once I've figured out how to handle configurations.
## Bitset optimization
Ruff tracks the enabled rules using a cheap copyable `Bitset` instead of
a hash map. This helped improve performance by a few percent (see
https://github.com/astral-sh/ruff/pull/3606). However, this approach is
no longer possible because lints have no "cheap" way to compute their
index inside the registry (other than using a hash map).
We could consider doing something similar to Salsa where each
`LintMetadata` stores a `LazyLintIndex`.
```
pub struct LazyLintIndex {
cached: OnceLock<(Nonce, LintIndex)>
}
impl LazyLintIndex {
pub fn get(registry: &LintRegistry, lint: &'static LintMetadata) {
let (nonce, index) = self.cached.get_or_init(|| registry.lint_index(lint));
if registry.nonce() == nonce {
index
} else {
registry.lint_index(lint)
}
}
```
Each registry keeps a map from `LintId` to `LintIndex` where `LintIndex`
is in the range of `0...registry.len()`. The `LazyLintIndex` is based on
the assumption that every program has exactly **one** registry. This
assumption allows to cache the `LintIndex` directly on the
`LintMetadata`. The implementation falls back to the "slow" path if
there is more than one registry at runtime.
I was very close to implementing this optimization because it's kind of
fun to implement. I ultimately decided against it because it adds
complexity and I don't think it's worth doing in Red Knot today:
* Red Knot only queries the rule selection when deciding whether or not
to emit a diagnostic. It is rarely used to detect if a certain code
block should run. This is different from Ruff where the rule selection
is queried many times for every single AST node to determine which rules
*should* run.
* I'm not sure if a 2-3% performance improvement is worth the complexity
I suggest revisiting this decision when working on the linter where a
fast path for deciding if a rule is enabled might be more important (but
that depends on how lint rules are implemented)
## Test Plan
I removed a lint from the default rule registry, and the MD tests
started failing because the diagnostics were no longer emitted.
This adds support for `type[Any]`, which represents an unknown type (not
an instance of an unknown type), and `type`, which we are choosing to
interpret as `type[object]`.
Closes#14546
## Summary
This is already several hundred lines of code, and it will get more
complex with call-signature checking.
## Test Plan
This is a pure code move; the moved code wasn't changed, just imports.
Existing tests pass.
## Summary
Add a `is_fully_static` premise to the equivalence on subtyping property tests.
## Test Plan
```
cargo test -p red_knot_python_semantic -- --ignored types::property_tests::stable
```
## Summary
This is the second PR out of three that adds support for
enabling/disabling lint rules in Red Knot. You may want to take a look
at the [first PR](https://github.com/astral-sh/ruff/pull/14869) in this
stack to familiarize yourself with the used terminology.
This PR adds a new syntax to define a lint:
```rust
declare_lint! {
/// ## What it does
/// Checks for references to names that are not defined.
///
/// ## Why is this bad?
/// Using an undefined variable will raise a `NameError` at runtime.
///
/// ## Example
///
/// ```python
/// print(x) # NameError: name 'x' is not defined
/// ```
pub(crate) static UNRESOLVED_REFERENCE = {
summary: "detects references to names that are not defined",
status: LintStatus::preview("1.0.0"),
default_level: Level::Warn,
}
}
```
A lint has a name and metadata about its status (preview, stable,
removed, deprecated), the default diagnostic level (unless the
configuration changes), and documentation. I use a macro here to derive
the kebab-case name and extract the documentation automatically.
This PR doesn't yet add any mechanism to discover all known lints. This
will be added in the next and last PR in this stack.
## Documentation
I documented some rules but then decided that it's probably not my best
use of time if I document all of them now (it also means that I play
catch-up with all of you forever). That's why I left some rules
undocumented (marked with TODO)
## Where is the best place to define all lints?
I'm not sure. I think what I have in this PR is fine but I also don't
love it because most lints are in a single place but not all of them. If
you have ideas, let me know.
## Why is the message not part of the lint, unlike Ruff's `Violation`
I understand that the main motivation for defining `message` on
`Violation` in Ruff is to remove the need to repeat the same message
over and over again. I'm not sure if this is an actual problem. Most
rules only emit a diagnostic in a single place and they commonly use
different messages if they emit diagnostics in different code paths,
requiring extra fields on the `Violation` struct.
That's why I'm not convinced that there's an actual need for it and
there are alternatives that can reduce the repetition when creating a
diagnostic:
* Create a helper function. We already do this in red knot with the
`add_xy` methods
* Create a custom `Diagnostic` implementation that tailors the entire
diagnostic and pre-codes e.g. the message
Avoiding an extra field on the `Violation` also removes the need to
allocate intermediate strings as it is commonly the place in Ruff.
Instead, Red Knot can use a borrowed string with `format_args`
## Test Plan
`cargo test`
## Summary
This PR introduces a structured `DiagnosticId` instead of using a plain
`&'static str`. It is the first of three in a stack that implements a
basic rules infrastructure for Red Knot.
`DiagnosticId` is an enum over all known diagnostic codes. A closed enum
reduces the risk of accidentally introducing two identical diagnostic
codes. It also opens the possibility of generating reference
documentation from the enum in the future (not part of this PR).
The enum isn't *fully closed* because it uses a `&'static str` for lint
names. This is because we want the flexibility to define lints in
different crates, and all names are only known in `red_knot_linter` or
above. Still, lower-level crates must already reference the lint names
to emit diagnostics. We could define all lint-names in `DiagnosticId`
but I decided against it because:
* We probably want to share the `DiagnosticId` type between Ruff and Red
Knot to avoid extra complexity in the diagnostic crate, and both tools
use different lint names.
* Lints require a lot of extra metadata beyond just the name. That's why
I think defining them close to their implementation is important.
In the long term, we may also want to support plugins, which would make
it impossible to know all lint names at compile time. The next PR in the
stack introduces extra syntax for defining lints.
A closed enum does have a few disadvantages:
* rustc can't help us detect unused diagnostic codes because the enum is
public
* Adding a new diagnostic in the workspace crate now requires changes to
at least two crates: It requires changing the workspace crate to add the
diagnostic and the `ruff_db` crate to define the diagnostic ID. I
consider this an acceptable trade. We may want to move `DiagnosticId` to
its own crate or into a shared `red_knot_diagnostic` crate.
## Preventing duplicate diagnostic identifiers
One goal of this PR is to make it harder to introduce ambiguous
diagnostic IDs, which is achieved by defining a closed enum. However,
the enum isn't fully "closed" because it doesn't explicitly list the IDs
for all lint rules. That leaves the possibility that a lint rule and a
diagnostic ID share the same name.
I made the names unambiguous in this PR by separating them into
different namespaces by using `lint/<rule>` for lint rule codes. I don't
mind the `lint` prefix in a *Ruff next* context, but it is a bit weird
for a standalone type checker. I'd like to not overfocus on this for now
because I see a few different options:
* We remove the `lint` prefix and add a unit test in a top-level crate
that iterates over all known lint rules and diagnostic IDs to ensure the
names are non-overlapping.
* We only render `[lint]` as the error code and add a note to the
diagnostic mentioning the lint rule. This is similar to clippy and has
the advantage that the header line remains short
(`lint/some-long-rule-name` is very long ;))
* Any other form of adjusting the diagnostic rendering to make the
distinction clear
I think we can defer this decision for now because the `DiagnosticId`
contains all the relevant information to change the rendering
accordingly.
## Why `Lint` and not `LintRule`
I see three kinds of diagnostics in Red Knot:
* Non-suppressable: Reveal type, IO errors, configuration errors, etc.
(any `DiagnosticId`)
* Lints: code-related diagnostics that are suppressable.
* Lint rules: The same as lints, but they can be enabled or disabled in
the configuration. The majority of lints in Red Knot and the Ruff
linter.
Our current implementation doesn't distinguish between lints and Lint
rules because we aren't aware of a suppressible code-related lint that
can't be configured in the configuration. The only lint that comes to my
mind is maybe `division-by-zero` if we're 99.99% sure that it is always
right. However, I want to keep the door open to making this distinction
in the future if it proves useful.
Another reason why I chose lint over lint rule (or just rule) is that I
want to leave room for a future lint rule and lint phase concept:
* lint is the *what*: a specific code smell, pattern, or violation
* the lint rule is the *how*: I could see a future `LintRule` trait in
`red_knot_python_linter` that provides the necessary hooks to run as
part of the linter. A lint rule produces diagnostics for exactly one
lint. A lint rule differs from all lints in `red_knot_python_semantic`
because they don't run as "rules" in the Ruff sense. Instead, they're a
side-product of type inference.
* the lint phase is a different form of *how*: A lint phase can produce
many different lints in a single pass. This is a somewhat common pattern
in Ruff where running one analysis collects the necessary information
for finding many different lints
* diagnostic is the *presentation*: Unlike a lint, the diagnostic isn't
the what, but how a specific lint gets presented. I expect that many
lints can use one generic `LintDiagnostic`, but a few lints might need
more flexibility and implement their custom diagnostic rendering (at
least custom `Diagnostic` implementation).
## Test Plan
`cargo test`
## Summary
Per suggestion in
https://github.com/astral-sh/ruff/pull/14802#discussion_r1875455417
This is a bit less error-prone and allows us to handle both expressions
in the current scope or a different scope. Also, there's currently no
need for this method outside of `TypeInferenceBuilder`, so no reason to
expose it in `types.rs`.
## Test Plan
Pure refactor, no functional change; existing tests pass.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
This adds support for `type[a.X]`, where the `type` special form is
applied to a qualified name that resolves to a class literal. This works
for both nested classes and classes imported from another module.
Closes#14545
## Summary
Inferred and declared types for function parameters, in the function
body scope.
Fixes#13693.
## Test Plan
Added mdtests.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
`typing.Never` and `typing.LiteralString` are only conditionally
exported from `typing` for Python versions 3.11 and later. We run the
Markdown tests with the default Python version of 3.9, so here we change
the import to `typing_extensions` instead, and add a new test to make
sure we'll continue to understand the `typing`-version of these symbols
for newer versions.
This didn't cause problems so far, as we don't understand
`sys.version_info` branches yet.
## Test Plan
New Markdown tests to make sure this will continue to work in the
future.
## Summary
This adds support for specifying the target Python version from a
Markdown test. It is a somewhat limited ad-hoc solution, but designed to
be future-compatible. TOML blocks can be added to arbitrary sections in
the Markdown block. They have the following format:
````markdown
```toml
[tool.knot.environment]
target-version = "3.13"
```
````
So far, there is nothing else that can be configured, but it should be
straightforward to extend this to things like a custom typeshed path.
This is in preparation for the statically-known branches feature where
we are going to have to specify the target version for lots of tests.
## Test Plan
- New Markdown test that fails without the explicitly specified
`target-version`.
- Manually tested various error paths when specifying a wrong
`target-version` field.
- Made sure that running tests is as fast as before.
## Summary
This is related to #13778, more specifically
https://github.com/astral-sh/ruff/issues/13778#issuecomment-2513556004.
This PR adds various test cases where a keyword is being where an
identifier is expected. The tests are to make sure that red knot doesn't
panic, raises the syntax error and the identifier is added to the symbol
table. The final part allows editor related features like renaming the
symbol.
## Summary
`typing_extensions` has a `>=3.13` re-export for the `typing.NoDefault`
singleton, but not for `typing._NoDefaultType`. This causes problems as
soon as we understand `sys.version_info` branches, so we explicity
switch to `typing._NoDefaultType` for Python 3.13 and later.
This is a part of #14759 that I thought might make sense to break out
and merge in isolation.
## Test Plan
New test that will become more meaningful with #12700
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
- Instead of seven (more or less similar) `setup_db` functions, use just
one in a single central place.
- For every test that needs customization beyond that, offer a
`TestDbBuilder` that can control the Python target version, custom
typeshed, and pre-existing files.
The main motivation for this is that we're soon going to need
customization of the Python version, and I didn't feel like adding this
to each of the existing `setup_db` functions.
## Summary
This changeset contains various improvements concerning non-fully-static
types and their relationships:
- Make sure that non-fully-static types do not participate in
equivalence or subtyping.
- Clarify what `Type::is_equivalent_to` actually implements.
- Introduce `Type::is_fully_static`
- New tests making sure that multiple `Any`/`Unknown`s inside unions and
intersections are collapsed.
closes#14524
## Test Plan
- Added new unit tests for union and intersection builder
- Added new unit tests for `Type::is_equivalent_to`
- Added new unit tests for `Type::is_subtype_of`
- Added new property test making sure that non-fully-static types do not
participate in subtyping
We already had a representation for the Any type, which we would use
e.g. for expressions without type annotations. We now recognize
`typing.Any` as a way to refer to this type explicitly. Like other
special forms, this is tracked correctly through aliasing, and isn't
confused with local definitions that happen to have the same name.
Closes#14544
## Summary
Minor change that uses two plain classes `A` and `B` instead of
`typing.Sized` and `typing.Hashable`.
The motivation is twofold: I remember that I was confused when I first
saw this test. Was there anything specific to `Sized` and `Hashable`
that was relevant here? (there is, these classes are not overlapping;
and you can build a proper intersection from them; but that's true for
almost all non-builtin classes).
I now ran into another problem while working on #14758: `Sized` and
`Hashable` are protocols that we don't fully understand yet. This
causing some trouble when trying to infer whether these are fully-static
types or not.
## Summary
This PR adds a new `property_tests` module with quickcheck-based tests
that verify certain properties of types. The following properties are
currently checked:
* `is_equivalent_to`:
* is reflexive: `T` is equivalent to itself
* `is_subtype_of`:
* is reflexive: `T` is a subtype of `T`
* is antisymmetric: if `S <: T` and `T <: S`, then `S` is equivalent to
`T`
* is transitive: `S <: T` & `T <: U` => `S <: U`
* `is_disjoint_from`:
* is irreflexive: `T` is not disjoint from `T`
* is symmetric: `S` disjoint from `T` => `T` disjoint from `S`
* `is_assignable_to`:
* is reflexive
* `negate`:
* is an involution: `T.negate().negate()` is equivalent to `T`
There are also some tests that validate higher-level properties like:
* `S <: T` implies that `S` is not disjoint from `T`
* `S <: T` implies that `S` is assignable to `T`
* A singleton type must also be single-valued
These tests found a few bugs so far:
- #14177
- #14195
- #14196
- #14210
- #14731
Some additional notes:
- Quickcheck-based property tests are non-deterministic and finding
counter-examples might take an arbitrary long time. This makes them bad
candidates for running in CI (for every PR). We can think of running
them in a cron-job way from time to time, similar to fuzzing. But for
now, it's only possible to run them locally (see instructions in source
code).
- Some tests currently find false positive "counterexamples" because our
understanding of equivalence of types is not yet complete. We do not
understand that `int | str` is the same as `str | int`, for example.
These tests are in a separate `property_tests::flaky` module.
- Properties can not be formulated in every way possible, due to the
fact that `is_disjoint_from` and `is_subtype_of` can produce false
negative answers.
- The current shrinking implementation is very naive, which leads to
counterexamples that are very long (`str & Any & ~tuple[Any] &
~tuple[Unknown] & ~Literal[""] & ~Literal["a"] | str & int & ~tuple[Any]
& ~tuple[Unknown]`), requiring the developer to simplify manually. It
has not been a major issue so far, but there is a comment in the code
how this can be improved.
- The tests are currently implemented using a macro. This is a single
commit on top which can easily be reverted, if we prefer the plain code
instead. With the macro:
```rs
// `S <: T` implies that `S` can be assigned to `T`.
type_property_test!(
subtype_of_implies_assignable_to, db,
forall types s, t. s.is_subtype_of(db, t) => s.is_assignable_to(db, t)
);
```
without the macro:
```rs
/// `S <: T` implies that `S` can be assigned to `T`.
#[quickcheck]
fn subtype_of_implies_assignable_to(s: Ty, t: Ty) -> bool {
let db = get_cached_db();
let s = s.into_type(&db);
let t = t.into_type(&db);
!s.is_subtype_of(&*db, t) || s.is_assignable_to(&*db, t)
}
```
## Test Plan
```bash
while cargo test --release -p red_knot_python_semantic --features property_tests types::property_tests; do :; done
```
## Summary
`KnownInstance::instance_fallback` may return instances of supertypes.
For example, it returns an instance of `_SpecialForm` for `Literal`.
This means it can't be used on the right-hand side of `is_subtype_of`
relationships, because it might lead to false positives.
I can lead to false negatives on the left hand side of `is_subtype_of`,
but this is at least a known limitation. False negatives are fine for
most applications, but false positives can lead to wrong results in
intersection-simplification, for example.
closes#14731
## Test Plan
Added regression test
## Summary
Simplify tuples containing `Never` to `Never`:
```py
from typing import Never
def never() -> Never: ...
reveal_type((1, never(), "foo")) # revealed: Never
```
I should note that mypy and pyright do *not* perform this
simplification. I don't know why.
There is [only one
place](5137fcc9c8/crates/red_knot_python_semantic/src/types/infer.rs (L1477-L1484))
where we use `TupleType::new` directly (instead of `Type::tuple`, which
changes behavior here). This appears when creating `TypeVar`
constraints, and it looks to me like it should stay this way, because
we're using `TupleType` to store a list of constraints there, instead of
an actual type. We also store `tuple[constraint1, constraint2, …]` as
the type for the `constraint1, constraint2, …` tuple expression. This
would mean that we infer a type of `tuple[str, Never]` for the following
type variable constraints, without simplifying it to `Never`. This seems
like a weird edge case that's maybe not worth looking further into?!
```py
from typing import Never
# vvvvvvvvvv
def f[T: (str, Never)](x: T):
pass
```
## Test Plan
- Added a new unit test. Did not add additional Markdown tests as that
seems superfluous.
- Tested the example above using red knot, mypy, pyright.
- Verified that this allows us to remove `contains_never` from the
property tests
(https://github.com/astral-sh/ruff/pull/14178#discussion_r1866473192)
Resolves https://github.com/astral-sh/ruff/issues/14547 by delegating
narrowing to `E` for `bool(E)` where `E` is some expression.
This change does not include other builtin class constructors which
should also work in this position, like `int(..)` or `float(..)`, as the
original issue does not mention these. It should be easy enough to add
checks for these as well if we want to.
I don't see a lot of markdown tests for malformed input, maybe there's a
better place for the no args and too many args cases to go?
I did see after the fact that it looks like this task was intended for a
new hire.. my apologies. I got here from
https://github.com/astral-sh/ruff/issues/13694, which is marked
help-wanted.
---------
Co-authored-by: David Peter <mail@david-peter.de>
## Summary
Closes: https://github.com/astral-sh/ruff/issues/14593
The final type of a variable after if-statement without explicit else
branch should be similar to having an explicit else branch.
## Test Plan
Originally failed test cases from the bug are added.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
`bool()` is equal to `False`, and we infer `Literal[False]` for it. Which
means that the test here will fail as soon as we treat the body of
this `if` as unreachable.
## Summary
Fix panics related to expressions without inferred types in invalid
syntax examples like:
```py
x: f"Literal[{1 + 2}]" = 3
```
where the `1 + 2` expression (and its sub-expressions) inside the
annotation did not have an inferred type.
## Test Plan
Added new corpus test.
## Summary
This is about the easiest patch that I can think of. It has a drawback
in that there is no real guarantee this won't happen again. I think this
might be acceptable, given that all of this is a temporary thing.
And we also add a new CI job to prevent regressions like this in the
future.
For the record though, I'm listing alternative approaches I thought of:
- We could get rid of the debug/release distinction and just add `@Todo`
type metadata everywhere. This has possible affects on runtime. The main
reason I didn't follow through with this is that the size of `Type`
increases. We would either have to adapt the `assert_eq_size!` test or
get rid of it. Even if we add messages everywhere and get rid of the
file-and-line-variant in the enum, it's not enough to get back to the
current release-mode size of `Type`.
- We could generally discard `@Todo` meta information when using it in
tests. I think this would be a huge drawback. I like that we can have
the actual messages in the mdtest. And make sure we get the expected
`@Todo` type, not just any `@Todo`. It's also helpful when debugging
tests.
closes#14594
## Test Plan
```rs
cargo nextest run --release
```
Fix#14558
## Summary
- Add `typing.NoReturn` and `typing.Never` to known instances and infer
them as `Type::Never`
- Add `is_assignable_to` cases for `Type::Never`
I skipped emitting diagnostic for when a function is annotated as
`NoReturn` but it actually returns.
## Test Plan
Added tests from
https://github.com/python/typing/blob/main/conformance/tests/specialtypes_never.py
except from generics and checking if the return value of the function
and the annotations match.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Closes#14588
```py
x: Literal[42, "hello"] = 42 if bool_instance() else "hello"
reveal_type(x) # revealed: Literal[42] | Literal["hello"]
_ = ... if isinstance(x, str) else ...
# The `isinstance` test incorrectly narrows the type of `x`.
# As a result, `x` is revealed as Literal["hello"], but it should remain Literal[42, "hello"].
reveal_type(x) # revealed: Literal["hello"]
```
## Test Plan
mdtest included!
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This fix addresses panics related to invalid syntax like the following
where a `break` statement is used in a nested definition inside a
loop:
```py
while True:
def b():
x: int
break
```
closes#14342
## Test Plan
* New corpus regression tests.
* New unit test to make sure we handle nested while loops correctly.
This test is passing on `main`, but can easily fail if the
`is_inside_loop` state isn't properly saved/restored.
## Summary
Add support for (non-generic) type aliases. The main motivation behind
this was to get rid of panics involving expressions in (generic) type
aliases. But it turned out the best way to fix it was to implement
(partial) support for type aliases.
```py
type IntOrStr = int | str
reveal_type(IntOrStr) # revealed: typing.TypeAliasType
reveal_type(IntOrStr.__name__) # revealed: Literal["IntOrStr"]
x: IntOrStr = 1
reveal_type(x) # revealed: Literal[1]
def f() -> None:
reveal_type(x) # revealed: int | str
```
## Test Plan
- Updated corpus test allow list to reflect that we don't panic anymore.
- Added Markdown-based test for type aliases (`type_alias.md`)
## Summary
Fixes a panic related to sub-expressions of `typing.Union` where we fail
to store a type for the `int, str` tuple-expression in code like this:
```
x: Union[int, str] = 1
```
relates to [my
comment](https://github.com/astral-sh/ruff/pull/14499#discussion_r1851794467)
on #14499.
## Test Plan
New corpus test
## Summary
Adds meta information to `Type::Todo`, allowing developers to easily
trace back the origin of a particular `@Todo` type they encounter.
Instead of `Type::Todo`, we now write either `type_todo!()` which
creates a `@Todo[path/to/source.rs:123]` type with file and line
information, or using `type_todo!("PEP 604 unions not supported")`,
which creates a variant with a custom message.
`Type::Todo` now contains a `TodoType` field. In release mode, this is
just a zero-sized struct, in order not to create any overhead. In debug
mode, this is an `enum` that contains the meta information.
`Type` implements `Copy`, which means that `TodoType` also needs to be
copyable. This limits the design space. We could intern `TodoType`, but
I discarded this option, as it would require us to have access to the
salsa DB everywhere we want to use `Type::Todo`. And it would have made
the macro invocations less ergonomic (requiring us to pass `db`).
So for now, the meta information is simply a `&'static str` / `u32` for
the file/line variant, or a `&'static str` for the custom message.
Anything involving a chain/backtrace of several `@Todo`s or similar is
therefore currently not implemented. Also because we currently don't see
any direct use cases for this, and because all of this will eventually
go away.
Note that the size of `Type` increases from 16 to 24 bytes, but only in
debug mode.
## Test Plan
- Observed the changes in Markdown tests.
- Added custom messages for all `Type::Todo`s that were revealed in the
tests
- Ran red knot in release and debug mode on the following Python file:
```py
def f(x: int) -> int:
reveal_type(x)
```
Prints `@Todo` in release mode and `@Todo(function parameter type)` in
debug mode.
Fix#14498
## Summary
This PR adds `typing.Union` support
## Test Plan
I created new tests in mdtest.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
closes#14279
### Limitations of the Current Implementation
#### Incorrect Error Propagation
In the current implementation of lexicographic comparisons, if the
result of an Eq operation is Ambiguous, the comparison stops
immediately, returning a bool instance. While this may yield correct
inferences, it fails to capture unsupported-operation errors that might
occur in subsequent comparisons.
```py
class A: ...
(int_instance(), A()) < (int_instance(), A()) # should error
```
#### Weak Inference in Specific Cases
> Example: `(int_instance(), "foo") == (int_instance(), "bar")`
> Current result: `bool`
> Expected result: `Literal[False]`
`Eq` and `NotEq` have unique behavior in lexicographic comparisons
compared to other operators. Specifically:
- For `Eq`, if any non-equal pair exists within the tuples being
compared, we can immediately conclude that the tuples are not equal.
- For `NotEq`, if any equal pair exists, we can conclude that the tuples
are unequal.
```py
a = (str_instance(), int_instance(), "foo")
reveal_type(a == a) # revealed: bool
reveal_type(a != a) # revealed: bool
b = (str_instance(), int_instance(), "bar")
reveal_type(a == b) # revealed: bool # should be Literal[False]
reveal_type(a != b) # revealed: bool # should be Literal[True]
```
#### Incorrect Support for Non-Boolean Rich Comparisons
In CPython, aside from `==` and `!=`, tuple comparisons return a
non-boolean result as-is. Tuples do not convert the value into `bool`.
Note: If all pairwise `==` comparisons between elements in the tuples
return Truthy, the comparison then considers the tuples' lengths.
Regardless of the return type of the dunder methods, the final result
can still be a boolean.
```py
from __future__ import annotations
class A:
def __eq__(self, o: object) -> str:
return "hello"
def __ne__(self, o: object) -> bytes:
return b"world"
def __lt__(self, o: A) -> float:
return 3.14
a = (A(), A())
reveal_type(a == a) # revealed: bool
reveal_type(a != a) # revealed: bool
reveal_type(a < a) # revealed: bool # should be: `float | Literal[False]`
```
### Key Changes
One of the major changes is that comparisons no longer end with a `bool`
result when a pairwise `Eq` result is `Ambiguous`. Instead, the function
attempts to infer all possible cases and unions the results. This
improvement allows for more robust type inference and better error
detection.
Additionally, as the function is now optimized for tuple comparisons,
the name has been changed from the more general
`infer_lexicographic_comparison` to `infer_tuple_rich_comparison`.
## Test Plan
mdtest included
## Summary
Previously, we panicked on expressions like `f"{v:{f'0.2f'}}"` because
we did not infer types for expressions nested inside format spec
elements.
## Test Plan
```
cargo nextest run -p red_knot_workspace -- --ignored linter_af linter_gz
```
## Summary
Add type narrowing for `type(x) is C` conditions (and `else` clauses of
`type(x) is not C` conditionals):
```py
if type(x) is A:
reveal_type(x) # revealed: A
else:
reveal_type(x) # revealed: A | B
```
closes: #14431, part of: #13694
## Test Plan
New Markdown-based tests.
## Summary
This patches up various missing paths where sub-expressions of type
annotations previously had no type attached. Examples include:
```py
tuple[int, str]
# ~~~~~~~~
type[MyClass]
# ~~~~~~~
Literal["foo"]
# ~~~~~
Literal["foo", Literal[1, 2]]
# ~~~~~~~~~~~~~
Literal[1, "a", random.illegal(sub[expr + ession])]
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
## Test Plan
```
cargo nextest run -p red_knot_workspace -- --ignored linter_af linter_gz
```