## Summary
This is related to #13778, more specifically
https://github.com/astral-sh/ruff/issues/13778#issuecomment-2513556004.
This PR adds various test cases where a keyword is being where an
identifier is expected. The tests are to make sure that red knot doesn't
panic, raises the syntax error and the identifier is added to the symbol
table. The final part allows editor related features like renaming the
symbol.
## Summary
`typing_extensions` has a `>=3.13` re-export for the `typing.NoDefault`
singleton, but not for `typing._NoDefaultType`. This causes problems as
soon as we understand `sys.version_info` branches, so we explicity
switch to `typing._NoDefaultType` for Python 3.13 and later.
This is a part of #14759 that I thought might make sense to break out
and merge in isolation.
## Test Plan
New test that will become more meaningful with #12700
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
- Instead of seven (more or less similar) `setup_db` functions, use just
one in a single central place.
- For every test that needs customization beyond that, offer a
`TestDbBuilder` that can control the Python target version, custom
typeshed, and pre-existing files.
The main motivation for this is that we're soon going to need
customization of the Python version, and I didn't feel like adding this
to each of the existing `setup_db` functions.
## Summary
This changeset contains various improvements concerning non-fully-static
types and their relationships:
- Make sure that non-fully-static types do not participate in
equivalence or subtyping.
- Clarify what `Type::is_equivalent_to` actually implements.
- Introduce `Type::is_fully_static`
- New tests making sure that multiple `Any`/`Unknown`s inside unions and
intersections are collapsed.
closes#14524
## Test Plan
- Added new unit tests for union and intersection builder
- Added new unit tests for `Type::is_equivalent_to`
- Added new unit tests for `Type::is_subtype_of`
- Added new property test making sure that non-fully-static types do not
participate in subtyping
We already had a representation for the Any type, which we would use
e.g. for expressions without type annotations. We now recognize
`typing.Any` as a way to refer to this type explicitly. Like other
special forms, this is tracked correctly through aliasing, and isn't
confused with local definitions that happen to have the same name.
Closes#14544
## Summary
Minor change that uses two plain classes `A` and `B` instead of
`typing.Sized` and `typing.Hashable`.
The motivation is twofold: I remember that I was confused when I first
saw this test. Was there anything specific to `Sized` and `Hashable`
that was relevant here? (there is, these classes are not overlapping;
and you can build a proper intersection from them; but that's true for
almost all non-builtin classes).
I now ran into another problem while working on #14758: `Sized` and
`Hashable` are protocols that we don't fully understand yet. This
causing some trouble when trying to infer whether these are fully-static
types or not.
Closes: #14676
I think the consensus generally was to keep the rule as-is, but expand
the docs.
## Summary
Expands the docs for TC006 with an explanation for why the type
expression is always quoted, including mention of another potential
benefit to this style.
When fixing an invalid escape sequence in an f-string, each f-string
element is analyzed for valid escape characters prior to creating the
diagnostic and fix. This allows us to safely prefix with `r` to create a
raw string if no valid escape characters were found anywhere in the
f-string, and otherwise insert backslashes.
This fixes a bug in the original implementation: each "f-string part"
was treated separately, so it was not possible to tell whether a valid
escape character was or would be used elsewhere in the f-string.
Progress towards #11491 but format specifiers are not handled in this
PR.
## Summary
This PR makes changes to the `AIR001` rule as per
https://github.com/astral-sh/ruff/pull/14627#discussion_r1860212307.
Additionally,
* Avoid returning the `Diagnostic` and update the checker in the rule
logic for consistency
* Remove test case for different keyword position (I don't think it's
required here)
## Test Plan
Add test cases for multiple operators from various modules.
## Summary
Just some minor followups to the recently merged RUF052 rule, that was
added in bf0fd04:
- Some small tweaks to the docs
- A minor code-style nit
- Some more tests for my peace of mind, just to check that the new
methods on the semantic model are working correctly
I'm adding the "internal" label as this doesn't deserve a changelog
entry. RUF052 is a new rule that hasn't been released yet.
## Test Plan
`cargo test -p ruff_linter`
## Summary
This PR adds a new `property_tests` module with quickcheck-based tests
that verify certain properties of types. The following properties are
currently checked:
* `is_equivalent_to`:
* is reflexive: `T` is equivalent to itself
* `is_subtype_of`:
* is reflexive: `T` is a subtype of `T`
* is antisymmetric: if `S <: T` and `T <: S`, then `S` is equivalent to
`T`
* is transitive: `S <: T` & `T <: U` => `S <: U`
* `is_disjoint_from`:
* is irreflexive: `T` is not disjoint from `T`
* is symmetric: `S` disjoint from `T` => `T` disjoint from `S`
* `is_assignable_to`:
* is reflexive
* `negate`:
* is an involution: `T.negate().negate()` is equivalent to `T`
There are also some tests that validate higher-level properties like:
* `S <: T` implies that `S` is not disjoint from `T`
* `S <: T` implies that `S` is assignable to `T`
* A singleton type must also be single-valued
These tests found a few bugs so far:
- #14177
- #14195
- #14196
- #14210
- #14731
Some additional notes:
- Quickcheck-based property tests are non-deterministic and finding
counter-examples might take an arbitrary long time. This makes them bad
candidates for running in CI (for every PR). We can think of running
them in a cron-job way from time to time, similar to fuzzing. But for
now, it's only possible to run them locally (see instructions in source
code).
- Some tests currently find false positive "counterexamples" because our
understanding of equivalence of types is not yet complete. We do not
understand that `int | str` is the same as `str | int`, for example.
These tests are in a separate `property_tests::flaky` module.
- Properties can not be formulated in every way possible, due to the
fact that `is_disjoint_from` and `is_subtype_of` can produce false
negative answers.
- The current shrinking implementation is very naive, which leads to
counterexamples that are very long (`str & Any & ~tuple[Any] &
~tuple[Unknown] & ~Literal[""] & ~Literal["a"] | str & int & ~tuple[Any]
& ~tuple[Unknown]`), requiring the developer to simplify manually. It
has not been a major issue so far, but there is a comment in the code
how this can be improved.
- The tests are currently implemented using a macro. This is a single
commit on top which can easily be reverted, if we prefer the plain code
instead. With the macro:
```rs
// `S <: T` implies that `S` can be assigned to `T`.
type_property_test!(
subtype_of_implies_assignable_to, db,
forall types s, t. s.is_subtype_of(db, t) => s.is_assignable_to(db, t)
);
```
without the macro:
```rs
/// `S <: T` implies that `S` can be assigned to `T`.
#[quickcheck]
fn subtype_of_implies_assignable_to(s: Ty, t: Ty) -> bool {
let db = get_cached_db();
let s = s.into_type(&db);
let t = t.into_type(&db);
!s.is_subtype_of(&*db, t) || s.is_assignable_to(&*db, t)
}
```
## Test Plan
```bash
while cargo test --release -p red_knot_python_semantic --features property_tests types::property_tests; do :; done
```
## Summary
`KnownInstance::instance_fallback` may return instances of supertypes.
For example, it returns an instance of `_SpecialForm` for `Literal`.
This means it can't be used on the right-hand side of `is_subtype_of`
relationships, because it might lead to false positives.
I can lead to false negatives on the left hand side of `is_subtype_of`,
but this is at least a known limitation. False negatives are fine for
most applications, but false positives can lead to wrong results in
intersection-simplification, for example.
closes#14731
## Test Plan
Added regression test
## Summary
Simplify tuples containing `Never` to `Never`:
```py
from typing import Never
def never() -> Never: ...
reveal_type((1, never(), "foo")) # revealed: Never
```
I should note that mypy and pyright do *not* perform this
simplification. I don't know why.
There is [only one
place](5137fcc9c8/crates/red_knot_python_semantic/src/types/infer.rs (L1477-L1484))
where we use `TupleType::new` directly (instead of `Type::tuple`, which
changes behavior here). This appears when creating `TypeVar`
constraints, and it looks to me like it should stay this way, because
we're using `TupleType` to store a list of constraints there, instead of
an actual type. We also store `tuple[constraint1, constraint2, …]` as
the type for the `constraint1, constraint2, …` tuple expression. This
would mean that we infer a type of `tuple[str, Never]` for the following
type variable constraints, without simplifying it to `Never`. This seems
like a weird edge case that's maybe not worth looking further into?!
```py
from typing import Never
# vvvvvvvvvv
def f[T: (str, Never)](x: T):
pass
```
## Test Plan
- Added a new unit test. Did not add additional Markdown tests as that
seems superfluous.
- Tested the example above using red knot, mypy, pyright.
- Verified that this allows us to remove `contains_never` from the
property tests
(https://github.com/astral-sh/ruff/pull/14178#discussion_r1866473192)
This PR improves on #14477 by:
- Ensuring user's do not require the module alias "__debug__", which is unassignable
- Validating the linter settings for
`lint.flake8-import-conventions.extend-aliases` (whereas previously we
only did this for `lint.flake8-import-conventions.aliases`).
Closes#14662
Resolves https://github.com/astral-sh/ruff/issues/14547 by delegating
narrowing to `E` for `bool(E)` where `E` is some expression.
This change does not include other builtin class constructors which
should also work in this position, like `int(..)` or `float(..)`, as the
original issue does not mention these. It should be easy enough to add
checks for these as well if we want to.
I don't see a lot of markdown tests for malformed input, maybe there's a
better place for the no args and too many args cases to go?
I did see after the fact that it looks like this task was intended for a
new hire.. my apologies. I got here from
https://github.com/astral-sh/ruff/issues/13694, which is marked
help-wanted.
---------
Co-authored-by: David Peter <mail@david-peter.de>
This PR extends the Decimal parsing used in [verbose-decimal-constructor
(FURB157)](https://docs.astral.sh/ruff/rules/verbose-decimal-constructor/)
to better handle non-finite `Decimal` objects, avoiding some false
negatives.
Closes#14587
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Seeing the fuzzing results from @dhruvmanila in #13778, I think we can
re-enable these tests. We also had one regression that would have been
caught by these tests, so there is some value in having them enabled.
## Summary
- Check if `hashlib` and `crypt` imports have been seen for `FURB181`
and `S324`
- Mark the fix for `FURB181` as safe: I think it was accidentally marked
as unsafe in the first place. The rule does not support user-defined
classes as the "fix safety" section suggests.
- Removed `hashlib._Hash`, as it's not part of the `hashlib` module.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
Updated the test snapshots
## Summary
Closes: https://github.com/astral-sh/ruff/issues/14593
The final type of a variable after if-statement without explicit else
branch should be similar to having an explicit else branch.
## Test Plan
Originally failed test cases from the bug are added.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
`bool()` is equal to `False`, and we infer `Literal[False]` for it. Which
means that the test here will fail as soon as we treat the body of
this `if` as unreachable.
## Summary
This came up as part of #12927 when implementing
`SemanticModel::simulate_runtime_load`.
Should be fairly self-explanatory, if the scope returns a binding with
`BindingKind::Annotation` the bottom part of the loop gets skipped, so
there's no chance for `seen_function` to have been updated. So unless
there's something subtle going on here, like function scopes never
containing bindings with `BindingKind::Annotation`, this seems like a
bug.
## Test Plan
`cargo nextest run`
## Summary
This PR fixes a bug in the f-string formatting to not consider the
escaped newlines for `is_multiline`. This is done by checking if the
f-string is triple-quoted or not similar to normal string literals.
This is not required to be gated behind preview because the logic change
for `is_multiline` was added in
https://github.com/astral-sh/ruff/pull/14454.
## Test Plan
Add a test case which formats differently on `main`:
https://play.ruff.rs/ea3c55c2-f0fe-474e-b6b8-e3365e0ede5e
## Summary
This PR gets rid of the `requirements.in` and `requirements.txt` files
in the `scripts/fuzz-parser` directory, and replaces them with
`pyproject.toml` and `uv.lock` files. The script is renamed from
`fuzz-parser` to `py-fuzzer` (since it can now also be used to fuzz
red-knot as well as the parser, following
https://github.com/astral-sh/ruff/pull/14566), and moved from the
`scripts/` directory to the `python/` directory, since it's now a
(uv)-pip-installable project in its own right.
I've been resisting this for a while, because conceptually this script
just doesn't feel "complicated" enough to me for it to be a full-blown
package. However, I think it's time to do this. Making it a proper
package has several advantages:
- It means we can run it from the project root using `uv run` without
having to activate a virtual environment and ensure that all required
dependencies are installed into that environment
- Using a `pyproject.toml` file means that we can express that the
project requires Python 3.12+ to run properly; this wasn't possible
before
- I've been running mypy on the project locally when I've been working
on it or reviewing other people's PRs; now I can put the mypy config for
the project in the `pyproject.toml` file
## Test Plan
I manually tested that all the commands detailed in
`python/py-fuzzer/README.md` work for me locally.
---------
Co-authored-by: David Peter <sharkdp@users.noreply.github.com>
## Summary
fixes: #14608
The logic that was only applied for 3.12+ target version needs to be
applied for other versions as well.
## Test Plan
I've moved the existing test cases for 3.12 only to `f_string.py` so
that it's tested against the default target version.
I think we should probably enabled testing for two target version (pre
3.12 and 3.12) but it won't highlight any issue because the parser
doesn't consider this. Maybe we should enable this once we have target
version specific syntax errors in place
(https://github.com/astral-sh/ruff/issues/6591).
## Summary
Fix panics related to expressions without inferred types in invalid
syntax examples like:
```py
x: f"Literal[{1 + 2}]" = 3
```
where the `1 + 2` expression (and its sub-expressions) inside the
annotation did not have an inferred type.
## Test Plan
Added new corpus test.
## Summary
Remove entry that was prevously fixed in
5a30ec0df6.
## Test Plan
```sh
cargo test -p red_knot_workspace -- --ignored linter_af linter_gz
```
## Summary
This is about the easiest patch that I can think of. It has a drawback
in that there is no real guarantee this won't happen again. I think this
might be acceptable, given that all of this is a temporary thing.
And we also add a new CI job to prevent regressions like this in the
future.
For the record though, I'm listing alternative approaches I thought of:
- We could get rid of the debug/release distinction and just add `@Todo`
type metadata everywhere. This has possible affects on runtime. The main
reason I didn't follow through with this is that the size of `Type`
increases. We would either have to adapt the `assert_eq_size!` test or
get rid of it. Even if we add messages everywhere and get rid of the
file-and-line-variant in the enum, it's not enough to get back to the
current release-mode size of `Type`.
- We could generally discard `@Todo` meta information when using it in
tests. I think this would be a huge drawback. I like that we can have
the actual messages in the mdtest. And make sure we get the expected
`@Todo` type, not just any `@Todo`. It's also helpful when debugging
tests.
closes#14594
## Test Plan
```rs
cargo nextest run --release
```
## Summary
fixes: #13813
This PR fixes a bug in the formatting assignment statement when the
value is an f-string.
This is resolved by using custom best fit layouts if the f-string is (a)
not already a flat f-string (thus, cannot be multiline) and (b) is not a
multiline string (thus, cannot be flattened). So, it is used in cases
like the following:
```py
aaaaaaaaaaaaaaaaaa = f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
expression}moreeeeeeeeeeeeeeeee"
```
Which is (a) `FStringLayout::Multiline` and (b) not a multiline.
There are various other examples in the PR diff along with additional
explanation and context as code comments.
## Test Plan
Add multiple test cases for various scenarios.
## Summary
This PR implements new rule discussed
[here](https://github.com/astral-sh/ruff/discussions/14449).
In short, it searches for assert messages which were unintentionally
used as a expression to be matched against.
## Test Plan
`cargo test` and review of `ruff-ecosystem`
Fix#14558
## Summary
- Add `typing.NoReturn` and `typing.Never` to known instances and infer
them as `Type::Never`
- Add `is_assignable_to` cases for `Type::Never`
I skipped emitting diagnostic for when a function is annotated as
`NoReturn` but it actually returns.
## Test Plan
Added tests from
https://github.com/python/typing/blob/main/conformance/tests/specialtypes_never.py
except from generics and checking if the return value of the function
and the annotations match.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Closes#14588
```py
x: Literal[42, "hello"] = 42 if bool_instance() else "hello"
reveal_type(x) # revealed: Literal[42] | Literal["hello"]
_ = ... if isinstance(x, str) else ...
# The `isinstance` test incorrectly narrows the type of `x`.
# As a result, `x` is revealed as Literal["hello"], but it should remain Literal[42, "hello"].
reveal_type(x) # revealed: Literal["hello"]
```
## Test Plan
mdtest included!
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This is just a small refactor to remove the `FormatFStringPart` as it's
only used in the case when the f-string is not implicitly concatenated
in which case the only part is going to be `FString`. In implicitly
concatenated f-strings, we use `StringLike` instead.
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix#14525
## Test Plan
<!-- How was it tested? -->
New test cases
---------
Signed-off-by: harupy <hkawamura0130@gmail.com>
## Summary
Resolves#14289
The documentation for B028 no_explicit_stacklevel is updated to be more
clear.
---------
Co-authored-by: dylwil3 <dylwil3@gmail.com>
This PR adds a sometimes-available, safe autofix for [unraw-re-pattern
(RUF039)](https://docs.astral.sh/ruff/rules/unraw-re-pattern/#unraw-re-pattern-ruf039),
which prepends an `r` prefix. It is used only when the string in
question has no backslahses (and also does not have a `u` prefix, since
that causes a syntax error.)
Closes#14527
Notes:
- Test fixture unchanged, but snapshot changed to include fix messages.
- This fix is automatically only available in preview since the rule
itself is in preview
## Summary
This fix addresses panics related to invalid syntax like the following
where a `break` statement is used in a nested definition inside a
loop:
```py
while True:
def b():
x: int
break
```
closes#14342
## Test Plan
* New corpus regression tests.
* New unit test to make sure we handle nested while loops correctly.
This test is passing on `main`, but can easily fail if the
`is_inside_loop` state isn't properly saved/restored.
## Summary
Add support for (non-generic) type aliases. The main motivation behind
this was to get rid of panics involving expressions in (generic) type
aliases. But it turned out the best way to fix it was to implement
(partial) support for type aliases.
```py
type IntOrStr = int | str
reveal_type(IntOrStr) # revealed: typing.TypeAliasType
reveal_type(IntOrStr.__name__) # revealed: Literal["IntOrStr"]
x: IntOrStr = 1
reveal_type(x) # revealed: Literal[1]
def f() -> None:
reveal_type(x) # revealed: int | str
```
## Test Plan
- Updated corpus test allow list to reflect that we don't panic anymore.
- Added Markdown-based test for type aliases (`type_alias.md`)
## Summary
Fixes a panic related to sub-expressions of `typing.Union` where we fail
to store a type for the `int, str` tuple-expression in code like this:
```
x: Union[int, str] = 1
```
relates to [my
comment](https://github.com/astral-sh/ruff/pull/14499#discussion_r1851794467)
on #14499.
## Test Plan
New corpus test
## Summary
Adds meta information to `Type::Todo`, allowing developers to easily
trace back the origin of a particular `@Todo` type they encounter.
Instead of `Type::Todo`, we now write either `type_todo!()` which
creates a `@Todo[path/to/source.rs:123]` type with file and line
information, or using `type_todo!("PEP 604 unions not supported")`,
which creates a variant with a custom message.
`Type::Todo` now contains a `TodoType` field. In release mode, this is
just a zero-sized struct, in order not to create any overhead. In debug
mode, this is an `enum` that contains the meta information.
`Type` implements `Copy`, which means that `TodoType` also needs to be
copyable. This limits the design space. We could intern `TodoType`, but
I discarded this option, as it would require us to have access to the
salsa DB everywhere we want to use `Type::Todo`. And it would have made
the macro invocations less ergonomic (requiring us to pass `db`).
So for now, the meta information is simply a `&'static str` / `u32` for
the file/line variant, or a `&'static str` for the custom message.
Anything involving a chain/backtrace of several `@Todo`s or similar is
therefore currently not implemented. Also because we currently don't see
any direct use cases for this, and because all of this will eventually
go away.
Note that the size of `Type` increases from 16 to 24 bytes, but only in
debug mode.
## Test Plan
- Observed the changes in Markdown tests.
- Added custom messages for all `Type::Todo`s that were revealed in the
tests
- Ran red knot in release and debug mode on the following Python file:
```py
def f(x: int) -> int:
reveal_type(x)
```
Prints `@Todo` in release mode and `@Todo(function parameter type)` in
debug mode.
Fix#14498
## Summary
This PR adds `typing.Union` support
## Test Plan
I created new tests in mdtest.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
- Expand some docs where they're unclear about the motivation, or assume
some knowledge that hasn't been introduced yet
- Add more links to external docs
- Rename PYI063 from `PrePep570PositionalArgument` to
`Pep484StylePositionalOnlyParameter`
- Rename the file `parenthesize_logical_operators.rs` to
`parenthesize_chained_operators.rs`, since the rule is called
`ParenthesizeChainedOperators`, not `ParenthesizeLogicalOperators`
## Test Plan
`cargo test`
## Summary
These rules were implemented in January, have been very stable, and have
no open issues about them. They were highly requested by the community
prior to being implemented. Let's stabilise them!
## Test Plan
Ecosystem check on this PR.
## Summary
closes#14279
### Limitations of the Current Implementation
#### Incorrect Error Propagation
In the current implementation of lexicographic comparisons, if the
result of an Eq operation is Ambiguous, the comparison stops
immediately, returning a bool instance. While this may yield correct
inferences, it fails to capture unsupported-operation errors that might
occur in subsequent comparisons.
```py
class A: ...
(int_instance(), A()) < (int_instance(), A()) # should error
```
#### Weak Inference in Specific Cases
> Example: `(int_instance(), "foo") == (int_instance(), "bar")`
> Current result: `bool`
> Expected result: `Literal[False]`
`Eq` and `NotEq` have unique behavior in lexicographic comparisons
compared to other operators. Specifically:
- For `Eq`, if any non-equal pair exists within the tuples being
compared, we can immediately conclude that the tuples are not equal.
- For `NotEq`, if any equal pair exists, we can conclude that the tuples
are unequal.
```py
a = (str_instance(), int_instance(), "foo")
reveal_type(a == a) # revealed: bool
reveal_type(a != a) # revealed: bool
b = (str_instance(), int_instance(), "bar")
reveal_type(a == b) # revealed: bool # should be Literal[False]
reveal_type(a != b) # revealed: bool # should be Literal[True]
```
#### Incorrect Support for Non-Boolean Rich Comparisons
In CPython, aside from `==` and `!=`, tuple comparisons return a
non-boolean result as-is. Tuples do not convert the value into `bool`.
Note: If all pairwise `==` comparisons between elements in the tuples
return Truthy, the comparison then considers the tuples' lengths.
Regardless of the return type of the dunder methods, the final result
can still be a boolean.
```py
from __future__ import annotations
class A:
def __eq__(self, o: object) -> str:
return "hello"
def __ne__(self, o: object) -> bytes:
return b"world"
def __lt__(self, o: A) -> float:
return 3.14
a = (A(), A())
reveal_type(a == a) # revealed: bool
reveal_type(a != a) # revealed: bool
reveal_type(a < a) # revealed: bool # should be: `float | Literal[False]`
```
### Key Changes
One of the major changes is that comparisons no longer end with a `bool`
result when a pairwise `Eq` result is `Ambiguous`. Instead, the function
attempts to infer all possible cases and unions the results. This
improvement allows for more robust type inference and better error
detection.
Additionally, as the function is now optimized for tuple comparisons,
the name has been changed from the more general
`infer_lexicographic_comparison` to `infer_tuple_rich_comparison`.
## Test Plan
mdtest included
## Summary
Previously, we panicked on expressions like `f"{v:{f'0.2f'}}"` because
we did not infer types for expressions nested inside format spec
elements.
## Test Plan
```
cargo nextest run -p red_knot_workspace -- --ignored linter_af linter_gz
```
## Summary
Add type narrowing for `type(x) is C` conditions (and `else` clauses of
`type(x) is not C` conditionals):
```py
if type(x) is A:
reveal_type(x) # revealed: A
else:
reveal_type(x) # revealed: A | B
```
closes: #14431, part of: #13694
## Test Plan
New Markdown-based tests.
## Summary
This patches up various missing paths where sub-expressions of type
annotations previously had no type attached. Examples include:
```py
tuple[int, str]
# ~~~~~~~~
type[MyClass]
# ~~~~~~~
Literal["foo"]
# ~~~~~
Literal["foo", Literal[1, 2]]
# ~~~~~~~~~~~~~
Literal[1, "a", random.illegal(sub[expr + ession])]
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
## Test Plan
```
cargo nextest run -p red_knot_workspace -- --ignored linter_af linter_gz
```
## Summary
Follow-up to #14371, this PR simplifies the visitor logic for list
expressions to remove the state management. We just need to make sure
that we visit the nested expressions using the `QuoteAnnotator` and not
the `Generator`. This is similar to what's being done for binary
expressions.
As per the
[grammar](https://typing.readthedocs.io/en/latest/spec/annotations.html#grammar-token-expression-grammar-annotation_expression),
list expressions can be present which can contain other type expressions
(`Callable`):
```
| <Callable> '[' <Concatenate> '[' (type_expression ',')+
(name | '...') ']' ',' type_expression ']'
(where name must be a valid in-scope ParamSpec)
| <Callable> '[' '[' maybe_unpacked (',' maybe_unpacked)*
']' ',' type_expression ']'
```
## Test Plan
`cargo insta test`
## Summary
Resolves#12616.
## Test Plan
`cargo nextest run` and `cargo insta test`.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Resolves#14378.
## Test Plan
`cargo nextest run` and `cargo insta test`.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Disable the no-panic tests for the linter corpus, as there are too many
problems right now, requiring linter-contributors to add their test
files to the allow-list.
We can still run the tests using `cargo test -p red_knot_workspace --
--ignored linter_af linter_gz`. This is also why I left the
`crates/ruff_linter/` entries in the allow list for now, even if they
will get out of sync. But let me know if I should rather remove them.
## Summary
Implements `redundant-bool-literal`
## Test Plan
<!-- How was it tested? -->
`cargo test`
The ecosystem results are all correct, but for `Airflow` the rule is not
relevant due to the use of overloading (and is marked as unsafe
correctly).
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This PR splits the corpus tests into smaller chunks because running all
of them takes 8s on my windows machine and it's by far the longest test
in `red_knot_workspace`.
Splitting the tests has the advantage that they run in parallel. This PR
brings down the wall time from 8s to 4s.
This PR also limits the glob for the linter tests because it's common to
clone cpython into the `ruff_linter/resources/test` folder for
benchmarks (because that's what's written in the contributing guides)
## Test Plan
`cargo test`
## Summary
This PR adds autofix for `redundant-numeric-union` (`PYI041`)
There are some comments below to explain the reasoning behind some
choices that might help review.
<!-- What's the purpose of the change? What does it do, and why? -->
Resolves part of https://github.com/astral-sh/ruff/issues/14185.
## Test Plan
<!-- How was it tested? -->
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
This PR adds corrected handling of list expressions to the `Visitor`
implementation of `QuotedAnnotator` in `flake8_type_checking::helpers`.
Closes#14368
## Summary
- Add 383 files from `crates/ruff_python_parser/resources` to the test
corpus
- Add 1296 files from `crates/ruff_linter/resources` to the test corpus
- Use in-memory file system for tests
- Improve test isolation by cleaning the test environment between checks
- Add a mechanism for "known failures". Mark ~80 files as known
failures.
- The corpus test is now a lot slower (6 seconds).
Note:
While `red_knot` as a command line tool can run over all of these
files without panicking, we still have a lot of test failures caused by
explicitly "pulling" all types.
## Test Plan
Run `cargo test -p red_knot_workspace` while making sure that
- Introducing code that is known to lead to a panic fails the test
- Removing code that is known to lead to a panic from
`KNOWN_FAILURES`-files also fails the test
Fix: #13934
## Summary
Current implementation has a bug when the current annotation contains a
string with single and double quotes.
TL;DR: I think these cases happen less than other use cases of Literal.
So instead of fixing them we skip the fix in those cases.
One of the problematic cases:
```
from typing import Literal
from third_party import Type
def error(self, type1: Type[Literal["'"]]):
pass
```
The outcome is:
```
- def error(self, type1: Type[Literal["'"]]):
+ def error(self, type1: "Type[Literal[''']]"):
```
While it should be:
```
"Type[Literal['\'']"
```
The solution in this case is that we check if there’s any quotes same as
the quote style we want to use for this Literal parameter then escape
that same quote used in the string.
Also this case is not uncommon to have:
<https://grep.app/search?current=2&q=Literal["'>
But this can get more complicated for example in case of:
```
- def error(self, type1: Type[Literal["\'"]]):
+ def error(self, type1: "Type[Literal[''']]"):
```
Here we escaped the inner quote but in the generated annotation it gets
removed. Then we flip the quote style of the Literal paramter and the
formatting is wrong.
In this case the solution is more complicated.
1. When generating the string of the source code preserve the backslash.
2. After we have the annotation check if there isn’t any escaped quote
of the same type we want to use for the Literal parameter. In this case
check if we have any `’` without `\` before them. This can get more
complicated since there can be multiple backslashes so checking for only
`\’` won’t be enough.
Another problem is when the string contains `\n`. In case of
`Type[Literal["\n"]]` we generate `'Type[Literal["\n"]]'` and both
pyright and mypy reject this annotation.
https://pyright-play.net/?code=GYJw9gtgBALgngBwJYDsDmUkQWEMoAySMApiAIYA2AUAMaXkDOjUAKoiQNqsC6AXFAB0w6tQAmJYLBKMYAfQCOAVzCk5tMChjlUjOQCNytANaMGjABYAKRiUrAANLA4BGAQHJ2CLkVIVKnABEADoogTw87gCUfNRQ8VAITIyiElKksooqahpaOih6hiZmTNa29k7w3m5sHJy%2BZFRBoeE8MXEJScxAA
## Test Plan
I added test cases for the original code in the reported issue and two
more cases for backslash and new line.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
This PR fixes a bug in the Ruff language server where the
editor-specified configuration was resolved relative to the
configuration directory and not the current working directory.
The existing behavior is confusing given that this config file is
specified by the user and is not _discovered_ by Ruff itself. The
behavior of resolving this configuration file should be similar to that
of the `--config` flag on the command-line which uses the current
working directory:
3210f1a23b/crates/ruff/src/resolve.rs (L34-L48)
This creates problems where certain configuration options doesn't work
because the paths resolved in that case are relative to the
configuration directory and not the current working directory in which
the editor is expected to be in. For example, the
`lint.per-file-ignores` doesn't work as mentioned in the linked issue
along with `exclude`, `extend-exclude`, etc.
fixes: #14282
## Test Plan
Using the following directory tree structure:
```
.
├── .config
│ └── ruff.toml
└── src
└── migrations
└── versions
└── a.py
```
where, the `ruff.toml` is:
```toml
# 1. Comment this out to test `per-file-ignores`
extend-exclude = ["**/versions/*.py"]
[lint]
select = ["D"]
# 2. Comment this out to test `extend-exclude`
[lint.per-file-ignores]
"**/versions/*.py" = ["D"]
# 3. Comment both `per-file-ignores` and `extend-exclude` to test selection works
```
And, the content of `a.py`:
```py
"""Test"""
```
And, the VS Code settings:
```jsonc
{
"ruff.nativeServer": "on",
"ruff.path": ["/Users/dhruv/work/astral/ruff/target/debug/ruff"],
// For single-file mode where current working directory is `/`
// "ruff.configuration": "/tmp/ruff-repro/.config/ruff.toml",
// When a workspace is opened containing this path
"ruff.configuration": "./.config/ruff.toml",
"ruff.trace.server": "messages",
"ruff.logLevel": "trace"
}
```
I also tested out just opening the file in single-file mode where the
current working directory is `/` in VS Code. Here, the
`ruff.configuration` needs to be updated to use absolute path as shown
in the above VS Code settings.
## Summary
This PR adds support for parsing and inferring types within string
annotations.
### Implementation (attempt 1)
This is preserved in
6217f48924.
The implementation here would separate the inference of string
annotations in the deferred query. This requires the following:
* Two ways of evaluating the deferred definitions - lazily and eagerly.
* An eager evaluation occurs right outside the definition query which in
this case would be in `binding_ty` and `declaration_ty`.
* A lazy evaluation occurs on demand like using the
`definition_expression_ty` to determine the function return type and
class bases.
* The above point means that when trying to get the binding type for a
variable in an annotated assignment, the definition query won't include
the type. So, it'll require going through the deferred query to get the
type.
This has the following limitations:
* Nested string annotations, although not necessarily a useful feature,
is difficult to implement unless we convert the implementation in an
infinite loop
* Partial string annotations require complex layout because inferring
the types for stringified and non-stringified parts of the annotation
are done in separate queries. This means we need to maintain additional
information
### Implementation (attempt 2)
This is the final diff in this PR.
The implementation here does the complete inference of string annotation
in the same definition query by maintaining certain state while trying
to infer different parts of an expression and take decisions
accordingly. These are:
* Allow names that are part of a string annotation to not exists in the
symbol table. For example, in `x: "Foo"`, if the "Foo" symbol is not
defined then it won't exists in the symbol table even though it's being
used. This is an invariant which is being allowed only for symbols in a
string annotation.
* Similarly, lookup name is updated to do the same and if the symbol
doesn't exists, then it's not bounded.
* Store the final type of a string annotation on the string expression
itself and not for any of the sub-expressions that are created after
parsing. This is because those sub-expressions won't exists in the
semantic index.
Design document:
https://www.notion.so/astral-sh/String-Annotations-12148797e1ca801197a9f146641e5b71?pvs=4Closes: #13796
## Test Plan
* Add various test cases in our markdown framework
* Run `red_knot` on LibCST (contains a lot of string annotations,
specifically
https://github.com/Instagram/LibCST/blob/main/libcst/matchers/_matcher_base.py),
FastAPI (good amount of annotated code including `typing.Literal`) and
compare against the `main` branch output
## Summary
Add a typed representation of function signatures (parameters and return
type) and infer it correctly from a function.
Convert existing usage of function return types to use the signature
representation.
This does not yet add inferred types for parameters within function body
scopes based on the annotations, but it should be easy to add as a next
step.
Part of #14161 and #13693.
## Test Plan
Added tests.
Follow-up to #14287 : when checking that `name` is the same as `as_name`
in `import name as as_name`, we do not need to first do an early return
if `'.'` is found in `name`.
This PR handles a panic that occurs when applying unsafe fixes if a user
inserts a required import (I002) that has a "useless alias" in it, like
`import numpy as numpy`, and also selects PLC0414 (useless-import-alias)
In this case, the fixes alternate between adding the required import
statement, then removing the alias, until the recursion limit is
reached. See linked issue for an example.
Closes#14283
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This fixes several panics related to invalid assignment targets. All of
these led to some a crash, previously:
```py
(x.y := 1) # only name-expressions are valid targets of named expressions
([x, y] := [1, 2]) # same
(x, y): tuple[int, int] = (2, 3) # tuples are not valid targets for annotated assignments
(x, y) += 2 # tuples are not valid targets for augmented assignments
```
closes#14321closes#14322
## Test Plan
I symlinked four files from `crates/ruff_python_parser/resources` into
the red knot corpus, as they seemed like ideal test files for this exact
scenario. I think eventually, it might be a good idea to simply include *all*
invalid-syntax examples from the parser tests into red knots corpus (I believe
we're actually not too far from that goal). Or expand the scope of the corpus
test to this directory. Then we can get rid of these symlinks again.
## Summary
This avoids a panic inside `TypeInferenceBuilder::infer_type_parameters`
when encountering generic type aliases:
```py
type ListOrSet[T] = list[T] | set[T]
```
To fix this properly, we would have to treat type aliases as being their own
annotation scope [1]. The left hand side is a definition for the type parameter
`T` which is being used in the special annotation scope on the right hand side.
Similar to how it works for generic functions and classes.
[1] https://docs.python.org/3/reference/compound_stmts.html#generic-type-aliasescloses#14307
## Test Plan
Added new example to the corpus.
When we look up the types of class bases or keywords (`metaclass`), we
currently do this little dance: if there are type params, then look up
the type using `SemanticModel` in the type-params scope, if not, look up
the type directly in the definition's own scope, with support for
deferred types.
With inference of function parameter types, I'm now adding another case
of this same dance, so I'm motivated to make it a bit more ergonomic.
Add support to `definition_expression_ty` to handle any sub-expression
of a definition, whether it is in the definition's own scope or in a
type-params sub-scope.
Related to both #13693 and #14161.
## Summary
Use the memory address to uniquely identify AST nodes, instead of
relying on source range and kind. The latter fails for ASTs resulting
from invalid syntax examples. See #14313 for details.
Also results in a 1-2% speedup
(https://codspeed.io/astral-sh/ruff/runs/67349cf55f36b36baa211360)
closes#14313
## Review
Here are the places where we use `NodeKey` directly or indirectly (via
`ExpressionNodeKey` or `DefinitionNodeKey`):
```rs
// semantic_index.rs
pub(crate) struct SemanticIndex<'db> {
// [...]
/// Map expressions to their corresponding scope.
scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>,
/// Map from a node creating a definition to its definition.
definitions_by_node: FxHashMap<DefinitionNodeKey, Definition<'db>>,
/// Map from a standalone expression to its [`Expression`] ingredient.
expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>,
// [...]
}
// semantic_index/builder.rs
pub(super) struct SemanticIndexBuilder<'db> {
// [...]
scopes_by_expression: FxHashMap<ExpressionNodeKey, FileScopeId>,
definitions_by_node: FxHashMap<ExpressionNodeKey, Definition<'db>>,
expressions_by_node: FxHashMap<ExpressionNodeKey, Expression<'db>>,
}
// semantic_index/ast_ids.rs
pub(crate) struct AstIds {
/// Maps expressions to their expression id.
expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>,
/// Maps expressions which "use" a symbol (that is, [`ast::ExprName`]) to a use id.
uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>,
}
pub(super) struct AstIdsBuilder {
expressions_map: FxHashMap<ExpressionNodeKey, ScopedExpressionId>,
uses_map: FxHashMap<ExpressionNodeKey, ScopedUseId>,
}
```
## Test Plan
Added two failing examples to the corpus.
## Summary
Fixes a failing debug assertion that triggers for the following code:
```py
match some_int:
case x:=2:
pass
```
closes#14305
## Test Plan
Added problematic code example to corpus.
## Summary
Resolves#13217.
## Test Plan
`cargo nextest run` and `cargo insta test`.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This PR improves the fix for `PYI055` to be able to handle nested and
mixed type unions.
It also marks the fix as unsafe when comments are present.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
<!-- How was it tested? -->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
`pytest-raises-too-broad (PT011)` should be raised when
`expected_exception` is provided as a keyword argument.
```python
def test_foo():
with pytest.raises(ValueError): # raises PT011
raise ValueError("Can't divide 1 by 0")
# This is minor but a valid pytest.raises call
with pytest.raises(expected_exception=ValueError): # doesn't raise PT011 but should
raise ValueError("Can't divide 1 by 0")
```
`pytest.raises` doc:
https://docs.pytest.org/en/8.3.x/reference/reference.html#pytest.raises
## Test Plan
<!-- How was it tested? -->
Unit tests
Signed-off-by: harupy <hkawamura0130@gmail.com>
## Summary
- Emit diagnostics when looking up (possibly) unbound attributes
- More explicit test assertions for unbound symbols
- Review remaining call sites of `Symbol::ignore_possibly_unbound`. Most
of them are something like `builtins_symbol(self.db,
"Ellipsis").ignore_possibly_unbound().unwrap_or(Type::Unknown)` which
look okay to me, unless we want to emit additional diagnostics. There is
one additional case in enum literal handling, which has a TODO comment
anyway.
part of #14022
## Test Plan
New MD tests for (possibly) unbound attributes.
## Summary
This adds a new diagnostic when possibly unbound symbols are imported.
The `TODO` comment had a question mark, do I'm not sure if this is
really something that we want.
This does not touch the un*declared* case, yet.
relates to: #14022
## Test Plan
Updated already existing tests with new diagnostics
## Summary
Apart from one small functional change, this is mostly a refactoring of
the `Symbol` API:
- Rename `as_type` to the more explicit `ignore_possibly_unbound`, no
functional change
- Remove `unwrap_or_unknown` in favor of the more explicit
`.ignore_possibly_unbound().unwrap_or(Type::Unknown)`, no functional
change
- Consistently call it "possibly unbound" (not "may be unbound")
- Rename `replace_unbound_with` to `or_fall_back_to` and properly handle
boundness of the fall back. This is the only functional change (did not
have any impact on existing tests).
relates to: #14022
## Test Plan
New unit tests for `Symbol::or_fall_back_to`
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Related to #970. Implement [`shallow-copy-environ /
W1507`](https://pylint.readthedocs.io/en/stable/user_guide/messages/warning/shallow-copy-environ.html).
## Test Plan
<!-- How was it tested? -->
Unit test
---------
Co-authored-by: Simon Brugman <sbrugman@users.noreply.github.com>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
The implicit namespace package rule currently fails to detect cases like
the following:
```text
foo/
├── __init__.py
└── bar/
└── baz/
└── __init__.py
```
The problem is that we detect a root at `foo`, and then an independent
root at `baz`. We _would_ detect that `bar` is an implicit namespace
package, but it doesn't contain any files! So we never check it, and
have no place to raise the diagnostic.
This PR adds detection for these kinds of nested packages, and augments
the `INP` rule to flag the `__init__.py` file above with a specialized
message. As a side effect, I've introduced a dedicated `PackageRoot`
struct which we can pass around in lieu of Yet Another `Path`.
For now, I'm only enabling this in preview (and the approach doesn't
affect any other rules). It's a bug fix, but it may end up expanding the
rule.
Closes https://github.com/astral-sh/ruff/issues/13519.
## Summary
It's only safe to enforce the `x in "1234567890"` case if `x` is exactly
one character, since the set on the right has been reordered as compared
to `string.digits`. We can't know if `x` is exactly one character unless
it's a literal. And if it's a literal, well, it's kind of silly code in
the first place?
Closes https://github.com/astral-sh/ruff/issues/13802.
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix `await-outside-async` to allow `await` at the top-level scope of a
notebook.
```python
# foo.ipynb
await asyncio.sleep(1) # should be allowed
```
## Test Plan
<!-- How was it tested? -->
A unit test
## Summary
Resolves#13833.
## Test Plan
`cargo nextest run` and `cargo insta test`.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
This PR accounts for further subtleties in `Decimal` parsing:
- Strings which are empty modulo underscores and surrounding whitespace
are skipped
- `Decimal("-0")` is skipped
- `Decimal("{integer literal that is longer than 640 digits}")` are
skipped (see linked issue for explanation)
NB: The snapshot did not need to be updated since the new test cases are
"Ok" instances and added below the diff.
Closes#14204
## Summary
Create definitions and infer types for PEP 695 type variables.
This just gives us the type of the type variable itself (the type of `T`
as a runtime object in the body of `def f[T](): ...`), with special
handling for its attributes `__name__`, `__bound__`, `__constraints__`,
and `__default__`. Mostly the support for these attributes exists
because it is easy to implement and allows testing that we are
internally representing the typevar correctly.
This PR doesn't yet have support for interpreting a typevar as a type
annotation, which is of course the primary use of a typevar. But the
information we store in the typevar's type in this PR gives us
everything we need to handle it correctly in a future PR when the
typevar appears in an annotation.
## Test Plan
Added mdtest.
## Summary
`Ty::BuiltinClassLiteral(…)` is a sub~~class~~type of
`Ty::BuiltinInstance("type")`, so it can't be disjoint from it.
## Test Plan
New `is_not_disjoint_from` test case
## Summary
Fix `Type::is_assignable_to` for union types on the left hand side (of
`.is_assignable_to`; or the right hand side of the `… = …` assignment):
`Literal[1, 2]` should be assignable to `int`.
## Test Plan
New unit tests that were previously failing.
## Summary
Minor fix to `Type::is_subtype_of` to make sure that Boolean literals
are subtypes of `int`, to match runtime semantics.
Found this while doing some property-testing experiments [1].
[1] https://github.com/astral-sh/ruff/pull/14178
## Test Plan
New unit test.
## Summary
Fixes#14114. I don't think I can really describe the problems with our
current architecture (and therefore the motivations for this PR) any
better than @carljm did in that issue, so I'll just copy it out here!
---
We currently represent "known instances" (e.g. special forms like
`typing.Literal`, which are an instance of `typing._SpecialForm`, but
need to be handled differently from other instances of
`typing._SpecialForm`) as an `InstanceType` with a `known` field that is
`Some(...)`.
This makes it easy to handle a known instance as if it were a regular
instance type (by ignoring the `known` field), and in some cases (e.g.
`Type::member`) that is correct and convenient. But in other cases (e.g.
`Type::is_equivalent_to`) it is not correct, and we currently have a bug
that we would consider the known-instance type of `typing.Literal` as
equivalent to the general instance type for `typing._SpecialForm`, and
we would fail to consider it a singleton type or a single-valued type
(even though it is both.)
An instance type with `known.is_some()` is semantically quite different
from an instance type with `known.is_none()`. The former is a singleton
type that represents exactly one runtime object; the latter is an open
type that represents many runtime objects, including instances of
unknown subclasses. It is too error-prone to represent these
very-different types as a single `Type` variant. We should instead
introduce a dedicated `Type::KnownInstance` variant and force ourselves
to handle these explicitly in all `Type` variant matches.
## Possible followups
There is still a little bit of awkwardness in our current design in some
places, in that we first infer the symbol `typing.Literal` as a
`_SpecialForm` instance, and then later convert that instance-type into
a known-instance-type. We could also use this `KnownInstanceType` enum
to account for other special runtime symbols such as `builtins.Ellipsis`
or `builtins.NotImplemented`.
I think these might be worth pursuing, but I didn't do them here as they
didn't seem essential right now, and I wanted to keep the diff
relatively minimal.
## Test Plan
`cargo test -p red_knot_python_semantic`. New unit tests added for
`Type::is_subtype_of`.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This adds type inference for comparison expressions involving
intersection types.
For example:
```py
x = get_random_int()
if x != 42:
reveal_type(x == 42) # revealed: Literal[False]
reveal_type(x == 43) # bool
```
closes#13854
## Test Plan
New Markdown-based tests.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
- Get rid of `Symbol::unwrap_or` (unclear semantics, not needed anymore)
- Introduce `Type::call_dunder`
- Emit new diagnostic for possibly-unbound `__iter__` methods
- Better diagnostics for callables with possibly-unbound /
possibly-non-callable `__call__` methods
part of: #14022closes#14016
## Test Plan
- Updated test for iterables with possibly-unbound `__iter__` methods.
- New tests for callables
## Summary
- Adds basic support for `type[C]` as a red knot `Type`. Some things
might not be supported yet, like `type[Any]`.
- Adds type narrowing for `issubclass` checks.
closes#14117
## Test Plan
New Markdown-based tests
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Implementation for one of the rules in
https://github.com/astral-sh/ruff/issues/1348
Refurb only deals only with classes with a single base, however the rule
is valid for any base.
(`str, Enum` is common prior to `StrEnum`)
## Test Plan
`cargo test`
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
Flake8-builtins provides two checks for arguments (really, parameters)
of a function shadowing builtins: A002 checks function definitions, and
A006 checks lambda expressions. This PR ensures that A002 is restricted
to functions rather than lambda expressions.
Closes#14135 .
## Summary
I mirrored some of the idioms that @AlexWaygood used in the MRO work.
Closes https://github.com/astral-sh/ruff/issues/14096.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Related to
https://github.com/astral-sh/ruff/pull/13979#discussion_r1828305790,
this PR removes the `current_unpack` state field from
`SemanticIndexBuilder` and passes the `Unpack` ingredient via the
`CurrentAssignment` -> `DefinitionNodeRef` conversion to finally store
it on `DefintionNodeKind`.
This involves updating the lifetime of `AnyParameterRef` (parameter to
`declare_parameter`) to use the `'db` lifetime. Currently, all AST nodes
stored on various enums are marked with `'a` lifetime but they're always
utilized using the `'db` lifetime.
This also removes the dedicated `'a` lifetime parameter on
`add_definition` which is currently being used in `DefinitionNodeRef`.
As mentioned, all AST nodes live through the `'db` lifetime so we can
remove the `'a` lifetime parameter from that method and use the `'db`
lifetime instead.
FURB157 suggests replacing expressions like `Decimal("123")` with
`Decimal(123)`. This PR extends the rule to cover cases where the input
string to `Decimal` can be easily transformed into an integer literal.
For example:
```python
Decimal("1__000") # fix: `Decimal(1000)`
```
Note: we do not implement the full decimal parsing logic from CPython on
the grounds that certain acceptable string inputs to the `Decimal`
constructor may be presumed purposeful on the part of the developer. For
example, as in the linked issue, `Decimal("١٢٣")` is valid and equal to
`Decimal(123)`, but we do not suggest a replacement in this case.
Closes#13807
## Summary
- Store the expression type for annotations that are starred expressions
(see [discussion
here](https://github.com/astral-sh/ruff/pull/14091#discussion_r1828332857))
- Use `self.store_expression_type(…)` consistently throughout, as it
makes sure that no double-insertion errors occur.
closes#14115
## Test Plan
Added an invalid-syntax example to the corpus which leads to a panic on
`main`. Also added a Markdown test with a valid-syntax example that
would lead to a panic once we implement function parameter inference.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Adds more precise type inference for `… is …` and `… is not …` identity
checks in some limited cases where we statically know the answer to be
either `Literal[True]` or `Literal[False]`.
I found this helpful while working on type inference for comparisons
involving intersection types, but I'm not sure if this is at all useful
for real world code (where the answer is most probably *not* statically
known). Note that we already have *type narrowing* for identity tests.
So while we are already able to generate constraints for things like `if
x is None`, we can now — in some limited cases — make an even stronger
conclusion and infer that the test expression itself is `Literal[False]`
(branch never taken) or `Literal[True]` (branch always taken).
## Test Plan
New Markdown tests
Handling `Literal` type in annotations.
Resolves: #13672
## Implementation
Since Literals are not a fully defined type in typeshed. I used a trick
to figure out when a special form is a literal.
When we are inferring assignment types I am checking if the type of that
assignment was resolved to typing.SpecialForm and the name of the target
is `Literal` if that is the case then I am re creating a new instance
type and set the known instance field to `KnownInstance:Literal`.
**Why not defining a new type?**
From this [issue](https://github.com/python/typeshed/issues/6219) I
learned that we want to resolve members to SpecialMethod class. So if we
create a new instance here we can rely on the member resolving in that
already exists.
## Tests
https://typing.readthedocs.io/en/latest/spec/literal.html#equivalence-of-two-literals
Since the type of the value inside Literal is evaluated as a
Literal(LiteralString, LiteralInt, ...) then the equality is only true
when types and value are equal.
https://typing.readthedocs.io/en/latest/spec/literal.html#legal-and-illegal-parameterizations
The illegal parameterizations are mostly implemented I'm currently
checking the slice expression and the slice type to make sure it's
valid.
https://typing.readthedocs.io/en/latest/spec/literal.html#shortening-unions-of-literals
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR enables red-knot to support type narrowing based on `and` and
`or` conditionals, including nested combinations and their negation (for
`elif` / `else` blocks and for `not` operator). Part of #13694.
In order to address this properly (hopefully 😅), I had to run
`NarrowingConstraintsBuilder` functions recursively. In the first commit
I introduced a minor refactor - instead of mutating `self.constraints`,
the new constraints are now returned as function return values. I also
modified the constraints map to be optional, preventing unnecessary
hashmap allocations.
Thanks @carljm for your support on this :)
The second commit contains the logic and tests for handling boolean ops,
with improvements to intersections handling in `is_subtype_of` .
As I'm still new to Rust and the internals of type checkers, I’d be more
than happy to hear any insights or suggestions.
Thank you!
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Encountered this while running red-knot benchmarks on the `black`
codebase.
Fixes two of the issues in #13478.
## Test Plan
Added a regression test.
## Summary
Removes `Type::None` in favor of `KnownClass::NoneType.to_instance(…)`.
closes#13670
## Performance
There is a -4% performance regression on our red-knot benchmark. This is due to the fact that we now have to import `_typeshed` as a module, and infer types.
## Test Plan
Existing tests pass.
## Summary
This PR adds a new salsa query and an ingredient to resolve all the
variables involved in an unpacking assignment like `(a, b) = (1, 2)` at
once. Previously, we'd recursively try to match the correct type for
each definition individually which will result in creating duplicate
diagnostics.
This PR still doesn't solve the duplicate diagnostics issue because that
requires a different solution like using salsa accumulator or
de-duplicating the diagnostics manually.
Related: #13773
## Test Plan
Make sure that all unpack assignment test cases pass, there are no
panics in the corpus tests.
## Todo
- [x] Look at the performance regression
## Summary
The `commented-out-code` rule (ERA001) from `eradicate` is currently
flagging a very common idiom that marks Python strings as another
language, to help with syntax highlighting:

This PR adds this idiom to the list of allowed exceptions to the rule.
## Test Plan
I've added some additional test cases.
## Summary
Like https://github.com/astral-sh/ruff/pull/14063, but ensures that we
catch cases like `{1, True}` in which the items hash to the same value
despite not being identical.
## Summary
Closes https://github.com/astral-sh/ruff/issues/13944
## Test Plan
Standard snapshot testing
flake8-simplify surprisingly only has a single test case
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Removing more TODOs from the augmented assignment test suite. Now, if
the _target_ is a union, we correctly infer the union of results:
```python
if flag:
f = Foo()
else:
f = 42.0
f += 12
```
## Summary
One of the follow-ups from augmented assignment inference, now that
`Type::Unbound` has been removed.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
- Remove `Type::Unbound`
- Handle (potential) unboundness as a concept orthogonal to the type
system (see new `Symbol` type)
- Improve existing and add new diagnostics related to (potential)
unboundness
closes#13671
## Test Plan
- Update existing markdown-based tests
- Add new tests for added/modified functionality
## Summary
This PR fixes a panic which can occur in an unpack assignment when:
* (number of target expressions) - (number of tuple types) > 2
* There's a starred expression
The reason being that the `insert` panics because the index is greater
than the length.
This is an error case and so practically it should occur very rarely.
The solution is to resize the types vector to match the number of
expressions and then insert the starred expression type.
## Test Plan
Add a new test case.
## Summary
This PR creates a new `TypeCheckDiagnosticsBuilder` for the
`TypeCheckDiagnostics` struct. The main motivation behind this is to
separate the helpers required to build the diagnostics from the type
inference builder itself. This allows us to use such helpers outside of
the inference builder like for example in the unpacking logic in
https://github.com/astral-sh/ruff/pull/13979.
## Test Plan
`cargo insta test`
## Summary
These cases aren't handled correctly yet -- some of them are waiting on
refactors to `Unbound` before fixing. Part of #12699.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
I noticed that augmented assignments on floats were yielding "not
supported" diagnostics. If the dunder isn't bound at all, we should use
binary operator semantics, rather than treating it as not-callable.
## Summary
Minor follow-up to #13917 — thanks @AlexWaygood for the post-merge
review.
- Add
SliceLiteralType::as_tuple
- Use .expect() instead of SAFETY
comment
- Match on ::try_from
result
- Add TODO comment regarding raising a diagnostic for `"foo"["bar":"baz"]`
## Summary
This PR adds support for heterogenous `tuple` annotations to red-knot.
It does the following:
- Extends `infer_type_expression` so that it understands tuple
annotations
- Changes `infer_type_expression` so that `ExprStarred` nodes in type
annotations are inferred as `Todo` rather than `Unknown` (they're valid
in PEP-646 tuple annotations)
- Extends `Type::is_subtype_of` to understand when one heterogenous
tuple type can be understood to be a subtype of another (without this
change, the PR would have introduced new false-positive errors to some
existing mdtests).
## Summary
- Add a new `Type::SliceLiteral` variant
- Infer `SliceLiteral` types for slice expressions, such as
`<int-literal>:<int-literal>:<int-literal>`.
- Infer "sliced" literal types for subscript expressions using slices,
such as `<string-literal>[<slice-literal>]`.
- Infer types for expressions involving slices of tuples:
`<tuple>[<slice-literal>]`.
closes#13853
## Test Plan
- Unit tests for indexing/slicing utility functions
- Markdown-based tests for
- Subscript expressions `tuple[slice]`
- Subscript expressions `string_literal[slice]`
- Subscript expressions `bytes_literal[slice]`
## Summary
This does two things:
- distribute negated intersections when building up intersections (i.e.
going from `A & ~(B & C)` to `(A & ~B) | (A & ~C)`) (fixing #13931)
## Test Plan
`cargo test`
## Summary
This PR adds type narrowing in `and` and `or` expressions, for example:
```py
class A: ...
x: A | None = A() if bool_instance() else None
isinstance(x, A) or reveal_type(x) # revealed: None
```
## Test Plan
New mdtests 😍
## Summary
After #13918 has landed, narrowing constraint negation became easy, so
adding support for `not` operator.
## Test Plan
Added a new mdtest file for `not` expression.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
As python uses short-circuiting boolean operations in runtime, we should
mimic that logic in redknot as well.
For example, we should detect that in the following code `x` might be
undefined inside the block:
```py
if flag or (x := 1):
print(x)
```
## Test Plan
Added mdtest suit for boolean expressions.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Add support for type narrowing in elif and else scopes as part of
#13694.
## Test Plan
- mdtest
- builder unit test for union negation.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
`ruff check` has not been the default in a long time. However, the help
message and code comment still designate it as the default. The remark
should have been removed in the deprecation PR #10169.
## Test Plan
Not tested.
## Summary
Add type narrowing for `isinstance(object, classinfo)` [1] checks:
```py
x = 1 if flag else "a"
if isinstance(x, int):
reveal_type(x) # revealed: Literal[1]
```
closes#13893
[1] https://docs.python.org/3/library/functions.html#isinstance
## Test Plan
New Markdown-based tests in `narrow/isinstance.md`.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR updates the fix generation logic for auto-quoting an annotation
to generate an edit even when there's a quote character present.
The logic uses the visitor pattern, maintaining it's state on where it
is and generating the string value one node at a time. This can be
considered as a specialized form of `Generator`. The state required to
maintain is whether we're currently inside a `typing.Literal` or
`typing.Annotated` because the string value in those types should not be
un-quoted i.e., `Generic[Literal["int"]]` should become
`"Generic[Literal['int']]`, the quotes inside the `Literal` should be
preserved.
Fixes: https://github.com/astral-sh/ruff/issues/9137
## Test Plan
Add various test cases to validate this change, validate the snapshots.
There are no ecosystem changes to go through.
---------
Signed-off-by: Shaygan <hey@glyphack.com>
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
A minor quality-of-life improvement: add
[`#[track_caller]`](https://doc.rust-lang.org/reference/attributes/codegen.html#the-track_caller-attribute)
attribute to `Type::expect_xyz()` methods and some `TypeInference` methods such that the panic-location
is reported one level higher up in the stack trace.
before: reports location inside the `Type::expect_class_literal()`
method. Not very useful.
```
thread 'types::infer::tests::deferred_annotation_builtin' panicked at crates/red_knot_python_semantic/src/types.rs:304:14:
Expected a Type::ClassLiteral variant
```
after: reports location at the `Type::expect_class_literal()` call site,
where the error was made.
```
thread 'types::infer::tests::deferred_annotation_builtin' panicked at crates/red_knot_python_semantic/src/types/infer.rs:4302:14:
Expected a Type::ClassLiteral variant
```
## Test Plan
Called `expect_class_literal()` on something that's not a
`Type::ClassLiteral` and saw that the error was reported at the call
site.
## Summary
* Rename `Type::Class` => `Type::ClassLiteral`
* Rename `Type::Function` => `Type::FunctionLiteral`
* Do not rename `Type::Module`
* Remove `*Literal` suffixes in `display::LiteralTypeKind` variants, as
per clippy suggestion
* Get rid of `Type::is_class()` in favor of `is_subtype_of(…, 'type')`;
modifiy `is_subtype_of` to support this.
* Add new `Type::is_xyz()` methods and use them instead of matching on
`Type` variants.
closes#13863
## Test Plan
New `is_subtype_of_class_literals` unit test.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
- Properly treat the empty intersection as being of type `object`.
- Consequently, change the simplification method to explicitly add
`Never` to the positive side of the intersection when collapsing a type
such as `int & str` to `Never`, as opposed to just clearing both the
positive and the negative side.
- Minor code improvement in `bindings_ty`: use `peekable()` to check
whether the iterator over constraints is empty, instead of handling
first and subsequent elements separately.
fixes#13870
## Test Plan
- New unit tests for `IntersectionBuilder` to make sure the empty
intersection represents `object`.
- Markdown-based regression test for the original issue in #13870
Add the following subtype relations:
- `BooleanLiteral <: object`
- `IntLiteral <: object`
- `StringLiteral <: object`
- `LiteralString <: object`
- `BytesLiteral <: object`
Added a test case for `bool <: int`.
## Test Plan
New unit tests.
Add type narrowing for `!=` expression as stated in
#13694.
### Test Plan
Add tests in new md format.
---------
Co-authored-by: David Peter <mail@david-peter.de>
## Summary
A small fix for comparisons of multiple comparators.
Instead of comparing each comparator to the leftmost item, we should
compare it to the closest item on the left.
While implementing this, I noticed that we don’t yet narrow Yoda
comparisons (e.g., `True is x`), so I didn’t change that behavior in
this PR.
## Test Plan
Added some mdtests 🎉
## Summary
Just a drive-by change that occurred to me while I was looking at
`Type::is_subtype_of`: the existing pattern for unions on the *right
hand side*:
```rs
(ty, Type::Union(union)) => union
.elements(db)
.iter()
.any(|&elem_ty| ty.is_subtype_of(db, elem_ty)),
```
is not (generally) correct if the *left hand side* is a union.
## Test Plan
Added new test cases for `is_subtype_of` and `!is_subtype_of`
## Summary
- Consistent naming: `BoolLiteral` => `BooleanLiteral` (it's mainly the
`Ty::BoolLiteral` variant that was renamed)
I tripped over this a few times now, so I thought I'll smooth it out.
- Add a new test case for `Literal[True] <: bool`, as suggested here:
https://github.com/astral-sh/ruff/pull/13781#discussion_r1804922827
Remove unnecessary uses of `.as_ref()`, `.iter()`, `&**` and similar, mostly in situations when iterating over variables. Many of these changes are only possible following #13826, when we bumped our MSRV to 1.80: several useful implementations on `&Box<[T]>` were only stabilised in Rust 1.80. Some of these changes we could have done earlier, however.
Implemented some points from
https://github.com/astral-sh/ruff/issues/12701
- Handle Unknown and Any in Unary operation
- Handle Boolean in binary operations
- Handle instances in unary operation
- Consider division by False to be division by zero
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
- Refactored comparison type inference functions in `infer.rs`: Changed
the return type from `Option` to `Result` to lay the groundwork for
providing more detailed diagnostics.
- Updated diagnostic messages.
This is a small step toward improving diagnostics in the future.
Please refer to #13787
## Test Plan
mdtest included!
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This fixes an edge case that @carljm and I missed when implementing
https://github.com/astral-sh/ruff/pull/13800. Namely, if the left-hand
operand is the _exact same type_ as the right-hand operand, the
reflected dunder on the right-hand operand is never tried:
```pycon
>>> class Foo:
... def __radd__(self, other):
... return 42
...
>>> Foo() + Foo()
Traceback (most recent call last):
File "<python-input-1>", line 1, in <module>
Foo() + Foo()
~~~~~~^~~~~~~
TypeError: unsupported operand type(s) for +: 'Foo' and 'Foo'
```
This edge case _is_ covered in Brett's blog at
https://snarky.ca/unravelling-binary-arithmetic-operations-in-python/,
but I missed it amongst all the other subtleties of this algorithm. The
motivations and history behind it were discussed in
https://mail.python.org/archives/list/python-dev@python.org/thread/7NZUCODEAPQFMRFXYRMGJXDSIS3WJYIV/
## Test Plan
I added an mdtest for this cornercase.
## Summary
- Add `Type::is_disjoint_from` as a way to test whether two types
overlap
- Add a first set of simplification rules for intersection types
- `S & T = S` for `S <: T`
- `S & ~T = Never` for `S <: T`
- `~S & ~T = ~T` for `S <: T`
- `A & ~B = A` for `A` disjoint from `B`
- `A & B = Never` for `A` disjoint from `B`
- `bool & ~Literal[bool] = Literal[!bool]`
resolves one item in #12694
## Open questions:
- Can we somehow leverage the (anti) symmetry between `positive` and
`negative` contributions? I could imagine that there would be a way if
we had `Type::Not(type)`/`Type::Negative(type)`, but with the
`positive`/`negative` architecture, I'm not sure. Note that there is a
certain duplication in the `add_positive`/`add_negative` functions (e.g.
`S & ~T = Never` is implemented twice), but other rules are actually not
perfectly symmetric: `S & T = S` vs `~S & ~T = ~T`.
- I'm not particularly proud of the way `add_positive`/`add_negative`
turned out. They are long imperative-style functions with some
mutability mixed in (`to_remove`). I'm happy to look into ways to
improve this code *if we decide to go with this approach* of
implementing a set of ad-hoc rules for simplification.
- ~~Is it useful to perform simplifications eagerly in
`add_positive`/`add_negative`? (@carljm)~~ This is what I did for now.
## Test Plan
- Unit tests for `Type::is_disjoint_from`
- Observe changes in Markdown-based tests
- Unit tests for `IntersectionBuilder::build()`
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Minor cleanup and consistent formatting of the Markdown-based tests.
- Removed lots of unnecessary `a`, `b`, `c`, … variables.
- Moved test assertions (`# revealed:` comments) closer to the tested
object.
- Always separate `# revealed` and `# error` comments from the code by
two spaces, according to the discussion
[here](https://github.com/astral-sh/ruff/pull/13746/files#r1799385758).
This trades readability for consistency in some cases.
- Fixed some headings
## Summary
This pull request resolves some rule thrashing identified in #12427 by
allowing for unused arguments when using `NotImplementedError` with a
variable per [this
comment](https://github.com/astral-sh/ruff/issues/12427#issuecomment-2384727468).
**Note**
This feels a little heavy-handed / edge-case-prone. So, to be clear, I'm
happy to scrap this code and just update the docs to communicate that
`abstractmethod` and friends should be used in this scenario (or
similar). Just let me know what you'd like done!
fixes: #12427
## Test Plan
I added a test-case to the existing `ARG.py` file and ran...
```sh
cargo run -p ruff -- check crates/ruff_linter/resources/test/fixtures/flake8_unused_arguments/ARG.py --no-cache --preview --select ARG002
```
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
This PR updates the language server to avoid indexing the workspace for
single-file mode.
**What's a single-file mode?**
When a user opens the file directly in an editor, and not the folder
that represents the workspace, the editor usually can't determine the
workspace root. This means that during initializing the server, the
`workspaceFolders` field will be empty / nil.
Now, in this case, the server defaults to using the current working
directory which is a reasonable default assuming that the directory
would point to the one where this open file is present. This would allow
the server to index the directory itself for any config file, if
present.
It turns out that in VS Code the current working directory in the above
scenario is the system root directory `/` and so the server will try to
index the entire root directory which would take a lot of time. This is
the issue as described in
https://github.com/astral-sh/ruff-vscode/issues/627. To reproduce, refer
https://github.com/astral-sh/ruff-vscode/issues/627#issuecomment-2401440767.
This PR updates the indexer to avoid traversing the workspace to read
any config file that might be present. The first commit
(8dd2a31eef)
refactors the initialization and introduces two structs `Workspaces` and
`Workspace`. The latter struct includes a field to determine whether
it's the default workspace. The second commit
(61fc39bdb6)
utilizes this field to avoid traversing.
Closes: #11366
## Editor behavior
This is to document the behavior as seen in different editors. The test
scenario used has the following directory tree structure:
```
.
├── nested
│ ├── nested.py
│ └── pyproject.toml
└── test.py
```
where, the contents of the files are:
**test.py**
```py
import os
```
**nested/nested.py**
```py
import os
import math
```
**nested/pyproject.toml**
```toml
[tool.ruff.lint]
select = ["I"]
```
Steps:
1. Open `test.py` directly in the editor
2. Validate that it raises the `F401` violation
3. Open `nested/nested.py` in the same editor instance
4. This file would raise only `I001` if the `nested/pyproject.toml` was
indexed
### VS Code
When (1) is done from above, the current working directory is `/` which
means the server will try to index the entire system to build up the
settings index. This will include the `nested/pyproject.toml` file as
well. This leads to bad user experience because the user would need to
wait for minutes for the server to finish indexing.
This PR avoids that by not traversing the workspace directory in
single-file mode. But, in VS Code, this means that per (4), the file
wouldn't raise `I001` but only raise two `F401` violations because the
`nested/pyproject.toml` was never resolved.
One solution here would be to fix this in the extension itself where we
would detect this scenario and pass in the workspace directory that is
the one containing this open file in (1) above.
### Neovim
**tl;dr** it works as expected because the client considers the presence
of certain files (depending on the server) as the root of the workspace.
For Ruff, they are `pyproject.toml`, `ruff.toml`, and `.ruff.toml`. This
means that the client notifies us as the user moves between single-file
mode and workspace mode.
https://github.com/astral-sh/ruff/pull/13770#issuecomment-2416608055
### Helix
Same as Neovim, additional context in
https://github.com/astral-sh/ruff/pull/13770#issuecomment-2417362097
### Sublime Text
**tl;dr** It works similar to VS Code except that the current working
directory of the current process is different and thus the config file
is never read. So, the behavior remains unchanged with this PR.
https://github.com/astral-sh/ruff/pull/13770#issuecomment-2417362097
### Zed
Zed seems to be starting a separate language server instance for each
file when the editor is running in a single-file mode even though all
files have been opened in a single editor instance.
(Separated the logs into sections separated by a single blank line
indicating 3 different server instances that the editor started for 3
files.)
```
0.000053375s INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp, using default settings
0.009448792s INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp
0.009906334s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/test.py
0.011775917s INFO ruff:main ruff_server::server: Configuration file watcher successfully registered
0.000060583s INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp/nested, using default settings
0.010387125s INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp/nested
0.011061875s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/nested.py
0.011545208s INFO ruff:main ruff_server::server: Configuration file watcher successfully registered
0.000059125s INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp/nested, using default settings
0.010857583s INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp/nested
0.011428958s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/other.py
0.011893792s INFO ruff:main ruff_server::server: Configuration file watcher successfully registered
```
## Test Plan
When using the `ruff` server from this PR, we see that the server starts
quickly as seen in the logs. Next, when I switch to the release binary,
it starts indexing the root directory.
For more details, refer to the "Editor Behavior" section above.
Summary
---------
PEP 695 Generics introduce a scope inside a class statement's arguments
and keywords.
```
class C[T](A[T]): # the T in A[T] is not from the global scope but from a type-param-specfic scope
...
```
When doing inference on the class bases, we currently have been doing
base class expression lookups in the global scope. Not an issue without
generics (since a scope is only created when generics are present).
This change instead makes sure to stop the global scope inference from
going into expressions within this sub-scope. Since there is a separate
scope, `check_file` and friends will trigger inference on these
expressions still.
Another change as a part of this is making sure that `ClassType` looks
up its bases in the right scope.
Test Plan
----------
`cargo test --package red_knot_python_semantic generics` will run the
markdown test that previously would panic due to scope lookup issues
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Carl Meyer <carl@astral.sh>
This reverts https://github.com/astral-sh/ruff/pull/13799, and restores
the previous behavior, which I think was the most pragmatic and useful
version of the divide-by-zero error, if we will emit it at all.
In general, a type checker _does_ emit diagnostics when it can detect
something that will definitely be a problem for some inhabitants of a
type, but not others. For example, `x.foo` if `x` is typed as `object`
is a type error, even though some inhabitants of the type `object` will
have a `foo` attribute! The correct fix is to make your type annotations
more precise, so that `x` is assigned a type which definitely has the
`foo` attribute.
If we will emit it divide-by-zero errors, it should follow the same
logic. Dividing an inhabitant of the type `int` by zero may not emit an
error, if the inhabitant is an instance of a subclass of `builtins.int`
that overrides division. But it may emit an error (more likely it will).
If you don't want the diagnostic, you can clarify your type annotations
to require an instance of your safe subclass.
Because the Python type system doesn't have the ability to explicitly
reflect the fact that divide-by-zero is an error in type annotations
(e.g. for `int.__truediv__`), or conversely to declare a type as safe
from divide-by-zero, or include a "nonzero integer" type which it is
always safe to divide by, the analogy doesn't fully apply. You can't
explicitly mark your subclass of `int` as safe from divide-by-zero, we
just semi-arbitrarily choose to silence the diagnostic for subclasses,
to avoid false positives.
Also, if we fully followed the above logic, we'd have to error on every
`int / int` because the RHS `int` might be zero! But this would likely
cause too many false positives, because of the lack of a "nonzero
integer" type.
So this is just a pragmatic choice to emit the diagnostic when it is
very likely to be an error. It's unclear how useful this diagnostic is
in practice, but this version of it is at least very unlikely to cause
harm.
If the LHS is just `int` or `float` type, that type includes custom
subclasses which can arbitrarily override division behavior, so we
shouldn't emit a divide-by-zero error in those cases.
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Add type inference for comparisons involving union types. For example:
```py
one_or_two = 1 if flag else 2
reveal_type(one_or_two <= 2) # revealed: Literal[True]
reveal_type(one_or_two <= 1) # revealed: bool
reveal_type(one_or_two <= 0) # revealed: Literal[False]
```
closes#13779
## Test Plan
See `resources/mdtest/comparison/unions.md`
## Summary
Fixes the bug described in #13514 where an unbound public type defaulted
to the type or `Unknown`, whereas it should only be the type if unbound.
## Test Plan
Added a new test case
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This PR adds a debug assertion that asserts that `TypeInference::extend`
is only called on results that have the same scope.
This is critical because `expressions` uses `ScopedExpressionId` that
are local and merging expressions from different
scopes would lead to incorrect expression types.
We could consider storing `scope` only on `TypeInference` for debug
builds. Doing so has the advantage that the `TypeInference` type is
smaller of which we'll have many. However, a `ScopeId` is a `u32`... so
it shouldn't matter that much and it avoids storing the `scope` both on
`TypeInference` and `TypeInferenceBuilder`
## Test Plan
`cargo test`
## Summary
This PR implements comparisons for (tuple, tuple).
It will close#13688 and complete an item in #13618 once merged.
## Test Plan
Basic tests are included for (tuple, tuple) comparisons.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Just a small simplification to remove some unnecessary complexity here.
Rather than using separate branches for subscript expressions involving
boolean literals, we can simply convert them to integer literals and
reuse the logic in the `IntLiteral` branches.
## Test Plan
`cargo test -p red_knot_python_semantic`
## Summary
This PR adds support for unpacking tuple expression in an assignment
statement where the target expression can be a tuple or a list (the
allowed sequence targets).
The implementation introduces a new `infer_assignment_target` which can
then be used for other targets like the ones in for loops as well. This
delegates it to the `infer_definition`. The final implementation uses a
recursive function that visits the target expression in source order and
compares the variable node that corresponds to the definition. At the
same time, it keeps track of where it is on the assignment value type.
The logic also accounts for the number of elements on both sides such
that it matches even if there's a gap in between. For example, if
there's a starred expression like `(a, *b, c) = (1, 2, 3)`, then the
type of `a` will be `Literal[1]` and the type of `b` will be
`Literal[2]`.
There are a couple of follow-ups that can be done:
* Use this logic for other target positions like `for` loop
* Add diagnostics for mis-match length between LHS and RHS
## Test Plan
Add various test cases using the new markdown test framework.
Validate that existing test cases pass.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Porting infer tests to new markdown tests framework.
Link to the corresponding issue: #13696
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
- Fix a bug with `… is not …` type guards.
Previously, in an example like
```py
x = [1]
y = [1]
if x is not y:
reveal_type(x)
```
we would infer a type of `list[int] & ~list[int] == Never` for `x`
inside the conditional (instead of `list[int]`), since we built a
(negative) intersection with the type of the right hand side (`y`).
However, as this example shows, this assumption can only be made for
singleton types (types with a single inhabitant) such as `None`.
- Add support for `… is …` type guards.
closes#13715
## Test Plan
Moved existing `narrow_…` tests to Markdown-based tests and added new
ones (including a regression test for the bug described above). Note
that will create some conflicts with
https://github.com/astral-sh/ruff/pull/13719. I tried to establish the
correct organizational structure as proposed in
https://github.com/astral-sh/ruff/pull/13719#discussion_r1800188105
Address a potential point of confusion that bit a contributor in
https://github.com/astral-sh/ruff/pull/13719
Also remove a no-longer-accurate line about bare `error: ` assertions
(which are no longer allowed) and clarify another point about which
kinds of error assertions to use.
## Summary
Fixes#13708.
Silence `undefined-reveal` diagnostic on any line including a `#
revealed:` assertion.
Add more context to un-silenced `undefined-reveal` diagnostics in mdtest
test failures. This doesn't make the failure output less verbose, but it
hopefully clarifies the right fix for an `undefined-reveal` in mdtest,
while still making it clear what red-knot's normal diagnostic for this
looks like.
## Test Plan
Added and updated tests.
This adds documentation for the new test framework.
I also added documentation for the planned design of features we haven't
built yet (clearly marked as such), so that this doc can become the sole
source of truth for the test framework design (we don't need to refer
back to the original internal design document.)
Also fixes a few issues in the test framework implementation that were
discovered in writing up the docs.
---------
Co-authored-by: T-256 <132141463+T-256@users.noreply.github.com>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
Adds a markdown-based test framework for writing tests of type inference
and type checking. Fixes#11664.
Implements the basic required features. A markdown test file is a suite
of tests, each test can contain one or more Python files, with
optionally specified path/name. The test writes all files to an
in-memory file system, runs red-knot, and matches the resulting
diagnostics against `Type: ` and `Error: ` assertions embedded in the
Python source as comments.
We will want to add features like incremental tests, setting custom
configuration for tests, writing non-Python files, testing syntax
errors, capturing full diagnostic output, etc. There's also plenty of
room for improved UX (colored output?).
## Test Plan
Lots of tests!
Sample of the current output when a test fails:
```
Running tests/inference.rs (target/debug/deps/inference-7c96590aa84de2a4)
running 1 test
test inference::path_1_resources_inference_numbers_md ... FAILED
failures:
---- inference::path_1_resources_inference_numbers_md stdout ----
inference/numbers.md - Numbers - Floats
/src/test.py
line 2: unexpected error: [invalid-assignment] "Object of type `Literal["str"]` is not assignable to `int`"
thread 'inference::path_1_resources_inference_numbers_md' panicked at crates/red_knot_test/src/lib.rs:60:5:
Some tests failed.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
inference::path_1_resources_inference_numbers_md
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.19s
error: test failed, to rerun pass `-p red_knot_test --test inference`
```
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Fixed a TODO by adding another TODO. It's the red-knot way!
## Summary
`builtins.type` can be subscripted at runtime on Python 3.9+, even
though it has no `__class_getitem__` method and its metaclass (which
is... itself) has no `__getitem__` method. The special case is
[hardcoded directly into `PyObject_GetItem` in
CPython](744caa8ef4/Objects/abstract.c (L181-L184)).
We just have to replicate the special case in our semantic model.
This will fail at runtime on Python <3.9. However, there's a bunch of
outstanding questions (detailed in the TODO comment I added) regarding
how we deal with subscriptions of other generic types on lower Python
versions. Since we want to avoid too many false positives for now, I
haven't tried to address this; I've just made `type` subscriptable on
all Python versions.
## Test Plan
`cargo test -p red_knot_python_semantic --lib`
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Treat async generators as "await" in ASYNC100.
Fixes#13637
## Test Plan
Updated snapshot
## Summary
Implements string literal comparisons and fallbacks to `str` instance
for `LiteralString`.
Completes an item in #13618
## Test Plan
- Adds a dedicated test with non exhaustive cases
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Implements the comparison operator for `[Type::IntLiteral]` and
`[Type::BooleanLiteral]` (as an artifact of special handling of `True` and
`False` in python).
Sets the framework to implement more comparison for types known at
static time (e.g. `BooleanLiteral`, `StringLiteral`), allowing us to only
implement cases of the triplet `<left> Type`, `<right> Type`, `CmpOp`.
Contributes to #12701 (without checking off an item yet).
## Test Plan
- Added a test for the comparison of literals that should include most
cases of note.
- Added a test for the comparison of int instances
Please note that the cases do not cover 100% of the branches as there
are many and the current testing strategy with variables make this
fairly confusing once we have too many in one test.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Resolves https://github.com/astral-sh/ruff/issues/9962 by allowing a
configuration setting `allowed-unused-imports`
TODO:
- [x] Figure out the correct name and place for the setting; currently,
I have added it top level.
- [x] The comparison is pretty naive. I tried using `glob::Pattern` but
couldn't get it to work in the configuration.
- [x] Add tests
- [x] Update documentations
## Test Plan
`cargo test`
Closes https://github.com/astral-sh/ruff/issues/13545
As described in the issue, we move comments before the inner `if`
statement to before the newly constructed `elif` statement (previously
`else`).
## Summary
fix#13602
Currently, `UP043` only applies to typing.Generator, but it should also
support collections.abc.Generator.
This update ensures `UP043` correctly handles both
`collections.abc.Generator` and `collections.abc.AsyncGenerator`
### UP043
> `UP043`
> Python 3.13 introduced the ability for type parameters to specify
default values. As such, the default type arguments for some types in
the standard library (e.g., Generator, AsyncGenerator) are now optional.
> Omitting type parameters that match the default values can make the
code more concise and easier to read.
```py
Generator[int, None, None] -> Generator[int]
```
## Summary
...and remove periods from messages that don't span more than a single
sentence.
This is more consistent with how we present user-facing messages in uv
(which has a defined style guide).
## Summary
You can now call `return_ty_result` to operate on a `Result` directly
thereby using your own diagnostics, as in:
```rust
return dunder_getitem_method
.call(self.db, &[slice_ty])
.return_ty_result(self.db, value.as_ref().into(), self)
.unwrap_or_else(|err| {
self.add_diagnostic(
(&**value).into(),
"call-non-callable",
format_args!(
"Method `__getitem__` is not callable on object of type '{}'.",
value_ty.display(self.db),
),
);
err.return_ty()
});
```
While looking into https://github.com/astral-sh/ruff/issues/13545 I
noticed that we return `None` here if you pass a block of comments. This
is annoying because it causes `adjust_indentation` to fall back to
LibCST which panics when it cannot find a statement.
Adds a diagnostic for division by the integer zero in `//`, `/`, and
`%`.
Doesn't handle `<int> / 0.0` because we don't track the values of float
literals.
This variant shows inference that is not yet implemented..
## Summary
PR #13500 reopened the idea of adding a new type variant to keep track
of not-implemented features in Red Knot.
It was based off of #12986 with a more generic approach of keeping track
of different kind of unknowns. Discussion in #13500 agreed that keeping
track of different `Unknown` is complicated for now, and this feature is
better achieved through a new variant of `Type`.
### Requirements
Requirements for this implementation can be summed up with some extracts
of comment from @carljm on the previous PR
> So at the moment we are leaning towards simplifying this PR to just
use a new top-level variant, which behaves like Any and Unknown but
represents inference that is not yet implemented in red-knot.
> I think the general rule should be that Todo should propagate only
when the presence of the input Todo caused the output to be unknown.
>
> To take a specific example, the inferred result of addition must be
Unknown if either operand is Unknown. That is, Unknown + X will always
be Unknown regardless of what X is. (Same for X + Unknown.) In this
case, I believe that Unknown + Todo (or Todo + Unknown) should result in
Unknown, not result in Todo. If we fix the upstream source of the Todo,
the result would still be Unknown, so it's not useful to propagate the
Todo in this case: it wrongly suggests that the output is unknown
because of a todo item.
## Test Plan
This PR does not introduce new tests, but it did required to edit some
tests with the display of `[Type::Todo]` (currently `@Todo`), which
suggests that those test are placeholders requirements for features we
don't support yet.
While working on https://github.com/astral-sh/ruff/pull/13576 I noticed
that it was really hard to tell which assertion failed in some of these
test cases. This could be expanded to elsewhere, but I've heard this
test suite format won't be around for long?
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
There was a typo in the links of the docs of PTH116, where Path.stat
used to link to Path.group.
Another rule, PTH202, does it correctly:
ec72e675d9/crates/ruff_linter/src/rules/flake8_use_pathlib/rules/os_path_getsize.rs (L33)
This PR only fixes a one word typo.
## Test Plan
<!-- How was it tested? -->
I did not test that the doc generation framework picked up these
changes, I assume it will do it successfully.
## Summary
Following #13449, this PR adds custom handling for the bool constructor,
so when the input type has statically known truthiness value, it will be
used as the return value of the bool function.
For example, in the following snippet x will now be resolved to
`Literal[True]` instead of `bool`.
```python
x = bool(1)
```
## Test Plan
Some cargo tests were added.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Implement inference for `f-string`, contributes to #12701.
### First Implementation
When looking at the way `mypy` handles things, I noticed the following:
- No variables (e.g. `f"hello"`) ⇒ `LiteralString`
- Any variable (e.g. `f"number {1}"`) ⇒ `str`
My first commit (1ba5d0f13fdf70ed8b2b1a41433b32fc9085add2) implements
exactly this logic, except that we deal with string literals just like
`infer_string_literal_expression` (if below `MAX_STRING_LITERAL_SIZE`,
show `Literal["exact string"]`)
### Second Implementation
My second commit (90326ce9af5549af7b4efae89cd074ddf68ada14) pushes
things a bit further to handle cases where the expression within the
`f-string` are all literal values (string representation known at static
time).
Here's an example of when this could happen in code:
```python
BASE_URL = "https://httpbin.org"
VERSION = "v1"
endpoint = f"{BASE_URL}/{VERSION}/post" # Literal["https://httpbin.org/v1/post"]
```
As this can be sightly more costly (additional allocations), I don't
know if we want this feature.
## Test Plan
- Added a test `fstring_expression` covering all cases I can think of
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Related to https://github.com/astral-sh/ruff/issues/13524
Doesn't offer a valid fix, opting to instead just not offer a fix at
all. If someone points me to a good way to handle parenthesis here I'm
down to try to fix the fix separately, but it looks quite hard.
## Summary
Building ruff on AIX breaks on `tiki-jemalloc-sys` due to OS header
incompatibility
## Test Plan
`cargo test`
Co-authored-by: Henry Jiang <henry.jiang1@ibm.com>
In https://github.com/astral-sh/ruff/pull/13503, we added supported for
detecting variadic keyword arguments as dictionaries, here we use the
same strategy for detecting variadic positional arguments as tuples.
Closes https://github.com/astral-sh/ruff/issues/13266
Avoids false negatives for shadowed bindings that aren't actually
references to the loop variable. There are some shadowed bindings we
need to support still, e.g., `del` requires the loop variable to exist.
## Summary
I think we should also make the change that @BurntSushi recommended in
the linked issue, but this gets rid of the panic.
See: https://github.com/astral-sh/ruff/issues/13483
See: https://github.com/astral-sh/ruff/issues/13442
## Test Plan
```
warning: `ruff analyze graph` is experimental and may change without warning
{
"/Users/crmarsh/workspace/django/django/__init__.py": [
"/Users/crmarsh/workspace/django/django/apps/__init__.py",
"/Users/crmarsh/workspace/django/django/conf/__init__.py",
"/Users/crmarsh/workspace/django/django/urls/__init__.py",
"/Users/crmarsh/workspace/django/django/utils/log.py",
"/Users/crmarsh/workspace/django/django/utils/version.py"
],
"/Users/crmarsh/workspace/django/django/__main__.py": [
"/Users/crmarsh/workspace/django/django/core/management/__init__.py"
ruff failed
Cause: Broken pipe (os error 32)
```
## Summary
This PR changes removes the typeshed stubs from the vendored file system
shipped with ruff
and instead ships an empty "typeshed".
Making the typeshed files optional required extracting the typshed files
into a new `ruff_vendored` crate. I do like this even if all our builds
always include typeshed because it means `red_knot_python_semantic`
contains less code that needs compiling.
This also allows us to use deflate because the compression algorithm
doesn't matter for an archive containing a single, empty file.
## Test Plan
`cargo test`
I verified with ` cargo tree -f "{p} {f}" -p <package> ` that:
* red_knot_wasm: enables `deflate` compression
* red_knot: enables `zstd` compression
* `ruff`: uses stored
I'm not quiet sure how to build the binary that maturin builds but
comparing the release artifact size with `strip = true` shows a `1.5MB`
size reduction
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
For reasons I haven't investigated, this speeds up the resolver about 2x
(from 6.404s to 3.612s on an extremely large codebase).
## Test Plan
\cc @BurntSushi
```
[andrew@duff rippling]$ time ruff analyze graph --preview > /dev/null
real 3.274
user 16.039
sys 7.609
maxmem 11631 MB
faults 0
[andrew@duff rippling]$ time ruff-patch analyze graph --preview > /dev/null
real 1.841
user 14.625
sys 3.639
maxmem 7173 MB
faults 0
[andrew@duff rippling]$ time ruff-patch2 analyze graph --preview > /dev/null
real 2.087
user 15.333
sys 4.869
maxmem 8642 MB
faults 0
```
Where that's `main`, then (`ruff-patch`) using the version with no
`File`, no `SemanticModel`, then (`ruff-patch2`) using `File`.
Avoid quadratic time in subsumed elements when adding a super-type of
existing union elements.
Reserve space in advance when adding multiple elements (from another
union) to a union.
Make union elements a `Box<[Type]>` instead of an `FxOrderSet`; the set
doesn't buy much since the rules of union uniqueness are defined in
terms of supertype/subtype, not in terms of simple type identity.
Move sealed-boolean handling out of a separate `UnionBuilder::simplify`
method and into `UnionBuilder::add`; now that `add` is iterating
existing elements anyway, this is more efficient.
Remove `UnionType::contains`, since it's now `O(n)` and we shouldn't
really need it, generally we care about subtype/supertype, not type
identity. (Right now it's used for `Type::Unbound`, which shouldn't even
be a type.)
Add support for `is_subtype_of` for the `object` type.
Addresses comments on https://github.com/astral-sh/ruff/pull/13401
This was mentioned in an earlier review, and seemed easy enough to just
do it. No need to repeat all the types twice when it gives no additional
information.
## Summary
This PR adds an experimental Ruff subcommand to generate dependency
graphs based on module resolution.
A few highlights:
- You can generate either dependency or dependent graphs via the
`--direction` command-line argument.
- Like Pants, we also provide an option to identify imports from string
literals (`--detect-string-imports`).
- Users can also provide additional dependency data via the
`include-dependencies` key under `[tool.ruff.import-map]`. This map uses
file paths as keys, and lists of strings as values. Those strings can be
file paths or globs.
The dependency resolution uses the red-knot module resolver which is
intended to be fully spec compliant, so it's also a chance to expose the
module resolver in a real-world setting.
The CLI is, e.g., `ruff graph build ../autobot`, which will output a
JSON map from file to files it depends on for the `autobot` project.
This fixes the last panic on checking pandas.
(Match statement became an `if let` because clippy decided it wanted
that once I added the additional line in the else case?)
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Support using `reveal_type` without importing it, as implied by the type
spec and supported by existing type checkers.
We use `typing_extensions.reveal_type` for the implicit built-in; this
way it exists on all Python versions. (It imports from `typing` on newer
Python versions.)
Emits an "undefined name" diagnostic whenever `reveal_type` is
referenced in this way (in addition to the revealed-type diagnostic when
it is called). This follows the mypy example (with `--enable-error-code
unimported-reveal`) and I think provides a good (and easily
understandable) balance for user experience. If you are using
`reveal_type` for quick temporary debugging, the additional
undefined-name diagnostic doesn't hinder that use case. If we make the
revealed-type diagnostic a non-failing one, the undefined-name
diagnostic can still be a failing diagnostic, helping prevent
accidentally leaving it in place. For any use cases where you want to
leave it in place, you can always import it to avoid the undefined-name
diagnostic.
In the future, we can easily provide configuration options to a) turn
off builtin-reveal_type altogether, and/or b) silence the undefined-name
diagnostic when using it, if we have users on either side (loving or
hating pseudo-builtin `reveal_type`) who are dissatisfied with this
compromise.
After looking at more cases (for example, the case in the added test in
this PR), I realized that our previous rule, "if a symbol has any
declarations, use only declarations for its public type" is not
adequate. Rather than using `Unknown` as fallback if the symbol is not
declared in some paths, we need to use the inferred type as fallback in
that case.
For the paths where the symbol _was_ declared, we know that any bindings
must be assignable to the declared type in that path, so this won't
change the overall declared type in those paths. But for paths where the
symbol wasn't declared, this will give us a better type in place of
`Unknown`.
Before `typing.reveal_type` existed, there was
`typing_extensions.reveal_type`. We should support both.
Also adds a test to verify that we can handle aliasing of `reveal_type`
to a different name.
Adds a bit of code to ensure that if we have a union of different
`reveal_type` functions (e.g. a union containing both
`typing_extensions.reveal_type` and `typing.reveal_type`) we still emit
the reveal-type diagnostic only once. This is probably unlikely in
practice, but it doesn't hurt to handle it smoothly. (It comes up now
because we don't support `version_info` checks yet, so
`typing_extensions.reveal_type` is actually that union.)
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
I noticed that this pattern sometimes occurs in typeshed:
```
if ...:
from foo import bar
else:
def bar(): ...
```
If we have the rule that symbols with declarations only use declarations
for the public type, then this ends up resolving as `Unknown |
Literal[bar]`, because we didn't consider the import to be a
declaration.
I think the most straightforward thing here is to also consider imports
as declarations. The same rationale applies as for function and class
definitions: if you shadow an import, you should have to explicitly
shadow with an annotation, rather than just doing it
implicitly/accidentally.
We may also ultimately need to re-evaluate the rule that public type
considers only declarations, if there are declarations.
Add support for the `typing.reveal_type` function, emitting a diagnostic
revealing the type of its single argument. This is a necessary piece for
the planned testing framework.
This puts the cart slightly in front of the horse, in that we don't yet
have proper support for validating call signatures / argument types. But
it's easy to do just enough to make `reveal_type` work.
This PR includes support for calling union types (this is necessary
because we don't yet support `sys.version_info` checks, so
`typing.reveal_type` itself is a union type), plus some nice
consolidated error messages for calls to unions where some elements are
not callable. This is mostly to demonstrate the flexibility in
diagnostics that we get from the `CallOutcome` enum.
Use declared types in inference and checking. This means several things:
* Imports prefer declarations over inference, when declarations are
available.
* When we encounter a binding, we check that the bound value's inferred
type is assignable to the live declarations of the bound symbol, if any.
* When we encounter a declaration, we check that the declared type is
assignable from the inferred type of the symbol from previous bindings,
if any.
* When we encounter a binding+declaration, we check that the inferred
type of the bound value is assignable to the declared type.
Add support for declared types to the semantic index. This involves a
lot of renaming to clarify the distinction between bindings and
declarations. The Definition (or more specifically, the DefinitionKind)
becomes responsible for determining which definitions are bindings,
which are declarations, and which are both, and the symbol table
building is refactored a bit so that the `IS_BOUND` (renamed from
`IS_DEFINED` for consistent terminology) flag is always set when a
binding is added, rather than being set separately (and requiring us to
ensure it is set properly).
The `SymbolState` is split into two parts, `SymbolBindings` and
`SymbolDeclarations`, because we need to store live bindings for every
declaration and live declarations for every binding; the split lets us
do this without storing more than we need.
The massive doc comment in `use_def.rs` is updated to reflect bindings
vs declarations.
The `UseDefMap` gains some new APIs which are allow-unused for now,
since this PR doesn't yet update type inference to take declarations
into account.
## Summary
Follow-up from #13268, this PR updates the test case to use
`assert_snapshot` now that the output is limited to only include the
rules with diagnostics.
## Test Plan
`cargo insta test`
Add `::is_empty` and `::union` methods to the `BitSet` implementation.
Allowing unused for now, until these methods become used later with the
declared-types implementation.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
These are quite incomplete, but I needed to start stubbing them out in
order to build and test declared-types.
Allowing unused for now, until they are used later in the declared-types
PR.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR adds a new `Type` variant called `TupleType` which is used for
heterogeneous elements.
### Display notes
* For an empty tuple, I'm using `tuple[()]` as described in the docs:
https://docs.python.org/3/library/typing.html#annotating-tuples
* For nested elements, it'll use the literal type instead of builtin
type unlike Pyright which does `tuple[Literal[1], tuple[int, int]]`
instead of `tuple[Literal[1], tuple[Literal[2], Literal[3]]]`. Also,
mypy would give `tuple[builtins.int, builtins.int]` instead of
`tuple[Literal[1], Literal[2]]`
## Test Plan
Update test case to account for the display change and add cases for
multiple elements and nested tuple elements.
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This PR adds support for control flow for match statement.
It also adds the necessary infrastructure required for narrowing
constraints in case blocks and implements the logic for
`PatternMatchSingleton` which is either `None` / `True` / `False`. Even
after this the inferred type doesn't get simplified completely, there's
a TODO for that in the test code.
## Test Plan
Add test cases for control flow for (a) when there's a wildcard pattern
and (b) when there isn't. There's also a test case to verify the
narrowing logic.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
When a type of the form `Literal["..."]` would be constructed with too
large of a string, this PR converts it to `LiteralString` instead.
We also extend inference for binary operations to include the case where
one of the operands is `LiteralString`.
Closes#13224
Pull the tests from `types.rs` into `infer.rs`.
All of these are integration tests with the same basic form: create a
code sample, run type inference or check on it, and make some assertions
about types and/or diagnostics. These are the sort of tests we will want
to move into a test framework with a low-boilerplate custom textual
format. In the meantime, having them together (and more importantly,
their helper utilities together) means that it's easy to keep tests for
related language features together (iterable tests with other iterable
tests, callable tests with other callable tests), without an artificial
split based on tests which test diagnostics vs tests which test
inference. And it allows a single test to more easily test both
diagnostics and inference. (Ultimately in the test framework, they will
likely all test diagnostics, just in some cases the diagnostics will
come from `reveal_type()`.)
My plan for handling declared types is to introduce a `Declaration` in
addition to `Definition`. A `Declaration` is an annotation of a name
with a type; a `Definition` is an actual runtime assignment of a value
to a name. A few things (an annotated function parameter, an
annotated-assignment with an RHS) are both a `Definition` and a
`Declaration`.
This more cleanly separates type inference (only cares about
`Definition`) from declared types (only impacted by a `Declaration`),
and I think it will work out better than trying to squeeze everything
into `Definition`. One of the tests in this PR
(`annotation_only_assignment_transparent_to_local_inference`)
demonstrates one reason why. The statement `x: int` should have no
effect on local inference of the type of `x`; whatever the locally
inferred type of `x` was before `x: int` should still be the inferred
type after `x: int`. This is actually quite hard to do if `x: int` is
considered a `Definition`, because a core assumption of the use-def map
is that a `Definition` replaces the previous value. To achieve this
would require some hackery to effectively treat `x: int` sort of as if
it were `x: int = x`, but it's not really even equivalent to that, so
this approach gets quite ugly.
As a first step in this plan, this PR stops treating AnnAssign with no
RHS as a `Definition`, which fixes behavior in a couple added tests.
This actually makes things temporarily worse for the ellipsis-type test,
since it is defined in typeshed only using annotated assignments with no
RHS. This will be fixed properly by the upcoming addition of
declarations, which should also treat a declared type as sufficient to
import a name, at least from a stub.
Initially I had deferred annotation name lookups reuse the "public
symbol type", since that gives the correct "from end of scope" view of
reaching definitions that we want. But there is a key difference; public
symbol types are based only on definitions in the queried scope (or
"name in the given namespace" in runtime terms), they don't ever look up
a name in nonlocal/global/builtin scopes. Deferred annotation resolution
should do this lookup.
Add a test, and fix deferred name resolution to support
nonlocal/global/builtin names.
Fixes#13176
## Summary
Part of #13085, this PR updates the comprehension definition to handle
multiple targets.
## Test Plan
Update existing semantic index test case for comprehension with multiple
targets. Running corpus tests shouldn't panic.
Add support for non-local name lookups.
There's one TODO around annotated assignments without a RHS; these need
a fair amount of attention, which they'll get in an upcoming PR about
declared vs inferred types.
Fixes#11663
Test coverage for #13131 wasn't as good as I thought it was, because
although we infer a lot of types in stubs in typeshed, we don't check
typeshed, and therefore we don't do scope-level inference and pull all
types for a scope. So we didn't really have good test coverage for
scope-level inference in a stub. And because of this, I got the code for
supporting that wrong, meaning that if we did scope-level inference with
deferred types, we'd end up never populating the deferred types in the
scope's `TypeInference`, which causes panics like #13160.
Here I both add test coverage by running the corpus tests both as `.py`
and as `.pyi` (which reveals the panic), and I fix the code to support
deferred types in scope inference.
This also revealed a problem with deferred types in generic functions,
which effectively span two scopes. That problem will require a bit more
thought, and I don't want to block this PR on it, so for now I just
don't defer annotations on generic functions.
Fixes#13160.
## Summary
Follow-up to #13147, this PR implements the `AstNode` for `Identifier`.
This makes it easier to create the `NodeKey` in red knot because it uses
a generic method to construct the key from `AnyNodeRef` and is important
for definitions that are created only on identifiers instead of
`ExprName`.
## Test Plan
`cargo test` and `cargo clippy`
## Summary
This PR adds definition for match patterns.
## Test Plan
Update the existing test case for match statement symbols to verify that
the definitions are added as well.
This PR contains the following updates:
| Package | Type | Update | Change |
|---|---|---|---|
| [quick-junit](https://redirect.github.com/nextest-rs/quick-junit) |
workspace.dependencies | minor | `0.4.0` -> `0.5.0` |
---
### Release Notes
<details>
<summary>nextest-rs/quick-junit (quick-junit)</summary>
###
[`v0.5.0`](https://redirect.github.com/nextest-rs/quick-junit/blob/HEAD/CHANGELOG.md#050---2024-09-01)
[Compare
Source](https://redirect.github.com/nextest-rs/quick-junit/compare/quick-junit-0.4.0...quick-junit-0.5.0)
##### Changed
- The `Output` type, which strips invalid XML characters from a string,
has been renamed to
`XmlString`.
- All internal storage now uses `XmlString` rather than `String`.
</details>
---
### Configuration
📅 **Schedule**: Branch creation - "before 4am on Monday" (UTC),
Automerge - At any time (no schedule defined).
🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.
♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.
🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.
---
- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box
---
This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/astral-sh/ruff).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzguNTkuMiIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOlsiaW50ZXJuYWwiXX0=-->
---------
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
The `SequenceIndexVisitor` currently does not recurse into
subexpressions of subscripts when searching for subscript accesses that
would trigger this rule. That means that we don't currently detect
violations of the rule on snippets like this:
```py
data = {"a": 1, "b": 2}
column_names = ["a", "b"]
for index, column_name in enumerate(column_names):
_ = data[column_names[index]]
```
Fixes#13183
## Test Plan
`cargo test -p ruff_linter`
The `UnionBuilder` builds `builtins.bool` when handed `Literal[True]`
and `Literal[False]`.
Caveat: If the builtins module is unfindable somehow, the builder falls
back to the union type of these two literals.
First task from #12694
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Adds basic support for inferring the type resulting from a call
expression. This only works for the *result* of call expressions; it
performs no inference on parameters. It also intentionally does nothing
with class instantiation, `__call__` implementors, or lambdas.
## Test Plan
Adds a test that it infers the right thing!
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
- Introduce methods for inferring annotation and type expressions.
- Correctly infer explicit return types from functions where they are
simple names that can be resolved in scope.
Contributes to #12701 by way of helping unlock call expressions (this
does not remotely finish that, as it stands, but it gets us moving that
direction).
## Test Plan
Added a test for function return types which use the name form of an
annotation expression, since this is aiming toward call expressions.
When we extend this to working for other annotation and type expression
positions, we should add explicit tests for those as well.
---------
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Extends deletions for RUF100, deleting trailing text from noqa
directives, while preserving upcoming comments on the same line if any.
In cases where it deletes a comment up to another comment on the same
line, the whitespace between them is now shown to be in the autofix in
the diagnostic as well. Leading whitespace before the removed comment is
not, though.
Fixes#12251
## Test Plan
`cargo test`
Prototype deferred evaluation of type expressions by deferring
evaluation of class bases in a stub file. This allows self-referential
class definitions, as occur with the definition of `str` in typeshed
(which inherits `Sequence[str]`).
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Just what it says on the tin: adds basic `EllipsisType` inference for
any time `...` appears in the AST.
## Test Plan
Test that `x = ...` produces exactly what we would expect.
---------
Co-authored-by: Carl Meyer <carl@oddbird.net>
## Summary
The resulting type when multiplying a string literal by an integer
literal is one of two types:
- `StringLiteral`, in the case where it is a reasonably small resulting
string (arbitrarily bounded here to 4096 bytes, roughly a page on many
operating systems), including the fully expanded string.
- `LiteralString`, matching Pyright etc., for strings larger than that.
Additionally:
- Switch to using `Box<str>` instead of `String` for the internal value
of `StringLiteral`, saving some non-trivial byte overhead (and keeping
the total number of allocations the same).
- Be clearer and more accurate about which types we ought to defer to in
`StringLiteral` and `LiteralString` member lookup.
## Test Plan
Added a test case covering multiplication times integers: positive,
negative, zero, and in and out of bounds.
---------
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
This fixes the outstanding TODO and make it easier to work with new
cases. (Tidy first, *then* implement, basically!)
## Test Plan
After making this change all the existing tests still pass. A classic
refactor win. 🎉
# Summary
Add support for the first unary operator: negating integer literals. The
resulting type is another integer literal, with the value being the
negated value of the literal. All other types continue to return
`Type::Unknown` for the present, but this is designed to make it easy to
extend easily with other combinations of operator and operand.
Contributes to #12701.
## Test Plan
Add tests with basic negation, including of very large integers and
double negation.
## Summary
Introduce a `StringLiteralType` with corresponding `Display` type and a
relatively basic test that the resulting representation is as expected.
Note: we currently always allocate for `StringLiteral` types. This may
end up being a perf issue later, at which point we may want to look at
other ways of representing `value` here, i.e. with some kind of smarter
string structure which can reuse types. That is most likely to show up
with e.g. concatenation.
Contributes to #12701.
## Test Plan
Added a test for individual strings with both single and double quotes
as well as concatenated strings with both forms.
## Summary
Now that Ruff provides a formatter, there is no need to rely on Black to
check that the docs are formatted correctly in
`check_docs_formatted.py`. This PR swaps out Black for the Ruff
formatter and updates inconsistencies between the two.
This PR will be a precursor to another PR
([branch](https://github.com/calumy/ruff/tree/format-pyi-in-docs)),
updating the `check_docs_formatted.py` script to check for pyi files,
fixing #11568.
## Test Plan
- CI to check that the docs are formatted correctly using the updated
script.
This PR has the `SemanticIndexBuilder` visit function definition
annotations before adding the function symbol/name to the builder.
For example, the following snippet no longer causes a panic:
```python
def bool(x) -> bool:
Return True
```
Note: This fix changes the ordering of the global symbol table.
Closes#13069
## Summary
This PR adds symbols introduced by `for` loops to red-knot:
- `x` in `for x in range(10): pass`
- `x` and `y` in `for x, y in d.items(): pass`
- `a`, `b`, `c` and `d` in `for [((a,), b), (c, d)] in foo: pass`
## Test Plan
Several tests added, and the assertion in the benchmarks has been
updated.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This PR simplifies the virtual file support in the red knot core,
specifically:
* Update `File::add_virtual_file` method to `File::virtual_file` which
will always create a new virtual file and override the existing entry in
the lookup table
* Add `VirtualFile` which is a wrapper around `File` and provides
methods to increment the file revision / close the virtual file
* Add a new `File::try_virtual_file` to lookup the `VirtualFile` from
`Files`
* Add `File::sync_virtual_path` which takes in the `SystemVirtualPath`,
looks up the `VirtualFile` for it and calls the `sync` method to
increment the file revision
* Removes the `virtual_path_metadata` method on `System` trait
## Test Plan
- [x] Make sure the existing red knot tests pass
- [x] Updated code works well with the LSP
## Summary
This PR adds support for `textDocument/didChange` notification.
There seems to be a bug (probably in Salsa) where it panics with:
```
2024-08-22 15:33:38.802 [info] panicked at /Users/dhruv/.cargo/git/checkouts/salsa-61760caba2b17ca5/f608ff8/src/tracked_struct.rs:377:9:
two concurrent writers to Id(4800), should not be possible
```
## Test Plan
https://github.com/user-attachments/assets/81055feb-ba8e-4acf-ad2f-94084a3efead
## Summary
This PR adds basic support for files outside of any workspace in the red
knot server.
This also limits the red knot server to only work in a single workspace.
The server will not start if there are multiple workspaces.
## Test Plan
https://github.com/user-attachments/assets/de601387-0ad5-433c-9d2c-7b6ae5137654
## Summary
This PR adds the `bytes` type to red-knot:
- Added the `bytes` type
- Added support for bytes literals
- Support for the `+` operator
Improves on #12701
Big TODO on supporting and normalizing r-prefixed bytestrings
(`rb"hello\n"`)
## Test Plan
Added a test for a bytes literals, concatenation, and corner values
The `SemanticIndexBuilder` was causing a cycle in a salsa query by
attempting to resolve the target before the value in a named expression
(e.g. `x := x+1`). This PR swaps the order, avoiding a panic.
Closes#13012.
## Summary
This PR removes notebook sync support from server capabilities because
it isn't tested, it'll be added back once we actually add full support
for notebook.
## Summary
This PR adds symbols and definitions introduced by `with` statements.
The symbols and definitions are introduced for each with item. The type
inference is updated to call the definition region type inference
instead.
## Test Plan
Add test case to check for symbol table and definitions.
## Summary
This PR adds symbols introduced by `match` statements.
There are three patterns that introduces new symbols:
* `as` pattern
* Sequence pattern
* Mapping pattern
The recursive nature of the visitor makes sure that all symbols are
added.
## Test Plan
Add test case for all types of patterns that introduces a symbol.
## Summary
This PR adds definition for augmented assignment. This is similar to
annotated assignment in terms of implementation.
An augmented assignment should also record a use of the variable but
that's a TODO for now.
## Test Plan
Add test case to validate that a definition is added.
## Summary
As suggested by @MichaReiser in
https://github.com/astral-sh/ruff/pull/12886#pullrequestreview-2237679793,
this adds an exemption to `RUF027` for `fastAPI` paths, which require
template strings rather than eagerly evaluated f-strings.
## Test Plan
I added a fixture that causes Ruff to emit a false-positive error on
`main` but no longer does with this PR.
Extend the `UseDefMap` to also track which constraints (provided by e.g.
`if` tests) apply to each visible definition.
Uses a custom `BitSet` and `BitSetArray` to track which constraints
apply to which definitions, while keeping data inline as much as
possible.
## Summary
This PR is a pure refactor to simplify some of the logic for `RUF027`.
This will make it easier to file some followup PRs to help reduce the
false positives from this rule. I'm separating the refactor out into a
separate PR so it's easier to review, and so I can double-check from the
ecosystem report that this doesn't have any user-facing impact.
## Test Plan
`cargo test -p ruff_linter --lib`
## Summary
This PR adds support for adding symbols and definitions for function and
lambda parameters to the semantic index.
### Notes
* The default expression of a parameter is evaluated in the enclosing
scope (not the type parameter or function scope).
* The annotation expression of a parameter is evaluated in the type
parameter scope if they're present other in the enclosing scope.
* The symbols and definitions are added in the function parameter scope.
### Type Inference
There are two definitions `Parameter` and `ParameterWithDefault` and
their respective `*_definition` methods on the type inference builder.
These methods are preferred and are re-used when checking from a
different region.
## Test Plan
Add test case for validating that the parameters are defined in the
function / lambda scope.
### Benchmark update
Validated the difference in diagnostics for benchmark code between
`main` and this branch. All of them are either directly or indirectly
referencing one of the function parameters. The diff is in the PR description.
This adds the `fast-api-unused-path-parameter` lint rule, as described
in #12632.
I'm still pretty new to rust, so the code can probably be improved, feel
free to tell me if there's any changes i should make.
Also, i needed to add the `add_parameter` edit function, not sure if it
was in the scope of the PR or if i should've made another one.
If a builtin is conditionally shadowed by a global, we didn't correctly
fall back to builtins for the not-defined-in-globals path (see added
test for an example.)
List and set comprehensions using `async for` cannot be replaced with
underlying generators; this PR modifies C419 to skip such
comprehensions.
Closes#12891.
## Summary
Occasionally, we receive bug reports that imports in `src` directories
aren't correctly detected. The root of the problem is that we default to
`src = ["."]`, so users have to set `src = ["src"]` explicitly. This PR
extends the default to cover _both_ of them: `src = [".", "src"]`.
Closes https://github.com/astral-sh/ruff/issues/12454.
## Test Plan
I replicated the structure described in
https://github.com/astral-sh/ruff/issues/12453, and verified that the
imports were considered sorted, but that adding `src = ["."]` showed an
error.
## Summary
This PR adds very basic support for using the line / column information
from the diagnostic message. This makes it easier to validate
diagnostics in an editor as oppose to going through the diff one
diagnostic at a time and confirming it at the location.
## Summary
This PR adds a fallback logic for `is_python_notebook` to check the
`kernelspec.language` field.
Reference implementation in VS Code:
1c31e75898/extensions/ipynb/src/deserializers.ts (L20-L22)
It's also required for the kernel to provide the `language` they're
implementing based on
https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs
reference although that's for the `kernel.json` file but is also
included in the notebook metadata.
Closes: #12281
## Test Plan
Add a test case for `is_python_notebook` and include the test notebook
for round trip validation.
The test notebook contains two cells, one is JavaScript (denoted via the
`vscode.languageId` metadata) and the other is Python (no metadata). The
notebook metadata only contains `kernelspec` and the `language_info` is
absent.
I also verified that this is a valid notebook by opening it in Jupyter
Lab, VS Code and using `nbformat` validator.
## Summary
This PR adds support for VS Code specific cell metadata to consider when
collecting valid code cells.
For context, Ruff only runs on valid code cells. These are the code
cells that doesn't contain cell magics. Previously, Ruff only used the
notebook's metadata to determine whether it's a Python notebook. But, in
VS Code, a notebook's preferred language might be Python but it could
still contain code cells for other languages. This can be determined
with the `metadata.vscode.languageId` field.
### References:
* https://code.visualstudio.com/docs/languages/identifiers
* e6c009a3d4/extensions/ipynb/src/serializers.ts (L104-L107)
*
e6c009a3d4/extensions/ipynb/src/serializers.ts (L117-L122)
This brings us one step closer to fixing #12281.
## Test Plan
Add test cases for `is_valid_python_code_cell` and an integration test
case which showcase running it end to end. The test notebook contains a
JavaScript code cell and a Python code cell.
## Summary
This PR fixes a bug in the semantic model where it would evaluate the
default parameter value in the type parameter scope. For example,
```py
def foo[T1: int](a = T1):
pass
```
Here, the `T1` in `a = T1` is undefined but Ruff doesn't flag it
(https://play.ruff.rs/ba2f7c2f-4da6-417e-aa2a-104aa63e6d5e).
The fix here is to evaluate the default parameter value in the
_enclosing_ scope instead.
## Test Plan
Add a test case which includes the above code under `F821`
(`undefined-name`) and validate the snapshot.
## Summary
See #12703. This only addresses the first bullet point, adding a space
after the comma in the suggested fix from list/tuple to string.
## Test Plan
Updated the snapshots and compared.
## Summary
This PR adds scope and definition for comprehension nodes. This includes
the following nodes:
* List comprehension
* Dictionary comprehension
* Set comprehension
* Generator expression
### Scope
Each expression here adds it's own scope with one caveat - the `iter`
expression of the first generator is part of the parent scope. For
example, in the following code snippet the `iter1` variable is evaluated
in the outer scope.
```py
[x for x in iter1]
```
> The iterable expression in the leftmost for clause is evaluated
directly in the enclosing scope and then passed as an argument to the
implicitly nested scope.
>
> Reference:
https://docs.python.org/3/reference/expressions.html#displays-for-lists-sets-and-dictionaries
There's another special case for assignment expressions:
> There is one special case: an assignment expression occurring in a
list, set or dict comprehension or in a generator expression (below
collectively referred to as “comprehensions”) binds the target in the
containing scope, honoring a nonlocal or global declaration for the
target in that scope, if one exists.
>
> Reference: https://peps.python.org/pep-0572/#scope-of-the-target
For example, in the following code snippet, the variables `a` and `b`
are available after the comprehension while `x` isn't:
```py
[a := 1 for x in range(2) if (b := 2)]
```
### Definition
Each comprehension node adds a single definition, the "target" variable
(`[_ for target in iter]`). This has been accounted for and a new
variant has been added to `DefinitionKind`.
### Type Inference
Currently, type inference is limited to a single scope. It doesn't
_enter_ in another scope to infer the types of the remaining expressions
of a node. To accommodate this, the type inference for a **scope**
requires new methods which _doesn't_ infer the type of the `iter`
expression of the leftmost outer generator (that's defined in the
enclosing scope).
The type inference for the scope region is split into two parts:
* `infer_generator_expression` (similarly for comprehensions) infers the
type of the `iter` expression of the leftmost outer generator
* `infer_generator_expression_scope` (similarly for comprehension)
infers the type of the remaining expressions except for the one
mentioned in the previous point
The type inference for the **definition** also needs to account for this
special case of leftmost generator. This is done by defining a `first`
boolean parameter which indicates whether this comprehension definition
occurs first in the enclosing expression.
## Test Plan
New test cases were added to validate multiple scenarios. Refer to the
documentation for each test case which explains what is being tested.
Make `cargo doc -p red_knot_python_semantic --document-private-items`
run warning-free. I'd still like to do this for all of ruff and start
enforcing it in CI (https://github.com/astral-sh/ruff/issues/12372) but
haven't gotten to it yet. But in the meantime I'm trying to maintain it
for at least `red_knot_python_semantic`, as it helps to ensure our doc
comments stay up to date.
A few of the comments I just removed or shortened, as their continued
relevance wasn't clear to me; please object in review if you think some
of them are important to keep!
Also remove a no-longer-needed `allow` attribute.
For type narrowing, we'll need intersections (since applying type
narrowing is just a type intersection.)
Add `IntersectionBuilder`, along with some tests for it and
`UnionBuilder` (renamed from `UnionTypeBuilder`).
We use smart builders to ensure that we always keep these types in
disjunctive normal form (DNF). That means that we never have deeply
nested trees of unions and intersections: unions flatten into unions,
intersections flatten into intersections, and intersections distribute
over unions, so the most complex tree we can ever have is a union of
intersections. We also never have a single-element union or a
single-positive-element intersection; these both just simplify to the
contained type.
Maintaining these invariants means that `UnionBuilder` doesn't
necessarily end up building a `Type::Union` (e.g. if you only add a
single type to the union, it'll just return that type instead), and
`IntersectionBuilder` doesn't necessarily build a `Type::Intersection`
(if you add a union to the intersection, we distribute the intersection
over that union, and `IntersectionBuilder` will end up returning a
`Type::Union` of intersections).
We also simplify intersections by ensuring that if a type and its
negation are both in an intersection, they simplify out. (In future this
should also respect subtyping, not just type identity, but we don't have
subtyping yet.) We do implement subtyping of `Never` as a special case
for now.
Most of this PR is unused for now until type narrowing lands; I'm just
breaking it out to reduce the review fatigue of a single massive PR.
## Summary
I'm not sure if this is useful but this is a hacky implementation to add
the filename and row / column numbers to the current Red Knot
diagnostics.
## Summary
Related to https://github.com/astral-sh/ruff-vscode/issues/571, this PR
updates the settings index builder to trace all the errors it
encountered. Without this, there's no way for user to know that
something failed and some of the capability might not work as expected.
For example, in the linked PR, the settings were invalid which means
notebooks weren't included and there were no log messages for it.
## Test Plan
Create an invalid `ruff.toml` file:
```toml
[tool.ruff]
extend-exclude = ["*.ipynb"]
```
Logs:
```
2024-08-12 18:33:09.873 [info] [Trace - 6:33:09 PM] 12.217043000s ERROR ruff:main ruff_server::session::index::ruff_settings: Failed to parse /Users/dhruv/playground/ruff/pyproject.toml
```
Notification Preview:
<img width="483" alt="Screenshot 2024-08-12 at 18 33 20"
src="https://github.com/user-attachments/assets/a4f303e5-f073-454f-bdcd-ba6af511e232">
Another way to trigger is to provide an invalid `cache-dir` value:
```toml
[tool.ruff]
cache-dir = "$UNKNOWN"
```
Same notification preview but different log message:
```
2024-08-12 18:41:37.571 [info] [Trace - 6:41:37 PM] 21.700112208s ERROR ThreadId(30) ruff_server::session::index::ruff_settings: Error while resolving settings from /Users/dhruv/playground/ruff/pyproject.toml: Invalid `cache-dir` value: error looking key 'UNKNOWN' up: environment variable not found
```
With multiple `pyproject.toml` file:
```
2024-08-12 18:41:15.887 [info] [Trace - 6:41:15 PM] 0.016636833s ERROR ThreadId(04) ruff_server::session::index::ruff_settings: Error while resolving settings from /Users/dhruv/playground/ruff/pyproject.toml: Invalid `cache-dir` value: error looking key 'UNKNOWN' up: environment variable not found
2024-08-12 18:41:15.888 [info] [Trace - 6:41:15 PM] 0.017378833s ERROR ThreadId(13) ruff_server::session::index::ruff_settings: Failed to parse /Users/dhruv/playground/ruff/tools/pyproject.toml
```
In most cases we should suggest a ternary operator, but there are three
edge cases where a binary operator is more appropriate.
Given an if-else block of the form
```python
if test:
target_var = body_value
else:
target_var = else_value
```
This PR updates the check for SIM108 to the following:
- If `test == body_value` and preview enabled, suggest to replace with
`target_var = test or else_value`
- If `test == not body_value` and preview enabled, suggest to replace
with `target_var = body_value and else_value`
- If `not test == body_value` and preview enabled, suggest to replace
with `target_var = body_value and else_value`
- Otherwise, suggest to replace with `target_var = body_value if test
else else_value`
Closes#12189.
## Summary
Adding parentheses to a tuple in a subscript with elements that include
slice expressions causes a syntax error. For example, `d[(1,2,:)]` is a
syntax error.
So, when `lint.ruff.parenthesize-tuple-in-subscript = true` and the
tuple includes a slice expression, we skip this check and fix.
Closes#12766.
> ~Builtins are also more efficient than `for` loops.~
Let's not promise performance because this code transformation does not
deliver.
Benchmark written by @dcbaker
> `any()` seems to be about 1/3 as fast (Python 3.11.9, NixOS):
```python
loop = 'abcdef'.split()
found = 'f'
nfound = 'g'
def test1():
for x in loop:
if x == found:
return True
return False
def test2():
return any(x == found for x in loop)
def test3():
for x in loop:
if x == nfound:
return True
return False
def test4():
return any(x == nfound for x in loop)
if __name__ == "__main__":
import timeit
print('for loop (found) :', timeit.timeit(test1))
print('for loop (not found):', timeit.timeit(test3))
print('any() (found) :', timeit.timeit(test2))
print('any() (not found) :', timeit.timeit(test4))
```
```
for loop (found) : 0.051076093994197436
for loop (not found): 0.04388196699437685
any() (found) : 0.15422860698890872
any() (not found) : 0.15568504799739458
```
I have retested with longer lists and on multiple Python versions with
similar results.
Implements the new fixable lint rule `RUF031` which checks for the use or omission of parentheses around tuples in subscripts, depending on the setting `lint.ruff.parenthesize-tuple-in-getitem`. By default, the use of parentheses is considered a violation.
## Summary
Follow-up from https://github.com/astral-sh/ruff/pull/12725, this is
just a small refactor to use a wrapper struct instead of type alias for
workspace settings index. This avoids the need to have the
`register_workspace_settings` as a static method on `Index` and instead
is a method on the new struct itself.
## Summary
This PR updates the server to ignore non-file workspace URL.
This is to avoid crashing the server if the URL scheme is not "file".
We'd still raise an error if the URL to file path conversion fails.
Also, as per the docs of
[`to_file_path`](https://docs.rs/url/2.5.2/url/struct.Url.html#method.to_file_path):
> Note: This does not actually check the URL’s scheme, and may give
nonsensical results for other schemes. It is the user’s responsibility
to check the URL’s scheme before calling this.
resolves: #12660
## Test Plan
I'm not sure how to test this locally but the change is small enough to
validate on its own.
## Summary
This PR updates the `red_knot` CLI to make the subcommand optional.
## Test Plan
Run the following commands:
* `cargo run --bin red_knot --
--current-directory=~/playground/ruff/type_inference` (no subcommand
requirement)
* `cargo run --bin red_knot -- server` (should start the server)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Resolves#12636
Consider docstrings which begin with the word "Returns" as having
satisfactorily documented they're returns. For example
```python
def f():
"""Returns 1."""
return 1
```
is valid.
## Test Plan
Added example to test fixture.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
Removes set comprehension as a violation for `sum` when checking `C419`,
because set comprehension may de-duplicate entries in a generator,
thereby modifying the value of the sum.
Closes#12690.
## Summary
Make it a violation of `C409` to call `tuple` with a list or set
comprehension, and
implement the (unsafe) fix of calling the `tuple` with the underlying
generator instead.
Closes#12648.
## Test Plan
Test fixture updated, cargo test, docs checked for updated description.
## Summary
Adds autofix for `RUF007`
## Test Plan
`cargo test`, however I get errors for `test resolver::tests::symlink
... FAILED` which seems to not be my fault
## Summary
Fixes#12630.
DOC501 and DOC502 now understand functions with constructs like this to
be explicitly raising `TypeError` (which should be documented in a
function's docstring):
```py
try:
foo():
except TypeError:
...
raise
```
I made an exception for `Exception` and `BaseException`, however.
Constructs like this are reasonably common, and I don't think anybody
would say that it's worth putting in the docstring that it raises "some
kind of generic exception":
```py
try:
foo()
except BaseException:
do_some_logging()
raise
```
## Test Plan
`cargo test -p ruff_linter --lib`
## Summary
Please see
https://github.com/astral-sh/ruff/pull/12605#discussion_r1699957443 for
a description of the issue.
They way I fixed it is to get the *last* timeout item in the `with`, and
if it's an `async with` and there are items after it, then don't trigger
the lint.
## Test Plan
Updated the fixture with some more cases.
Changes the red-knot benchmark to run on the stdlib "tomllib" library
(which is self-contained, four files, uses type annotations) instead of
on very small bits of handwritten code.
Also remove the `without_parse` benchmark: now that we are running on
real code that uses typeshed, we'd either have to pre-parse all of
typeshed (slow) or find some way to determine which typeshed modules
will be used by the benchmark (not feasible with reasonable complexity.)
## Test Plan
`cargo bench -p ruff_benchmark --bench red_knot`
## Summary
This PR separates the current `red_knot` crate into two crates:
1. `red_knot` - This will be similar to the `ruff` crate, it'll act as
the CLI crate
2. `red_knot_workspace` - This includes everything except for the CLI
functionality from the existing `red_knot` crate
Note that the code related to the file watcher is in
`red_knot_workspace` for now but might be required to extract it out in
the future.
The main motivation for this change is so that we can have a `red_knot
server` command. This makes it easier to test the server out without
making any changes in the VS Code extension. All we need is to specify
the `red_knot` executable path in `ruff.path` extension setting.
## Test Plan
- `cargo build`
- `cargo clippy --workspace --all-targets --all-features`
- `cargo shear --fix`
## Summary
There's still a problem here. Given:
```python
class Class():
pass
# comment
# another comment
a = 1
```
We only add one newline before `a = 1` on the first pass, because
`max_precedling_blank_lines` is 1... We then add the second newline on
the second pass, so it ends up in the right state, but the logic is
clearly wonky.
Closes https://github.com/astral-sh/ruff/issues/11508.
I hit this `todo!` trying to run type inference over some real modules.
Since it's a one-liner to implement it, I just did that rather than
changing to `Type::Unknown`.
## Summary
@zanieb noticed while we were discussing #12595 that this flag is now
unnecessary, so remove it and the flags which reference it.
## Test Plan
Question for maintainers: is there a test to add *or* remove here? (I’ve
opened this as a draft PR with that in view!)
## Summary
This pull request adds support for logging via `$/logTrace` RPC
messages. It also enables that code path for when a client is Zed editor
or VS Code (as there's no way for us to generically tell whether a client prefers
`$/logTrace` over stderr.
Related to: #12523
## Test Plan
I've built Ruff from this branch and tested it manually with Zed.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Extend `flake8-builtins` to imports, lambda-arguments, and modules to be
consistent with original checker
[flake8_builtins](https://github.com/gforcada/flake8-builtins/blob/main/flake8_builtins.py).
closes#12540
## Details
- Implement builtin-import-shadowing (A004)
- Stop tracking imports shadowing in builtin-variable-shadowing (A001)
in preview mode.
- Implement builtin-lambda-argument-shadowing (A005)
- Implement builtin-module-shadowing (A006)
- Add new option `linter.flake8_builtins.builtins_allowed_modules`
## Test Plan
cargo test
## Summary
If an import is marked as "required", we should never flag it as unused.
In practice, this is rare, since required imports are typically used for
`__future__` annotations, which are always considered "used".
Closes https://github.com/astral-sh/ruff/issues/12458.
Now that we have builtins available, resolve some simple cases to the
right builtin type.
We should also adjust the display for types to include their module
name; that's not done yet here.
## Summary
This PR adds support for untitled files in the Red Knot project.
Refer to the [design
discussion](https://github.com/astral-sh/ruff/discussions/12336) for
more details.
### Changes
* The `parsed_module` always assumes that the `SystemVirtual` path is of
`PySourceType::Python`.
* For the module resolver, as suggested, I went ahead by adding a new
`SystemOrVendoredPath` enum and renamed `FilePathRef` to
`SystemOrVendoredPathRef` (happy to consider better names here).
* The `file_to_module` query would return if it's a
`FilePath::SystemVirtual` variant because a virtual file doesn't belong
to any module.
* The sync implementation for the system virtual path is basically the
same as that of system path except that it uses the
`virtual_path_metadata`. The reason for this is that the system
(language server) would provide the metadata on whether it still exists
or not and if it exists, the corresponding metadata.
For point (1), VS Code would use `Untitled-1` for Python files and
`Untitled-1.ipynb` for Jupyter Notebooks. We could use this distinction
to determine whether the source type is `Python` or `Ipynb`.
## Test Plan
Added test cases in #12526
Extend red-knot type inference to cover all syntax, so that inferring
types for a scope gives all expressions a type. This means we can run
the red-knot semantic lint on all Python code without panics. It also
means we can infer types for `builtins.pyi` without panics.
To keep things simple, this PR intentionally doesn't add any new type
inference capabilities: the expanded coverage is all achieved with
`Type::Unknown`. But this puts the skeleton in place for adding better
inference of all these language features.
I also had to add basic Salsa cycle recovery (with just `Type::Unknown`
for now), because some `builtins.pyi` definitions are cyclic.
To test this, I added a comprehensive corpus of test snippets sourced
from Cinder under [MIT
license](https://github.com/facebookincubator/cinder/blob/cinder/3.10/cinderx/LICENSE),
which matches Ruff's license. I also added to this corpus some
additional snippets for newer language features: all the
`27_func_generic_*` and `73_class_generic_*` files, as well as
`20_lambda_default_arg.py`, and added a test which runs semantic-lint
over all these files. (The test doesn't assert the test-corpus files are
lint-free; just that they are able to lint without a panic.)
## Summary
Right now, in the isort comment model, there's nowhere for trailing
comments on the _statement_ to go, as in:
```python
from mylib import (
MyClient,
MyMgmtClient,
) # some comment
```
If the comment is on the _alias_, we do preserve it, because we attach
it to the alias, as in:
```python
from mylib import (
MyClient,
MyMgmtClient, # some comment
)
```
Similarly, if the comment is trailing on an import statement
(non-`from`), we again attach it to the alias, because it can't be
parenthesized, as in:
```python
import foo # some comment
```
This PR adds logic to track and preserve those trailing comments.
We also no longer drop several other comments, like:
```python
from mylib import (
# some comment
MyClient
)
```
Closes https://github.com/astral-sh/ruff/issues/12487.
## Summary
When working on improving Ruff integration with Zed I noticed that it
errors out when we try to resolve a code action of a `QUICKFIX` kind;
apparently, per @dhruvmanila we shouldn't need to resolve it, as the
edit is provided in the initial response for the code action. However,
it's possible for the `resolve` call to fill out other fields (such as
`command`).
AFAICT Helix also tries to resolve the code actions unconditionally (as
in, when either `edit` or `command` is absent); so does VSC. They can
still apply the quickfixes though, as they do not error out on a failed
call to resolve code actions - Zed does. Following suit on Zed's side
does not cut it though, as we still get a log request from Ruff for that
failure (which is surfaced in the UI).
There are also other language servers (such as
[rust-analyzer](c1c9e10f72/crates/rust-analyzer/src/handlers/request.rs (L1257)))
that fill out both `command` and `edit` fields as a part of code action
resolution.
This PR makes the resolve calls for quickfix actions return the input
value.
## Test Plan
N/A
Add support for while-loop control flow.
This doesn't yet include general support for terminals and reachability;
that is wider than just while loops and belongs in its own PR.
This also doesn't yet add support for cyclic definitions in loops; that
comes with enough of its own complexity in Salsa that I want to handle
it separately.
Add a lint rule to detect if a name is definitely or possibly undefined
at a given usage.
If I create the file `undef/main.py` with contents:
```python
x = int
def foo():
z
return x
if flag:
y = x
y
```
And then run `cargo run --bin red_knot -- --current-directory
../ruff-examples/undef`, I get the output:
```
Name 'z' used when not defined.
Name 'flag' used when not defined.
Name 'y' used when possibly not defined.
```
If I modify the file to add `y = 0` at the top, red-knot re-checks it
and I get the new output:
```
Name 'z' used when not defined.
Name 'flag' used when not defined.
```
Note that `int` is not flagged, since it's a builtin, and `return x` in
the function scope is not flagged, since it refers to the global `x`.
## Summary
This PR fixes a bug to raise a syntax error when an unparenthesized
generator expression is used as an argument to a call when there are
more than one argument.
For reference, the grammar is:
```
primary:
| ...
| primary genexp
| primary '(' [arguments] ')'
| ...
genexp:
| '(' ( assignment_expression | expression !':=') for_if_clauses ')'
```
The `genexp` requires the parenthesis as mentioned in the grammar. So,
the grammar for a call expression is either a name followed by a
generator expression or a name followed by a list of argument. In the
former case, the parenthesis are excluded because the generator
expression provides them while in the later case, the parenthesis are
explicitly provided for a list of arguments which means that the
generator expression requires it's own parenthesis.
This was discovered in https://github.com/astral-sh/ruff/issues/12420.
## Test Plan
Add test cases for valid and invalid syntax.
Make sure that the parser from CPython also raises this at the parsing
step:
```console
$ python3.13 -m ast parser/_.py
File "parser/_.py", line 1
total(1, 2, x for x in range(5), 6)
^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized
$ python3.13 -m ast parser/_.py
File "parser/_.py", line 1
sum(x for x in range(10), 10)
^^^^^^^^^^^^^^^^^^^^
SyntaxError: Generator expression must be parenthesized
```
## Summary
Fix panic reported in #12428. Where a string would sometimes get split
within a character boundary. This bypasses the need to split the string.
This does not guarantee the correct formatting of the docstring, but
neither did the previous implementation.
Resolves#12428
## Test Plan
Test case added to fixture
## Summary
These are the first rules implemented as part of #458, but I plan to
implement more.
Specifically, this implements `docstring-missing-exception` which checks
for raised exceptions not documented in the docstring, and
`docstring-extraneous-exception` which checks for exceptions in the
docstring not present in the body.
## Test Plan
Test fixtures added for both google and numpy style.
When poring over traces, the ones that just include a definition or
symbol or expression ID aren't very useful, because you don't know which
file it comes from. This adds that information to the trace.
I guess the downside here is that if calling `.file(db)` on a
scope/definition/expression would execute other traced code, it would be
marked as outside the span? I don't think that's a concern, because I
don't think a simple field access on a tracked struct should ever
execute our code. If I'm wrong and this is a problem, it seems like the
tracing crate has this feature where you can record a field as
`tracing::field::Empty` and then fill in its value later with
`span.record(...)`, but when I tried this it wasn't working for me, not
sure why.
I think there's a lot more we can do to make our tracing output more
useful for debugging (e.g. record an event whenever a
definition/symbol/expression/use id is created with the details of that
definition/symbol/expression/use), this is just dipping my toes in the
water.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
This PR updates D301 rule to allow inclduing escaped docstring, e.g.
`\"""Foo.\"""` or `\"\"\"Bar.\"\"\"`, within a docstring.
Related issue: #12152
## Test Plan
Add more test cases to D301.py and update the snapshot file.
<!-- How was it tested? -->
In preparation for supporting resolving builtins, simplify the benchmark
so it doesn't look up `str`, which is actually a complex builtin to deal
with because it inherits `Sequence[str]`.
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
Per comments in https://github.com/astral-sh/ruff/pull/12269, "module
global" is kind of long, and arguably redundant.
I tried just using "module" but there were too many cases where I felt
this was ambiguous. I like the way "global" works out better, though it
does require an understanding that in Python "global" generally means
"module global" not "globally global" (though in a sense module globals
are also globally global since modules are singletons).
Support falling back to a global name lookup if a name isn't defined in
the local scope, in the cases where that is correct according to Python
semantics.
In class scopes, a name lookup checks the local namespace first, and if
the name isn't found there, looks it up in globals.
In function scopes (and type parameter scopes, which are function-like),
if a name has any definitions in the local scope, it is a local, and
accessing it when none of those definitions have executed yet just
results in an `UnboundLocalError`, it does not fall back to a global. If
the name does not have any definitions in the local scope, then it is an
implicit global.
Public symbol type lookups never include such a fall back. For example,
if a name is not defined in a class scope, it is not available as a
member on that class, even if a name lookup within the class scope would
have fallen back to a global lookup.
This PR makes the `@override` lint rule work again.
Not yet included/supported in this PR:
* Support for free variables / closures: a free symbol in a nested
function-like scope referring to a symbol in an outer function-like
scope.
* Support for `global` and `nonlocal` statements, which force a symbol
to be treated as global or nonlocal even if it has definitions in the
local scope.
* Module-global lookups should fall back to builtins if the name isn't
found in the module scope.
I would like to expose nicer APIs for the various kinds of symbols
(explicit global, implicit global, free, etc), but this will also wait
for a later PR, when more kinds of symbols are supported.
Adds inference tests sufficient to give full test coverage of the
`UseDefMapBuilder::merge` method.
In the process I realized that we could implement visiting of if
statements in `SemanticBuilder` with fewer `snapshot`, `restore`, and
`merge` operations, so I restructured that visit a bit.
I also found one correctness bug in the `merge` method (it failed to
extend the given snapshot with "unbound" for any missing symbols,
meaning we would just lose the fact that the symbol could be unbound in
the merged-in path), and two efficiency bugs (if one of the ranges to
merge is empty, we can just use the other one, no need for copies, and
if the ranges are overlapping -- which can occur with nested branches --
we can still just merge them with no copies), and fixed all three.
## Summary
This PR allows us to fix both expressions in `foo == "a" or foo == "b"
or ("c" != bar and "d" != bar)`, but limits the rule to consecutive
comparisons, following https://github.com/astral-sh/ruff/issues/7797.
I think this logic was _probably_ added because of
https://github.com/astral-sh/ruff/pull/12368 -- the intent being that
we'd replace the _entire_ expression.
## Summary
This PR adds documentation for the Ruff language server.
It mainly does the following:
1. Combines various READMEs containing instructions for different editor
setup in their respective section on the online docs
2. Provide an enumerated list of server settings. Additionally, it also
provides a section for VS Code specific options.
3. Adds a "Features" section which enumerates all the current
capabilities of the native server
For (2), the settings documentation is done manually but a future
improvement (easier after `ruff-lsp` is deprecated) is to move the docs
in to Rust struct and generate the documentation from the code itself.
And, the VS Code extension specific options can be generated by diffing
against the `package.json` in `ruff-vscode` repository.
### Structure
1. Setup: This section contains the configuration for setting up the
language server for different editors
2. Features: This section contains a list of capabilities provided by
the server along with short GIF to showcase it
3. Settings: This section contains an enumerated list of settings in a
similar format to the one for the linter / formatter
4. Migrating from `ruff-lsp`
> [!NOTE]
>
> The settings page is manually written but could possibly be
auto-generated via a macro similar to `OptionsMetadata` on the
`ClientSettings` struct
resolves: #11217
## Test Plan
Generate and open the documentation locally using:
1. `python scripts/generate_mkdocs.py`
2. `mkdocs serve -f mkdocs.insiders.yml`
## Summary
This PR removes the requirement of `--preview` flag to run the `ruff
server` and instead considers it to be an indicator to turn on preview
mode for the linter and the formatter.
resolves: #12161
## Test Plan
Add test cases to assert the `preview` value is updated accordingly.
In an editor context, I used the local `ruff` executable in Neovim with
the `--preview` flag and verified that the preview-only violations are
being highlighted.
Running with:
```lua
require('lspconfig').ruff.setup({
cmd = {
'/Users/dhruv/work/astral/ruff/target/debug/ruff',
'server',
'--preview',
},
})
```
The screenshot shows that `E502` is highlighted with the below config in
`pyproject.toml`:
<img width="877" alt="Screenshot 2024-07-17 at 16 43 09"
src="https://github.com/user-attachments/assets/c7016ef3-55b1-4a14-bbd3-a07b1bcdd323">
## Summary
This PR updates the settings index building logic in the language server
to consider the fallback settings for applying ignore filters in
`WalkBuilder` and the exclusion via `exclude` / `extend-exclude`.
This flow matches the one in the `ruff` CLI where the root settings is
built by (1) finding the workspace setting in the ancestor directory (2)
finding the user configuration if that's missing and (3) fallback to
using the default configuration.
Previously, the index building logic was being executed before (2) and
(3). This PR reverses the logic so that the exclusion /
`respect_gitignore` is being considered from the default settings if
there's no workspace / user settings. This has the benefit that the
server no longer enters the `.git` directory or any other excluded
directory when a user opens a file in the home directory.
Related to #11366
## Test plan
Opened a test file from the home directory and confirmed with the debug
trace (removed in #12360) that the server excludes the `.git` directory
when indexing.
## Summary
Add new rule and implement for `unnecessary default type arguments`
under the `UP` category (`UP043`).
```py
// < py313
Generator[int, None, None]
// >= py313
Generator[int]
```
I think that as Python 3.13 develops, there might be more default type
arguments added besides `Generator` and `AsyncGenerator`. So, I made
this more flexible to accommodate future changes.
related issue: #12286
## Test Plan
snapshot included..!
## Summary
Pretty sure this should still be an error, but also, I think I added
this because of ecosystem CI? So want to see what pops up.
Closes https://github.com/astral-sh/ruff/issues/12164.
## Summary
This is the _intended_ default that PEP 597 _wants_, but it's not
backwards compatible. The fix is already unsafe, so it's better for us
to recommend the desired and expected behavior.
Closes https://github.com/astral-sh/ruff/issues/12069.
Improve semantic index tests with better assertions than just `.len()`,
and re-add use-definition test that was commented out in the switch to
Salsa initially.
Implements definition-level type inference, with basic control flow
(only if statements and if expressions so far) in Salsa.
There are a couple key ideas here:
1) We can do type inference queries at any of three region
granularities: an entire scope, a single definition, or a single
expression. These are represented by the `InferenceRegion` enum, and the
entry points are the salsa queries `infer_scope_types`,
`infer_definition_types`, and `infer_expression_types`. Generally
per-scope will be used for scopes that we are directly checking and
per-definition will be used anytime we are looking up symbol types from
another module/scope. Per-expression should be uncommon: used only for
the RHS of an unpacking or multi-target assignment (to avoid
re-inferring the RHS once per symbol defined in the assignment) and for
test nodes in type narrowing (e.g. the `test` of an `If` node). All
three queries return a `TypeInference` with a map of types for all
definitions and expressions within their region. If you do e.g.
scope-level inference, when it hits a definition, or an
independently-inferable expression, it should use the relevant query
(which may already be cached) to get all types within the smaller
region. This avoids double-inferring smaller regions, even though larger
regions encompass smaller ones.
2) Instead of building a control-flow graph and lazily traversing it to
find definitions which reach a use of a name (which is O(n^2) in the
worst case), instead semantic indexing builds a use-def map, where every
use of a name knows which definitions can reach that use. We also no
longer track all definitions of a symbol in the symbol itself; instead
the use-def map also records which defs remain visible at the end of the
scope, and considers these the publicly-visible definitions of the
symbol (see below).
Major items left as TODOs in this PR, to be done in follow-up PRs:
1) Free/global references aren't supported yet (only lookup based on
definitions in current scope), which means the override-check example
doesn't currently work. This is the first thing I'll fix as follow-up to
this PR.
2) Control flow outside of if statements and expressions.
3) Type narrowing.
There are also some smaller relevant changes here:
1) Eliminate `Option` in the return type of member lookups; instead
always return `Type::Unbound` for a name we can't find. Also use
`Type::Unbound` for modules we can't resolve (not 100% sure about this
one yet.)
2) Eliminate the use of the terms "public" and "root" to refer to
module-global scope or symbols. Instead consistently use the term
"module-global". It's longer, but it's the clearest, and the most
consistent with typical Python terminology. In particular I don't like
"public" for this use because it has other implications around author
intent (is an underscore-prefixed module-global symbol "public"?). And
"root" is just not commonly used for this in Python.
3) Eliminate the `PublicSymbol` Salsa ingredient. Many non-module-global
symbols can also be seen from other scopes (e.g. by a free var in a
nested scope, or by class attribute access), and thus need to have a
"public type" (that is, the type not as seen from a particular use in
the control flow of the same scope, but the type as seen from some other
scope.) So all symbols need to have a "public type" (here I want to keep
the use of the term "public", unless someone has a better term to
suggest -- since it's "public type of a symbol" and not "public symbol"
the confusion with e.g. initial underscores is less of an issue.) At
least initially, I would like to try not having special handling for
module-global symbols vs other symbols.
4) Switch to using "definitions that reach end of scope" rather than
"all definitions" in determining the public type of a symbol. I'm
convinced that in general this is the right way to go. We may want to
refine this further in future for some free-variable cases, but it can
be changed purely by making changes to the building of the use-def map
(the `public_definitions` index in it), without affecting any other
code. One consequence of combining this with no control-flow support
(just last-definition-wins) is that some inference tests now give more
wrong-looking results; I left TODO comments on these tests to fix them
when control flow is added.
And some potential areas for consideration in the future:
1) Should `symbol_ty` be a Salsa query? This would require making all
symbols a Salsa ingredient, and tracking even more dependencies. But it
would save some repeated reconstruction of unions, for symbols with
multiple public definitions. For now I'm not making it a query, but open
to changing this in future with actual perf evidence that it's better.
## Summary
I believe these should always bind more tightly -- e.g., in:
```python
for _ in bar(baz for foo in [1]):
pass
```
The inner `baz` and `foo` should be considered comprehension variables,
not for loop bindings.
We need to revisit this more holistically. In some of these cases,
`BindingKind` should probably be a flag, not an enum, since the values
aren't mutually exclusive. Separately, we should probably be more
precise in how we set it (e.g., by passing down from the parent rather
than sniffing in `handle_node_store`).
Closes https://github.com/astral-sh/ruff/issues/12339
When there is a function or class definition at the end of a suite
followed by the beginning of an alternative block, we have to insert a
single empty line between them.
In the if-else-statement example below, we insert an empty line after
the `foo` in the if-block, but none after the else-block `foo`, since in
the latter case the enclosing suite already adds empty lines.
```python
if sys.version_info >= (3, 10):
def foo():
return "new"
else:
def foo():
return "old"
class Bar:
pass
```
To do so, we track whether the current suite is the last one in the
current statement with a new option on the suite kind.
Fixes#12199
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This PR updates the server to build the settings index in parallel using
similar logic as `python_files_in_path`.
This should help with https://github.com/astral-sh/ruff/issues/11366 but
ideally we would want to build it lazily.
## Test Plan
`cargo insta test`
## Summary
I don't know that there's more to do here. We could consider not raising
the violation at all for arguments, but that would have some false
negatives and could also be surprising to users.
Closes https://github.com/astral-sh/ruff/issues/12267.
## Summary
Ensures that, e.g., the following is not considered a
redefinition-without-use:
```python
import contextlib
foo = None
with contextlib.suppress(ImportError):
from some_module import foo
```
Closes https://github.com/astral-sh/ruff/issues/12309.
## Summary
Closes https://github.com/astral-sh/ruff/issues/12291.
## Test Plan
```shell
❯ cargo run check ../uv/foo --select INP
/Users/crmarsh/workspace/uv/foo/bar/baz.py:1:1: INP001 File `/Users/crmarsh/workspace/uv/foo/bar/baz.py` is part of an implicit namespace package. Add an `__init__.py`.
Found 1 error.
```
## Summary
I don't fully understand the purpose of this. In #7905, it was just
copied over from the previous non-preview implementation. But it means
that (e.g.) we don't treat `type(self.foo)` as a type -- which is wrong.
Closes https://github.com/astral-sh/ruff/issues/12290.
## Summary
Update the name of `ASYNC109` to match
[upstream](https://flake8-async.readthedocs.io/en/latest/rules.html).
Also update to the functionality to match upstream by supporting
additional context managers from `asyncio` and `anyio`. This doesn't
change any of the detection functionality, but recommends additional
context managers from `asyncio` and `anyio` depending on context.
Part of https://github.com/astral-sh/ruff/issues/12039.
## Test Plan
Added fixture for asyncio recommendation
## Summary
S113 exists because `requests` doesn't have a default timeout, so
request without timeout may hang indefinitely
> B113: Test for missing requests timeout
This plugin test checks for requests or httpx calls without a timeout
specified.
>
> Nearly all production code should use this parameter in nearly all
requests, **Failure to do so can cause your program to hang
indefinitely.**
But httpx has default timeout 5s, so S113 for httpx request without
`timeout` argument is a false positive, only valid case would be
`timeout=None`.
https://www.python-httpx.org/advanced/timeouts/
> HTTPX is careful to enforce timeouts everywhere by default.
>
> The default behavior is to raise a TimeoutException after 5 seconds of
network inactivity.
## Test Plan
snap updated
Intern types using Salsa interning instead of in the `TypeInference`
result.
This eliminates the need for `TypingContext`, and also paves the way for
finer-grained type inference queries.
## Summary
This PR fixes the bug where the server was not considering the
`cells.structure.didOpen` field to sync up the new content of the newly
added cells.
The parameters corresponding to this request provides two fields to get
the newly added cells:
1. `cells.structure.array.cells`: This is a list of `NotebookCell` which
doesn't contain any cell content. The only useful information from this
array is the cell kind and the cell document URI which we use to
initialize the new cell in the index.
2. `cells.structure.didOpen`: This is a list of `TextDocumentItem` which
corresponds to the newly added cells. This actually contains the text
content and the version.
This wasn't a problem before because we initialize the cell with an
empty string and this isn't a problem when someone just creates an empty
cell. But, when someone copy-pastes a cell, the cell needs to be
initialized with the content.
fixes: #12201
## Test Plan
First, let's see the panic in action:
1. Press <kbd>Esc</kbd> to allow using the keyboard to perform cell
actions (move around, copy, paste, etc.)
2. Copy the second cell with <kbd>c</kbd> key
3. Delete the second cell with <kbd>dd</kbd> key
4. Paste the copied cell with <kbd>p</kbd> key
You can see that the content isn't synced up because the `unused-import`
for `sys` is still being highlighted but it's being used in the second
cell. And, the hover isn't working either. Then, as I start editing the
second cell, it panics.
https://github.com/astral-sh/ruff/assets/67177269/fc58364c-c8fc-4c11-a917-71b6dd90c1ef
Now, here's the preview of the fixed version:
https://github.com/astral-sh/ruff/assets/67177269/207872dd-dca6-49ee-8b6e-80435c7ef22e
This reverts commit b28dc9ac14.
We're not ready to stabilize the server yet. There's some pending work
for the VS Code extension and documentation improvements.
This change is to unblock Ruff release.
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
This is the implementation for the new rule of `pycodestyle (E204)`. It
follows the guidlines described in the contributing site, and as such it
has a new file named `whitespace_after_decorator.rs`, a new test file
called `E204.py`, and as such invokes the `function` in the `AST
statement checker` for functions and functions in classes. Linking #2402
because it has all the pycodestyle rules.
## Test Plan
<!-- How was it tested? -->
The file E204.py, has a `decorator` defined called wrapper, and this
decorator is used for 2 cases. The first one is when a `function` which
has a `decorator` is called in the file, and the second one is when
there is a `class` and 2 `methods` are defined for the `class` with a
`decorator` attached it.
Test file:
``` python
def foo(fun):
def wrapper():
print('before')
fun()
print('after')
return wrapper
# No error
@foo
def bar():
print('bar')
# E204
@ foo
def baz():
print('baz')
class Test:
# No error
@foo
def bar(self):
print('bar')
# E204
@ foo
def baz(self):
print('baz')
```
I am still new to rust and any suggestion is appreciated. Specially with
the way im using native ruff utilities.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Fixes a few typos and consistency issues in the "Settings"
documentation:
- use "Ruff" consistently in the few places where "ruff" is used
- use double quotes in the few places where single quotes are used
- add backticks around rule codes where they are currently missing
- update a few example values where they are the same as the defaults,
for consistency
2nd commit might be controversial, as there are many options mentioned
where we don't currently link to the documentation sections, so maybe
it's done on purpose, as this will also appear in the JSON schema where
it's not desirable? If that's the case, I can easily drop it.
## Test Plan
Local testing.
## Summary
This PR fixes various bugs for computing the replacement range between
the original and modified source for the language server.
1. When finding the end offset of the source and modified range, we
should apply `zip` on the reversed iterator. The bug was that it was
reversing the already zipped iterator. The problem here is that the
length of both slices aren't going to be the same unless the source
wasn't modified at all. Refer to the [Rust
playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=44f860d31bd26456f3586b6ab530c22f)
where you can see this in action.
2. Skip the first line when computing the start offset because the first
line start value will always be 0 and the default value of the source /
modified range start is also 0. So, comparing 0 and 0 is not useful
which means we can skip the first value.
3. While iterating in the reverse direction, we should only stop if the
line start is strictly less than the source start i.e., we should use
`<` instead of `<=`.
fixes: #12128
## Test Plan
Add test cases where the text is being inserted, deleted, and replaced
between the original and new source code, validate the replacement
ranges.
## Summary
Bandit now also reports `B113` on `httpx`
(https://github.com/PyCQA/bandit/pull/1060). This PR implements the same
logic, to detect missing or `None` timeouts for `httpx` alongside
`requests`.
## Test Plan
Snapshot tests.
Closes https://github.com/astral-sh/ruff/issues/12158
Hashing `Path` does not take into account path separators so `foo/bar`
is the same as `foobar` which is no good for our case. I'm guessing this
is an upstream bug, perhaps introduced by
45082b077b?
I'm investigating that further.
## Summary
This PR updates the linter, specifically the token-based rules, to work
on the tokens that come after a syntax error.
For context, the token-based rules only diagnose the tokens up to the
first lexical error. This PR builds up an error resilience by
introducing a `TokenIterWithContext` which updates the `nesting` level
and tries to reflect it with what the lexer is seeing. This isn't 100%
accurate because if the parser recovered from an unclosed parenthesis in
the middle of the line, the context won't reduce the nesting level until
it sees the newline token at the end of the line.
resolves: #11915
## Test Plan
* Add test cases for a bunch of rules that are affected by this change.
* Run the fuzzer for a long time, making sure to fix any other bugs.
## Summary
This PR updates Ruff to **not** generate auto-fixes if the source code
contains syntax errors as determined by the parser.
The main motivation behind this is to avoid infinite autofix loop when
the token-based rules are run over any source with syntax errors in
#11950.
Although even after this, it's not certain that there won't be an
infinite autofix loop because the logic might be incorrect. For example,
https://github.com/astral-sh/ruff/issues/12094 and
https://github.com/astral-sh/ruff/pull/12136.
This requires updating the test infrastructure to not validate for fix
availability status when the source contained syntax errors. This is
required because otherwise the fuzzer might fail as it uses the test
function to run the linter and validate the source code.
resolves: #11455
## Test Plan
`cargo insta test`
## Summary
This PR updates various references in the linter to compute the
line-width for summing the width of each `char` in a `str` instead of
computing the width of the `str` itself.
Refer to #12133 for more details.
fixes: #12130
## Test Plan
Add a file with null (`\0`) character which is zero-width. Run this test
case on `main` to make sure it panics and switch over to this branch to
make sure it doesn't panic now.
## Summary
Use the following to reproduce this:
```console
$ cargo run -- check --select=E275,E203 --preview --no-cache ~/playground/ruff/src/play.py --fix
debug error: Failed to converge after 100 iterations in `/Users/dhruv/playground/ruff/src/play.py` with rule codes E275:---
yield,x
---
/Users/dhruv/playground/ruff/src/play.py:1:1: E275 Missing whitespace after keyword
|
1 | yield,x
| ^^^^^ E275
|
= help: Added missing whitespace after keyword
Found 101 errors (100 fixed, 1 remaining).
[*] 1 fixable with the `--fix` option.
```
## Test Plan
Add a test case and run `cargo insta test`.
## Summary
This patch inverts the defaults for
[pytest-fixture-incorrect-parentheses-style
(PT001)](https://docs.astral.sh/ruff/rules/pytest-fixture-incorrect-parentheses-style/)
and [pytest-incorrect-mark-parentheses-style
(PT003)](https://docs.astral.sh/ruff/rules/pytest-incorrect-mark-parentheses-style/)
to prefer dropping superfluous parentheses.
Presently, Ruff defaults to adding superfluous parentheses on pytest
mark and fixture decorators for documented purpose of consistency; for
example,
```diff
import pytest
-@pytest.mark.foo
+@pytest.mark.foo()
def test_bar(): ...
```
This behaviour is counter to the official pytest recommendation and
diverges from the flake8-pytest-style plugin as of version 2.0.0 (see
https://github.com/m-burst/flake8-pytest-style/issues/272). Seeing as
either default satisfies the documented benefit of consistency across a
codebase, it makes sense to change the behaviour to be consistent with
pytest and the flake8 plugin as well.
This change is breaking, so is gated behind preview (at least under my
understanding of Ruff versioning). The implementation of this gating
feature is a bit hacky, but seemed to be the least disruptive solution
without performing invasive surgery on the `#[option()]` macro.
Related to #8796.
### Caveat
Whilst updating the documentation, I sought to reference the pytest
recommendation to drop superfluous parentheses, but couldn't find any
official instruction beyond it being a revealed preference within the
pytest documentation code examples (as well as the linked issues from a
core pytest developer). Thus, the wording of the preference is
deliberately timid; it's to cohere with pytest rather than follow an
explicit guidance.
## Test Plan
`cargo nextest run`
I also ran
```sh
cargo run -p ruff -- check crates/ruff_linter/resources/test/fixtures/flake8_pytest_style/PT001.py --no-cache --diff --select PT001
```
and compared against it with `--preview` to verify that the default does
change under preview (I also repeated this with `echo
'[tool.ruff]\npreview = true' > pyproject.toml` to verify that it works
with a configuration file).
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Implement mutable-contextvar-default (B039) which was added to
flake8-bugbear in https://github.com/PyCQA/flake8-bugbear/pull/476.
This rule is similar to [mutable-argument-default
(B006)](https://docs.astral.sh/ruff/rules/mutable-argument-default) and
[function-call-in-default-argument
(B008)](https://docs.astral.sh/ruff/rules/function-call-in-default-argument),
except that it checks the `default` keyword argument to
`contextvars.ContextVar`.
```
B039.py:19:26: B039 Do not use mutable data structures for ContextVar defaults
|
18 | # Bad
19 | ContextVar("cv", default=[])
| ^^ B039
20 | ContextVar("cv", default={})
21 | ContextVar("cv", default=list())
|
= help: Replace with `None`; initialize with `.set()` after checking for `None`
```
In the upstream flake8-plugin, this rule is written expressly as a
corollary to B008 and shares much of its logic. Likewise, this
implementation reuses the logic of the Ruff implementation of B008,
namely
f765d19402/crates/ruff_linter/src/rules/flake8_bugbear/rules/function_call_in_argument_default.rs (L104-L106)
and
f765d19402/crates/ruff_linter/src/rules/flake8_bugbear/rules/mutable_argument_default.rs (L106)
Thus, this rule deliberately replicates B006's and B008's heuristics.
For example, this rule assumes that all functions are mutable unless
otherwise qualified. If improvements are to be made to B039 heuristics,
they should probably be made to B006 and B008 as well (whilst trying to
match the upstream implementation).
This rule does not have an autofix as it is unknown where the ContextVar
next used (and it might not be within the same file).
Closes#12054
## Test Plan
`cargo nextest run`
## Summary
This adds a fix for the `duplicate-bases` rule that removes the
duplicate base from the class definition.
## Test Plan
`cargo nextest run duplicate_bases`, `cargo insta review`.
## Summary
`ruff server` has reached a point of stabilization, and `--preview` is
no longer required as a flag.
`--preview` is still supported as a flag, since future features may be
need to gated behind it initially.
## Test Plan
A simple way to test this is to run `ruff server` from the command line.
No error about a missing `--preview` argument should be reported.
## Summary
Follow-up to #11902
This PR simplifies the `LinterResult` struct by avoiding the generic and
not store the `ParseError`.
This is possible because the callers already have access to the
`ParseError` via the `Parsed` output. This also means that we can
simplify the return type of `check_path` and avoid the generic `T` on
`LinterResult`.
## Test Plan
`cargo insta test`
## Summary
Follow-up to #11901
This PR avoids displaying the syntax errors as log message now that the
`E999` diagnostic cannot be disabled.
For context on why this was added, refer to
https://github.com/astral-sh/ruff/pull/2505. Basically, we would allow
ignoring the syntax error diagnostic because certain syntax feature
weren't supported back then like `match` statement. And, if a user
ignored `E999`, Ruff would give no feedback if the source code contained
any syntax error. So, this log message was a way to indicate to the user
even if `E999` was disabled.
The current state of the parser is such that (a) it matches with the
latest grammar and (b) it's easy to add support for any new syntax.
**Note:** This PR doesn't remove the `DisplayParseError` struct because
it's still being used by the formatter.
## Test Plan
Update existing snapshots from the integration tests.
## Summary
This PR updates the way syntax errors are handled throughout the linter.
The main change is that it's now not considered as a rule which involves
the following changes:
* Update `Message` to be an enum with two variants - one for diagnostic
message and the other for syntax error message
* Provide methods on the new message enum to query information required
by downstream usages
This means that the syntax errors cannot be hidden / disabled via any
disablement methods. These are:
1. Configuration via `select`, `ignore`, `per-file-ignores`, and their
`extend-*` variants
```console
$ cargo run -- check ~/playground/ruff/src/lsp.py --extend-select=E999
--no-preview --no-cache
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.10s
Running `target/debug/ruff check /Users/dhruv/playground/ruff/src/lsp.py
--extend-select=E999 --no-preview --no-cache`
warning: Rule `E999` is deprecated and will be removed in a future
release. Syntax errors will always be shown regardless of whether this
rule is selected or not.
/Users/dhruv/playground/ruff/src/lsp.py:1:8: F401 [*] `abc` imported but
unused
|
1 | import abc
| ^^^ F401
2 | from pathlib import Path
3 | import os
|
= help: Remove unused import: `abc`
```
3. Command-line flags via `--select`, `--ignore`, `--per-file-ignores`,
and their `--extend-*` variants
```console
$ cargo run -- check ~/playground/ruff/src/lsp.py --no-cache
--config=~/playground/ruff/pyproject.toml
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
Running `target/debug/ruff check /Users/dhruv/playground/ruff/src/lsp.py
--no-cache --config=/Users/dhruv/playground/ruff/pyproject.toml`
warning: Rule `E999` is deprecated and will be removed in a future
release. Syntax errors will always be shown regardless of whether this
rule is selected or not.
/Users/dhruv/playground/ruff/src/lsp.py:1:8: F401 [*] `abc` imported but
unused
|
1 | import abc
| ^^^ F401
2 | from pathlib import Path
3 | import os
|
= help: Remove unused import: `abc`
```
This also means that the **output format** needs to be updated:
1. The `code`, `noqa_row`, `url` fields in the JSON output is optional
(`null` for syntax errors)
2. Other formats are changed accordingly
For each format, a new test case specific to syntax errors have been
added. Please refer to the snapshot output for the exact format for
syntax error message.
The output of the `--statistics` flag will have a blank entry for syntax
errors:
```
315 F821 [ ] undefined-name
119 [ ] syntax-error
103 F811 [ ] redefined-while-unused
```
The **language server** is updated to consider the syntax errors by
convert them into LSP diagnostic format separately.
### Preview
There are no quick fixes provided to disable syntax errors. This will
automatically work for `ruff-lsp` because the `noqa_row` field will be
`null` in that case.
<img width="772" alt="Screenshot 2024-06-26 at 14 57 08"
src="https://github.com/astral-sh/ruff/assets/67177269/aaac827e-4777-4ac8-8c68-eaf9f2c36774">
Even with `noqa` comment, the syntax error is displayed:
<img width="763" alt="Screenshot 2024-06-26 at 14 59 51"
src="https://github.com/astral-sh/ruff/assets/67177269/ba1afb68-7eaf-4b44-91af-6d93246475e2">
Rule documentation page:
<img width="1371" alt="Screenshot 2024-06-26 at 16 48 07"
src="https://github.com/astral-sh/ruff/assets/67177269/524f01df-d91f-4ac0-86cc-40e76b318b24">
## Test Plan
- [x] Disablement methods via config shows a warning
- [x] `select`, `extend-select`
- [ ] ~`ignore`~ _doesn't show any message_
- [ ] ~`per-file-ignores`, `extend-per-file-ignores`~ _doesn't show any
message_
- [x] Disablement methods via command-line flag shows a warning
- [x] `--select`, `--extend-select`
- [ ] ~`--ignore`~ _doesn't show any message_
- [ ] ~`--per-file-ignores`, `--extend-per-file-ignores`~ _doesn't show
any message_
- [x] File with syntax errors should exit with code 1
- [x] Language server
- [x] Should show diagnostics for syntax errors
- [x] Should not recommend a quick fix edit for adding `noqa` comment
- [x] Same for `ruff-lsp`
resolves: #8447
The motivation for this rule is solid; it's been in preview for a long
time; the implementation and tests seem sound; there are no open issues
regarding it, and as far as I can tell there never have been any.
The only issue I see is that the docs don't really describe the rule
accurately right now; I fix that in this PR.
## Summary
This PR migrates our release workflow to
[`cargo-dist`](https://github.com/axodotdev/cargo-dist). The primary
motivation here is that we want to ship dedicated installers for Ruff
that work across platforms, and `cargo-dist` gives us those installers
out-of-the-box. The secondary motivation is that `cargo-dist` formalizes
some of the patterns that we've built up over time in our own release
process.
At a high level:
- The `release.yml` file is generated by `cargo-dist` with `cargo dist
generate`. It doesn't contain any modifications vis-a-vis the generated
file. (If it's edited out of band from generation, the release fails.)
- Our customizations are inserted as custom steps within the
`cargo-dist` workflow. Specifically, `build-binaries` builds the wheels
and packages them into binaries (as on `main`), while `build-docker.yml`
builds the Docker image. `publish-pypi.yml` publishes the wheels to
PyPI. This is effectively our `release.yaml` (on `main`), broken down
into individual workflows rather than steps within a single workflow.
### Changes from `main`
The workflow is _nearly_ unchanged. We kick off a release manually via
the GitHub Action by providing a tag. If the tag doesn't match the
`Cargo.toml`, the release fails. If the tag matches an already-existing
release, the release fails.
The release proceeds by (in order):
0. Doing some upfront validation via `cargo-dist`.
1. Creating the wheels and archives.
2. Building and pushing the Docker image.
3. Publishing to PyPI (if it's not a "dry run").
4. Creating the GitHub Release (if it's not a "dry run").
5. Notifying `ruff-pre-commit` (if it's not a "dry run").
There are a few changes in the workflow as compared to `main`:
- **We no longer validate the SHA** (just the tag). It's not an input to
the job. The Axo team is considering whether / how to support this.
- **Releases are now published directly** (rather than as draft). Again,
the Axo team is considering whether / how to support this. The downside
of drafts is that the URLs aren't stable, so the installers don't work
_as long as the release is in draft_. This is fine for our workflow. It
seems like the Axo team will add it.
- Releases already contain the latest entry from the changelog (we don't
need to copy it over). This "Just Works", which is nice, though we'll
still want to edit them to add contributors.
There are also a few **breaking changes** for consumers of the binaries:
- **We no longer include the version tag in the file name**. This
enables users to install via `/latest` URLs on GitHub, and is part of
the cargo-dist paradigm.
- **Archives now include an extra level of nesting,** which you can
remove with `--strip-components=1` when untarring.
Here's an example release that I created -- I omitted all the artifacts
since I was just testing a workflow, so none of the installers or links
work, but it gives you a sense for what the release looks like:
https://github.com/charliermarsh/cargodisttest/releases/tag/0.1.13.
### Test Plan
I ran a successful release to completion last night, and installed Ruff
via the installer:


The piece I'm least confident about is the Docker push. We build the
image, but the push fails in my test repo since I haven't wired up the
credentials.
## Summary
This rule removes `PLR1701` and redirects it to `SIM101`.
In addition to that, the `SIM101` autofix has been fixed to add padding
if required.
### `PLR1701` has bugs
It also seems that the implementation of `PLR1701` is incorrect in
multiple scenarios. For example, the following code snippet:
```py
# There are two _different_ variables `a` and `b`
if isinstance(a, int) or isinstance(b, bool) or isinstance(a, float):
pass
# There's another condition `or 1`
if isinstance(self.k, int) or isinstance(self.k, float) or 1:
pass
```
is fixed to:
```py
# Fixed to only considering variable `a`
if isinstance(a, (float, int)):
pass
# The additional condition is not present in the fix
if isinstance(self.k, (float, int)):
pass
```
Playground: https://play.ruff.rs/6cfbdfb7-f183-43b0-b59e-31e728b34190
## Documentation Preview
### `PLR1701`
<img width="1397" alt="Screenshot 2024-06-25 at 11 14 40"
src="https://github.com/astral-sh/ruff/assets/67177269/779ee84d-7c4d-4bb8-a3a4-c2b23a313eba">
## Test Plan
Remove the test cases for `PLR1701`, port the padding test case to
`SIM101` and update the snapshot.
## Summary
This PR splits the re-lexing logic into two parts:
1. `TokenSource`: The token source will be responsible to find the
position the lexer needs to be moved to
2. `Lexer`: The lexer will be responsible to reduce the nesting level
and move itself to the new position if recovered from a parenthesized
context
This split makes it easy to find the new lexer position without needing
to implement the backwards lexing logic again which would need to handle
cases involving:
* Different kinds of newlines
* Line continuation character(s)
* Comments
* Whitespaces
### F-strings
This change did reveal one thing about re-lexing f-strings. Consider the
following example:
```py
f'{'
# ^
f'foo'
```
Here, the quote as highlighted by the caret (`^`) is the start of a
string inside an f-string expression. This is unterminated string which
means the token emitted is actually `Unknown`. The parser tries to
recover from it but there's no newline token in the vector so the new
logic doesn't recover from it. The previous logic does recover because
it's looking at the raw characters instead.
The parser would be at `FStringStart` (the one for the second line) when
it calls into the re-lexing logic to recover from an unterminated
f-string on the first line. So, moving backwards the first character
encountered is a newline character but the first token encountered is an
`Unknown` token.
This is improved with #12067fixes: #12046fixes: #12036
## Test Plan
Update the snapshot and validate the changes.
## Summary
This PR fixes the lexer logic to **not** consume the newline character
for an unterminated string literal.
Currently, the lexer would consume it to be part of the string itself
but that would be bad for recovery because then the lexer wouldn't emit
the newline token ever. This PR fixes that to avoid consuming the
newline character in that case.
This was discovered during https://github.com/astral-sh/ruff/pull/12060.
## Test Plan
Update the snapshots and validate them.
## Summary
This PR fixes a bug introduced in
https://github.com/astral-sh/ruff/pull/12008 which didn't consider the
two character newline after the line continuation character.
For example, consider the following code highlighted with whitespaces:
```py
call(foo # comment \\r\n
\r\n
def bar():\r\n
....pass\r\n
```
The lexer is at `def` when it's running the re-lexing logic and trying
to move back to a newline character. It encounters `\n` and it's being
escaped (incorrect) but `\r` is being escaped, so it moves the lexer to
`\n` character. This creates an overlap in token ranges which causes the
panic.
```
Name 0..4
Lpar 4..5
Name 5..8
Comment 9..20
NonLogicalNewline 20..22 <-- overlap between
Newline 21..22 <-- these two tokens
NonLogicalNewline 22..23
Def 23..26
...
```
fixes: #12028
## Test Plan
Add a test case with line continuation and windows style newline
character.
## Summary
(I'm pretty sure I added this in the parser re-write but must've got
lost in the rebase?)
This PR raises a syntax error if the type parameter list is empty.
As per the grammar, there should be at least one type parameter:
```
type_params:
| invalid_type_params
| '[' type_param_seq ']'
type_param_seq: ','.type_param+ [',']
```
Verified via the builtin `ast` module as well:
```console
$ python3.13 -m ast parser/_.py
Traceback (most recent call last):
[..]
File "parser/_.py", line 1
def foo[]():
^
SyntaxError: Type parameter list cannot be empty
```
## Test Plan
Add inline test cases and update the snapshots.
## Summary
Right now, it's inconsistent... We sometimes match against the name, and
sometimes against the alias (`asname`). I could see a case for always
matching against the name, but matching against both seems fine too,
since the rule is really about the combination of the two?
Closes https://github.com/astral-sh/ruff/issues/12031.
## Summary
This PR fixes a bug where Ruff would raise `E203` for f-string debug
expression. This isn't valid because whitespaces are important for debug
expressions.
fixes: #12023
## Test Plan
Add test case and make sure there are no snapshot changes.
## Summary
This PR updates the parser test infrastructure to validate the token
ranges.
From the code documentation:
```
/// Verifies that:
/// * the ranges are strictly increasing when loop the tokens in insertion order
/// * all ranges are within the length of the source code
```
Follow-up from #12016 and #12017resolves: #11938
## Test Plan
Make sure that there are no failures.
## Summary
This PR updates the unterminated string error range to not include the
final newline character.
This is a follow-up to #12016 and required for #12019
This is not done for when the unterminated string goes till the end of
file (not a newline character). The unterminated f-string range is
correct.
### Why is this required for #12019 ?
Because otherwise the token ranges will overlap. For example:
```py
f"{"
f"{foo!r"
```
Here, the re-lexing logic recovers from an unterminated f-string and
thus emitting a `Newline` token for the one at the end of the first
line. But, currently the `Unknown` and the `Newline` token would overlap
because the `Unknown` token (unterminated string literal) range would
include the newline character.
## Test Plan
Update and validate the snapshot.
## Summary
This PR fixes the range highlighted for the line continuation error.
Previously, it would highlight an incorrect range:
```
1 | call(a, b, \\\
| ^^ Syntax Error: unexpected character after line continuation character
2 |
3 | def bar():
|
```
And now:
```
|
1 | call(a, b, \\\
| ^ Syntax Error: unexpected character after line continuation character
2 |
3 | def bar():
|
```
This is implemented by avoiding to update the token range for the
`Unknown` token which is emitted when there's a lexical error. Instead,
the `push_error` helper method will be responsible to update the range
to the error location.
This actually becomes a requirement which can be seen in follow-up PRs.
## Test Plan
Update and validate the snapshot.
## Summary
This PR fixes a bug where the re-lexing logic didn't consider the line
continuation character being present before the newline character. This
meant that the lexer was being moved back to the newline character which
is actually ignored via `\`.
Considering the following code:
```py
f'middle {'string':\
'format spec'}
```
The old token stream is:
```
...
Colon 18..19
FStringMiddle 19..29 (flags = F_STRING)
Newline 20..21
Indent 21..29
String 29..42
Rbrace 42..43
...
```
Notice how the ranges are overlapping between the `FStringMiddle` token
and the tokens emitted after moving the lexer backwards.
After this fix, the new token stream which is without moving the lexer
backwards in this scenario:
```
FStringStart 0..2 (flags = F_STRING)
FStringMiddle 2..9 (flags = F_STRING)
Lbrace 9..10
String 10..18
Colon 18..19
FStringMiddle 19..29 (flags = F_STRING)
FStringEnd 29..30 (flags = F_STRING)
Name 30..36
Name 37..41
Unknown 41..44
Newline 44..45
```
fixes: #12004
## Test Plan
Add test cases and update the snapshots.
## Summary
This PR updates `F811` rule to include assignment as possible shadowed
binding. This will fix issue: #11828 .
## Test Plan
Add a test file, F811_30.py, which includes a redefinition after an
assignment and a verified snapshot file.
## Summary
Addresses #11974 to add a `RUF` rule to replace `print` expressions in
`assert` statements with the inner message.
An autofix is available, but is considered unsafe as it changes
behaviour of the execution, notably:
- removal of the printout in `stdout`, and
- `AssertionError` instance containing a different message.
While the detection of the condition is a straightforward matter,
deciding how to resolve the print arguments into a string literal can be
a relatively subjective matter. The implementation of this PR chooses to
be as tolerant as possible, and will attempt to reformat any number of
`print` arguments containing single or concatenated strings or variables
into either a string literal, or a f-string if any variables or
placeholders are detected.
## Test Plan
`cargo test`.
## Examples
For ease of discussion, this is the diff for the tests:
```diff
# Standard Case
# Expects:
# - single StringLiteral
-assert True, print("This print is not intentional.")
+assert True, "This print is not intentional."
# Concatenated string literals
# Expects:
# - single StringLiteral
-assert True, print("This print" " is not intentional.")
+assert True, "This print is not intentional."
# Positional arguments, string literals
# Expects:
# - single StringLiteral concatenated with " "
-assert True, print("This print", "is not intentional")
+assert True, "This print is not intentional"
# Concatenated string literals combined with Positional arguments
# Expects:
# - single stringliteral concatenated with " " only between `print` and `is`
-assert True, print("This " "print", "is not intentional.")
+assert True, "This print is not intentional."
# Positional arguments, string literals with a variable
# Expects:
# - single FString concatenated with " "
-assert True, print("This", print.__name__, "is not intentional.")
+assert True, f"This {print.__name__} is not intentional."
# Mixed brackets string literals
# Expects:
# - single StringLiteral concatenated with " "
-assert True, print("This print", 'is not intentional', """and should be removed""")
+assert True, "This print is not intentional and should be removed"
# Mixed brackets with other brackets inside
# Expects:
# - single StringLiteral concatenated with " " and escaped brackets
-assert True, print("This print", 'is not "intentional"', """and "should" be 'removed'""")
+assert True, "This print is not \"intentional\" and \"should\" be 'removed'"
# Positional arguments, string literals with a separator
# Expects:
# - single StringLiteral concatenated with "|"
-assert True, print("This print", "is not intentional", sep="|")
+assert True, "This print|is not intentional"
# Positional arguments, string literals with None as separator
# Expects:
# - single StringLiteral concatenated with " "
-assert True, print("This print", "is not intentional", sep=None)
+assert True, "This print is not intentional"
# Positional arguments, string literals with variable as separator, needs f-string
# Expects:
# - single FString concatenated with "{U00A0}"
-assert True, print("This print", "is not intentional", sep=U00A0)
+assert True, f"This print{U00A0}is not intentional"
# Unnecessary f-string
# Expects:
# - single StringLiteral
-assert True, print(f"This f-string is just a literal.")
+assert True, "This f-string is just a literal."
# Positional arguments, string literals and f-strings
# Expects:
# - single FString concatenated with " "
-assert True, print("This print", f"is not {'intentional':s}")
+assert True, f"This print is not {'intentional':s}"
# Positional arguments, string literals and f-strings with a separator
# Expects:
# - single FString concatenated with "|"
-assert True, print("This print", f"is not {'intentional':s}", sep="|")
+assert True, f"This print|is not {'intentional':s}"
# A single f-string
# Expects:
# - single FString
-assert True, print(f"This print is not {'intentional':s}")
+assert True, f"This print is not {'intentional':s}"
# A single f-string with a redundant separator
# Expects:
# - single FString
-assert True, print(f"This print is not {'intentional':s}", sep="|")
+assert True, f"This print is not {'intentional':s}"
# Complex f-string with variable as separator
# Expects:
# - single FString concatenated with "{U00A0}", all placeholders preserved
condition = "True is True"
maintainer = "John Doe"
-assert True, print("Unreachable due to", condition, f", ask {maintainer} for advice", sep=U00A0)
+assert True, f"Unreachable due to{U00A0}{condition}{U00A0}, ask {maintainer} for advice"
# Empty print
# Expects:
# - `msg` entirely removed from assertion
-assert True, print()
+assert True
# Empty print with separator
# Expects:
# - `msg` entirely removed from assertion
-assert True, print(sep=" ")
+assert True
# Custom print function that actually returns a string
# Expects:
@@ -100,4 +100,4 @@
# Use of `builtins.print`
# Expects:
# - single StringLiteral
-assert True, builtins.print("This print should be removed.")
+assert True, "This print should be removed."
```
## Known Issues
The current implementation resolves all arguments and separators of the
`print` expression into a single string, be it
`StringLiteralValue::single` or a `FStringValue::single`. This:
- potentially joins together strings well beyond the ideal character
limit for each line, and
- does not preserve multi-line strings in their original format, in
favour of a single line `"...\n...\n..."` format.
These are purely formatting issues only occurring in unusual scenarios.
Additionally, the autofix will tolerate `print` calls that were
previously invalid:
```python
assert True, print("this", "should not be allowed", sep=42)
```
This will be transformed into
```python
assert True, f"this{42}should not be allowed"
```
which some could argue is an alteration of behaviour.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Documentation mentions:
> PEP 563 enabled the use of a number of convenient type annotations,
such as `list[str]` instead of `List[str]`
but it meant [PEP 585](https://peps.python.org/pep-0585/) instead.
[PEP 563](https://peps.python.org/pep-0563/) is the one defining `from
__future__ import annotations`.
## Test Plan
No automated test required, just verify that
https://peps.python.org/pep-0585/ is the correct reference.
## Summary
I look at the token stream a lot, not specifically in the playground but
in the terminal output and it's annoying to scroll a lot to find
specific location. Most of the information is also redundant.
The final format we end up with is: `<kind> <range> (flags = ...)` e.g.,
`String 0..4 (flags = BYTE_STRING)` where the flags part is only
populated if there are any flags set.
## Summary
Fixes#11651.
Fixes#11851.
We were double-closing a notebook document from the index, once in
`textDocument/didClose` and then in the `notebookDocument/didClose`
handler. The second time this happens, taking a snapshot fails.
I've rewritten how we handle snapshots for closing notebooks / notebook
cells so that any failure is simply logged instead of propagating
upwards. This implementation works consistently even if we don't receive
`textDocument/didClose` notifications for each specific cell, since they
get closed (and the diagnostics get cleared) in the notebook document
removal process.
## Test Plan
1. Open an untitled, unsaved notebook with the `Create: New Jupyter
Notebook` command from the VS Code command palette (`Ctrl/Cmd + Shift +
P`)
2. Without saving the document, close it.
3. No error popup should appear.
4. Run the debug command (`Ruff: print debug information`) to confirm
that there are no open documents
## Summary
This PR does some housekeeping into moving certain structs into related
modules. Specifically,
1. Move `LexicalError` from `lexer.rs` to `error.rs` which also contains
the `ParseError`
2. Move `Token`, `TokenFlags` and `TokenValue` from `lexer.rs` to
`token.rs`
## Summary
This PR removes the duplication around `is_trivia` functions.
There are two of them in the codebase:
1. In `pycodestyle`, it's for newline, indent, dedent, non-logical
newline and comment
2. In the parser, it's for non-logical newline and comment
The `TokenKind::is_trivia` method used (1) but that's not correct in
that context. So, this PR introduces a new `is_non_logical_token` helper
method for the `pycodestyle` crate and updates the
`TokenKind::is_trivia` implementation with (2).
This also means we can remove `Token::is_trivia` method and the
standalone `token_source::is_trivia` function and use the one on
`TokenKind`.
## Test Plan
`cargo insta test`
## Summary
Closes#11914.
This PR introduces a snapshot test that replays the LSP requests made
during a document formatting request, and confirms that the notebook
document is updated in the expected way.
## Summary
Fixes#11911.
`shellexpand` is now used on `logFile` to expand the file path, allowing
the usage of `~` and environment variables.
## Test Plan
1. Set `logFile` in either Neovim or Helix to a file path that needs
expansion, like `~/.config/helix/ruff_logs.txt`.
2. Ensure that `RUFF_TRACE` is set to `messages` or `verbose`
3. Open a Python file in Neovim/Helix
4. Confirm that a file at the path specified was created, with the
expected logs.
## Summary
This PR updates the logical line rules entry-point function to only run
the logic if any of the rules within that group is enabled.
Although this shouldn't really give any performance improvements, it's
better not to do additional work if we can. This is also consistent with
how other rules are run.
## Test Plan
`cargo insta test`
## Summary
This PR avoids moving back the lexer for a triple-quoted f-string during
the re-lexing phase.
The reason this is a problem is that for a triple-quoted f-string the
newlines are part of the f-string itself, specifically they'll be part
of the `FStringMiddle` token. So, if we moved the lexer back, there
would be a `Newline` token whose range would be in between an
`FStringMiddle` token. This creates a panic in downstream usage.
fixes: #11937
## Test Plan
Add test cases and validate the snapshots.
## Summary
This PR avoids the `depth` counter when detecting indentation from
non-logical lines because it seems to never be used. It might have been
a leftover when the logic was added originally in #11608.
## Test Plan
`cargo insta test`
## Summary
This PR updates the linter to show all the parse errors as diagnostics
instead of just the first one.
Note that this doesn't affect the parse error displayed as error log
message. This will be removed in a follow-up PR.
### Breaking?
I don't think this is a breaking change even though this might give more
diagnostics. The main reason is that this shouldn't affect any users
because it'll only give additional diagnostics in the case of multiple
syntax errors.
## Test Plan
Add an integration test case which would raise more than one parse
error.
## Summary
This PR updates the re-lexing logic to avoid consuming the trailing
whitespace and move the lexer explicitly to the last newline character
encountered while moving backwards.
Consider the following code snippet as taken from the test case
highlighted with whitespace (`.`) and newline (`\n`) characters:
```py
# There are trailing whitespace before the newline character but those whitespaces are
# part of the comment token
f"""hello {x # comment....\n
# ^
y = 1\n
```
The parser is at `y` when it's trying to recover from an unclosed `{`,
so it calls into the re-lexing logic which tries to move the lexer back
to the end of the previous line. But, as it consumed all whitespaces it
moved the lexer to the location marked by `^` in the above code snippet.
But, those whitespaces are part of the comment token. This means that
the range for the two tokens were overlapping which introduced the
panic.
Note that this is only a bug when there's a comment with a trailing
whitespace otherwise it's fine to move the lexer to the whitespace
character. This is because the lexer would just skip the whitespace
otherwise. Nevertheless, this PR updates the logic to move it explicitly
to the newline character in all cases.
fixes: #11929
## Test Plan
Add test cases and update the snapshot. Make sure that it doesn't panic
on the code snippet in the linked issue.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
related to https://github.com/astral-sh/ruff/issues/5306
The check right now only checks in the first 1024 bytes, and that's
really not enough when there's a docstring at the beginning of a file.
A more proper fix might be needed, which might be more complex (and I
don't have the `rust` skills to implement that). But this temporary
"fix" might enable more users to use this.
Context: We want to use this rule in
https://github.com/scikit-learn/scikit-learn/ and we got blocked because
of this hardcoded rule (which TBH took us quite a while to figure out
why it was failing since it's not documented).
## Test Plan
This is already kinda tested, modified the test for the new byte number.
<!-- How was it tested? -->
## Summary
This PR removes most of the syntax errors from the test cases. This
would create noise when https://github.com/astral-sh/ruff/pull/11901 is
complete. These syntax errors are also just noise for the test itself.
## Test Plan
Update the snapshots and verify that they're still the same.
## Summary
This PR is a follow-up on #11845 to add the re-lexing logic for normal
list parsing.
A normal list parsing is basically parsing elements without any
separator in between i.e., there can only be trivia tokens in between
the two elements. Currently, this is only being used for parsing
**assignment statement** and **f-string elements**. Assignment
statements cannot be in a parenthesized context, but f-string can have
curly braces so this PR is specifically for them.
I don't think this is an ideal recovery but the problem is that both
lexer and parser could add an error for f-strings. If the lexer adds an
error it'll emit an `Unknown` token instead while the parser adds the
error directly. I think we'd need to move all f-string errors to be
emitted by the parser instead. This way the parser can correctly inform
the lexer that it's out of an f-string and then the lexer can pop the
current f-string context out of the stack.
## Test Plan
Add test cases, update the snapshots, and run the fuzzer.
## Summary
Fixes https://github.com/astral-sh/ruff-vscode/issues/496.
Cells are no longer removed from the notebook index when a notebook gets
updated, but rather when `textDocument/didClose` is called for them.
This solves an issue where their premature removal from the notebook
cell index would cause their URL to be un-queryable in the
`textDocument/didClose` handler.
## Test Plan
Create and then delete a notebook cell in VS Code. No error should
appear.
## Summary
This PR implements the re-lexing logic in the parser.
This logic is only applied when recovering from an error during list
parsing. The logic is as follows:
1. During list parsing, if an unexpected token is encountered and it
detects that an outer context can understand it and thus recover from
it, it invokes the re-lexing logic in the lexer
2. This logic first checks if the lexer is in a parenthesized context
and returns if it's not. Thus, the logic is a no-op if the lexer isn't
in a parenthesized context
3. It then reduces the nesting level by 1. It shouldn't reset it to 0
because otherwise the recovery from nested list parsing will be
incorrect
4. Then, it tries to find last newline character going backwards from
the current position of the lexer. This avoids any whitespaces but if it
encounters any character other than newline or whitespace, it aborts.
5. Now, if there's a newline character, then it needs to be re-lexed in
a logical context which means that the lexer needs to emit it as a
`Newline` token instead of `NonLogicalNewline`.
6. If the re-lexing gives a different token than the current one, the
token source needs to update it's token collection to remove all the
tokens which comes after the new current position.
It turns out that the list parsing isn't that happy with the results so
it requires some re-arranging such that the following two errors are
raised correctly:
1. Expected comma
2. Recovery context error
For (1), the following scenarios needs to be considered:
* Missing comma between two elements
* Half parsed element because the grammar doesn't allow it (for example,
named expressions)
For (2), the following scenarios needs to be considered:
1. If the parser is at a comma which means that there's a missing
element otherwise the comma would've been consumed by the first `eat`
call above. And, the parser doesn't take the re-lexing route on a comma
token.
2. If it's the first element and the current token is not a comma which
means that it's an invalid element.
resolves: #11640
## Test Plan
- [x] Update existing test snapshots and validate them
- [x] Add additional test cases specific to the re-lexing logic and
validate the snapshots
- [x] Run the fuzzer on 3000+ valid inputs
- [x] Run the fuzzer on invalid inputs
- [x] Run the parser on various open source projects
- [x] Make sure the ecosystem changes are none
## Summary
This PR updates the server capabilities to include the commands that
Ruff supports. This is similar to how there's a list of possible code
actions supported by the server.
I noticed this when I was trying to find whether Helix supported
workspace commands or not based on Jane's comment
(https://github.com/astral-sh/ruff/pull/11831#discussion_r1634984921)
and I found the `:lsp-workspace-command` in the editor but it didn't
show up anything in the picker.
So, I looked at the implementation in Helix
(9c479e6d2d/helix-term/src/commands/typed.rs (L1372-L1384))
which made me realize that Ruff doesn't provide this in its
capabilities. Currently, this does require `ruff` to be first in the
list of language servers in the user config but that should be resolved
by https://github.com/helix-editor/helix/pull/10176. So, the following
config should work:
```toml
[[language]]
name = "python"
# Ruff should come first until https://github.com/helix-editor/helix/pull/10176 is released
language-servers = ["ruff", "pyright"]
```
## Test Plan
1. Neovim's server capabilities output should include the supported
commands:
```
executeCommandProvider = {
commands = { "ruff.applyFormat", "ruff.applyAutofix", "ruff.applyOrganizeImports", "ruff.printDebugInformation" },
workDoneProgress = false
},
```
2. Helix should now display the commands to pick from when
`:lsp-workspace-command` is invoked:
<img width="832" alt="Screenshot 2024-06-13 at 08 47 14"
src="https://github.com/astral-sh/ruff/assets/67177269/09048ecd-c974-4e09-ab56-9482ff3d780b">
## Summary
This PR adds a new enum to determine the kind of terminator token i.e.,
is it actually terminates the list or is it used for error recovery.
This is important because the parser should take the error recovery
route in case the terminator token is used for better error recovery.
This will then try to re-lex the token if it's the case.
I haven't updated any reference to use this new enum as otherwise it'll
update the snapshots. I plan to do that in a follow-up PR so that it's
easier to reason about.
## Test plan
`cargo insta test`
## Summary
This PR separates the terminator token for f-string elements depending
on the context. A list of f-string element can occur either in a regular
f-string or a format spec of an f-string. The terminator token is
different depending on that context.
## Test Plan
`cargo insta test` and verify the updated snapshots.
## Summary
This PR re-uses the `ruff_python_trivia::is_python_whitespace` in the
lexer instead of defining its own. This was mainly to avoid circular
dependency which was resolved in #11261.
## Summary
Add Constraint nodes to flow graph, and narrow types based on that (only
`is None` and `is not None` narrowing supported for now, to prototype
the structure.)
Also add simplification of zero- and one-element unions and
intersections, and flattening of intersections.
There's a lot more normalization logic needed for unions and
intersections (as is obvious from the inferred type in the added
`narrow_none` test), but this will be non-trivial and I'd rather do it
in a separate PR.
Here's a flowchart diagram for the code in the added `narrow_none` test:

The top branch is for the `if` expression in the initial assignment to
`x`; that `Constraint` node would only affect the type of `flag`, which
we don't care about in this test.
The second branch is for the `if` statement, with `Constraint` node
affecting the type of `x`.
## Test Plan
Added tests.
## Summary
Fixes#11744.
We now show a distinct popup message when we fail to get a document
snapshot during command execution. This message more clearly
communicates the issue to the user, instead of a generic "ruff
encountered an error" message.
## Test Plan
Try running `Fix all auto-fixable problems` on an incompatible file (for
example: `settings.json`). You should see the following popup message:
<img width="456" alt="Screenshot 2024-06-11 at 11 47 16 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/3a28e3d7-3896-4dd0-b117-f87300dd3b68">
## Summary
Closes#11715.
Introduces a new command, `ruff.printDebugInformation`. This will print
useful information about the status of the server to `stderr`.
Right now, the information shown by this command includes:
* The path to the server executable
* The version of the executable
* The text encoding being used
* The number of open documents and workspaces
* A list of registered configuration files
* The capabilities of the client
## Test Plan
First, checkout and use [the corresponding `ruff-vscode`
PR](https://github.com/astral-sh/ruff-vscode/pull/495).
Running the `Print debug information` command in VS Code should show
something like the following in the Output channel:
<img width="991" alt="Screenshot 2024-06-11 at 11 41 46 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/ab93c009-bb7b-4291-b057-d44fdc6f9f86">
## Summary
Fixes#10968.
Fixes#11545.
The server's tracing system has been rewritten from the ground up. The
server now has trace level and log level settings which restrict the
tracing events and spans that get logged.
* A `logLevel` setting has been added, which lets a user set the log
level. By default, it is set to `"info"`.
* A `logFile` setting has also been added, which lets the user supply an
optional file to send tracing output (it does not have to exist as a
file yet). By default, if this is unset, tracing output will be sent to
`stderr`.
* A `$/setTrace` handler has also been added, and we also set the trace
level from the initialization options. For editors without direct
support for tracing, the environment variable `RUFF_TRACE` can override
the trace level.
* Small changes have been made to how we display tracing output. We no
longer use `tracing-tree`, and instead use
`tracing_subscriber::fmt::Layer` to format output. Thread names are now
included in traces, and I've made some adjustment to thread worker names
to be more useful.
## Test Plan
In VS Code, with `ruff.trace.server` set to its default value, no logs
from Ruff should appear.
After changing `ruff.trace.server` to either `messages` or `verbose`,
you should see log messages at `info` level or higher appear in Ruff's
output:
<img width="1005" alt="Screenshot 2024-06-10 at 10 35 04 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/6050d107-9815-4bd2-96d0-e86f096a57f5">
In Helix, by default, no logs from Ruff should appear.
To set the trace level in Helix, you'll need to modify your language
configuration as follows:
```toml
[language-server.ruff]
command = "/Users/jane/astral/ruff/target/debug/ruff"
args = ["server", "--preview"]
environment = { "RUFF_TRACE" = "messages" }
```
After doing this, logs of `info` level or higher should be visible in
Helix:
<img width="1216" alt="Screenshot 2024-06-10 at 10 39 26 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/8ff88692-d3f7-4fd1-941e-86fb338fcdcc">
You can use `:log-open` to quickly open the Helix log file.
In Neovim, by default, no logs from Ruff should appear.
To set the trace level in Neovim, you'll need to modify your
configuration as follows:
```lua
require('lspconfig').ruff.setup {
cmd = {"/path/to/debug/executable", "server", "--preview"},
cmd_env = { RUFF_TRACE = "messages" }
}
```
You should see logs appear in `:LspLog` that look like the following:
<img width="1490" alt="Screenshot 2024-06-11 at 11 24 01 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/576cd5fa-03cf-477a-b879-b29a9a1200ff">
You can adjust `logLevel` and `logFile` in `settings`:
```lua
require('lspconfig').ruff.setup {
cmd = {"/path/to/debug/executable", "server", "--preview"},
cmd_env = { RUFF_TRACE = "messages" },
settings = {
logLevel = "debug",
logFile = "your/log/file/path/log.txt"
}
}
```
The `logLevel` and `logFile` can also be set in Helix like so:
```toml
[language-server.ruff.config.settings]
logLevel = "debug"
logFile = "your/log/file/path/log.txt"
```
Even if this log file does not exist, it should now be created and
written to after running the server:
<img width="1148" alt="Screenshot 2024-06-10 at 10 43 44 AM"
src="https://github.com/astral-sh/ruff/assets/19577865/ab533cf7-d5ac-4178-97f1-e56da17450dd">
## Summary
This PR updates the parser to remove building the `CommentRanges` and
instead it'll be built by the linter and the formatter when it's
required.
For the linter, it'll be built and owned by the `Indexer` while for the
formatter it'll be built from the `Tokens` struct and passed as an
argument.
## Test Plan
`cargo insta test`
## Summary
The fix for E203 now produces the same result as ruff format in cases
where a slice ends on a colon and the closing square bracket is on the
following line.
Refers to https://github.com/astral-sh/ruff/issues/10973
## Test Plan
The minimal reproduction case in the ticket was added as test case
producing no error. Additional cases with multiple spaces or a tab
before the colon where added to make sure that the rule still finds
these.
## Summary
As-is, we're using the URL path for all files, leading us to use paths
like:
```
/c%3A/Users/crmar/workspace/fastapi/tests/main.py
```
This doesn't match against per-file ignores and other patterns in Ruff
configuration.
This PR modifies the LSP to use the real file path if available, and the
virtual file path if not.
Closes https://github.com/astral-sh/ruff/issues/11751.
## Test Plan
Ran the LSP on Windows. In the FastAPI repo, added:
```toml
[tool.ruff.lint.per-file-ignores]
"tests/**/*.py" = ["F401"]
```
And verified that an unused import was ignored in `tests` after this
change, but not before.
## Summary
This PR removes the `result-like` dependency and instead implement the
required functionality. The motivation being that `noqa.is_enabled()` is
easier to read than `noqa.into()`.
For context, I was just trying to understand the syntax error workflow
and I saw these flags which were being converted via `into`. I always
find `into` confusing because you never know what's it being converted
into unless you know the type. Later realized that it's just a boolean
flag. After removing the usages from these two flags, it turns out that
the dependency is only being used in one rule so I thought to remove
that as well.
## Test Plan
`cargo insta test`
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
This PR implements the [consider dict
items](https://pylint.pycqa.org/en/latest/user_guide/messages/convention/consider-using-dict-items.html)
rule from Pylint. Enabling this rule flags:
```python
ORCHESTRA = {
"violin": "strings",
"oboe": "woodwind",
"tuba": "brass",
"gong": "percussion",
}
for instrument in ORCHESTRA:
print(f"{instrument}: {ORCHESTRA[instrument]}")
for instrument in ORCHESTRA.keys():
print(f"{instrument}: {ORCHESTRA[instrument]}")
for instrument in (inline_dict := {"foo": "bar"}):
print(f"{instrument}: {inline_dict[instrument]}")
```
For not using `items()` to extract the value out of the dict. We ignore
the case of an assignment, as you can't modify the underlying
representation with the value in the list of tuples returned.
## Test Plan
<!-- How was it tested? -->
`cargo test`.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Definitions are used in symbol table and in flow graph, and aren't
inherently owned by one or the other; move them into their own
submodule.
## Test Plan
Existing tests.
## Summary
Add support for inferring int literal types from basic arithmetic on int
literals. Just to begin showing examples of resolving more complex
expression types, and because this will be useful in testing walrus
expressions.
## Test Plan
Added test.
## Summary
After looking at this a bit, I think it does make sense to have
`Unbound` as part of the `Definition` enum; if we are modeling `Unbound`
as a type (which currently we are), then every symbol implicitly starts
each scope with a "definition" as unbound, and the cleanest way to model
that is as a real `Definition`. We should be able to handle a definition
of "unbound" anywhere we handle definitions.
But the name `None` wasn't clear enough; changing the name to `Unbound`
and adding a doc comment.
Also change `[first].into_iter()` to `std::iter::once(first)`, from
post-land code review on a prior PR.
## Test Plan
Existing tests.
## Summary
This PR is a follow-up to #11740 to restrict access to the `Parsed`
output by replacing the `parsed` API function with a more specific one.
Currently, that is `comment_ranges` but the linked PR exposes a `tokens`
method.
The main motivation is so that there's no way to get an incorrect
information from the checker. And, it also encapsulates the source of
the comment ranges and the tokens itself. This way it would become
easier to just update the checker if the source for these information
changes in the future.
## Test Plan
`cargo insta test`
## Summary
This PR fixes a bug where the checker would require the tokens for an
invalid offset w.r.t. the source code.
Taking the source code from the linked issue as an example:
```py
relese_version :"0.0is 64"
```
Now, this isn't really a valid type annotation but that's what this PR
is fixing. Regardless of whether it's valid or not, Ruff shouldn't
panic.
The checker would visit the parsed type annotation (`0.0is 64`) and try
to detect any violations. Certain rule logic requests the tokens for the
same but it would fail because the lexer would only have the `String`
token considering original source code. This worked before because the
lexer was invoked again for each rule logic.
The solution is to store the parsed type annotation on the checker if
it's in a typing context and use the tokens from that instead if it's
available. This is enforced by creating a new API on the checker to get
the tokens.
But, this means that there are two ways to get the tokens via the
checker API. I want to restrict this in a follow-up PR (#11741) to only
expose `tokens` and `comment_ranges` as methods and restrict access to
the parsed source code.
fixes: #11736
## Test Plan
- [x] Add a test case for `F632` rule and update the snapshot
- [x] Check all affected rules
- [x] No ecosystem changes
## Summary
This PR updates the return type of `parse_type_annotation` from `Expr`
to `Parsed<ModExpression>`. This is to allow accessing the tokens for
the parsed sub-expression in the follow-up PR.
## Test Plan
`cargo insta test`
## Summary
Fixes https://github.com/astral-sh/ruff-vscode/issues/482.
I've made adjustments to `format` and `format_range` that handle parsing
errors before they become server errors. We'll still log this as a
problem, but there will no longer be a visible popup.
## Test Plan
Instead of seeing a visible error when formatting a document with syntax
issues, you should see this warning in the LSP logs:
<img width="991" alt="Screenshot 2024-06-04 at 3 38 23 PM"
src="https://github.com/astral-sh/ruff/assets/19577865/9d68947d-6462-4ca6-ab5a-65e573c91db6">
Similarly, if you try to format a range with syntax issues, you should
see this warning in the LSP logs instead of a visible error popup:
<img width="1010" alt="Screenshot 2024-06-04 at 3 39 10 PM"
src="https://github.com/astral-sh/ruff/assets/19577865/99fff098-798d-406a-976e-81ead0da0352">
---------
Co-authored-by: Zanie Blue <contact@zanie.dev>
## Summary
This PR fixes a bug where the lexer didn't consider the BOM into the
start offset.
fixes: #11731
## Test Plan
Add multiple test cases which involves BOM character in the source for
the lexer and verify the snapshot.
## Summary
This PR updates the lexer checkpoint to store the cursor offset instead
of cloning the cursor itself. This reduces the size of `LexerCheckpoint`
from 136 to 112 bytes and also removes the need for lifetime.
## Test Plan
`cargo insta test`
## Summary
Ensures that we respect per-file ignores and exemptions for these rules.
Specifically, we allow:
```python
# ruff: noqa: PGH004
```
...to ignore `PGH004`.
## Summary
Should resolve https://github.com/astral-sh/ruff/issues/11454.
This is my first PR to `ruff`, so I may have missed something.
If I understood the suggestion in the issue correctly, rule `PGH004`
should be set to `Preview` again.
## Test Plan
Created two fixtures derived from the issue.
## Summary
Switch name resolution in `infer_expression_type` from resolving the
public type of a symbol, to resolving the reachable definitions of that
symbol from the reference point, using the flow graph.
This surfaced a bug in the flow graph implementation and a bug in symbol
table building, both of which are also fixed here.
The bug in flow graph implementation was that when we pushed and popped
scopes, we didn't maintain a stack of "current flow nodes" in all
stacked scopes, to be restored when we returned to that scope. Now we
do.
The bug in symbol table building that we didn't visit the parts of
functions and class definitions in the correct scopes. E.g. decorators
should be visited in the outer scope, arguments should be visited inside
the type-params scope (if any) but not inside the function body scope,
and only the body itself should actually be visited inside the body
scope. Fixing this requires that we no longer use `walk_stmt` here,
instead we have to visit each individual component.
## Test Plan
Added test.
## Summary
Rename `infer_symbol_type` to `infer_symbol_public_type`, and allow it
to work on symbols with more than one definition. For now, use the most
cautious/sound inference, which is the union of all definitions. We can
prune this union more in future by eliminating definitions if we can
show that they can't be visible (this requires both that the symbol is
definitely later reassigned, and that there is no intervening
call/import that might be able to see the over-written definition).
## Test Plan
Added a test showing inference of union from multiple definitions.
## Summary
This PR fixes a bug where the `Generator` wouldn't add a newline before
a type alias statement. This is because it wasn't using the `statement`
macro which takes care of the newline.
Without this fix, a code like:
```py
type X = int
type Y = str
```
The generator would produce:
```py
type X = inttype Y = str
```
## Test Plan
Add a test case.
## Summary
This PR removes the following dependencies from the `ruff_python_parser`
crate:
* `anyhow` (moved to dev dependencies)
* `is-macro`
* `itertools`
The main motivation is that they aren't used much.
Additionally, it updates the return type of `parse_type_annotation` to
use a more specific `ParseError` instead of the generic `anyhow::Error`.
## Test Plan
`cargo insta test`
## Summary
This PR updates the logic for parsing type annotation to accept a
`ExprStringLiteral` node instead of the string value and the range.
The main motivation of this change is to simplify the implementation of
`parse_type_annotation` function with:
* Use the `opener_len` and `closer_len` from the string flags to get the
raw contents range instead of extracting it via
* `str::leading_quote(expression).unwrap().text_len()`
* `str::trailing_quote(expression).unwrap().text_len()`
* Avoid comparing the string content if we already know that it's
implicitly concatenated
## Test Plan
`cargo insta test`
## Summary
This PR re-orders the lexer methods in the following order:
1. `next_token`
2. `lex_token`
3. `eat_indentation`
4. `handle_indentation`
5. `skip_whitespace`
6. `consume_ascii_character`
7. `try_single_char_prefix`
8. `try_double_char_prefix`
9. `lex_identifier`
10. `lex_fstring_start`
11. `lex_fstring_middle_or_end`
12. `lex_string`
13. `lex_number`
14. `lex_number_radix`
15. `lex_decimal_number`
16. `radix_run`
17. `lex_comment`
18. `lex_ipython_escape_command`
19. `consume_end`
Following was considered for the ordering:
* 1 is the main entry point which delegates to 2
* 3, 4, 5 are all related to whitespace which is done first
* 6 is the entrypoint for an ascii character which delegates to 9, 12,
13, 17, 18, 19
* Others are grouped around similar kind of methods
## Summary
This PR updates the entire parser stack in multiple ways:
### Make the lexer lazy
* https://github.com/astral-sh/ruff/pull/11244
* https://github.com/astral-sh/ruff/pull/11473
Previously, Ruff's lexer would act as an iterator. The parser would
collect all the tokens in a vector first and then process the tokens to
create the syntax tree.
The first task in this project is to update the entire parsing flow to
make the lexer lazy. This includes the `Lexer`, `TokenSource`, and
`Parser`. For context, the `TokenSource` is a wrapper around the `Lexer`
to filter out the trivia tokens[^1]. Now, the parser will ask the token
source to get the next token and only then the lexer will continue and
emit the token. This means that the lexer needs to be aware of the
"current" token. When the `next_token` is called, the current token will
be updated with the newly lexed token.
The main motivation to make the lexer lazy is to allow re-lexing a token
in a different context. This is going to be really useful to make the
parser error resilience. For example, currently the emitted tokens
remains the same even if the parser can recover from an unclosed
parenthesis. This is important because the lexer emits a
`NonLogicalNewline` in parenthesized context while a normal `Newline` in
non-parenthesized context. This different kinds of newline is also used
to emit the indentation tokens which is important for the parser as it's
used to determine the start and end of a block.
Additionally, this allows us to implement the following functionalities:
1. Checkpoint - rewind infrastructure: The idea here is to create a
checkpoint and continue lexing. At a later point, this checkpoint can be
used to rewind the lexer back to the provided checkpoint.
2. Remove the `SoftKeywordTransformer` and instead use lookahead or
speculative parsing to determine whether a soft keyword is a keyword or
an identifier
3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted
by the lexer but it contains owned data which makes it expensive to
clone. The new `TokenKind` enum just represents the type of token which
is very cheap.
This brings up a question as to how will the parser get the owned value
which was stored on `Tok`. This will be solved by introducing a new
`TokenValue` enum which only contains a subset of token kinds which has
the owned value. This is stored on the lexer and is requested by the
parser when it wants to process the data. For example:
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)
[^1]: Trivia tokens are `NonLogicalNewline` and `Comment`
### Remove `SoftKeywordTransformer`
* https://github.com/astral-sh/ruff/pull/11441
* https://github.com/astral-sh/ruff/pull/11459
* https://github.com/astral-sh/ruff/pull/11442
* https://github.com/astral-sh/ruff/pull/11443
* https://github.com/astral-sh/ruff/pull/11474
For context,
https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220
added support for soft keywords in the parser which uses infinite
lookahead to classify a soft keyword as a keyword or an identifier. This
is a brilliant idea as it basically wraps the existing Lexer and works
on top of it which means that the logic for lexing and re-lexing a soft
keyword remains separate. The change here is to remove
`SoftKeywordTransformer` and let the parser determine this based on
context, lookahead and speculative parsing.
* **Context:** The transformer needs to know the position of the lexer
between it being at a statement position or a simple statement position.
This is because a `match` token starts a compound statement while a
`type` token starts a simple statement. **The parser already knows
this.**
* **Lookahead:** Now that the parser knows the context it can perform
lookahead of up to two tokens to classify the soft keyword. The logic
for this is mentioned in the PR implementing it for `type` and `match
soft keyword.
* **Speculative parsing:** This is where the checkpoint - rewind
infrastructure helps. For `match` soft keyword, there are certain cases
for which we can't classify based on lookahead. The idea here is to
create a checkpoint and keep parsing. Based on whether the parsing was
successful and what tokens are ahead we can classify the remaining
cases. Refer to #11443 for more details.
If the soft keyword is being parsed in an identifier context, it'll be
converted to an identifier and the emitted token will be updated as
well. Refer
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491).
The `case` soft keyword doesn't require any special handling because
it'll be a keyword only in the context of a match statement.
### Update the parser API
* https://github.com/astral-sh/ruff/pull/11494
* https://github.com/astral-sh/ruff/pull/11505
Now that the lexer is in sync with the parser, and the parser helps to
determine whether a soft keyword is a keyword or an identifier, the
lexer cannot be used on its own. The reason being that it's not
sensitive to the context (which is correct). This means that the parser
API needs to be updated to not allow any access to the lexer.
Previously, there were multiple ways to parse the source code:
1. Passing the source code itself
2. Or, passing the tokens
Now that the lexer and parser are working together, the API
corresponding to (2) cannot exists. The final API is mentioned in this
PR description: https://github.com/astral-sh/ruff/pull/11494.
### Refactor the downstream tools (linter and formatter)
* https://github.com/astral-sh/ruff/pull/11511
* https://github.com/astral-sh/ruff/pull/11515
* https://github.com/astral-sh/ruff/pull/11529
* https://github.com/astral-sh/ruff/pull/11562
* https://github.com/astral-sh/ruff/pull/11592
And, the final set of changes involves updating all references of the
lexer and `Tok` enum. This was done in two-parts:
1. Update all the references in a way that doesn't require any changes
from this PR i.e., it can be done independently
* https://github.com/astral-sh/ruff/pull/11402
* https://github.com/astral-sh/ruff/pull/11406
* https://github.com/astral-sh/ruff/pull/11418
* https://github.com/astral-sh/ruff/pull/11419
* https://github.com/astral-sh/ruff/pull/11420
* https://github.com/astral-sh/ruff/pull/11424
2. Update all the remaining references to use the changes made in this
PR
For (2), there were various strategies used:
1. Introduce a new `Tokens` struct which wraps the token vector and add
methods to query a certain subset of tokens. These includes:
1. `up_to_first_unknown` which replaces the `tokenize` function
2. `in_range` and `after` which replaces the `lex_starts_at` function
where the former returns the tokens within the given range while the
latter returns all the tokens after the given offset
2. Introduce a new `TokenFlags` which is a set of flags to query certain
information from a token. Currently, this information is only limited to
any string type token but can be expanded to include other information
in the future as needed. https://github.com/astral-sh/ruff/pull/11578
3. Move the `CommentRanges` to the parsed output because this
information is common to both the linter and the formatter. This removes
the need for `tokens_and_ranges` function.
## Test Plan
- [x] Update and verify the test snapshots
- [x] Make sure the entire test suite is passing
- [x] Make sure there are no changes in the ecosystem checks
- [x] Run the fuzzer on the parser
- [x] Run this change on dozens of open-source projects
### Running this change on dozens of open-source projects
Refer to the PR description to get the list of open source projects used
for testing.
Now, the following tests were done between `main` and this branch:
1. Compare the output of `--select=E999` (syntax errors)
2. Compare the output of default rule selection
3. Compare the output of `--select=ALL`
**Conclusion: all output were same**
## What's next?
The next step is to introduce re-lexing logic and update the parser to
feed the recovery information to the lexer so that it can emit the
correct token. This moves us one step closer to having error resilience
in the parser and provides Ruff the possibility to lint even if the
source code contains syntax errors.
## Summary
Implement support for RDJson output for `ruff check`, as requested in
#8655.
## Test Plan
Tested using a snapshot test. Same approach as for e.g. the JSON output
formatter.
## Additional info
I tried to keep the implementation close to the JSON implementation.
I had to deviate a bit to make the `suggestions` key work: If there are
no suggestions, then setting `suggestions` to `null` is invalid
according to the JSONSchema. Therefore, I opted for a slightly more
complex implementation, that skips the `suggestions` key entirely if
there are no fixes available for the given diagnostic. Maybe it would
have been easier to set `"suggestions": []`, but I ended up doing it
this way.
I didn't consider notebooks, as I _think_ that RDJson doesn't work with
notebooks. This should be confirmed, and if so, there should be some
form of warning or error emitted when trying to output diagnostics for a
notebook.
I also didn't consider `ruff format`, as this comment:
https://github.com/astral-sh/ruff/issues/8655#issuecomment-1811446160
suggests that that wouldn't be compatible.
I'm new to Rust, any feedback is appreciated. 🙂 I
implemented this in order to have a productive rainy saturday afternoon,
I'm not knowledgeable about RDJson beyond the sources linked in the
issue.
## Summary
This PR implements the rule B901, which is part of the opinionated rules
of `flake8-bugbear`.
This rule seems to be desired in `ruff` as per
https://github.com/astral-sh/ruff/issues/3758 and
https://github.com/astral-sh/ruff/issues/2954#issuecomment-1441162976.
## Test Plan
As this PR was made closely following the
[CONTRIBUTING.md](8a25531a71/CONTRIBUTING.md),
it tests using the snapshot approach, that is described there.
## Sources
The implementation is inspired by [the original implementation in the
`flake8-bugbear`
repository](d1aec4cbef/bugbear.py (L1092)).
The error message and [test
file](d1aec4cbef/tests/b901.py)
where also copied from there.
The documentation I came up with on my own and needs improvement. Maybe
the example given in
https://github.com/astral-sh/ruff/issues/2954#issuecomment-1441162976
could be used, but maybe they are too complex, I'm not sure.
## Open Questions
- [ ] Documentation. (See above.)
- [x] Can I access the parent in a visitor?
The [original
implementation](d1aec4cbef/bugbear.py (L1100))
references the `yield` statement's parent to check if it is an
expression statement. I didn't find a way to do this in `ruff` and used
the `is_expresssion_statement` field on the visitor instead. What are
your thoughts on this? Is it possible and / or desired to access the
parent node here?
- [x] Is `Option::is_some(...)` -> `...unwrap()` the right thing to do?
Referring to [this piece of
code](9d5a280f71/crates/ruff_linter/src/rules/flake8_bugbear/rules/return_x_in_generator.rs?plain=1#L91-L96).
From my understanding, the `.unwrap()` is safe, because it is checked
that `return_` is not `None`. However, I feel like I missed a more
elegant solution that does both in one.
## Other
I don't know a lot about this rule, I just implemented it because I
found it in a
https://github.com/astral-sh/ruff/labels/good%20first%20issue.
I'm new to Rust, so any constructive critisism is appreciated.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Introduces the skeleton of the flow graph. So far it doesn't actually
handle any non-linear control flow :) But it does show how we can go
from an expression that references a symbol, backward through the flow
graph, to find reachable definitions of that symbol.
Adding non-linear control flow will mean adding flow nodes with multiple
predecessors, which will introduce more complexity into
`ReachableDefinitionsIterator.next()`. But one step at a time.
## Test Plan
Added a (very basic) test.
## Summary
Give red-knot the ability to infer int literal types. This is quick and
easy, mostly because these types are a convenient way to observe
control-flow handling with simple assignments.
## Test Plan
Added test.
## Summary
In the [roadmap for `ruff
server`](https://github.com/astral-sh/ruff/discussions/10581) support
for vim and kate is listed. Therefore I added setup guides for them
based on the neovim guide. As I don't use pyright I wasn't able to
translate the corresponding part from the neovim guide.
## Test Plan
Doesn't apply.
* Potentially resolves#11619 (nondeterministic hashmap order across
different architectures) in F401 by replacing a hashmap with
nondeterministic traversal order with an ordered mapping.
I'm not sure how to test this with our CI/CD. I don't have an s390x
machine at home. Should I try it in Qemu?
## Summary
In an `__init__.py` file, it's not uncommon to lack a logical indent
(since it may just contain imports). In such cases, we were always
falling back to four-space indent. This PR adds detection for indents
within import groups.
Closes https://github.com/astral-sh/ruff/issues/11606.
## Summary
This PR aims to close#10095 by adding an option
`init-allow-undef-export` to the `pyflakes` settings. This option is
currently set to `true` such that behavior is kept identical.
But setting this option to `false` will lead to `F822` warnings to be
shown in all files, **including** `__init__.py` files.
As I've mentioned on #10095, I think `init-allow-undef-export=false`
would be the more user-friendly default option, as it creates fewer
surprises. @charliermarsh what do you think about making that the
default?
With this option in place, it's a single line fix for people that rely
on the old behavior.
And thinking longer term, for future major releases, one could probably
consider deprecating the option and eventually having people just `noqa`
these warnings if they are not wanted.
## Test Plan
I've added a `test_init_f822_enabled` test which repeats the test that
is done in the `init` test but this time with
`init-allow-undef-export=false` and the snap file correctly shows that
ruff will then trigger the otherwise suppressed F822 warning.
closes#10095
## Summary
Removed stray space in sample code snippet that is against ruff's own
default formatting rules.
This documentation appears on
https://docs.astral.sh/ruff/rules/unused-import/
## Test Plan
This is a trivially obvious change, verifiable with `ruff format
--check`
## Summary
Closes https://github.com/astral-sh/ruff/issues/11587.
## Test Plan
- Added a lint error to `test_server.py` in `vscode-ruff`.
- Validated that, prior to this change, diagnostics appeared in the
file.
- Validated that, with this change, no diagnostics were shown.
- Validated that, with this change, no diagnostics were fixed on-save.
## Summary
- Implements `Y066` from `flake8-pyi` as `PYI066`
- Fixes `PYI006` not being raised for `elif` clauses. This would have
conflicted with PYI006's implementation, so decided to do it in the same
PR.
## Test Plan
`cargo test` / `cargo insta review`
* Add a module type, `ModuleTypeId`
* Add an attribute lookup method `get_member` for `Type`
* Only implemented for `ModuleTypeId` and `ClassTypeId`
* [x] Should this be a trait?
*Answer: no*
* [x] Uses `unwrap`, but we should remove that. Maybe add a new variant
to `QueryError`?
*Answer: Return `Option<Type>` as is done elsewhere*
* Add `infer_definition_type` case for `Import`
* Add `infer_expr_type` case for `Attribute`
* Add a test to exercise these
* [x] remove all NOTE/FIXME/TODO after discussing with reviewers
## Summary
This PR ensures that if a variable is bound via `global`, and then the
`global` is read, the originating variable is also marked as read. It's
not perfect, in that it won't detect _rebindings_, like:
```python
from app import redis_connection
def func():
global redis_connection
redis_connection = 1
redis_connection()
```
So, above, `redis_connection` is still marked as unused.
But it does avoid flagging `redis_connection` as unused in:
```python
from app import redis_connection
def func():
global redis_connection
redis_connection()
```
Closes https://github.com/astral-sh/ruff/issues/11518.
## Summary
Follow up to https://github.com/astral-sh/ruff/pull/11521
Removes the extra added complexity for catch all match cases. This
matches the implementation of plain `else` statements.
## Test Plan
Added new test cases.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
This PR fixes the bug to avoid flattening the global-only settings for
the new server.
This was added in https://github.com/astral-sh/ruff/pull/11497, possibly
to correctly de-serialize an empty value (`{}`). But, this lead to a bug
where the configuration under the `settings` key was not being read for
global-only variant.
By using #[serde(default)], we ensure that the settings field in the
`GlobalOnly` variant is optional and that an empty JSON object `{}` is
correctly deserialized into `GlobalOnly` with a default `ClientSettings`
instance.
fixes: #11507
## Test Plan
Update the snapshot and existing test case. Also, verify the following
settings in Neovim:
1. Nothing
```lua
ruff = {
cmd = {
'/Users/dhruv/work/astral/ruff/target/debug/ruff',
'server',
'--preview',
},
}
```
2. Empty dictionary
```lua
ruff = {
cmd = {
'/Users/dhruv/work/astral/ruff/target/debug/ruff',
'server',
'--preview',
},
init_options = vim.empty_dict(),
}
```
3. Empty `settings`
```lua
ruff = {
cmd = {
'/Users/dhruv/work/astral/ruff/target/debug/ruff',
'server',
'--preview',
},
init_options = {
settings = vim.empty_dict(),
},
}
```
4. With some configuration:
```lua
ruff = {
cmd = {
'/Users/dhruv/work/astral/ruff/target/debug/ruff',
'server',
'--preview',
},
init_options = {
settings = {
configuration = '/tmp/ruff-repro/pyproject.toml',
},
},
}
```
## Summary
This PR brings back the functionality to remove empty strings when
converting to an f-string in `UP032`.
For context, https://github.com/astral-sh/ruff/pull/8712 added this
functionality to remove _trailing_ empty strings but it got removed in
https://github.com/astral-sh/ruff/pull/8697 possibly unexpectedly so.
There's one difference which is that this PR will remove _any_ empty
strings and not just trailing ones. For example,
```diff
--- /Users/dhruv/playground/ruff/src/UP032.py
+++ /Users/dhruv/playground/ruff/src/UP032.py
@@ -1,7 +1,5 @@
(
- "{a}"
- ""
- "{b}"
- ""
-).format(a=1, b=1)
+ f"{1}"
+ f"{1}"
+)
```
## Test Plan
Run `cargo insta test` and update the snapshots.
## Summary
This PR updates the sequence sorting (`RUF022` and `RUF023`) to avoid
using the owned data from the string token. Instead, we will directly
use the reference to the data on the AST. This does introduce a lot of
lifetimes but that's required.
The main motivation for this is to allow removing the `lex_starts_at`
usage easily.
### Alternatives
1. Extract the raw string content (stripping the prefix and quotes)
using the `Locator` and use that for comparison
2. Build up an
[`IndexVec`](3e30962077/crates/ruff_index/src/vec.rs)
and use the newtype index in place of the string value itself. This also
does require lifetimes so we might as well just use the method in this
PR.
## Test Plan
`cargo insta test` and no ecosystem changes