## Summary
The use of `|` as a union operator is not always safe, if a type
annotation is evaluated in a runtime context. For example, this code
errors at runtime:
```python
import httpretty
import requests_mock
item: type[requests_mock.Mocker | httpretty] = requests_mock.Mocker
```
However, it's fine in a `.pyi` file, with `__future__` annotations`, or
if the annotation is in a non-evaluated context, like:
```python
def func():
item: type[requests_mock.Mocker | httpretty] = requests_mock.Mocker
```
This PR modifies the rule to avoid enforcing in those invalid,
runtime-evaluated contexts.
Closes https://github.com/astral-sh/ruff/issues/6455.
## Summary
Use the same Python version by default for all tests (our
latest-supported version).
## Test Plan
`cargo test`
---------
Co-authored-by: Zanie <contact@zanie.dev>
## Summary
I think it makes sense for `PythonVersion::default()` to return our
minimum-supported non-EOL version.
## Test Plan
`cargo test`
---------
Co-authored-by: Zanie <contact@zanie.dev>
## Summary
This PR renames the `MagicCommand` token to `IpyEscapeCommand` token and
`MagicKind` to `IpyEscapeKind` type to better reflect the purpose of the
token and type. Similarly, it renames the AST nodes from `LineMagic` to
`IpyEscapeCommand` prefixed with `Stmt`/`Expr` wherever necessary.
It also makes renames from using `jupyter_magic` to
`ipython_escape_commands` in various function names.
The mode value is still `Mode::Jupyter` because the escape commands are
part of the IPython syntax but the lexing/parsing is done for a Jupyter
notebook.
### Motivation behind the rename:
* IPython codebase defines it as "EscapeCommand" / "Escape Sequences":
* Escape Sequences:
292e3a2345/IPython/core/inputtransformer2.py (L329-L333)
* Escape command:
292e3a2345/IPython/core/inputtransformer2.py (L410-L411)
* The word "magic" is used mainly for the actual magic commands i.e.,
the ones starting with `%`/`%%`
(https://ipython.readthedocs.io/en/stable/interactive/reference.html#magic-command-system).
So, this avoids any confusion between the Magic token (`%`, `%%`) and
the escape command itself.
## Test Plan
* `cargo test` to make sure all renames are done correctly.
* `grep` for `jupyter_escape`/`magic` to make sure all renames are done
correctly.
## Summary
This PR removes the group around function definition parameters, instead
grouping the parameters with the type parameters and return type
annotation.
This increases Zulip's similarity score from 0.99385 to 0.99699, so it's
a meaningful improvement. However, there's at least one stability error
that I'm working on, and I'm really just looking for high-level feedback
at this point, because I'm not happy with the solution.
Closes https://github.com/astral-sh/ruff/issues/6352.
## Test Plan
Before:
- `zulip`: 0.99396
- `django`: 0.99784
- `warehouse`: 0.99578
- `build`: 0.75436
- `transformers`: 0.99407
- `cpython`: 0.75987
- `typeshed`: 0.74432
After:
- `zulip`: 0.99702
- `django`: 0.99784
- `warehouse`: 0.99585
- `build`: 0.75623
- `transformers`: 0.99470
- `cpython`: 0.75988
- `typeshed`: 0.74853
## Summary
In https://github.com/astral-sh/ruff/pull/6397, the documentation was
updated stating that the default target-version is now "py38", but the
actual default value wasn't updated and remained py310. This commit
updates the default value to match what the documentation says.
## Summary
This PR adds support for a stricter version of help end escape
commands[^1] in the parser. By stricter, I mean that the escape tokens
are only at the end of the command and there are no tokens at the start.
This makes it difficult to implement it in the lexer without having to
do a lot of look aheads or keeping track of previous tokens.
Now, as we're adding this in the parser, the lexer needs to recognize
and emit a new token for `?`. So, `Question` token is added which will
be recognized only in `Jupyter` mode.
The conditions applied are the same as the ones in the original
implementation in IPython codebase (which is a regex):
* There can only be either 1 or 2 question mark(s) at the end
* The node before the question mark can be a `Name`, `Attribute`,
`Subscript` (only with integer constants in slice position), or any
combination of the 3 nodes.
## Test Plan
Added test cases for various combination of the possible nodes in the
command value position and update the snapshots.
fixes: #6359fixes: #5030 (This is the final piece)
[^1]: https://github.com/astral-sh/ruff/pull/6272#issue-1833094281
## Summary
Error if `tab-size` is set to zero (it is used as a divisor). Closes
#6423.
Also fixes a typo.
## Test Plan
Running ruff with a config
```toml
[tool.ruff]
tab-size = 0
```
returns an error message to the user saying that `tab-size` must be
greater than zero.
## Summary
Given:
```python
def double(a: int) -> ( # Hello
int
):
return 2*a
```
We currently treat `# Hello` as a trailing comment on the parameters
(`(a: int)`). This PR adds a placement method to instead treat it as a
dangling comment on the function definition itself, so that it gets
formatted at the end of the definition, like:
```python
def double(a: int) -> int: # Hello
return 2*a
```
The formatting in this case is unchanged, but it's incorrect IMO for
that to be a trailing comment on the parameters, and that placement
leads to an instability after changing the grouping in #6410.
Fixing this led to a _different_ instability related to tuple return
type annotations, like:
```python
def zrevrangebylex(self, name: _Key, max: _Value, min: _Value, start: int | None = None, num: int | None = None) -> ( # type: ignore[override]
):
...
```
(This is a real example.)
To fix, I had to special-case tuples in that spot, though I'm not
certain that's correct.
## Summary
This PR moves `empty_parenthesized` such that it's peer to
`parenthesized`, and changes the API to better match that of
`parenthesized` (takes `&str` rather than `StaticText`, has a
`with_dangling_comments` method, etc.).
It may be intentionally _not_ part of `parentheses.rs`, but to me
they're so similar that it makes more sense for them to be in the same
module, with the same API, etc.
## Summary
This PR adds support for `StmtMatch` with subs for `MatchCase`.
## Test Plan
Add a few additional test cases around `match` statement, comments, line
breaks.
resolves: #6298
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix name of rule in example of `extend-per-file-ignores` in `options.rs`
file.
It was `E401` but in configuration example `E402` was listed. Just a
tiny mismatch.
## Test Plan
<!-- How was it tested? -->
Just by my eyes :).
## Bug
Given
```python
x = () - (#
)
```
the comment is a dangling comment of the empty tuple. This is an
end-of-line comment so it may move after the expression. It still
expands the parent, so the operator breaks:
```python
x = (
()
- () #
)
```
In the next formatting pass, the comment is not a trailing tuple but a
trailing bin op comment, so the bin op doesn't break anymore. The
comment again expands the parent, so we still add the superfluous
parentheses
```python
x = (
() - () #
)
```
## Fix
The new formatting is to keep the comment on the empty tuple. This is a
log uglier and again has additional outer parentheses, but it's stable:
```python
x = (
()
- ( #
)
)
```
## Alternatives
Black formats all the examples above as
```python
x = () - () #
```
which i find better.
I would be happy about any suggestions for better solutions than the
current one. I'd mainly need a workaround for expand parent having an
effect on the bin op instead of first moving the comment to the end and
then applying expand parent to the assign statement.
## Summary
Manually add the parentheses around tuple expressions for the autofix in
`B014`.
This is also done in various other autofixes as well such as for
[`RUF005`](6df5ab4098/crates/ruff/src/rules/ruff/rules/collection_literal_concatenation.rs (L183-L184)),
[`UP024`](6df5ab4098/crates/ruff/src/rules/pyupgrade/rules/os_error_alias.rs (L137-L137)).
### Alternate Solution
An alternate solution would be to fix this in the `Generator` itself by
checking
if the tuple expression needs to be generated at the top-level or not.
If so,
then always add the parentheses.
```rust
} else if level == 0 {
// Top-level tuples are always parenthesized.
self.p("(");
let mut first = true;
for elt in elts {
self.p_delim(&mut first, ", ");
self.unparse_expr(elt, precedence::COMMA);
}
self.p_if(elts.len() == 1, ",");
self.p(")");
```
## Test Plan
Add a regression test for this case in `B014`.
fixes: #6412
Extends #6289 to support moving type variable usage in type aliases to
use PEP-695.
Does not remove the possibly unused type variable declaration.
Presumably this is handled by other rules, but is not working for me.
Does not handle type variables with bounds or variance declarations yet.
Part of #4617
## Summary
We have some logic in the expression analyzer method to avoid
re-checking the inner `Union` in `Union[Union[...]]`, since the methods
that analyze `Union` expressions already recurse. Elsewhere, we have
logic to avoid re-checking the inner `|` in `int | (int | str)`, for the
same reason.
This PR unifies that logic into a single method _and_ ensures that, just
as we recurse over both `Union` and `|`, we also detect that we're in
_either_ kind of nested union.
Closes https://github.com/astral-sh/ruff/issues/6285.
## Test Plan
Added some new snapshots.
## Summary
We can anticipate earlier that this will error, so we should avoid
flagging the error at all. Specifically, we're talking about cases like
`"{1} {0}".format(*args)"`, in which we'd need to reorder the arguments
in order to remove the `1` and `0`, but we _can't_ reorder the arguments
since they're not statically analyzable.
Closes https://github.com/astral-sh/ruff/issues/6388.
## Summary
I noticed some deviations in how we treat dangling comments that hug the
opening parenthesis for function definitions.
For example, given:
```python
def f( # first
# second
): # third
...
```
We currently format as:
```python
def f(
# first
# second
): # third
...
```
This PR adds the proper opening-parenthesis dangling comment handling
for function parameters. Specifically, as with all other parenthesized
nodes, we now detect that dangling comment in `placement.rs` and handle
it in `parameters.rs`. We have to take some care in that file, since we
have multiple "kinds" of dangling comments, but I added a bunch of test
cases that we now format identically to Black.
## Test Plan
`cargo test`
Before:
- `zulip`: 0.99388
- `django`: 0.99784
- `warehouse`: 0.99504
- `transformers`: 0.99404
- `cpython`: 0.75913
- `typeshed`: 0.74364
After:
- `zulip`: 0.99386
- `django`: 0.99784
- `warehouse`: 0.99504
- `transformers`: 0.99404
- `cpython`: 0.75913
- `typeshed`: 0.74409
Meaningful improvement on `typeshed`, minor decrease on `zulip`.
Closes https://github.com/astral-sh/ruff/issues/6068
These commits are kind of a mess as I did some stumbling around here.
Unrolls formatting of chained boolean operations to prevent nested
grouping which gives us Black-compatible formatting where each boolean
operation is on a new line.
## Summary
This PR modifies our `can_omit_optional_parentheses` rules to ensure
that if we see a call followed by an attribute, we treat that as an
attribute access rather than a splittable call expression.
This in turn ensures that we wrap like:
```python
ct_match = aaaaaaaaaaact_id == self.get_content_type(
obj=rel_obj, using=instance._state.db
)
```
For calls, but:
```python
ct_match = (
aaaaaaaaaaact_id == self.get_content_type(obj=rel_obj, using=instance._state.db).id
)
```
For calls with trailing attribute accesses.
Closes https://github.com/astral-sh/ruff/issues/6065.
## Test Plan
Similarity index before:
- `zulip`: 0.99436
- `django`: 0.99779
- `warehouse`: 0.99504
- `transformers`: 0.99403
- `cpython`: 0.75912
- `typeshed`: 0.72293
And after:
- `zulip`: 0.99436
- `django`: 0.99780
- `warehouse`: 0.99504
- `transformers`: 0.99404
- `cpython`: 0.75913
- `typeshed`: 0.72293
## Summary
This PR leverages the unified function definition node to add precise
AST node types to `MemberKind`, which is used to power our docstring
definition tracking (e.g., classes and functions, whether they're
methods or functions or nested functions and so on, whether they have a
docstring, etc.). It was painful to do this in the past because the
function variants needed to support a union anyway, but storing precise
nodes removes like a dozen panics.
No behavior changes -- purely a refactor.
## Test Plan
`cargo test`
## Summary
Per the suggestion in
https://github.com/astral-sh/ruff/discussions/6183, this PR removes
`AsyncWith`, `AsyncFor`, and `AsyncFunctionDef`, replacing them with an
`is_async` field on the non-async variants of those structs. Unlike an
interpreter, we _generally_ have identical handling for these nodes, so
separating them into distinct variants adds complexity from which we
don't really benefit. This can be seen below, where we get to remove a
_ton_ of code related to adding generic `Any*` wrappers, and a ton of
duplicate branches for these cases.
## Test Plan
`cargo test` is unchanged, apart from parser snapshots.
## Summary
See discussion in
https://github.com/astral-sh/ruff/pull/6351#discussion_r1284996979. We
can remove `RefEquality` entirely and instead use a text offset for
statement keys, since no two statements can start at the same text
offset.
## Test Plan
`cargo test`
## Summary
This PR adds support for help end escape command in the lexer.
### What are "help end escape commands"?
First, the escape commands are special IPython syntax which enhances the
functionality for the IPython REPL. There are 9 types of escape kinds
which are recognized by the tokens which are present at the start of the
command (`?`, `??`, `!`, `!!`, etc.).
Here, the help command is using either the `?` or `??` token at the
start (`?str.replace` for example). Those 2 tokens are also supported
when they're at the end of the command (`str.replace?`), but the other
tokens aren't supported in that position.
There are mainly two types of help end escape commands:
1. Ending with either `?` or `??`, but it also starts with one of the
escape tokens (`%matplotlib?`)
2. On the other hand, there's a stricter version for (1) which doesn't
start with any escape tokens (`str.replace?`)
This PR adds support for (1) while (2) will be supported in the parser.
### Priority
Now, if the command starts and ends with an escape token, how do we
decide the kind of this command? This is where priority comes into
picture. This is simple as there's only one priority where `?`/`??` at
the end takes priority over any other escape token and all of the other
tokens are at the same priority. Remember that only `?`/`??` at the end
is considered valid.
This is mainly useful in the case where someone would want to invoke the
help command on the magic command itself. For example, in `%matplotlib?`
the help command takes priority which means that we want help for the
`matplotlib` magic function instead of calling the magic function
itself.
### Specification
Here's where things get a bit tricky. What if there are question mark
tokens at both ends. How do we decide if it's `Help` (`?`) kind or
`Help2` (`??`) kind?
| | Magic | Value | Kind |
| --- | --- | --- | --- |
| 1 | `?foo?` | `foo` | `Help` |
| 2 | `??foo?` | `foo` | `Help` |
| 3 | `?foo??` | `foo` | `Help2` |
| 4 | `??foo??` | `foo` | `Help2` |
| 5 | `???foo??` | `foo` | `Help2` |
| 6 | `??foo???` | `foo???` | `Help2` |
| 7 | `???foo???` | `?foo???` | `Help2` |
Looking at the above table:
- The question mark tokens on the right takes priority over the ones on
the left but only if the number of question mark on the right is 1 or 2.
- If there are more than 2 question mark tokens on the right side, then
the left side is used to determine the same.
- If the right side is used to determine the kind, then all of the
question marks and whitespaces on the left side are ignored in the
`value`, but if it’s the other way around, then all of the extra
question marks are part of the `value`.
### References
- IPython implementation using the regex:
292e3a2345/IPython/core/inputtransformer2.py (L454-L462)
- Priorities:
292e3a2345/IPython/core/inputtransformer2.py (L466-L469)
## Test Plan
Add a bunch of test cases for the lexer and verify that it matches the
behavior of
IPython transformer.
resolves: #6357
## Summary
This PR fixes the performance degradation introduced in
https://github.com/astral-sh/ruff/pull/6345. Instead of using the
generic `Nodes` structs, we now use separate `Statement` and
`Expression` structs. Importantly, we can avoid tracking a bunch of
state for expressions that we need for parents: we don't need to track
reference-to-ID pointers (we just have no use-case for this -- I'd
actually like to remove this from statements too, but we need it for
branch detection right now), we don't need to track depth, etc.
In my testing, this entirely removes the regression on all-rules, and
gets us down to 2ms slower on the default rules (as a crude hyperfine
benchmark, so this is within margin of error IMO).
No behavioral changes.
## Summary
This PR attempts to draw a clearer divide between "methods that take
(e.g.) an expression or statement as input" and "methods that rely on
the _current_ expression or statement" in the semantic model, by
renaming methods like `stmt()` to `current_statement()`.
This had led to confusion in the past. For example, prior to this PR, we
had `scope()` (which returns the current scope), and `parent_scope`,
which returns the parent _of a scope that's passed in_. Now, the API is
clearer: `current_scope` returns the current scope, and `parent_scope`
takes a scope as argument and returns its parent.
Per above, I also changed `stmt` to `statement` and `expr` to
`expression`.
## Summary
Given:
```python
[ # comment
first,
second,
third
] # another comment
```
We were adding a hard line break as part of the formatting of `#
comment`, which led to the following formatting:
```python
[first, second, third] # comment
# another comment
```
Closes https://github.com/astral-sh/ruff/issues/6367.
## Summary
Fixes an instability whereby this:
```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
# Returns a list of deployments not older than threshold days
# including `/root/zulip` directory if it exists.
recent = set()
threshold_date = datetime.datetime.now() - datetime.timedelta( # noqa: DTZ005
days=threshold_days
)
```
Was being formatted as:
```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
# Returns a list of deployments not older than threshold days
# including `/root/zulip` directory if it exists.
recent = set()
threshold_date = (
datetime.datetime.now()
- datetime.timedelta(days=threshold_days) # noqa: DTZ005
)
```
Which was in turn being formatted as:
```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
# Returns a list of deployments not older than threshold days
# including `/root/zulip` directory if it exists.
recent = set()
threshold_date = (
datetime.datetime.now() - datetime.timedelta(days=threshold_days) # noqa: DTZ005
)
```
The second-to-third formattings still differs from Black because we
aren't taking the line suffix into account when splitting
(https://github.com/astral-sh/ruff/issues/6377), but the first
formatting is correct and should be unchanged (i.e., the first-to-second
formattings is incorrect, and fixed here).
## Test Plan
`cargo run --bin ruff_dev -- format-dev --stability-check ../zulip`
## Summary
When we iterate over the AST for analysis, we often process nodes in a
"deferred" manner. For example, if we're analyzing a function, we push
the function body onto a deferred stack, along with a snapshot of the
current semantic model state. Later, when we analyze the body, we
restore the semantic model state from the snapshot. This ensures that we
know the correct scope, hierarchy of statement parents, etc., when we go
to analyze the function body.
Historically, we _haven't_ included the _expression_ hierarchy in the
model snapshot -- so we track the current expression parents in the
visitor, but we never save and restore them when processing deferred
nodes. This can lead to subtle bugs, in that methods like
`expr_parent()` aren't guaranteed to be correct, if you're in a deferred
visitor.
This PR migrates expression tracking to mirror statement tracking
exactly. So we push all expressions onto an `IndexVec`, and include the
current expression on the snapshot. This ensures that `expr_parent()`
and related methods are "always correct" rather than "sometimes
correct".
There's a performance cost here, both at runtime and in terms of memory
consumption (we now store an additional pointer for every expression).
In my hyperfine testing, it's about a 1% performance decrease for
all-rules on CPython (up to 533.8ms, from 528.3ms) and a 4% performance
decrease for default-rules on CPython (up to 212ms, from 204ms).
However... I think this is worth it given the incorrectness of our
current approach. In the future, we may want to reconsider how we do
these upward traversals (e.g., with something like a red-green tree).
(**Note**: in https://github.com/astral-sh/ruff/pull/6351, the slowdown
seems to be entirely removed.)
Changes:
- Fixes typo and repeated phrase in `DTZ002`
- Adds docs for `DTZ003`
- Adds docs for `DTZ004`
- Adds example for <=Python3.10 in `DTZ001`
Related to: https://github.com/astral-sh/ruff/issues/2646
## Summary
Historically, we've stored "qualified names" on our
`BindingKind::Import`, `BindingKind::SubmoduleImport`, and
`BindingKind::ImportFrom` structs. In Ruff, a "qualified name" is a
dot-separated path to a symbol. For example, given `import foo.bar`, the
"qualified name" would be `"foo.bar"`; and given `from foo.bar import
baz`, the "qualified name" would be `foo.bar.baz`.
This PR modifies the `BindingKind` structs to instead store _call paths_
rather than qualified names. So in the examples above, we'd store
`["foo", "bar"]` and `["foo", "bar", "baz"]`. It turns out that this
more efficient given our data access patterns. Namely, we frequently
need to convert the qualified name to a call path (whenever we call
`resolve_call_path`), and it turns out that we do this operation enough
that those conversations show up on benchmarks.
There are a few other advantages to using call paths, rather than
qualified names:
1. The size of `BindingKind` is reduced from 32 to 24 bytes, since we no
longer need to store a `String` (only a boxed slice).
2. All three import types are more consistent, since they now all store
a boxed slice, rather than some storing an `&str` and some storing a
`String` (for `BindingKind::ImportFrom`, we needed to allocate a
`String` to create the qualified name, but the call path is a slice of
static elements that don't require that allocation).
3. A lot of code gets simpler, in part because we now do call path
resolution "earlier". Most notably, for relative imports (`from .foo
import bar`), we store the _resolved_ call path rather than the relative
call path, so the semantic model doesn't have to deal with that
resolution. (See that `resolve_call_path` is simpler, fewer branches,
etc.)
In my testing, this change improves the all-rules benchmark by another
4-5% on top of the improvements mentioned in #6047.
## Summary
Update `F841` autofix to not remove line magic expr
## Test Plan
Added test case for assignment statement with and without type
annotation
fixes: #6116
## Summary
Enable using the new `Mode::Jupyter` for the tokenizer/parser to parse
Jupyter line magic tokens.
The individual call to the lexer i.e., `lex_starts_at` done by various
rules should consider the context of the source code (is this content
from a Jupyter Notebook?). Thus, a new field `source_type` (of type
`PySourceType`) is added to `Checker` which is being passed around as an
argument to the relevant functions. This is then used to determine the
`Mode` for the lexer.
## Test Plan
Add new test cases to make sure that the magic statement is considered
while generating the diagnostic and autofix:
* For `I001`, if there's a magic statement in between two import blocks,
they should be sorted independently
fixes: #6090
## Summary
As of version
[23.1.0](2a86db8271/CHANGELOG.md?plain=1#L158-L160),
`flake8-pyi` remove the rule `Y027`.
The errors that resulted in `PYI027` are now being emitted by `PYI022`
(`UP035`).
ref: #848
## Test Plan
Add new tests cases.
## Summary
Previously, failed on methods like:
```python
@classmethod
def bad_posonly_class_method(cls: type[_S], /) -> _S: ... # PYI019
```
Since we check if there are any positional-only or non-positional
arguments, but then do an unsafe access on `parameters.args`.
Closes https://github.com/astral-sh/ruff/issues/6349.
## Test Plan
`cargo test` (verified that `main` panics on the new fixtures)
These rules assume that the docstring is on its own line. pydocstyle
treats them inconsistently, so I'm just going to disable them in this
case.
Closes https://github.com/astral-sh/ruff/issues/6329.
## Summary
Fixes some comprehension formatting by avoiding creating the group for
the comprehension itself (so that if it breaks, all parts break on their
own lines, e.g. the `for` and the `if` clauses).
Closes https://github.com/astral-sh/ruff/issues/6063.
## Test Plan
Bunch of new fixtures.
Implement fluent style/call chains. See the `call_chains.py` formatting
for examples.
This isn't fully like black because in `raise A from B` they allow `A`
breaking can influence the formatting of `B` even if it is already
multiline.
Similarity index:
| project | main | PR |
|--------------|-------|-------|
| build | ??? | 0.753 |
| django | 0.991 | 0.998 |
| transformers | 0.993 | 0.994 |
| typeshed | 0.723 | 0.723 |
| warehouse | 0.978 | 0.994 |
| zulip | 0.992 | 0.994 |
Call chain formatting is affected by
https://github.com/astral-sh/ruff/issues/627, but i'm cutting scope
here.
Closes#5343
**Test Plan**:
* Added a dedicated call chains test file
* The ecosystem checks found some bugs
* I manually check django and zulip formatting
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Extends `comparison-with-itself` to cover simple function calls on
known-pure functions, like `id`. For example, we now flag `id(x) ==
id(x)`.
Closes https://github.com/astral-sh/ruff/issues/6276.
## Test Plan
`cargo test`
**Summary** This adds the information whether we're in a .py python
source file or in a .pyi stub file to enable people working on #5822 and
related issues.
I'm not completely happy with `Default` for something that depends on
the input.
**Test Plan** None, this is currently unused, i'm leaving this to first
implementation of stub file specific formatting.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Follow-up to https://github.com/astral-sh/ruff/pull/6325, to avoid false
positives in cases like:
```python
if x == int:
...
```
Which is valid, since we don't know that we're comparing the type _of_
something -- we're comparing the type objects directly.
## Summary
Extends `type-comparison` to flag:
```python
if type(obj) is int:
pass
```
In addition to the existing cases, like:
```python
if type(obj) is type(1):
pass
```
Closes https://github.com/astral-sh/ruff/issues/6260.
## Summary
We already support preserving the end-of-line comment in calls and type
parameters, as in:
```python
foo( # comment
bar,
)
```
This PR adds the same behavior for lists, sets, comprehensions, etc.,
such that we preserve:
```python
[ # comment
1,
2,
3,
]
```
And related cases.
## Summary
This PR adds an API for chaining comment placement methods based on the
[`then_with`](https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.then_with)
from `Ordering` in the standard library.
For example, you can now do:
```rust
try_some_case(comment).then_with(|comment| try_some_other_case_if_still_default(comment))
```
This lets us avoid this kind of pattern, which I've seen in
`placement.rs` and used myself before:
```rust
let comment = match handle_own_line_comment_between_branches(comment, preceding, locator) {
CommentPlacement::Default(comment) => comment,
placement => return placement,
};
```
## Summary
This ensures that we treat `# comment` as parenthesized in contexts
like:
```python
while (
True
# comment
):
pass
```
The same logic applies equally to `for`, `async for`, `if`, `with`, and
`async with`. The general pattern is that you have an expression which
precedes a colon-separated suite.
**Summary** Prompted by
https://github.com/astral-sh/ruff/pull/6257#issuecomment-1661308410, it
tried to make the ecosystem script output on failure better
understandable. All log messages are now written to a file, which is
printed on error. Running locally progress is still shown.
Looking through the log output i saw that we currently log syntax errors
in input, which is confusing because they aren't actual errors, but we
don't check that these files don't change due to parser regressions or
improvements. I added `--files-with-errors` to catch that.
**Test Plan** CI
Adds rule to convert type aliases defined with annotations i.e. `x:
TypeAlias = int` to the new PEP-695 syntax e.g. `type x = int`.
Does not support using new generic syntax for type variables, will be
addressed in a follow-up.
Added as part of pyupgrade — ~the code 100 as chosen to avoid collision
with real pyupgrade codes~.
Part of #4617
Builds on #5062
Of the rules that flake8-pyi enforces for `.pyi` type stubs, many of
them equally make sense to check in normal runtime code with type
annotations. Broaden these rules to check all files:
PYI013 ellipsis-in-non-empty-class-body
PYI016 duplicate-union-member
PYI018 unused-private-type-var
PYI019 custom-type-var-return-type
PYI024 collections-named-tuple
PYI025 unaliased-collections-abc-set-import
PYI030 unnecessary-literal-union
PYI032 any-eq-ne-annotation
PYI034 non-self-return-type
PYI036 bad-exit-annotation
PYI041 redundant-numeric-union
PYI042 snake-case-type-alias
PYI043 t-suffixed-type-alias
PYI045 iter-method-return-iterable
PYI046 unused-private-protocol
PYI047 unused-private-type-alias
PYI049 unused-private-typed-dict
PYI050 no-return-argument-annotation-in-stub (Python ≥ 3.11)
PYI051 redundant-literal-union
PYI056 unsupported-method-call-on-all
The other rules are stub-specific and remain enabled only in `.pyi`
files.
PYI001 unprefixed-type-param
PYI002 complex-if-statement-in-stub
PYI003 unrecognized-version-info-check
PYI004 patch-version-comparison
PYI005 wrong-tuple-length-version-comparison (could make sense to
broaden, see
https://github.com/astral-sh/ruff/pull/6297#issuecomment-1663314807)
PYI006 bad-version-info-comparison (same)
PYI007 unrecognized-platform-check
PYI008 unrecognized-platform-name
PYI009 pass-statement-stub-body
PYI010 non-empty-stub-body
PYI011 typed-argument-default-in-stub
PYI012 pass-in-class-body
PYI014 argument-default-in-stub
PYI015 assignment-default-in-stub
PYI017 complex-assignment-in-stub
PYI020 quoted-annotation-in-stub
PYI021 docstring-in-stub
PYI026 type-alias-without-annotation (could make sense to broaden, but
gives many false positives on runtime code as currently implemented)
PYI029 str-or-repr-defined-in-stub
PYI033 type-comment-in-stub
PYI035 unassigned-special-variable-in-stub
PYI044 future-annotations-in-stub
PYI048 stub-body-multiple-statements
PYI052 unannotated-assignment-in-stub
PYI053 string-or-bytes-too-long
PYI054 numeric-literal-too-long
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
In Python >= 3.7, `await` can be included in f-strings.
https://bugs.python.org/issue28942
## Test Plan
<!-- How was it tested? -->
Existing tests
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
#2646
## Test Plan
<!-- How was it tested? -->
## Summary
Implements `Y019` from
[flake8-pyi](https://github.com/PyCQA/flake8-pyi).
The rule checks if
- instance methods that return `self`
- class methods that return an instance of `cls`
- `__new__` methods
Return a custom `TypeVar` instead of `typing.Self` and raises a
violation if this is the case. The rule also covers
[PEP-695](https://peps.python.org/pep-0695/) syntax as introduced in
upstream in https://github.com/PyCQA/flake8-pyi/pull/402
## Test Plan
Added fixtures with test cases from upstream implementation (plus
additional one for an excluded edge case, mentioned in upstream
implementation)
## Summary
Relates to #970.
Add new `bad-format-character` Pylint rule.
I had to make a change in `crates/ruff_python_literal/src/format.rs` to
get a more detailed error in case the format character is not correct. I
chose to do this since most of the format spec parsing functions are
private. It would have required me reimplementing most of the parsing
logic just to know if the format char was correct.
This PR also doesn't reflect current Pylint functionality in two ways.
It supports new format strings correctly, Pylint as of now doesn't. See
pylint-dev/pylint#6085.
In case there are multiple adjacent string literals delimited by
whitespace the index of the wrong format char will relative to the
single string. Pylint will instead reported it relative to the
concatenated string.
Given this:
```
"%s" "%z" % ("hello", "world")
```
Ruff will report this:
```Unsupported format character 'z' (0x7a) at index 1```
Pylint instead:
```Unsupported format character 'z' (0x7a) at index 3```
I believe it's more sensible to report the index relative to the
individual string.
## Test Plan
Added new snapshot and a small test in
`crates/ruff_python_literal/src/format.rs`.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
Part of #5062
Closes https://github.com/astral-sh/ruff/issues/5931
Implements formatting of a sequence of type parameters in a dedicated
struct for reuse by classes, functions, and type aliases (preparing for
#5929). Adds formatting of type parameters in class and function
definitions — previously, they were just elided.
## Summary
Builds on #6170 to break `global` and `nonlocal` statements, such that
we get:
```python
def f():
global \
analyze_featuremap_layer, \
analyze_featuremapcompression_layer, \
analyze_latencies_post, \
analyze_motions_layer, \
analyze_size_model
```
Instead of:
```python
def f():
global analyze_featuremap_layer, analyze_featuremapcompression_layer, analyze_latencies_post, analyze_motions_layer, analyze_size_model
```
Notably, we avoid applying this formatting if the statement ends in a
comment. Otherwise, the comment would _need_ to be placed after the last
item, like:
```python
def f():
global \
analyze_featuremap_layer, \
analyze_featuremapcompression_layer, \
analyze_latencies_post, \
analyze_motions_layer, \
analyze_size_model # noqa
```
To me, this seems wrong (and would break the `# noqa` comment). Ideally,
the items would be parenthesized, and the comment would be on the inner
parenthesis, like:
```python
def f():
global ( # noqa
analyze_featuremap_layer,
analyze_featuremapcompression_layer,
analyze_latencies_post,
analyze_motions_layer,
analyze_size_model
)
```
But that's not valid syntax.
## Summary
Checks for the presence of redundant `Literal` types and builtin super
types in an union. See [original
source](2a86db8271/pyi.py (L1261)).
This implementation has a couple of differences from the original. The
first one is, we support the `complex` and `float` builtin types. The
second is, when reporting diagnostic for a `Literal` with multiple
members of the same type, we print the entire `Literal` while `flak8`
only prints the `Literal` with its first member.
For example:
```python
from typing import Literal
x: Literal[1, 2] | int
```
Ruff will show `Literal[1, 2]` while flake8 only shows `Literal[1]`.
```shell
$ ruff crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:4:18: PYI051 `Literal["foo"]` is redundant in an union with `str`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:5:37: PYI051 `Literal[b"bar", b"foo"]` is redundant in an union with `bytes`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:37: PYI051 `Literal[5]` is redundant in an union with `int`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:67: PYI051 `Literal["foo"]` is redundant in an union with `str`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:37: PYI051 `Literal[b"str_bytes"]` is redundant in an union with `bytes`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:51: PYI051 `Literal[42]` is redundant in an union with `int`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:9:31: PYI051 `Literal[1J]` is redundant in an union with `complex`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:9:53: PYI051 `Literal[3.14]` is redundant in an union with `float`
Found 8 errors.
```
```shell
$ flake8 crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:4:18: Y051 "Literal['foo']" is redundant in a union with "str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:5:37: Y051 "Literal[b'bar']" is redundant in a union with "bytes"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:37: Y051 "Literal[5]" is redundant in a unionwith "int"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:67: Y051 "Literal['foo']" is redundant in a union with "str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:37: Y051 "Literal[b'str_bytes']" is redundantin a union with "bytes"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:51: Y051 "Literal[42]" is redundant in a union with "int"
```
While implementing this rule, I found a bug in the `is_unchecked_union`
check. This is the new check.
1ab86bad35/crates/ruff/src/checkers/ast/analyze/expression.rs (L85-L102)
The purpose of the check was to prevent rules from navigating through
nested `Union`s, as they already handle nested `Union`s. The way it was
implemented, this was not happening, the rules were getting executed
more than one time and sometimes were receiving expressions that were
not `Union`. For example, with the following code:
```python
typing.Union[Literal[5], int, typing.Union[Literal["foo"], str]]
```
The rules were receiving the expressions in the following order:
- `typing.Union[Literal[5], int, typing.Union[Literal["foo"], str]]`
- `Literal[5]`
- `typing.Union[Literal["foo"], str]]`
This was causing `PYI030` to report redundant information, for example:
```python
typing.Union[Literal[5], int, typing.Union[Literal["foo"],
Literal["bar"]]]
```
This is the `PYI030` output for this code:
```shell
PYI030 Multiple literal members in a union. Use a single literal, e.g. `Literal[5, "foo", "bar"]`
YI030 Multiple literal members in a union. Use a single literal, e.g.`Literal[5, "foo"]`
```
If I haven't misinterpreted the rule, that looks incorrect. I didn't
have the time to check the `PYI016` rule.
The last thing is, I couldn't find a reason for the "Why is this bad?"
section for `PYI051`.
Ref: #848
## Test Plan
Snapshots and manual runs of flake8.
\
## Summary
Previously, the ruff formatter was removing the star argument of
`lambda` expressions when formatting.
Given the following code snippet
```python
lambda *a: ()
lambda **b: ()
```
it would be formatted to
```python
lambda: ()
lambda: ()
```
We fix this by checking for the presence of `args`, `vararg` or `kwarg`
in the `lambda` expression, before we were only checking for the
presence of `args`.
Fixes#5894
## Test Plan
Add new tests cases.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Similar to #6279, moving some helpers onto the struct in the name of
reducing the number of random undiscoverable utilities we have in
`helpers.rs`.
Most of the churn is migrating rules to take `ast::ExprCall` instead of
the spread call arguments.
## Test Plan
`cargo test`
## Summary
This PR removes a now-unnecessary abstraction from `helper.rs`
(`CallArguments`), in favor of adding methods to `Arguments` directly,
which helps with discoverability.
## Summary
This PR boxes the `TypeParams` and `Arguments` fields on the class
definition node. These fields are optional and often emitted, and given
that class definition is our largest enum variant, we pay the cost of
including them for every statement in the AST. Boxing these types
reduces the statement size by 40 bytes, which seems like a good tradeoff
given how infrequently these are accessed.
## Test Plan
Need to benchmark, but no behavior changes.
## Summary
This PR leverages the `Arguments` AST node introduced in #6259 in the
formatter, which ensures that we correctly handle trailing comments in
calls, like:
```python
f(
1,
# comment
)
pass
```
(Previously, this was treated as a leading comment on `pass`.)
This also allows us to unify the argument handling across calls and
class definitions.
## Test Plan
A bunch of new fixture tests, plus improved Black compatibility.
## Summary
Similar to #6259, this PR adds a `TypeParams` node to the AST, to
capture the list of type parameters with their surrounding brackets.
If a statement lacks type parameters, the `type_params` field will be
`None`.
## Summary
This PR adds a new `Arguments` AST node, which we can use for function
calls and class definitions.
The `Arguments` node spans from the left (open) to right (close)
parentheses inclusive.
In the case of classes, the `Arguments` is an option, to differentiate
between:
```python
# None
class C: ...
# Some, with empty vectors
class C(): ...
```
In this PR, we don't really leverage this change (except that a few
rules get much simpler, since we don't need to lex to find the start and
end ranges of the parentheses, e.g.,
`crates/ruff/src/rules/pyupgrade/rules/lru_cache_without_parameters.rs`,
`crates/ruff/src/rules/pyupgrade/rules/unnecessary_class_parentheses.rs`).
In future PRs, this will be especially helpful for the formatter, since
we can track comments enclosed on the node itself.
## Test Plan
`cargo test`
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Before:
<img width="1031" alt="Screen Shot 2023-08-02 at 15 57 10"
src="https://github.com/astral-sh/ruff/assets/17039389/171a21d5-01a5-4aa5-8079-4e7f8a59ade8">
After:
<img width="1031" alt="Screen Shot 2023-08-02 at 15 58 03"
src="https://github.com/astral-sh/ruff/assets/17039389/afd1b9b7-89e0-4e38-a4a6-e3255b62f021">
## Test Plan
<!-- How was it tested? -->
Manual inspection
## Summary
This PR renames...
- `Parameter#arg` to `Parameter#name`
- `ParameterWithDefault#def` to `ParameterWithDefault#parameter` (such
that `ParameterWithDefault` has a `default` and a `parameter`)
## Test Plan
`cargo test`
## Summary
This PR renames a few AST nodes for clarity:
- `Arguments` is now `Parameters`
- `Arg` is now `Parameter`
- `ArgWithDefault` is now `ParameterWithDefault`
For now, the attribute names that reference `Parameters` directly are
changed (e.g., on `StmtFunctionDef`), but the attributes on `Parameters`
itself are not (e.g., `vararg`). We may revisit that decision in the
future.
For context, the AST node formerly known as `Arguments` is used in
function definitions. Formally (outside of the Python context),
"arguments" typically refers to "the values passed to a function", while
"parameters" typically refers to "the variables used in a function
definition". E.g., if you Google "arguments vs parameters", you'll get
some explanation like:
> A parameter is a variable in a function definition. It is a
placeholder and hence does not have a concrete value. An argument is a
value passed during function invocation.
We're thus deviating from Python's nomenclature in favor of a scheme
that we find to be more precise.
## Summary
Black allows up to one blank line _before_ a class docstring, and
enforces one blank line _after_ a class docstring. This PR implements
that handling. The cases in
`crates/ruff_python_formatter/resources/test/fixtures/ruff/statement/class_definition.py`
match Black identically.
## Summary
This PR ensures that if a function or class is the first statement in a
nested suite that _isn't_ a function or class body, we insert a leading
newline.
For example, given:
```python
def f():
if True:
def register_type():
pass
```
We _want_ to preserve the newline, whereas today, we remove it.
Note that this only applies when the function or class doesn't have any
leading comments.
Closes https://github.com/astral-sh/ruff/issues/6066.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the `Interactive` and `FunctionType` parser modes that are unused by ruff
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the `type_comment` field which our parser doesn't support.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the type ignore node from the AST because our parser doesn't support it, and just having it around is confusing.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo build`
<!-- How was it tested? -->
**Summary** Allow passing any node to `CommentPlacement::{leading,
trailing, dangling}` without manually converting. Conversely, Restrict
the comment to the only type we actually pass.
**Test Plan** No changes.
## Summary
Previously, given:
```python
a = \
5;
```
When detecting continuations starting at the offset of the `;`, we'd
flag the previous line as a continuation. We should only flag a
continuation if there isn't leading content prior to the offset.
Closes https://github.com/astral-sh/ruff/issues/6214
## Summary
This PR moves the "insert empty lines" behavior out of
`JoinNodesBuilder` and into the `Suite` formatter. I find it a little
confusing that the logic is split between those two formatters right
now, and since this is _only_ used in that one place, IMO it is a bit
simpler to just inline it and use a single approach to tracking state
(right now, both are stateful).
The only other place this was used was for decorators. As a side effect,
we now remove blank lines in both of these cases, which is a known but
intentional deviation from Black (which preserves the empty line before
the comment in the first case):
```python
@foo
# Hello
@bar
def baz():
pass
@foo
@bar
def baz():
pass
```
## Summary
Very subtle bug related to the AST traversal. Given:
```python
from __future__ import annotations
from logging import getLogger
__all__ = ("getLogger",)
def foo() -> None:
pass
```
We end up visiting the `-> None` annotation, then reusing the state
snapshot when we go to visit the `__all__` exports, so when we visit
`"getLogger"`, we think we're inside of a deferred type annotation.
This PR changes all the deferred visitors to snapshot and restore the
state, which is a lot safer -- that way, the visitors avoid modifying
the current visitor state. (Previously, they implicitly left the visitor
state set to the state of the _last_ thing they visited.)
Closes https://github.com/astral-sh/ruff/issues/6207.
**Summary** This includes two changes:
* Allow setting `-v` in `ruff_dev`, using the `ruff_cli` implementation
* `debug!` which ruff configuration strategy was used
This is a byproduct of debugging #6187.
**Test Plan** n/a
**Summary** This prevents us from turning `r'''\""'''` into
`r"""\"""""`, which is invalid syntax.
This PR fixes CI, which is currently broken on main (in a way that still
passes on linter PRs and allows merging formatter PRs, but it's bad to
have a job be red). Once merged, i'll make the formatted ecosystem
checks a required check.
**Test Plan** Added a regression test.
## Summary
This PR adds a new precedence level for the comprehension element. This fixes
the generator to not add parentheses around the comprehension element every
time.
The new precedence level is `COMPREHENSION_ELEMENT` and it should occur after
the `NAMED_EXPR` precedence level because named expressions are always parenthesized.
This matches the behavior of Python `ast.unparse` and tested with the
following snippet:
```python
import ast
code = ""
ast.unparse(ast.parse(code))
```
## Test Plan
Add a bunch of test cases for all the valid nodes at that position.
fixes: #5777
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format bytes string
Closes#6064
## Test Plan
Added a fixture based on string's one
## Summary
We have some code to ensure that if an aliased import is used, any
submodules should be marked as used too. This comment says it best:
```rust
// If the name of a submodule import is the same as an alias of another import, and the
// alias is used, then the submodule import should be marked as used too.
//
// For example, mark `pyarrow.csv` as used in:
//
// ```python
// import pyarrow as pa
// import pyarrow.csv
// print(pa.csv.read_csv("test.csv"))
// ```
```
However, it looks like when we go to look up `pyarrow` (of `import
pyarrow as pa`), we aren't checking to ensure the resolved binding is
_actually_ an import. This was causing us to attribute `print(rm.ANY)`
to `def requests_mock` here:
```python
import requests_mock as rm
def requests_mock(requests_mock: rm.Mocker):
print(rm.ANY)
```
Closes https://github.com/astral-sh/ruff/issues/6180.
## Summary
Adds `global` and `nonlocal` formatting, without the "deviation from
black" outlined in the linked issue, which I'll do separately.
See: https://github.com/astral-sh/ruff/issues/4798.
## Test Plan
Added a fixture in the Ruff-specific directory since the Black fixtures
don't seem to cover this.
## Summary
Closes https://github.com/astral-sh/ruff/issues/5781
## Test Plan
Added cases to
`crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/named_expr.py`
one-by-one and adjusted the condition as needed.
## Summary
Right now, if we have two fixes that have an overlapping edit, but not
an _identical_ set of edits, they'll conflict, causing us to do another
linter traversal. Here, I've enabled the fixer to support partially
overlapping edits, which (as an example) let's us greatly reduce the
number of iterations required in the test suite.
The most common case here is that in which a bunch of edits need to
import some symbol, and then use that symbol, but in different ways. In
that case, all edits will have a common fix (to import the symbol), but
deviate in some way. With this change, we can do all of those edits in
one pass.
Note that the simplest way to enable this was to store sorted edits on
`Fix`. We don't allow modifying the edits on `Fix` once it's
constructed, so this is an easy change, and allows us to avoid a bunch
of clones and traversals later on.
Closes#5800.
## Summary
If a file has a BOM, the import sorter _always_ reports the imports as
unsorted. The acute issue is that we detect that the line has leading
content (before the imports), which we always consider a violation.
Rather than fixing that one site, this PR instead makes `.line_start`
BOM-aware.
Fixes https://github.com/astral-sh/ruff/issues/6155.
## Summary
This PR adds a new config option for `pep8-naming` plugin called
`extend-ignore-names` which is used to extend the default values in
`ignore-names` option.
resolves: #6050
## Summary
This PR protects against code like:
```python
from typing import Optional
import bar # ruff: noqa
import baz
class Foo:
x: Optional[str] = None
```
In which the user wrote `# ruff: noqa` to ignore a specific error, not
realizing that it was a file-level exemption that thus turned off all
lint rules.
Specifically, if a `# ruff: noqa` directive is not at the start of a
line, we now ignore it and warn, since this is almost certainly a
mistake.
Requires https://github.com/astral-sh/RustPython-Parser/pull/42
Related https://github.com/PyCQA/pyflakes/pull/778
[PEP-695](https://peps.python.org/pep-0695)
Part of #5062
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Adds a scope for type parameters, a type parameter binding kind, and
checker visitation of type parameters in type alias statements, function
definitions, and class definitions.
A few changes were necessary to ensure correctness following the
insertion of a new scope between function and class scopes and their
parent.
## Test Plan
<!-- How was it tested? -->
Undefined name snapshots.
Unused type parameter rule will be added as follow-up.
## Summary
Right now, `RUF015` will try to rewrite `x[:1]` as `[next(x)]`. This
isn't equivalent if `x`, for example, is empty, where slicing like
`x[:1]` is forgiving, but `next` raises `StopIteration`. For me this is
a little too much of a deviation to be comfortable with, and most of the
value in this rule is the `x[0]` to `next(x)` conversion anyway.
Closes https://github.com/astral-sh/ruff/issues/6148.
## Summary
In #6134 and #6136, we see some false positives for "shadowed" class
definitions. For example, here, the first definition is flagged as
unused, since from the perspective of the semantic model (which doesn't
understand branching), it appears to be immediately shadowed in the
`else`, and thus never used:
```python
if sys.version_info >= (3, 11):
class _RootLoggerConfiguration(TypedDict, total=False):
level: _Level
filters: Sequence[str | _FilterType]
handlers: Sequence[str]
else:
class _RootLoggerConfiguration(TypedDict, total=False):
level: _Level
filters: Sequence[str]
handlers: Sequence[str]
```
Instead of looking at _all_ bindings, we should instead look at the
"live" bindings, which is similar to how other rules (like unused
variables detection) is structured. We thus move the rule from
`bindings.rs` (which iterates over _all_ bindings, regardless of whether
they're shadowed) to `deferred_scopes.rs`, which iterates over all
"live" bindings once a scope has been fully analyzed.
## Test Plan
`cargo test`
## Summary
This PR implements pycodestyle's E241 (tab after comma) and E242
(multiple whitespace after comma) lints.
These are marked as nursery rules like many other pycodestyle rules.
Refs #2402
## Test Plan
E24.py copied from pycodestyle.
## Summary
Completes the documentation for the ruleset, apart from four rules which
have contradictions, so need to be thought about more regarding how to
document that. Related to #2646.
## Test Plan
`python scripts/test_docs_formatted.py`
**Summary**
Updated doc comments for `missing_whitespace_around_operator.rs`. Online
docs also benefit from this update.
**Test Plan**
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
## Summary
Completes the documentation for the one and only (current) rule in the
`flynt` ruleset. Related to #2646.
## Test Plan
`python scripts/test_docs_formatted.py`
## Summary
This PR stores the mapping from `ExprName` node to resolved `BindingId`,
which lets us skip scope lookups in `resolve_call_path`. It's enabled by
#6045, since that PR ensures that when we analyze a node (and thus call
`resolve_call_path`), we'll have already visited its `ExprName`
elements.
In more detail: imagine that we're traversing over `foo.bar()`. When we
read `foo`, it will be an `ExprName`, which we'll then resolve to a
binding via `handle_node_load`. With this change, we then store that
binding in a map. Later, if we call `collect_call_path` on `foo.bar`,
we'll identify `foo` (the "head" of the attribute) and grab the resolved
binding in that map. _Almost_ all names are now resolved in advance,
though it's not a strict requirement, and some rules break that pattern
(e.g., if we're analyzing arguments, and they need to inspect their
annotations, which are visited in a deferred manner).
This improves performance by 4-6% on the all-rules benchmark. It looks
like it hurts performance (1-2% drop) in the default-rules benchmark,
presumedly because those rules don't call `resolve_call_path` nearly as
much, and so we're paying for these extra writes.
Here's the benchmark data:
```
linter/default-rules/numpy/globals.py
time: [67.270 µs 67.380 µs 67.489 µs]
thrpt: [43.720 MiB/s 43.792 MiB/s 43.863 MiB/s]
change:
time: [+0.4747% +0.7752% +1.0626%] (p = 0.00 < 0.05)
thrpt: [-1.0514% -0.7693% -0.4724%]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
linter/default-rules/pydantic/types.py
time: [1.4067 ms 1.4105 ms 1.4146 ms]
thrpt: [18.028 MiB/s 18.081 MiB/s 18.129 MiB/s]
change:
time: [+1.3152% +1.6953% +2.0414%] (p = 0.00 < 0.05)
thrpt: [-2.0006% -1.6671% -1.2981%]
Performance has regressed.
linter/default-rules/numpy/ctypeslib.py
time: [637.67 µs 638.96 µs 640.28 µs]
thrpt: [26.006 MiB/s 26.060 MiB/s 26.113 MiB/s]
change:
time: [+1.5859% +1.8109% +2.0353%] (p = 0.00 < 0.05)
thrpt: [-1.9947% -1.7787% -1.5611%]
Performance has regressed.
linter/default-rules/large/dataset.py
time: [3.2289 ms 3.2336 ms 3.2383 ms]
thrpt: [12.563 MiB/s 12.581 MiB/s 12.599 MiB/s]
change:
time: [+0.8029% +0.9898% +1.1740%] (p = 0.00 < 0.05)
thrpt: [-1.1604% -0.9801% -0.7965%]
Change within noise threshold.
linter/all-rules/numpy/globals.py
time: [134.05 µs 134.15 µs 134.26 µs]
thrpt: [21.977 MiB/s 21.995 MiB/s 22.012 MiB/s]
change:
time: [-4.4571% -4.1175% -3.8268%] (p = 0.00 < 0.05)
thrpt: [+3.9791% +4.2943% +4.6651%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
linter/all-rules/pydantic/types.py
time: [2.5627 ms 2.5669 ms 2.5720 ms]
thrpt: [9.9158 MiB/s 9.9354 MiB/s 9.9516 MiB/s]
change:
time: [-5.8304% -5.6374% -5.4452%] (p = 0.00 < 0.05)
thrpt: [+5.7587% +5.9742% +6.1914%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe
linter/all-rules/numpy/ctypeslib.py
time: [1.3949 ms 1.3956 ms 1.3964 ms]
thrpt: [11.925 MiB/s 11.931 MiB/s 11.937 MiB/s]
change:
time: [-6.2496% -6.0856% -5.9293%] (p = 0.00 < 0.05)
thrpt: [+6.3030% +6.4799% +6.6662%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
linter/all-rules/large/dataset.py
time: [5.5951 ms 5.6019 ms 5.6093 ms]
thrpt: [7.2527 MiB/s 7.2623 MiB/s 7.2711 MiB/s]
change:
time: [-5.1781% -4.9783% -4.8070%] (p = 0.00 < 0.05)
thrpt: [+5.0497% +5.2391% +5.4608%]
Performance has improved.
```
Still playing with this (the concepts need better names, documentation,
etc.), but opening up for feedback.
## Summary
Noticed in #5954: we walk _past_ the root rather than stopping _at_ the
root when attempting to traverse along the parent path. It's effectively
an off-by-one bug.
## Summary
Updated doc comment for `call_datetime_without_tzinfo.rs`. Online docs
also benefit from this update.
## Test Plan
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
## Summary
Check for unused private `TypeVar`. See [original
implementation](2a86db8271/pyi.py (L1958)).
```
$ flake8 --select Y018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: Y018 TypeVar "_T" is not used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: Y018 TypeVar "_P" is not used
```
```
$ ./target/debug/ruff --select PYI018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: PYI018 TypeVar `_T` is never used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: PYI018 TypeVar `_P` is never used
Found 2 errors.
```
In the file `unused_private_type_declaration.rs`, I'm planning to add
other rules that are similar to `PYI018` like the `PYI046`, `PYI047` and
`PYI049`.
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
This is a rewrite of the main comment placement logic. `place_comment`
now has three parts:
- place own line comments
- between branches
- after a branch
- place end-of-line comments
- after colon
- after a branch
- place comments for specific nodes (that include module level comments)
The rewrite fixed three bugs: `class A: # trailing comment` comments now
stay end-of-line, `try: # comment` remains end-of-line and deeply
indented try-else-finally comments remain with the right nested
statement.
It will be much easier to give more alternative branches nodes since
this is abstracted away by `is_node_with_body` and the first/last child
helpers. Adding new node types can now be done by adding an entry to the
`place_comment` match. The code went from 1526 lines before #6033 to
1213 lines now.
It thinks it easier to just read the new `placement.rs` rather than
reviewing the diff.
## Test Plan
The existing fixtures staying the same or improving plus new ones for
the bug fixes.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
The `SIM102` auto-fix fails if `elif` is indented like this:
## Example
```python
def f():
# SIM102
if a:
pass
elif b:
if c:
d
```
```
> cargo run -p ruff_cli -- check --select SIM102 --fix a.py
...
error: Failed to fix nested if: Failed to extract statement from source
a.py:5:5: SIM102 Use a single `if` statement instead of nested `if` statements
Found 1 error.
```
## Test Plan
<!-- How was it tested? -->
New test
## Summary
This PR adds the implementation for the new Jupyter AST nodes i.e.,
`ExprLineMagic` and `StmtLineMagic`.
## Test Plan
Add test cases for `unparse` containing magic commands
resolves: #6087
## Summary
Updated doc comment for `tab_indentation.rs`. Online docs also benefit
from this update.
## Test Plan
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Part of #5062
Requires https://github.com/astral-sh/RustPython-Parser/pull/32
Adds visitation of type alias statements and type parameters in class
and function definitions.
Duplicates tests for `PreorderVisitor` into `Visitor` with new
snapshots. Testing required node implementations for the `TypeParam`
enum, which is a chunk of the diff and the reason we need `Ranged`
implementations in
https://github.com/astral-sh/RustPython-Parser/pull/32.
## Test Plan
<!-- How was it tested? -->
Adds unit tests with snapshots.
Reimplements https://github.com/astral-sh/ruff/pull/3104
Closes https://github.com/astral-sh/ruff/issues/5726
Note that we will generate the hash for a cache key twice in normal
operation. Once to check for the cached item and again to update the
cache. We could optimize this by generating the hash once in
`diagnostics::lint_file` and passing the `u64` into `get` and `update`.
We'd probably want to wrap it in a `CacheKeyHash` enum for type safety.
## Test plan
Unit tests for Windows and Unix.
Manual test with case from issue
```
❯ touch fake.py
❯ chmod +x fake.py
❯ ./target/debug/ruff --select EXE fake.py
fake.py:1:1: EXE002 The file is executable but no shebang is present
Found 1 error.
❯ chmod -x fake.py
❯ ./target/debug/ruff --select EXE fake.py
```
## Summary
If a method is annotated with `@typing_extensions.override`, we should
avoid flagging A003 on it. This isn't part of the standard library yet,
but it's used to explicitly mark methods as overrides.
## Summary
If a user subclasses `threading.Event`, e.g. with:
```python
from threading import Event
class CustomEvent(Event):
def set(self) -> None:
...
```
They no control over the method name (`set`). This PR allows
`threading.Event#set` and `logging.Filter#filter` overrides, and avoids
flagging A003 in such cases. Ideally, we'd avoid flagging all overridden
methods, but... that's a lot more difficult, and this is at least
_better_ than what we do now.
Closes https://github.com/astral-sh/ruff/issues/6057.
Closes https://github.com/astral-sh/ruff/issues/5956.
## Summary
This PR modifies the order of operations in our AST checker. Previously,
we ran our analysis rules first, then bound names and traversed over the
subtrees. Now, after a series of refactors, we can invert the order: do
the subtree traversal and model-building _first_, then run rules.
The nice thing about this change is that when we go to analyze, e.g., a
function call node, we'll already have traversed any of the constituent
`Expr::Name` nodes... So if we store the resolution of all names when do
the traversal, we can avoid having to do any expensive work in
`resolve_call_path`.
## Test Plan
Clean run of the snapshot tests, and hopefully the ecosystem checks too!
## Summary
This PR is a refactoring of placement.rs. The code got more consistent,
some comments were updated and some dead code was removed or replaced
with debug assertions. It also contains a bugfix for the placement of
end-of-branch comments with nested bodies inside try statements that
occurred when refactoring the nested body loop.
## Test Plan
The existing test cases don't change. I added a couple of cases that i
think should be tested but weren't, and a regression test for the bugfix
## Summary
This PR ensures that we can retain the current behavior even after we
reorder the visitor a bit, by looking for annotated lambdas rather than
"is the name bound to anything?", since if we visit the name before we
run this rule, it'll _always_ be bound. (This check is already a bit
flawed -- in truth, we should probably run this rule deferred so that we
can reliably detect shadowing.)
## Summary
This PR attempts to draw some basic separation between the `Checker`'s
traversal responsibilities (traversing the AST, building the semantic
model) and its calling-out-to-lint-rule responsibilities. It doesn't try
to introduce any sophisticated API. Instead, it just moves all of the
lint rule calls out of `checkers/ast/mod.rs` and into methods in a new
`analyze` module. (There are four remaining lint rules in `Checker`, but
I'll remove those in future PRs.)
I'm not trying to "solve" our lint rule API here. Instead, I'm trying to
make two improvements:
1. `checkers/ast/mod.rs` has just gotten way too large, and people work
in it all the time. Prior to this PR, it was 5.5k lines, which led to
significant lags in my editor and made it really hard to reason about
the parts that are _actually_ important. (I like big files, but this one
crossed the line for me.) Now, it's < 2,000 lines, and the code is much
more focused.
2. I want to avoid accidentally adding lint rules in the "wrong" parts
of the traversal. By confining lint rule invocations to these "analyze"
calls, we'll avoid (e.g.) putting them in the binding phase.
## Summary
These only need the token stream, and we always prefer token-based to
physical line-based rules.
There are a few other changes snuck in here:
- Renaming the rule files to match the diagnostic names (likely an
error).
- The "leading whitespace before shebang" rule now works regardless of
where the comment occurs (i.e., if the shebang is on the second line,
and the first line is blank, we flag and remove that leading
whitespace).
**Summary** Fix an instability in with statement formatter when there is
an own line comment as the `as`
```python
with (
a as
# bad comment
b):
```
**Test Plan** Added the comment to the test cases.
## Summary
Closes https://github.com/astral-sh/ruff/issues/6025 (which contains a
more thorough description of the issue). Previously, the `# noqa` here
was being marked as unused, but removing it raised `SIM114`:
```python
def foo():
a = True
b = False
if a > b: # noqa: SIM114
return 3
elif a == b:
return 3
```
**Summary** Add a formatter progress testing script to CI. This script
will 1) print the black compability on each run 2) catch regressions wrt
to formatter stability, emitting invalid syntax and other kinds of bugs
(e.g. #5917) before they land on main 3) have an additional layer of
real world tests when implementing new nodes or other new formatter
code.
This is currently a bash script, i'm not sure if we want to keep it that
way, or switch to e.g. the regular ecosystem scripts. The output
separation of `format_dev` could also use some polishing. We should also
consider pinning commits so we don't get spurious regression when they
change their code.
**Test Plan** The script extends CI.
## Summary
My intuition is that it's faster to do these checks as-needed rather
than allocation new hash maps and vectors for the arguments. (We
typically only query once anyway.)
## Summary
This PR adds a `logger-objects` setting that allows users to mark
specific symbols a `logging.Logger` objects. Currently, if a `logger` is
imported, we only flagged it as a `logging.Logger` if it comes exactly
from the `logging` module or is `flask.current_app.logger`.
This PR allows users to mark specific loggers, like
`logging_setup.logger`, to ensure that they're covered by the
`flake8-logging-format` rules and others.
For example, if you have a module `logging_setup.py` with the following
contents:
```python
import logging
logger = logging.getLogger(__name__)
```
Adding `"logging_setup.logger"` to `logger-objects` will ensure that
`logging_setup.logger` is treated as a `logging.Logger` object when
imported from other modules (e.g., `from logging_setup import logger`).
Closes https://github.com/astral-sh/ruff/issues/5694.
I don't know whether we want to make this change but here's some data...
Binary size:
- `main`: 30,384
- `charlie/match-phf`: 30,416
llvm-lines:
- `main`: 1,784,148
- `charlie/match-phf`: 1,789,877
llvm-lines and binary size are both unchanged (or, by < 5) when moving
from `u8` to `u32` return types, and even when moving to `char` keys and
values. I didn't expect this, but I'm not very knowledgable on this
topic.
Performance:
```
Confusables/match/src time: [4.9102 µs 4.9352 µs 4.9777 µs]
change: [+1.7469% +2.2421% +2.8710%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe
Confusables/match-with-skip/src
time: [2.0676 µs 2.0945 µs 2.1317 µs]
change: [+0.9384% +1.6000% +2.3920%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
Confusables/phf/src time: [31.087 µs 31.188 µs 31.305 µs]
change: [+1.9262% +2.2188% +2.5496%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
6 (6.00%) high mild
6 (6.00%) high severe
Confusables/phf-with-skip/src
time: [2.0470 µs 2.0486 µs 2.0502 µs]
change: [-0.3093% -0.1446% +0.0106%] (p = 0.08 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
```
The `-with-skip` variants add our optimization which first checks
whether the character is ASCII. So `match` is way, way faster than PHF,
but it tends not to matter since almost all source code is ASCII anyway.
## Summary
This pull request add supports for `int`, `float` and `bool` types in
`UP018`
rule to convert empty call to the default value of the type or remove
the call
if a value of the same type is provided as an argument.
## Test Plan
Added tests for `int`, `float` and `bool` types.
Partially resolves#5988
**Summary** Add a `EmptyWithDanglingComments` format helper that formats
comments inside empty parentheses, brackets or curly braces. Previously,
this was implemented separately, and partially incorrectly, for each use
case.
Empty `()`, `[]` and `{}` are special because there can be dangling
comments, and they can be in
two positions:
```python
x = [ # end-of-line
# own line
]
```
These comments are dangling because they can't be assigned to any
element inside as they would
in all other cases.
**Test Plan** Added a regression test.
145 (from previously 149) instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
145
```