## Summary
Our `is_builtin` check did a naive walk over the parent scopes; instead,
it needs to (e.g.) skip symbols in a class scope if being called outside
of the class scope itself.
Closes https://github.com/astral-sh/ruff/issues/6466.
## Test Plan
`cargo test`
## Summary
Reopening of https://github.com/astral-sh/ruff/pull/4880
One open TODO as described in:
https://github.com/astral-sh/ruff/pull/4880#discussion_r1265110215
FYI @charliermarsh seeing as you commented you wanted to do final review
and merge. @konstin @dhruvmanila @MichaReiser as previous reviewers.
# Old Description
## Summary
Adds an autofix for B006 turning mutable argument defaults into None and
setting their original value back in the function body if still `None`
at runtime like so:
```python
def before(x=[]):
pass
def after(x=None):
if x is None:
x = []
pass
```
## Test Plan
Added an extra test case to existing fixture with more indentation.
Checked results for all old examples.
NOTE: Also adapted the jupyter notebook test as this checked for B006 as
well.
## Issue link
Closes: https://github.com/charliermarsh/ruff/issues/4693
---------
Co-authored-by: konstin <konstin@mailbox.org>
**Summary** I collected all examples of end-of-line comments after
opening parentheses that i could think of so we get a comprehensive view
at the state of their formatting (#6390).
This PR intentionally only adds tests cases without any changes in
formatting. We need to decide which exact formatting we want, ideally in
terms of these test files, and implement this in follow-up PRs.
~~One stability check is still deactivated pending
https://github.com/astral-sh/ruff/pull/6386.~~
We currently don't format all comments as match statements are not yet implemented. We can work around this for the top level match statement by setting them manually formatted but the mocked-out top level match doesn't call into its children so they would still have unformatted comments
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR fixes the issue where the FString formatting dropped dangling comments between the string parts.
```python
result_f = (
f' File "{__file__}", line {lineno_f+1}, in f\n'
' f()\n'
# XXX: The following line changes depending on whether the tests
# are run through the interactive interpreter or with -m
# It also varies depending on the platform (stack size)
# Fortunately, we don't care about exactness here, so we use regex
r' \[Previous line repeated (\d+) more times\]' '\n'
'RecursionError: maximum recursion depth exceeded\n'
)
```
The solution here isn't ideal because it re-introduces the `enclosing_parent` on `DecoratedComment` but it is the easiest fix that I could come up.
I didn't spend more time finding another solution becaues I think we have to re-write most of the fstring formatting with the upcoming Python 3.12 support (because lexing the individual parts as we do now will no longer work).
closes#6440
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
The child PR testing that all comments are formatted should now pass
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds the `AnyNodeRef.visit_preorder` method. I'll need this method to mark all comments of a suppressed node's children as formatted (in debug builds).
I'm not super happy with this because it now requires a double-dispatch where the `walk_*` methods call into `node.visit_preorder` and the `visit_preorder` then calls back into the visitor. Meaning,
the new implementation now probably results in way more function calls. The other downside is that `AnyNodeRef` now contains code that is difficult to auto-generate. This could be mitigated by extracting the `visit_preorder` method into its own `VisitPreorder` trait.
Anyway, this approach solves the need and avoids duplicating the visiting code once more.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
## Summary
The use of `|` as a union operator is not always safe, if a type
annotation is evaluated in a runtime context. For example, this code
errors at runtime:
```python
import httpretty
import requests_mock
item: type[requests_mock.Mocker | httpretty] = requests_mock.Mocker
```
However, it's fine in a `.pyi` file, with `__future__` annotations`, or
if the annotation is in a non-evaluated context, like:
```python
def func():
item: type[requests_mock.Mocker | httpretty] = requests_mock.Mocker
```
This PR modifies the rule to avoid enforcing in those invalid,
runtime-evaluated contexts.
Closes https://github.com/astral-sh/ruff/issues/6455.
## Summary
Use the same Python version by default for all tests (our
latest-supported version).
## Test Plan
`cargo test`
---------
Co-authored-by: Zanie <contact@zanie.dev>
## Summary
I think it makes sense for `PythonVersion::default()` to return our
minimum-supported non-EOL version.
## Test Plan
`cargo test`
---------
Co-authored-by: Zanie <contact@zanie.dev>
## Summary
This PR renames the `MagicCommand` token to `IpyEscapeCommand` token and
`MagicKind` to `IpyEscapeKind` type to better reflect the purpose of the
token and type. Similarly, it renames the AST nodes from `LineMagic` to
`IpyEscapeCommand` prefixed with `Stmt`/`Expr` wherever necessary.
It also makes renames from using `jupyter_magic` to
`ipython_escape_commands` in various function names.
The mode value is still `Mode::Jupyter` because the escape commands are
part of the IPython syntax but the lexing/parsing is done for a Jupyter
notebook.
### Motivation behind the rename:
* IPython codebase defines it as "EscapeCommand" / "Escape Sequences":
* Escape Sequences:
292e3a2345/IPython/core/inputtransformer2.py (L329-L333)
* Escape command:
292e3a2345/IPython/core/inputtransformer2.py (L410-L411)
* The word "magic" is used mainly for the actual magic commands i.e.,
the ones starting with `%`/`%%`
(https://ipython.readthedocs.io/en/stable/interactive/reference.html#magic-command-system).
So, this avoids any confusion between the Magic token (`%`, `%%`) and
the escape command itself.
## Test Plan
* `cargo test` to make sure all renames are done correctly.
* `grep` for `jupyter_escape`/`magic` to make sure all renames are done
correctly.
## Summary
This PR removes the group around function definition parameters, instead
grouping the parameters with the type parameters and return type
annotation.
This increases Zulip's similarity score from 0.99385 to 0.99699, so it's
a meaningful improvement. However, there's at least one stability error
that I'm working on, and I'm really just looking for high-level feedback
at this point, because I'm not happy with the solution.
Closes https://github.com/astral-sh/ruff/issues/6352.
## Test Plan
Before:
- `zulip`: 0.99396
- `django`: 0.99784
- `warehouse`: 0.99578
- `build`: 0.75436
- `transformers`: 0.99407
- `cpython`: 0.75987
- `typeshed`: 0.74432
After:
- `zulip`: 0.99702
- `django`: 0.99784
- `warehouse`: 0.99585
- `build`: 0.75623
- `transformers`: 0.99470
- `cpython`: 0.75988
- `typeshed`: 0.74853
## Summary
In https://github.com/astral-sh/ruff/pull/6397, the documentation was
updated stating that the default target-version is now "py38", but the
actual default value wasn't updated and remained py310. This commit
updates the default value to match what the documentation says.
## Summary
This PR adds support for a stricter version of help end escape
commands[^1] in the parser. By stricter, I mean that the escape tokens
are only at the end of the command and there are no tokens at the start.
This makes it difficult to implement it in the lexer without having to
do a lot of look aheads or keeping track of previous tokens.
Now, as we're adding this in the parser, the lexer needs to recognize
and emit a new token for `?`. So, `Question` token is added which will
be recognized only in `Jupyter` mode.
The conditions applied are the same as the ones in the original
implementation in IPython codebase (which is a regex):
* There can only be either 1 or 2 question mark(s) at the end
* The node before the question mark can be a `Name`, `Attribute`,
`Subscript` (only with integer constants in slice position), or any
combination of the 3 nodes.
## Test Plan
Added test cases for various combination of the possible nodes in the
command value position and update the snapshots.
fixes: #6359fixes: #5030 (This is the final piece)
[^1]: https://github.com/astral-sh/ruff/pull/6272#issue-1833094281
## Summary
Error if `tab-size` is set to zero (it is used as a divisor). Closes
#6423.
Also fixes a typo.
## Test Plan
Running ruff with a config
```toml
[tool.ruff]
tab-size = 0
```
returns an error message to the user saying that `tab-size` must be
greater than zero.
## Summary
Given:
```python
def double(a: int) -> ( # Hello
int
):
return 2*a
```
We currently treat `# Hello` as a trailing comment on the parameters
(`(a: int)`). This PR adds a placement method to instead treat it as a
dangling comment on the function definition itself, so that it gets
formatted at the end of the definition, like:
```python
def double(a: int) -> int: # Hello
return 2*a
```
The formatting in this case is unchanged, but it's incorrect IMO for
that to be a trailing comment on the parameters, and that placement
leads to an instability after changing the grouping in #6410.
Fixing this led to a _different_ instability related to tuple return
type annotations, like:
```python
def zrevrangebylex(self, name: _Key, max: _Value, min: _Value, start: int | None = None, num: int | None = None) -> ( # type: ignore[override]
):
...
```
(This is a real example.)
To fix, I had to special-case tuples in that spot, though I'm not
certain that's correct.
## Summary
This PR moves `empty_parenthesized` such that it's peer to
`parenthesized`, and changes the API to better match that of
`parenthesized` (takes `&str` rather than `StaticText`, has a
`with_dangling_comments` method, etc.).
It may be intentionally _not_ part of `parentheses.rs`, but to me
they're so similar that it makes more sense for them to be in the same
module, with the same API, etc.
## Summary
This PR adds support for `StmtMatch` with subs for `MatchCase`.
## Test Plan
Add a few additional test cases around `match` statement, comments, line
breaks.
resolves: #6298
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix name of rule in example of `extend-per-file-ignores` in `options.rs`
file.
It was `E401` but in configuration example `E402` was listed. Just a
tiny mismatch.
## Test Plan
<!-- How was it tested? -->
Just by my eyes :).
## Bug
Given
```python
x = () - (#
)
```
the comment is a dangling comment of the empty tuple. This is an
end-of-line comment so it may move after the expression. It still
expands the parent, so the operator breaks:
```python
x = (
()
- () #
)
```
In the next formatting pass, the comment is not a trailing tuple but a
trailing bin op comment, so the bin op doesn't break anymore. The
comment again expands the parent, so we still add the superfluous
parentheses
```python
x = (
() - () #
)
```
## Fix
The new formatting is to keep the comment on the empty tuple. This is a
log uglier and again has additional outer parentheses, but it's stable:
```python
x = (
()
- ( #
)
)
```
## Alternatives
Black formats all the examples above as
```python
x = () - () #
```
which i find better.
I would be happy about any suggestions for better solutions than the
current one. I'd mainly need a workaround for expand parent having an
effect on the bin op instead of first moving the comment to the end and
then applying expand parent to the assign statement.
## Summary
Manually add the parentheses around tuple expressions for the autofix in
`B014`.
This is also done in various other autofixes as well such as for
[`RUF005`](6df5ab4098/crates/ruff/src/rules/ruff/rules/collection_literal_concatenation.rs (L183-L184)),
[`UP024`](6df5ab4098/crates/ruff/src/rules/pyupgrade/rules/os_error_alias.rs (L137-L137)).
### Alternate Solution
An alternate solution would be to fix this in the `Generator` itself by
checking
if the tuple expression needs to be generated at the top-level or not.
If so,
then always add the parentheses.
```rust
} else if level == 0 {
// Top-level tuples are always parenthesized.
self.p("(");
let mut first = true;
for elt in elts {
self.p_delim(&mut first, ", ");
self.unparse_expr(elt, precedence::COMMA);
}
self.p_if(elts.len() == 1, ",");
self.p(")");
```
## Test Plan
Add a regression test for this case in `B014`.
fixes: #6412
Extends #6289 to support moving type variable usage in type aliases to
use PEP-695.
Does not remove the possibly unused type variable declaration.
Presumably this is handled by other rules, but is not working for me.
Does not handle type variables with bounds or variance declarations yet.
Part of #4617
## Summary
We have some logic in the expression analyzer method to avoid
re-checking the inner `Union` in `Union[Union[...]]`, since the methods
that analyze `Union` expressions already recurse. Elsewhere, we have
logic to avoid re-checking the inner `|` in `int | (int | str)`, for the
same reason.
This PR unifies that logic into a single method _and_ ensures that, just
as we recurse over both `Union` and `|`, we also detect that we're in
_either_ kind of nested union.
Closes https://github.com/astral-sh/ruff/issues/6285.
## Test Plan
Added some new snapshots.
## Summary
We can anticipate earlier that this will error, so we should avoid
flagging the error at all. Specifically, we're talking about cases like
`"{1} {0}".format(*args)"`, in which we'd need to reorder the arguments
in order to remove the `1` and `0`, but we _can't_ reorder the arguments
since they're not statically analyzable.
Closes https://github.com/astral-sh/ruff/issues/6388.
## Summary
I noticed some deviations in how we treat dangling comments that hug the
opening parenthesis for function definitions.
For example, given:
```python
def f( # first
# second
): # third
...
```
We currently format as:
```python
def f(
# first
# second
): # third
...
```
This PR adds the proper opening-parenthesis dangling comment handling
for function parameters. Specifically, as with all other parenthesized
nodes, we now detect that dangling comment in `placement.rs` and handle
it in `parameters.rs`. We have to take some care in that file, since we
have multiple "kinds" of dangling comments, but I added a bunch of test
cases that we now format identically to Black.
## Test Plan
`cargo test`
Before:
- `zulip`: 0.99388
- `django`: 0.99784
- `warehouse`: 0.99504
- `transformers`: 0.99404
- `cpython`: 0.75913
- `typeshed`: 0.74364
After:
- `zulip`: 0.99386
- `django`: 0.99784
- `warehouse`: 0.99504
- `transformers`: 0.99404
- `cpython`: 0.75913
- `typeshed`: 0.74409
Meaningful improvement on `typeshed`, minor decrease on `zulip`.
Closes https://github.com/astral-sh/ruff/issues/6068
These commits are kind of a mess as I did some stumbling around here.
Unrolls formatting of chained boolean operations to prevent nested
grouping which gives us Black-compatible formatting where each boolean
operation is on a new line.
## Summary
This PR modifies our `can_omit_optional_parentheses` rules to ensure
that if we see a call followed by an attribute, we treat that as an
attribute access rather than a splittable call expression.
This in turn ensures that we wrap like:
```python
ct_match = aaaaaaaaaaact_id == self.get_content_type(
obj=rel_obj, using=instance._state.db
)
```
For calls, but:
```python
ct_match = (
aaaaaaaaaaact_id == self.get_content_type(obj=rel_obj, using=instance._state.db).id
)
```
For calls with trailing attribute accesses.
Closes https://github.com/astral-sh/ruff/issues/6065.
## Test Plan
Similarity index before:
- `zulip`: 0.99436
- `django`: 0.99779
- `warehouse`: 0.99504
- `transformers`: 0.99403
- `cpython`: 0.75912
- `typeshed`: 0.72293
And after:
- `zulip`: 0.99436
- `django`: 0.99780
- `warehouse`: 0.99504
- `transformers`: 0.99404
- `cpython`: 0.75913
- `typeshed`: 0.72293
## Summary
This PR leverages the unified function definition node to add precise
AST node types to `MemberKind`, which is used to power our docstring
definition tracking (e.g., classes and functions, whether they're
methods or functions or nested functions and so on, whether they have a
docstring, etc.). It was painful to do this in the past because the
function variants needed to support a union anyway, but storing precise
nodes removes like a dozen panics.
No behavior changes -- purely a refactor.
## Test Plan
`cargo test`
## Summary
Per the suggestion in
https://github.com/astral-sh/ruff/discussions/6183, this PR removes
`AsyncWith`, `AsyncFor`, and `AsyncFunctionDef`, replacing them with an
`is_async` field on the non-async variants of those structs. Unlike an
interpreter, we _generally_ have identical handling for these nodes, so
separating them into distinct variants adds complexity from which we
don't really benefit. This can be seen below, where we get to remove a
_ton_ of code related to adding generic `Any*` wrappers, and a ton of
duplicate branches for these cases.
## Test Plan
`cargo test` is unchanged, apart from parser snapshots.
## Summary
See discussion in
https://github.com/astral-sh/ruff/pull/6351#discussion_r1284996979. We
can remove `RefEquality` entirely and instead use a text offset for
statement keys, since no two statements can start at the same text
offset.
## Test Plan
`cargo test`
## Summary
This PR adds support for help end escape command in the lexer.
### What are "help end escape commands"?
First, the escape commands are special IPython syntax which enhances the
functionality for the IPython REPL. There are 9 types of escape kinds
which are recognized by the tokens which are present at the start of the
command (`?`, `??`, `!`, `!!`, etc.).
Here, the help command is using either the `?` or `??` token at the
start (`?str.replace` for example). Those 2 tokens are also supported
when they're at the end of the command (`str.replace?`), but the other
tokens aren't supported in that position.
There are mainly two types of help end escape commands:
1. Ending with either `?` or `??`, but it also starts with one of the
escape tokens (`%matplotlib?`)
2. On the other hand, there's a stricter version for (1) which doesn't
start with any escape tokens (`str.replace?`)
This PR adds support for (1) while (2) will be supported in the parser.
### Priority
Now, if the command starts and ends with an escape token, how do we
decide the kind of this command? This is where priority comes into
picture. This is simple as there's only one priority where `?`/`??` at
the end takes priority over any other escape token and all of the other
tokens are at the same priority. Remember that only `?`/`??` at the end
is considered valid.
This is mainly useful in the case where someone would want to invoke the
help command on the magic command itself. For example, in `%matplotlib?`
the help command takes priority which means that we want help for the
`matplotlib` magic function instead of calling the magic function
itself.
### Specification
Here's where things get a bit tricky. What if there are question mark
tokens at both ends. How do we decide if it's `Help` (`?`) kind or
`Help2` (`??`) kind?
| | Magic | Value | Kind |
| --- | --- | --- | --- |
| 1 | `?foo?` | `foo` | `Help` |
| 2 | `??foo?` | `foo` | `Help` |
| 3 | `?foo??` | `foo` | `Help2` |
| 4 | `??foo??` | `foo` | `Help2` |
| 5 | `???foo??` | `foo` | `Help2` |
| 6 | `??foo???` | `foo???` | `Help2` |
| 7 | `???foo???` | `?foo???` | `Help2` |
Looking at the above table:
- The question mark tokens on the right takes priority over the ones on
the left but only if the number of question mark on the right is 1 or 2.
- If there are more than 2 question mark tokens on the right side, then
the left side is used to determine the same.
- If the right side is used to determine the kind, then all of the
question marks and whitespaces on the left side are ignored in the
`value`, but if it’s the other way around, then all of the extra
question marks are part of the `value`.
### References
- IPython implementation using the regex:
292e3a2345/IPython/core/inputtransformer2.py (L454-L462)
- Priorities:
292e3a2345/IPython/core/inputtransformer2.py (L466-L469)
## Test Plan
Add a bunch of test cases for the lexer and verify that it matches the
behavior of
IPython transformer.
resolves: #6357
## Summary
This PR fixes the performance degradation introduced in
https://github.com/astral-sh/ruff/pull/6345. Instead of using the
generic `Nodes` structs, we now use separate `Statement` and
`Expression` structs. Importantly, we can avoid tracking a bunch of
state for expressions that we need for parents: we don't need to track
reference-to-ID pointers (we just have no use-case for this -- I'd
actually like to remove this from statements too, but we need it for
branch detection right now), we don't need to track depth, etc.
In my testing, this entirely removes the regression on all-rules, and
gets us down to 2ms slower on the default rules (as a crude hyperfine
benchmark, so this is within margin of error IMO).
No behavioral changes.
## Summary
This PR attempts to draw a clearer divide between "methods that take
(e.g.) an expression or statement as input" and "methods that rely on
the _current_ expression or statement" in the semantic model, by
renaming methods like `stmt()` to `current_statement()`.
This had led to confusion in the past. For example, prior to this PR, we
had `scope()` (which returns the current scope), and `parent_scope`,
which returns the parent _of a scope that's passed in_. Now, the API is
clearer: `current_scope` returns the current scope, and `parent_scope`
takes a scope as argument and returns its parent.
Per above, I also changed `stmt` to `statement` and `expr` to
`expression`.
## Summary
Given:
```python
[ # comment
first,
second,
third
] # another comment
```
We were adding a hard line break as part of the formatting of `#
comment`, which led to the following formatting:
```python
[first, second, third] # comment
# another comment
```
Closes https://github.com/astral-sh/ruff/issues/6367.
## Summary
Fixes an instability whereby this:
```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
# Returns a list of deployments not older than threshold days
# including `/root/zulip` directory if it exists.
recent = set()
threshold_date = datetime.datetime.now() - datetime.timedelta( # noqa: DTZ005
days=threshold_days
)
```
Was being formatted as:
```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
# Returns a list of deployments not older than threshold days
# including `/root/zulip` directory if it exists.
recent = set()
threshold_date = (
datetime.datetime.now()
- datetime.timedelta(days=threshold_days) # noqa: DTZ005
)
```
Which was in turn being formatted as:
```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
# Returns a list of deployments not older than threshold days
# including `/root/zulip` directory if it exists.
recent = set()
threshold_date = (
datetime.datetime.now() - datetime.timedelta(days=threshold_days) # noqa: DTZ005
)
```
The second-to-third formattings still differs from Black because we
aren't taking the line suffix into account when splitting
(https://github.com/astral-sh/ruff/issues/6377), but the first
formatting is correct and should be unchanged (i.e., the first-to-second
formattings is incorrect, and fixed here).
## Test Plan
`cargo run --bin ruff_dev -- format-dev --stability-check ../zulip`
## Summary
When we iterate over the AST for analysis, we often process nodes in a
"deferred" manner. For example, if we're analyzing a function, we push
the function body onto a deferred stack, along with a snapshot of the
current semantic model state. Later, when we analyze the body, we
restore the semantic model state from the snapshot. This ensures that we
know the correct scope, hierarchy of statement parents, etc., when we go
to analyze the function body.
Historically, we _haven't_ included the _expression_ hierarchy in the
model snapshot -- so we track the current expression parents in the
visitor, but we never save and restore them when processing deferred
nodes. This can lead to subtle bugs, in that methods like
`expr_parent()` aren't guaranteed to be correct, if you're in a deferred
visitor.
This PR migrates expression tracking to mirror statement tracking
exactly. So we push all expressions onto an `IndexVec`, and include the
current expression on the snapshot. This ensures that `expr_parent()`
and related methods are "always correct" rather than "sometimes
correct".
There's a performance cost here, both at runtime and in terms of memory
consumption (we now store an additional pointer for every expression).
In my hyperfine testing, it's about a 1% performance decrease for
all-rules on CPython (up to 533.8ms, from 528.3ms) and a 4% performance
decrease for default-rules on CPython (up to 212ms, from 204ms).
However... I think this is worth it given the incorrectness of our
current approach. In the future, we may want to reconsider how we do
these upward traversals (e.g., with something like a red-green tree).
(**Note**: in https://github.com/astral-sh/ruff/pull/6351, the slowdown
seems to be entirely removed.)
Changes:
- Fixes typo and repeated phrase in `DTZ002`
- Adds docs for `DTZ003`
- Adds docs for `DTZ004`
- Adds example for <=Python3.10 in `DTZ001`
Related to: https://github.com/astral-sh/ruff/issues/2646
## Summary
Historically, we've stored "qualified names" on our
`BindingKind::Import`, `BindingKind::SubmoduleImport`, and
`BindingKind::ImportFrom` structs. In Ruff, a "qualified name" is a
dot-separated path to a symbol. For example, given `import foo.bar`, the
"qualified name" would be `"foo.bar"`; and given `from foo.bar import
baz`, the "qualified name" would be `foo.bar.baz`.
This PR modifies the `BindingKind` structs to instead store _call paths_
rather than qualified names. So in the examples above, we'd store
`["foo", "bar"]` and `["foo", "bar", "baz"]`. It turns out that this
more efficient given our data access patterns. Namely, we frequently
need to convert the qualified name to a call path (whenever we call
`resolve_call_path`), and it turns out that we do this operation enough
that those conversations show up on benchmarks.
There are a few other advantages to using call paths, rather than
qualified names:
1. The size of `BindingKind` is reduced from 32 to 24 bytes, since we no
longer need to store a `String` (only a boxed slice).
2. All three import types are more consistent, since they now all store
a boxed slice, rather than some storing an `&str` and some storing a
`String` (for `BindingKind::ImportFrom`, we needed to allocate a
`String` to create the qualified name, but the call path is a slice of
static elements that don't require that allocation).
3. A lot of code gets simpler, in part because we now do call path
resolution "earlier". Most notably, for relative imports (`from .foo
import bar`), we store the _resolved_ call path rather than the relative
call path, so the semantic model doesn't have to deal with that
resolution. (See that `resolve_call_path` is simpler, fewer branches,
etc.)
In my testing, this change improves the all-rules benchmark by another
4-5% on top of the improvements mentioned in #6047.
## Summary
Update `F841` autofix to not remove line magic expr
## Test Plan
Added test case for assignment statement with and without type
annotation
fixes: #6116
## Summary
Enable using the new `Mode::Jupyter` for the tokenizer/parser to parse
Jupyter line magic tokens.
The individual call to the lexer i.e., `lex_starts_at` done by various
rules should consider the context of the source code (is this content
from a Jupyter Notebook?). Thus, a new field `source_type` (of type
`PySourceType`) is added to `Checker` which is being passed around as an
argument to the relevant functions. This is then used to determine the
`Mode` for the lexer.
## Test Plan
Add new test cases to make sure that the magic statement is considered
while generating the diagnostic and autofix:
* For `I001`, if there's a magic statement in between two import blocks,
they should be sorted independently
fixes: #6090
## Summary
As of version
[23.1.0](2a86db8271/CHANGELOG.md?plain=1#L158-L160),
`flake8-pyi` remove the rule `Y027`.
The errors that resulted in `PYI027` are now being emitted by `PYI022`
(`UP035`).
ref: #848
## Test Plan
Add new tests cases.
## Summary
Previously, failed on methods like:
```python
@classmethod
def bad_posonly_class_method(cls: type[_S], /) -> _S: ... # PYI019
```
Since we check if there are any positional-only or non-positional
arguments, but then do an unsafe access on `parameters.args`.
Closes https://github.com/astral-sh/ruff/issues/6349.
## Test Plan
`cargo test` (verified that `main` panics on the new fixtures)
These rules assume that the docstring is on its own line. pydocstyle
treats them inconsistently, so I'm just going to disable them in this
case.
Closes https://github.com/astral-sh/ruff/issues/6329.
## Summary
Fixes some comprehension formatting by avoiding creating the group for
the comprehension itself (so that if it breaks, all parts break on their
own lines, e.g. the `for` and the `if` clauses).
Closes https://github.com/astral-sh/ruff/issues/6063.
## Test Plan
Bunch of new fixtures.
Implement fluent style/call chains. See the `call_chains.py` formatting
for examples.
This isn't fully like black because in `raise A from B` they allow `A`
breaking can influence the formatting of `B` even if it is already
multiline.
Similarity index:
| project | main | PR |
|--------------|-------|-------|
| build | ??? | 0.753 |
| django | 0.991 | 0.998 |
| transformers | 0.993 | 0.994 |
| typeshed | 0.723 | 0.723 |
| warehouse | 0.978 | 0.994 |
| zulip | 0.992 | 0.994 |
Call chain formatting is affected by
https://github.com/astral-sh/ruff/issues/627, but i'm cutting scope
here.
Closes#5343
**Test Plan**:
* Added a dedicated call chains test file
* The ecosystem checks found some bugs
* I manually check django and zulip formatting
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Extends `comparison-with-itself` to cover simple function calls on
known-pure functions, like `id`. For example, we now flag `id(x) ==
id(x)`.
Closes https://github.com/astral-sh/ruff/issues/6276.
## Test Plan
`cargo test`
**Summary** This adds the information whether we're in a .py python
source file or in a .pyi stub file to enable people working on #5822 and
related issues.
I'm not completely happy with `Default` for something that depends on
the input.
**Test Plan** None, this is currently unused, i'm leaving this to first
implementation of stub file specific formatting.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Follow-up to https://github.com/astral-sh/ruff/pull/6325, to avoid false
positives in cases like:
```python
if x == int:
...
```
Which is valid, since we don't know that we're comparing the type _of_
something -- we're comparing the type objects directly.
## Summary
Extends `type-comparison` to flag:
```python
if type(obj) is int:
pass
```
In addition to the existing cases, like:
```python
if type(obj) is type(1):
pass
```
Closes https://github.com/astral-sh/ruff/issues/6260.
## Summary
We already support preserving the end-of-line comment in calls and type
parameters, as in:
```python
foo( # comment
bar,
)
```
This PR adds the same behavior for lists, sets, comprehensions, etc.,
such that we preserve:
```python
[ # comment
1,
2,
3,
]
```
And related cases.
## Summary
This PR adds an API for chaining comment placement methods based on the
[`then_with`](https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.then_with)
from `Ordering` in the standard library.
For example, you can now do:
```rust
try_some_case(comment).then_with(|comment| try_some_other_case_if_still_default(comment))
```
This lets us avoid this kind of pattern, which I've seen in
`placement.rs` and used myself before:
```rust
let comment = match handle_own_line_comment_between_branches(comment, preceding, locator) {
CommentPlacement::Default(comment) => comment,
placement => return placement,
};
```
## Summary
This ensures that we treat `# comment` as parenthesized in contexts
like:
```python
while (
True
# comment
):
pass
```
The same logic applies equally to `for`, `async for`, `if`, `with`, and
`async with`. The general pattern is that you have an expression which
precedes a colon-separated suite.
**Summary** Prompted by
https://github.com/astral-sh/ruff/pull/6257#issuecomment-1661308410, it
tried to make the ecosystem script output on failure better
understandable. All log messages are now written to a file, which is
printed on error. Running locally progress is still shown.
Looking through the log output i saw that we currently log syntax errors
in input, which is confusing because they aren't actual errors, but we
don't check that these files don't change due to parser regressions or
improvements. I added `--files-with-errors` to catch that.
**Test Plan** CI
Adds rule to convert type aliases defined with annotations i.e. `x:
TypeAlias = int` to the new PEP-695 syntax e.g. `type x = int`.
Does not support using new generic syntax for type variables, will be
addressed in a follow-up.
Added as part of pyupgrade — ~the code 100 as chosen to avoid collision
with real pyupgrade codes~.
Part of #4617
Builds on #5062
Of the rules that flake8-pyi enforces for `.pyi` type stubs, many of
them equally make sense to check in normal runtime code with type
annotations. Broaden these rules to check all files:
PYI013 ellipsis-in-non-empty-class-body
PYI016 duplicate-union-member
PYI018 unused-private-type-var
PYI019 custom-type-var-return-type
PYI024 collections-named-tuple
PYI025 unaliased-collections-abc-set-import
PYI030 unnecessary-literal-union
PYI032 any-eq-ne-annotation
PYI034 non-self-return-type
PYI036 bad-exit-annotation
PYI041 redundant-numeric-union
PYI042 snake-case-type-alias
PYI043 t-suffixed-type-alias
PYI045 iter-method-return-iterable
PYI046 unused-private-protocol
PYI047 unused-private-type-alias
PYI049 unused-private-typed-dict
PYI050 no-return-argument-annotation-in-stub (Python ≥ 3.11)
PYI051 redundant-literal-union
PYI056 unsupported-method-call-on-all
The other rules are stub-specific and remain enabled only in `.pyi`
files.
PYI001 unprefixed-type-param
PYI002 complex-if-statement-in-stub
PYI003 unrecognized-version-info-check
PYI004 patch-version-comparison
PYI005 wrong-tuple-length-version-comparison (could make sense to
broaden, see
https://github.com/astral-sh/ruff/pull/6297#issuecomment-1663314807)
PYI006 bad-version-info-comparison (same)
PYI007 unrecognized-platform-check
PYI008 unrecognized-platform-name
PYI009 pass-statement-stub-body
PYI010 non-empty-stub-body
PYI011 typed-argument-default-in-stub
PYI012 pass-in-class-body
PYI014 argument-default-in-stub
PYI015 assignment-default-in-stub
PYI017 complex-assignment-in-stub
PYI020 quoted-annotation-in-stub
PYI021 docstring-in-stub
PYI026 type-alias-without-annotation (could make sense to broaden, but
gives many false positives on runtime code as currently implemented)
PYI029 str-or-repr-defined-in-stub
PYI033 type-comment-in-stub
PYI035 unassigned-special-variable-in-stub
PYI044 future-annotations-in-stub
PYI048 stub-body-multiple-statements
PYI052 unannotated-assignment-in-stub
PYI053 string-or-bytes-too-long
PYI054 numeric-literal-too-long
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
In Python >= 3.7, `await` can be included in f-strings.
https://bugs.python.org/issue28942
## Test Plan
<!-- How was it tested? -->
Existing tests
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
#2646
## Test Plan
<!-- How was it tested? -->
## Summary
Implements `Y019` from
[flake8-pyi](https://github.com/PyCQA/flake8-pyi).
The rule checks if
- instance methods that return `self`
- class methods that return an instance of `cls`
- `__new__` methods
Return a custom `TypeVar` instead of `typing.Self` and raises a
violation if this is the case. The rule also covers
[PEP-695](https://peps.python.org/pep-0695/) syntax as introduced in
upstream in https://github.com/PyCQA/flake8-pyi/pull/402
## Test Plan
Added fixtures with test cases from upstream implementation (plus
additional one for an excluded edge case, mentioned in upstream
implementation)
## Summary
Relates to #970.
Add new `bad-format-character` Pylint rule.
I had to make a change in `crates/ruff_python_literal/src/format.rs` to
get a more detailed error in case the format character is not correct. I
chose to do this since most of the format spec parsing functions are
private. It would have required me reimplementing most of the parsing
logic just to know if the format char was correct.
This PR also doesn't reflect current Pylint functionality in two ways.
It supports new format strings correctly, Pylint as of now doesn't. See
pylint-dev/pylint#6085.
In case there are multiple adjacent string literals delimited by
whitespace the index of the wrong format char will relative to the
single string. Pylint will instead reported it relative to the
concatenated string.
Given this:
```
"%s" "%z" % ("hello", "world")
```
Ruff will report this:
```Unsupported format character 'z' (0x7a) at index 1```
Pylint instead:
```Unsupported format character 'z' (0x7a) at index 3```
I believe it's more sensible to report the index relative to the
individual string.
## Test Plan
Added new snapshot and a small test in
`crates/ruff_python_literal/src/format.rs`.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
Part of #5062
Closes https://github.com/astral-sh/ruff/issues/5931
Implements formatting of a sequence of type parameters in a dedicated
struct for reuse by classes, functions, and type aliases (preparing for
#5929). Adds formatting of type parameters in class and function
definitions — previously, they were just elided.
## Summary
Builds on #6170 to break `global` and `nonlocal` statements, such that
we get:
```python
def f():
global \
analyze_featuremap_layer, \
analyze_featuremapcompression_layer, \
analyze_latencies_post, \
analyze_motions_layer, \
analyze_size_model
```
Instead of:
```python
def f():
global analyze_featuremap_layer, analyze_featuremapcompression_layer, analyze_latencies_post, analyze_motions_layer, analyze_size_model
```
Notably, we avoid applying this formatting if the statement ends in a
comment. Otherwise, the comment would _need_ to be placed after the last
item, like:
```python
def f():
global \
analyze_featuremap_layer, \
analyze_featuremapcompression_layer, \
analyze_latencies_post, \
analyze_motions_layer, \
analyze_size_model # noqa
```
To me, this seems wrong (and would break the `# noqa` comment). Ideally,
the items would be parenthesized, and the comment would be on the inner
parenthesis, like:
```python
def f():
global ( # noqa
analyze_featuremap_layer,
analyze_featuremapcompression_layer,
analyze_latencies_post,
analyze_motions_layer,
analyze_size_model
)
```
But that's not valid syntax.
## Summary
Checks for the presence of redundant `Literal` types and builtin super
types in an union. See [original
source](2a86db8271/pyi.py (L1261)).
This implementation has a couple of differences from the original. The
first one is, we support the `complex` and `float` builtin types. The
second is, when reporting diagnostic for a `Literal` with multiple
members of the same type, we print the entire `Literal` while `flak8`
only prints the `Literal` with its first member.
For example:
```python
from typing import Literal
x: Literal[1, 2] | int
```
Ruff will show `Literal[1, 2]` while flake8 only shows `Literal[1]`.
```shell
$ ruff crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:4:18: PYI051 `Literal["foo"]` is redundant in an union with `str`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:5:37: PYI051 `Literal[b"bar", b"foo"]` is redundant in an union with `bytes`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:37: PYI051 `Literal[5]` is redundant in an union with `int`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:67: PYI051 `Literal["foo"]` is redundant in an union with `str`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:37: PYI051 `Literal[b"str_bytes"]` is redundant in an union with `bytes`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:51: PYI051 `Literal[42]` is redundant in an union with `int`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:9:31: PYI051 `Literal[1J]` is redundant in an union with `complex`
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:9:53: PYI051 `Literal[3.14]` is redundant in an union with `float`
Found 8 errors.
```
```shell
$ flake8 crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:4:18: Y051 "Literal['foo']" is redundant in a union with "str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:5:37: Y051 "Literal[b'bar']" is redundant in a union with "bytes"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:37: Y051 "Literal[5]" is redundant in a unionwith "int"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:6:67: Y051 "Literal['foo']" is redundant in a union with "str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:37: Y051 "Literal[b'str_bytes']" is redundantin a union with "bytes"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI051.pyi:7:51: Y051 "Literal[42]" is redundant in a union with "int"
```
While implementing this rule, I found a bug in the `is_unchecked_union`
check. This is the new check.
1ab86bad35/crates/ruff/src/checkers/ast/analyze/expression.rs (L85-L102)
The purpose of the check was to prevent rules from navigating through
nested `Union`s, as they already handle nested `Union`s. The way it was
implemented, this was not happening, the rules were getting executed
more than one time and sometimes were receiving expressions that were
not `Union`. For example, with the following code:
```python
typing.Union[Literal[5], int, typing.Union[Literal["foo"], str]]
```
The rules were receiving the expressions in the following order:
- `typing.Union[Literal[5], int, typing.Union[Literal["foo"], str]]`
- `Literal[5]`
- `typing.Union[Literal["foo"], str]]`
This was causing `PYI030` to report redundant information, for example:
```python
typing.Union[Literal[5], int, typing.Union[Literal["foo"],
Literal["bar"]]]
```
This is the `PYI030` output for this code:
```shell
PYI030 Multiple literal members in a union. Use a single literal, e.g. `Literal[5, "foo", "bar"]`
YI030 Multiple literal members in a union. Use a single literal, e.g.`Literal[5, "foo"]`
```
If I haven't misinterpreted the rule, that looks incorrect. I didn't
have the time to check the `PYI016` rule.
The last thing is, I couldn't find a reason for the "Why is this bad?"
section for `PYI051`.
Ref: #848
## Test Plan
Snapshots and manual runs of flake8.
\
## Summary
Previously, the ruff formatter was removing the star argument of
`lambda` expressions when formatting.
Given the following code snippet
```python
lambda *a: ()
lambda **b: ()
```
it would be formatted to
```python
lambda: ()
lambda: ()
```
We fix this by checking for the presence of `args`, `vararg` or `kwarg`
in the `lambda` expression, before we were only checking for the
presence of `args`.
Fixes#5894
## Test Plan
Add new tests cases.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Similar to #6279, moving some helpers onto the struct in the name of
reducing the number of random undiscoverable utilities we have in
`helpers.rs`.
Most of the churn is migrating rules to take `ast::ExprCall` instead of
the spread call arguments.
## Test Plan
`cargo test`
## Summary
This PR removes a now-unnecessary abstraction from `helper.rs`
(`CallArguments`), in favor of adding methods to `Arguments` directly,
which helps with discoverability.
## Summary
This PR boxes the `TypeParams` and `Arguments` fields on the class
definition node. These fields are optional and often emitted, and given
that class definition is our largest enum variant, we pay the cost of
including them for every statement in the AST. Boxing these types
reduces the statement size by 40 bytes, which seems like a good tradeoff
given how infrequently these are accessed.
## Test Plan
Need to benchmark, but no behavior changes.
## Summary
This PR leverages the `Arguments` AST node introduced in #6259 in the
formatter, which ensures that we correctly handle trailing comments in
calls, like:
```python
f(
1,
# comment
)
pass
```
(Previously, this was treated as a leading comment on `pass`.)
This also allows us to unify the argument handling across calls and
class definitions.
## Test Plan
A bunch of new fixture tests, plus improved Black compatibility.
## Summary
Similar to #6259, this PR adds a `TypeParams` node to the AST, to
capture the list of type parameters with their surrounding brackets.
If a statement lacks type parameters, the `type_params` field will be
`None`.
## Summary
This PR adds a new `Arguments` AST node, which we can use for function
calls and class definitions.
The `Arguments` node spans from the left (open) to right (close)
parentheses inclusive.
In the case of classes, the `Arguments` is an option, to differentiate
between:
```python
# None
class C: ...
# Some, with empty vectors
class C(): ...
```
In this PR, we don't really leverage this change (except that a few
rules get much simpler, since we don't need to lex to find the start and
end ranges of the parentheses, e.g.,
`crates/ruff/src/rules/pyupgrade/rules/lru_cache_without_parameters.rs`,
`crates/ruff/src/rules/pyupgrade/rules/unnecessary_class_parentheses.rs`).
In future PRs, this will be especially helpful for the formatter, since
we can track comments enclosed on the node itself.
## Test Plan
`cargo test`
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Before:
<img width="1031" alt="Screen Shot 2023-08-02 at 15 57 10"
src="https://github.com/astral-sh/ruff/assets/17039389/171a21d5-01a5-4aa5-8079-4e7f8a59ade8">
After:
<img width="1031" alt="Screen Shot 2023-08-02 at 15 58 03"
src="https://github.com/astral-sh/ruff/assets/17039389/afd1b9b7-89e0-4e38-a4a6-e3255b62f021">
## Test Plan
<!-- How was it tested? -->
Manual inspection
## Summary
This PR renames...
- `Parameter#arg` to `Parameter#name`
- `ParameterWithDefault#def` to `ParameterWithDefault#parameter` (such
that `ParameterWithDefault` has a `default` and a `parameter`)
## Test Plan
`cargo test`
## Summary
This PR renames a few AST nodes for clarity:
- `Arguments` is now `Parameters`
- `Arg` is now `Parameter`
- `ArgWithDefault` is now `ParameterWithDefault`
For now, the attribute names that reference `Parameters` directly are
changed (e.g., on `StmtFunctionDef`), but the attributes on `Parameters`
itself are not (e.g., `vararg`). We may revisit that decision in the
future.
For context, the AST node formerly known as `Arguments` is used in
function definitions. Formally (outside of the Python context),
"arguments" typically refers to "the values passed to a function", while
"parameters" typically refers to "the variables used in a function
definition". E.g., if you Google "arguments vs parameters", you'll get
some explanation like:
> A parameter is a variable in a function definition. It is a
placeholder and hence does not have a concrete value. An argument is a
value passed during function invocation.
We're thus deviating from Python's nomenclature in favor of a scheme
that we find to be more precise.
## Summary
Black allows up to one blank line _before_ a class docstring, and
enforces one blank line _after_ a class docstring. This PR implements
that handling. The cases in
`crates/ruff_python_formatter/resources/test/fixtures/ruff/statement/class_definition.py`
match Black identically.
## Summary
This PR ensures that if a function or class is the first statement in a
nested suite that _isn't_ a function or class body, we insert a leading
newline.
For example, given:
```python
def f():
if True:
def register_type():
pass
```
We _want_ to preserve the newline, whereas today, we remove it.
Note that this only applies when the function or class doesn't have any
leading comments.
Closes https://github.com/astral-sh/ruff/issues/6066.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the `Interactive` and `FunctionType` parser modes that are unused by ruff
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the `type_comment` field which our parser doesn't support.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the type ignore node from the AST because our parser doesn't support it, and just having it around is confusing.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo build`
<!-- How was it tested? -->
**Summary** Allow passing any node to `CommentPlacement::{leading,
trailing, dangling}` without manually converting. Conversely, Restrict
the comment to the only type we actually pass.
**Test Plan** No changes.
## Summary
Previously, given:
```python
a = \
5;
```
When detecting continuations starting at the offset of the `;`, we'd
flag the previous line as a continuation. We should only flag a
continuation if there isn't leading content prior to the offset.
Closes https://github.com/astral-sh/ruff/issues/6214
## Summary
This PR moves the "insert empty lines" behavior out of
`JoinNodesBuilder` and into the `Suite` formatter. I find it a little
confusing that the logic is split between those two formatters right
now, and since this is _only_ used in that one place, IMO it is a bit
simpler to just inline it and use a single approach to tracking state
(right now, both are stateful).
The only other place this was used was for decorators. As a side effect,
we now remove blank lines in both of these cases, which is a known but
intentional deviation from Black (which preserves the empty line before
the comment in the first case):
```python
@foo
# Hello
@bar
def baz():
pass
@foo
@bar
def baz():
pass
```
## Summary
Very subtle bug related to the AST traversal. Given:
```python
from __future__ import annotations
from logging import getLogger
__all__ = ("getLogger",)
def foo() -> None:
pass
```
We end up visiting the `-> None` annotation, then reusing the state
snapshot when we go to visit the `__all__` exports, so when we visit
`"getLogger"`, we think we're inside of a deferred type annotation.
This PR changes all the deferred visitors to snapshot and restore the
state, which is a lot safer -- that way, the visitors avoid modifying
the current visitor state. (Previously, they implicitly left the visitor
state set to the state of the _last_ thing they visited.)
Closes https://github.com/astral-sh/ruff/issues/6207.
**Summary** This includes two changes:
* Allow setting `-v` in `ruff_dev`, using the `ruff_cli` implementation
* `debug!` which ruff configuration strategy was used
This is a byproduct of debugging #6187.
**Test Plan** n/a
**Summary** This prevents us from turning `r'''\""'''` into
`r"""\"""""`, which is invalid syntax.
This PR fixes CI, which is currently broken on main (in a way that still
passes on linter PRs and allows merging formatter PRs, but it's bad to
have a job be red). Once merged, i'll make the formatted ecosystem
checks a required check.
**Test Plan** Added a regression test.
## Summary
This PR adds a new precedence level for the comprehension element. This fixes
the generator to not add parentheses around the comprehension element every
time.
The new precedence level is `COMPREHENSION_ELEMENT` and it should occur after
the `NAMED_EXPR` precedence level because named expressions are always parenthesized.
This matches the behavior of Python `ast.unparse` and tested with the
following snippet:
```python
import ast
code = ""
ast.unparse(ast.parse(code))
```
## Test Plan
Add a bunch of test cases for all the valid nodes at that position.
fixes: #5777
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format bytes string
Closes#6064
## Test Plan
Added a fixture based on string's one
## Summary
We have some code to ensure that if an aliased import is used, any
submodules should be marked as used too. This comment says it best:
```rust
// If the name of a submodule import is the same as an alias of another import, and the
// alias is used, then the submodule import should be marked as used too.
//
// For example, mark `pyarrow.csv` as used in:
//
// ```python
// import pyarrow as pa
// import pyarrow.csv
// print(pa.csv.read_csv("test.csv"))
// ```
```
However, it looks like when we go to look up `pyarrow` (of `import
pyarrow as pa`), we aren't checking to ensure the resolved binding is
_actually_ an import. This was causing us to attribute `print(rm.ANY)`
to `def requests_mock` here:
```python
import requests_mock as rm
def requests_mock(requests_mock: rm.Mocker):
print(rm.ANY)
```
Closes https://github.com/astral-sh/ruff/issues/6180.
## Summary
Adds `global` and `nonlocal` formatting, without the "deviation from
black" outlined in the linked issue, which I'll do separately.
See: https://github.com/astral-sh/ruff/issues/4798.
## Test Plan
Added a fixture in the Ruff-specific directory since the Black fixtures
don't seem to cover this.
## Summary
Closes https://github.com/astral-sh/ruff/issues/5781
## Test Plan
Added cases to
`crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/named_expr.py`
one-by-one and adjusted the condition as needed.
## Summary
Right now, if we have two fixes that have an overlapping edit, but not
an _identical_ set of edits, they'll conflict, causing us to do another
linter traversal. Here, I've enabled the fixer to support partially
overlapping edits, which (as an example) let's us greatly reduce the
number of iterations required in the test suite.
The most common case here is that in which a bunch of edits need to
import some symbol, and then use that symbol, but in different ways. In
that case, all edits will have a common fix (to import the symbol), but
deviate in some way. With this change, we can do all of those edits in
one pass.
Note that the simplest way to enable this was to store sorted edits on
`Fix`. We don't allow modifying the edits on `Fix` once it's
constructed, so this is an easy change, and allows us to avoid a bunch
of clones and traversals later on.
Closes#5800.
## Summary
If a file has a BOM, the import sorter _always_ reports the imports as
unsorted. The acute issue is that we detect that the line has leading
content (before the imports), which we always consider a violation.
Rather than fixing that one site, this PR instead makes `.line_start`
BOM-aware.
Fixes https://github.com/astral-sh/ruff/issues/6155.
## Summary
This PR adds a new config option for `pep8-naming` plugin called
`extend-ignore-names` which is used to extend the default values in
`ignore-names` option.
resolves: #6050
## Summary
This PR protects against code like:
```python
from typing import Optional
import bar # ruff: noqa
import baz
class Foo:
x: Optional[str] = None
```
In which the user wrote `# ruff: noqa` to ignore a specific error, not
realizing that it was a file-level exemption that thus turned off all
lint rules.
Specifically, if a `# ruff: noqa` directive is not at the start of a
line, we now ignore it and warn, since this is almost certainly a
mistake.
Requires https://github.com/astral-sh/RustPython-Parser/pull/42
Related https://github.com/PyCQA/pyflakes/pull/778
[PEP-695](https://peps.python.org/pep-0695)
Part of #5062
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Adds a scope for type parameters, a type parameter binding kind, and
checker visitation of type parameters in type alias statements, function
definitions, and class definitions.
A few changes were necessary to ensure correctness following the
insertion of a new scope between function and class scopes and their
parent.
## Test Plan
<!-- How was it tested? -->
Undefined name snapshots.
Unused type parameter rule will be added as follow-up.
## Summary
Right now, `RUF015` will try to rewrite `x[:1]` as `[next(x)]`. This
isn't equivalent if `x`, for example, is empty, where slicing like
`x[:1]` is forgiving, but `next` raises `StopIteration`. For me this is
a little too much of a deviation to be comfortable with, and most of the
value in this rule is the `x[0]` to `next(x)` conversion anyway.
Closes https://github.com/astral-sh/ruff/issues/6148.
## Summary
In #6134 and #6136, we see some false positives for "shadowed" class
definitions. For example, here, the first definition is flagged as
unused, since from the perspective of the semantic model (which doesn't
understand branching), it appears to be immediately shadowed in the
`else`, and thus never used:
```python
if sys.version_info >= (3, 11):
class _RootLoggerConfiguration(TypedDict, total=False):
level: _Level
filters: Sequence[str | _FilterType]
handlers: Sequence[str]
else:
class _RootLoggerConfiguration(TypedDict, total=False):
level: _Level
filters: Sequence[str]
handlers: Sequence[str]
```
Instead of looking at _all_ bindings, we should instead look at the
"live" bindings, which is similar to how other rules (like unused
variables detection) is structured. We thus move the rule from
`bindings.rs` (which iterates over _all_ bindings, regardless of whether
they're shadowed) to `deferred_scopes.rs`, which iterates over all
"live" bindings once a scope has been fully analyzed.
## Test Plan
`cargo test`
## Summary
This PR implements pycodestyle's E241 (tab after comma) and E242
(multiple whitespace after comma) lints.
These are marked as nursery rules like many other pycodestyle rules.
Refs #2402
## Test Plan
E24.py copied from pycodestyle.
## Summary
Completes the documentation for the ruleset, apart from four rules which
have contradictions, so need to be thought about more regarding how to
document that. Related to #2646.
## Test Plan
`python scripts/test_docs_formatted.py`
**Summary**
Updated doc comments for `missing_whitespace_around_operator.rs`. Online
docs also benefit from this update.
**Test Plan**
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
## Summary
Completes the documentation for the one and only (current) rule in the
`flynt` ruleset. Related to #2646.
## Test Plan
`python scripts/test_docs_formatted.py`
## Summary
This PR stores the mapping from `ExprName` node to resolved `BindingId`,
which lets us skip scope lookups in `resolve_call_path`. It's enabled by
#6045, since that PR ensures that when we analyze a node (and thus call
`resolve_call_path`), we'll have already visited its `ExprName`
elements.
In more detail: imagine that we're traversing over `foo.bar()`. When we
read `foo`, it will be an `ExprName`, which we'll then resolve to a
binding via `handle_node_load`. With this change, we then store that
binding in a map. Later, if we call `collect_call_path` on `foo.bar`,
we'll identify `foo` (the "head" of the attribute) and grab the resolved
binding in that map. _Almost_ all names are now resolved in advance,
though it's not a strict requirement, and some rules break that pattern
(e.g., if we're analyzing arguments, and they need to inspect their
annotations, which are visited in a deferred manner).
This improves performance by 4-6% on the all-rules benchmark. It looks
like it hurts performance (1-2% drop) in the default-rules benchmark,
presumedly because those rules don't call `resolve_call_path` nearly as
much, and so we're paying for these extra writes.
Here's the benchmark data:
```
linter/default-rules/numpy/globals.py
time: [67.270 µs 67.380 µs 67.489 µs]
thrpt: [43.720 MiB/s 43.792 MiB/s 43.863 MiB/s]
change:
time: [+0.4747% +0.7752% +1.0626%] (p = 0.00 < 0.05)
thrpt: [-1.0514% -0.7693% -0.4724%]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
linter/default-rules/pydantic/types.py
time: [1.4067 ms 1.4105 ms 1.4146 ms]
thrpt: [18.028 MiB/s 18.081 MiB/s 18.129 MiB/s]
change:
time: [+1.3152% +1.6953% +2.0414%] (p = 0.00 < 0.05)
thrpt: [-2.0006% -1.6671% -1.2981%]
Performance has regressed.
linter/default-rules/numpy/ctypeslib.py
time: [637.67 µs 638.96 µs 640.28 µs]
thrpt: [26.006 MiB/s 26.060 MiB/s 26.113 MiB/s]
change:
time: [+1.5859% +1.8109% +2.0353%] (p = 0.00 < 0.05)
thrpt: [-1.9947% -1.7787% -1.5611%]
Performance has regressed.
linter/default-rules/large/dataset.py
time: [3.2289 ms 3.2336 ms 3.2383 ms]
thrpt: [12.563 MiB/s 12.581 MiB/s 12.599 MiB/s]
change:
time: [+0.8029% +0.9898% +1.1740%] (p = 0.00 < 0.05)
thrpt: [-1.1604% -0.9801% -0.7965%]
Change within noise threshold.
linter/all-rules/numpy/globals.py
time: [134.05 µs 134.15 µs 134.26 µs]
thrpt: [21.977 MiB/s 21.995 MiB/s 22.012 MiB/s]
change:
time: [-4.4571% -4.1175% -3.8268%] (p = 0.00 < 0.05)
thrpt: [+3.9791% +4.2943% +4.6651%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
linter/all-rules/pydantic/types.py
time: [2.5627 ms 2.5669 ms 2.5720 ms]
thrpt: [9.9158 MiB/s 9.9354 MiB/s 9.9516 MiB/s]
change:
time: [-5.8304% -5.6374% -5.4452%] (p = 0.00 < 0.05)
thrpt: [+5.7587% +5.9742% +6.1914%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe
linter/all-rules/numpy/ctypeslib.py
time: [1.3949 ms 1.3956 ms 1.3964 ms]
thrpt: [11.925 MiB/s 11.931 MiB/s 11.937 MiB/s]
change:
time: [-6.2496% -6.0856% -5.9293%] (p = 0.00 < 0.05)
thrpt: [+6.3030% +6.4799% +6.6662%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
linter/all-rules/large/dataset.py
time: [5.5951 ms 5.6019 ms 5.6093 ms]
thrpt: [7.2527 MiB/s 7.2623 MiB/s 7.2711 MiB/s]
change:
time: [-5.1781% -4.9783% -4.8070%] (p = 0.00 < 0.05)
thrpt: [+5.0497% +5.2391% +5.4608%]
Performance has improved.
```
Still playing with this (the concepts need better names, documentation,
etc.), but opening up for feedback.
## Summary
Noticed in #5954: we walk _past_ the root rather than stopping _at_ the
root when attempting to traverse along the parent path. It's effectively
an off-by-one bug.
## Summary
Updated doc comment for `call_datetime_without_tzinfo.rs`. Online docs
also benefit from this update.
## Test Plan
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
## Summary
Check for unused private `TypeVar`. See [original
implementation](2a86db8271/pyi.py (L1958)).
```
$ flake8 --select Y018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: Y018 TypeVar "_T" is not used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: Y018 TypeVar "_P" is not used
```
```
$ ./target/debug/ruff --select PYI018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: PYI018 TypeVar `_T` is never used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: PYI018 TypeVar `_P` is never used
Found 2 errors.
```
In the file `unused_private_type_declaration.rs`, I'm planning to add
other rules that are similar to `PYI018` like the `PYI046`, `PYI047` and
`PYI049`.
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
This is a rewrite of the main comment placement logic. `place_comment`
now has three parts:
- place own line comments
- between branches
- after a branch
- place end-of-line comments
- after colon
- after a branch
- place comments for specific nodes (that include module level comments)
The rewrite fixed three bugs: `class A: # trailing comment` comments now
stay end-of-line, `try: # comment` remains end-of-line and deeply
indented try-else-finally comments remain with the right nested
statement.
It will be much easier to give more alternative branches nodes since
this is abstracted away by `is_node_with_body` and the first/last child
helpers. Adding new node types can now be done by adding an entry to the
`place_comment` match. The code went from 1526 lines before #6033 to
1213 lines now.
It thinks it easier to just read the new `placement.rs` rather than
reviewing the diff.
## Test Plan
The existing fixtures staying the same or improving plus new ones for
the bug fixes.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
The `SIM102` auto-fix fails if `elif` is indented like this:
## Example
```python
def f():
# SIM102
if a:
pass
elif b:
if c:
d
```
```
> cargo run -p ruff_cli -- check --select SIM102 --fix a.py
...
error: Failed to fix nested if: Failed to extract statement from source
a.py:5:5: SIM102 Use a single `if` statement instead of nested `if` statements
Found 1 error.
```
## Test Plan
<!-- How was it tested? -->
New test
## Summary
This PR adds the implementation for the new Jupyter AST nodes i.e.,
`ExprLineMagic` and `StmtLineMagic`.
## Test Plan
Add test cases for `unparse` containing magic commands
resolves: #6087
## Summary
Updated doc comment for `tab_indentation.rs`. Online docs also benefit
from this update.
## Test Plan
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Part of #5062
Requires https://github.com/astral-sh/RustPython-Parser/pull/32
Adds visitation of type alias statements and type parameters in class
and function definitions.
Duplicates tests for `PreorderVisitor` into `Visitor` with new
snapshots. Testing required node implementations for the `TypeParam`
enum, which is a chunk of the diff and the reason we need `Ranged`
implementations in
https://github.com/astral-sh/RustPython-Parser/pull/32.
## Test Plan
<!-- How was it tested? -->
Adds unit tests with snapshots.
Reimplements https://github.com/astral-sh/ruff/pull/3104
Closes https://github.com/astral-sh/ruff/issues/5726
Note that we will generate the hash for a cache key twice in normal
operation. Once to check for the cached item and again to update the
cache. We could optimize this by generating the hash once in
`diagnostics::lint_file` and passing the `u64` into `get` and `update`.
We'd probably want to wrap it in a `CacheKeyHash` enum for type safety.
## Test plan
Unit tests for Windows and Unix.
Manual test with case from issue
```
❯ touch fake.py
❯ chmod +x fake.py
❯ ./target/debug/ruff --select EXE fake.py
fake.py:1:1: EXE002 The file is executable but no shebang is present
Found 1 error.
❯ chmod -x fake.py
❯ ./target/debug/ruff --select EXE fake.py
```
## Summary
If a method is annotated with `@typing_extensions.override`, we should
avoid flagging A003 on it. This isn't part of the standard library yet,
but it's used to explicitly mark methods as overrides.
## Summary
If a user subclasses `threading.Event`, e.g. with:
```python
from threading import Event
class CustomEvent(Event):
def set(self) -> None:
...
```
They no control over the method name (`set`). This PR allows
`threading.Event#set` and `logging.Filter#filter` overrides, and avoids
flagging A003 in such cases. Ideally, we'd avoid flagging all overridden
methods, but... that's a lot more difficult, and this is at least
_better_ than what we do now.
Closes https://github.com/astral-sh/ruff/issues/6057.
Closes https://github.com/astral-sh/ruff/issues/5956.
## Summary
This PR modifies the order of operations in our AST checker. Previously,
we ran our analysis rules first, then bound names and traversed over the
subtrees. Now, after a series of refactors, we can invert the order: do
the subtree traversal and model-building _first_, then run rules.
The nice thing about this change is that when we go to analyze, e.g., a
function call node, we'll already have traversed any of the constituent
`Expr::Name` nodes... So if we store the resolution of all names when do
the traversal, we can avoid having to do any expensive work in
`resolve_call_path`.
## Test Plan
Clean run of the snapshot tests, and hopefully the ecosystem checks too!
## Summary
This PR is a refactoring of placement.rs. The code got more consistent,
some comments were updated and some dead code was removed or replaced
with debug assertions. It also contains a bugfix for the placement of
end-of-branch comments with nested bodies inside try statements that
occurred when refactoring the nested body loop.
## Test Plan
The existing test cases don't change. I added a couple of cases that i
think should be tested but weren't, and a regression test for the bugfix
## Summary
This PR ensures that we can retain the current behavior even after we
reorder the visitor a bit, by looking for annotated lambdas rather than
"is the name bound to anything?", since if we visit the name before we
run this rule, it'll _always_ be bound. (This check is already a bit
flawed -- in truth, we should probably run this rule deferred so that we
can reliably detect shadowing.)
## Summary
This PR attempts to draw some basic separation between the `Checker`'s
traversal responsibilities (traversing the AST, building the semantic
model) and its calling-out-to-lint-rule responsibilities. It doesn't try
to introduce any sophisticated API. Instead, it just moves all of the
lint rule calls out of `checkers/ast/mod.rs` and into methods in a new
`analyze` module. (There are four remaining lint rules in `Checker`, but
I'll remove those in future PRs.)
I'm not trying to "solve" our lint rule API here. Instead, I'm trying to
make two improvements:
1. `checkers/ast/mod.rs` has just gotten way too large, and people work
in it all the time. Prior to this PR, it was 5.5k lines, which led to
significant lags in my editor and made it really hard to reason about
the parts that are _actually_ important. (I like big files, but this one
crossed the line for me.) Now, it's < 2,000 lines, and the code is much
more focused.
2. I want to avoid accidentally adding lint rules in the "wrong" parts
of the traversal. By confining lint rule invocations to these "analyze"
calls, we'll avoid (e.g.) putting them in the binding phase.
## Summary
These only need the token stream, and we always prefer token-based to
physical line-based rules.
There are a few other changes snuck in here:
- Renaming the rule files to match the diagnostic names (likely an
error).
- The "leading whitespace before shebang" rule now works regardless of
where the comment occurs (i.e., if the shebang is on the second line,
and the first line is blank, we flag and remove that leading
whitespace).
**Summary** Fix an instability in with statement formatter when there is
an own line comment as the `as`
```python
with (
a as
# bad comment
b):
```
**Test Plan** Added the comment to the test cases.
## Summary
Closes https://github.com/astral-sh/ruff/issues/6025 (which contains a
more thorough description of the issue). Previously, the `# noqa` here
was being marked as unused, but removing it raised `SIM114`:
```python
def foo():
a = True
b = False
if a > b: # noqa: SIM114
return 3
elif a == b:
return 3
```
**Summary** Add a formatter progress testing script to CI. This script
will 1) print the black compability on each run 2) catch regressions wrt
to formatter stability, emitting invalid syntax and other kinds of bugs
(e.g. #5917) before they land on main 3) have an additional layer of
real world tests when implementing new nodes or other new formatter
code.
This is currently a bash script, i'm not sure if we want to keep it that
way, or switch to e.g. the regular ecosystem scripts. The output
separation of `format_dev` could also use some polishing. We should also
consider pinning commits so we don't get spurious regression when they
change their code.
**Test Plan** The script extends CI.
## Summary
My intuition is that it's faster to do these checks as-needed rather
than allocation new hash maps and vectors for the arguments. (We
typically only query once anyway.)
## Summary
This PR adds a `logger-objects` setting that allows users to mark
specific symbols a `logging.Logger` objects. Currently, if a `logger` is
imported, we only flagged it as a `logging.Logger` if it comes exactly
from the `logging` module or is `flask.current_app.logger`.
This PR allows users to mark specific loggers, like
`logging_setup.logger`, to ensure that they're covered by the
`flake8-logging-format` rules and others.
For example, if you have a module `logging_setup.py` with the following
contents:
```python
import logging
logger = logging.getLogger(__name__)
```
Adding `"logging_setup.logger"` to `logger-objects` will ensure that
`logging_setup.logger` is treated as a `logging.Logger` object when
imported from other modules (e.g., `from logging_setup import logger`).
Closes https://github.com/astral-sh/ruff/issues/5694.
I don't know whether we want to make this change but here's some data...
Binary size:
- `main`: 30,384
- `charlie/match-phf`: 30,416
llvm-lines:
- `main`: 1,784,148
- `charlie/match-phf`: 1,789,877
llvm-lines and binary size are both unchanged (or, by < 5) when moving
from `u8` to `u32` return types, and even when moving to `char` keys and
values. I didn't expect this, but I'm not very knowledgable on this
topic.
Performance:
```
Confusables/match/src time: [4.9102 µs 4.9352 µs 4.9777 µs]
change: [+1.7469% +2.2421% +2.8710%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe
Confusables/match-with-skip/src
time: [2.0676 µs 2.0945 µs 2.1317 µs]
change: [+0.9384% +1.6000% +2.3920%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
Confusables/phf/src time: [31.087 µs 31.188 µs 31.305 µs]
change: [+1.9262% +2.2188% +2.5496%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
6 (6.00%) high mild
6 (6.00%) high severe
Confusables/phf-with-skip/src
time: [2.0470 µs 2.0486 µs 2.0502 µs]
change: [-0.3093% -0.1446% +0.0106%] (p = 0.08 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
```
The `-with-skip` variants add our optimization which first checks
whether the character is ASCII. So `match` is way, way faster than PHF,
but it tends not to matter since almost all source code is ASCII anyway.
## Summary
This pull request add supports for `int`, `float` and `bool` types in
`UP018`
rule to convert empty call to the default value of the type or remove
the call
if a value of the same type is provided as an argument.
## Test Plan
Added tests for `int`, `float` and `bool` types.
Partially resolves#5988
**Summary** Add a `EmptyWithDanglingComments` format helper that formats
comments inside empty parentheses, brackets or curly braces. Previously,
this was implemented separately, and partially incorrectly, for each use
case.
Empty `()`, `[]` and `{}` are special because there can be dangling
comments, and they can be in
two positions:
```python
x = [ # end-of-line
# own line
]
```
These comments are dangling because they can't be assigned to any
element inside as they would
in all other cases.
**Test Plan** Added a regression test.
145 (from previously 149) instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
145
```
## Summary
This is equivalent for a single flag, but I think it's more likely to be
correct when the bitflags are modified -- the primary reason being that
we sometimes define flags as the union of other flags, e.g.:
```rust
const ANNOTATION = Self::TYPING_ONLY_ANNOTATION.bits() | Self::RUNTIME_ANNOTATION.bits();
```
In this case, `flags.contains(Flag::ANNOTATION)` requires that _both_
flags in the union are set, whereas `flags.intersects(Flag::ANNOTATION)`
requires that _at least one_ flag is set.
## Summary
Use the `find_keyword` helper function instead of reimplementing it.
Follows on from #5983 by doing a different search.
## Test Plan
`cargo test`
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
F507 should not be raised when the right-hand side value is a non-tuple
object.
```python
'%s' % (1, 2, 3) # throws
'%s' % [1, 2, 3] # doesn't throw
'%s' % {1, 2, 3} # doesn't throw
```
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix a regression introduced by
https://github.com/astral-sh/ruff/pull/5638. A multiline expression
can't be safely inserted into a format field.
### Example
```
> cat a.py
"{}".format(
[
1,
2,
3,
]
)
> cargo run -p ruff_cli -- check a.py --no-cache --select UP032 --fix
Finished dev [unoptimized + debuginfo] target(s) in 0.07s
Running `target/debug/ruff check a.py --no-cache --select UP032 --fix`
error: Autofix introduced a syntax error in `a.py` with rule codes UP032: EOL while scanning string literal at byte offset 5
---
f"{[
1,
2,
3,
]}"
---
a.py:1:1: UP032 Use f-string instead of `format` call
Found 1 error.
```
## Test Plan
New test cases
## Summary
Checks that `append`, `extend` and `remove` methods are not called on
`__all__`. See [original
implementation](2a86db8271/pyi.py (L1133-L1138)).
```
$ flake8 --select Y026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:3:1: Y056 Calling ".append()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:4:1: Y056 Calling ".extend()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:5:1: Y056 Calling ".remove()" on "__all__" may not be supported by all type checkers (use += instead)
```
```
$ ./target/debug/ruff --select PYI026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:3:1: PYI056 Calling ".append()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:4:1: PYI056 Calling ".extend()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:5:1: PYI056 Calling ".remove()" on "__all__" may not be supported by all type checkers (use += instead)
Found 3 errors.
```
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
These are skipped as an optimization, but it feels kind of unnecessary
and makes the code a bit more confusing than is worthwhile.
(non-`strict` is also by far the more popular setting, and the default.)
## Summary
I ran into this in the wild. It looks like Ruff will collapse the `else`
and `elif` branches here (i.e., it doesn't recognize that they're too
independent import blocks):
```python
if "sdist" in cmds:
_sdist = cmds["sdist"]
elif "setuptools" in sys.modules:
from setuptools.command.sdist import sdist as _sdist
else:
from setuptools.command.sdist import sdist as _sdist
from distutils.command.sdist import sdist as _sdist
```
Likely fallout from the `elif_else_branches` refactor.
**Summary** Fix implemented in
https://github.com/astral-sh/RustPython-Parser/pull/35: Previously,
empty lambda arguments (e.g. `lambda: 1`) would get the range of the
entire expression, which leads to incorrect comment placement. Now empty
lambda arguments get an empty range between the `lambda` and the `:`
tokens.
**Test Plan** Added a regression test.
149 instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
149
```
**Summary** Add script to shrink all formatter errors: This started as a
fun idea and turned out really useful: This script gives us a single
Python file with all formatter stability errors. I want to keep it
around to occasionally update #5828 so I added it to the git.
**Test Plan** None, this is a helper script
## Summary
**Don't minimize files that don't match in the first place** This adds a
sanity check to the minimizer script that the
input matches the condition (e.g. unstable formatting). Otherwise we run
through all checks with the whole file, which is extremely slow. It's
more reasonable for downstream usage to write an empty string to the
output file instead.
## Summary
Allow `respect_gitignore` even when not in a git repo
## Test Plan
Within the Ruff repository:
1. Renamed `.git` to `.hello-world`
2. Added `test.py` in root folder
3. Added `test.py` to `.gitignore`
4. Ran `cargo run --bin ruff -- check --no-cache --isolated --show-files
.` with
and without `--respect-gitignore` flag
fixes: #5930
## Summary
We now allow RUF015 to fix cases like:
```python
list(range(10))[0]
list(x.y)[0]
list(x["y"])[0]
```
Further, we fix generators like:
```python
[i + 1 for i in x][0]
```
By rewriting to `next(iter(i + 1 for i in x))`.
I've retained the special-case that rewrites `[i for i in x][0]` to
`next(iter(x))`.
Closes https://github.com/astral-sh/ruff/issues/5764.
## Summary
Similar to #5852 and a bunch of related PRs -- trying to move rules that
rely on point-in-time semantic analysis to _after_ the semantic model
building.
## Summary
Implements `PYI017` or `Y017` from `flake8-pyi` plug-in. Mirrors
[upstream
implementation](ceab86d16b/pyi.py (L1039-L1048)).
It checks for any assignment with more than 1 target or an assignment to
anything other than a name, and raises a violation for these in stub
files.
Couldn't find a clear and concise explanation for why this is to be
avoided and what is preferred for attribute cases like:
```python
a.b = int
```
So welcome some input there, to learn and to finish up the docs.
## Test Plan
Added test cases from upstream plug-in in a fixture (both `.py` and
`.pyi`). Added a few more.
## Issue link
Refers: https://github.com/astral-sh/ruff/issues/848
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
- Remove space when start of slice is empty
- Treat unary op except `not` as simple expression
## Test Plan
Add some simple tests for unary op expressions in slice
Closes#5673
This shrinks a good bit more than previously, which was helpful for all
the formatter bugs. fwiw i treat this as a very ad-hoc script since it's
mainly my ecosystem bug processing companion.
## Summary
It can happen that we can't read a file (a python file, a jupyter
notebook or pyproject.toml), which needs to be handled and handled
consistently for all file types. Instead of using `Err` or `error!`, we
emit E602 with the io error as message and continue. This PR makes sure
we handle all three cases consistently, emit E602.
I'm not convinced that it should be possible to disable io errors, but
we now handle the regular case consistently and at least print warning
consistently.
I went with `warn!` but i can change them all to `error!`, too.
It also checks the error case when a pyproject.toml is not readable. The
error message is not very helpful, but it's now a bit clearer that
actually ruff itself failed instead vs this being a diagnostic.
## Examples
This is how an Err of `run` looks now:

With an unreadable file and `IOError` disabled:

(we lint zero files but count files before linting not during so we exit
0)
I'm not sure if it should (or if we should take a different path with
manual ExitStatus), but this currently also triggers when `files` is
empty:

## Test Plan
Unix only: Create a temporary directory with files with permissions
`000` (not readable by the owner) and run on that directory. Since this
breaks the assumptions of most of the test code (single file, `ruff`
instead of `ruff_cli`), the test code is rather cumbersome and looks a
bit misplaced; i'm happy about suggestions to fit it in closer with the
other tests or streamline it in other ways. I added another test for
when the entire directory is not readable.
## Summary
Completes documentation for the `flake8-fixme` (`FIX`) ruleset. Related
to #2646.
Tweaks the violation message. For example,
```
FIX001 Line contains FIXME
```
becomes
```
FIX001 Line contains FIXME, consider resolving the issue
```
This is because the previous message was unclear if it was warning
against the use of FIXME tags per se, or the code the FIXME tag was
annotating.
## Test Plan
`cargo test && python scripts/check_docs_formatted.py`
## Summary
Checks for `typehint.TypeAlias` annotation in type aliases. See
[original
source](https://github.com/PyCQA/flake8-pyi/blob/main/pyi.py#L1085).
```
$ flake8 --select Y026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:4:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "NewAny: TypeAlias = Any"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:5:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "OptinalStr: TypeAlias = typing.Optional[str]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:6:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "Foo: TypeAlias = Literal['foo']"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:7:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "IntOrStr: TypeAlias = int | str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:8:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "AliasNone: TypeAlias = None"
```
```
$ ./target/debug/ruff --select PYI026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:4:1: PYI026 Use `typing.TypeAlias` for type aliases in `NewAny`, e.g. "NewAny: typing.TypeAlias = Any"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:5:1: PYI026 Use `typing.TypeAlias` for type aliases in `OptinalStr`, e.g. "OptinalStr: typing.TypeAlias = typing.Optional[str]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:6:1: PYI026 Use `typing.TypeAlias` for type aliases in `Foo`, e.g. "Foo: typing.TypeAlias = Literal["foo"]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:7:1: PYI026 Use `typing.TypeAlias` for type aliases in `IntOrStr`, e.g. "IntOrStr: typing.TypeAlias = int | str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:8:1: PYI026 Use `typing.TypeAlias` for type aliases in `AliasNone`, e.g. "AliasNone: typing.TypeAlias = None"
Found 5 errors.
```
ref: #848
## Test Plan
Snapshots, manual runs of flake8.
## Summary
As part of my continued quest to separate semantic model-building from
diagnostic emission, this PR moves our unresolved-reference rules to a
deferred pass. So, rather than emitting diagnostics as we encounter
unresolved references, we now track those unresolved references on the
semantic model (just like resolved references), and after traversal,
emit the relevant rules for any unresolved references.
## Summary
Add known problems to `compare-to-empty-string` documentation. Related
to #5873.
Tweaked the example in the documentation to be a tad more concise and
correct (that the rule is most applicable when comparing to a `str`
variable).
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
This PR moves two rules (`invalid-all-format` and `invalid-all-object`)
out of the name-binding phase, and into the dedicated pass over all
bindings that occurs at the end of the `Checker`. This is part of my
continued quest to separate the semantic model-building logic from the
actual rule enforcement.
**Summary** Previously, `RUF014` would be part of ruff.schema.json
depending on whether or not the `unreachable-code` feature was active.
This caused problems for contributors who got unrelated RUF014 changes
when updating the schema without the feature active.
An alternative would be to always add `RUF014`.
**Test plan** `cargo dev generate-all` and `cargo run --bin ruff_dev
--features unreachable-code -- generate-all` now have the same effect.
## Summary
This crate now contains utilities for dealing with trivia more broadly:
whitespace, newlines, "simple" trivia lexing, etc. So renaming it to
reflect its increased responsibilities.
To avoid conflicts, I've also renamed `Token` and `TokenKind` to
`SimpleToken` and `SimpleTokenKind`.
## Summary
The vector of names here is immutable -- we never push to it after
initialization. Boxing reduces the size of the variant from 32 bytes to
24 bytes. (See:
https://nnethercote.github.io/perf-book/type-sizes.html#boxed-slices.)
It doesn't make a difference here, since it's not the largest variant,
but it still seems like a prudent change (and I was considering adding
another field to this variant, though I may no longer do so).
**Summary** This replaces the `todo!()` with a type alias stub in the
formatter. I added the tests from
704eb40108/parser/src/parser.rs (L901-L936)
as ruff python formatter tests.
**Test Plan** None, testing is part of the actual implementation
**Summary** Fix the formatter crash with `x[(1) :: ]` and related code.
**Problem** For assigning comments in slices in subscripts, we need to
find the positions of the colons to assign comments before and after the
colon to the respective lower/upper/step node (or dangling in that
section). Formatting `x[(1) :: ]` was broken because we were looking for
a `:` after the `1` but didn't consider that there could be a `)`
outside the range of the lower node, which contains just the `1` and no
optional parentheses.
**Solution** Use the simple tokenizer directly and skip all closing
parentheses.
**Test Plan** I added regression tests.
Closes#5733
**Summary** Add a static string error message to the formatter syntax
error so we can disambiguate where the syntax error came from
**Test Plan** No fixed tests, we don't expect this to occur, but it
helped with transformers syntax error debugging:
```
Error: Failed to format node
Caused by:
syntax error: slice first colon token was not a colon
```
## Summary
No behavior change, but this is in theory more efficient, since we can
just iterate over the flat `Binding` vector rather than having to
iterate over binding chains via the `Scope`.
## Summary
This PR moves the "unused exception" rule out of the visitor and into a
deferred check. When we can base rules solely on the semantic model, we
probably should, as it greatly simplifies the `Checker` itself.
## Summary
The `SemanticModel` currently stores the "body" of a given `Suite`,
along with the current statement index. This is used to support "next
sibling" queries, but we only use this in exactly one place -- the rule
that simplifies constructs like this to `any` or `all`:
```python
for x in y:
if x == 0:
return True
return False
```
Instead of tracking the state, we can just do a (slightly more
expensive) traversal, by finding the node within its parent and
returning the next node in the body.
Note that we'll only have to do this extremely rarely -- namely, for
functions that contain something like:
```python
for x in y:
if x == 0:
return True
```
## Summary
The motivation here is that it will make this rule easier to rewrite as
a deferred check. Right now, we can't run this rule in the deferred
phase, because it depends on the `except_handler` to power its autofix.
Instead of lexing the `except_handler`, we can use the `SimpleTokenizer`
from the formatter, and just lex forwards and backwards.
For context, this rule detects the unused `e` in:
```python
try:
pass
except ValueError as e:
pass
```
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Resolve#5854
## Test Plan
<!-- How was it tested? -->
New test cases
---------
Co-authored-by: konsti <konstin@mailbox.org>
## Summary
This PR just naively unrolls `collect_call_path` to handle attribute
resolutions of up to eight segments. In profiling via Instruments, it
seems to be about 4x faster for a very hot code path (4% of total
execution time on `main`, 1% here).
Profiling by running `RAYON_NUM_THREADS=1 cargo instruments -t time
--profile release-debug --time-limit 10000 -p ruff_cli -o
FromSlice.trace -- check crates/ruff/resources/test/cpython --silent -e
--no-cache --select ALL`, and modifying the linter to loop infinitely up
to the specified time (10 seconds) to increase sample size.
Before:
<img width="1792" alt="Screen Shot 2023-07-15 at 5 13 34 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/4a8b0b45-8b67-43e9-af5e-65b326928a8e">
After:
<img width="1792" alt="Screen Shot 2023-07-15 at 8 38 51 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/d8829159-2c79-4a49-ab3c-9e4e86f5b2b1">
## Summary
Before:
```
» ruff litestar tests --fix
warning: Invalid `# noqa` directive on line 19: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 65: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 74: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 22: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 66: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 75: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
After:
```
» cargo run --bin ruff ../litestar/litestar ../litestar/tests
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/ruff ../litestar/litestar ../litestar/tests`
warning: Detected debug build without --no-cache.
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:19: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:65: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:74: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:22: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:66: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:75: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
## Test Plan
I didn't find any existing tests with this warning.
Closes https://github.com/astral-sh/ruff/issues/5855
## Summary
Previously, `StmtIf` was defined recursively as
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub orelse: Vec<Stmt>,
}
```
Every `elif` was represented as an `orelse` with a single `StmtIf`. This
means that this representation couldn't differentiate between
```python
if cond1:
x = 1
else:
if cond2:
x = 2
```
and
```python
if cond1:
x = 1
elif cond2:
x = 2
```
It also makes many checks harder than they need to be because we have to
recurse just to iterate over an entire if-elif-else and because we're
lacking nodes and ranges on the `elif` and `else` branches.
We change the representation to a flat
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub elif_else_clauses: Vec<ElifElseClause>,
}
pub struct ElifElseClause {
pub range: TextRange,
pub test: Option<Expr>,
pub body: Vec<Stmt>,
}
```
where `test: Some(_)` represents an `elif` and `test: None` an else.
This representation is different tradeoff, e.g. we need to allocate the
`Vec<ElifElseClause>`, the `elif`s are now different than the `if`s
(which matters in rules where want to check both `if`s and `elif`s) and
the type system doesn't guarantee that the `test: None` else is actually
last. We're also now a bit more inconsistent since all other `else`,
those from `for`, `while` and `try`, still don't have nodes. With the
new representation some things became easier, e.g. finding the `elif`
token (we can use the start of the `ElifElseClause`) and formatting
comments for if-elif-else (no more dangling comments splitting, we only
have to insert the dangling comment after the colon manually and set
`leading_alternate_branch_comments`, everything else is taken of by
having nodes for each branch and the usual placement.rs fixups).
## Merge Plan
This PR requires coordination between the parser repo and the main ruff
repo. I've split the ruff part, into two stacked PRs which have to be
merged together (only the second one fixes all tests), the first for the
formatter to be reviewed by @michareiser and the second for the linter
to be reviewed by @charliermarsh.
* MH: Review and merge
https://github.com/astral-sh/RustPython-Parser/pull/20
* MH: Review and merge or move later in stack
https://github.com/astral-sh/RustPython-Parser/pull/21
* MH: Review and approve
https://github.com/astral-sh/RustPython-Parser/pull/22
* MH: Review and approve formatter PR
https://github.com/astral-sh/ruff/pull/5459
* CM: Review and approve linter PR
https://github.com/astral-sh/ruff/pull/5460
* Merge linter PR in formatter PR, fix ecosystem checks (ecosystem
checks can't run on the formatter PR and won't run on the linter PR, so
we need to merge them first)
* Merge https://github.com/astral-sh/RustPython-Parser/pull/22
* Create tag in the parser, update linter+formatter PR
* Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Originally `join_with` was used in the formatters README.md. Now it uses
```rs
f.join_comma_separated(item.end())
.nodes(elts.iter())
.finish()
```
## Test Plan
None
## Summary
For formatter instabilities, the message we get look something like
this:
```text
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py
@@ -47,9 +47,9 @@
- script_header_dict["slurm_partition_line"] = (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ script_header_dict[
+ "slurm_partition_line"
+ ] = NOT_YET_IMPLEMENTED_ExprJoinedStr
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/pbs.py
@@ -26,9 +26,9 @@
- pbs_script_header_dict["select_node_line"] += (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ pbs_script_header_dict[
+ "select_node_line"
+ ] += NOT_YET_IMPLEMENTED_ExprJoinedStr
```
For ruff crashes. you don't even get that but just the file that crashed
it. To extract the actual bug, you'd need to manually remove parts of
the file, rerun to see if the bug still occurs (and revert if it
doesn't) until you have a minimal example.
With this script, you run
```shell
cargo run --bin ruff_shrinking -- target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py target/minirepo/code.py "Unstable formatting" "target/debug/ruff_dev format-dev --stability-check target/minirepo"
```
and get
```python
class Slurm():
def gen_script_header(self, job):
if resources.queue_name != "":
script_header_dict["slurm_partition_line"] = f"#SBATCH --partition {resources.queue_name}"
```
which is an nice minimal example.
I've been using this script and it would be easier for me if this were
part of main. The main disadvantage to merging is that it adds
additional dependencies.
## Test Plan
I've been using this for a number of minimization. This is an internal
helper script you only run manually. I could add a test that minimizes a
rule violation if required.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fixes#5739
## Test Plan
<!-- How was it tested? -->
Manually tested:
```sh
$ tree dir
dir
├── dir.py
│ └── file.py
└── file.py
1 directory, 2 files
$ cargo run -p ruff_cli -- check dir --no-cache
Finished dev [unoptimized + debuginfo] target(s) in 0.08s
Running `target/debug/ruff check dir --no-cache`
dir/dir.py/file.py:1:7: F821 Undefined name `a`
dir/file.py:1:7: F821 Undefined name `a`
Found 2 errors.
```
Is a unit test needed?
## Summary
This PR does some non-behavior-changing refactoring of the AST checker.
Specifically, it breaks the `Stmt`, `Expr`, and `ExceptHandler` visitors
into four distinct, consistent phases:
1. **Phase 1: Analysis**: Run any lint rules on the node.
2. **Phase 2: Binding**: Bind any symbols declared by the node.
3. **Phase 3: Recursion**: Visit all child nodes.
4. **Phase 4: Clean-up**: Pop scopes, etc.
There are some fuzzy boundaries in the last three phases, but the most
important divide is between the Phase 1 and all the others -- the goal
here is (as much as possible) to disentangle all of the vanilla
lint-rule calls from any other semantic analysis or model building.
Part of the motivation here is that I'm considering re-ordering some of
these phases, and it was just impossible to reason about that change as
long as we had miscellaneous binding-creation and scope-modification
code intermingled with lint rules. However, this could also enable us to
(e.g.) move the entire analysis phase elsewhere, and even with a more
limited API that has read-only access to `Checker` (but can push to a
diagnostics vector).
## Summary
Comparing repos with black requires that we use the settings as black,
notably line length and magic trailing comma behaviour. Excludes and
preserving quotes (vs. a preference for either quote style) is not yet
implemented because they weren't needed for the test projects.
In the other two commits i fixed the output when the progress bar is
hidden (this way is recommonded in the indicatif docs), added a
`scratch.pyi` file to gitignore because black formats stub files
differently and also updated the ecosystem readme with the projects json
without forks.
## Test Plan
I added a `line-length` vs `line_length` test. Otherwise only my
personal usage atm, a PR to integrate the script into the CI to check
some projects will follow.
## Summary
Closes#5628 by only emitting if `sep=","`. Includes documentation
(completes the `pandas-vet` ruleset).
Related to #2646.
## Test Plan
`cargo test`
Moves the computation of the `start_offset` for overlong lines to just
before the result is returned. There is a slight overhead for overlong
lines (double the work for the first `limit` characters).
In practice this results in a speedup on the CPython codebase. Most
lines are not overlong, or are not enforced because the line ends with a
URL, or does not contain whitespace. Nonetheless, the 0.3% of overlong
lines are a lot compared to other violations.
### Before

_Selected W505 and E501_

_All rules_
### After

_Selected W505 and E501_

_All rules_
CPython line statistics:
- Number of Python lines: 867.696
- Number of overlong lines: 2.963 (0.3%)
<details>
Benchmark selected:
```shell
cargo build --release && hyperfine --warmup 10 --min-runs 50 \
"./target/release/ruff ./crates/ruff/resources/test/cpython/ --no-cache -e --select W505,E501"
```
Benchmark all:
```shell
cargo build --release && hyperfine --warmup 10 --min-runs 50 \
"./target/release/ruff ./crates/ruff/resources/test/cpython/ --no-cache -e --select ALL"
```
Overlong lines in CPython
```shell
cargo run -p ruff_cli -- check crates/ruff/resources/test/cpython/Lib --no-cache --select=E501,W505 --statistics
```
Total Python lines:
```shell
find crates/ruff/resources/test/cpython/ -name '*.py' | xargs wc -l
```
</details>
(Performance tested on Mac M1)
## Summary
The motivating change here is to remove `let range =
except_handler.try_identifier().unwrap();` and instead just do
`name.range()`, since exception names now have ranges attached to them
by the parse. This also required some refactors (which are improvements)
to the built-in attribute shadowing rules, since at least one invocation
relied on passing in the exception handler and calling
`.try_identifier()`. Now that we have easy access to identifiers, we can
remove the whole `AnyShadowing` abstraction.
## Summary
This is more similar to how these flags work in other contexts (e.g.,
`visit_annotation`), and also ensures that we unset it prior to visit
the `orelse` and `finalbody` (a subtle bug).
## Summary
The intent of this rule is to always flag the `global` declaration, not
the usage. The current implementation does the wrong thing if a global
is assigned multiple times. Using `semantic().global()` is also more
efficient.
## Summary
Adds autofix for `hasattr` case of B004. I don't think it's safe (or
simple) to implement it for the `getattr` case because, inter alia,
calling `getattr` may have side effects.
Fixes#3545
## Test Plan
Existing tests were sufficient. Updated snapshots
## Summary
I'm doing some unrelated profiling, and I noticed that this method is
actually measurable on the CPython benchmark -- it's > 1% of execution
time. We don't need to lex here, we already know the ranges of all
comments, so we can just do a simple binary search for overlap, which
brings the method down to 0%.
## Test Plan
`cargo test`
## Summary
When required-imports is set with the syntax from ... import ... as ...,
autofix I002 is failing
## Test Plan
Reuse the same python files as
`crates/ruff/src/rules/isort/mod.rs:required_import` test.
## Summary
Previously, the `quoted-annotation` rule only removed quotes when `from
__future__ import annotations` was present. However, there are some
other cases in which this is also safe -- for example:
```python
def foo():
x: "MyClass"
```
We already model these in the semantic model, so this PR just expands
the scope of the rule to handle those.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR uses the `join_comma_separated` builder for formatting set
expressions
to ensure the formatting preserves magic commas, if the setting is
enabled.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
See the fixed black tests
<!-- How was it tested? -->
## Summary
Format `DictComp` like `ListComp` from #5600. It's not 100%, but I
figured maybe it's worth starting to explore.
## Test Plan
Added ruff fixture based on `ListComp`'s.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR improves the parentheses handling for with items to get closer
to black's formatting.
### Case 1:
```python
# Black / Input
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
aaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+ cccccccccccccccccccccccccccc
+ ddddddddddddddddd as example2,
CtxManager2() as example2,
CtxManager2() as example2,
CtxManager2() as example2,
):
...
# Before
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
(
aaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+ cccccccccccccccccccccccccccc
+ ddddddddddddddddd
) as example2,
CtxManager2() as example2,
CtxManager2() as example2,
CtxManager2() as example2,
):
...
```
Notice how Ruff wraps the binary expression in an extra set of
parentheses
### Case 2:
Black does not expand the with-items if the with has no parentheses:
```python
# Black / Input
with aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c:
...
# Before
with (
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c
):
...
```
Or
```python
# Black / Input
with [
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1, aaaaaaaaaaaaaaaaaaaaaaaaaa * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb * cccccccccccccccccccccccccccc + ddddddddddddddddd as example2, CtxManager222222222222222() as example2:
...
# Before (Same as Case 1)
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
(
aaaaaaaaaaaaaaaaaaaaaaaaaa
* bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
* cccccccccccccccccccccccccccc
+ ddddddddddddddddd
) as example2,
CtxManager222222222222222() as example2,
):
...
```
## Test Plan
I added new snapshot tests
Improves the django similarity index from 0.973 to 0.977
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format `SetComp` like `ListComp`.
## Test Plan
Derived from `ListComp`'s fixture.
## Summary
`B006` should allow using `bytes(...)` as an argument defaule value.
## Test Plan
A new test case
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
Non-behavioral change, but this is the same in each branch. Visiting the
`func` first also means we've visited the `func` by the time we try to
resolve it (via `resolve_call_path`), which should be helpful in a
future refactor.
## Summary
The AST pass is broken up into three phases: pre-visit (which includes
analysis), recurse (visit all members), and post-visit (clean-up). We're
not supposed to edit semantic model flags in the pre-visit phase, but it
looks like we were for literal detection. This didn't matter in
practice, but I'm looking into some AST refactors for which this _does_
cause issues.
No behavior changes expected.
## Test Plan
Good test coverage on these.
## Summary
`PERF102` looks for unused keys or values in `dict.items()` calls, and
suggests instead using `dict.keys()` or `dict.values()`. Previously,
this check determined usage by looking for underscore-prefixed
variables. However, we can use the semantic model to actually detect
whether a variable is used. This has two nice effects:
1. We avoid odd false-positives whereby underscore-prefixed variables
are actually used.
2. We can catch more cases (fewer false-negatives) by detecting unused
loop variables that _aren't_ underscore-prefixed.
Closes#5692.
## Summary
Nested calls to `sorted` can only be collapsed if the calls are
identical (i.e., they have the exact same keyword arguments).
Update C414 to only flag such cases.
Fixes#5712
## Test Plan
Updated snapshots.
Tested against flake8-comprehensions. It incorrectly flags these cases.
## Summary
This is really bad PR hygiene, but a mix of: using `Locator`-based fixes
in a few places (in lieu of `Generator`-based fixes), using match syntax
to avoid `.len() == 1` checks, using common helpers in more places, etc.
## Test Plan
`cargo test`
## Summary
Consider single element subscript expr for implicit optional.
On `main`, the cases where there is only a single element in the
subscript
list was giving false positives such as for the following:
```python
typing.Union[None]
typing.Literal[None]
```
## Test Plan
`cargo test`
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Check for `Any` in other types for `ANN401`. This reuses the logic from
`implicit-optional` rule to resolve the type to `Any`.
Following types are supported:
* `Union[Any, ...]`
* `Any | ...`
* `Optional[Any]`
* `Annotated[<any of the above variant>, ...]`
* Forward references i.e., `"Any | ..."`
## Test Plan
Added test cases for various combinations.
fixes: #5458
## Summary
Do not raise `EXE001` and `EXE002` if WSL is detected. Uses the
[`wsl`](https://crates.io/crates/wsl) crate.
Closes#5445.
## Test Plan
`cargo test`
I don't use Windows, so was unable to test on a WSL environment. It
would be good if someone who runs Windows could check the functionality.
## Summary
Python doesn't allow `"Foo" | None` if the annotation will be evaluated
at runtime (see the comments in the PR, or the semantic model
documentation for more on what this means and when it is true), but it
_does_ allow it if the annotation is typing-only.
This, for example, is invalid, as Python will evaluate `"Foo" | None` at
runtime in order to
populate the function's `__annotations__`:
```python
def f(x: "Foo" | None): ...
```
This, however, is valid:
```python
def f():
x: "Foo" | None
```
As is this:
```python
from __future__ import annotations
def f(x: "Foo" | None): ...
```
Closes#5706.
## Summary
`StmtAnnAssign` would not insert parentheses when breaking the same way
`StmtAssign` does, causing unstable formatting and likely some syntax
errors.
## Test Plan
I added a regression test.
## Summary
The previous dummy was causing instabilities since it turned a string
into a variable.
E.g.
```python
script_header_dict[
"slurm_partition_line"
] = f"#SBATCH --partition {resources.queue_name}"
```
has an instability as
```python
- script_header_dict["slurm_partition_line"] = (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ script_header_dict[
+ "slurm_partition_line"
+ ] = NOT_YET_IMPLEMENTED_ExprJoinedStr
```
## Test Plan
The instability is gone, otherwise it's still a dummy
## Summary
Implement Pylint rule [`consider-using-in`
(`R1714`)](https://pylint.pycqa.org/en/latest/user_guide/messages/refactor/consider-using-in.html)
as `repeated-equality-comparison-target` (`PLR1714`). This rule checks
for expressions that can be re-written as a membership test for better
readability and performance.
For example,
```python
foo == "bar" or foo == "baz" or foo == "qux"
```
should be rewritten as
```python
foo in {"bar", "baz", "qux"}
```
Related to #970. Includes documentation.
### Implementation quirks
The implementation does not work with Yoda conditions (e.g., `"a" ==
foo` instead of `foo == "a"`). The Pylint version does. I couldn't find
a way of supporting Yoda-style conditions without it being inefficient,
so didn't (I don't think people write Yoda conditions any way).
## Test Plan
Added fixture.
`cargo test`
## Summary
We have two `Cursor` implementations. This PR moves the implementation
from the formatter into `ruff_python_whitespace` (kind of a poorly-named
crate now) and uses it for both use-cases.
## Summary
Document all `ruff_dev` subcommands and document the `format_dev` flags
in the formatter readme.
CC @zanieb please flag everything that isn't clear or missing
## Test Plan
n/a
Detects invalid types for tuple, list, bytes, string indices.
For example, the following will raise a `TypeError` at runtime and when
imported Python will display a `SyntaxWarning`
```python
var = [1, 2, 3]["x"]
```
```
example.py:1: SyntaxWarning: list indices must be integers or slices, not str; perhaps you missed a comma?
var = [1, 2, 3]["x"]
Traceback (most recent call last):
File "example.py", line 1, in <module>
var = [1, 2, 3]["x"]
~~~~~~~~~^^^^^
TypeError: list indices must be integers or slices, not str
```
Previously, Ruff would not report the invalid syntax but now a violation
will be reported. This does not apply to cases where a variable, call,
or complex expression is used in the index — detection is roughly
limited to static definitions, which matches Python's warnings.
```
❯ ./target/debug/ruff example.py --select RUF015 --show-source --no-cache
example.py:1:17: RUF015 Indexed access to type `list` uses type `str` instead of an integer or slice.
|
1 | var = [1, 2, 3]["x"]
| ^^^ RUF015
|
```
Closes https://github.com/astral-sh/ruff/issues/5082
xref
ffff1440d1
## Summary
Replaces `DictionaryKey` enum with the more general `ComparableExpr`
when checking for duplicate keys
## Test Plan
Added test fixture from issue. Can potentially be expanded further
depending on what exactly we want to flag (e.g. do we also want to check
for unhashable types?) and which `ComparableExpr::XYZ` types we consider
literals.
## Issue link
Closes: https://github.com/astral-sh/ruff/issues/5691
## Summary
Similar to #5567, we can remove the use of regex, plus simplify the
representation (use `Option`), add snapshot tests, etc.
This is about 100x faster than using a regex for cases that match (2.5ns
vs. 250ns). It's obviously not a hot path, but I prefer the consistency
with other similar comment-parsing. I may DRY these up into some common
functionality later on.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR matches Black' behavior where it only omits the optional parentheses if the expression starts or ends with a parenthesized expression:
```python
a + [aaa, bbb, cccc] * c # Don't omit
[aaa, bbb, cccc] + a * c # Split
a + c * [aaa, bbb, ccc] # Split
```
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
This improves the Jaccard index from 0.945 to 0.946
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR improves the Black compatibility when it comes to breaking comprehensions.
We want to avoid line breaks before the target and `in` whenever possible. Furthermore, `if X is not None` should be grouped together, similar to other binary like expressions
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
## Summary
We don't use `ModExpression` anywhere but it's part of the AST, removes
one `not_implemented_yet` and is a trivial 2-liner, so i implemented
formatting for `ModExpression`.
## Test Plan
None, this kind of node does not occur in file input. Otherwise all the
tests for expressions
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
I started working on this because I assumed that I would need access to options inside of `NeedsParantheses` but it then turned out that I won't.
Anyway, it kind of felt nice to pass fewer arguments. So I'm gonna put this out here to get your feedback if you prefer this over passing individual fiels.
Oh, I sneeked in another change. I renamed `context.contents` to `source`. `contents` is too generic and doesn't tell you anything.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
It compiles
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the `mode` field from `BestFitting` because it is no longer used (we now use `conditional_group` and `fits_expanded).
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses:
```python
# We want
if a + [
b,
c
]:
pass
# Rather than
if (
a
+ [b, c]
):
pass
```
This is implemented by using the new IR elements introduced in #5596.
* We give the group wrapping the optional parentheses an ID (`parentheses_id`)
* We use `conditional_group` for the lower priority groups (all non-parenthesized expressions) with the condition that the `parentheses_id` group breaks (we want to split before operands only if the parentheses are necessary)
* We use `fits_expanded` to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the `parentheses_id` group. We gate the `fits_expand` to only apply if the `parentheses_id` group fits (because we prefer `a\n+[b, c]` over expanding `[b, c]` if the whole expression gets parenthesized).
We limit using `fits_expanded` and `conditional_group` only to expressions that themselves are not in parentheses (checking the conditions isn't free)
## Test Plan
It increases the Jaccard index for Django from 0.915 to 0.917
## Incompatibilites
There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences).
### Long string literals
I commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points.
I think we should ignore this incompatibility for now
### Expressions on statement level
I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width
```python
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd
```
But it would if the expression is used inside of a condition.
What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.
## Summary
The similarity index, the fraction of unchanged lines, is easier to
understand than the jaccard index, the fraction between intersection and
union.
## Test Plan
I ran this on django and git a 0.945 index, meaning 5.5% of lines are
currently reformatted when compared to black
## Summary
Format statements such as `tree_depth += 1`. This is a statement that
does not allow any line breaks, the only thing to be mindful of is to
parenthesize the assigned expression
Jaccard index on django: 0.915 -> 0.918
## Test Plan
black tests, and two new tests, a basic one and one that ensures that
the child gets parentheses. I ran the django stability check.
## Summary
This is the result of running `cargo +nightly clippy --workspace
--all-targets --all-features -- -D warnings` and fixing all violations.
Just wanted to see if there were any interesting new checks on nightly
👀
## Summary
This PR implements the formatting of `raise` statements. I haven't
looked at the black implementation, this is inspired from from the
`return` statements formatting.
## Test Plan
The black differences with insta.
I also compared manually some edge cases with very long string and call
chaining and it seems to do the same formatting as black.
There is one issue:
```python
# input
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa).a(aaaa)
# black
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(
aaaa
)
# ruff
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(aaaa)
```
But I'm not sure this diff is the raise formatting implementation.
---------
Co-authored-by: Louis Dispa <ldispa@deezer.com>
## Summary
#5658 didn't actually ignore bivariate types in some all cases (sorry
about that). This PR fixes that and adds bivariate types to the test
fixture.
## Test Plan
`cargo test`
## Summary
Change the `type-name-incorrect-variance` diagnostic message to include
the detected variance and a name change recommendation. For example,
```
`TypeVar` name "T_co" does not reflect its contravariance; consider renaming it to "T_contra"
```
Related to #5651.
## Test Plan
`cargo test`
## Summary
Non-behavior-changing refactors to delay some `.is_builtin` calls in a
few older rules. Cheaper pre-conditions should always be checked first.
## Summary
Fixes#5503. Ready for final review as the `mkdocs` issue involving SSH
keys is fixed.
Note that this will only throw on a `Name` - it will be refactorable
once we have a type-checker. This means that this is the only sort of
input that will throw.
```python
x = range(10)
list(x)[0]
```
I thought it'd be confusing if we supported direct function results.
Consider this example, assuming we support direct results:
```python
# throws
list(range(10))[0]
def createRange(bound):
return range(bound)
# "why doesn't this throw, but a direct `range(10)` call does?"
list(createRange(10))[0]
```
If it's necessary, I can go through the list of built-ins and find those
which produce iterables, then add them to the throwing list.
## Test Plan
Added a new fixture, then ran `cargo t`
## Summary
Implement Pylint `typevar-name-incorrect-variance` (`C0105`) as
`type-name-incorrect-variance` (`PLC0105`). Includes documentation.
Related to #970.
The Pylint implementation checks only `TypeVar`, but this PR checks
`ParamSpec` as well.
## Test Plan
Added test fixture.
`cargo test`
## Summary
Completes all the documentation for the `pandas-vet` rules, except for
`pandas-use-of-dot-read-table` as I am unclear of the rule's motivation
(see #5628).
Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py && mkdocs serve`
## Summary
Fixes#5246. We generate a hash set of all exception IDs caught by the
`try` statement, then check that the inner `raise` actually raises a
caught exception.
## Test Plan
Added a new test, `cargo t`.
## Summary
Fix an oversight in `find_only_token_in_range` where the following code
would panic due do the closing and opening parentheses being in the
range we scan:
```python
d1 = [
("a") if # 1
("b") else # 2
("c")
]
```
Closing and opening parentheses respectively are now correctly skipped.
## Test Plan
I added a regression test
## Summary
This PR reworks the `upstream_categories` mechanism that is only used
for documentation purposes to make it easier to generate docs using
`all_rules()`. The new implementation also relies on "tribal knowledge"
about rule codes, so it's not the best implementation, but gets us
forward.
Another option would be to change the rule-defining proc macros to allow
configuring an optional `RuleCategory`, but that seems more heavy-handed
and possibly unnecessary in the long run...
Draft since this builds on #5439.
cc @charliermarsh :)
## Summary
Format named expressions (walrus operator) such a `value := f()`.
Unlike tuples, named expression parentheses are not part of the range
even when mandatory, so mapping optional parentheses to always gives us
decent formatting without implementing all [PEP
572](https://peps.python.org/pep-0572/) rules on when we need
parentheses where other expressions wouldn't. We might want to revisit
this decision later and implement special cases, but for now this gives
us what we need.
## Test Plan
black fixtures, i added some fixtures and checked django and cpython for
stability.
Closes#5613
## Summary
This changes the docs to show a nursery icon (🌅) for rules in the
nursery.
It currently doesn't do that for the rules that are in sub-categories
(Pylint, Pycodestyle) because there is no `all_rules()` for the
`RuleCodePrefix` that's returned by `UpstreamCategory` iteration (and as
mentioned on Discord, I think `UpstreamCategory` maybe shouldn't be a
thing). (That would be enabled by #5591.)
## Test Plan
Generated docs to see new icons (with the caveat above).
## Summary
It turns out that just doing this match directly without `AhoCorasick`
is much faster, like 2x (and removes one dependency, though we likely
already rely on this transitively).
## Summary
Adds `import tkinter as tk` to the list of default import conventions.
Closes#5620.
## Test Plan
Added `tkinter` to test fixture.
`cargo test`
## Summary
We're doing some unsafe accesses to advance these iterators. It's easier
to model these as actual iterators to ensure safety everywhere. Also
added some additional test cases.
Closes#5621.
## Summary
We now treat `# flake8: noqa: F401` as turning off F401 for the entire
file. (Flake8 treats this as turning off _all rules_ for the entire
file).
This deviates from Flake8, but I think it's a much more user-friendly
deviation than what I introduced in #5571. See
https://github.com/astral-sh/ruff/issues/5617 for an explanation.
Closes https://github.com/astral-sh/ruff/issues/5617.
## Summary
This PR adds a `ParseError` type to the `noqa` parsing system to enable
us to render useful warnings instead of silently failing when parsing
`noqa` codes.
For example, given `foo.py`:
```python
# ruff: noqa: x
# ruff: noqa foo
# flake8: noqa: F401
import os # noqa: foo-bar
```
We would now output:
```console
warning: Invalid `# noqa` directive on line 2: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 4: expected `:` followed by a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 6: Flake8's blanket exemption does not support exempting specific codes. To exempt specific codes, use, e.g., `# ruff: noqa: F401, F841` instead.
warning: Invalid `# noqa` directive on line 7: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
There's one important behavior change here too. Right now, with Flake8,
if you do `# flake8: noqa: F401`, Flake8 treats that as equivalent to `#
flake8: noqa` -- it turns off _all_ diagnostics in the file, not just
`F401`. Historically, we respected this... but, I think it's confusing.
So we now raise a warning, and don't respect it at all. This will lead
to errors in some projects, but I'd argue that right now, those
directives are almost certainly behaving in an unintended way for users
anyway.
Closes https://github.com/astral-sh/ruff/issues/3339.
## Summary
I was testing some changes on Airflow, and I realized that we _always_
run the `pyproject.toml` validation rules, even if they're not enabled.
This PR gates them behind the appropriate enablement flags.
## Test Plan
- Ran: `cargo run -p ruff_cli -- check ../airflow -n`. Verified that no
RUF200 violations were raised.
- Run: `cargo run -p ruff_cli -- check ../airflow -n --select RUF200`.
Verified that two RUF200 violations were raised.
Since the (implicit) update to cargo-insta 1.30, CI would pass even when
the tests failed. This downgrades to cargo insta 1.29.0 and CI fails
again when it should (which i can't show here, because CI needs to pass
to merge this PR). I've improved the unreferenced snapshot handling in
the process
See https://github.com/mitsuhiko/insta/issues/392
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Fix typos found by
[codespell](https://github.com/codespell-project/codespell).
I have left out `memoize` for now (see #5606).
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
CI tests.
<!-- How was it tested? -->
## Summary
Format `ExprIfExp`, also known as the ternary operator or inline `if`.
It can look like
```python
a1 = 1 if True else 2
```
but also
```python
b1 = (
# We return "a" ...
"a" # that's our True value
# ... if this condition matches ...
if True # that's our test
# ... otherwise we return "b§
else "b" # that's our False value
)
```
This also fixes a visitor order bug.
The jaccard index on django goes from 0.911 to 0.915.
## Test Plan
I added fixtures without and with comments in strange places.
Implements PYI030 as part of
https://github.com/astral-sh/ruff/issues/848
> Union expressions should never have more than one Literal member, as
Literal[1] | Literal[2] is semantically identical to Literal[1, 2].
Note we differ slightly from the flake8-pyi implementation:
- We detect cases where there are parentheses or nested unions
- We detect cases with mixed `Union` and `|` syntax
- We use the same error message for all violations; flake8-pyi has two
different messages
- We retain the user's quoting style when displaying string literals;
flake8-pyi uses single quotes
- We warn on duplicates of the same literal `Literal[1] | Literal[1]`
## Summary
In addition to `# noqa` codes, we also support file-level exemptions,
which look like:
- `# flake8: noqa` (ignore all rules in the file, for compatibility)
- `# ruff: noqa` (all rules in the file)
- `# ruff: noqa: F401` (ignore `F401` in the file, Flake8 doesn't
support this)
This PR moves that logic to something that looks a lot more like our `#
noqa` parser. Performance is actually quite a bit _worse_ than the
previous approach (lexing `# flake8: noqa` goes from 2ns to 11ns; lexing
`# ruff: noqa: F401, F841` is about the same`; lexing `# type: ignore #
noqa: E501` fgoes from 4ns to 6ns), but the numbers are very small so
it's... maybe worth it?
The primary benefit here is that we now properly support flexible
whitespace, like: `#flake8:noqa`. Previously, we required exact string
matching, and we also didn't support all case-insensitive variants of
`noqa`.
## Summary
There are two pypi links in the documentation that link to specific
version numbers of other packages. Removing these versioned links allows
users to immediately view the latest version of the package and
maintains consistency with the other links.
## Test Plan
N/A
## Summary
This extends the `ruff_dev` formatter script util. Instead of only doing
stability checks, you can now choose different compatible options on the
CLI and get statistics.
* It adds an option the formats all files that ruff would check to allow
looking at an entire black-formatted repository with `git diff`
* It computes the [Jaccard
index](https://en.wikipedia.org/wiki/Jaccard_index) as a measure of
deviation between input and output, which is useful as single number
metric for assessing our current deviations from black.
* It adds progress bars to both the single projects as well as the
multi-project mode.
* It adds an option to write the multi-project output to a file
Sample usage:
```
$ cargo run --bin ruff_dev -- format-dev --stability-check crates/ruff/resources/test/cpython
$ cargo run --bin ruff_dev -- format-dev --stability-check /home/konsti/projects/django
Syntax error in /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors (parser error): BaseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
Found 0 stability errors in 2755 files (jaccard index 0.911) in 9.75s
$ cargo run --bin ruff_dev -- format-dev --write /home/konsti/projects/django
```
Options:
```
Several utils related to the formatter which can be run on one or more repositories. The selected set of files in a repository is the same as for `ruff check`.
* Check formatter stability: Format a repository twice and ensure that it looks that the first and second formatting look the same. * Format: Format the files in a repository to be able to check them with `git diff` * Statistics: The subcommand the Jaccard index between the (assumed to be black formatted) input and the ruff formatted output
Usage: ruff_dev format-dev [OPTIONS] [FILES]...
Arguments:
[FILES]...
Like `ruff check`'s files. See `--multi-project` if you want to format an ecosystem checkout
Options:
--stability-check
Check stability
We want to ensure that once formatted content stays the same when formatted again, which is known as formatter stability or formatter idempotency, and that the formatter prints syntactically valid code. As our test cases cover only a limited amount of code, this allows checking entire repositories.
--write
Format the files. Without this flag, the python files are not modified
--format <FORMAT>
Control the verbosity of the output
[default: default]
Possible values:
- minimal: Filenames only
- default: Filenames and reduced diff
- full: Full diff and invalid code
-x, --exit-first-error
Print only the first error and exit, `-x` is same as pytest
--multi-project
Checks each project inside a directory, useful e.g. if you want to check all of the ecosystem checkouts
--error-file <ERROR_FILE>
Write all errors to this file in addition to stdout. Only used in multi-project mode
```
## Test Plan
I ran this on django (2755 files, jaccard index 0.911) and discovered a
magic trailing comma problem and that we really needed to implement
import formatting. I ran the script on cpython to identify
https://github.com/astral-sh/ruff/pull/5558.
## Summary
In Python, the annotations on `x` and `y` here have very different
treatment:
```python
def foo(x: int):
y: int
```
The `int` in `x: int` is a runtime-required annotation, because `x` gets
added to the function's `__annotations__`. You'll notice, for example,
that this fails:
```python
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from foo import Bar
def f(x: Bar):
...
```
Because `Bar` is required to be available at runtime, not just at typing
time. Meanwhile, this succeeds:
```python
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from foo import Bar
def f():
x: Bar = 1
f()
```
(Both cases are fine if you use `from __future__ import annotations`.)
Historically, we've tracked those annotations that are _not_
runtime-required via the semantic model's `ANNOTATION` flag. But
annotations that _are_ runtime-required have been treated as "type
definitions" that aren't annotations.
This causes problems for the flake8-future-annotations rules, which try
to detect whether adding `from __future__ import annotations` would
_allow_ you to rewrite a type annotation. We need to know whether we're
in _any_ type annotation, runtime-required or not, since adding `from
__future__ import annotations` will convert any runtime-required
annotation to a typing-only annotation.
This PR adds separate state to track these runtime-required annotations.
The changes in the test fixtures are correct -- these were false
negatives before.
Closes https://github.com/astral-sh/ruff/issues/5574.
## Summary
I'll write up a more detailed description tomorrow, but in short, this
PR removes our regex-based implementation in favor of "manual" parsing.
I tried a couple different implementations. In the benchmarks below:
- `Directive/Regex` is our implementation on `main`.
- `Directive/Find` just uses `text.find("noqa")`, which is insufficient,
since it doesn't cover case-insensitive variants like `NOQA`, and
doesn't handle multiple `noqa` matches in a single like, like ` # Here's
a noqa comment # noqa: F401`. But it's kind of a baseline.
- `Directive/Memchr` uses three `memchr` iterative finders (one for
`noqa`, `NOQA`, and `NoQA`).
- `Directive/AhoCorasick` is roughly the variant checked-in here.
The raw results:
```
Directive/Regex/# noqa: F401
time: [273.69 ns 274.71 ns 276.03 ns]
change: [+1.4467% +1.8979% +2.4243%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
8 (8.00%) high mild
4 (4.00%) high severe
Directive/Find/# noqa: F401
time: [66.972 ns 67.048 ns 67.132 ns]
change: [+2.8292% +2.9377% +3.0540%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
3 (3.00%) low mild
8 (8.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa: F401
time: [76.922 ns 77.189 ns 77.536 ns]
change: [+0.4265% +0.6862% +0.9871%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Memchr/# noqa: F401
time: [62.627 ns 62.654 ns 62.679 ns]
change: [-0.1780% -0.0887% -0.0120%] (p = 0.03 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
5 (5.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
Directive/Regex/# noqa: F401, F841
time: [321.83 ns 322.39 ns 322.93 ns]
change: [+8602.4% +8623.5% +8644.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# noqa: F401, F841
time: [78.618 ns 78.758 ns 78.896 ns]
change: [+1.6909% +1.8771% +2.0628%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
Directive/AhoCorasick/# noqa: F401, F841
time: [87.739 ns 88.057 ns 88.468 ns]
change: [+0.1843% +0.4685% +0.7854%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
Directive/Memchr/# noqa: F401, F841
time: [80.674 ns 80.774 ns 80.860 ns]
change: [-0.7343% -0.5633% -0.4031%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) low severe
9 (9.00%) low mild
1 (1.00%) high mild
Directive/Regex/# noqa time: [194.86 ns 195.93 ns 196.97 ns]
change: [+11973% +12039% +12103%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) low mild
1 (1.00%) high mild
Directive/Find/# noqa time: [25.327 ns 25.354 ns 25.383 ns]
change: [+3.8524% +4.0267% +4.1845%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa
time: [34.267 ns 34.368 ns 34.481 ns]
change: [+0.5646% +0.8505% +1.1281%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild
Directive/Memchr/# noqa time: [21.770 ns 21.818 ns 21.874 ns]
change: [-0.0990% +0.1464% +0.4046%] (p = 0.26 > 0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
Directive/Regex/# type: ignore # noqa: E501
time: [278.76 ns 279.69 ns 280.72 ns]
change: [+7449.4% +7469.8% +7490.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# type: ignore # noqa: E501
time: [67.791 ns 67.976 ns 68.184 ns]
change: [+2.8321% +3.1735% +3.5418%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# type: ignore # noqa: E501
time: [75.908 ns 76.055 ns 76.210 ns]
change: [+0.9269% +1.1427% +1.3955%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
Directive/Memchr/# type: ignore # noqa: E501
time: [72.549 ns 72.723 ns 72.957 ns]
change: [+1.5881% +1.9660% +2.3974%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
10 (10.00%) high mild
5 (5.00%) high severe
Directive/Regex/# type: ignore # nosec
time: [66.967 ns 67.075 ns 67.207 ns]
change: [+1713.0% +1715.8% +1718.9%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
Directive/Find/# type: ignore # nosec
time: [18.505 ns 18.548 ns 18.597 ns]
change: [+1.3520% +1.6976% +2.0333%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
Directive/AhoCorasick/# type: ignore # nosec
time: [16.162 ns 16.206 ns 16.252 ns]
change: [+1.2919% +1.5587% +1.8430%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
Directive/Memchr/# type: ignore # nosec
time: [39.192 ns 39.233 ns 39.276 ns]
change: [+0.5164% +0.7456% +0.9790%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
4 (4.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Regex/# some very long comment that # is interspersed with characters but # no directive
time: [81.460 ns 81.578 ns 81.703 ns]
change: [+2093.3% +2098.8% +2104.2%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
2 (2.00%) high mild
Directive/Find/# some very long comment that # is interspersed with characters but # no directive
time: [26.284 ns 26.331 ns 26.387 ns]
change: [+0.7554% +1.1027% +1.3832%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# some very long comment that # is interspersed with characters but # no direc...
time: [28.643 ns 28.714 ns 28.787 ns]
change: [+1.3774% +1.6780% +2.0028%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
Directive/Memchr/# some very long comment that # is interspersed with characters but # no directive
time: [55.766 ns 55.831 ns 55.897 ns]
change: [+1.5802% +1.7476% +1.9021%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) low mild
```
While memchr is faster than aho-corasick in some of the common cases
(like `# noqa: F401`), the latter is way, way faster when there _isn't_
a match (like 2x faster -- see the last two cases). Since most comments
_aren't_ `noqa` comments, this felt like the right tradeoff. Note that
all implementations are significantly faster than the regex version.
(I know I originally reported a 10x speedup, but I ended up improving
the regex version a bit in some prior PRs, so it got unintentionally
faster via some refactors.)
There's also one behavior change in here, which is that we now allow
variable spaces, e.g., `#noqa` or `# noqa`. Previously, we required
exactly one space. This thus closes#5177.
## Summary
The following code was previously leading to unstable formatting:
```python
try:
try:
pass
finally:
print(1) # issue7208
except A:
pass
```
The comment would be formatted as a trailing comment of `try` which is
unstable as an end-of-line comment gets two extra whitespaces.
This was originally found in
99b00efd5e/Lib/getpass.py (L68-L91)
## Test Plan
I added a regression test
## Summary
It's a bit simpler to let the API just take the text itself, plus an
offset (to make the returned `TextRange` absolute, rather than
relative).
## Summary
Adds a `--case-sensitive` setting/flag to isort (default: `false`)
which, when set to `true` sorts imports case sensitively instead of case
insensitively.
Tests and Docs can be improved, can do that if the general idea of the
implementation is in order.
First `isort` edit so any and all feedback is welcomed even more than
usual.
## Test Plan
Added a fixture with an assortment of imports in various cases.
## Issue links
Closes: https://github.com/astral-sh/ruff/issues/5514
## Summary
This PR enables us to resolve attribute accesses within files, at least
for static and class methods. For example, we can now detect that this
is a function access (and avoid a false-positive):
```python
class Class:
@staticmethod
def error():
return ValueError("Something")
# OK
raise Class.error()
```
Closes#5487.
Closes#5416.
## Summary
Implement Pylint `typevar-double-variance` (`C0131`) as
`type-bivariance` (`PLC0131`). Includes documentation. Related to #970.
Renamed the rule to be more clear (it's not immediately obvious what
'double' means, IMO).
The Pylint implementation checks only `TypeVar`, but this PR checks
`ParamSpec` as well.
## Test Plan
Added tests.
`cargo test`
## Summary
This adds a `ruff rule --all` switch that prints out a human-readable
Markdown or a machine-readable JSON document of the lint rules known to
Ruff.
I needed a machine-readable document of the rules [for a
project](https://github.com/astral-sh/ruff/discussions/5078), and
figured it could be useful for other people – or tooling! – to be able
to interrogate Ruff about its arcane knowledge.
The JSON output is an array of the same objects printed by `ruff rule
--format=json`.
## Test Plan
I ran `ruff rule --all --format=json`. I think more might be needed, but
maybe a snapshot test is overkill?
## Summary
Implement Pylint `typevar-name-mismatch` (`C0132`) as
`type-param-name-mismatch` (`PLC0132`). Includes documentation. Related
to #970.
The Pylint implementation checks only `TypeVar`, but this PR checks
`TypeVarTuple`, `ParamSpec`, and `NewType` as well. This seems to better
represent the Pylint rule's [intended
behaviour](https://github.com/pylint-dev/pylint/issues/5224).
Full disclosure: I am not a fan of the translated name and think it
should probably be different.
## Test Plan
`cargo test`
## Summary
This makes the output of `check-formatter-stability` more concise by
removing extraneous newlines. It also adds a `--error-file` option to
that script that allows creating a file with just the errors (without
the status messages) to share with others.
## Test Plan
I ran it over CPython and looked at the output. I then added the
`--error-file` option and looked at the contents of the file
## Summary
Format import statements in all their variants. Specifically, this
implemented formatting `StmtImport`, `StmtImportFrom` and `Alias`.
## Test Plan
I added some custom snapshots, even though this has been covered well by
black's tests.
## Summary
As discussed on ~IRC~ Discord, this will make it easier for e.g. the
docs generation stuff to get all rules for a linter (using
`all_rules()`) instead of just non-nursery ones, and it also makes it
more Explicit Is Better Than Implicit to iterate over linter rules.
Grepping for `Item = Rule` reveals some remaining implicit
`IntoIterator`s that I didn't feel were necessarily in scope for this
(and honestly, iterating over a `RuleSet` makes sense).
## Summary
If a comma separated list has only one entry, black will respect the
magic trailing comma, but it will not add a new one.
The following code will remain as is:
```python
b1 = [
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
]
b2 = [
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa,
]
b3 = [
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa,
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
]
```
## Test Plan
This was first discovered in
7eeadc82c2/django/contrib/admin/checks.py (L674-L681),
which i've minimized into a call test.
I've added tests for the three cases (one entry + no comma, one entry +
comma, more than one entry) to the list tests.
The diffs from the black tests get smaller.
## Summary
Removing some false positives based on running over `zulip`.
`PERF401` now also detects cases like:
```py
original = list(range(10000))
filtered = []
for i in original:
filtered.append(i * i)
```
Previously, these were caught by the list-copy rule, but these too need
comprehensions.
## Summary
This PR applies the fix in #5478 to a variety of other call-sites, and
fixes some other range hygienic stuff in the rules that were modified.
## Summary
Change generator formatting dummy to include `NOT_YET_IMPLEMENTED`. This
makes it easier to correctly identify them as dummies
## Test Plan
This is a dummy change
## Summary
Adds `PERF401` and `PERF402` mirroring `W8401` and `W8402` from
https://github.com/tonybaloney/perflint
Implementation is not super smart but should be at parity with upstream
implementation judging by:
c07391c176/perflint/comprehension_checker.py (L42-L73)
It essentially checks:
- If the body of a for-loop is just one statement
- If that statement is an `if` and the if-statement contains a call to
`append()` we flag `PERF401` and suggest a list comprehension
- If that statement is a plain call to `append()` or `insert()` we flag
`PERF402` and suggest `list()` or `list.copy()`
I've set the violation to only flag the first append call in a long
`if-else` statement for `PERF401`. Happy to change this to some other
location or make it multiple violations if that makes more sense.
## Test Plan
Fixtures were added with the relevant scenarios for both rules
## Issue Links
Refers: https://github.com/astral-sh/ruff/issues/4789
## Summary
Implements flake8-pyi checks 002, 003, 004, 005. The logic is a bit
complex, as you can see in the [original
code](57921813c1/pyi.py (L1403C18-L1403C18)).
ref: #848
## Test Plan
Updated snapshot tests. Ran flake8 to double check lints, and ran ruff
with all PYI lints enabled to check for incorrect overlapping lint
errors.
Support for `let…else` formatting was just merged to nightly
(rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02
to pick this up. Followup to #939.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
## Summary
This PR reduces the noise from `DJ012` by emitting a single violation
when you have multiple consecutive violations of the same "type".
For example, given:
```py
class MultipleConsecutiveFields(models.Model):
"""Model that contains multiple out-of-order field definitions in a row."""
class Meta:
verbose_name = "test"
first_name = models.CharField(max_length=32)
last_name = models.CharField(max_length=32)
```
It's convenient to only error on `first_name`, and not `last_name`,
since we're really flagging that the _section_ is out-of-order.
Closes#5465.
## Summary
Given a docstring like:
```py
def f(a: int, b: int) -> int:
"""Showcase function.
Parameters
----------
a : int
_description_
b : int
_description_
Returns
-------
int
_description
"""
```
We were failing to identify `Returns` as a section, because the previous
line was neither empty nor ended with punctuation. This was causing a
false negative, where by we weren't flagging a missing line before
`Returns`. So, the very reason for the rule (no blank line) was causing
us to fail to catch it.
Note that, we did have a test case for this, which was working properly:
```py
def f() -> int:
"""Showcase function.
Parameters
----------
Returns
-------
"""
```
...because the line before `Returns` "ends in a punctuation mark" (`-`).
Closes#5442.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR normalizes line endings inside of strings to `\n` as required by the printer.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new test using `\r\n` and ran the ecosystem check. There are no remaining end of line panics.
https://gist.github.com/MichaReiser/8f36b1391ca7b48475b3a4f592d74ff4
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR uses rayon to parallelize the stability check by scheduling each project as its own task.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I ran the ecosystem check. It now makes use of all cores (except at the end, there are some large projects).
## Performance
The check now completes in minutes where it took about 30 minutes before.
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR fixes an issue where the binary expression formatting removed parentheses around the left hand side of an expression.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new regression test and re-ran the ecosystem check. It brings down the `check-formatter-stability` output from a 3.4MB file down to 900KB.
<!-- How was it tested? -->
## Summary
This PR makes E731 a "manual" fix in one other context: when the lambda
is shadowing another variable in the scope. Function declarations (with
shadowing) cause issues for type checkers, and so rewriting an
annotation, e.g., in branches of an `if` statement can lead to failures.
Closes https://github.com/astral-sh/ruff/issues/5421.
## Summary
Completes the documentation for the `flake8-logging-format` rules.
Related to #2646.
I included both the `flake8-logging-format` recommendation to use the
`extra` keyword and the Pylint recommendation to pass format values as
parameters so that formatting is done lazily, as #970 suggests the
Pylint logging rules are covered by this ruleset. Using lazy formatting
via parameters is probably more common than avoiding formatting entirely
in favour of the `extra` argument, regardless.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
This PR extracts a bunch of complex logic from `add_binding`, instead
running the the shadowing rules in the deferred handler, thereby
decoupling the binding phase (during which we build up the semantic
model) from the analysis phase, and generally making `add_binding` much
more focused.
This was made possible by improving the semantic model to better handle
deletions -- previously, we'd "lose track" of bindings if they were
deleted, which made this kind of refactor impossible.
## Test Plan
We have good automated coverage for this, but I want to benchmark it
separately.
## Summary
This PR fixes a silent failure that manifested itself in
https://github.com/astral-sh/ruff-vscode/issues/238. In short, if the
user provided invalid arguments to Ruff in the VS Code extension (like
`"ruff.args": ["a"]`), then we generated something like the following
command:
```console
/path/to/ruff --force-exclude --no-cache --no-fix --format json - --fix a --stdin-filename /path/to/file.py
```
Since this contains both `-` and `a` as the "input files", Ruff would
treat this as if we're linting the files names `-` and `a`, rather than
linting standard input.
This PR modifies out standard input detection to force standard input
when `--stdin-filename` is present, or at least one file is `-`. (We
then warn and ignore the others.)
## Summary
This was just an oversight -- the last remaining `todo!()` that I never
filled in. We clearly don't have any test coverage for it yet, but this
mimics the Pyright implementation.
## Summary
This PR adds some snapshot tests for the resolver based on executing
resolutions within a "mock" of the Airflow repo (that is: a folder that
contains a subset of the repo's files, but all empty, and with an
only-partially-complete virtual environment). It's intended to act as a
lightweight integration test, to enable us to test resolutions on a
"real" project without adding a dependency on Airflow itself.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Currently the URL at the bottom of the `ruff rule SLOT00x` output points
to Python 3.7 docs.
Given that Python 3.7 is now end-of-life (as of yesterday), let's
instead point users to the current Python docs.
## Test Plan
<!-- How was it tested? -->
## Summary
Consider Jupyter index for code frames (`--show-source`).
This solves two problems as mentioned in the linked issue:
> Omit any contents from adjoining cells
If the Jupyter index is present, we'll use that to check if the
surrounding
lines belong to the same cell as the content line. If not, we'll skip
that line
until we either reach the one which does or we reach the content line.
> code frame line number
If the Jupyter index is present, we'll use that to get the actual start
line in
corresponding to the computed start index.
## Test Plan
`cargo run --bin ruff -- check --no-cache --isolated --select=ALL --show-source /path/to/notebook.ipynb`
fixes: #5395
## Summary
The `Y053` rule of `flake8-pyi` ignores docstrings, it only triggers on
other string literals.
The separate `Y021/PYI021` rule exists to disallow docstrings.
## Test Plan
Added some `# OK` test cases to `PYI053.py(i)` files.
## Summary
Implement Pylint rule `single-string-used-for-slots` (`C0205`) as
`single-string-slots` (`PLC0205`). This rule checks for single strings
being assigned to `__slots__`. For example
```python
class Foo:
__slots__: str = "bar"
def __init__(self, bar: str) -> None:
self.bar = bar
```
should be
```python
class Foo:
__slots__: tuple[str, ...] = ("bar",)
def __init__(self, bar: str) -> None:
self.bar = bar
```
Related to #970. Includes documentation.
## Test Plan
`cargo test`
## Summary
Replace same length equal line with dash line in D407
Do we want to update the message and autofix title to reflect this
change?
## Test Plan
Added test cases for:
- Equal line length == dash line length
- Equal line length != dash line length
fixes: #5378
## Summary
This PR contains the first step towards enabling robust first-party,
third-party, and standard library import resolution in Ruff (including
support for `typeshed`, stub files, native modules, etc.) by porting
Pyright's import resolver to Rust.
The strategy taken here was to start with a more-or-less direct port of
the Pyright's TypeScript resolver. The code is intentionally similar,
and the test suite is effectively a superset of Pyright's test suite for
its own resolver. Due to the nature of the port, the code is very, very
non-idiomatic for Rust. The code is also entirely unused outside of the
test suite, and no effort has been made to integrate it with the rest of
the codebase.
Future work will include:
- Refactoring the code (now that it works) to match Rust and Ruff
idioms.
- Further testing, in practice, to ensure that the resolver can resolve
imports in a complex project, when provided with a virtual environment
path.
- Caching, to minimize filesystem lookups and redundant resolutions.
- Integration into Ruff itself (use Ruff's existing settings, find rules
that can make use of robust resolution, etc.)
ruff_dev repeat recently broke (i think with the cargo update?):
> thread 'main' panicked at 'Command repeat: Short option names must be
unique for each argument, but '-n' is in use by both 'no_cache' and
'repeat''
This fixes this by removing the short argument.
## Summary
This formats call expressions with magic trailing comma and parentheses
behaviour but without call chaining
## Test Plan
Lots of new test fixtures, including some that don't work yet
## Summary
Implements PERF203 from #4789, which throws if a `try/except` block is
inside of a loop. Not sure if we want to extend the diagnostic to the
`except` as well, but I thought that that may get a little messy. We may
also want to just throw on the word `try` - open to suggestions though.
## Test Plan
`cargo test`
## Summary
Experimental release for Jupyter Notebook integration.
Currently, this requires a user to explicitly opt-in using the
[include](https://beta.ruff.rs/docs/settings/#include) configuration:
```toml
[tool.ruff]
include = ["*.py", "*.pyi", "**/pyproject.toml", "*.ipynb"]
```
Or, a user can pass in the file directly:
```sh
ruff check path/to/notebook.ipynb
```
For known limitations, please refer #5188
## Test Plan
Following command should work without the `--all-features` flag:
```sh
cargo dev round-trip /path/to/notebook.ipynb
```
Following command should work with the above config file along with
`select = ["ALL"]`:
```sh
cargo run --bin ruff -- check --no-cache --config=../test-repos/openai-cookbook/pyproject.toml --fix ../test-repos/openai-cookbook/
```
Passing the Jupyter notebook directly:
```sh
cargo run --bin ruff -- check --no-cache --isolated --select=ALL --fix ../test-repos/openai-cookbook/examples/Classification_using_embeddings.ipynb
```
## Summary
Add documentation to the `D1XX` rules that flag missing docstrings.
The examples are quite long and docstrings practices vary a lot between
projects, so I thought it would be best that the documentation for these
rules be their own PR separate to the other `pydocstyle` rules.
Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR extends the string formatting to respect the configured quote style.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
Extended the string test with new cases and set it up to run twice: Once with the `quote_style: Doube`, and once with `quote_style: Single` single and double quotes.
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds tests that verify that the magic trailing comma is not respected if disabled in the formatter options.
Our test setup now allows to create a `<fixture-name>.options.json` file that contains an array of configurations that should be tested.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
It's all about tests :)
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds a new `PyFormatOptions` struct that stores the python formatter options.
The new options aren't used yet, with the exception of magical trailing commas and the options passed to the printer.
I'll follow up with more PRs that use the new options (e.g. `QuoteStyle`).
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test` I'll follow up with a new PR that adds support for overriding the options in our fixture tests.
## Motation
Previously,
```python
x = (
a1
.a2
# a
. # b
# c
a3
)
```
got formatted as
```python
x = a1.a2
# a
. # b
# c
a3
```
which is invalid syntax. This fixes that.
## Summary
This implements a basic form of attribute chaining
(<https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#call-chains>)
by checking if any inner attribute access contains an own line comment,
and if this is the case, adds parentheses around the outermost attribute
access while disabling parentheses for all inner attribute expressions.
We want to replace this with an implementation that uses recursion or a
stack while formatting instead of in `needs_parentheses` and also
includes calls rather sooner than later, but i'm fixing this now because
i'm uncomfortable with having known invalid syntax generation in the
formatter.
## Test Plan
I added new fixtures.
## Summary
Add documentation to the `D3XX` rules that check for issues with
docstring quotes. Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
Ignore type aliases for RUF013 to avoid flagging false positives:
```python
from typing import Optional
MaybeInt = Optional[int]
def f(arg: MaybeInt = None):
pass
```
But, at the expense of having false negatives:
```python
Text = str | bytes
def f(arg: Text = None):
pass
```
## Test Plan
`cargo test`
fixes: #5295
## Summary
This is small refactoring to reuse the code that detects the magic
trailing comma across functions. I make this change now to avoid copying
code in a later PR. @MichaReiser is planning on making a larger
refactoring later that integrates with the join nodes builder
## Test Plan
No functional changes. The magic trailing comma behaviour is checked by
the fixtures.
## Summary
When visiting AugAssign in evaluation order, the AugAssign `target`
should be visited after it's `value`. Based on my testing, the pseudo
code for `a += b` is effectively:
```python
tmp = a
a = tmp.__iadd__(b)
```
That is, an ideal traversal order would look something like this:
1. load a
2. b
3. op
4. store a
But, there is only a single AST node which captures `a` in the statement
`a += b`, so it cannot be traversed both before and after the traversal
of `b` and the `op`.
Nonetheless, I think traversing `a` after `b` and the `op` makes the
most sense for a number of reasons:
1. All the other assignment expressions traverse their `value`s before
their `target`s. Having `AugAssign` traverse in the same order would be
more consistent.
2. Within the AST, the `ctx` of the `target` for an `AugAssign` is
`Store` (though technically this is a `Load` and `Store` operation, the
AST only indicates it as a `Store`). Since the the store portion of the
`AugAssign` occurs last, I think it makes sense to traverse the `target`
last as well.
The effect of this is marginal, but it may have an impact on the
behavior of #5271.
## Summary
And remove cached files that we haven't seen for a certain period of
time, currently 30 days.
For the last seen timestamp we actually use an `u64`, it's smaller on
disk than `SystemTime` (which size is OS dependent) and fits in an
`AtomicU64` which we can use to update it without locks.
## Test Plan
Added a new unit test, run by `cargo test`.
In the following code, the comment used to get wrongly associated with
the `if False` since it looked like an elif. This fixes it by checking
the indentation and adding a regression test
```python
if True:
pass
else: # Comment
if False:
pass
pass
```
Originally found in
1570b94a02/gradio/external.py (L478)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR implements formatting for non-f-string Strings that do not use implicit concatenation.
Docstring formatting is out of the scope of this PR.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a few tests for simple string literals.
## Performance
Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times:
1. To detect if it is an implicit string continuation
2. To detect if the string contains any new lines
3. To detect the preferred quote
4. To normalize the string
Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time.
We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for compare operations.
The implementation currently breaks diffeently when nesting binary like expressions. I haven't yet figured out what Black's logic is in that case but I think that this by itself is already an improvement worth merging.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a few new tests
<!-- How was it tested? -->
## Summary
The `Visitor` and `preorder::Visitor` traits provide some convenience
functions, `visit_annotation` and `visit_format_spec`, for handling
annotation and format spec expressions respectively. Both of these
functions accept an `&Expr` and have a default implementation which
delegates to `walk_expr`. The problem with this approach is that any
custom handling done in `visit_expr` will be skipped for annotations and
format specs. Instead, to capture any custom logic implemented in
`visit_expr`, both of these function's default implementations should
delegate to `visit_expr` instead of `walk_expr`.
## Example
Consider the below `Visitor` implementation:
```rust
impl<'a> Visitor<'a> for Example<'a> {
fn visit_expr(&mut self, expr: &'a Expr) {
match expr {
Expr::Name(ExprName { id, .. }) => println!("Visiting {:?}", id),
_ => walk_expr(self, expr),
}
}
}
```
Run on the following Python snippet:
```python
a: b
```
I would expect such a visitor to print the following:
```
Visiting b
Visiting a
```
But it instead prints the following:
```
Visiting a
```
Our custom `visit_expr` handler is not invoked for the annotation.
## Test Plan
Tests added in #5271 caught this behavior.
## Summary
Move `collection-literal-concatenation` markdown documentation to the
correct place.
Fixes error in #5262.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
Adds PERF101 which checks for unnecessary casts to `list` in for loops.
NOTE: Is not fully equal to its upstream implementation as this
implementation does not flag based on type annotations
(i.e.):
```python
def foo(x: List[str]):
for y in list(x):
...
```
With the current set-up it's quite hard to get the annotation from a
function arg from its binding. Problem is best considered broader than
this implementation.
## Test Plan
Added fixture.
## Issue links
Refers: https://github.com/astral-sh/ruff/issues/4789
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This contains three changes:
* repos in `check_ecosystem.py` are stored as `org:name` instead of
`org/name` to create a flat directory layout
* `check_ecosystem.py` performs a maximum of 50 parallel jobs at the
same time to avoid consuming to much RAM
* `check-formatter-stability` gets a new option `--multi-project` so
it's possible to do `cargo run --bin ruff_dev --
check-formatter-stability --multi-project target/checkouts`
With these three changes it becomes easy to check the formatter
stability over a larger number of repositories. This is part of the
integration of integrating formatter regressions checks into the
ecosystem checks.
## Test Plan
```shell
python scripts/check_ecosystem.py --checkouts target/checkouts --projects github_search.jsonl -v $(which true) $(which true)
cargo run --bin ruff_dev -- check-formatter-stability --multi-project target/checkouts
```
## Summary
Remove recommendations to replace
`typing_extensions.dataclass_transform` and
`typing_extensions.SupportsIndex` with their `typing` library
counterparts.
Closes#5112.
## Test Plan
Added extra checks to the test fixture.
`cargo test`
## Summary
This snippet used to panic because it expected to see a comma or
something similar after the `2` but met the closing parentheses that is
not part of the range and panicked
```python
a = {
1: (2),
# comment
3: True,
}
```
Originally found in
636a717ef0/testing/marionette/client/marionette_driver/geckoinstance.py (L109)
This snippet is also the test plan.
This solves an instability when formatting cpython. It also introduces
another one, but i think it's still a worthwhile change for now.
There's no proper testing since this is just a dummy.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
format StmtFor
still trying to learn how to help out with the formatter. trying
something slightly more advanced than [break](#5158)
mostly copied form StmtWhile
## Test Plan
snapshots
## Summary
Ensures that `--select PL` and `--select PLC` don't include `PLC1901`.
Previously, `--select PL` _did_, because it's a "linter-level selector"
(`--select PLC` is viewed as selecting the `C` prefix from `PL`), and we
were missing this filtering path.
## Summary
This is a follow up to #5221. Turns out it was easy to restructure the
visitor to get the right order, I'm just dumb 🤷♂️ I've
removed `visit_arg_with_default` entirely from the `Visitor`, although
it still exists as part of `preorder::Visitor`.
## Motivation
While black keeps parentheses nearly everywhere, the notable exception
is in the body of for loops:
```python
for (a, b) in x:
pass
```
becomes
```python
for a, b in x:
pass
```
This currently blocks #5163, which this PR should unblock.
## Solution
This changes the `ExprTuple` formatting option to include one additional
option that removes the parentheses when not using magic trailing comma
and not breaking. It is supposed to be used through
```rust
#[derive(Debug)]
struct ExprTupleWithoutParentheses<'a>(&'a Expr);
impl Format<PyFormatContext<'_>> for ExprTupleWithoutParentheses<'_> {
fn fmt(&self, f: &mut Formatter<PyFormatContext<'_>>) -> FormatResult<()> {
match self.0 {
Expr::Tuple(expr_tuple) => expr_tuple
.format()
.with_options(TupleParentheses::StripInsideForLoop)
.fmt(f),
other => other.format().with_options(Parenthesize::IfBreaks).fmt(f),
}
}
}
```
## Testing
The for loop formatting isn't merged due to missing this (and i didn't
want to create more git weirdness across two people), but I've confirmed
that when applying this to while loops instead of for loops, then
```rust
write!(
f,
[
text("while"),
space(),
ExprTupleWithoutParentheses(test.as_ref()),
text(":"),
trailing_comments(trailing_condition_comments),
block_indent(&body.format())
]
)?;
```
makes
```python
while (a, b):
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (a,b,):
pass
```
formatted as
```python
while a, b:
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (
a,
b,
):
pass
```
## Summary
According to the AST visitor documentation, the AST visitor "visits all
nodes in the AST recursively in evaluation-order". However, the current
traversal fails to meet this specification in a few places.
### Function traversal
```python
order = []
@(order.append("decorator") or (lambda x: x))
def f(
posonly: order.append("posonly annotation") = order.append("posonly default"),
/,
arg: order.append("arg annotation") = order.append("arg default"),
*args: order.append("vararg annotation"),
kwarg: order.append("kwarg annotation") = order.append("kwarg default"),
**kwargs: order.append("kwarg annotation")
) -> order.append("return annotation"):
pass
print(order)
```
Executing the above snippet using CPython 3.10.6 prints the following
result (formatted for readability):
```python
[
'decorator',
'posonly default',
'arg default',
'kwarg default',
'arg annotation',
'posonly annotation',
'vararg annotation',
'kwarg annotation',
'kwarg annotation',
'return annotation',
]
```
Here we can see that decorators are evaluated first, followed by
argument defaults, and annotations are last. The current traversal of a
function's AST does not align with this order.
### Annotated assignment traversal
```python
order = []
x: order.append("annotation") = order.append("expression")
print(order)
```
Executing the above snippet using CPython 3.10.6 prints the following
result:
```python
['expression', 'annotation']
```
Here we can see that an annotated assignments annotation gets evaluated
after the assignment's expression. The current traversal of an annotated
assignment's AST does not align with this order.
## Why?
I'm slowly working on #3946 and porting over some of the logic and tests
from ssort. ssort is very sensitive to AST traversal order, so ensuring
the utmost correctness here is important.
## Test Plan
There doesn't seem to be existing tests for the AST visitor, so I didn't
bother adding tests for these very subtle changes. However, this
behavior will be captured in the tests for the PR which addresses #3946.
## Summary
This is a complete rewrite of the handling of `/` and `*` comment
handling in function signatures. The key problem is that slash and star
don't have a note. We now parse out the positions of slash and star and
their respective preceding and following note. I've left code comments
for each possible case of function signature structure and comment
placement
## Test Plan
I extended the function statement fixtures with cases that i found. If
you have more weird edge cases your input would be appreciated.
## Summary
This fixes two problems discovered when trying to format the cpython
repo with `cargo run --bin ruff_dev -- check-formatter-stability
projects/cpython`:
The first is to ignore try/except trailing comments for now since they
lead to unstable formatting on the dummy.
The second is to avoid dropping trailing if comments through placement:
This changes the placement to keep a comment trailing an if-elif or
if-elif-else to keep the comment a trailing comment on the entire if.
Previously the last comment would have been lost.
```python
if "first if":
pass
elif "first elif":
pass
```
The last remaining problem in cpython so far is function signature
argument separator comment placement which is its own PR on top of this.
## Test Plan
I added test fixtures of minimized examples with links back to the
original cpython location
## Summary
While fixing https://github.com/astral-sh/ruff/pull/5233, I noticed that
in FastAPI, 343 out of 823 files weren't hitting the cache. It turns out
these are standalone files in the documentation that lack a "package
root". Later, when looking up the cache entries, we fallback to the
package directory.
This PR ensures that we initialize the cache for both kinds of files:
those that are in a package, and those that aren't.
The total size of the FastAPI cache for me is now 388K. I also suspect
that this approach is much faster than as initially written, since
before, we were probably initializing one cache per _directory_.
## Test Plan
Ran `cargo run -p ruff_cli -- check ../fastapi --verbose`; verified
that, on second execution, there were no "Checking" entries in the logs.
## Summary
In the latest release, we made some improvements to the semantic model,
but our modifications to exception-unbinding are causing some
false-positives. For example:
```py
try:
v = 3
except ImportError as v:
print(v)
else:
print(v)
```
In the latest release, we started unbinding `v` after the `except`
handler. (We used to restore the existing binding, the `v = 3`, but this
was quite complicated.) Because we don't have full branch analysis, we
can't then know that `v` is still bound in the `else` branch.
The solution here modifies `resolve_read` to skip-lookup when hitting
unbound exceptions. So when store the "unbind" for `except ImportError
as v`, we save the binding that it shadowed `v = 3`, and skip to that.
Closes#5249.
Closes#5250.
This formats slice expressions and subscript expressions.
Spaces around the colons follows the same rules as black
(https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#slices):
```python
e00 = "e"[:]
e01 = "e"[:1]
e02 = "e"[: a()]
e10 = "e"[1:]
e11 = "e"[1:1]
e12 = "e"[1 : a()]
e20 = "e"[a() :]
e21 = "e"[a() : 1]
e22 = "e"[a() : a()]
e200 = "e"[a() : :]
e201 = "e"[a() :: 1]
e202 = "e"[a() :: a()]
e210 = "e"[a() : 1 :]
```
Comment placement is different due to our very different infrastructure.
If we have explicit bounds (e.g. `x[1:2]`) all comments get assigned as
leading or trailing to the bound expression. If a bound is missing
`[:]`, comments get marked as dangling and placed in the same section as
they were originally in:
```python
x = "x"[ # a
# b
: # c
# d
]
```
to
```python
x = "x"[
# a
# b
:
# c
# d
]
```
Except for the potential trailing end-of-line comments, all comments get
formatted on their own line. This can be improved by keeping end-of-line
comments after the opening bracket or after a colon as such but the
changes were already complex enough.
I added tests for comment placement and spaces.
## Summary
I found it hard to figure out which function decides placement for a
specific comment. An explicit loop makes this easier to debug
## Test Plan
There should be no functional changes, no changes to the formatting of
the fixtures.
This moves all docs about benchmarking and profiling into
CONTRIBUTING.md by moving the readme of `ruff_benchmark` and adding more
information on profiling.
We need to somehow consolidate that documentation, but i'm not convinced
that this is the best way (i tried subpages in mkdocs, but that didn't
seem good either), so i'm happy to take suggestions.
## Summary
Previously, `DecoratedComment` used `text_position()` and
`SourceComment` used `position()`. This PR unifies this to
`line_position` everywhere.
## Test Plan
This is a rename refactoring.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for unary expressions.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new `unary.py` with custom test cases
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. This PR adds support for `ExpandRightThenLeft` and `ExpandLeft`.
## Test Plan
I added tests that play through all 4 binary expression layouts.
## Summary
I initially wanted this category to be more general and decoupled from
the plugin, but I got some feedback that the titling felt inconsistent
with others.
## Summary
This is a proper fix for the issue patched-over in
https://github.com/astral-sh/ruff/pull/5229, thanks to an extremely
helpful repro from @tlambert03 in that thread. It looks like we were
using the keys of `package_roots` rather than the values to initialize
the cache -- but it's a map from package to package root.
## Test Plan
Reverted #5229, then ran through the plan that @tlambert03 included in
https://github.com/astral-sh/ruff/pull/5229#issuecomment-1599723226.
Verified the panic before but not after this change.
## Summary
This PR reverts #4971 (aba073a791). It
turns out that `f"{str(x)}"` and `f"{x}"` are often but not exactly
equivalent, and performing that conversion automatically can lead to
subtle bugs, See the discussion in
https://github.com/astral-sh/ruff/issues/4958.
## Summary
I haven't been able to determine why / when this is happening, but in
some cases, users are reporting that this `unwrap()` is causing a panic.
It's fine to just return `None` here and fallback to "No cache",
certainly better than panicking (while we figure out the edge case).
Closes#5225.
Closes#5228.
## Summary
I'm looking into the Black stability tests, and here's one failing case.
We split `assert a and (b and c)` into:
```python
assert a
assert (b and c)
```
We fail to split `assert (b and c)` due to the parentheses. But Black
then removes then, and when running Ruff again, we get:
```python
assert a
assert b
assert c
```
This PR just enables us to fix to this in one pass.
## Summary
A new CLI option (`-o`/`--output-file`) to write output to a file
instead of stdout.
Major change is to remove the lock acquired on stdout. The argument is
that the output is buffered and thus the lock is acquired only when
writing a block (8kb). As per the benchmark below there is a slight
performance penalty.
Reference:
https://rustmagazine.org/issue-3/javascript-compiler/#printing-is-slow
## Benchmarks
_Output is truncated to only contain useful information:_
Command: `check --isolated --no-cache --select=ALL --show-source
./test-repos/cpython"`
Latest HEAD (361d45f2b2) with and without
the manual lock on stdout:
```console
Benchmark 1: With lock
Time (mean ± σ): 5.687 s ± 0.075 s [User: 17.110 s, System: 0.486 s]
Range (min … max): 5.615 s … 5.860 s 10 runs
Benchmark 2: Without lock
Time (mean ± σ): 5.719 s ± 0.064 s [User: 17.095 s, System: 0.491 s]
Range (min … max): 5.640 s … 5.865 s 10 runs
Summary
(1) ran 1.01 ± 0.02 times faster than (2)
```
This PR:
```console
Benchmark 1: This PR
Time (mean ± σ): 5.855 s ± 0.058 s [User: 17.197 s, System: 0.491 s]
Range (min … max): 5.786 s … 5.987 s 10 runs
Benchmark 2: Latest HEAD with lock
Time (mean ± σ): 5.645 s ± 0.033 s [User: 16.922 s, System: 0.495 s]
Range (min … max): 5.600 s … 5.712 s 10 runs
Summary
(2) ran 1.04 ± 0.01 times faster than (1)
```
## Test Plan
Run all of the commands which gives output with and without the
`--output-file=ruff.out` option:
* `--show-settings`
* `--show-files`
* `--show-fixes`
* `--diff`
* `--select=ALL`
* `--select=All --show-source`
* `--watch` (only stdout allowed)
resolves: #4754
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. `ExpandLeft` turns out to be surprisingly hard. This PR adds a new `BestFittingMode` parameter to `BestFitting` to support `ExpandLeft`.
There are 3 variants that `ExpandLeft` must support:
**Variant 1**: Everything fits on the line (easy)
```python
[a, b] + c
```
**Variant 2**: Left breaks, but right fits on the line. Doesn't need parentheses
```python
[
a,
b
] + c
```
**Variant 3**: The left breaks, but there's still not enough space for the right hand side. Parenthesize the whole expression:
```python
(
[
a,
b
]
+ ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
)
```
Solving Variant 1 and 2 on their own is straightforward The printer gives us this behavior by nesting right inside of the group of left:
```
group(&format_args![
if_group_breaks(&text("(")),
soft_block_indent(&group(&format_args![
left,
soft_line_break_or_space(),
op,
space(),
group(&right)
])),
if_group_breaks(&text(")"))
])
```
The fundamental problem is that the outer group, which adds the parentheses, always breaks if the left side breaks. That means, we end up with
```python
(
[
a,
b
] + c
)
```
which is not what we want (we only want parentheses if the right side doesn't fit).
Okay, so nesting groups don't work because of the outer parentheses. Sequencing groups doesn't work because it results in a right-to-left breaking which is the opposite of what we want.
Could we use best fitting? Almost!
```
best_fitting![
// All flat
format_args![left, space(), op, space(), right],
// Break left
format_args!(group(&left).should_expand(true), space(), op, space(), right],
// Break all
format_args![
text("("),
block_indent!(&format_args![
left,
hard_line_break(),
op,
space()
right
])
]
]
```
I hope I managed to write this up correctly. The problem is that the printer never reaches the 3rd variant because the second variant always fits:
* The `group(&left).should_expand(true)` changes the group so that all `soft_line_breaks` are turned into hard line breaks. This is necessary because we want to test if the content fits if we break after the `[`.
* Now, the whole idea of `best_fitting` is that you can pretend that some content fits on the line when it actually does not. The way this works is that the printer **only** tests if all the content of the variant **up to** the first line break fits on the line (we insert that line break by using `should_expand(true))`. The printer doesn't care whether the rest `a\n, b\n ] + c` all fits on (multiple?) lines.
Why does breaking right work but not breaking the left? The difference is that we can make the decision whether to parenthesis the expression based on the left expression. We can't do this for breaking left because the decision whether to insert parentheses or not would depend on a lookahead: will the right side break. We simply don't know this yet when printing the parentheses (it would work for the right parentheses but not for the left and indent).
What we kind of want here is to tell the printer: Look, what comes here may or may not fit on a single line but we don't care. Simply test that what comes **after** fits on a line.
This PR adds a new `BestFittingMode` that has a new `AllLines` option that gives us the desired behavior of testing all content and not just up to the first line break.
## Test Plan
I added a new example to `BestFitting::with_mode`
## Summary
Completes the documentation for the `flake8-bugbear` ruleset. Related to
#2646.
## Test Plan
`python scripts/check_docs_formatted.py`
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Now that all identifiers include ranges (#5194), we can remove a ton of
this "custom lexing" code that we have to sketchily extract identifier
ranges from source.
## Test Plan
`cargo test`
## Summary
In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified
RustPython to include ranges for any identifiers that aren't
`Expr::Name` (which already has an identifier).
For example, the `e` in `except ValueError as e` was previously
un-ranged. To extract its range, we had to do some lexing of our own.
This change should improve performance and let us remove a bunch of
code.
## Test Plan
`cargo test`
## Summary
Open cache files in parallel (again), brings the performance back to be roughly equal to the old implementation.
## Test Plan
Existing tests should keep working.
## Summary
Handle trailing newline in Jupyter Notebook JSON string similar to how
`black`
does it.
## Test Plan
Add test cases when the JSON string for notebook ends with and without a
newline.
resolves: #5190
## Summary
Fixes#4404.
Consider this file:
```python
if True:
x = 1; \
<space><space><space>
```
The current implementation of W293 removes the 3 spaces on line 2. This
fix changes the file to:
```python
if True:
x = 1; \
```
A file can't end in a `\`, according to Python's [lexical
analysis](https://docs.python.org/3/reference/lexical_analysis.html), so
subsequent iterations of the autofixer fail (the AST-based ones
specifically, since they depend on a valid syntax tree and get
re-parsed).
This patch examines the line before the line checked in `W293`. If its
first non-whitespace character is a `\`, the patch will extend the
diagnostic's fix range to all whitespace up until the previous line's
*second* non-whitespace character; that is, it deletes all spaces and
potential `\`s up until the next non-whitespace character on the
previous line.
## Test Plan
Ran `cargo run -p ruff_cli -- ~/Downloads/aa.py --fix --select W293,D100
--no-cache` against the above file. This resulted in:
```
/Users/evan/Downloads/aa.py:1:1: D100 Missing docstring in public module
Found 2 errors (1 fixed, 1 remaining).
```
The file's contents, after the fix:
```python
if True:
x = 1;<space>
```
The `\` was removed, leaving the terminal space. The space should be
handled by `Rule::TrailingWhitespace`, not `BlankLineWithWhitespace`.
## Summary
This PR modifies our statement deletion logic to delete any preceding
continuation lines.
For example, given:
```py
x = 1; \
import os
```
We'll now rewrite to:
```py
x = 1;
```
In addition, the logic can now handle multiple preceding continuations
(which is unlikely, but valid).
## Summary
This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
## Summary
I want to include URLs to the rule documentation in the LSP (the LSP has
a native `code_description` field for this, which, if specified, causes
the source to be rendered as a link to the docs). This PR exposes the
URL to the documentation in the Ruff JSON output.
## Summary
Maintain consistency while deserializing Jupyter notebook to JSON. The
following changes were made:
1. Use string array to store the source value as that's the default
(5781720423/nbformat/v4/nbjson.py (L56-L57))
2. Remove unused structs and enums
3. Reorder the keys in alphabetical order as that's the default.
(5781720423/nbformat/v4/nbjson.py (L51))
### Side effect
Removing the `preserve_order` feature means that the order of keys in
JSON output (`--format json`) will be in alphabetical order. This is
because the value is represented using `serde_json::Value` which
internally is a `BTreeMap`, thus sorting it as per the string key. For
posterity if this turns out to be not ideal, then we could define a
struct representing the JSON object and the order of struct fields will
determine the order in the JSON string.
## Test Plan
Add a test case to assert the raw JSON string.
## Summary
This changes the caching design from one cache file per source file, to
one cache file per package. This greatly reduces the amount of cache
files that are opened and written, while maintaining roughly the same
(combined) size as bincode is very compact.
Below are some very much not scientific performance tests. It uses
projects/sources to check:
* small.py: single, 31 bytes Python file with 2 errors.
* test.py: single, 43k Python file with 8 errors.
* fastapi: FastAPI repo, 1134 files checked, 0 errors.
Source | Before # files | After # files | Before size | After size
-------|-------|-------|-------|-------
small.py | 1 | 1 | 20 K | 20 K
test.py | 1 | 1 | 60 K | 60 K
fastapi | 1134 | 518 | 4.5 M | 2.3 M
One question that might come up is why fastapi still has 518 cache files
and not 1? That is because this is using the existing package
resolution, which sees examples, docs, etc. as separate from the "main"
source code (in the fastapi directory in the repo). In this future it
might be worth consider switching to a one cache file per repo strategy.
This new design is not perfect and does have a number of known issues.
First, like the old design it doesn't remove the cache for a source file
that has been (re)moved until `ruff clean` is called.
Second, this currently uses a large mutex around the mutation of the
package cache (e.g. inserting result). This could be (or become) a
bottleneck. It's future work to test and improve this (if needed).
Third, currently the packages and opened and stored in a sequential
loop, this could be done parallel. This is also future work.
## Test Plan
Run `ruff check` (with caching enabled) twice on any Python source code
and it should produce the same results.
## Summary
We want to ensure that once formatted content stays the same when
formatted again, which is known as formatter stability or formatter
idempotency, and that the formatter prints syntactically valid code. As
our test cases cover only a limited amount of code, this allows checking
entire repositories.
This adds a new subcommand to `ruff_dev` which can be invoked as `cargo
run --bin ruff_dev -- check-formatter-stability <repo>`. While initially
only intended to check stability, it has also found cases where the
formatter printed invalid syntax or panicked.
## Test Plan
Running this on cpython is already identifying bugs
(https://github.com/astral-sh/ruff/pull/5089)
This documentation change improves the section on dangling comments in
the formatter.
---------
Co-authored-by: David Szotten <davidszotten@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This adds a new subcommand that can be used as
```shell
cargo build --bin ruff_dev --profile=release-debug
perf record -g -F 999 target/release-debug/ruff_dev repeat --repeat 30 --exit-zero --no-cache path/to/cpython > /dev/null
flamegraph --perfdata perf.data
```
## Test Plan
This is a ruff internal script. I successfully used it to profile
cpython with the instructions above
## Summary
We weren't resetting the `allow_ellipsis` flag properly, which
ultimately caused us to treat the semicolon as "unnecessary" rather than
"creating a multi-statement line".
Closes#5154.
## Summary
Given:
```python
\
import os
```
Deleting `import os` leaves a syntax error: a file can't end in a
continuation. We have code to handle this case, but it failed to pick up
continuations at the _very start_ of a file.
Closes#5156.
## Summary
This PR moves the "unconventional import alias" rule (which enforces,
e.g., that `pandas` is imported as `pd`) to the "dead scopes" phase,
after the main linter pass. This (1) avoids an allocation since we no
longer need to create the qualified name in the linter pass; and (2)
will allow us to autofix it, since we'll have access to all references.
## Test Plan
`cargo test` -- all changes are to ranges (which are improvements IMO).
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format `continue` statement.
## Test Plan
`continue` is used already in some tests, but if a new test is needed I
could add it.
---------
Co-authored-by: konstin <konstin@mailbox.org>
This tackles three problems:
* pre-commit was slow because it ran cargo commands
* Improve the clarity on what you need to run to get your PR pass on CI
(and make those fast)
* You had to compile and run `cargo dev generate-all` separately, which
was slow
The first change is to remove all cargo commands except running ruff
itself from pre-commit. With `cargo run --bin ruff` already compiled it
takes about 7s on my machine. It would make sense to also use the ruff
pre-commit action here even if we're then lagging a release behind for
checking ruff on ruff.
The contributing guide is now clear about what you need to run:
```shell
cargo clippy --workspace --all-targets --all-features -- -D warnings # Linting...
RUFF_UPDATE_SCHEMA=1 cargo test # Testing and updating ruff.schema.json
pre-commit run --all-files # rust and python formatting, markdown and python linting, etc.
```
Example timings from my machine:
`cargo clippy --workspace --all-targets --all-features -- -D warnings`:
23s
`RUFF_UPDATE_SCHEMA=1 cargo test`: 2min (recompiling), 1min (no code
changes, this is mainly doc tests)
`pre-commit run --all-files`: 7s
The exact numbers don't matter so much as the approximate experience (6s
is easier to just wait than 1min, esp if you need to fix and rerun). The
biggest remaining block seems to be doc tests, i'm surprised i didn't
find any solution to speeding them up (nextest simply doesn't run them
at all). Also note that the formatter has it's own tests which are much
faster since they avoid linking ruff (`cargo test
ruff_python_formatter`).
The third change is to enable `cargo test` to update the schema. Similar
to `INSTA_UPDATE=always`, i've added `RUFF_UPDATE_SCHEMA=1` (name open
to bikeshedding), so `RUFF_UPDATE_SCHEMA=1 cargo test` updates the
schema, while `cargo test` still fails as expected if the repo isn't
up-to-date.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
Completes the documentation for the `flake8-blind-except` and
`flake8-raise` rules.
Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
After #5140, I audited the codebase for similar patterns (defining a
list of `CallPath` entities in a static vector, then looping over them
to pattern-match). This PR migrates all other such cases to use `match`
and `matches!` where possible.
There are a few benefits to this:
1. It more clearly denotes the intended semantics (branches are
exclusive).
2. The compiler can help deduplicate the patterns and detect unreachable
branches.
3. Performance: in the benchmark below, the all-rules performance is
increased by nearly 10%...
## Benchmarks
I decided to benchmark against a large file in the Airflow repository
with a lot of type annotations
([`views.py`](https://raw.githubusercontent.com/apache/airflow/f03f73100e8a7d6019249889de567cb00e71e457/airflow/www/views.py)):
```
linter/default-rules/airflow/views.py
time: [10.871 ms 10.882 ms 10.894 ms]
thrpt: [19.739 MiB/s 19.761 MiB/s 19.781 MiB/s]
change:
time: [-2.7182% -2.5687% -2.4204%] (p = 0.00 < 0.05)
thrpt: [+2.4805% +2.6364% +2.7942%]
Performance has improved.
linter/all-rules/airflow/views.py
time: [24.021 ms 24.038 ms 24.062 ms]
thrpt: [8.9373 MiB/s 8.9461 MiB/s 8.9527 MiB/s]
change:
time: [-8.9537% -8.8516% -8.7527%] (p = 0.00 < 0.05)
thrpt: [+9.5923% +9.7112% +9.8342%]
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
5 (5.00%) high mild
7 (7.00%) high severe
```
The impact is dramatic -- nearly a 10% improvement for `all-rules`.
## Summary
This PR fixes a small quirk in the semantic model. Typically, when we
see an import, like `import foo`, we create a `BindingKind::Importation`
for it. However, if `foo` has been declared as a `global`, then we
propagate the kind forward. So given:
```python
global foo
import foo
```
We'd create two bindings for `foo`, both with type `global`.
This was originally borrowed from Pyflakes, and it exists to help avoid
false-positives like:
```python
def f():
global foo
# Don't mark `foo` as "assigned but unused"! It's a global!
foo = 1
```
This PR removes that behavior, and instead tracks "Does this binding
refer to a global?" as a flag. This is much cleaner, since it means we
don't "lose" the identity of various bindings.
As a very strange example of why this matters, consider:
```python
def foo():
global Member
from module import Member
x: Member = 1
```
`Member` is only used in a typing context, so we should flag it and say
"move it to a `TYPE_CHECKING` block". However, when we go to analyze
`from module import Member`, it has `BindingKind::Global`. So we don't
even know that it's an import!
## Summary
In #5074, we introduced an abstraction to support local symbol renames
("local" here refers to "within a module"). However, that abstraction
didn't support `global` and `nonlocal` symbols. This PR extends it to
those cases.
Broadly, there are considerations.
First, if we're renaming a symbol in a scope in which it is declared
`global` or `nonlocal`. For example, given:
```python
x = 1
def foo():
global x
```
Then when renaming `x` in `foo`, we need to detect that it's `global`
and instead perform the rename starting from the module scope.
Second, when renaming a symbol, we need to determine the scopes in which
it is declared `global` or `nonlocal`. This is effectively the inverse
of the above: when renaming `x` in the module scope, we need to detect
that we should _also_ rename `x` in `foo`.
To support these cases, the renaming algorithm was adjusted as follows:
- When we start a rename in a scope, determine whether the symbol is
declared `global` or `nonlocal` by looking for a `global` or `nonlocal`
binding. If it is, start the rename in the defining scope. (This
requires storing the defining scope on the `nonlocal` binding, which is
new.)
- We then perform the rename in the defining scope.
- We then check whether the symbol was declared as `global` or
`nonlocal` in any scopes, and perform the rename in those scopes too.
(Thankfully, this doesn't need to be done recursively.)
Closes#5092.
## Test Plan
Added some additional snapshot tests.
## Summary
This PR enables autofix behavior for the `flake8-pyi` rule that asks you
to alias `Set` to `AbstractSet` when importing `collections.abc.Set`.
It's not the most important rule, but it's a good isolated test-case for
local symbol renaming.
The renaming algorithm is outlined in-detail in the `renamer.rs` module.
But to demonstrate the behavior, here's the diff when running this fix
over a complex file that exercises a few edge cases:
```diff
--- a/foo.pyi
+++ b/foo.pyi
@@ -1,16 +1,16 @@
if True:
- from collections.abc import Set
+ from collections.abc import Set as AbstractSet
else:
- Set = 1
+ AbstractSet = 1
-x: Set = set()
+x: AbstractSet = set()
-x: Set
+x: AbstractSet
-del Set
+del AbstractSet
def f():
- print(Set)
+ print(AbstractSet)
def Set():
pass
```
Making this work required resolving a bunch of edge cases in the
semantic model that were causing us to "lose track" of references. For
example, the above wasn't possible with our previous approach to
handling deletions (#5071). Similarly, the `x: Set` "delayed annotation"
tracking was enabled via #5070. And many of these edits would've failed
if we hadn't changed `BindingKind` to always match the identifier range
(#5090). So it's really the culmination of a bunch of changes over the
course of the week.
The main outstanding TODO is that this doesn't support `global` or
`nonlocal` usages. I'm going to take a look at that tonight, but I'm
comfortable merging this as-is.
Closes#1106.
Closes#5091.
## Summary
I noticed that we have a few hot comparisons that involve called
`s.to_lowercase()`. We can avoid an allocation by comparing characters
directly.
## Summary
@konstin mentioned that in profiling, this function accounted for a
non-trivial amount of time (0.33% of total execution, the most of any
rule). This PR attempts to rewrite it as a match statement for better
performance over a looping comparison.
## Test Plan
`cargo test`
## Summary
If you `import __future__`, it's not subject to the same rules as `from
__future__ import feature` -- i.e., this is fine:
```python
x = 1
import __future__
```
It doesn't really make sense to treat these as `__future__` imports
(though I can't imagine anyone ever does this anyway).
## Summary
At present, when we store a binding, we include a `TextRange` alongside
it. The `TextRange` _sometimes_ matches the exact range of the
identifier to which the `Binding` is linked, but... not always.
For example, given:
```python
x = 1
```
The binding we create _will_ use the range of `x`, because the left-hand
side is an `Expr::Name`, which has a valid range on it.
However, given:
```python
try:
pass
except ValueError as e:
pass
```
When we create a binding for `e`, we don't have a `TextRange`... The AST
doesn't give us one. So we end up extracting it via lexing.
This PR extends that pattern to the rest of the binding kinds, to ensure
that whenever we create a binding, we always use the range of the bound
name. This leads to better diagnostics in cases like pattern matching,
whereby the diagnostic for "unused variable `x`" here used to include
`*x`, instead of just `x`:
```python
def f(provided: int) -> int:
match provided:
case [_, *x]:
pass
```
This is _also_ required for symbol renames, since we track writes as
bindings -- so we need to know the ranges of the bound symbols.
By storing these bindings precisely, we can also remove the
`binding.trimmed_range` abstraction -- since bindings already use the
"trimmed range".
To implement this behavior, I took some of our existing utilities (like
the code we had for `except ValueError as e` above), migrated them from
a full lexer to a zero-allocation lexer that _only_ identifies
"identifiers", and moved the behavior into a trait, so we can now do
`stmt.identifier(locator)` to get the range for the identifier.
Honestly, we might end up discarding much of this if we decide to put
ranges on all identifiers
(https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we
do, this will _still_ be a good change, because the lexer introduced
here is useful beyond names (e.g., we use it find the `except` keyword
in an exception handler, to find the `else` after a `for` loop, and so
on). So, I'm fine committing this even if we end up changing our minds
about the right approach.
Closes#5090.
## Benchmarks
No significant change, with one statistically significant improvement
(-2.1654% on `linter/all-rules/large/dataset.py`):
```
linter/default-rules/numpy/globals.py
time: [73.922 µs 73.955 µs 73.986 µs]
thrpt: [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s]
change:
time: [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05)
thrpt: [+0.3996% +0.4755% +0.5611%]
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
linter/default-rules/pydantic/types.py
time: [1.4909 ms 1.4917 ms 1.4926 ms]
thrpt: [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s]
change:
time: [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05)
thrpt: [-0.3380% -0.2734% -0.2136%]
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
linter/default-rules/numpy/ctypeslib.py
time: [688.97 µs 691.34 µs 694.15 µs]
thrpt: [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s]
change:
time: [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05)
thrpt: [+0.1468% +0.7351% +1.3461%]
Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
2 (2.00%) high mild
12 (12.00%) high severe
linter/default-rules/large/dataset.py
time: [3.3872 ms 3.4032 ms 3.4191 ms]
thrpt: [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s]
change:
time: [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05)
thrpt: [-0.0905% +0.2642% +0.6469%]
No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
1 (1.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
13 (13.00%) high severe
linter/all-rules/numpy/globals.py
time: [148.99 µs 149.21 µs 149.42 µs]
thrpt: [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s]
change:
time: [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05)
thrpt: [+0.2785% +0.5094% +0.7395%]
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
linter/all-rules/pydantic/types.py
time: [3.0362 ms 3.0396 ms 3.0441 ms]
thrpt: [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s]
change:
time: [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05)
thrpt: [-0.2121% -0.0618% +0.0958%]
No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
3 (3.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
time: [1.6879 ms 1.6894 ms 1.6909 ms]
thrpt: [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s]
change:
time: [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05)
thrpt: [-0.0435% +0.0889% +0.2284%]
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) low mild
1 (1.00%) high severe
linter/all-rules/large/dataset.py
time: [7.1520 ms 7.1586 ms 7.1654 ms]
thrpt: [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s]
change:
time: [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05)
thrpt: [+1.8102% +2.2133% +2.6300%]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild
```
## Summary
This fixes a number of problems in the formatter that showed up with
various files in the [cpython](https://github.com/python/cpython)
repository. These problems surfaced as unstable formatting and invalid
code. This is not the entirety of problems discovered through cpython,
but a big enough chunk to separate it. Individual fixes are generally
individual commits. They were discovered with #5055, which i update as i
work through the output
## Test Plan
I added regression tests with links to cpython for each entry, except
for the two stubs that also got comment stubs since they'll be
implemented properly later.
## Summary
This PR runs `rustfmt` with a few nightly options as a one-time fix to
catch some malformatted comments. I ended up just running with:
```toml
condense_wildcard_suffixes = true
edition = "2021"
max_width = 100
normalize_comments = true
normalize_doc_attributes = true
reorder_impl_items = true
unstable_features = true
use_field_init_shorthand = true
```
Since these all seem like reasonable things to fix, so may as well while
I'm here.