## Summary
Completes the documentation for the ruleset, apart from four rules which
have contradictions, so need to be thought about more regarding how to
document that. Related to #2646.
## Test Plan
`python scripts/test_docs_formatted.py`
**Summary**
Updated doc comments for `missing_whitespace_around_operator.rs`. Online
docs also benefit from this update.
**Test Plan**
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
## Summary
Completes the documentation for the one and only (current) rule in the
`flynt` ruleset. Related to #2646.
## Test Plan
`python scripts/test_docs_formatted.py`
## Summary
This PR stores the mapping from `ExprName` node to resolved `BindingId`,
which lets us skip scope lookups in `resolve_call_path`. It's enabled by
#6045, since that PR ensures that when we analyze a node (and thus call
`resolve_call_path`), we'll have already visited its `ExprName`
elements.
In more detail: imagine that we're traversing over `foo.bar()`. When we
read `foo`, it will be an `ExprName`, which we'll then resolve to a
binding via `handle_node_load`. With this change, we then store that
binding in a map. Later, if we call `collect_call_path` on `foo.bar`,
we'll identify `foo` (the "head" of the attribute) and grab the resolved
binding in that map. _Almost_ all names are now resolved in advance,
though it's not a strict requirement, and some rules break that pattern
(e.g., if we're analyzing arguments, and they need to inspect their
annotations, which are visited in a deferred manner).
This improves performance by 4-6% on the all-rules benchmark. It looks
like it hurts performance (1-2% drop) in the default-rules benchmark,
presumedly because those rules don't call `resolve_call_path` nearly as
much, and so we're paying for these extra writes.
Here's the benchmark data:
```
linter/default-rules/numpy/globals.py
time: [67.270 µs 67.380 µs 67.489 µs]
thrpt: [43.720 MiB/s 43.792 MiB/s 43.863 MiB/s]
change:
time: [+0.4747% +0.7752% +1.0626%] (p = 0.00 < 0.05)
thrpt: [-1.0514% -0.7693% -0.4724%]
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
linter/default-rules/pydantic/types.py
time: [1.4067 ms 1.4105 ms 1.4146 ms]
thrpt: [18.028 MiB/s 18.081 MiB/s 18.129 MiB/s]
change:
time: [+1.3152% +1.6953% +2.0414%] (p = 0.00 < 0.05)
thrpt: [-2.0006% -1.6671% -1.2981%]
Performance has regressed.
linter/default-rules/numpy/ctypeslib.py
time: [637.67 µs 638.96 µs 640.28 µs]
thrpt: [26.006 MiB/s 26.060 MiB/s 26.113 MiB/s]
change:
time: [+1.5859% +1.8109% +2.0353%] (p = 0.00 < 0.05)
thrpt: [-1.9947% -1.7787% -1.5611%]
Performance has regressed.
linter/default-rules/large/dataset.py
time: [3.2289 ms 3.2336 ms 3.2383 ms]
thrpt: [12.563 MiB/s 12.581 MiB/s 12.599 MiB/s]
change:
time: [+0.8029% +0.9898% +1.1740%] (p = 0.00 < 0.05)
thrpt: [-1.1604% -0.9801% -0.7965%]
Change within noise threshold.
linter/all-rules/numpy/globals.py
time: [134.05 µs 134.15 µs 134.26 µs]
thrpt: [21.977 MiB/s 21.995 MiB/s 22.012 MiB/s]
change:
time: [-4.4571% -4.1175% -3.8268%] (p = 0.00 < 0.05)
thrpt: [+3.9791% +4.2943% +4.6651%]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
2 (2.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
linter/all-rules/pydantic/types.py
time: [2.5627 ms 2.5669 ms 2.5720 ms]
thrpt: [9.9158 MiB/s 9.9354 MiB/s 9.9516 MiB/s]
change:
time: [-5.8304% -5.6374% -5.4452%] (p = 0.00 < 0.05)
thrpt: [+5.7587% +5.9742% +6.1914%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe
linter/all-rules/numpy/ctypeslib.py
time: [1.3949 ms 1.3956 ms 1.3964 ms]
thrpt: [11.925 MiB/s 11.931 MiB/s 11.937 MiB/s]
change:
time: [-6.2496% -6.0856% -5.9293%] (p = 0.00 < 0.05)
thrpt: [+6.3030% +6.4799% +6.6662%]
Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
3 (3.00%) high mild
4 (4.00%) high severe
linter/all-rules/large/dataset.py
time: [5.5951 ms 5.6019 ms 5.6093 ms]
thrpt: [7.2527 MiB/s 7.2623 MiB/s 7.2711 MiB/s]
change:
time: [-5.1781% -4.9783% -4.8070%] (p = 0.00 < 0.05)
thrpt: [+5.0497% +5.2391% +5.4608%]
Performance has improved.
```
Still playing with this (the concepts need better names, documentation,
etc.), but opening up for feedback.
## Summary
Noticed in #5954: we walk _past_ the root rather than stopping _at_ the
root when attempting to traverse along the parent path. It's effectively
an off-by-one bug.
## Summary
Updated doc comment for `call_datetime_without_tzinfo.rs`. Online docs
also benefit from this update.
## Test Plan
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
## Summary
Check for unused private `TypeVar`. See [original
implementation](2a86db8271/pyi.py (L1958)).
```
$ flake8 --select Y018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: Y018 TypeVar "_T" is not used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: Y018 TypeVar "_P" is not used
```
```
$ ./target/debug/ruff --select PYI018 crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:4:1: PYI018 TypeVar `_T` is never used
crates/ruff/resources/test/fixtures/flake8_pyi/PYI018.pyi:5:1: PYI018 TypeVar `_P` is never used
Found 2 errors.
```
In the file `unused_private_type_declaration.rs`, I'm planning to add
other rules that are similar to `PYI018` like the `PYI046`, `PYI047` and
`PYI049`.
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
This is a rewrite of the main comment placement logic. `place_comment`
now has three parts:
- place own line comments
- between branches
- after a branch
- place end-of-line comments
- after colon
- after a branch
- place comments for specific nodes (that include module level comments)
The rewrite fixed three bugs: `class A: # trailing comment` comments now
stay end-of-line, `try: # comment` remains end-of-line and deeply
indented try-else-finally comments remain with the right nested
statement.
It will be much easier to give more alternative branches nodes since
this is abstracted away by `is_node_with_body` and the first/last child
helpers. Adding new node types can now be done by adding an entry to the
`place_comment` match. The code went from 1526 lines before #6033 to
1213 lines now.
It thinks it easier to just read the new `placement.rs` rather than
reviewing the diff.
## Test Plan
The existing fixtures staying the same or improving plus new ones for
the bug fixes.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
The `SIM102` auto-fix fails if `elif` is indented like this:
## Example
```python
def f():
# SIM102
if a:
pass
elif b:
if c:
d
```
```
> cargo run -p ruff_cli -- check --select SIM102 --fix a.py
...
error: Failed to fix nested if: Failed to extract statement from source
a.py:5:5: SIM102 Use a single `if` statement instead of nested `if` statements
Found 1 error.
```
## Test Plan
<!-- How was it tested? -->
New test
## Summary
This PR adds the implementation for the new Jupyter AST nodes i.e.,
`ExprLineMagic` and `StmtLineMagic`.
## Test Plan
Add test cases for `unparse` containing magic commands
resolves: #6087
## Summary
Updated doc comment for `tab_indentation.rs`. Online docs also benefit
from this update.
## Test Plan
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Part of #5062
Requires https://github.com/astral-sh/RustPython-Parser/pull/32
Adds visitation of type alias statements and type parameters in class
and function definitions.
Duplicates tests for `PreorderVisitor` into `Visitor` with new
snapshots. Testing required node implementations for the `TypeParam`
enum, which is a chunk of the diff and the reason we need `Ranged`
implementations in
https://github.com/astral-sh/RustPython-Parser/pull/32.
## Test Plan
<!-- How was it tested? -->
Adds unit tests with snapshots.
Reimplements https://github.com/astral-sh/ruff/pull/3104
Closes https://github.com/astral-sh/ruff/issues/5726
Note that we will generate the hash for a cache key twice in normal
operation. Once to check for the cached item and again to update the
cache. We could optimize this by generating the hash once in
`diagnostics::lint_file` and passing the `u64` into `get` and `update`.
We'd probably want to wrap it in a `CacheKeyHash` enum for type safety.
## Test plan
Unit tests for Windows and Unix.
Manual test with case from issue
```
❯ touch fake.py
❯ chmod +x fake.py
❯ ./target/debug/ruff --select EXE fake.py
fake.py:1:1: EXE002 The file is executable but no shebang is present
Found 1 error.
❯ chmod -x fake.py
❯ ./target/debug/ruff --select EXE fake.py
```
## Summary
If a method is annotated with `@typing_extensions.override`, we should
avoid flagging A003 on it. This isn't part of the standard library yet,
but it's used to explicitly mark methods as overrides.
## Summary
If a user subclasses `threading.Event`, e.g. with:
```python
from threading import Event
class CustomEvent(Event):
def set(self) -> None:
...
```
They no control over the method name (`set`). This PR allows
`threading.Event#set` and `logging.Filter#filter` overrides, and avoids
flagging A003 in such cases. Ideally, we'd avoid flagging all overridden
methods, but... that's a lot more difficult, and this is at least
_better_ than what we do now.
Closes https://github.com/astral-sh/ruff/issues/6057.
Closes https://github.com/astral-sh/ruff/issues/5956.
## Summary
This PR modifies the order of operations in our AST checker. Previously,
we ran our analysis rules first, then bound names and traversed over the
subtrees. Now, after a series of refactors, we can invert the order: do
the subtree traversal and model-building _first_, then run rules.
The nice thing about this change is that when we go to analyze, e.g., a
function call node, we'll already have traversed any of the constituent
`Expr::Name` nodes... So if we store the resolution of all names when do
the traversal, we can avoid having to do any expensive work in
`resolve_call_path`.
## Test Plan
Clean run of the snapshot tests, and hopefully the ecosystem checks too!
## Summary
This PR is a refactoring of placement.rs. The code got more consistent,
some comments were updated and some dead code was removed or replaced
with debug assertions. It also contains a bugfix for the placement of
end-of-branch comments with nested bodies inside try statements that
occurred when refactoring the nested body loop.
## Test Plan
The existing test cases don't change. I added a couple of cases that i
think should be tested but weren't, and a regression test for the bugfix
## Summary
This PR ensures that we can retain the current behavior even after we
reorder the visitor a bit, by looking for annotated lambdas rather than
"is the name bound to anything?", since if we visit the name before we
run this rule, it'll _always_ be bound. (This check is already a bit
flawed -- in truth, we should probably run this rule deferred so that we
can reliably detect shadowing.)
## Summary
This PR attempts to draw some basic separation between the `Checker`'s
traversal responsibilities (traversing the AST, building the semantic
model) and its calling-out-to-lint-rule responsibilities. It doesn't try
to introduce any sophisticated API. Instead, it just moves all of the
lint rule calls out of `checkers/ast/mod.rs` and into methods in a new
`analyze` module. (There are four remaining lint rules in `Checker`, but
I'll remove those in future PRs.)
I'm not trying to "solve" our lint rule API here. Instead, I'm trying to
make two improvements:
1. `checkers/ast/mod.rs` has just gotten way too large, and people work
in it all the time. Prior to this PR, it was 5.5k lines, which led to
significant lags in my editor and made it really hard to reason about
the parts that are _actually_ important. (I like big files, but this one
crossed the line for me.) Now, it's < 2,000 lines, and the code is much
more focused.
2. I want to avoid accidentally adding lint rules in the "wrong" parts
of the traversal. By confining lint rule invocations to these "analyze"
calls, we'll avoid (e.g.) putting them in the binding phase.
## Summary
These only need the token stream, and we always prefer token-based to
physical line-based rules.
There are a few other changes snuck in here:
- Renaming the rule files to match the diagnostic names (likely an
error).
- The "leading whitespace before shebang" rule now works regardless of
where the comment occurs (i.e., if the shebang is on the second line,
and the first line is blank, we flag and remove that leading
whitespace).
**Summary** Fix an instability in with statement formatter when there is
an own line comment as the `as`
```python
with (
a as
# bad comment
b):
```
**Test Plan** Added the comment to the test cases.
## Summary
Closes https://github.com/astral-sh/ruff/issues/6025 (which contains a
more thorough description of the issue). Previously, the `# noqa` here
was being marked as unused, but removing it raised `SIM114`:
```python
def foo():
a = True
b = False
if a > b: # noqa: SIM114
return 3
elif a == b:
return 3
```
**Summary** Add a formatter progress testing script to CI. This script
will 1) print the black compability on each run 2) catch regressions wrt
to formatter stability, emitting invalid syntax and other kinds of bugs
(e.g. #5917) before they land on main 3) have an additional layer of
real world tests when implementing new nodes or other new formatter
code.
This is currently a bash script, i'm not sure if we want to keep it that
way, or switch to e.g. the regular ecosystem scripts. The output
separation of `format_dev` could also use some polishing. We should also
consider pinning commits so we don't get spurious regression when they
change their code.
**Test Plan** The script extends CI.
## Summary
My intuition is that it's faster to do these checks as-needed rather
than allocation new hash maps and vectors for the arguments. (We
typically only query once anyway.)
## Summary
This PR adds a `logger-objects` setting that allows users to mark
specific symbols a `logging.Logger` objects. Currently, if a `logger` is
imported, we only flagged it as a `logging.Logger` if it comes exactly
from the `logging` module or is `flask.current_app.logger`.
This PR allows users to mark specific loggers, like
`logging_setup.logger`, to ensure that they're covered by the
`flake8-logging-format` rules and others.
For example, if you have a module `logging_setup.py` with the following
contents:
```python
import logging
logger = logging.getLogger(__name__)
```
Adding `"logging_setup.logger"` to `logger-objects` will ensure that
`logging_setup.logger` is treated as a `logging.Logger` object when
imported from other modules (e.g., `from logging_setup import logger`).
Closes https://github.com/astral-sh/ruff/issues/5694.
I don't know whether we want to make this change but here's some data...
Binary size:
- `main`: 30,384
- `charlie/match-phf`: 30,416
llvm-lines:
- `main`: 1,784,148
- `charlie/match-phf`: 1,789,877
llvm-lines and binary size are both unchanged (or, by < 5) when moving
from `u8` to `u32` return types, and even when moving to `char` keys and
values. I didn't expect this, but I'm not very knowledgable on this
topic.
Performance:
```
Confusables/match/src time: [4.9102 µs 4.9352 µs 4.9777 µs]
change: [+1.7469% +2.2421% +2.8710%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe
Confusables/match-with-skip/src
time: [2.0676 µs 2.0945 µs 2.1317 µs]
change: [+0.9384% +1.6000% +2.3920%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
Confusables/phf/src time: [31.087 µs 31.188 µs 31.305 µs]
change: [+1.9262% +2.2188% +2.5496%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
6 (6.00%) high mild
6 (6.00%) high severe
Confusables/phf-with-skip/src
time: [2.0470 µs 2.0486 µs 2.0502 µs]
change: [-0.3093% -0.1446% +0.0106%] (p = 0.08 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
```
The `-with-skip` variants add our optimization which first checks
whether the character is ASCII. So `match` is way, way faster than PHF,
but it tends not to matter since almost all source code is ASCII anyway.
## Summary
This pull request add supports for `int`, `float` and `bool` types in
`UP018`
rule to convert empty call to the default value of the type or remove
the call
if a value of the same type is provided as an argument.
## Test Plan
Added tests for `int`, `float` and `bool` types.
Partially resolves#5988
**Summary** Add a `EmptyWithDanglingComments` format helper that formats
comments inside empty parentheses, brackets or curly braces. Previously,
this was implemented separately, and partially incorrectly, for each use
case.
Empty `()`, `[]` and `{}` are special because there can be dangling
comments, and they can be in
two positions:
```python
x = [ # end-of-line
# own line
]
```
These comments are dangling because they can't be assigned to any
element inside as they would
in all other cases.
**Test Plan** Added a regression test.
145 (from previously 149) instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
145
```
## Summary
This is equivalent for a single flag, but I think it's more likely to be
correct when the bitflags are modified -- the primary reason being that
we sometimes define flags as the union of other flags, e.g.:
```rust
const ANNOTATION = Self::TYPING_ONLY_ANNOTATION.bits() | Self::RUNTIME_ANNOTATION.bits();
```
In this case, `flags.contains(Flag::ANNOTATION)` requires that _both_
flags in the union are set, whereas `flags.intersects(Flag::ANNOTATION)`
requires that _at least one_ flag is set.
## Summary
Use the `find_keyword` helper function instead of reimplementing it.
Follows on from #5983 by doing a different search.
## Test Plan
`cargo test`
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
F507 should not be raised when the right-hand side value is a non-tuple
object.
```python
'%s' % (1, 2, 3) # throws
'%s' % [1, 2, 3] # doesn't throw
'%s' % {1, 2, 3} # doesn't throw
```
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix a regression introduced by
https://github.com/astral-sh/ruff/pull/5638. A multiline expression
can't be safely inserted into a format field.
### Example
```
> cat a.py
"{}".format(
[
1,
2,
3,
]
)
> cargo run -p ruff_cli -- check a.py --no-cache --select UP032 --fix
Finished dev [unoptimized + debuginfo] target(s) in 0.07s
Running `target/debug/ruff check a.py --no-cache --select UP032 --fix`
error: Autofix introduced a syntax error in `a.py` with rule codes UP032: EOL while scanning string literal at byte offset 5
---
f"{[
1,
2,
3,
]}"
---
a.py:1:1: UP032 Use f-string instead of `format` call
Found 1 error.
```
## Test Plan
New test cases
## Summary
Checks that `append`, `extend` and `remove` methods are not called on
`__all__`. See [original
implementation](2a86db8271/pyi.py (L1133-L1138)).
```
$ flake8 --select Y026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:3:1: Y056 Calling ".append()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:4:1: Y056 Calling ".extend()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:5:1: Y056 Calling ".remove()" on "__all__" may not be supported by all type checkers (use += instead)
```
```
$ ./target/debug/ruff --select PYI026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:3:1: PYI056 Calling ".append()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:4:1: PYI056 Calling ".extend()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:5:1: PYI056 Calling ".remove()" on "__all__" may not be supported by all type checkers (use += instead)
Found 3 errors.
```
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
These are skipped as an optimization, but it feels kind of unnecessary
and makes the code a bit more confusing than is worthwhile.
(non-`strict` is also by far the more popular setting, and the default.)
## Summary
I ran into this in the wild. It looks like Ruff will collapse the `else`
and `elif` branches here (i.e., it doesn't recognize that they're too
independent import blocks):
```python
if "sdist" in cmds:
_sdist = cmds["sdist"]
elif "setuptools" in sys.modules:
from setuptools.command.sdist import sdist as _sdist
else:
from setuptools.command.sdist import sdist as _sdist
from distutils.command.sdist import sdist as _sdist
```
Likely fallout from the `elif_else_branches` refactor.
**Summary** Fix implemented in
https://github.com/astral-sh/RustPython-Parser/pull/35: Previously,
empty lambda arguments (e.g. `lambda: 1`) would get the range of the
entire expression, which leads to incorrect comment placement. Now empty
lambda arguments get an empty range between the `lambda` and the `:`
tokens.
**Test Plan** Added a regression test.
149 instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
149
```
**Summary** Add script to shrink all formatter errors: This started as a
fun idea and turned out really useful: This script gives us a single
Python file with all formatter stability errors. I want to keep it
around to occasionally update #5828 so I added it to the git.
**Test Plan** None, this is a helper script
## Summary
**Don't minimize files that don't match in the first place** This adds a
sanity check to the minimizer script that the
input matches the condition (e.g. unstable formatting). Otherwise we run
through all checks with the whole file, which is extremely slow. It's
more reasonable for downstream usage to write an empty string to the
output file instead.
## Summary
Allow `respect_gitignore` even when not in a git repo
## Test Plan
Within the Ruff repository:
1. Renamed `.git` to `.hello-world`
2. Added `test.py` in root folder
3. Added `test.py` to `.gitignore`
4. Ran `cargo run --bin ruff -- check --no-cache --isolated --show-files
.` with
and without `--respect-gitignore` flag
fixes: #5930
## Summary
We now allow RUF015 to fix cases like:
```python
list(range(10))[0]
list(x.y)[0]
list(x["y"])[0]
```
Further, we fix generators like:
```python
[i + 1 for i in x][0]
```
By rewriting to `next(iter(i + 1 for i in x))`.
I've retained the special-case that rewrites `[i for i in x][0]` to
`next(iter(x))`.
Closes https://github.com/astral-sh/ruff/issues/5764.
## Summary
Similar to #5852 and a bunch of related PRs -- trying to move rules that
rely on point-in-time semantic analysis to _after_ the semantic model
building.
## Summary
Implements `PYI017` or `Y017` from `flake8-pyi` plug-in. Mirrors
[upstream
implementation](ceab86d16b/pyi.py (L1039-L1048)).
It checks for any assignment with more than 1 target or an assignment to
anything other than a name, and raises a violation for these in stub
files.
Couldn't find a clear and concise explanation for why this is to be
avoided and what is preferred for attribute cases like:
```python
a.b = int
```
So welcome some input there, to learn and to finish up the docs.
## Test Plan
Added test cases from upstream plug-in in a fixture (both `.py` and
`.pyi`). Added a few more.
## Issue link
Refers: https://github.com/astral-sh/ruff/issues/848
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
- Remove space when start of slice is empty
- Treat unary op except `not` as simple expression
## Test Plan
Add some simple tests for unary op expressions in slice
Closes#5673
This shrinks a good bit more than previously, which was helpful for all
the formatter bugs. fwiw i treat this as a very ad-hoc script since it's
mainly my ecosystem bug processing companion.
## Summary
It can happen that we can't read a file (a python file, a jupyter
notebook or pyproject.toml), which needs to be handled and handled
consistently for all file types. Instead of using `Err` or `error!`, we
emit E602 with the io error as message and continue. This PR makes sure
we handle all three cases consistently, emit E602.
I'm not convinced that it should be possible to disable io errors, but
we now handle the regular case consistently and at least print warning
consistently.
I went with `warn!` but i can change them all to `error!`, too.
It also checks the error case when a pyproject.toml is not readable. The
error message is not very helpful, but it's now a bit clearer that
actually ruff itself failed instead vs this being a diagnostic.
## Examples
This is how an Err of `run` looks now:

With an unreadable file and `IOError` disabled:

(we lint zero files but count files before linting not during so we exit
0)
I'm not sure if it should (or if we should take a different path with
manual ExitStatus), but this currently also triggers when `files` is
empty:

## Test Plan
Unix only: Create a temporary directory with files with permissions
`000` (not readable by the owner) and run on that directory. Since this
breaks the assumptions of most of the test code (single file, `ruff`
instead of `ruff_cli`), the test code is rather cumbersome and looks a
bit misplaced; i'm happy about suggestions to fit it in closer with the
other tests or streamline it in other ways. I added another test for
when the entire directory is not readable.
## Summary
Completes documentation for the `flake8-fixme` (`FIX`) ruleset. Related
to #2646.
Tweaks the violation message. For example,
```
FIX001 Line contains FIXME
```
becomes
```
FIX001 Line contains FIXME, consider resolving the issue
```
This is because the previous message was unclear if it was warning
against the use of FIXME tags per se, or the code the FIXME tag was
annotating.
## Test Plan
`cargo test && python scripts/check_docs_formatted.py`
## Summary
Checks for `typehint.TypeAlias` annotation in type aliases. See
[original
source](https://github.com/PyCQA/flake8-pyi/blob/main/pyi.py#L1085).
```
$ flake8 --select Y026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:4:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "NewAny: TypeAlias = Any"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:5:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "OptinalStr: TypeAlias = typing.Optional[str]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:6:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "Foo: TypeAlias = Literal['foo']"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:7:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "IntOrStr: TypeAlias = int | str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:8:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "AliasNone: TypeAlias = None"
```
```
$ ./target/debug/ruff --select PYI026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:4:1: PYI026 Use `typing.TypeAlias` for type aliases in `NewAny`, e.g. "NewAny: typing.TypeAlias = Any"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:5:1: PYI026 Use `typing.TypeAlias` for type aliases in `OptinalStr`, e.g. "OptinalStr: typing.TypeAlias = typing.Optional[str]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:6:1: PYI026 Use `typing.TypeAlias` for type aliases in `Foo`, e.g. "Foo: typing.TypeAlias = Literal["foo"]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:7:1: PYI026 Use `typing.TypeAlias` for type aliases in `IntOrStr`, e.g. "IntOrStr: typing.TypeAlias = int | str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:8:1: PYI026 Use `typing.TypeAlias` for type aliases in `AliasNone`, e.g. "AliasNone: typing.TypeAlias = None"
Found 5 errors.
```
ref: #848
## Test Plan
Snapshots, manual runs of flake8.
## Summary
As part of my continued quest to separate semantic model-building from
diagnostic emission, this PR moves our unresolved-reference rules to a
deferred pass. So, rather than emitting diagnostics as we encounter
unresolved references, we now track those unresolved references on the
semantic model (just like resolved references), and after traversal,
emit the relevant rules for any unresolved references.
## Summary
Add known problems to `compare-to-empty-string` documentation. Related
to #5873.
Tweaked the example in the documentation to be a tad more concise and
correct (that the rule is most applicable when comparing to a `str`
variable).
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
This PR moves two rules (`invalid-all-format` and `invalid-all-object`)
out of the name-binding phase, and into the dedicated pass over all
bindings that occurs at the end of the `Checker`. This is part of my
continued quest to separate the semantic model-building logic from the
actual rule enforcement.
**Summary** Previously, `RUF014` would be part of ruff.schema.json
depending on whether or not the `unreachable-code` feature was active.
This caused problems for contributors who got unrelated RUF014 changes
when updating the schema without the feature active.
An alternative would be to always add `RUF014`.
**Test plan** `cargo dev generate-all` and `cargo run --bin ruff_dev
--features unreachable-code -- generate-all` now have the same effect.
## Summary
This crate now contains utilities for dealing with trivia more broadly:
whitespace, newlines, "simple" trivia lexing, etc. So renaming it to
reflect its increased responsibilities.
To avoid conflicts, I've also renamed `Token` and `TokenKind` to
`SimpleToken` and `SimpleTokenKind`.
## Summary
The vector of names here is immutable -- we never push to it after
initialization. Boxing reduces the size of the variant from 32 bytes to
24 bytes. (See:
https://nnethercote.github.io/perf-book/type-sizes.html#boxed-slices.)
It doesn't make a difference here, since it's not the largest variant,
but it still seems like a prudent change (and I was considering adding
another field to this variant, though I may no longer do so).
**Summary** This replaces the `todo!()` with a type alias stub in the
formatter. I added the tests from
704eb40108/parser/src/parser.rs (L901-L936)
as ruff python formatter tests.
**Test Plan** None, testing is part of the actual implementation
**Summary** Fix the formatter crash with `x[(1) :: ]` and related code.
**Problem** For assigning comments in slices in subscripts, we need to
find the positions of the colons to assign comments before and after the
colon to the respective lower/upper/step node (or dangling in that
section). Formatting `x[(1) :: ]` was broken because we were looking for
a `:` after the `1` but didn't consider that there could be a `)`
outside the range of the lower node, which contains just the `1` and no
optional parentheses.
**Solution** Use the simple tokenizer directly and skip all closing
parentheses.
**Test Plan** I added regression tests.
Closes#5733
**Summary** Add a static string error message to the formatter syntax
error so we can disambiguate where the syntax error came from
**Test Plan** No fixed tests, we don't expect this to occur, but it
helped with transformers syntax error debugging:
```
Error: Failed to format node
Caused by:
syntax error: slice first colon token was not a colon
```
## Summary
No behavior change, but this is in theory more efficient, since we can
just iterate over the flat `Binding` vector rather than having to
iterate over binding chains via the `Scope`.
## Summary
This PR moves the "unused exception" rule out of the visitor and into a
deferred check. When we can base rules solely on the semantic model, we
probably should, as it greatly simplifies the `Checker` itself.
## Summary
The `SemanticModel` currently stores the "body" of a given `Suite`,
along with the current statement index. This is used to support "next
sibling" queries, but we only use this in exactly one place -- the rule
that simplifies constructs like this to `any` or `all`:
```python
for x in y:
if x == 0:
return True
return False
```
Instead of tracking the state, we can just do a (slightly more
expensive) traversal, by finding the node within its parent and
returning the next node in the body.
Note that we'll only have to do this extremely rarely -- namely, for
functions that contain something like:
```python
for x in y:
if x == 0:
return True
```
## Summary
The motivation here is that it will make this rule easier to rewrite as
a deferred check. Right now, we can't run this rule in the deferred
phase, because it depends on the `except_handler` to power its autofix.
Instead of lexing the `except_handler`, we can use the `SimpleTokenizer`
from the formatter, and just lex forwards and backwards.
For context, this rule detects the unused `e` in:
```python
try:
pass
except ValueError as e:
pass
```
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Resolve#5854
## Test Plan
<!-- How was it tested? -->
New test cases
---------
Co-authored-by: konsti <konstin@mailbox.org>
## Summary
This PR just naively unrolls `collect_call_path` to handle attribute
resolutions of up to eight segments. In profiling via Instruments, it
seems to be about 4x faster for a very hot code path (4% of total
execution time on `main`, 1% here).
Profiling by running `RAYON_NUM_THREADS=1 cargo instruments -t time
--profile release-debug --time-limit 10000 -p ruff_cli -o
FromSlice.trace -- check crates/ruff/resources/test/cpython --silent -e
--no-cache --select ALL`, and modifying the linter to loop infinitely up
to the specified time (10 seconds) to increase sample size.
Before:
<img width="1792" alt="Screen Shot 2023-07-15 at 5 13 34 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/4a8b0b45-8b67-43e9-af5e-65b326928a8e">
After:
<img width="1792" alt="Screen Shot 2023-07-15 at 8 38 51 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/d8829159-2c79-4a49-ab3c-9e4e86f5b2b1">
## Summary
Before:
```
» ruff litestar tests --fix
warning: Invalid `# noqa` directive on line 19: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 65: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 74: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 22: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 66: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 75: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
After:
```
» cargo run --bin ruff ../litestar/litestar ../litestar/tests
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/ruff ../litestar/litestar ../litestar/tests`
warning: Detected debug build without --no-cache.
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:19: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:65: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:74: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:22: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:66: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:75: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
## Test Plan
I didn't find any existing tests with this warning.
Closes https://github.com/astral-sh/ruff/issues/5855
## Summary
Previously, `StmtIf` was defined recursively as
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub orelse: Vec<Stmt>,
}
```
Every `elif` was represented as an `orelse` with a single `StmtIf`. This
means that this representation couldn't differentiate between
```python
if cond1:
x = 1
else:
if cond2:
x = 2
```
and
```python
if cond1:
x = 1
elif cond2:
x = 2
```
It also makes many checks harder than they need to be because we have to
recurse just to iterate over an entire if-elif-else and because we're
lacking nodes and ranges on the `elif` and `else` branches.
We change the representation to a flat
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub elif_else_clauses: Vec<ElifElseClause>,
}
pub struct ElifElseClause {
pub range: TextRange,
pub test: Option<Expr>,
pub body: Vec<Stmt>,
}
```
where `test: Some(_)` represents an `elif` and `test: None` an else.
This representation is different tradeoff, e.g. we need to allocate the
`Vec<ElifElseClause>`, the `elif`s are now different than the `if`s
(which matters in rules where want to check both `if`s and `elif`s) and
the type system doesn't guarantee that the `test: None` else is actually
last. We're also now a bit more inconsistent since all other `else`,
those from `for`, `while` and `try`, still don't have nodes. With the
new representation some things became easier, e.g. finding the `elif`
token (we can use the start of the `ElifElseClause`) and formatting
comments for if-elif-else (no more dangling comments splitting, we only
have to insert the dangling comment after the colon manually and set
`leading_alternate_branch_comments`, everything else is taken of by
having nodes for each branch and the usual placement.rs fixups).
## Merge Plan
This PR requires coordination between the parser repo and the main ruff
repo. I've split the ruff part, into two stacked PRs which have to be
merged together (only the second one fixes all tests), the first for the
formatter to be reviewed by @michareiser and the second for the linter
to be reviewed by @charliermarsh.
* MH: Review and merge
https://github.com/astral-sh/RustPython-Parser/pull/20
* MH: Review and merge or move later in stack
https://github.com/astral-sh/RustPython-Parser/pull/21
* MH: Review and approve
https://github.com/astral-sh/RustPython-Parser/pull/22
* MH: Review and approve formatter PR
https://github.com/astral-sh/ruff/pull/5459
* CM: Review and approve linter PR
https://github.com/astral-sh/ruff/pull/5460
* Merge linter PR in formatter PR, fix ecosystem checks (ecosystem
checks can't run on the formatter PR and won't run on the linter PR, so
we need to merge them first)
* Merge https://github.com/astral-sh/RustPython-Parser/pull/22
* Create tag in the parser, update linter+formatter PR
* Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Originally `join_with` was used in the formatters README.md. Now it uses
```rs
f.join_comma_separated(item.end())
.nodes(elts.iter())
.finish()
```
## Test Plan
None
## Summary
For formatter instabilities, the message we get look something like
this:
```text
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py
@@ -47,9 +47,9 @@
- script_header_dict["slurm_partition_line"] = (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ script_header_dict[
+ "slurm_partition_line"
+ ] = NOT_YET_IMPLEMENTED_ExprJoinedStr
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/pbs.py
@@ -26,9 +26,9 @@
- pbs_script_header_dict["select_node_line"] += (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ pbs_script_header_dict[
+ "select_node_line"
+ ] += NOT_YET_IMPLEMENTED_ExprJoinedStr
```
For ruff crashes. you don't even get that but just the file that crashed
it. To extract the actual bug, you'd need to manually remove parts of
the file, rerun to see if the bug still occurs (and revert if it
doesn't) until you have a minimal example.
With this script, you run
```shell
cargo run --bin ruff_shrinking -- target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py target/minirepo/code.py "Unstable formatting" "target/debug/ruff_dev format-dev --stability-check target/minirepo"
```
and get
```python
class Slurm():
def gen_script_header(self, job):
if resources.queue_name != "":
script_header_dict["slurm_partition_line"] = f"#SBATCH --partition {resources.queue_name}"
```
which is an nice minimal example.
I've been using this script and it would be easier for me if this were
part of main. The main disadvantage to merging is that it adds
additional dependencies.
## Test Plan
I've been using this for a number of minimization. This is an internal
helper script you only run manually. I could add a test that minimizes a
rule violation if required.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fixes#5739
## Test Plan
<!-- How was it tested? -->
Manually tested:
```sh
$ tree dir
dir
├── dir.py
│ └── file.py
└── file.py
1 directory, 2 files
$ cargo run -p ruff_cli -- check dir --no-cache
Finished dev [unoptimized + debuginfo] target(s) in 0.08s
Running `target/debug/ruff check dir --no-cache`
dir/dir.py/file.py:1:7: F821 Undefined name `a`
dir/file.py:1:7: F821 Undefined name `a`
Found 2 errors.
```
Is a unit test needed?
## Summary
This PR does some non-behavior-changing refactoring of the AST checker.
Specifically, it breaks the `Stmt`, `Expr`, and `ExceptHandler` visitors
into four distinct, consistent phases:
1. **Phase 1: Analysis**: Run any lint rules on the node.
2. **Phase 2: Binding**: Bind any symbols declared by the node.
3. **Phase 3: Recursion**: Visit all child nodes.
4. **Phase 4: Clean-up**: Pop scopes, etc.
There are some fuzzy boundaries in the last three phases, but the most
important divide is between the Phase 1 and all the others -- the goal
here is (as much as possible) to disentangle all of the vanilla
lint-rule calls from any other semantic analysis or model building.
Part of the motivation here is that I'm considering re-ordering some of
these phases, and it was just impossible to reason about that change as
long as we had miscellaneous binding-creation and scope-modification
code intermingled with lint rules. However, this could also enable us to
(e.g.) move the entire analysis phase elsewhere, and even with a more
limited API that has read-only access to `Checker` (but can push to a
diagnostics vector).
## Summary
Comparing repos with black requires that we use the settings as black,
notably line length and magic trailing comma behaviour. Excludes and
preserving quotes (vs. a preference for either quote style) is not yet
implemented because they weren't needed for the test projects.
In the other two commits i fixed the output when the progress bar is
hidden (this way is recommonded in the indicatif docs), added a
`scratch.pyi` file to gitignore because black formats stub files
differently and also updated the ecosystem readme with the projects json
without forks.
## Test Plan
I added a `line-length` vs `line_length` test. Otherwise only my
personal usage atm, a PR to integrate the script into the CI to check
some projects will follow.
## Summary
Closes#5628 by only emitting if `sep=","`. Includes documentation
(completes the `pandas-vet` ruleset).
Related to #2646.
## Test Plan
`cargo test`
Moves the computation of the `start_offset` for overlong lines to just
before the result is returned. There is a slight overhead for overlong
lines (double the work for the first `limit` characters).
In practice this results in a speedup on the CPython codebase. Most
lines are not overlong, or are not enforced because the line ends with a
URL, or does not contain whitespace. Nonetheless, the 0.3% of overlong
lines are a lot compared to other violations.
### Before

_Selected W505 and E501_

_All rules_
### After

_Selected W505 and E501_

_All rules_
CPython line statistics:
- Number of Python lines: 867.696
- Number of overlong lines: 2.963 (0.3%)
<details>
Benchmark selected:
```shell
cargo build --release && hyperfine --warmup 10 --min-runs 50 \
"./target/release/ruff ./crates/ruff/resources/test/cpython/ --no-cache -e --select W505,E501"
```
Benchmark all:
```shell
cargo build --release && hyperfine --warmup 10 --min-runs 50 \
"./target/release/ruff ./crates/ruff/resources/test/cpython/ --no-cache -e --select ALL"
```
Overlong lines in CPython
```shell
cargo run -p ruff_cli -- check crates/ruff/resources/test/cpython/Lib --no-cache --select=E501,W505 --statistics
```
Total Python lines:
```shell
find crates/ruff/resources/test/cpython/ -name '*.py' | xargs wc -l
```
</details>
(Performance tested on Mac M1)
## Summary
The motivating change here is to remove `let range =
except_handler.try_identifier().unwrap();` and instead just do
`name.range()`, since exception names now have ranges attached to them
by the parse. This also required some refactors (which are improvements)
to the built-in attribute shadowing rules, since at least one invocation
relied on passing in the exception handler and calling
`.try_identifier()`. Now that we have easy access to identifiers, we can
remove the whole `AnyShadowing` abstraction.
## Summary
This is more similar to how these flags work in other contexts (e.g.,
`visit_annotation`), and also ensures that we unset it prior to visit
the `orelse` and `finalbody` (a subtle bug).
## Summary
The intent of this rule is to always flag the `global` declaration, not
the usage. The current implementation does the wrong thing if a global
is assigned multiple times. Using `semantic().global()` is also more
efficient.
## Summary
Adds autofix for `hasattr` case of B004. I don't think it's safe (or
simple) to implement it for the `getattr` case because, inter alia,
calling `getattr` may have side effects.
Fixes#3545
## Test Plan
Existing tests were sufficient. Updated snapshots
## Summary
I'm doing some unrelated profiling, and I noticed that this method is
actually measurable on the CPython benchmark -- it's > 1% of execution
time. We don't need to lex here, we already know the ranges of all
comments, so we can just do a simple binary search for overlap, which
brings the method down to 0%.
## Test Plan
`cargo test`