## Summary
This pull request add supports for `int`, `float` and `bool` types in
`UP018`
rule to convert empty call to the default value of the type or remove
the call
if a value of the same type is provided as an argument.
## Test Plan
Added tests for `int`, `float` and `bool` types.
Partially resolves#5988
**Summary** Add a `EmptyWithDanglingComments` format helper that formats
comments inside empty parentheses, brackets or curly braces. Previously,
this was implemented separately, and partially incorrectly, for each use
case.
Empty `()`, `[]` and `{}` are special because there can be dangling
comments, and they can be in
two positions:
```python
x = [ # end-of-line
# own line
]
```
These comments are dangling because they can't be assigned to any
element inside as they would
in all other cases.
**Test Plan** Added a regression test.
145 (from previously 149) instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
145
```
## Summary
This is equivalent for a single flag, but I think it's more likely to be
correct when the bitflags are modified -- the primary reason being that
we sometimes define flags as the union of other flags, e.g.:
```rust
const ANNOTATION = Self::TYPING_ONLY_ANNOTATION.bits() | Self::RUNTIME_ANNOTATION.bits();
```
In this case, `flags.contains(Flag::ANNOTATION)` requires that _both_
flags in the union are set, whereas `flags.intersects(Flag::ANNOTATION)`
requires that _at least one_ flag is set.
## Summary
Use the `find_keyword` helper function instead of reimplementing it.
Follows on from #5983 by doing a different search.
## Test Plan
`cargo test`
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
F507 should not be raised when the right-hand side value is a non-tuple
object.
```python
'%s' % (1, 2, 3) # throws
'%s' % [1, 2, 3] # doesn't throw
'%s' % {1, 2, 3} # doesn't throw
```
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix a regression introduced by
https://github.com/astral-sh/ruff/pull/5638. A multiline expression
can't be safely inserted into a format field.
### Example
```
> cat a.py
"{}".format(
[
1,
2,
3,
]
)
> cargo run -p ruff_cli -- check a.py --no-cache --select UP032 --fix
Finished dev [unoptimized + debuginfo] target(s) in 0.07s
Running `target/debug/ruff check a.py --no-cache --select UP032 --fix`
error: Autofix introduced a syntax error in `a.py` with rule codes UP032: EOL while scanning string literal at byte offset 5
---
f"{[
1,
2,
3,
]}"
---
a.py:1:1: UP032 Use f-string instead of `format` call
Found 1 error.
```
## Test Plan
New test cases
## Summary
Checks that `append`, `extend` and `remove` methods are not called on
`__all__`. See [original
implementation](2a86db8271/pyi.py (L1133-L1138)).
```
$ flake8 --select Y026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:3:1: Y056 Calling ".append()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:4:1: Y056 Calling ".extend()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:5:1: Y056 Calling ".remove()" on "__all__" may not be supported by all type checkers (use += instead)
```
```
$ ./target/debug/ruff --select PYI026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:3:1: PYI056 Calling ".append()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:4:1: PYI056 Calling ".extend()" on "__all__" may not be supported by all type checkers (use += instead)
crates/ruff/resources/test/fixtures/flake8_pyi/PYI056.pyi:5:1: PYI056 Calling ".remove()" on "__all__" may not be supported by all type checkers (use += instead)
Found 3 errors.
```
ref #848
## Test Plan
Snapshots and manual runs of flake8.
## Summary
These are skipped as an optimization, but it feels kind of unnecessary
and makes the code a bit more confusing than is worthwhile.
(non-`strict` is also by far the more popular setting, and the default.)
## Summary
I ran into this in the wild. It looks like Ruff will collapse the `else`
and `elif` branches here (i.e., it doesn't recognize that they're too
independent import blocks):
```python
if "sdist" in cmds:
_sdist = cmds["sdist"]
elif "setuptools" in sys.modules:
from setuptools.command.sdist import sdist as _sdist
else:
from setuptools.command.sdist import sdist as _sdist
from distutils.command.sdist import sdist as _sdist
```
Likely fallout from the `elif_else_branches` refactor.
**Summary** Fix implemented in
https://github.com/astral-sh/RustPython-Parser/pull/35: Previously,
empty lambda arguments (e.g. `lambda: 1`) would get the range of the
entire expression, which leads to incorrect comment placement. Now empty
lambda arguments get an empty range between the `lambda` and the `:`
tokens.
**Test Plan** Added a regression test.
149 instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
149
```
**Summary** Add script to shrink all formatter errors: This started as a
fun idea and turned out really useful: This script gives us a single
Python file with all formatter stability errors. I want to keep it
around to occasionally update #5828 so I added it to the git.
**Test Plan** None, this is a helper script
## Summary
**Don't minimize files that don't match in the first place** This adds a
sanity check to the minimizer script that the
input matches the condition (e.g. unstable formatting). Otherwise we run
through all checks with the whole file, which is extremely slow. It's
more reasonable for downstream usage to write an empty string to the
output file instead.
## Summary
Allow `respect_gitignore` even when not in a git repo
## Test Plan
Within the Ruff repository:
1. Renamed `.git` to `.hello-world`
2. Added `test.py` in root folder
3. Added `test.py` to `.gitignore`
4. Ran `cargo run --bin ruff -- check --no-cache --isolated --show-files
.` with
and without `--respect-gitignore` flag
fixes: #5930
## Summary
We now allow RUF015 to fix cases like:
```python
list(range(10))[0]
list(x.y)[0]
list(x["y"])[0]
```
Further, we fix generators like:
```python
[i + 1 for i in x][0]
```
By rewriting to `next(iter(i + 1 for i in x))`.
I've retained the special-case that rewrites `[i for i in x][0]` to
`next(iter(x))`.
Closes https://github.com/astral-sh/ruff/issues/5764.
## Summary
Similar to #5852 and a bunch of related PRs -- trying to move rules that
rely on point-in-time semantic analysis to _after_ the semantic model
building.
## Summary
Implements `PYI017` or `Y017` from `flake8-pyi` plug-in. Mirrors
[upstream
implementation](ceab86d16b/pyi.py (L1039-L1048)).
It checks for any assignment with more than 1 target or an assignment to
anything other than a name, and raises a violation for these in stub
files.
Couldn't find a clear and concise explanation for why this is to be
avoided and what is preferred for attribute cases like:
```python
a.b = int
```
So welcome some input there, to learn and to finish up the docs.
## Test Plan
Added test cases from upstream plug-in in a fixture (both `.py` and
`.pyi`). Added a few more.
## Issue link
Refers: https://github.com/astral-sh/ruff/issues/848
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
- Remove space when start of slice is empty
- Treat unary op except `not` as simple expression
## Test Plan
Add some simple tests for unary op expressions in slice
Closes#5673
This shrinks a good bit more than previously, which was helpful for all
the formatter bugs. fwiw i treat this as a very ad-hoc script since it's
mainly my ecosystem bug processing companion.
## Summary
It can happen that we can't read a file (a python file, a jupyter
notebook or pyproject.toml), which needs to be handled and handled
consistently for all file types. Instead of using `Err` or `error!`, we
emit E602 with the io error as message and continue. This PR makes sure
we handle all three cases consistently, emit E602.
I'm not convinced that it should be possible to disable io errors, but
we now handle the regular case consistently and at least print warning
consistently.
I went with `warn!` but i can change them all to `error!`, too.
It also checks the error case when a pyproject.toml is not readable. The
error message is not very helpful, but it's now a bit clearer that
actually ruff itself failed instead vs this being a diagnostic.
## Examples
This is how an Err of `run` looks now:

With an unreadable file and `IOError` disabled:

(we lint zero files but count files before linting not during so we exit
0)
I'm not sure if it should (or if we should take a different path with
manual ExitStatus), but this currently also triggers when `files` is
empty:

## Test Plan
Unix only: Create a temporary directory with files with permissions
`000` (not readable by the owner) and run on that directory. Since this
breaks the assumptions of most of the test code (single file, `ruff`
instead of `ruff_cli`), the test code is rather cumbersome and looks a
bit misplaced; i'm happy about suggestions to fit it in closer with the
other tests or streamline it in other ways. I added another test for
when the entire directory is not readable.
## Summary
Completes documentation for the `flake8-fixme` (`FIX`) ruleset. Related
to #2646.
Tweaks the violation message. For example,
```
FIX001 Line contains FIXME
```
becomes
```
FIX001 Line contains FIXME, consider resolving the issue
```
This is because the previous message was unclear if it was warning
against the use of FIXME tags per se, or the code the FIXME tag was
annotating.
## Test Plan
`cargo test && python scripts/check_docs_formatted.py`
## Summary
Checks for `typehint.TypeAlias` annotation in type aliases. See
[original
source](https://github.com/PyCQA/flake8-pyi/blob/main/pyi.py#L1085).
```
$ flake8 --select Y026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:4:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "NewAny: TypeAlias = Any"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:5:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "OptinalStr: TypeAlias = typing.Optional[str]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:6:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "Foo: TypeAlias = Literal['foo']"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:7:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "IntOrStr: TypeAlias = int | str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:8:1: Y026 Use typing_extensions.TypeAlias for type aliases, e.g. "AliasNone: TypeAlias = None"
```
```
$ ./target/debug/ruff --select PYI026 crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi --no-cache
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:4:1: PYI026 Use `typing.TypeAlias` for type aliases in `NewAny`, e.g. "NewAny: typing.TypeAlias = Any"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:5:1: PYI026 Use `typing.TypeAlias` for type aliases in `OptinalStr`, e.g. "OptinalStr: typing.TypeAlias = typing.Optional[str]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:6:1: PYI026 Use `typing.TypeAlias` for type aliases in `Foo`, e.g. "Foo: typing.TypeAlias = Literal["foo"]"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:7:1: PYI026 Use `typing.TypeAlias` for type aliases in `IntOrStr`, e.g. "IntOrStr: typing.TypeAlias = int | str"
crates/ruff/resources/test/fixtures/flake8_pyi/PYI026.pyi:8:1: PYI026 Use `typing.TypeAlias` for type aliases in `AliasNone`, e.g. "AliasNone: typing.TypeAlias = None"
Found 5 errors.
```
ref: #848
## Test Plan
Snapshots, manual runs of flake8.
## Summary
As part of my continued quest to separate semantic model-building from
diagnostic emission, this PR moves our unresolved-reference rules to a
deferred pass. So, rather than emitting diagnostics as we encounter
unresolved references, we now track those unresolved references on the
semantic model (just like resolved references), and after traversal,
emit the relevant rules for any unresolved references.
## Summary
Add known problems to `compare-to-empty-string` documentation. Related
to #5873.
Tweaked the example in the documentation to be a tad more concise and
correct (that the rule is most applicable when comparing to a `str`
variable).
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
This PR moves two rules (`invalid-all-format` and `invalid-all-object`)
out of the name-binding phase, and into the dedicated pass over all
bindings that occurs at the end of the `Checker`. This is part of my
continued quest to separate the semantic model-building logic from the
actual rule enforcement.
**Summary** Previously, `RUF014` would be part of ruff.schema.json
depending on whether or not the `unreachable-code` feature was active.
This caused problems for contributors who got unrelated RUF014 changes
when updating the schema without the feature active.
An alternative would be to always add `RUF014`.
**Test plan** `cargo dev generate-all` and `cargo run --bin ruff_dev
--features unreachable-code -- generate-all` now have the same effect.
## Summary
This crate now contains utilities for dealing with trivia more broadly:
whitespace, newlines, "simple" trivia lexing, etc. So renaming it to
reflect its increased responsibilities.
To avoid conflicts, I've also renamed `Token` and `TokenKind` to
`SimpleToken` and `SimpleTokenKind`.
## Summary
The vector of names here is immutable -- we never push to it after
initialization. Boxing reduces the size of the variant from 32 bytes to
24 bytes. (See:
https://nnethercote.github.io/perf-book/type-sizes.html#boxed-slices.)
It doesn't make a difference here, since it's not the largest variant,
but it still seems like a prudent change (and I was considering adding
another field to this variant, though I may no longer do so).
**Summary** This replaces the `todo!()` with a type alias stub in the
formatter. I added the tests from
704eb40108/parser/src/parser.rs (L901-L936)
as ruff python formatter tests.
**Test Plan** None, testing is part of the actual implementation
**Summary** Fix the formatter crash with `x[(1) :: ]` and related code.
**Problem** For assigning comments in slices in subscripts, we need to
find the positions of the colons to assign comments before and after the
colon to the respective lower/upper/step node (or dangling in that
section). Formatting `x[(1) :: ]` was broken because we were looking for
a `:` after the `1` but didn't consider that there could be a `)`
outside the range of the lower node, which contains just the `1` and no
optional parentheses.
**Solution** Use the simple tokenizer directly and skip all closing
parentheses.
**Test Plan** I added regression tests.
Closes#5733
**Summary** Add a static string error message to the formatter syntax
error so we can disambiguate where the syntax error came from
**Test Plan** No fixed tests, we don't expect this to occur, but it
helped with transformers syntax error debugging:
```
Error: Failed to format node
Caused by:
syntax error: slice first colon token was not a colon
```
## Summary
No behavior change, but this is in theory more efficient, since we can
just iterate over the flat `Binding` vector rather than having to
iterate over binding chains via the `Scope`.
## Summary
This PR moves the "unused exception" rule out of the visitor and into a
deferred check. When we can base rules solely on the semantic model, we
probably should, as it greatly simplifies the `Checker` itself.
## Summary
The `SemanticModel` currently stores the "body" of a given `Suite`,
along with the current statement index. This is used to support "next
sibling" queries, but we only use this in exactly one place -- the rule
that simplifies constructs like this to `any` or `all`:
```python
for x in y:
if x == 0:
return True
return False
```
Instead of tracking the state, we can just do a (slightly more
expensive) traversal, by finding the node within its parent and
returning the next node in the body.
Note that we'll only have to do this extremely rarely -- namely, for
functions that contain something like:
```python
for x in y:
if x == 0:
return True
```
## Summary
The motivation here is that it will make this rule easier to rewrite as
a deferred check. Right now, we can't run this rule in the deferred
phase, because it depends on the `except_handler` to power its autofix.
Instead of lexing the `except_handler`, we can use the `SimpleTokenizer`
from the formatter, and just lex forwards and backwards.
For context, this rule detects the unused `e` in:
```python
try:
pass
except ValueError as e:
pass
```
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Resolve#5854
## Test Plan
<!-- How was it tested? -->
New test cases
---------
Co-authored-by: konsti <konstin@mailbox.org>
## Summary
This PR just naively unrolls `collect_call_path` to handle attribute
resolutions of up to eight segments. In profiling via Instruments, it
seems to be about 4x faster for a very hot code path (4% of total
execution time on `main`, 1% here).
Profiling by running `RAYON_NUM_THREADS=1 cargo instruments -t time
--profile release-debug --time-limit 10000 -p ruff_cli -o
FromSlice.trace -- check crates/ruff/resources/test/cpython --silent -e
--no-cache --select ALL`, and modifying the linter to loop infinitely up
to the specified time (10 seconds) to increase sample size.
Before:
<img width="1792" alt="Screen Shot 2023-07-15 at 5 13 34 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/4a8b0b45-8b67-43e9-af5e-65b326928a8e">
After:
<img width="1792" alt="Screen Shot 2023-07-15 at 8 38 51 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/d8829159-2c79-4a49-ab3c-9e4e86f5b2b1">
## Summary
Before:
```
» ruff litestar tests --fix
warning: Invalid `# noqa` directive on line 19: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 65: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 74: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 22: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 66: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 75: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
After:
```
» cargo run --bin ruff ../litestar/litestar ../litestar/tests
Finished dev [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/ruff ../litestar/litestar ../litestar/tests`
warning: Detected debug build without --no-cache.
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:19: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:65: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_bigint.py:74: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:22: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:66: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on /Users/sobolev/Desktop/litestar/tests/unit/test_contrib/test_sqlalchemy/models_uuid.py:75: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
## Test Plan
I didn't find any existing tests with this warning.
Closes https://github.com/astral-sh/ruff/issues/5855
## Summary
Previously, `StmtIf` was defined recursively as
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub orelse: Vec<Stmt>,
}
```
Every `elif` was represented as an `orelse` with a single `StmtIf`. This
means that this representation couldn't differentiate between
```python
if cond1:
x = 1
else:
if cond2:
x = 2
```
and
```python
if cond1:
x = 1
elif cond2:
x = 2
```
It also makes many checks harder than they need to be because we have to
recurse just to iterate over an entire if-elif-else and because we're
lacking nodes and ranges on the `elif` and `else` branches.
We change the representation to a flat
```rust
pub struct StmtIf {
pub range: TextRange,
pub test: Box<Expr>,
pub body: Vec<Stmt>,
pub elif_else_clauses: Vec<ElifElseClause>,
}
pub struct ElifElseClause {
pub range: TextRange,
pub test: Option<Expr>,
pub body: Vec<Stmt>,
}
```
where `test: Some(_)` represents an `elif` and `test: None` an else.
This representation is different tradeoff, e.g. we need to allocate the
`Vec<ElifElseClause>`, the `elif`s are now different than the `if`s
(which matters in rules where want to check both `if`s and `elif`s) and
the type system doesn't guarantee that the `test: None` else is actually
last. We're also now a bit more inconsistent since all other `else`,
those from `for`, `while` and `try`, still don't have nodes. With the
new representation some things became easier, e.g. finding the `elif`
token (we can use the start of the `ElifElseClause`) and formatting
comments for if-elif-else (no more dangling comments splitting, we only
have to insert the dangling comment after the colon manually and set
`leading_alternate_branch_comments`, everything else is taken of by
having nodes for each branch and the usual placement.rs fixups).
## Merge Plan
This PR requires coordination between the parser repo and the main ruff
repo. I've split the ruff part, into two stacked PRs which have to be
merged together (only the second one fixes all tests), the first for the
formatter to be reviewed by @michareiser and the second for the linter
to be reviewed by @charliermarsh.
* MH: Review and merge
https://github.com/astral-sh/RustPython-Parser/pull/20
* MH: Review and merge or move later in stack
https://github.com/astral-sh/RustPython-Parser/pull/21
* MH: Review and approve
https://github.com/astral-sh/RustPython-Parser/pull/22
* MH: Review and approve formatter PR
https://github.com/astral-sh/ruff/pull/5459
* CM: Review and approve linter PR
https://github.com/astral-sh/ruff/pull/5460
* Merge linter PR in formatter PR, fix ecosystem checks (ecosystem
checks can't run on the formatter PR and won't run on the linter PR, so
we need to merge them first)
* Merge https://github.com/astral-sh/RustPython-Parser/pull/22
* Create tag in the parser, update linter+formatter PR
* Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Originally `join_with` was used in the formatters README.md. Now it uses
```rs
f.join_comma_separated(item.end())
.nodes(elts.iter())
.finish()
```
## Test Plan
None
## Summary
For formatter instabilities, the message we get look something like
this:
```text
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py
@@ -47,9 +47,9 @@
- script_header_dict["slurm_partition_line"] = (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ script_header_dict[
+ "slurm_partition_line"
+ ] = NOT_YET_IMPLEMENTED_ExprJoinedStr
Unstable formatting /home/konsti/ruff/target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/pbs.py
@@ -26,9 +26,9 @@
- pbs_script_header_dict["select_node_line"] += (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ pbs_script_header_dict[
+ "select_node_line"
+ ] += NOT_YET_IMPLEMENTED_ExprJoinedStr
```
For ruff crashes. you don't even get that but just the file that crashed
it. To extract the actual bug, you'd need to manually remove parts of
the file, rerun to see if the bug still occurs (and revert if it
doesn't) until you have a minimal example.
With this script, you run
```shell
cargo run --bin ruff_shrinking -- target/checkouts/deepmodeling:dpdispatcher/dpdispatcher/slurm.py target/minirepo/code.py "Unstable formatting" "target/debug/ruff_dev format-dev --stability-check target/minirepo"
```
and get
```python
class Slurm():
def gen_script_header(self, job):
if resources.queue_name != "":
script_header_dict["slurm_partition_line"] = f"#SBATCH --partition {resources.queue_name}"
```
which is an nice minimal example.
I've been using this script and it would be easier for me if this were
part of main. The main disadvantage to merging is that it adds
additional dependencies.
## Test Plan
I've been using this for a number of minimization. This is an internal
helper script you only run manually. I could add a test that minimizes a
rule violation if required.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fixes#5739
## Test Plan
<!-- How was it tested? -->
Manually tested:
```sh
$ tree dir
dir
├── dir.py
│ └── file.py
└── file.py
1 directory, 2 files
$ cargo run -p ruff_cli -- check dir --no-cache
Finished dev [unoptimized + debuginfo] target(s) in 0.08s
Running `target/debug/ruff check dir --no-cache`
dir/dir.py/file.py:1:7: F821 Undefined name `a`
dir/file.py:1:7: F821 Undefined name `a`
Found 2 errors.
```
Is a unit test needed?
## Summary
This PR does some non-behavior-changing refactoring of the AST checker.
Specifically, it breaks the `Stmt`, `Expr`, and `ExceptHandler` visitors
into four distinct, consistent phases:
1. **Phase 1: Analysis**: Run any lint rules on the node.
2. **Phase 2: Binding**: Bind any symbols declared by the node.
3. **Phase 3: Recursion**: Visit all child nodes.
4. **Phase 4: Clean-up**: Pop scopes, etc.
There are some fuzzy boundaries in the last three phases, but the most
important divide is between the Phase 1 and all the others -- the goal
here is (as much as possible) to disentangle all of the vanilla
lint-rule calls from any other semantic analysis or model building.
Part of the motivation here is that I'm considering re-ordering some of
these phases, and it was just impossible to reason about that change as
long as we had miscellaneous binding-creation and scope-modification
code intermingled with lint rules. However, this could also enable us to
(e.g.) move the entire analysis phase elsewhere, and even with a more
limited API that has read-only access to `Checker` (but can push to a
diagnostics vector).
## Summary
Comparing repos with black requires that we use the settings as black,
notably line length and magic trailing comma behaviour. Excludes and
preserving quotes (vs. a preference for either quote style) is not yet
implemented because they weren't needed for the test projects.
In the other two commits i fixed the output when the progress bar is
hidden (this way is recommonded in the indicatif docs), added a
`scratch.pyi` file to gitignore because black formats stub files
differently and also updated the ecosystem readme with the projects json
without forks.
## Test Plan
I added a `line-length` vs `line_length` test. Otherwise only my
personal usage atm, a PR to integrate the script into the CI to check
some projects will follow.
## Summary
Closes#5628 by only emitting if `sep=","`. Includes documentation
(completes the `pandas-vet` ruleset).
Related to #2646.
## Test Plan
`cargo test`
Moves the computation of the `start_offset` for overlong lines to just
before the result is returned. There is a slight overhead for overlong
lines (double the work for the first `limit` characters).
In practice this results in a speedup on the CPython codebase. Most
lines are not overlong, or are not enforced because the line ends with a
URL, or does not contain whitespace. Nonetheless, the 0.3% of overlong
lines are a lot compared to other violations.
### Before

_Selected W505 and E501_

_All rules_
### After

_Selected W505 and E501_

_All rules_
CPython line statistics:
- Number of Python lines: 867.696
- Number of overlong lines: 2.963 (0.3%)
<details>
Benchmark selected:
```shell
cargo build --release && hyperfine --warmup 10 --min-runs 50 \
"./target/release/ruff ./crates/ruff/resources/test/cpython/ --no-cache -e --select W505,E501"
```
Benchmark all:
```shell
cargo build --release && hyperfine --warmup 10 --min-runs 50 \
"./target/release/ruff ./crates/ruff/resources/test/cpython/ --no-cache -e --select ALL"
```
Overlong lines in CPython
```shell
cargo run -p ruff_cli -- check crates/ruff/resources/test/cpython/Lib --no-cache --select=E501,W505 --statistics
```
Total Python lines:
```shell
find crates/ruff/resources/test/cpython/ -name '*.py' | xargs wc -l
```
</details>
(Performance tested on Mac M1)
## Summary
The motivating change here is to remove `let range =
except_handler.try_identifier().unwrap();` and instead just do
`name.range()`, since exception names now have ranges attached to them
by the parse. This also required some refactors (which are improvements)
to the built-in attribute shadowing rules, since at least one invocation
relied on passing in the exception handler and calling
`.try_identifier()`. Now that we have easy access to identifiers, we can
remove the whole `AnyShadowing` abstraction.
## Summary
This is more similar to how these flags work in other contexts (e.g.,
`visit_annotation`), and also ensures that we unset it prior to visit
the `orelse` and `finalbody` (a subtle bug).
## Summary
The intent of this rule is to always flag the `global` declaration, not
the usage. The current implementation does the wrong thing if a global
is assigned multiple times. Using `semantic().global()` is also more
efficient.
## Summary
Adds autofix for `hasattr` case of B004. I don't think it's safe (or
simple) to implement it for the `getattr` case because, inter alia,
calling `getattr` may have side effects.
Fixes#3545
## Test Plan
Existing tests were sufficient. Updated snapshots
## Summary
I'm doing some unrelated profiling, and I noticed that this method is
actually measurable on the CPython benchmark -- it's > 1% of execution
time. We don't need to lex here, we already know the ranges of all
comments, so we can just do a simple binary search for overlap, which
brings the method down to 0%.
## Test Plan
`cargo test`
## Summary
When required-imports is set with the syntax from ... import ... as ...,
autofix I002 is failing
## Test Plan
Reuse the same python files as
`crates/ruff/src/rules/isort/mod.rs:required_import` test.
## Summary
Previously, the `quoted-annotation` rule only removed quotes when `from
__future__ import annotations` was present. However, there are some
other cases in which this is also safe -- for example:
```python
def foo():
x: "MyClass"
```
We already model these in the semantic model, so this PR just expands
the scope of the rule to handle those.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR uses the `join_comma_separated` builder for formatting set
expressions
to ensure the formatting preserves magic commas, if the setting is
enabled.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
See the fixed black tests
<!-- How was it tested? -->
## Summary
Format `DictComp` like `ListComp` from #5600. It's not 100%, but I
figured maybe it's worth starting to explore.
## Test Plan
Added ruff fixture based on `ListComp`'s.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR improves the parentheses handling for with items to get closer
to black's formatting.
### Case 1:
```python
# Black / Input
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
aaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+ cccccccccccccccccccccccccccc
+ ddddddddddddddddd as example2,
CtxManager2() as example2,
CtxManager2() as example2,
CtxManager2() as example2,
):
...
# Before
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
(
aaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+ cccccccccccccccccccccccccccc
+ ddddddddddddddddd
) as example2,
CtxManager2() as example2,
CtxManager2() as example2,
CtxManager2() as example2,
):
...
```
Notice how Ruff wraps the binary expression in an extra set of
parentheses
### Case 2:
Black does not expand the with-items if the with has no parentheses:
```python
# Black / Input
with aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c:
...
# Before
with (
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c
):
...
```
Or
```python
# Black / Input
with [
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1, aaaaaaaaaaaaaaaaaaaaaaaaaa * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb * cccccccccccccccccccccccccccc + ddddddddddddddddd as example2, CtxManager222222222222222() as example2:
...
# Before (Same as Case 1)
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
(
aaaaaaaaaaaaaaaaaaaaaaaaaa
* bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
* cccccccccccccccccccccccccccc
+ ddddddddddddddddd
) as example2,
CtxManager222222222222222() as example2,
):
...
```
## Test Plan
I added new snapshot tests
Improves the django similarity index from 0.973 to 0.977
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format `SetComp` like `ListComp`.
## Test Plan
Derived from `ListComp`'s fixture.
## Summary
`B006` should allow using `bytes(...)` as an argument defaule value.
## Test Plan
A new test case
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
Non-behavioral change, but this is the same in each branch. Visiting the
`func` first also means we've visited the `func` by the time we try to
resolve it (via `resolve_call_path`), which should be helpful in a
future refactor.
## Summary
The AST pass is broken up into three phases: pre-visit (which includes
analysis), recurse (visit all members), and post-visit (clean-up). We're
not supposed to edit semantic model flags in the pre-visit phase, but it
looks like we were for literal detection. This didn't matter in
practice, but I'm looking into some AST refactors for which this _does_
cause issues.
No behavior changes expected.
## Test Plan
Good test coverage on these.
## Summary
`PERF102` looks for unused keys or values in `dict.items()` calls, and
suggests instead using `dict.keys()` or `dict.values()`. Previously,
this check determined usage by looking for underscore-prefixed
variables. However, we can use the semantic model to actually detect
whether a variable is used. This has two nice effects:
1. We avoid odd false-positives whereby underscore-prefixed variables
are actually used.
2. We can catch more cases (fewer false-negatives) by detecting unused
loop variables that _aren't_ underscore-prefixed.
Closes#5692.
## Summary
Nested calls to `sorted` can only be collapsed if the calls are
identical (i.e., they have the exact same keyword arguments).
Update C414 to only flag such cases.
Fixes#5712
## Test Plan
Updated snapshots.
Tested against flake8-comprehensions. It incorrectly flags these cases.
## Summary
This is really bad PR hygiene, but a mix of: using `Locator`-based fixes
in a few places (in lieu of `Generator`-based fixes), using match syntax
to avoid `.len() == 1` checks, using common helpers in more places, etc.
## Test Plan
`cargo test`
## Summary
Consider single element subscript expr for implicit optional.
On `main`, the cases where there is only a single element in the
subscript
list was giving false positives such as for the following:
```python
typing.Union[None]
typing.Literal[None]
```
## Test Plan
`cargo test`
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Check for `Any` in other types for `ANN401`. This reuses the logic from
`implicit-optional` rule to resolve the type to `Any`.
Following types are supported:
* `Union[Any, ...]`
* `Any | ...`
* `Optional[Any]`
* `Annotated[<any of the above variant>, ...]`
* Forward references i.e., `"Any | ..."`
## Test Plan
Added test cases for various combinations.
fixes: #5458
## Summary
Do not raise `EXE001` and `EXE002` if WSL is detected. Uses the
[`wsl`](https://crates.io/crates/wsl) crate.
Closes#5445.
## Test Plan
`cargo test`
I don't use Windows, so was unable to test on a WSL environment. It
would be good if someone who runs Windows could check the functionality.
## Summary
Python doesn't allow `"Foo" | None` if the annotation will be evaluated
at runtime (see the comments in the PR, or the semantic model
documentation for more on what this means and when it is true), but it
_does_ allow it if the annotation is typing-only.
This, for example, is invalid, as Python will evaluate `"Foo" | None` at
runtime in order to
populate the function's `__annotations__`:
```python
def f(x: "Foo" | None): ...
```
This, however, is valid:
```python
def f():
x: "Foo" | None
```
As is this:
```python
from __future__ import annotations
def f(x: "Foo" | None): ...
```
Closes#5706.
## Summary
`StmtAnnAssign` would not insert parentheses when breaking the same way
`StmtAssign` does, causing unstable formatting and likely some syntax
errors.
## Test Plan
I added a regression test.
## Summary
The previous dummy was causing instabilities since it turned a string
into a variable.
E.g.
```python
script_header_dict[
"slurm_partition_line"
] = f"#SBATCH --partition {resources.queue_name}"
```
has an instability as
```python
- script_header_dict["slurm_partition_line"] = (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ script_header_dict[
+ "slurm_partition_line"
+ ] = NOT_YET_IMPLEMENTED_ExprJoinedStr
```
## Test Plan
The instability is gone, otherwise it's still a dummy
## Summary
Implement Pylint rule [`consider-using-in`
(`R1714`)](https://pylint.pycqa.org/en/latest/user_guide/messages/refactor/consider-using-in.html)
as `repeated-equality-comparison-target` (`PLR1714`). This rule checks
for expressions that can be re-written as a membership test for better
readability and performance.
For example,
```python
foo == "bar" or foo == "baz" or foo == "qux"
```
should be rewritten as
```python
foo in {"bar", "baz", "qux"}
```
Related to #970. Includes documentation.
### Implementation quirks
The implementation does not work with Yoda conditions (e.g., `"a" ==
foo` instead of `foo == "a"`). The Pylint version does. I couldn't find
a way of supporting Yoda-style conditions without it being inefficient,
so didn't (I don't think people write Yoda conditions any way).
## Test Plan
Added fixture.
`cargo test`
## Summary
We have two `Cursor` implementations. This PR moves the implementation
from the formatter into `ruff_python_whitespace` (kind of a poorly-named
crate now) and uses it for both use-cases.
## Summary
Document all `ruff_dev` subcommands and document the `format_dev` flags
in the formatter readme.
CC @zanieb please flag everything that isn't clear or missing
## Test Plan
n/a
Detects invalid types for tuple, list, bytes, string indices.
For example, the following will raise a `TypeError` at runtime and when
imported Python will display a `SyntaxWarning`
```python
var = [1, 2, 3]["x"]
```
```
example.py:1: SyntaxWarning: list indices must be integers or slices, not str; perhaps you missed a comma?
var = [1, 2, 3]["x"]
Traceback (most recent call last):
File "example.py", line 1, in <module>
var = [1, 2, 3]["x"]
~~~~~~~~~^^^^^
TypeError: list indices must be integers or slices, not str
```
Previously, Ruff would not report the invalid syntax but now a violation
will be reported. This does not apply to cases where a variable, call,
or complex expression is used in the index — detection is roughly
limited to static definitions, which matches Python's warnings.
```
❯ ./target/debug/ruff example.py --select RUF015 --show-source --no-cache
example.py:1:17: RUF015 Indexed access to type `list` uses type `str` instead of an integer or slice.
|
1 | var = [1, 2, 3]["x"]
| ^^^ RUF015
|
```
Closes https://github.com/astral-sh/ruff/issues/5082
xref
ffff1440d1
## Summary
Replaces `DictionaryKey` enum with the more general `ComparableExpr`
when checking for duplicate keys
## Test Plan
Added test fixture from issue. Can potentially be expanded further
depending on what exactly we want to flag (e.g. do we also want to check
for unhashable types?) and which `ComparableExpr::XYZ` types we consider
literals.
## Issue link
Closes: https://github.com/astral-sh/ruff/issues/5691
## Summary
Similar to #5567, we can remove the use of regex, plus simplify the
representation (use `Option`), add snapshot tests, etc.
This is about 100x faster than using a regex for cases that match (2.5ns
vs. 250ns). It's obviously not a hot path, but I prefer the consistency
with other similar comment-parsing. I may DRY these up into some common
functionality later on.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR matches Black' behavior where it only omits the optional parentheses if the expression starts or ends with a parenthesized expression:
```python
a + [aaa, bbb, cccc] * c # Don't omit
[aaa, bbb, cccc] + a * c # Split
a + c * [aaa, bbb, ccc] # Split
```
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
This improves the Jaccard index from 0.945 to 0.946
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR improves the Black compatibility when it comes to breaking comprehensions.
We want to avoid line breaks before the target and `in` whenever possible. Furthermore, `if X is not None` should be grouped together, similar to other binary like expressions
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
## Summary
We don't use `ModExpression` anywhere but it's part of the AST, removes
one `not_implemented_yet` and is a trivial 2-liner, so i implemented
formatting for `ModExpression`.
## Test Plan
None, this kind of node does not occur in file input. Otherwise all the
tests for expressions
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
I started working on this because I assumed that I would need access to options inside of `NeedsParantheses` but it then turned out that I won't.
Anyway, it kind of felt nice to pass fewer arguments. So I'm gonna put this out here to get your feedback if you prefer this over passing individual fiels.
Oh, I sneeked in another change. I renamed `context.contents` to `source`. `contents` is too generic and doesn't tell you anything.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
It compiles
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR removes the `mode` field from `BestFitting` because it is no longer used (we now use `conditional_group` and `fits_expanded).
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses:
```python
# We want
if a + [
b,
c
]:
pass
# Rather than
if (
a
+ [b, c]
):
pass
```
This is implemented by using the new IR elements introduced in #5596.
* We give the group wrapping the optional parentheses an ID (`parentheses_id`)
* We use `conditional_group` for the lower priority groups (all non-parenthesized expressions) with the condition that the `parentheses_id` group breaks (we want to split before operands only if the parentheses are necessary)
* We use `fits_expanded` to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the `parentheses_id` group. We gate the `fits_expand` to only apply if the `parentheses_id` group fits (because we prefer `a\n+[b, c]` over expanding `[b, c]` if the whole expression gets parenthesized).
We limit using `fits_expanded` and `conditional_group` only to expressions that themselves are not in parentheses (checking the conditions isn't free)
## Test Plan
It increases the Jaccard index for Django from 0.915 to 0.917
## Incompatibilites
There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences).
### Long string literals
I commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points.
I think we should ignore this incompatibility for now
### Expressions on statement level
I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width
```python
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd
```
But it would if the expression is used inside of a condition.
What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.
## Summary
The similarity index, the fraction of unchanged lines, is easier to
understand than the jaccard index, the fraction between intersection and
union.
## Test Plan
I ran this on django and git a 0.945 index, meaning 5.5% of lines are
currently reformatted when compared to black
## Summary
Format statements such as `tree_depth += 1`. This is a statement that
does not allow any line breaks, the only thing to be mindful of is to
parenthesize the assigned expression
Jaccard index on django: 0.915 -> 0.918
## Test Plan
black tests, and two new tests, a basic one and one that ensures that
the child gets parentheses. I ran the django stability check.
## Summary
This is the result of running `cargo +nightly clippy --workspace
--all-targets --all-features -- -D warnings` and fixing all violations.
Just wanted to see if there were any interesting new checks on nightly
👀
## Summary
This PR implements the formatting of `raise` statements. I haven't
looked at the black implementation, this is inspired from from the
`return` statements formatting.
## Test Plan
The black differences with insta.
I also compared manually some edge cases with very long string and call
chaining and it seems to do the same formatting as black.
There is one issue:
```python
# input
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa).a(aaaa)
# black
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(
aaaa
)
# ruff
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(aaaa)
```
But I'm not sure this diff is the raise formatting implementation.
---------
Co-authored-by: Louis Dispa <ldispa@deezer.com>
## Summary
#5658 didn't actually ignore bivariate types in some all cases (sorry
about that). This PR fixes that and adds bivariate types to the test
fixture.
## Test Plan
`cargo test`
## Summary
Change the `type-name-incorrect-variance` diagnostic message to include
the detected variance and a name change recommendation. For example,
```
`TypeVar` name "T_co" does not reflect its contravariance; consider renaming it to "T_contra"
```
Related to #5651.
## Test Plan
`cargo test`
## Summary
Non-behavior-changing refactors to delay some `.is_builtin` calls in a
few older rules. Cheaper pre-conditions should always be checked first.
## Summary
Fixes#5503. Ready for final review as the `mkdocs` issue involving SSH
keys is fixed.
Note that this will only throw on a `Name` - it will be refactorable
once we have a type-checker. This means that this is the only sort of
input that will throw.
```python
x = range(10)
list(x)[0]
```
I thought it'd be confusing if we supported direct function results.
Consider this example, assuming we support direct results:
```python
# throws
list(range(10))[0]
def createRange(bound):
return range(bound)
# "why doesn't this throw, but a direct `range(10)` call does?"
list(createRange(10))[0]
```
If it's necessary, I can go through the list of built-ins and find those
which produce iterables, then add them to the throwing list.
## Test Plan
Added a new fixture, then ran `cargo t`
## Summary
Implement Pylint `typevar-name-incorrect-variance` (`C0105`) as
`type-name-incorrect-variance` (`PLC0105`). Includes documentation.
Related to #970.
The Pylint implementation checks only `TypeVar`, but this PR checks
`ParamSpec` as well.
## Test Plan
Added test fixture.
`cargo test`
## Summary
Completes all the documentation for the `pandas-vet` rules, except for
`pandas-use-of-dot-read-table` as I am unclear of the rule's motivation
(see #5628).
Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py && mkdocs serve`
## Summary
Fixes#5246. We generate a hash set of all exception IDs caught by the
`try` statement, then check that the inner `raise` actually raises a
caught exception.
## Test Plan
Added a new test, `cargo t`.
## Summary
Fix an oversight in `find_only_token_in_range` where the following code
would panic due do the closing and opening parentheses being in the
range we scan:
```python
d1 = [
("a") if # 1
("b") else # 2
("c")
]
```
Closing and opening parentheses respectively are now correctly skipped.
## Test Plan
I added a regression test
## Summary
This PR reworks the `upstream_categories` mechanism that is only used
for documentation purposes to make it easier to generate docs using
`all_rules()`. The new implementation also relies on "tribal knowledge"
about rule codes, so it's not the best implementation, but gets us
forward.
Another option would be to change the rule-defining proc macros to allow
configuring an optional `RuleCategory`, but that seems more heavy-handed
and possibly unnecessary in the long run...
Draft since this builds on #5439.
cc @charliermarsh :)
## Summary
Format named expressions (walrus operator) such a `value := f()`.
Unlike tuples, named expression parentheses are not part of the range
even when mandatory, so mapping optional parentheses to always gives us
decent formatting without implementing all [PEP
572](https://peps.python.org/pep-0572/) rules on when we need
parentheses where other expressions wouldn't. We might want to revisit
this decision later and implement special cases, but for now this gives
us what we need.
## Test Plan
black fixtures, i added some fixtures and checked django and cpython for
stability.
Closes#5613
## Summary
This changes the docs to show a nursery icon (🌅) for rules in the
nursery.
It currently doesn't do that for the rules that are in sub-categories
(Pylint, Pycodestyle) because there is no `all_rules()` for the
`RuleCodePrefix` that's returned by `UpstreamCategory` iteration (and as
mentioned on Discord, I think `UpstreamCategory` maybe shouldn't be a
thing). (That would be enabled by #5591.)
## Test Plan
Generated docs to see new icons (with the caveat above).
## Summary
It turns out that just doing this match directly without `AhoCorasick`
is much faster, like 2x (and removes one dependency, though we likely
already rely on this transitively).
## Summary
Adds `import tkinter as tk` to the list of default import conventions.
Closes#5620.
## Test Plan
Added `tkinter` to test fixture.
`cargo test`
## Summary
We're doing some unsafe accesses to advance these iterators. It's easier
to model these as actual iterators to ensure safety everywhere. Also
added some additional test cases.
Closes#5621.
## Summary
We now treat `# flake8: noqa: F401` as turning off F401 for the entire
file. (Flake8 treats this as turning off _all rules_ for the entire
file).
This deviates from Flake8, but I think it's a much more user-friendly
deviation than what I introduced in #5571. See
https://github.com/astral-sh/ruff/issues/5617 for an explanation.
Closes https://github.com/astral-sh/ruff/issues/5617.
## Summary
This PR adds a `ParseError` type to the `noqa` parsing system to enable
us to render useful warnings instead of silently failing when parsing
`noqa` codes.
For example, given `foo.py`:
```python
# ruff: noqa: x
# ruff: noqa foo
# flake8: noqa: F401
import os # noqa: foo-bar
```
We would now output:
```console
warning: Invalid `# noqa` directive on line 2: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 4: expected `:` followed by a comma-separated list of codes (e.g., `# noqa: F401, F841`).
warning: Invalid `# noqa` directive on line 6: Flake8's blanket exemption does not support exempting specific codes. To exempt specific codes, use, e.g., `# ruff: noqa: F401, F841` instead.
warning: Invalid `# noqa` directive on line 7: expected a comma-separated list of codes (e.g., `# noqa: F401, F841`).
```
There's one important behavior change here too. Right now, with Flake8,
if you do `# flake8: noqa: F401`, Flake8 treats that as equivalent to `#
flake8: noqa` -- it turns off _all_ diagnostics in the file, not just
`F401`. Historically, we respected this... but, I think it's confusing.
So we now raise a warning, and don't respect it at all. This will lead
to errors in some projects, but I'd argue that right now, those
directives are almost certainly behaving in an unintended way for users
anyway.
Closes https://github.com/astral-sh/ruff/issues/3339.
## Summary
I was testing some changes on Airflow, and I realized that we _always_
run the `pyproject.toml` validation rules, even if they're not enabled.
This PR gates them behind the appropriate enablement flags.
## Test Plan
- Ran: `cargo run -p ruff_cli -- check ../airflow -n`. Verified that no
RUF200 violations were raised.
- Run: `cargo run -p ruff_cli -- check ../airflow -n --select RUF200`.
Verified that two RUF200 violations were raised.
Since the (implicit) update to cargo-insta 1.30, CI would pass even when
the tests failed. This downgrades to cargo insta 1.29.0 and CI fails
again when it should (which i can't show here, because CI needs to pass
to merge this PR). I've improved the unreferenced snapshot handling in
the process
See https://github.com/mitsuhiko/insta/issues/392
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Fix typos found by
[codespell](https://github.com/codespell-project/codespell).
I have left out `memoize` for now (see #5606).
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
CI tests.
<!-- How was it tested? -->
## Summary
Format `ExprIfExp`, also known as the ternary operator or inline `if`.
It can look like
```python
a1 = 1 if True else 2
```
but also
```python
b1 = (
# We return "a" ...
"a" # that's our True value
# ... if this condition matches ...
if True # that's our test
# ... otherwise we return "b§
else "b" # that's our False value
)
```
This also fixes a visitor order bug.
The jaccard index on django goes from 0.911 to 0.915.
## Test Plan
I added fixtures without and with comments in strange places.
Implements PYI030 as part of
https://github.com/astral-sh/ruff/issues/848
> Union expressions should never have more than one Literal member, as
Literal[1] | Literal[2] is semantically identical to Literal[1, 2].
Note we differ slightly from the flake8-pyi implementation:
- We detect cases where there are parentheses or nested unions
- We detect cases with mixed `Union` and `|` syntax
- We use the same error message for all violations; flake8-pyi has two
different messages
- We retain the user's quoting style when displaying string literals;
flake8-pyi uses single quotes
- We warn on duplicates of the same literal `Literal[1] | Literal[1]`
## Summary
In addition to `# noqa` codes, we also support file-level exemptions,
which look like:
- `# flake8: noqa` (ignore all rules in the file, for compatibility)
- `# ruff: noqa` (all rules in the file)
- `# ruff: noqa: F401` (ignore `F401` in the file, Flake8 doesn't
support this)
This PR moves that logic to something that looks a lot more like our `#
noqa` parser. Performance is actually quite a bit _worse_ than the
previous approach (lexing `# flake8: noqa` goes from 2ns to 11ns; lexing
`# ruff: noqa: F401, F841` is about the same`; lexing `# type: ignore #
noqa: E501` fgoes from 4ns to 6ns), but the numbers are very small so
it's... maybe worth it?
The primary benefit here is that we now properly support flexible
whitespace, like: `#flake8:noqa`. Previously, we required exact string
matching, and we also didn't support all case-insensitive variants of
`noqa`.
## Summary
There are two pypi links in the documentation that link to specific
version numbers of other packages. Removing these versioned links allows
users to immediately view the latest version of the package and
maintains consistency with the other links.
## Test Plan
N/A
## Summary
This extends the `ruff_dev` formatter script util. Instead of only doing
stability checks, you can now choose different compatible options on the
CLI and get statistics.
* It adds an option the formats all files that ruff would check to allow
looking at an entire black-formatted repository with `git diff`
* It computes the [Jaccard
index](https://en.wikipedia.org/wiki/Jaccard_index) as a measure of
deviation between input and output, which is useful as single number
metric for assessing our current deviations from black.
* It adds progress bars to both the single projects as well as the
multi-project mode.
* It adds an option to write the multi-project output to a file
Sample usage:
```
$ cargo run --bin ruff_dev -- format-dev --stability-check crates/ruff/resources/test/cpython
$ cargo run --bin ruff_dev -- format-dev --stability-check /home/konsti/projects/django
Syntax error in /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors (parser error): BaseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
Found 0 stability errors in 2755 files (jaccard index 0.911) in 9.75s
$ cargo run --bin ruff_dev -- format-dev --write /home/konsti/projects/django
```
Options:
```
Several utils related to the formatter which can be run on one or more repositories. The selected set of files in a repository is the same as for `ruff check`.
* Check formatter stability: Format a repository twice and ensure that it looks that the first and second formatting look the same. * Format: Format the files in a repository to be able to check them with `git diff` * Statistics: The subcommand the Jaccard index between the (assumed to be black formatted) input and the ruff formatted output
Usage: ruff_dev format-dev [OPTIONS] [FILES]...
Arguments:
[FILES]...
Like `ruff check`'s files. See `--multi-project` if you want to format an ecosystem checkout
Options:
--stability-check
Check stability
We want to ensure that once formatted content stays the same when formatted again, which is known as formatter stability or formatter idempotency, and that the formatter prints syntactically valid code. As our test cases cover only a limited amount of code, this allows checking entire repositories.
--write
Format the files. Without this flag, the python files are not modified
--format <FORMAT>
Control the verbosity of the output
[default: default]
Possible values:
- minimal: Filenames only
- default: Filenames and reduced diff
- full: Full diff and invalid code
-x, --exit-first-error
Print only the first error and exit, `-x` is same as pytest
--multi-project
Checks each project inside a directory, useful e.g. if you want to check all of the ecosystem checkouts
--error-file <ERROR_FILE>
Write all errors to this file in addition to stdout. Only used in multi-project mode
```
## Test Plan
I ran this on django (2755 files, jaccard index 0.911) and discovered a
magic trailing comma problem and that we really needed to implement
import formatting. I ran the script on cpython to identify
https://github.com/astral-sh/ruff/pull/5558.
## Summary
In Python, the annotations on `x` and `y` here have very different
treatment:
```python
def foo(x: int):
y: int
```
The `int` in `x: int` is a runtime-required annotation, because `x` gets
added to the function's `__annotations__`. You'll notice, for example,
that this fails:
```python
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from foo import Bar
def f(x: Bar):
...
```
Because `Bar` is required to be available at runtime, not just at typing
time. Meanwhile, this succeeds:
```python
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from foo import Bar
def f():
x: Bar = 1
f()
```
(Both cases are fine if you use `from __future__ import annotations`.)
Historically, we've tracked those annotations that are _not_
runtime-required via the semantic model's `ANNOTATION` flag. But
annotations that _are_ runtime-required have been treated as "type
definitions" that aren't annotations.
This causes problems for the flake8-future-annotations rules, which try
to detect whether adding `from __future__ import annotations` would
_allow_ you to rewrite a type annotation. We need to know whether we're
in _any_ type annotation, runtime-required or not, since adding `from
__future__ import annotations` will convert any runtime-required
annotation to a typing-only annotation.
This PR adds separate state to track these runtime-required annotations.
The changes in the test fixtures are correct -- these were false
negatives before.
Closes https://github.com/astral-sh/ruff/issues/5574.
## Summary
I'll write up a more detailed description tomorrow, but in short, this
PR removes our regex-based implementation in favor of "manual" parsing.
I tried a couple different implementations. In the benchmarks below:
- `Directive/Regex` is our implementation on `main`.
- `Directive/Find` just uses `text.find("noqa")`, which is insufficient,
since it doesn't cover case-insensitive variants like `NOQA`, and
doesn't handle multiple `noqa` matches in a single like, like ` # Here's
a noqa comment # noqa: F401`. But it's kind of a baseline.
- `Directive/Memchr` uses three `memchr` iterative finders (one for
`noqa`, `NOQA`, and `NoQA`).
- `Directive/AhoCorasick` is roughly the variant checked-in here.
The raw results:
```
Directive/Regex/# noqa: F401
time: [273.69 ns 274.71 ns 276.03 ns]
change: [+1.4467% +1.8979% +2.4243%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
8 (8.00%) high mild
4 (4.00%) high severe
Directive/Find/# noqa: F401
time: [66.972 ns 67.048 ns 67.132 ns]
change: [+2.8292% +2.9377% +3.0540%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
3 (3.00%) low mild
8 (8.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa: F401
time: [76.922 ns 77.189 ns 77.536 ns]
change: [+0.4265% +0.6862% +0.9871%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Memchr/# noqa: F401
time: [62.627 ns 62.654 ns 62.679 ns]
change: [-0.1780% -0.0887% -0.0120%] (p = 0.03 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
5 (5.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
Directive/Regex/# noqa: F401, F841
time: [321.83 ns 322.39 ns 322.93 ns]
change: [+8602.4% +8623.5% +8644.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# noqa: F401, F841
time: [78.618 ns 78.758 ns 78.896 ns]
change: [+1.6909% +1.8771% +2.0628%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
Directive/AhoCorasick/# noqa: F401, F841
time: [87.739 ns 88.057 ns 88.468 ns]
change: [+0.1843% +0.4685% +0.7854%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
Directive/Memchr/# noqa: F401, F841
time: [80.674 ns 80.774 ns 80.860 ns]
change: [-0.7343% -0.5633% -0.4031%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) low severe
9 (9.00%) low mild
1 (1.00%) high mild
Directive/Regex/# noqa time: [194.86 ns 195.93 ns 196.97 ns]
change: [+11973% +12039% +12103%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) low mild
1 (1.00%) high mild
Directive/Find/# noqa time: [25.327 ns 25.354 ns 25.383 ns]
change: [+3.8524% +4.0267% +4.1845%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa
time: [34.267 ns 34.368 ns 34.481 ns]
change: [+0.5646% +0.8505% +1.1281%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild
Directive/Memchr/# noqa time: [21.770 ns 21.818 ns 21.874 ns]
change: [-0.0990% +0.1464% +0.4046%] (p = 0.26 > 0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
Directive/Regex/# type: ignore # noqa: E501
time: [278.76 ns 279.69 ns 280.72 ns]
change: [+7449.4% +7469.8% +7490.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# type: ignore # noqa: E501
time: [67.791 ns 67.976 ns 68.184 ns]
change: [+2.8321% +3.1735% +3.5418%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# type: ignore # noqa: E501
time: [75.908 ns 76.055 ns 76.210 ns]
change: [+0.9269% +1.1427% +1.3955%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
Directive/Memchr/# type: ignore # noqa: E501
time: [72.549 ns 72.723 ns 72.957 ns]
change: [+1.5881% +1.9660% +2.3974%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
10 (10.00%) high mild
5 (5.00%) high severe
Directive/Regex/# type: ignore # nosec
time: [66.967 ns 67.075 ns 67.207 ns]
change: [+1713.0% +1715.8% +1718.9%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
Directive/Find/# type: ignore # nosec
time: [18.505 ns 18.548 ns 18.597 ns]
change: [+1.3520% +1.6976% +2.0333%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
Directive/AhoCorasick/# type: ignore # nosec
time: [16.162 ns 16.206 ns 16.252 ns]
change: [+1.2919% +1.5587% +1.8430%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
Directive/Memchr/# type: ignore # nosec
time: [39.192 ns 39.233 ns 39.276 ns]
change: [+0.5164% +0.7456% +0.9790%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
4 (4.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Regex/# some very long comment that # is interspersed with characters but # no directive
time: [81.460 ns 81.578 ns 81.703 ns]
change: [+2093.3% +2098.8% +2104.2%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
2 (2.00%) high mild
Directive/Find/# some very long comment that # is interspersed with characters but # no directive
time: [26.284 ns 26.331 ns 26.387 ns]
change: [+0.7554% +1.1027% +1.3832%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# some very long comment that # is interspersed with characters but # no direc...
time: [28.643 ns 28.714 ns 28.787 ns]
change: [+1.3774% +1.6780% +2.0028%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
Directive/Memchr/# some very long comment that # is interspersed with characters but # no directive
time: [55.766 ns 55.831 ns 55.897 ns]
change: [+1.5802% +1.7476% +1.9021%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) low mild
```
While memchr is faster than aho-corasick in some of the common cases
(like `# noqa: F401`), the latter is way, way faster when there _isn't_
a match (like 2x faster -- see the last two cases). Since most comments
_aren't_ `noqa` comments, this felt like the right tradeoff. Note that
all implementations are significantly faster than the regex version.
(I know I originally reported a 10x speedup, but I ended up improving
the regex version a bit in some prior PRs, so it got unintentionally
faster via some refactors.)
There's also one behavior change in here, which is that we now allow
variable spaces, e.g., `#noqa` or `# noqa`. Previously, we required
exactly one space. This thus closes#5177.
## Summary
The following code was previously leading to unstable formatting:
```python
try:
try:
pass
finally:
print(1) # issue7208
except A:
pass
```
The comment would be formatted as a trailing comment of `try` which is
unstable as an end-of-line comment gets two extra whitespaces.
This was originally found in
99b00efd5e/Lib/getpass.py (L68-L91)
## Test Plan
I added a regression test
## Summary
It's a bit simpler to let the API just take the text itself, plus an
offset (to make the returned `TextRange` absolute, rather than
relative).
## Summary
Adds a `--case-sensitive` setting/flag to isort (default: `false`)
which, when set to `true` sorts imports case sensitively instead of case
insensitively.
Tests and Docs can be improved, can do that if the general idea of the
implementation is in order.
First `isort` edit so any and all feedback is welcomed even more than
usual.
## Test Plan
Added a fixture with an assortment of imports in various cases.
## Issue links
Closes: https://github.com/astral-sh/ruff/issues/5514
## Summary
This PR enables us to resolve attribute accesses within files, at least
for static and class methods. For example, we can now detect that this
is a function access (and avoid a false-positive):
```python
class Class:
@staticmethod
def error():
return ValueError("Something")
# OK
raise Class.error()
```
Closes#5487.
Closes#5416.
## Summary
Implement Pylint `typevar-double-variance` (`C0131`) as
`type-bivariance` (`PLC0131`). Includes documentation. Related to #970.
Renamed the rule to be more clear (it's not immediately obvious what
'double' means, IMO).
The Pylint implementation checks only `TypeVar`, but this PR checks
`ParamSpec` as well.
## Test Plan
Added tests.
`cargo test`
## Summary
This adds a `ruff rule --all` switch that prints out a human-readable
Markdown or a machine-readable JSON document of the lint rules known to
Ruff.
I needed a machine-readable document of the rules [for a
project](https://github.com/astral-sh/ruff/discussions/5078), and
figured it could be useful for other people – or tooling! – to be able
to interrogate Ruff about its arcane knowledge.
The JSON output is an array of the same objects printed by `ruff rule
--format=json`.
## Test Plan
I ran `ruff rule --all --format=json`. I think more might be needed, but
maybe a snapshot test is overkill?
## Summary
Implement Pylint `typevar-name-mismatch` (`C0132`) as
`type-param-name-mismatch` (`PLC0132`). Includes documentation. Related
to #970.
The Pylint implementation checks only `TypeVar`, but this PR checks
`TypeVarTuple`, `ParamSpec`, and `NewType` as well. This seems to better
represent the Pylint rule's [intended
behaviour](https://github.com/pylint-dev/pylint/issues/5224).
Full disclosure: I am not a fan of the translated name and think it
should probably be different.
## Test Plan
`cargo test`
## Summary
This makes the output of `check-formatter-stability` more concise by
removing extraneous newlines. It also adds a `--error-file` option to
that script that allows creating a file with just the errors (without
the status messages) to share with others.
## Test Plan
I ran it over CPython and looked at the output. I then added the
`--error-file` option and looked at the contents of the file
## Summary
Format import statements in all their variants. Specifically, this
implemented formatting `StmtImport`, `StmtImportFrom` and `Alias`.
## Test Plan
I added some custom snapshots, even though this has been covered well by
black's tests.
## Summary
As discussed on ~IRC~ Discord, this will make it easier for e.g. the
docs generation stuff to get all rules for a linter (using
`all_rules()`) instead of just non-nursery ones, and it also makes it
more Explicit Is Better Than Implicit to iterate over linter rules.
Grepping for `Item = Rule` reveals some remaining implicit
`IntoIterator`s that I didn't feel were necessarily in scope for this
(and honestly, iterating over a `RuleSet` makes sense).
## Summary
If a comma separated list has only one entry, black will respect the
magic trailing comma, but it will not add a new one.
The following code will remain as is:
```python
b1 = [
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
]
b2 = [
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa,
]
b3 = [
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa,
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
]
```
## Test Plan
This was first discovered in
7eeadc82c2/django/contrib/admin/checks.py (L674-L681),
which i've minimized into a call test.
I've added tests for the three cases (one entry + no comma, one entry +
comma, more than one entry) to the list tests.
The diffs from the black tests get smaller.
## Summary
Removing some false positives based on running over `zulip`.
`PERF401` now also detects cases like:
```py
original = list(range(10000))
filtered = []
for i in original:
filtered.append(i * i)
```
Previously, these were caught by the list-copy rule, but these too need
comprehensions.
## Summary
This PR applies the fix in #5478 to a variety of other call-sites, and
fixes some other range hygienic stuff in the rules that were modified.
## Summary
Change generator formatting dummy to include `NOT_YET_IMPLEMENTED`. This
makes it easier to correctly identify them as dummies
## Test Plan
This is a dummy change
## Summary
Adds `PERF401` and `PERF402` mirroring `W8401` and `W8402` from
https://github.com/tonybaloney/perflint
Implementation is not super smart but should be at parity with upstream
implementation judging by:
c07391c176/perflint/comprehension_checker.py (L42-L73)
It essentially checks:
- If the body of a for-loop is just one statement
- If that statement is an `if` and the if-statement contains a call to
`append()` we flag `PERF401` and suggest a list comprehension
- If that statement is a plain call to `append()` or `insert()` we flag
`PERF402` and suggest `list()` or `list.copy()`
I've set the violation to only flag the first append call in a long
`if-else` statement for `PERF401`. Happy to change this to some other
location or make it multiple violations if that makes more sense.
## Test Plan
Fixtures were added with the relevant scenarios for both rules
## Issue Links
Refers: https://github.com/astral-sh/ruff/issues/4789
## Summary
Implements flake8-pyi checks 002, 003, 004, 005. The logic is a bit
complex, as you can see in the [original
code](57921813c1/pyi.py (L1403C18-L1403C18)).
ref: #848
## Test Plan
Updated snapshot tests. Ran flake8 to double check lints, and ran ruff
with all PYI lints enabled to check for incorrect overlapping lint
errors.
Support for `let…else` formatting was just merged to nightly
(rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02
to pick this up. Followup to #939.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
## Summary
This PR reduces the noise from `DJ012` by emitting a single violation
when you have multiple consecutive violations of the same "type".
For example, given:
```py
class MultipleConsecutiveFields(models.Model):
"""Model that contains multiple out-of-order field definitions in a row."""
class Meta:
verbose_name = "test"
first_name = models.CharField(max_length=32)
last_name = models.CharField(max_length=32)
```
It's convenient to only error on `first_name`, and not `last_name`,
since we're really flagging that the _section_ is out-of-order.
Closes#5465.
## Summary
Given a docstring like:
```py
def f(a: int, b: int) -> int:
"""Showcase function.
Parameters
----------
a : int
_description_
b : int
_description_
Returns
-------
int
_description
"""
```
We were failing to identify `Returns` as a section, because the previous
line was neither empty nor ended with punctuation. This was causing a
false negative, where by we weren't flagging a missing line before
`Returns`. So, the very reason for the rule (no blank line) was causing
us to fail to catch it.
Note that, we did have a test case for this, which was working properly:
```py
def f() -> int:
"""Showcase function.
Parameters
----------
Returns
-------
"""
```
...because the line before `Returns` "ends in a punctuation mark" (`-`).
Closes#5442.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR normalizes line endings inside of strings to `\n` as required by the printer.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new test using `\r\n` and ran the ecosystem check. There are no remaining end of line panics.
https://gist.github.com/MichaReiser/8f36b1391ca7b48475b3a4f592d74ff4
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR uses rayon to parallelize the stability check by scheduling each project as its own task.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I ran the ecosystem check. It now makes use of all cores (except at the end, there are some large projects).
## Performance
The check now completes in minutes where it took about 30 minutes before.
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR fixes an issue where the binary expression formatting removed parentheses around the left hand side of an expression.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new regression test and re-ran the ecosystem check. It brings down the `check-formatter-stability` output from a 3.4MB file down to 900KB.
<!-- How was it tested? -->
## Summary
This PR makes E731 a "manual" fix in one other context: when the lambda
is shadowing another variable in the scope. Function declarations (with
shadowing) cause issues for type checkers, and so rewriting an
annotation, e.g., in branches of an `if` statement can lead to failures.
Closes https://github.com/astral-sh/ruff/issues/5421.
## Summary
Completes the documentation for the `flake8-logging-format` rules.
Related to #2646.
I included both the `flake8-logging-format` recommendation to use the
`extra` keyword and the Pylint recommendation to pass format values as
parameters so that formatting is done lazily, as #970 suggests the
Pylint logging rules are covered by this ruleset. Using lazy formatting
via parameters is probably more common than avoiding formatting entirely
in favour of the `extra` argument, regardless.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
This PR extracts a bunch of complex logic from `add_binding`, instead
running the the shadowing rules in the deferred handler, thereby
decoupling the binding phase (during which we build up the semantic
model) from the analysis phase, and generally making `add_binding` much
more focused.
This was made possible by improving the semantic model to better handle
deletions -- previously, we'd "lose track" of bindings if they were
deleted, which made this kind of refactor impossible.
## Test Plan
We have good automated coverage for this, but I want to benchmark it
separately.
## Summary
This PR fixes a silent failure that manifested itself in
https://github.com/astral-sh/ruff-vscode/issues/238. In short, if the
user provided invalid arguments to Ruff in the VS Code extension (like
`"ruff.args": ["a"]`), then we generated something like the following
command:
```console
/path/to/ruff --force-exclude --no-cache --no-fix --format json - --fix a --stdin-filename /path/to/file.py
```
Since this contains both `-` and `a` as the "input files", Ruff would
treat this as if we're linting the files names `-` and `a`, rather than
linting standard input.
This PR modifies out standard input detection to force standard input
when `--stdin-filename` is present, or at least one file is `-`. (We
then warn and ignore the others.)
## Summary
This was just an oversight -- the last remaining `todo!()` that I never
filled in. We clearly don't have any test coverage for it yet, but this
mimics the Pyright implementation.
## Summary
This PR adds some snapshot tests for the resolver based on executing
resolutions within a "mock" of the Airflow repo (that is: a folder that
contains a subset of the repo's files, but all empty, and with an
only-partially-complete virtual environment). It's intended to act as a
lightweight integration test, to enable us to test resolutions on a
"real" project without adding a dependency on Airflow itself.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Currently the URL at the bottom of the `ruff rule SLOT00x` output points
to Python 3.7 docs.
Given that Python 3.7 is now end-of-life (as of yesterday), let's
instead point users to the current Python docs.
## Test Plan
<!-- How was it tested? -->
## Summary
Consider Jupyter index for code frames (`--show-source`).
This solves two problems as mentioned in the linked issue:
> Omit any contents from adjoining cells
If the Jupyter index is present, we'll use that to check if the
surrounding
lines belong to the same cell as the content line. If not, we'll skip
that line
until we either reach the one which does or we reach the content line.
> code frame line number
If the Jupyter index is present, we'll use that to get the actual start
line in
corresponding to the computed start index.
## Test Plan
`cargo run --bin ruff -- check --no-cache --isolated --select=ALL --show-source /path/to/notebook.ipynb`
fixes: #5395
## Summary
The `Y053` rule of `flake8-pyi` ignores docstrings, it only triggers on
other string literals.
The separate `Y021/PYI021` rule exists to disallow docstrings.
## Test Plan
Added some `# OK` test cases to `PYI053.py(i)` files.
## Summary
Implement Pylint rule `single-string-used-for-slots` (`C0205`) as
`single-string-slots` (`PLC0205`). This rule checks for single strings
being assigned to `__slots__`. For example
```python
class Foo:
__slots__: str = "bar"
def __init__(self, bar: str) -> None:
self.bar = bar
```
should be
```python
class Foo:
__slots__: tuple[str, ...] = ("bar",)
def __init__(self, bar: str) -> None:
self.bar = bar
```
Related to #970. Includes documentation.
## Test Plan
`cargo test`
## Summary
Replace same length equal line with dash line in D407
Do we want to update the message and autofix title to reflect this
change?
## Test Plan
Added test cases for:
- Equal line length == dash line length
- Equal line length != dash line length
fixes: #5378
## Summary
This PR contains the first step towards enabling robust first-party,
third-party, and standard library import resolution in Ruff (including
support for `typeshed`, stub files, native modules, etc.) by porting
Pyright's import resolver to Rust.
The strategy taken here was to start with a more-or-less direct port of
the Pyright's TypeScript resolver. The code is intentionally similar,
and the test suite is effectively a superset of Pyright's test suite for
its own resolver. Due to the nature of the port, the code is very, very
non-idiomatic for Rust. The code is also entirely unused outside of the
test suite, and no effort has been made to integrate it with the rest of
the codebase.
Future work will include:
- Refactoring the code (now that it works) to match Rust and Ruff
idioms.
- Further testing, in practice, to ensure that the resolver can resolve
imports in a complex project, when provided with a virtual environment
path.
- Caching, to minimize filesystem lookups and redundant resolutions.
- Integration into Ruff itself (use Ruff's existing settings, find rules
that can make use of robust resolution, etc.)
ruff_dev repeat recently broke (i think with the cargo update?):
> thread 'main' panicked at 'Command repeat: Short option names must be
unique for each argument, but '-n' is in use by both 'no_cache' and
'repeat''
This fixes this by removing the short argument.
## Summary
This formats call expressions with magic trailing comma and parentheses
behaviour but without call chaining
## Test Plan
Lots of new test fixtures, including some that don't work yet
## Summary
Implements PERF203 from #4789, which throws if a `try/except` block is
inside of a loop. Not sure if we want to extend the diagnostic to the
`except` as well, but I thought that that may get a little messy. We may
also want to just throw on the word `try` - open to suggestions though.
## Test Plan
`cargo test`
## Summary
Experimental release for Jupyter Notebook integration.
Currently, this requires a user to explicitly opt-in using the
[include](https://beta.ruff.rs/docs/settings/#include) configuration:
```toml
[tool.ruff]
include = ["*.py", "*.pyi", "**/pyproject.toml", "*.ipynb"]
```
Or, a user can pass in the file directly:
```sh
ruff check path/to/notebook.ipynb
```
For known limitations, please refer #5188
## Test Plan
Following command should work without the `--all-features` flag:
```sh
cargo dev round-trip /path/to/notebook.ipynb
```
Following command should work with the above config file along with
`select = ["ALL"]`:
```sh
cargo run --bin ruff -- check --no-cache --config=../test-repos/openai-cookbook/pyproject.toml --fix ../test-repos/openai-cookbook/
```
Passing the Jupyter notebook directly:
```sh
cargo run --bin ruff -- check --no-cache --isolated --select=ALL --fix ../test-repos/openai-cookbook/examples/Classification_using_embeddings.ipynb
```
## Summary
Add documentation to the `D1XX` rules that flag missing docstrings.
The examples are quite long and docstrings practices vary a lot between
projects, so I thought it would be best that the documentation for these
rules be their own PR separate to the other `pydocstyle` rules.
Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR extends the string formatting to respect the configured quote style.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
Extended the string test with new cases and set it up to run twice: Once with the `quote_style: Doube`, and once with `quote_style: Single` single and double quotes.
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds tests that verify that the magic trailing comma is not respected if disabled in the formatter options.
Our test setup now allows to create a `<fixture-name>.options.json` file that contains an array of configurations that should be tested.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
It's all about tests :)
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds a new `PyFormatOptions` struct that stores the python formatter options.
The new options aren't used yet, with the exception of magical trailing commas and the options passed to the printer.
I'll follow up with more PRs that use the new options (e.g. `QuoteStyle`).
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test` I'll follow up with a new PR that adds support for overriding the options in our fixture tests.
## Motation
Previously,
```python
x = (
a1
.a2
# a
. # b
# c
a3
)
```
got formatted as
```python
x = a1.a2
# a
. # b
# c
a3
```
which is invalid syntax. This fixes that.
## Summary
This implements a basic form of attribute chaining
(<https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#call-chains>)
by checking if any inner attribute access contains an own line comment,
and if this is the case, adds parentheses around the outermost attribute
access while disabling parentheses for all inner attribute expressions.
We want to replace this with an implementation that uses recursion or a
stack while formatting instead of in `needs_parentheses` and also
includes calls rather sooner than later, but i'm fixing this now because
i'm uncomfortable with having known invalid syntax generation in the
formatter.
## Test Plan
I added new fixtures.
## Summary
Add documentation to the `D3XX` rules that check for issues with
docstring quotes. Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
Ignore type aliases for RUF013 to avoid flagging false positives:
```python
from typing import Optional
MaybeInt = Optional[int]
def f(arg: MaybeInt = None):
pass
```
But, at the expense of having false negatives:
```python
Text = str | bytes
def f(arg: Text = None):
pass
```
## Test Plan
`cargo test`
fixes: #5295
## Summary
This is small refactoring to reuse the code that detects the magic
trailing comma across functions. I make this change now to avoid copying
code in a later PR. @MichaReiser is planning on making a larger
refactoring later that integrates with the join nodes builder
## Test Plan
No functional changes. The magic trailing comma behaviour is checked by
the fixtures.
## Summary
When visiting AugAssign in evaluation order, the AugAssign `target`
should be visited after it's `value`. Based on my testing, the pseudo
code for `a += b` is effectively:
```python
tmp = a
a = tmp.__iadd__(b)
```
That is, an ideal traversal order would look something like this:
1. load a
2. b
3. op
4. store a
But, there is only a single AST node which captures `a` in the statement
`a += b`, so it cannot be traversed both before and after the traversal
of `b` and the `op`.
Nonetheless, I think traversing `a` after `b` and the `op` makes the
most sense for a number of reasons:
1. All the other assignment expressions traverse their `value`s before
their `target`s. Having `AugAssign` traverse in the same order would be
more consistent.
2. Within the AST, the `ctx` of the `target` for an `AugAssign` is
`Store` (though technically this is a `Load` and `Store` operation, the
AST only indicates it as a `Store`). Since the the store portion of the
`AugAssign` occurs last, I think it makes sense to traverse the `target`
last as well.
The effect of this is marginal, but it may have an impact on the
behavior of #5271.
## Summary
And remove cached files that we haven't seen for a certain period of
time, currently 30 days.
For the last seen timestamp we actually use an `u64`, it's smaller on
disk than `SystemTime` (which size is OS dependent) and fits in an
`AtomicU64` which we can use to update it without locks.
## Test Plan
Added a new unit test, run by `cargo test`.
In the following code, the comment used to get wrongly associated with
the `if False` since it looked like an elif. This fixes it by checking
the indentation and adding a regression test
```python
if True:
pass
else: # Comment
if False:
pass
pass
```
Originally found in
1570b94a02/gradio/external.py (L478)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR implements formatting for non-f-string Strings that do not use implicit concatenation.
Docstring formatting is out of the scope of this PR.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a few tests for simple string literals.
## Performance
Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times:
1. To detect if it is an implicit string continuation
2. To detect if the string contains any new lines
3. To detect the preferred quote
4. To normalize the string
Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time.
We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for compare operations.
The implementation currently breaks diffeently when nesting binary like expressions. I haven't yet figured out what Black's logic is in that case but I think that this by itself is already an improvement worth merging.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a few new tests
<!-- How was it tested? -->
## Summary
The `Visitor` and `preorder::Visitor` traits provide some convenience
functions, `visit_annotation` and `visit_format_spec`, for handling
annotation and format spec expressions respectively. Both of these
functions accept an `&Expr` and have a default implementation which
delegates to `walk_expr`. The problem with this approach is that any
custom handling done in `visit_expr` will be skipped for annotations and
format specs. Instead, to capture any custom logic implemented in
`visit_expr`, both of these function's default implementations should
delegate to `visit_expr` instead of `walk_expr`.
## Example
Consider the below `Visitor` implementation:
```rust
impl<'a> Visitor<'a> for Example<'a> {
fn visit_expr(&mut self, expr: &'a Expr) {
match expr {
Expr::Name(ExprName { id, .. }) => println!("Visiting {:?}", id),
_ => walk_expr(self, expr),
}
}
}
```
Run on the following Python snippet:
```python
a: b
```
I would expect such a visitor to print the following:
```
Visiting b
Visiting a
```
But it instead prints the following:
```
Visiting a
```
Our custom `visit_expr` handler is not invoked for the annotation.
## Test Plan
Tests added in #5271 caught this behavior.
## Summary
Move `collection-literal-concatenation` markdown documentation to the
correct place.
Fixes error in #5262.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
Adds PERF101 which checks for unnecessary casts to `list` in for loops.
NOTE: Is not fully equal to its upstream implementation as this
implementation does not flag based on type annotations
(i.e.):
```python
def foo(x: List[str]):
for y in list(x):
...
```
With the current set-up it's quite hard to get the annotation from a
function arg from its binding. Problem is best considered broader than
this implementation.
## Test Plan
Added fixture.
## Issue links
Refers: https://github.com/astral-sh/ruff/issues/4789
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This contains three changes:
* repos in `check_ecosystem.py` are stored as `org:name` instead of
`org/name` to create a flat directory layout
* `check_ecosystem.py` performs a maximum of 50 parallel jobs at the
same time to avoid consuming to much RAM
* `check-formatter-stability` gets a new option `--multi-project` so
it's possible to do `cargo run --bin ruff_dev --
check-formatter-stability --multi-project target/checkouts`
With these three changes it becomes easy to check the formatter
stability over a larger number of repositories. This is part of the
integration of integrating formatter regressions checks into the
ecosystem checks.
## Test Plan
```shell
python scripts/check_ecosystem.py --checkouts target/checkouts --projects github_search.jsonl -v $(which true) $(which true)
cargo run --bin ruff_dev -- check-formatter-stability --multi-project target/checkouts
```
## Summary
Remove recommendations to replace
`typing_extensions.dataclass_transform` and
`typing_extensions.SupportsIndex` with their `typing` library
counterparts.
Closes#5112.
## Test Plan
Added extra checks to the test fixture.
`cargo test`
## Summary
This snippet used to panic because it expected to see a comma or
something similar after the `2` but met the closing parentheses that is
not part of the range and panicked
```python
a = {
1: (2),
# comment
3: True,
}
```
Originally found in
636a717ef0/testing/marionette/client/marionette_driver/geckoinstance.py (L109)
This snippet is also the test plan.
This solves an instability when formatting cpython. It also introduces
another one, but i think it's still a worthwhile change for now.
There's no proper testing since this is just a dummy.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
format StmtFor
still trying to learn how to help out with the formatter. trying
something slightly more advanced than [break](#5158)
mostly copied form StmtWhile
## Test Plan
snapshots
## Summary
Ensures that `--select PL` and `--select PLC` don't include `PLC1901`.
Previously, `--select PL` _did_, because it's a "linter-level selector"
(`--select PLC` is viewed as selecting the `C` prefix from `PL`), and we
were missing this filtering path.
## Summary
This is a follow up to #5221. Turns out it was easy to restructure the
visitor to get the right order, I'm just dumb 🤷♂️ I've
removed `visit_arg_with_default` entirely from the `Visitor`, although
it still exists as part of `preorder::Visitor`.
## Motivation
While black keeps parentheses nearly everywhere, the notable exception
is in the body of for loops:
```python
for (a, b) in x:
pass
```
becomes
```python
for a, b in x:
pass
```
This currently blocks #5163, which this PR should unblock.
## Solution
This changes the `ExprTuple` formatting option to include one additional
option that removes the parentheses when not using magic trailing comma
and not breaking. It is supposed to be used through
```rust
#[derive(Debug)]
struct ExprTupleWithoutParentheses<'a>(&'a Expr);
impl Format<PyFormatContext<'_>> for ExprTupleWithoutParentheses<'_> {
fn fmt(&self, f: &mut Formatter<PyFormatContext<'_>>) -> FormatResult<()> {
match self.0 {
Expr::Tuple(expr_tuple) => expr_tuple
.format()
.with_options(TupleParentheses::StripInsideForLoop)
.fmt(f),
other => other.format().with_options(Parenthesize::IfBreaks).fmt(f),
}
}
}
```
## Testing
The for loop formatting isn't merged due to missing this (and i didn't
want to create more git weirdness across two people), but I've confirmed
that when applying this to while loops instead of for loops, then
```rust
write!(
f,
[
text("while"),
space(),
ExprTupleWithoutParentheses(test.as_ref()),
text(":"),
trailing_comments(trailing_condition_comments),
block_indent(&body.format())
]
)?;
```
makes
```python
while (a, b):
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (a,b,):
pass
```
formatted as
```python
while a, b:
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (
a,
b,
):
pass
```
## Summary
According to the AST visitor documentation, the AST visitor "visits all
nodes in the AST recursively in evaluation-order". However, the current
traversal fails to meet this specification in a few places.
### Function traversal
```python
order = []
@(order.append("decorator") or (lambda x: x))
def f(
posonly: order.append("posonly annotation") = order.append("posonly default"),
/,
arg: order.append("arg annotation") = order.append("arg default"),
*args: order.append("vararg annotation"),
kwarg: order.append("kwarg annotation") = order.append("kwarg default"),
**kwargs: order.append("kwarg annotation")
) -> order.append("return annotation"):
pass
print(order)
```
Executing the above snippet using CPython 3.10.6 prints the following
result (formatted for readability):
```python
[
'decorator',
'posonly default',
'arg default',
'kwarg default',
'arg annotation',
'posonly annotation',
'vararg annotation',
'kwarg annotation',
'kwarg annotation',
'return annotation',
]
```
Here we can see that decorators are evaluated first, followed by
argument defaults, and annotations are last. The current traversal of a
function's AST does not align with this order.
### Annotated assignment traversal
```python
order = []
x: order.append("annotation") = order.append("expression")
print(order)
```
Executing the above snippet using CPython 3.10.6 prints the following
result:
```python
['expression', 'annotation']
```
Here we can see that an annotated assignments annotation gets evaluated
after the assignment's expression. The current traversal of an annotated
assignment's AST does not align with this order.
## Why?
I'm slowly working on #3946 and porting over some of the logic and tests
from ssort. ssort is very sensitive to AST traversal order, so ensuring
the utmost correctness here is important.
## Test Plan
There doesn't seem to be existing tests for the AST visitor, so I didn't
bother adding tests for these very subtle changes. However, this
behavior will be captured in the tests for the PR which addresses #3946.
## Summary
This is a complete rewrite of the handling of `/` and `*` comment
handling in function signatures. The key problem is that slash and star
don't have a note. We now parse out the positions of slash and star and
their respective preceding and following note. I've left code comments
for each possible case of function signature structure and comment
placement
## Test Plan
I extended the function statement fixtures with cases that i found. If
you have more weird edge cases your input would be appreciated.
## Summary
This fixes two problems discovered when trying to format the cpython
repo with `cargo run --bin ruff_dev -- check-formatter-stability
projects/cpython`:
The first is to ignore try/except trailing comments for now since they
lead to unstable formatting on the dummy.
The second is to avoid dropping trailing if comments through placement:
This changes the placement to keep a comment trailing an if-elif or
if-elif-else to keep the comment a trailing comment on the entire if.
Previously the last comment would have been lost.
```python
if "first if":
pass
elif "first elif":
pass
```
The last remaining problem in cpython so far is function signature
argument separator comment placement which is its own PR on top of this.
## Test Plan
I added test fixtures of minimized examples with links back to the
original cpython location
## Summary
While fixing https://github.com/astral-sh/ruff/pull/5233, I noticed that
in FastAPI, 343 out of 823 files weren't hitting the cache. It turns out
these are standalone files in the documentation that lack a "package
root". Later, when looking up the cache entries, we fallback to the
package directory.
This PR ensures that we initialize the cache for both kinds of files:
those that are in a package, and those that aren't.
The total size of the FastAPI cache for me is now 388K. I also suspect
that this approach is much faster than as initially written, since
before, we were probably initializing one cache per _directory_.
## Test Plan
Ran `cargo run -p ruff_cli -- check ../fastapi --verbose`; verified
that, on second execution, there were no "Checking" entries in the logs.
## Summary
In the latest release, we made some improvements to the semantic model,
but our modifications to exception-unbinding are causing some
false-positives. For example:
```py
try:
v = 3
except ImportError as v:
print(v)
else:
print(v)
```
In the latest release, we started unbinding `v` after the `except`
handler. (We used to restore the existing binding, the `v = 3`, but this
was quite complicated.) Because we don't have full branch analysis, we
can't then know that `v` is still bound in the `else` branch.
The solution here modifies `resolve_read` to skip-lookup when hitting
unbound exceptions. So when store the "unbind" for `except ImportError
as v`, we save the binding that it shadowed `v = 3`, and skip to that.
Closes#5249.
Closes#5250.
This formats slice expressions and subscript expressions.
Spaces around the colons follows the same rules as black
(https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#slices):
```python
e00 = "e"[:]
e01 = "e"[:1]
e02 = "e"[: a()]
e10 = "e"[1:]
e11 = "e"[1:1]
e12 = "e"[1 : a()]
e20 = "e"[a() :]
e21 = "e"[a() : 1]
e22 = "e"[a() : a()]
e200 = "e"[a() : :]
e201 = "e"[a() :: 1]
e202 = "e"[a() :: a()]
e210 = "e"[a() : 1 :]
```
Comment placement is different due to our very different infrastructure.
If we have explicit bounds (e.g. `x[1:2]`) all comments get assigned as
leading or trailing to the bound expression. If a bound is missing
`[:]`, comments get marked as dangling and placed in the same section as
they were originally in:
```python
x = "x"[ # a
# b
: # c
# d
]
```
to
```python
x = "x"[
# a
# b
:
# c
# d
]
```
Except for the potential trailing end-of-line comments, all comments get
formatted on their own line. This can be improved by keeping end-of-line
comments after the opening bracket or after a colon as such but the
changes were already complex enough.
I added tests for comment placement and spaces.
## Summary
I found it hard to figure out which function decides placement for a
specific comment. An explicit loop makes this easier to debug
## Test Plan
There should be no functional changes, no changes to the formatting of
the fixtures.
This moves all docs about benchmarking and profiling into
CONTRIBUTING.md by moving the readme of `ruff_benchmark` and adding more
information on profiling.
We need to somehow consolidate that documentation, but i'm not convinced
that this is the best way (i tried subpages in mkdocs, but that didn't
seem good either), so i'm happy to take suggestions.
## Summary
Previously, `DecoratedComment` used `text_position()` and
`SourceComment` used `position()`. This PR unifies this to
`line_position` everywhere.
## Test Plan
This is a rename refactoring.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for unary expressions.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new `unary.py` with custom test cases
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. This PR adds support for `ExpandRightThenLeft` and `ExpandLeft`.
## Test Plan
I added tests that play through all 4 binary expression layouts.
## Summary
I initially wanted this category to be more general and decoupled from
the plugin, but I got some feedback that the titling felt inconsistent
with others.
## Summary
This is a proper fix for the issue patched-over in
https://github.com/astral-sh/ruff/pull/5229, thanks to an extremely
helpful repro from @tlambert03 in that thread. It looks like we were
using the keys of `package_roots` rather than the values to initialize
the cache -- but it's a map from package to package root.
## Test Plan
Reverted #5229, then ran through the plan that @tlambert03 included in
https://github.com/astral-sh/ruff/pull/5229#issuecomment-1599723226.
Verified the panic before but not after this change.
## Summary
This PR reverts #4971 (aba073a791). It
turns out that `f"{str(x)}"` and `f"{x}"` are often but not exactly
equivalent, and performing that conversion automatically can lead to
subtle bugs, See the discussion in
https://github.com/astral-sh/ruff/issues/4958.
## Summary
I haven't been able to determine why / when this is happening, but in
some cases, users are reporting that this `unwrap()` is causing a panic.
It's fine to just return `None` here and fallback to "No cache",
certainly better than panicking (while we figure out the edge case).
Closes#5225.
Closes#5228.
## Summary
I'm looking into the Black stability tests, and here's one failing case.
We split `assert a and (b and c)` into:
```python
assert a
assert (b and c)
```
We fail to split `assert (b and c)` due to the parentheses. But Black
then removes then, and when running Ruff again, we get:
```python
assert a
assert b
assert c
```
This PR just enables us to fix to this in one pass.
## Summary
A new CLI option (`-o`/`--output-file`) to write output to a file
instead of stdout.
Major change is to remove the lock acquired on stdout. The argument is
that the output is buffered and thus the lock is acquired only when
writing a block (8kb). As per the benchmark below there is a slight
performance penalty.
Reference:
https://rustmagazine.org/issue-3/javascript-compiler/#printing-is-slow
## Benchmarks
_Output is truncated to only contain useful information:_
Command: `check --isolated --no-cache --select=ALL --show-source
./test-repos/cpython"`
Latest HEAD (361d45f2b2) with and without
the manual lock on stdout:
```console
Benchmark 1: With lock
Time (mean ± σ): 5.687 s ± 0.075 s [User: 17.110 s, System: 0.486 s]
Range (min … max): 5.615 s … 5.860 s 10 runs
Benchmark 2: Without lock
Time (mean ± σ): 5.719 s ± 0.064 s [User: 17.095 s, System: 0.491 s]
Range (min … max): 5.640 s … 5.865 s 10 runs
Summary
(1) ran 1.01 ± 0.02 times faster than (2)
```
This PR:
```console
Benchmark 1: This PR
Time (mean ± σ): 5.855 s ± 0.058 s [User: 17.197 s, System: 0.491 s]
Range (min … max): 5.786 s … 5.987 s 10 runs
Benchmark 2: Latest HEAD with lock
Time (mean ± σ): 5.645 s ± 0.033 s [User: 16.922 s, System: 0.495 s]
Range (min … max): 5.600 s … 5.712 s 10 runs
Summary
(2) ran 1.04 ± 0.01 times faster than (1)
```
## Test Plan
Run all of the commands which gives output with and without the
`--output-file=ruff.out` option:
* `--show-settings`
* `--show-files`
* `--show-fixes`
* `--diff`
* `--select=ALL`
* `--select=All --show-source`
* `--watch` (only stdout allowed)
resolves: #4754
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. `ExpandLeft` turns out to be surprisingly hard. This PR adds a new `BestFittingMode` parameter to `BestFitting` to support `ExpandLeft`.
There are 3 variants that `ExpandLeft` must support:
**Variant 1**: Everything fits on the line (easy)
```python
[a, b] + c
```
**Variant 2**: Left breaks, but right fits on the line. Doesn't need parentheses
```python
[
a,
b
] + c
```
**Variant 3**: The left breaks, but there's still not enough space for the right hand side. Parenthesize the whole expression:
```python
(
[
a,
b
]
+ ccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
)
```
Solving Variant 1 and 2 on their own is straightforward The printer gives us this behavior by nesting right inside of the group of left:
```
group(&format_args![
if_group_breaks(&text("(")),
soft_block_indent(&group(&format_args![
left,
soft_line_break_or_space(),
op,
space(),
group(&right)
])),
if_group_breaks(&text(")"))
])
```
The fundamental problem is that the outer group, which adds the parentheses, always breaks if the left side breaks. That means, we end up with
```python
(
[
a,
b
] + c
)
```
which is not what we want (we only want parentheses if the right side doesn't fit).
Okay, so nesting groups don't work because of the outer parentheses. Sequencing groups doesn't work because it results in a right-to-left breaking which is the opposite of what we want.
Could we use best fitting? Almost!
```
best_fitting![
// All flat
format_args![left, space(), op, space(), right],
// Break left
format_args!(group(&left).should_expand(true), space(), op, space(), right],
// Break all
format_args![
text("("),
block_indent!(&format_args![
left,
hard_line_break(),
op,
space()
right
])
]
]
```
I hope I managed to write this up correctly. The problem is that the printer never reaches the 3rd variant because the second variant always fits:
* The `group(&left).should_expand(true)` changes the group so that all `soft_line_breaks` are turned into hard line breaks. This is necessary because we want to test if the content fits if we break after the `[`.
* Now, the whole idea of `best_fitting` is that you can pretend that some content fits on the line when it actually does not. The way this works is that the printer **only** tests if all the content of the variant **up to** the first line break fits on the line (we insert that line break by using `should_expand(true))`. The printer doesn't care whether the rest `a\n, b\n ] + c` all fits on (multiple?) lines.
Why does breaking right work but not breaking the left? The difference is that we can make the decision whether to parenthesis the expression based on the left expression. We can't do this for breaking left because the decision whether to insert parentheses or not would depend on a lookahead: will the right side break. We simply don't know this yet when printing the parentheses (it would work for the right parentheses but not for the left and indent).
What we kind of want here is to tell the printer: Look, what comes here may or may not fit on a single line but we don't care. Simply test that what comes **after** fits on a line.
This PR adds a new `BestFittingMode` that has a new `AllLines` option that gives us the desired behavior of testing all content and not just up to the first line break.
## Test Plan
I added a new example to `BestFitting::with_mode`
## Summary
Completes the documentation for the `flake8-bugbear` ruleset. Related to
#2646.
## Test Plan
`python scripts/check_docs_formatted.py`
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
Now that all identifiers include ranges (#5194), we can remove a ton of
this "custom lexing" code that we have to sketchily extract identifier
ranges from source.
## Test Plan
`cargo test`
## Summary
In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified
RustPython to include ranges for any identifiers that aren't
`Expr::Name` (which already has an identifier).
For example, the `e` in `except ValueError as e` was previously
un-ranged. To extract its range, we had to do some lexing of our own.
This change should improve performance and let us remove a bunch of
code.
## Test Plan
`cargo test`
## Summary
Open cache files in parallel (again), brings the performance back to be roughly equal to the old implementation.
## Test Plan
Existing tests should keep working.
## Summary
Handle trailing newline in Jupyter Notebook JSON string similar to how
`black`
does it.
## Test Plan
Add test cases when the JSON string for notebook ends with and without a
newline.
resolves: #5190
## Summary
Fixes#4404.
Consider this file:
```python
if True:
x = 1; \
<space><space><space>
```
The current implementation of W293 removes the 3 spaces on line 2. This
fix changes the file to:
```python
if True:
x = 1; \
```
A file can't end in a `\`, according to Python's [lexical
analysis](https://docs.python.org/3/reference/lexical_analysis.html), so
subsequent iterations of the autofixer fail (the AST-based ones
specifically, since they depend on a valid syntax tree and get
re-parsed).
This patch examines the line before the line checked in `W293`. If its
first non-whitespace character is a `\`, the patch will extend the
diagnostic's fix range to all whitespace up until the previous line's
*second* non-whitespace character; that is, it deletes all spaces and
potential `\`s up until the next non-whitespace character on the
previous line.
## Test Plan
Ran `cargo run -p ruff_cli -- ~/Downloads/aa.py --fix --select W293,D100
--no-cache` against the above file. This resulted in:
```
/Users/evan/Downloads/aa.py:1:1: D100 Missing docstring in public module
Found 2 errors (1 fixed, 1 remaining).
```
The file's contents, after the fix:
```python
if True:
x = 1;<space>
```
The `\` was removed, leaving the terminal space. The space should be
handled by `Rule::TrailingWhitespace`, not `BlankLineWithWhitespace`.
## Summary
This PR modifies our statement deletion logic to delete any preceding
continuation lines.
For example, given:
```py
x = 1; \
import os
```
We'll now rewrite to:
```py
x = 1;
```
In addition, the logic can now handle multiple preceding continuations
(which is unlikely, but valid).
## Summary
This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
## Summary
I want to include URLs to the rule documentation in the LSP (the LSP has
a native `code_description` field for this, which, if specified, causes
the source to be rendered as a link to the docs). This PR exposes the
URL to the documentation in the Ruff JSON output.
## Summary
Maintain consistency while deserializing Jupyter notebook to JSON. The
following changes were made:
1. Use string array to store the source value as that's the default
(5781720423/nbformat/v4/nbjson.py (L56-L57))
2. Remove unused structs and enums
3. Reorder the keys in alphabetical order as that's the default.
(5781720423/nbformat/v4/nbjson.py (L51))
### Side effect
Removing the `preserve_order` feature means that the order of keys in
JSON output (`--format json`) will be in alphabetical order. This is
because the value is represented using `serde_json::Value` which
internally is a `BTreeMap`, thus sorting it as per the string key. For
posterity if this turns out to be not ideal, then we could define a
struct representing the JSON object and the order of struct fields will
determine the order in the JSON string.
## Test Plan
Add a test case to assert the raw JSON string.
## Summary
This changes the caching design from one cache file per source file, to
one cache file per package. This greatly reduces the amount of cache
files that are opened and written, while maintaining roughly the same
(combined) size as bincode is very compact.
Below are some very much not scientific performance tests. It uses
projects/sources to check:
* small.py: single, 31 bytes Python file with 2 errors.
* test.py: single, 43k Python file with 8 errors.
* fastapi: FastAPI repo, 1134 files checked, 0 errors.
Source | Before # files | After # files | Before size | After size
-------|-------|-------|-------|-------
small.py | 1 | 1 | 20 K | 20 K
test.py | 1 | 1 | 60 K | 60 K
fastapi | 1134 | 518 | 4.5 M | 2.3 M
One question that might come up is why fastapi still has 518 cache files
and not 1? That is because this is using the existing package
resolution, which sees examples, docs, etc. as separate from the "main"
source code (in the fastapi directory in the repo). In this future it
might be worth consider switching to a one cache file per repo strategy.
This new design is not perfect and does have a number of known issues.
First, like the old design it doesn't remove the cache for a source file
that has been (re)moved until `ruff clean` is called.
Second, this currently uses a large mutex around the mutation of the
package cache (e.g. inserting result). This could be (or become) a
bottleneck. It's future work to test and improve this (if needed).
Third, currently the packages and opened and stored in a sequential
loop, this could be done parallel. This is also future work.
## Test Plan
Run `ruff check` (with caching enabled) twice on any Python source code
and it should produce the same results.
## Summary
We want to ensure that once formatted content stays the same when
formatted again, which is known as formatter stability or formatter
idempotency, and that the formatter prints syntactically valid code. As
our test cases cover only a limited amount of code, this allows checking
entire repositories.
This adds a new subcommand to `ruff_dev` which can be invoked as `cargo
run --bin ruff_dev -- check-formatter-stability <repo>`. While initially
only intended to check stability, it has also found cases where the
formatter printed invalid syntax or panicked.
## Test Plan
Running this on cpython is already identifying bugs
(https://github.com/astral-sh/ruff/pull/5089)
This documentation change improves the section on dangling comments in
the formatter.
---------
Co-authored-by: David Szotten <davidszotten@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This adds a new subcommand that can be used as
```shell
cargo build --bin ruff_dev --profile=release-debug
perf record -g -F 999 target/release-debug/ruff_dev repeat --repeat 30 --exit-zero --no-cache path/to/cpython > /dev/null
flamegraph --perfdata perf.data
```
## Test Plan
This is a ruff internal script. I successfully used it to profile
cpython with the instructions above
## Summary
We weren't resetting the `allow_ellipsis` flag properly, which
ultimately caused us to treat the semicolon as "unnecessary" rather than
"creating a multi-statement line".
Closes#5154.
## Summary
Given:
```python
\
import os
```
Deleting `import os` leaves a syntax error: a file can't end in a
continuation. We have code to handle this case, but it failed to pick up
continuations at the _very start_ of a file.
Closes#5156.
## Summary
This PR moves the "unconventional import alias" rule (which enforces,
e.g., that `pandas` is imported as `pd`) to the "dead scopes" phase,
after the main linter pass. This (1) avoids an allocation since we no
longer need to create the qualified name in the linter pass; and (2)
will allow us to autofix it, since we'll have access to all references.
## Test Plan
`cargo test` -- all changes are to ranges (which are improvements IMO).
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format `continue` statement.
## Test Plan
`continue` is used already in some tests, but if a new test is needed I
could add it.
---------
Co-authored-by: konstin <konstin@mailbox.org>
This tackles three problems:
* pre-commit was slow because it ran cargo commands
* Improve the clarity on what you need to run to get your PR pass on CI
(and make those fast)
* You had to compile and run `cargo dev generate-all` separately, which
was slow
The first change is to remove all cargo commands except running ruff
itself from pre-commit. With `cargo run --bin ruff` already compiled it
takes about 7s on my machine. It would make sense to also use the ruff
pre-commit action here even if we're then lagging a release behind for
checking ruff on ruff.
The contributing guide is now clear about what you need to run:
```shell
cargo clippy --workspace --all-targets --all-features -- -D warnings # Linting...
RUFF_UPDATE_SCHEMA=1 cargo test # Testing and updating ruff.schema.json
pre-commit run --all-files # rust and python formatting, markdown and python linting, etc.
```
Example timings from my machine:
`cargo clippy --workspace --all-targets --all-features -- -D warnings`:
23s
`RUFF_UPDATE_SCHEMA=1 cargo test`: 2min (recompiling), 1min (no code
changes, this is mainly doc tests)
`pre-commit run --all-files`: 7s
The exact numbers don't matter so much as the approximate experience (6s
is easier to just wait than 1min, esp if you need to fix and rerun). The
biggest remaining block seems to be doc tests, i'm surprised i didn't
find any solution to speeding them up (nextest simply doesn't run them
at all). Also note that the formatter has it's own tests which are much
faster since they avoid linking ruff (`cargo test
ruff_python_formatter`).
The third change is to enable `cargo test` to update the schema. Similar
to `INSTA_UPDATE=always`, i've added `RUFF_UPDATE_SCHEMA=1` (name open
to bikeshedding), so `RUFF_UPDATE_SCHEMA=1 cargo test` updates the
schema, while `cargo test` still fails as expected if the repo isn't
up-to-date.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
Completes the documentation for the `flake8-blind-except` and
`flake8-raise` rules.
Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
After #5140, I audited the codebase for similar patterns (defining a
list of `CallPath` entities in a static vector, then looping over them
to pattern-match). This PR migrates all other such cases to use `match`
and `matches!` where possible.
There are a few benefits to this:
1. It more clearly denotes the intended semantics (branches are
exclusive).
2. The compiler can help deduplicate the patterns and detect unreachable
branches.
3. Performance: in the benchmark below, the all-rules performance is
increased by nearly 10%...
## Benchmarks
I decided to benchmark against a large file in the Airflow repository
with a lot of type annotations
([`views.py`](https://raw.githubusercontent.com/apache/airflow/f03f73100e8a7d6019249889de567cb00e71e457/airflow/www/views.py)):
```
linter/default-rules/airflow/views.py
time: [10.871 ms 10.882 ms 10.894 ms]
thrpt: [19.739 MiB/s 19.761 MiB/s 19.781 MiB/s]
change:
time: [-2.7182% -2.5687% -2.4204%] (p = 0.00 < 0.05)
thrpt: [+2.4805% +2.6364% +2.7942%]
Performance has improved.
linter/all-rules/airflow/views.py
time: [24.021 ms 24.038 ms 24.062 ms]
thrpt: [8.9373 MiB/s 8.9461 MiB/s 8.9527 MiB/s]
change:
time: [-8.9537% -8.8516% -8.7527%] (p = 0.00 < 0.05)
thrpt: [+9.5923% +9.7112% +9.8342%]
Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
5 (5.00%) high mild
7 (7.00%) high severe
```
The impact is dramatic -- nearly a 10% improvement for `all-rules`.
## Summary
This PR fixes a small quirk in the semantic model. Typically, when we
see an import, like `import foo`, we create a `BindingKind::Importation`
for it. However, if `foo` has been declared as a `global`, then we
propagate the kind forward. So given:
```python
global foo
import foo
```
We'd create two bindings for `foo`, both with type `global`.
This was originally borrowed from Pyflakes, and it exists to help avoid
false-positives like:
```python
def f():
global foo
# Don't mark `foo` as "assigned but unused"! It's a global!
foo = 1
```
This PR removes that behavior, and instead tracks "Does this binding
refer to a global?" as a flag. This is much cleaner, since it means we
don't "lose" the identity of various bindings.
As a very strange example of why this matters, consider:
```python
def foo():
global Member
from module import Member
x: Member = 1
```
`Member` is only used in a typing context, so we should flag it and say
"move it to a `TYPE_CHECKING` block". However, when we go to analyze
`from module import Member`, it has `BindingKind::Global`. So we don't
even know that it's an import!
## Summary
In #5074, we introduced an abstraction to support local symbol renames
("local" here refers to "within a module"). However, that abstraction
didn't support `global` and `nonlocal` symbols. This PR extends it to
those cases.
Broadly, there are considerations.
First, if we're renaming a symbol in a scope in which it is declared
`global` or `nonlocal`. For example, given:
```python
x = 1
def foo():
global x
```
Then when renaming `x` in `foo`, we need to detect that it's `global`
and instead perform the rename starting from the module scope.
Second, when renaming a symbol, we need to determine the scopes in which
it is declared `global` or `nonlocal`. This is effectively the inverse
of the above: when renaming `x` in the module scope, we need to detect
that we should _also_ rename `x` in `foo`.
To support these cases, the renaming algorithm was adjusted as follows:
- When we start a rename in a scope, determine whether the symbol is
declared `global` or `nonlocal` by looking for a `global` or `nonlocal`
binding. If it is, start the rename in the defining scope. (This
requires storing the defining scope on the `nonlocal` binding, which is
new.)
- We then perform the rename in the defining scope.
- We then check whether the symbol was declared as `global` or
`nonlocal` in any scopes, and perform the rename in those scopes too.
(Thankfully, this doesn't need to be done recursively.)
Closes#5092.
## Test Plan
Added some additional snapshot tests.
## Summary
This PR enables autofix behavior for the `flake8-pyi` rule that asks you
to alias `Set` to `AbstractSet` when importing `collections.abc.Set`.
It's not the most important rule, but it's a good isolated test-case for
local symbol renaming.
The renaming algorithm is outlined in-detail in the `renamer.rs` module.
But to demonstrate the behavior, here's the diff when running this fix
over a complex file that exercises a few edge cases:
```diff
--- a/foo.pyi
+++ b/foo.pyi
@@ -1,16 +1,16 @@
if True:
- from collections.abc import Set
+ from collections.abc import Set as AbstractSet
else:
- Set = 1
+ AbstractSet = 1
-x: Set = set()
+x: AbstractSet = set()
-x: Set
+x: AbstractSet
-del Set
+del AbstractSet
def f():
- print(Set)
+ print(AbstractSet)
def Set():
pass
```
Making this work required resolving a bunch of edge cases in the
semantic model that were causing us to "lose track" of references. For
example, the above wasn't possible with our previous approach to
handling deletions (#5071). Similarly, the `x: Set` "delayed annotation"
tracking was enabled via #5070. And many of these edits would've failed
if we hadn't changed `BindingKind` to always match the identifier range
(#5090). So it's really the culmination of a bunch of changes over the
course of the week.
The main outstanding TODO is that this doesn't support `global` or
`nonlocal` usages. I'm going to take a look at that tonight, but I'm
comfortable merging this as-is.
Closes#1106.
Closes#5091.
## Summary
I noticed that we have a few hot comparisons that involve called
`s.to_lowercase()`. We can avoid an allocation by comparing characters
directly.
## Summary
@konstin mentioned that in profiling, this function accounted for a
non-trivial amount of time (0.33% of total execution, the most of any
rule). This PR attempts to rewrite it as a match statement for better
performance over a looping comparison.
## Test Plan
`cargo test`
## Summary
If you `import __future__`, it's not subject to the same rules as `from
__future__ import feature` -- i.e., this is fine:
```python
x = 1
import __future__
```
It doesn't really make sense to treat these as `__future__` imports
(though I can't imagine anyone ever does this anyway).
## Summary
At present, when we store a binding, we include a `TextRange` alongside
it. The `TextRange` _sometimes_ matches the exact range of the
identifier to which the `Binding` is linked, but... not always.
For example, given:
```python
x = 1
```
The binding we create _will_ use the range of `x`, because the left-hand
side is an `Expr::Name`, which has a valid range on it.
However, given:
```python
try:
pass
except ValueError as e:
pass
```
When we create a binding for `e`, we don't have a `TextRange`... The AST
doesn't give us one. So we end up extracting it via lexing.
This PR extends that pattern to the rest of the binding kinds, to ensure
that whenever we create a binding, we always use the range of the bound
name. This leads to better diagnostics in cases like pattern matching,
whereby the diagnostic for "unused variable `x`" here used to include
`*x`, instead of just `x`:
```python
def f(provided: int) -> int:
match provided:
case [_, *x]:
pass
```
This is _also_ required for symbol renames, since we track writes as
bindings -- so we need to know the ranges of the bound symbols.
By storing these bindings precisely, we can also remove the
`binding.trimmed_range` abstraction -- since bindings already use the
"trimmed range".
To implement this behavior, I took some of our existing utilities (like
the code we had for `except ValueError as e` above), migrated them from
a full lexer to a zero-allocation lexer that _only_ identifies
"identifiers", and moved the behavior into a trait, so we can now do
`stmt.identifier(locator)` to get the range for the identifier.
Honestly, we might end up discarding much of this if we decide to put
ranges on all identifiers
(https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we
do, this will _still_ be a good change, because the lexer introduced
here is useful beyond names (e.g., we use it find the `except` keyword
in an exception handler, to find the `else` after a `for` loop, and so
on). So, I'm fine committing this even if we end up changing our minds
about the right approach.
Closes#5090.
## Benchmarks
No significant change, with one statistically significant improvement
(-2.1654% on `linter/all-rules/large/dataset.py`):
```
linter/default-rules/numpy/globals.py
time: [73.922 µs 73.955 µs 73.986 µs]
thrpt: [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s]
change:
time: [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05)
thrpt: [+0.3996% +0.4755% +0.5611%]
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) low severe
1 (1.00%) low mild
1 (1.00%) high mild
linter/default-rules/pydantic/types.py
time: [1.4909 ms 1.4917 ms 1.4926 ms]
thrpt: [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s]
change:
time: [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05)
thrpt: [-0.3380% -0.2734% -0.2136%]
Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
linter/default-rules/numpy/ctypeslib.py
time: [688.97 µs 691.34 µs 694.15 µs]
thrpt: [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s]
change:
time: [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05)
thrpt: [+0.1468% +0.7351% +1.3461%]
Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
2 (2.00%) high mild
12 (12.00%) high severe
linter/default-rules/large/dataset.py
time: [3.3872 ms 3.4032 ms 3.4191 ms]
thrpt: [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s]
change:
time: [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05)
thrpt: [-0.0905% +0.2642% +0.6469%]
No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
1 (1.00%) low severe
2 (2.00%) low mild
4 (4.00%) high mild
13 (13.00%) high severe
linter/all-rules/numpy/globals.py
time: [148.99 µs 149.21 µs 149.42 µs]
thrpt: [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s]
change:
time: [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05)
thrpt: [+0.2785% +0.5094% +0.7395%]
Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
linter/all-rules/pydantic/types.py
time: [3.0362 ms 3.0396 ms 3.0441 ms]
thrpt: [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s]
change:
time: [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05)
thrpt: [-0.2121% -0.0618% +0.0958%]
No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
3 (3.00%) low mild
5 (5.00%) high mild
2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
time: [1.6879 ms 1.6894 ms 1.6909 ms]
thrpt: [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s]
change:
time: [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05)
thrpt: [-0.0435% +0.0889% +0.2284%]
No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) low mild
1 (1.00%) high severe
linter/all-rules/large/dataset.py
time: [7.1520 ms 7.1586 ms 7.1654 ms]
thrpt: [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s]
change:
time: [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05)
thrpt: [+1.8102% +2.2133% +2.6300%]
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild
```
## Summary
This fixes a number of problems in the formatter that showed up with
various files in the [cpython](https://github.com/python/cpython)
repository. These problems surfaced as unstable formatting and invalid
code. This is not the entirety of problems discovered through cpython,
but a big enough chunk to separate it. Individual fixes are generally
individual commits. They were discovered with #5055, which i update as i
work through the output
## Test Plan
I added regression tests with links to cpython for each entry, except
for the two stubs that also got comment stubs since they'll be
implemented properly later.
## Summary
This PR runs `rustfmt` with a few nightly options as a one-time fix to
catch some malformatted comments. I ended up just running with:
```toml
condense_wildcard_suffixes = true
edition = "2021"
max_width = 100
normalize_comments = true
normalize_doc_attributes = true
reorder_impl_items = true
unstable_features = true
use_field_init_shorthand = true
```
Since these all seem like reasonable things to fix, so may as well while
I'm here.
## Summary
Small update to leverage `get_or_import_symbol` to fix `UP017` in more
cases (e.g., when we need to import `UTC`, or access it from an alias or
something).
## Test Plan
Check out the updated snapshot.
## Summary
This PR consistently uses `matches! for static `CallPath` comparisons.
In some cases, we can significantly reduce the number of cases or
checks.
## Test Plan
`cargo test `
## Summary
As discussed in Discord, and similar to oxc, we're going to refer to
this as `.semantic()` everywhere.
While I was auditing usages of `model: &SemanticModel`, I also changed
as many function signatures as I could find to consistently take the
model as the _last_ argument, rather than the first.
## Summary
This PR tackles a corner case that we'll need to support local symbol
renaming. It relates to a nuance in how we want handle annotations
(i.e., `AnnAssign` statements with no value, like `x: int` in a function
body).
When we see a statement like:
```python
x: int
```
We create a `BindingKind::Annotation` for `x`. This is a special
`BindingKind` that the resolver isn't allowed to return. For example,
given:
```python
x: int
print(x)
```
The second line will yield an `undefined-name` error.
So why does this `BindingKind` exist at all? In Pyflakes, to support the
`unused-annotation` lint:
```python
def f():
x: int # unused-annotation
```
If we don't track `BindingKind::Annotation`, we can't lint for unused
variables that are only "defined" via annotations.
There are a few other wrinkles to `BindingKind::Annotation`. One is
that, if a binding already exists in the scope, we actually just discard
the `BindingKind`. So in this case:
```python
x = 1
x: int
```
When we go to create the `BindingKind::Annotation` for the second
statement, we notice that (1) we're creating an annotation but (2) the
scope already has binding for the name -- so we just drop the binding on
the floor. This has the nice property that annotations aren't considered
to "shadow" another binding, which is important in a bunch of places
(e.g., if we have `import os; os: int`, we still consider `os` to be an
import, as we should). But it also means that these "delayed"
annotations are one of the few remaining references that we don't track
anywhere in the semantic model.
This PR adds explicit support for these via a new `delayed_annotations`
attribute on the semantic model. These should be extremely rare, but we
do need to track them if we want to support local symbol renaming.
### This isn't the right way to model this
This isn't the right way to model this.
Here's an alternative:
- Remove `BindingKind::Annotation`, and treat annotations as their own,
separate concept.
- Instead of storing a map from name to `BindingId` on each `Scope`,
store a map from name to... `SymbolId`.
- Introduce a `Symbol` abstraction, where a symbol can point to a
current binding, and a list of annotations, like:
```rust
pub struct Symbol {
binding: Option<BindingId>,
annotations: Vec<AnnotationId>
}
```
If we did this, we could appropriately model the semantics described
above. When we go to resolve a binding, we ignore annotations (always).
When we try to find unused variables, we look through the list of
symbols, and have sufficient information to discriminate between
annotations and bound variables. Etc.
The main downside of this `Symbol`-based approach is that it's going to
take a lot more work to implement, and it'll be less performant (we'll
be storing more data per symbol, and our binding lookups will have an
added layer of indirection).
## Summary
We now _always_ generate fixes, so `FixMode::None` and
`FixMode::Generate` are redundant. We can also remove the TODO around
`--fix-dry-run`, since that's our default behavior.
Closes#5081.
## Summary
Small follow-up to #4888 to add a dedicated `ResolvedRead` case for
unbound locals, mostly for clarity and documentation purposes (no
behavior changes).
## Test Plan
`cargo test`
## Summary
Our current mechanism for handling deletions (e.g., `del x`) is to
remove the symbol from the scope's `bindings` table. This "does the
right thing", in that if we then reference a deleted symbol, we're able
to determine that it's unbound -- but it causes a variety of problems,
mostly in that it makes certain bindings and references unreachable
after-the-fact.
Consider:
```python
x = 1
print(x)
del x
```
If we analyze this code _after_ running the semantic model over the AST,
we'll have no way of knowing that `x` was ever introduced in the scope,
much less that it was bound to a value, read, and then deleted --
because we effectively erased `x` from the model entirely when we hit
the deletion.
In practice, this will make it impossible for us to support local symbol
renames. It also means that certain rules that we want to move out of
the model-building phase and into the "check dead scopes" phase wouldn't
work today, since we'll have lost important information about the source
code.
This PR introduces two new `BindingKind` variants to model deletions:
- `BindingKind::Deletion`, which represents `x = 1; del x`.
- `BindingKind::UnboundException`, which represents:
```python
try:
1 / 0
except Exception as e:
pass
```
In the latter case, `e` gets unbound after the exception handler
(assuming it's triggered), so we want to handle it similarly to a
deletion.
The main challenge here is auditing all of our existing `Binding` and
`Scope` usages to understand whether they need to accommodate deletions
or otherwise behave differently. If you look one commit back on this
branch, you'll see that the code is littered with `NOTE(charlie)`
comments that describe the reasoning behind changing (or not) each of
those call sites. I've also augmented our test suite in preparation for
this change over a few prior PRs.
### Alternatives
As an alternative, I considered introducing a flag to `BindingFlags`,
like `BindingFlags::UNBOUND`, and setting that at the appropriate time.
This turned out to be a much more difficult change, because we tend to
match on `BindingKind` all over the place (e.g., we have a bunch of code
blocks that only run when a `BindingKind` is
`BindingKind::Importation`). As a result, introducing these new
`BindingKind` variants requires only a few changes at the client sites.
Adding a flag would've required a much wider-reaching change.
## Summary
This behavior dates back to a Pyflakes commit (5fc37cbd), which was used
to allow this test to pass:
```py
from __future__ import annotations
T: object
def f(t: T): pass
def g(t: 'T'): pass
```
But, I think this is an error. Mypy and Pyright don't accept it -- you
can only use variables as type annotations if they're type aliases
(i.e., annotated with `TypeAlias`), in which case, there has to be an
assignment on the right-hand side (see: [PEP
613](https://peps.python.org/pep-0613/)).
## Summary
This PR corrects a misunderstanding I had related to Python's handling
of bound exceptions.
Previously, I thought this code ran without error:
```py
def f():
x = 1
try:
1 / 0
except Exception as x:
pass
print(x)
```
My understanding was that `except Exception as x` bound `x` within the
`except` block, but then restored the `x = 1` binding after exiting the
block.
In practice, however, this throws a `UnboundLocalError` error, because
`x` becomes "unbound" after exiting the exception handler. It's similar
to a `del` statement in this way.
This PR removes our behavior to "restore" the previous binding. This
could lead to faulty analysis in conditional blocks due to our lack of
control flow analysis, but those same problems already exist for `del`
statements.
## Summary
In a dataclass:
```py
from dataclasses import dataclass
@dataclass
class X:
class_var = {}
x: int
```
`class_var` isn't actually a dataclass attribute, since it's
unannotated. This PR removes such attributes from RUF008
(`mutable-dataclass-default`), but it does enforce them in RUF012
(`mutable-class-default`), since those should be annotated with
`ClassVar` like any other mutable class attribute.
Closes#5043.
## Summary
This adds `json-lines` (https://jsonlines.org/ or http://ndjson.org/) as
an output format.
I'm sure you already know, but
* JSONL is more greppable (each record is a single line) than the pretty
JSON
* JSONL is faster to ingest piecewise (and/or in parallel) than JSON
## Test Plan
Snapshot test in the new module :)
I've written done my condensed learnings from working on the formatter
so that others can have an easier start working on it.
This is a pure docs change
## Summary
Per the [API
reference](https://docs.pytest.org/en/7.1.x/reference/reference.html#pytest.fail),
`reason` was added in version 7, and is equivalent to `msg` (but
preferred going forward).
I also grepped for `msg` usages in `flake8_pytest_style`, but found no
others (apart from those that reference `unittest` APIs.)
Closes#3387.
## Summary
This PR adds autofixer for rule ISC001 in cases where both string
literals are of the same kind and with same quotes (double / single).
Fixes#4829
## Test Plan
I added testcases with different combinations of string literals.
## Summary
This PR (1) avoids flagging `TypedDict` and `NamedTuple` conversions
when attributes are dunder methods, like `__dict__`, and (2) avoids
flagging the `A003` shadowed-attribute rule for `TypedDict` classes at
all, where it doesn't really apply (since those attributes are only
accessed via subscripting anyway).
Closes#5027.
## Summary
A few of our rules look at the parentheses that follow a class
definition (e.g., `class Foo(object):`) and attempt to modify those
parentheses. Neither of those rules were behaving properly in the
presence of decorators, which were recently added to the statement
range.
## Test Plan
`cargo test` with a variety of new fixture tests.
## Summary
Add rule to disallow implicit optional with autofix.
Currently, I've added it under `RUF` category.
### Limitation
Type aliases could result in false positive:
```python
from typing import Optional
StrOptional = Optional[str]
def foo(arg: StrOptional = None):
pass
```
## Test Plan
`cargo test`
resolves: #1983
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
Improves the `ruff_parse_simple` fuzz harness by adding checks for
parsed locations to ensure they all lie on UTF-8 character boundaries.
This will allow for faster identification of issues like #5004.
This also adds additional details for Apple M1 users and clarifies the
importance of using `init-fuzzer.sh` (thanks for the feedback,
@jasikpark 🙂).
## Summary
Previously the rule for SIM117 explicitly ignored `async with`
statements as it would incorrectly suggestion to merge `async with` and
regular `with` statements as reported in issue #1902.
This partially reverts the fix for that (commit
396be5edea) by enabling the rules for
`async with` statements again, but with a check ensuring that the
statements are both of the same kind, i.e. both `async with` or both
(just) `with` statements.
Closes#3025
## Test Plan
Updated and existing test and added a new test case from #3025.
## Summary
Add support for applying auto-fixes in Jupyter Notebook.
### Solution
Cell offsets are the boundaries for each cell in the concatenated source
code. They are represented using `TextSize`. It includes the start and
end offset as well, thus creating a range for each cell. These offsets
are updated using the `SourceMap` markers.
### SourceMap
`SourceMap` contains markers constructed from each edits which tracks
the original source code position to the transformed positions. The
following drawing might make it clear:

The center column where the dotted lines are present are the markers
included in the `SourceMap`. The `Notebook` looks at these markers and
updates the cell offsets after each linter loop. If you notice closely,
the destination takes into account all of the markers before it.
The index is constructed only when required as it's only used to render
the diagnostics. So, a `OnceCell` is used for this purpose. The cell
offsets, cell content and the index will be updated after each iteration
of linting in the mentioned order. The order is important here as the
content is updated as per the new offsets and index is updated as per
the new content.
## Limitations
### 1
Styling rules such as the ones in `pycodestyle` will not be applicable
everywhere in Jupyter notebook, especially at the cell boundaries. Let's
take an example where a rule suggests to have 2 blank lines before a
function and the cells contains the following code:
```python
import something
# ---
def first():
pass
def second():
pass
```
(Again, the comment is only to visualize cell boundaries.)
In the concatenated source code, the 2 blank lines will be added but it
shouldn't actually be added when we look in terms of Jupyter notebook.
It's as if the function `first` is at the start of a file.
`nbqa` solves this by recording newlines before and after running
`autopep8`, then running the tool and restoring the newlines at the end
(refer https://github.com/nbQA-dev/nbQA/pull/807).
## Test Plan
Three commands were run in order with common flags (`--select=ALL
--no-cache --isolated`) to isolate which stage the problem is occurring:
1. Only diagnostics
2. Fix with diff (`--fix --diff`)
3. Fix (`--fix`)
### https://github.com/facebookresearch/segment-anything
```
-------------------------------------------------------------------------------
Jupyter Notebooks 3 0 0 0 0
|- Markdown 3 98 0 94 4
|- Python 3 513 468 4 41
(Total) 611 468 98 45
-------------------------------------------------------------------------------
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/segment-anything/**/*.ipynb --fix
...
Found 180 errors (89 fixed, 91 remaining).
```
### https://github.com/openai/openai-cookbook
```
-------------------------------------------------------------------------------
Jupyter Notebooks 65 0 0 0 0
|- Markdown 64 3475 12 2507 956
|- Python 65 9700 7362 1101 1237
(Total) 13175 7374 3608 2193
===============================================================================
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/openai-cookbook/**/*.ipynb --fix
error: Failed to parse /path/to/openai-cookbook/examples/vector_databases/Using_vector_databases_for_embeddings_search.ipynb:cell 4:29:18: unexpected token '-'
...
Found 4227 errors (2165 fixed, 2062 remaining).
```
### https://github.com/tensorflow/docs
```
-------------------------------------------------------------------------------
Jupyter Notebooks 150 0 0 0 0
|- Markdown 1 55 0 46 9
|- Python 1 402 289 60 53
(Total) 457 289 106 62
-------------------------------------------------------------------------------
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/tensorflow-docs/**/*.ipynb --fix
error: Failed to parse /path/to/tensorflow-docs/site/en/guide/extension_type.ipynb:cell 80:1:1: unexpected token Indent
error: Failed to parse /path/to/tensorflow-docs/site/en/r1/tutorials/eager/custom_layers.ipynb:cell 20:1:1: unexpected token Indent
error: Failed to parse /path/to/tensorflow-docs/site/en/guide/data.ipynb:cell 175:5:14: unindent does not match any outer indentation level
error: Failed to parse /path/to/tensorflow-docs/site/en/r1/tutorials/representation/unicode.ipynb:cell 30:1:1: unexpected token Indent
...
Found 12726 errors (5140 fixed, 7586 remaining).
```
### https://github.com/tensorflow/models
```
-------------------------------------------------------------------------------
Jupyter Notebooks 46 0 0 0 0
|- Markdown 1 11 0 6 5
|- Python 1 328 249 19 60
(Total) 339 249 25 65
-------------------------------------------------------------------------------
```
```console
$ cargo run --all-features --bin ruff -- check --no-cache --isolated --select=ALL /path/to/tensorflow-models/**/*.ipynb --fix
...
Found 4856 errors (2690 fixed, 2166 remaining).
```
resolves: #1218fixes: #4556
This implements formatting ExprTuple, including magic trailing comma. I
intentionally didn't change the settings mechanism but just added a
dummy global const flag.
Besides the snapshots, I added custom breaking/joining tests and a
deeply nested test case. The diffs look better than previously, proper
black compatibility depends on parentheses handling.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
This implements PYI044. This rule checks if `from __future__ import
annotations` is used in stub files as it has no effect in stub files, since type
checkers automatically treat stubs as having those semantics.
Updates https://github.com/astral-sh/ruff/issues/848
## Test Plan
Added a test case and snapshots.
## Summary
Ignore pyproject.toml file for adding noqa directives using `--add-noqa`
## Test Plan
`cargo run --bin ruff -- check --add-noqa .`
fixes: #5012
## Summary
Add copyright notice detection to enforce the presence of copyright
headers in Python files.
Configurable settings include: the relevant regular expression, the
author name, and the minimum file size, similar to
[flake8-copyright](https://github.com/savoirfairelinux/flake8-copyright).
Closes https://github.com/charliermarsh/ruff/issues/3579
---------
Signed-off-by: ryan <ryang@waabi.ai>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
The `RET504` rule, which looks for unnecessary assignments before return
statements, is a frequent source of issues (#4173, #4236, #4242, #1606,
#2950). Over time, we've tried to refine the logic to handle more cases.
For example, we now avoid analyzing any functions that contain any
function calls or attribute assignments, since those operations can
contain side effects (and so we mark them as a "read" on all variables
in the function -- we could do a better job with code graph analysis to
handle this limitation, but that'd be a more involved change.) We also
avoid flagging any variables that are the target of multiple
assignments. Ultimately, though, I'm not happy with the implementation
-- we just can't do sufficiently reliable analysis of arbitrary code
flow given the limited logic herein, and the existing logic is very hard
to reason about and maintain.
This PR refocuses the rule to only catch cases of the form:
```py
def f():
x = 1
return x
```
That is, we now only flag returns that are immediately preceded by an
assignment to the returned variable. While this is more limiting, in
some ways, it lets us flag more cases vis-a-vis the previous
implementation, since we no longer "fully eject" when functions contain
function calls and other effect-ful operations.
Closes#4173.
Closes#4236.
Closes#4242.
## Summary
We use `.trim()` and friends in a bunch of places, to strip whitespace
from source code. However, not all Unicode whitespace characters are
considered "whitespace" in Python, which only supports the standard
space, tab, and form-feed characters.
This PR audits our usages of `.trim()`, `.trim_start()`, `.trim_end()`,
and `char::is_whitespace`, and replaces them as appropriate with a new
`.trim_whitespace()` analogues, powered by a `PythonWhitespace` trait.
In general, the only place that should continue to use `.trim()` is
content within docstrings, which don't need to adhere to Python's
semantic definitions of whitespace.
Closes#4991.
## Summary
`ruff_newlines` becomes `ruff_python_whitespace`, and includes the
existing "universal newline" handlers alongside the Python
whitespace-specific utilities.
* Implement StmtPass
This implements StmtPass as `pass`.
The snapshot diff is small because pass mainly occurs in bodies and function (#4951) and if/for bodies.
* Implement StmtReturn
This implements StmtReturn as `return` or `return {value}`.
The snapshot diff is small because return occurs in functions (#4951)
* A basic StmtAssign formatter and better dummies for expressions
The goal of this PR was formatting StmtAssign since many nodes in the black tests (and in python in general) are after an assignment. This caused unstable formatting: The spacing of power op spacing depends on the type of the two involved expressions, but each expression was formatted as dummy string and re-parsed as a ExprName, so in the second round the different rules of ExprName were applied, causing unstable formatting.
This PR does not necessarily bring us closer to black's style, but it unlocks a good porting of black's test suite and is a basis for implementing the Expr nodes.
* fmt
* Review
* Use phf for confusables to reduce llvm lines
## Summary
This replaces FxHashMap for the confusables with a perfect hash map from the [phf crate](https://github.com/rust-phf/rust-phf) to reduce the generated llvm instructions.
A perfect hash function is one that doesn't have any collisions. We can build one because we know all keys at compile time. This improves hashmap efficiency, even though this is likely not noticeable in our case (except someone has a large non-english crate to test on).
The original hashmap contained a lot of duplicates, which i had to remove when phf_map complained, i did so by sorting the keys.
The important part that it reduces the llvm instructions generated (#3808, `RUSTFLAGS="-Csymbol-mangling-version=v0" cargo llvm-lines -p ruff --lib | head -20`):
```
Lines Copies Function name
----- ------ -------------
1740502 38973 (TOTAL)
27423 (1.6%, 1.6%) 1 (0.0%, 0.0%) ruff[cef4c65d96248843]::rules::ruff::rules::confusables::CONFUSABLES::{closure#0}
10193 (0.6%, 2.2%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::RuleCodePrefix>::iter
8107 (0.5%, 2.6%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::Rule>::noqa_code
7345 (0.4%, 3.0%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_stmt
6412 (0.4%, 3.4%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
6412 (0.4%, 3.8%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
6409 (0.4%, 4.2%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
5696 (0.3%, 4.5%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_expr
4448 (0.3%, 4.7%) 1 (0.0%, 0.0%) ruff[cef4c65d96248843]::flake8_to_ruff::converter::convert
3702 (0.2%, 4.9%) 1 (0.0%, 0.0%) <&ruff[cef4c65d96248843]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
3349 (0.2%, 5.1%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::registry::Linter>::code_for_rule
3132 (0.2%, 5.3%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
3130 (0.2%, 5.5%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<&ruff[cef4c65d96248843]::codes::Rule>>::from
3130 (0.2%, 5.7%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<ruff[cef4c65d96248843]::codes::Rule>>::from
3130 (0.2%, 5.9%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
3128 (0.2%, 6.0%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::RuleIter>::get
2669 (0.2%, 6.2%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
```
After:
```
Lines Copies Function name
----- ------ -------------
1710487 38900 (TOTAL)
10193 (0.6%, 0.6%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::RuleCodePrefix>::iter
8107 (0.5%, 1.1%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::Rule>::noqa_code
7345 (0.4%, 1.5%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_stmt
6412 (0.4%, 1.9%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
6412 (0.4%, 2.2%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
6409 (0.4%, 2.6%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
5696 (0.3%, 3.0%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_expr
4448 (0.3%, 3.2%) 1 (0.0%, 0.0%) ruff[52408f46d2058296]::flake8_to_ruff::converter::convert
3702 (0.2%, 3.4%) 1 (0.0%, 0.0%) <&ruff[52408f46d2058296]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
3349 (0.2%, 3.6%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::registry::Linter>::code_for_rule
3132 (0.2%, 3.8%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
3130 (0.2%, 4.0%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<&ruff[52408f46d2058296]::codes::Rule>>::from
3130 (0.2%, 4.2%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<ruff[52408f46d2058296]::codes::Rule>>::from
3130 (0.2%, 4.4%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
3128 (0.2%, 4.5%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::RuleIter>::get
2669 (0.2%, 4.7%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
2659 (0.2%, 4.9%) 1 (0.0%, 0.0%) <&ruff[52408f46d2058296]::codes::Pylint as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
```
I'd assume this has a positive effect both on compile time and on runtime, but i don't know the actual effect on compile times and can't really measure.
## Test plan
Check CI for any performance regressions.
This should fix#3808 if we merge it.
* clippy
* Update update_ambiguous_characters.py
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes.
The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:
```python
def func_no_args():
a; b; c
if True: raise RuntimeError
if False: ...
for i in range(10):
print(i)
continue
```
Get's reformatted to
```python
def func_no_args():
a; b; c
if True: raise RuntimeError
if False: ...
for i in range(10):
print(i)
continue
```
because our formatter does not yet support `for` statements and just inserts the text from the source.
## Downsides
Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid.
## Upsides
Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)
## Test Plan
None?
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This issue fixes the removal of empty lines between a leading comment and the previous statement:
```python
a = 20
# leading comment
b = 10
```
Ruff removed the empty line between `a` and `b` because:
* The leading comments formatting does not preserve leading newlines (to avoid adding new lines at the top of a body)
* The `JoinNodesBuilder` counted the lines before `b`, which is 1 -> Doesn't insert a new line
This is fixed by changing the `JoinNodesBuilder` to count the lines instead *after* the last node. This correctly gives 1, and the `# leading comment` will insert the empty lines between any other leading comment or the node.
## Test Plan
I added a new test for empty lines.
According to https://docs.python.org/3/library/ast.html#ast-helpers, we expect type_ignores to be always be empty, so this adds a debug assert.
Test plan: I confirmed that the assertion holdes for the file below and for all the black tests which include a number of `type: ignore` comments.
```python
# type: ignore
if 1:
print("1") # type: ignore
# elsebranch
# type: ignore
else: # type: ignore
print("2") # type: ignore
while 1:
print()
# type: ignore
```
* Implement module formatting using JoinNodesBuilder
This uses JoinNodesBuilder to implement module formatting for #4800
See the snapshots for the changed behaviour. See one PR up for a CLI that i used to verify the trailing new line behaviour
* Add a formatter CLI for debugging
This adds a ruff_python_formatter cli modelled aber `rustfmt` that i use for debugging
* clippy
* Add print IR and print comments options
Tested with `cargo run --bin ruff_python_formatter -- --print-ir --print-comments scratch.py`
* Abstract codegen with stylist into a CodegenStylist trait
Replace all duplicate invocations of
```rust
let mut state = CodegenState {
default_newline: &stylist.line_ending(),
default_indent: stylist.indentation(),
..CodegenState::default()
}
tree.codegen(&mut state);
state.to_string()
```
with
```rust
tree.codegen_stylist(&stylist);
```
No functional changes.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
And more custom logic around comments in bodies... uff.
Let's say we have the following code
```python
if x == y:
pass # trailing comment of pass
else: # trailing comment of `else`
print("I have no comments")
```
Right now, the formatter attaches the `# trailing comment of `else` as a trailing comment of `pass` because it doesn't "see" that there's an `else` keyword in between (because the else body is just a Vec and not a node).
This PR adds custom logic that attaches the trailing comments after the `else` as dangling comments to the `if` statement. The if statement must then split the dangling comments by `comments.text_position()`:
* All comments up to the first end-of-line comment are leading comments of the `else` keyword.
* All end-of-line comments coming after are `trailing` comments for the `else` keyword.
## Test Plan
I added new unit tests.
### Summary
This PR adds custom logic to handle end-of-line comments of the last statement in a body.
For example:
```python
while True:
if something.changed:
do.stuff() # trailing comment
b
```
The `# trailing comment` is a trailing comment of the `do.stuff()` expression statement. We incorrectly attached the comment as a trailing comment of the enclosing `while` statement because the comment is between the end of the while statement (the `while` statement ends right after `do.stuff()`) and before the `b` statement.
This PR fixes the placement to correctly attach these comments to the last statement in a body (recursively).
## Test Plan
I reviewed the snapshots and they now look correct. This may appear odd because a lot comments have now disappeared. This is the expected result because we use `verbatim` formatting for the block statements (like `while`) and that means that it only formats the inner content of the block, but not any trailing comments. The comments were visible before, because they were associated with the block statement (e.g. `while`).
* Document codes.rs
* Refactor codes.rs before merging
Helper script:
```python
# %%
from pathlib import Path
codes = Path("crates/ruff/src/codes.rs").read_text().splitlines()
rules = Path("a.txt").read_text().strip().splitlines()
rule_map = {i.split("::")[-1]: i for i in rules}
# %%
codes_new = []
for line in codes:
if ", Rule::" in line:
left, right = line.split(", Rule::")
right = right[:-2]
line = left + ", " + rule_map[right] + "),"
codes_new.append(line)
# %%
Path("crates/ruff/src/codes.rs").write_text("\n".join(codes_new))
```
Co-authored-by: Jonathan Plasse <13716151+JonathanPlasse@users.noreply.github.com>
* Add Format for Stmt
* Implement basic module formatting
This implements formatting each statement in a module with a hard line break in between, so that we can start formatting statements.
Basic testing is done by the snapshots
* Use dummy verbatim formatter for all nodes
* Use new formatter infrastructure in CLI and test
* Expose the new formatter in the CLI
* Merge import blocks
This adds a new rule `InvalidPyprojectToml` that lints pyproject.toml by checking if https://github.com/PyO3/pyproject-toml-rs can parse it. This means the linting is currently very basic, e.g. we don't check whether the name is actually a valid python project name or appropriately normalized. It does catch errors e.g. with invalid dependency requirements or problems withs the license specifications. It is open to be extended in the future (validate name, SPDX expressions, classifiers, ...), either in ruff or in pyproject-toml-rs.
Test plan:
```
scripts/ecosystem_all_check.sh check --select RUF200
```
This lead to a bunch of
```
RUF200 Failed to parse pyproject.toml: missing field `name`
```
(e.g. https://github.com/amitsk/fastapi-todos/blob/main/pyproject.toml) which is indeed invalid (https://packaging.python.org/en/latest/specifications/declaring-project-metadata/#specification).
Filtering those out, the following other problems were found by `cd target/ecosystem_all_results/ && rg RUF200`:
```
UCL-ARC:rred-reports.stdout.txt
1:pyproject.toml:27:16: RUF200 Failed to parse pyproject.toml: Version specifier `>='3.9'` doesn't match PEP 440 rules
EndlessTrax:python-start-project.stdout.txt
1:pyproject.toml:14:16: RUF200 Failed to parse pyproject.toml: Expected package name starting with an alphanumeric character, found '#'
redjax:gardening-api.stdout.txt
1:pyproject.toml:7:11: RUF200 Failed to parse pyproject.toml: Version `` doesn't match PEP 440 rules
ajslater:codex.stdout.txt
2: 3:17 RUF200 Failed to parse pyproject.toml: invalid type: sequence, expected a string
LDmitriy7:404_AvatarsBot.stdout.txt
1:pyproject.toml:3:11: RUF200 Failed to parse pyproject.toml: Version `` doesn't match PEP 440 rules
ajslater:comicbox.stdout.txt
1:pyproject.toml:3:17: RUF200 Failed to parse pyproject.toml: invalid type: sequence, expected a string
manueldevillena:forecast-earnings.stdout.txt
1:pyproject.toml:24:12: RUF200 Failed to parse pyproject.toml: Expected one of `@`, `(`, `<`, `=`, `>`, `~`, `!`, `;`, found `^`
redjax:ohio_utility_scraper.stdout.txt
1:pyproject.toml:11:11: RUF200 Failed to parse pyproject.toml: Version `` doesn't match PEP 440 rules
agronholm:typeguard.stdout.txt
1:pyproject.toml:40:8: RUF200 Failed to parse pyproject.toml: Expected a valid marker name, found 'python_implementation'
cyuss:decathlon-turnover.stdout.txt
1:pyproject.toml:7:12: RUF200 Failed to parse pyproject.toml: invalid type: string "Youcef", expected a table with 'name' and 'email' keys
ajslater:boilerplate.stdout.txt
1:pyproject.toml:3:17: RUF200 Failed to parse pyproject.toml: invalid type: sequence, expected a string
kaparoo:lightning-project-template.stdout.txt
1:pyproject.toml:56:16: RUF200 Failed to parse pyproject.toml: You can't mix a >= operator with a local version (`+cu117`)
dijital20:pytexas2023-decorators.stdout.txt
1:pyproject.toml:5:11: RUF200 Failed to parse pyproject.toml: Version `` doesn't match PEP 440 rules
pfouque:django-anymail-history.stdout.txt
1:pyproject.toml:137:12: RUF200 Failed to parse pyproject.toml: Version specifier `> = 1.2.0` doesn't match PEP 440 rules
pfouque:django-fakemessages.stdout.txt
1:pyproject.toml:130:12: RUF200 Failed to parse pyproject.toml: Version specifier `> = 1.2.0` doesn't match PEP 440 rules
pypa:build.stdout.txt
1:tests/packages/test-invalid-requirements/pyproject.toml:2:12: RUF200 Failed to parse pyproject.toml: Expected one of `@`, `(`, `<`, `=`, `>`, `~`, `!`, `;`, found `i`
4:tests/packages/test-no-requires/pyproject.toml:1:1: RUF200 Failed to parse pyproject.toml: missing field `requires`
UnoYakshi:FRAAND.stdout.txt
2: 3:11 RUF200 Failed to parse pyproject.toml: Version `` doesn't match PEP 440 rules
DHolmanCoding:python-template.stdout.txt
1:pyproject.toml:22:1: RUF200 Failed to parse pyproject.toml: missing field `requires`
```
Overall, this emitted errors in 43 out of 3408 projects (`rg -c RUF200 target/ecosystem_all_results/ | wc -l`)
Co-authored-by: Micha Reiser <micha@reiser.io>
Add Checker::any_enabled shortcut
## Summary
Akin to #4625, This is a refactoring that shortens a bunch of code by replacing `checker.settings.rules.any_enabled` with `checker.any_enabled`.
## Test Plan
`cargo clippy`
* Refactor and fix task trigger for dependent jobs in other repos
I have confirmed (https://github.com/konstin/ruff-pre-commit/actions/runs/5056928280/jobs/9075029868) that this does dispatch the workflow when running with act, `owner: 'konstin'`, `needs` commented out and personal access token. I can't properly test the actual release workflow, and i'm unsure how to best handle the next release after this was merged (should we do a beta release or will this break everything that assumes we only do stable releases?)
The command for act is
```
act -j update-dependents -s RUFF_PRE_COMMIT_PAT=<...>
```
* delete old file
* Update maturin to 1.0
A 1.0 release for maturin 🎉