## Summary
This PR adds support for parenthesized comments. A parenthesized comment
is a comment that appears within a parenthesis, but not within the range
of the expression enclosed by the parenthesis. For example, the comment
here is a parenthesized comment:
```python
if (
# comment
True
):
...
```
The parentheses enclose the `True`, but the range of `True` doesn’t
include the `# comment`.
There are at least two problems associated with parenthesized comments:
(1) associating the comment with the correct (i.e., enclosed) node; and
(2) formatting the comment correctly, once it has been associated with
the enclosed node.
The solution proposed here for (1) is to search for parentheses between
preceding and following node, and use open and close parentheses to
break ties, rather than always assigning to the preceding node.
For (2), we handle these special parenthesized comments in `FormatExpr`.
The biggest risk with this approach is that we forget some codepath that
force-disables parenthesization (by passing in `Parentheses::Never`).
I've audited all usages of that enum and added additional handling +
test coverage for such cases.
Closes https://github.com/astral-sh/ruff/issues/6390.
## Test Plan
`cargo test` with new cases.
Before:
| project | similarity index |
|--------------|------------------|
| build | 0.75623 |
| cpython | 0.75472 |
| django | 0.99804 |
| transformers | 0.99618 |
| typeshed | 0.74233 |
| warehouse | 0.99601 |
| zulip | 0.99727 |
After:
| project | similarity index |
|--------------|------------------|
| build | 0.75623 |
| cpython | 0.75472 |
| django | 0.99804 |
| transformers | 0.99618 |
| typeshed | 0.74237 |
| warehouse | 0.99601 |
| zulip | 0.99727 |
## Summary
In https://github.com/astral-sh/ruff/pull/6512, we added a flag to the
AST to mark implicitly-concatenated string expressions. This PR makes
use of that flag to remove the `is_implicit_concatenation` method.
## Test Plan
`cargo test`
**Summary** Implement docstring formatting
**Test Plan** Matches black's `docstring.py` fixture exactly, added some
new cases for what is hard to debug with black and with what black
doesn't cover.
similarity index:
main:
zulip: 0.99702
django: 0.99784
warehouse: 0.99585
build: 0.75623
transformers: 0.99469
cpython: 0.75989
typeshed: 0.74853
this branch:
zulip: 0.99702
django: 0.99784
warehouse: 0.99585
build: 0.75623
transformers: 0.99464
cpython: 0.75517
typeshed: 0.74853
The regression in transformers is actually an improvement in a file they
don't format with black (they run `black examples tests src utils
setup.py conftest.py`, the difference is in hubconf.py). cpython doesn't
use black.
Closes#6196
## Summary
This PR modifies our logic for wrapping return type annotations.
Previously, we _always_ wrapped the annotation in parentheses if it
expanded; however, Black only exhibits this behavior when the function
parameters is empty (i.e., it doesn't and can't break). In other cases,
it uses the normal parenthesization rules, allowing nodes to bring their
own parentheses.
For example, given:
```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> Set[
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
...
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(x) -> Set[
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
...
```
Black will format as:
```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
Set[
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]
):
...
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
x,
) -> Set[
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]:
...
```
Whereas, prior to this PR, Ruff would format as:
```python
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx() -> (
Set[
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]
):
...
def xxxxxxxxxxxxxxxxxxxxxxxxxxxx(
x,
) -> (
Set[
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
]
):
...
```
Closes https://github.com/astral-sh/ruff/issues/6431.
## Test Plan
Before:
- `zulip`: 0.99702
- `django`: 0.99784
- `warehouse`: 0.99585
- `build`: 0.75623
- `transformers`: 0.99470
- `cpython`: 0.75988
- `typeshed`: 0.74853
After:
- `zulip`: 0.99724
- `django`: 0.99791
- `warehouse`: 0.99586
- `build`: 0.75623
- `transformers`: 0.99474
- `cpython`: 0.75956
- `typeshed`: 0.74857
## Summary
This PR fixes some misformattings around optional parentheses for
expressions.
I first noticed that we were misformatting this:
```python
return (
unicodedata.normalize("NFKC", s1).casefold()
== unicodedata.normalize("NFKC", s2).casefold()
)
```
The above is stable Black formatting, but we were doing:
```python
return unicodedata.normalize("NFKC", s1).casefold() == unicodedata.normalize(
"NFKC", s2
).casefold()
```
Above, the "last" expression is a function call, so our
`can_omit_optional_parentheses` was returning `true`...
However, it turns out that Black treats function calls differently
depending on whether or not they have arguments -- presumedly because
they'll never split empty parentheses, and so they're functionally
non-useful. On further investigation, I believe this applies to all
parenthesized expressions. If Black can't split on the parentheses, it
doesn't leverage them when removing optional parentheses.
## Test Plan
Nice increase in similarity scores.
Before:
- `zulip`: 0.99702
- `django`: 0.99784
- `warehouse`: 0.99585
- `build`: 0.75623
- `transformers`: 0.99470
- `cpython`: 0.75989
- `typeshed`: 0.74853
After:
- `zulip`: 0.99705
- `django`: 0.99795
- `warehouse`: 0.99600
- `build`: 0.75623
- `transformers`: 0.99471
- `cpython`: 0.75989
- `typeshed`: 0.74853
We currently don't format all comments as match statements are not yet implemented. We can work around this for the top level match statement by setting them manually formatted but the mocked-out top level match doesn't call into its children so they would still have unformatted comments
## Summary
This PR renames the `MagicCommand` token to `IpyEscapeCommand` token and
`MagicKind` to `IpyEscapeKind` type to better reflect the purpose of the
token and type. Similarly, it renames the AST nodes from `LineMagic` to
`IpyEscapeCommand` prefixed with `Stmt`/`Expr` wherever necessary.
It also makes renames from using `jupyter_magic` to
`ipython_escape_commands` in various function names.
The mode value is still `Mode::Jupyter` because the escape commands are
part of the IPython syntax but the lexing/parsing is done for a Jupyter
notebook.
### Motivation behind the rename:
* IPython codebase defines it as "EscapeCommand" / "Escape Sequences":
* Escape Sequences:
292e3a2345/IPython/core/inputtransformer2.py (L329-L333)
* Escape command:
292e3a2345/IPython/core/inputtransformer2.py (L410-L411)
* The word "magic" is used mainly for the actual magic commands i.e.,
the ones starting with `%`/`%%`
(https://ipython.readthedocs.io/en/stable/interactive/reference.html#magic-command-system).
So, this avoids any confusion between the Magic token (`%`, `%%`) and
the escape command itself.
## Test Plan
* `cargo test` to make sure all renames are done correctly.
* `grep` for `jupyter_escape`/`magic` to make sure all renames are done
correctly.
## Summary
This PR moves `empty_parenthesized` such that it's peer to
`parenthesized`, and changes the API to better match that of
`parenthesized` (takes `&str` rather than `StaticText`, has a
`with_dangling_comments` method, etc.).
It may be intentionally _not_ part of `parentheses.rs`, but to me
they're so similar that it makes more sense for them to be in the same
module, with the same API, etc.
Closes https://github.com/astral-sh/ruff/issues/6068
These commits are kind of a mess as I did some stumbling around here.
Unrolls formatting of chained boolean operations to prevent nested
grouping which gives us Black-compatible formatting where each boolean
operation is on a new line.
## Summary
This PR modifies our `can_omit_optional_parentheses` rules to ensure
that if we see a call followed by an attribute, we treat that as an
attribute access rather than a splittable call expression.
This in turn ensures that we wrap like:
```python
ct_match = aaaaaaaaaaact_id == self.get_content_type(
obj=rel_obj, using=instance._state.db
)
```
For calls, but:
```python
ct_match = (
aaaaaaaaaaact_id == self.get_content_type(obj=rel_obj, using=instance._state.db).id
)
```
For calls with trailing attribute accesses.
Closes https://github.com/astral-sh/ruff/issues/6065.
## Test Plan
Similarity index before:
- `zulip`: 0.99436
- `django`: 0.99779
- `warehouse`: 0.99504
- `transformers`: 0.99403
- `cpython`: 0.75912
- `typeshed`: 0.72293
And after:
- `zulip`: 0.99436
- `django`: 0.99780
- `warehouse`: 0.99504
- `transformers`: 0.99404
- `cpython`: 0.75913
- `typeshed`: 0.72293
## Summary
Per the suggestion in
https://github.com/astral-sh/ruff/discussions/6183, this PR removes
`AsyncWith`, `AsyncFor`, and `AsyncFunctionDef`, replacing them with an
`is_async` field on the non-async variants of those structs. Unlike an
interpreter, we _generally_ have identical handling for these nodes, so
separating them into distinct variants adds complexity from which we
don't really benefit. This can be seen below, where we get to remove a
_ton_ of code related to adding generic `Any*` wrappers, and a ton of
duplicate branches for these cases.
## Test Plan
`cargo test` is unchanged, apart from parser snapshots.
## Summary
Given:
```python
[ # comment
first,
second,
third
] # another comment
```
We were adding a hard line break as part of the formatting of `#
comment`, which led to the following formatting:
```python
[first, second, third] # comment
# another comment
```
Closes https://github.com/astral-sh/ruff/issues/6367.
## Summary
Fixes some comprehension formatting by avoiding creating the group for
the comprehension itself (so that if it breaks, all parts break on their
own lines, e.g. the `for` and the `if` clauses).
Closes https://github.com/astral-sh/ruff/issues/6063.
## Test Plan
Bunch of new fixtures.
Implement fluent style/call chains. See the `call_chains.py` formatting
for examples.
This isn't fully like black because in `raise A from B` they allow `A`
breaking can influence the formatting of `B` even if it is already
multiline.
Similarity index:
| project | main | PR |
|--------------|-------|-------|
| build | ??? | 0.753 |
| django | 0.991 | 0.998 |
| transformers | 0.993 | 0.994 |
| typeshed | 0.723 | 0.723 |
| warehouse | 0.978 | 0.994 |
| zulip | 0.992 | 0.994 |
Call chain formatting is affected by
https://github.com/astral-sh/ruff/issues/627, but i'm cutting scope
here.
Closes#5343
**Test Plan**:
* Added a dedicated call chains test file
* The ecosystem checks found some bugs
* I manually check django and zulip formatting
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
We already support preserving the end-of-line comment in calls and type
parameters, as in:
```python
foo( # comment
bar,
)
```
This PR adds the same behavior for lists, sets, comprehensions, etc.,
such that we preserve:
```python
[ # comment
1,
2,
3,
]
```
And related cases.
## Summary
Previously, the ruff formatter was removing the star argument of
`lambda` expressions when formatting.
Given the following code snippet
```python
lambda *a: ()
lambda **b: ()
```
it would be formatted to
```python
lambda: ()
lambda: ()
```
We fix this by checking for the presence of `args`, `vararg` or `kwarg`
in the `lambda` expression, before we were only checking for the
presence of `args`.
Fixes#5894
## Test Plan
Add new tests cases.
---------
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
## Summary
This PR leverages the `Arguments` AST node introduced in #6259 in the
formatter, which ensures that we correctly handle trailing comments in
calls, like:
```python
f(
1,
# comment
)
pass
```
(Previously, this was treated as a leading comment on `pass`.)
This also allows us to unify the argument handling across calls and
class definitions.
## Test Plan
A bunch of new fixture tests, plus improved Black compatibility.
## Summary
This PR adds a new `Arguments` AST node, which we can use for function
calls and class definitions.
The `Arguments` node spans from the left (open) to right (close)
parentheses inclusive.
In the case of classes, the `Arguments` is an option, to differentiate
between:
```python
# None
class C: ...
# Some, with empty vectors
class C(): ...
```
In this PR, we don't really leverage this change (except that a few
rules get much simpler, since we don't need to lex to find the start and
end ranges of the parentheses, e.g.,
`crates/ruff/src/rules/pyupgrade/rules/lru_cache_without_parameters.rs`,
`crates/ruff/src/rules/pyupgrade/rules/unnecessary_class_parentheses.rs`).
In future PRs, this will be especially helpful for the formatter, since
we can track comments enclosed on the node itself.
## Test Plan
`cargo test`
## Summary
This PR renames a few AST nodes for clarity:
- `Arguments` is now `Parameters`
- `Arg` is now `Parameter`
- `ArgWithDefault` is now `ParameterWithDefault`
For now, the attribute names that reference `Parameters` directly are
changed (e.g., on `StmtFunctionDef`), but the attributes on `Parameters`
itself are not (e.g., `vararg`). We may revisit that decision in the
future.
For context, the AST node formerly known as `Arguments` is used in
function definitions. Formally (outside of the Python context),
"arguments" typically refers to "the values passed to a function", while
"parameters" typically refers to "the variables used in a function
definition". E.g., if you Google "arguments vs parameters", you'll get
some explanation like:
> A parameter is a variable in a function definition. It is a
placeholder and hence does not have a concrete value. An argument is a
value passed during function invocation.
We're thus deviating from Python's nomenclature in favor of a scheme
that we find to be more precise.
**Summary** This prevents us from turning `r'''\""'''` into
`r"""\"""""`, which is invalid syntax.
This PR fixes CI, which is currently broken on main (in a way that still
passes on linter PRs and allows merging formatter PRs, but it's bad to
have a job be red). Once merged, i'll make the formatted ecosystem
checks a required check.
**Test Plan** Added a regression test.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format bytes string
Closes#6064
## Test Plan
Added a fixture based on string's one
## Summary
Closes https://github.com/astral-sh/ruff/issues/5781
## Test Plan
Added cases to
`crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/named_expr.py`
one-by-one and adjusted the condition as needed.
## Summary
This is a rewrite of the main comment placement logic. `place_comment`
now has three parts:
- place own line comments
- between branches
- after a branch
- place end-of-line comments
- after colon
- after a branch
- place comments for specific nodes (that include module level comments)
The rewrite fixed three bugs: `class A: # trailing comment` comments now
stay end-of-line, `try: # comment` remains end-of-line and deeply
indented try-else-finally comments remain with the right nested
statement.
It will be much easier to give more alternative branches nodes since
this is abstracted away by `is_node_with_body` and the first/last child
helpers. Adding new node types can now be done by adding an entry to the
`place_comment` match. The code went from 1526 lines before #6033 to
1213 lines now.
It thinks it easier to just read the new `placement.rs` rather than
reviewing the diff.
## Test Plan
The existing fixtures staying the same or improving plus new ones for
the bug fixes.
## Summary
This PR adds the implementation for the new Jupyter AST nodes i.e.,
`ExprLineMagic` and `StmtLineMagic`.
## Test Plan
Add test cases for `unparse` containing magic commands
resolves: #6087
**Summary** Add a `EmptyWithDanglingComments` format helper that formats
comments inside empty parentheses, brackets or curly braces. Previously,
this was implemented separately, and partially incorrectly, for each use
case.
Empty `()`, `[]` and `{}` are special because there can be dangling
comments, and they can be in
two positions:
```python
x = [ # end-of-line
# own line
]
```
These comments are dangling because they can't be assigned to any
element inside as they would
in all other cases.
**Test Plan** Added a regression test.
145 (from previously 149) instances of unstable formatting remaining.
```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
145
```
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
- Remove space when start of slice is empty
- Treat unary op except `not` as simple expression
## Test Plan
Add some simple tests for unary op expressions in slice
Closes#5673
## Summary
This crate now contains utilities for dealing with trivia more broadly:
whitespace, newlines, "simple" trivia lexing, etc. So renaming it to
reflect its increased responsibilities.
To avoid conflicts, I've also renamed `Token` and `TokenKind` to
`SimpleToken` and `SimpleTokenKind`.
**Summary** Fix the formatter crash with `x[(1) :: ]` and related code.
**Problem** For assigning comments in slices in subscripts, we need to
find the positions of the colons to assign comments before and after the
colon to the respective lower/upper/step node (or dangling in that
section). Formatting `x[(1) :: ]` was broken because we were looking for
a `:` after the `1` but didn't consider that there could be a `)`
outside the range of the lower node, which contains just the `1` and no
optional parentheses.
**Solution** Use the simple tokenizer directly and skip all closing
parentheses.
**Test Plan** I added regression tests.
Closes#5733
**Summary** Add a static string error message to the formatter syntax
error so we can disambiguate where the syntax error came from
**Test Plan** No fixed tests, we don't expect this to occur, but it
helped with transformers syntax error debugging:
```
Error: Failed to format node
Caused by:
syntax error: slice first colon token was not a colon
```
## Summary
The motivation here is that it will make this rule easier to rewrite as
a deferred check. Right now, we can't run this rule in the deferred
phase, because it depends on the `except_handler` to power its autofix.
Instead of lexing the `except_handler`, we can use the `SimpleTokenizer`
from the formatter, and just lex forwards and backwards.
For context, this rule detects the unused `e` in:
```python
try:
pass
except ValueError as e:
pass
```
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR uses the `join_comma_separated` builder for formatting set
expressions
to ensure the formatting preserves magic commas, if the setting is
enabled.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
See the fixed black tests
<!-- How was it tested? -->
## Summary
Format `DictComp` like `ListComp` from #5600. It's not 100%, but I
figured maybe it's worth starting to explore.
## Test Plan
Added ruff fixture based on `ListComp`'s.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR improves the parentheses handling for with items to get closer
to black's formatting.
### Case 1:
```python
# Black / Input
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
aaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+ cccccccccccccccccccccccccccc
+ ddddddddddddddddd as example2,
CtxManager2() as example2,
CtxManager2() as example2,
CtxManager2() as example2,
):
...
# Before
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
(
aaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
+ cccccccccccccccccccccccccccc
+ ddddddddddddddddd
) as example2,
CtxManager2() as example2,
CtxManager2() as example2,
CtxManager2() as example2,
):
...
```
Notice how Ruff wraps the binary expression in an extra set of
parentheses
### Case 2:
Black does not expand the with-items if the with has no parentheses:
```python
# Black / Input
with aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c:
...
# Before
with (
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c
):
...
```
Or
```python
# Black / Input
with [
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1, aaaaaaaaaaaaaaaaaaaaaaaaaa * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb * cccccccccccccccccccccccccccc + ddddddddddddddddd as example2, CtxManager222222222222222() as example2:
...
# Before (Same as Case 1)
with (
[
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
"bbbbbbbbbb",
"cccccccccccccccccccccccccccccccccccccccccc",
dddddddddddddddddddddddddddddddd,
] as example1,
(
aaaaaaaaaaaaaaaaaaaaaaaaaa
* bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
* cccccccccccccccccccccccccccc
+ ddddddddddddddddd
) as example2,
CtxManager222222222222222() as example2,
):
...
```
## Test Plan
I added new snapshot tests
Improves the django similarity index from 0.973 to 0.977
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Format `SetComp` like `ListComp`.
## Test Plan
Derived from `ListComp`'s fixture.
## Summary
The previous dummy was causing instabilities since it turned a string
into a variable.
E.g.
```python
script_header_dict[
"slurm_partition_line"
] = f"#SBATCH --partition {resources.queue_name}"
```
has an instability as
```python
- script_header_dict["slurm_partition_line"] = (
- NOT_YET_IMPLEMENTED_ExprJoinedStr
- )
+ script_header_dict[
+ "slurm_partition_line"
+ ] = NOT_YET_IMPLEMENTED_ExprJoinedStr
```
## Test Plan
The instability is gone, otherwise it's still a dummy
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR matches Black' behavior where it only omits the optional parentheses if the expression starts or ends with a parenthesized expression:
```python
a + [aaa, bbb, cccc] * c # Don't omit
[aaa, bbb, cccc] + a * c # Split
a + c * [aaa, bbb, ccc] # Split
```
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
This improves the Jaccard index from 0.945 to 0.946
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR improves the Black compatibility when it comes to breaking comprehensions.
We want to avoid line breaks before the target and `in` whenever possible. Furthermore, `if X is not None` should be grouped together, similar to other binary like expressions
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
`cargo test`
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
I started working on this because I assumed that I would need access to options inside of `NeedsParantheses` but it then turned out that I won't.
Anyway, it kind of felt nice to pass fewer arguments. So I'm gonna put this out here to get your feedback if you prefer this over passing individual fiels.
Oh, I sneeked in another change. I renamed `context.contents` to `source`. `contents` is too generic and doesn't tell you anything.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
It compiles
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR implements Black's behavior where it first splits off parenthesized expressions before splitting before operands to avoid unnecessary parentheses:
```python
# We want
if a + [
b,
c
]:
pass
# Rather than
if (
a
+ [b, c]
):
pass
```
This is implemented by using the new IR elements introduced in #5596.
* We give the group wrapping the optional parentheses an ID (`parentheses_id`)
* We use `conditional_group` for the lower priority groups (all non-parenthesized expressions) with the condition that the `parentheses_id` group breaks (we want to split before operands only if the parentheses are necessary)
* We use `fits_expanded` to wrap all other parenthesized expressions (lists, dicts, sets), to prevent that expanding e.g. a list expands the `parentheses_id` group. We gate the `fits_expand` to only apply if the `parentheses_id` group fits (because we prefer `a\n+[b, c]` over expanding `[b, c]` if the whole expression gets parenthesized).
We limit using `fits_expanded` and `conditional_group` only to expressions that themselves are not in parentheses (checking the conditions isn't free)
## Test Plan
It increases the Jaccard index for Django from 0.915 to 0.917
## Incompatibilites
There are two incompatibilities left that I'm aware of (there may be more, I didn't go through all snapshot differences).
### Long string literals
I commented on the regression. The issue is that a very long string (or any content without a split point) may not fit when only breaking the right side. The formatter than inserts the optional parentheses. But this is kind of useless because the overlong string will still not fit, because there are no new split points.
I think we should ignore this incompatibility for now
### Expressions on statement level
I don't fully understand the logic behind this yet, but black doesn't break before the operators for the following example even though the expression exceeds the configured line width
```python
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa < bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb > ccccccccccccccccccccccccccccc == ddddddddddddddddddddd
```
But it would if the expression is used inside of a condition.
What I understand so far is that Black doesn't insert optional parentheses on the expression statement level (and a few other places) and, therefore, only breaks after opening parentheses. I propose to keep this deviation for now to avoid overlong-lines and use the compatibility report to make a decision if we should implement the same behavior.
## Summary
This PR implements the formatting of `raise` statements. I haven't
looked at the black implementation, this is inspired from from the
`return` statements formatting.
## Test Plan
The black differences with insta.
I also compared manually some edge cases with very long string and call
chaining and it seems to do the same formatting as black.
There is one issue:
```python
# input
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa).a(aaaa)
# black
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(
aaaa
)
# ruff
raise OsError(
"aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(aaaa)
```
But I'm not sure this diff is the raise formatting implementation.
---------
Co-authored-by: Louis Dispa <ldispa@deezer.com>
## Summary
Format named expressions (walrus operator) such a `value := f()`.
Unlike tuples, named expression parentheses are not part of the range
even when mandatory, so mapping optional parentheses to always gives us
decent formatting without implementing all [PEP
572](https://peps.python.org/pep-0572/) rules on when we need
parentheses where other expressions wouldn't. We might want to revisit
this decision later and implement special cases, but for now this gives
us what we need.
## Test Plan
black fixtures, i added some fixtures and checked django and cpython for
stability.
Closes#5613
## Summary
Format `ExprIfExp`, also known as the ternary operator or inline `if`.
It can look like
```python
a1 = 1 if True else 2
```
but also
```python
b1 = (
# We return "a" ...
"a" # that's our True value
# ... if this condition matches ...
if True # that's our test
# ... otherwise we return "b§
else "b" # that's our False value
)
```
This also fixes a visitor order bug.
The jaccard index on django goes from 0.911 to 0.915.
## Test Plan
I added fixtures without and with comments in strange places.
## Summary
Change generator formatting dummy to include `NOT_YET_IMPLEMENTED`. This
makes it easier to correctly identify them as dummies
## Test Plan
This is a dummy change
Support for `let…else` formatting was just merged to nightly
(rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02
to pick this up. Followup to #939.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR normalizes line endings inside of strings to `\n` as required by the printer.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new test using `\r\n` and ran the ecosystem check. There are no remaining end of line panics.
https://gist.github.com/MichaReiser/8f36b1391ca7b48475b3a4f592d74ff4
<!-- How was it tested? -->
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR fixes an issue where the binary expression formatting removed parentheses around the left hand side of an expression.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new regression test and re-ran the ecosystem check. It brings down the `check-formatter-stability` output from a 3.4MB file down to 900KB.
<!-- How was it tested? -->
## Summary
This formats call expressions with magic trailing comma and parentheses
behaviour but without call chaining
## Test Plan
Lots of new test fixtures, including some that don't work yet
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR extends the string formatting to respect the configured quote style.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
Extended the string test with new cases and set it up to run twice: Once with the `quote_style: Doube`, and once with `quote_style: Single` single and double quotes.
<!-- How was it tested? -->
## Motation
Previously,
```python
x = (
a1
.a2
# a
. # b
# c
a3
)
```
got formatted as
```python
x = a1.a2
# a
. # b
# c
a3
```
which is invalid syntax. This fixes that.
## Summary
This implements a basic form of attribute chaining
(<https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#call-chains>)
by checking if any inner attribute access contains an own line comment,
and if this is the case, adds parentheses around the outermost attribute
access while disabling parentheses for all inner attribute expressions.
We want to replace this with an implementation that uses recursion or a
stack while formatting instead of in `needs_parentheses` and also
includes calls rather sooner than later, but i'm fixing this now because
i'm uncomfortable with having known invalid syntax generation in the
formatter.
## Test Plan
I added new fixtures.
## Summary
This is small refactoring to reuse the code that detects the magic
trailing comma across functions. I make this change now to avoid copying
code in a later PR. @MichaReiser is planning on making a larger
refactoring later that integrates with the join nodes builder
## Test Plan
No functional changes. The magic trailing comma behaviour is checked by
the fixtures.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR implements formatting for non-f-string Strings that do not use implicit concatenation.
Docstring formatting is out of the scope of this PR.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a few tests for simple string literals.
## Performance
Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times:
1. To detect if it is an implicit string continuation
2. To detect if the string contains any new lines
3. To detect the preferred quote
4. To normalize the string
Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time.
We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for compare operations.
The implementation currently breaks diffeently when nesting binary like expressions. I haven't yet figured out what Black's logic is in that case but I think that this by itself is already an improvement worth merging.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a few new tests
<!-- How was it tested? -->
This solves an instability when formatting cpython. It also introduces
another one, but i think it's still a worthwhile change for now.
There's no proper testing since this is just a dummy.
## Motivation
While black keeps parentheses nearly everywhere, the notable exception
is in the body of for loops:
```python
for (a, b) in x:
pass
```
becomes
```python
for a, b in x:
pass
```
This currently blocks #5163, which this PR should unblock.
## Solution
This changes the `ExprTuple` formatting option to include one additional
option that removes the parentheses when not using magic trailing comma
and not breaking. It is supposed to be used through
```rust
#[derive(Debug)]
struct ExprTupleWithoutParentheses<'a>(&'a Expr);
impl Format<PyFormatContext<'_>> for ExprTupleWithoutParentheses<'_> {
fn fmt(&self, f: &mut Formatter<PyFormatContext<'_>>) -> FormatResult<()> {
match self.0 {
Expr::Tuple(expr_tuple) => expr_tuple
.format()
.with_options(TupleParentheses::StripInsideForLoop)
.fmt(f),
other => other.format().with_options(Parenthesize::IfBreaks).fmt(f),
}
}
}
```
## Testing
The for loop formatting isn't merged due to missing this (and i didn't
want to create more git weirdness across two people), but I've confirmed
that when applying this to while loops instead of for loops, then
```rust
write!(
f,
[
text("while"),
space(),
ExprTupleWithoutParentheses(test.as_ref()),
text(":"),
trailing_comments(trailing_condition_comments),
block_indent(&body.format())
]
)?;
```
makes
```python
while (a, b):
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (a,b,):
pass
```
formatted as
```python
while a, b:
pass
while (
ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
b,
):
pass
while (
a,
b,
):
pass
```
This formats slice expressions and subscript expressions.
Spaces around the colons follows the same rules as black
(https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#slices):
```python
e00 = "e"[:]
e01 = "e"[:1]
e02 = "e"[: a()]
e10 = "e"[1:]
e11 = "e"[1:1]
e12 = "e"[1 : a()]
e20 = "e"[a() :]
e21 = "e"[a() : 1]
e22 = "e"[a() : a()]
e200 = "e"[a() : :]
e201 = "e"[a() :: 1]
e202 = "e"[a() :: a()]
e210 = "e"[a() : 1 :]
```
Comment placement is different due to our very different infrastructure.
If we have explicit bounds (e.g. `x[1:2]`) all comments get assigned as
leading or trailing to the bound expression. If a bound is missing
`[:]`, comments get marked as dangling and placed in the same section as
they were originally in:
```python
x = "x"[ # a
# b
: # c
# d
]
```
to
```python
x = "x"[
# a
# b
:
# c
# d
]
```
Except for the potential trailing end-of-line comments, all comments get
formatted on their own line. This can be improved by keeping end-of-line
comments after the opening bracket or after a colon as such but the
changes were already complex enough.
I added tests for comment placement and spaces.
## Summary
Previously, `DecoratedComment` used `text_position()` and
`SourceComment` used `position()`. This PR unifies this to
`line_position` everywhere.
## Test Plan
This is a rename refactoring.
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR adds basic formatting for unary expressions.
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
I added a new `unary.py` with custom test cases
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Black supports for layouts when it comes to breaking binary expressions:
```rust
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
enum BinaryLayout {
/// Put each operand on their own line if either side expands
Default,
/// Try to expand the left to make it fit. Add parentheses if the left or right don't fit.
///
///```python
/// [
/// a,
/// b
/// ] & c
///```
ExpandLeft,
/// Try to expand the right to make it fix. Add parentheses if the left or right don't fit.
///
/// ```python
/// a & [
/// b,
/// c
/// ]
/// ```
ExpandRight,
/// Both the left and right side can be expanded. Try in the following order:
/// * expand the right side
/// * expand the left side
/// * expand both sides
///
/// to make the expression fit
///
/// ```python
/// [
/// a,
/// b
/// ] & [
/// c,
/// d
/// ]
/// ```
ExpandRightThenLeft,
}
```
Our current implementation only handles `ExpandRight` and `Default` correctly. This PR adds support for `ExpandRightThenLeft` and `ExpandLeft`.
## Test Plan
I added tests that play through all 4 binary expression layouts.
## Summary
This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
## Summary
This fixes a number of problems in the formatter that showed up with
various files in the [cpython](https://github.com/python/cpython)
repository. These problems surfaced as unstable formatting and invalid
code. This is not the entirety of problems discovered through cpython,
but a big enough chunk to separate it. Individual fixes are generally
individual commits. They were discovered with #5055, which i update as i
work through the output
## Test Plan
I added regression tests with links to cpython for each entry, except
for the two stubs that also got comment stubs since they'll be
implemented properly later.
## Summary
This PR runs `rustfmt` with a few nightly options as a one-time fix to
catch some malformatted comments. I ended up just running with:
```toml
condense_wildcard_suffixes = true
edition = "2021"
max_width = 100
normalize_comments = true
normalize_doc_attributes = true
reorder_impl_items = true
unstable_features = true
use_field_init_shorthand = true
```
Since these all seem like reasonable things to fix, so may as well while
I'm here.
This implements formatting ExprTuple, including magic trailing comma. I
intentionally didn't change the settings mechanism but just added a
dummy global const flag.
Besides the snapshots, I added custom breaking/joining tests and a
deeply nested test case. The diffs look better than previously, proper
black compatibility depends on parentheses handling.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
* A basic StmtAssign formatter and better dummies for expressions
The goal of this PR was formatting StmtAssign since many nodes in the black tests (and in python in general) are after an assignment. This caused unstable formatting: The spacing of power op spacing depends on the type of the two involved expressions, but each expression was formatted as dummy string and re-parsed as a ExprName, so in the second round the different rules of ExprName were applied, causing unstable formatting.
This PR does not necessarily bring us closer to black's style, but it unlocks a good porting of black's test suite and is a basis for implementing the Expr nodes.
* fmt
* Review
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes.
The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:
```python
def func_no_args():
a; b; c
if True: raise RuntimeError
if False: ...
for i in range(10):
print(i)
continue
```
Get's reformatted to
```python
def func_no_args():
a; b; c
if True: raise RuntimeError
if False: ...
for i in range(10):
print(i)
continue
```
because our formatter does not yet support `for` statements and just inserts the text from the source.
## Downsides
Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid.
## Upsides
Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)
## Test Plan
None?