I don't think we can move the `fits_expanded` call into the assignment
formatting because that would wrap the whole lambda in a `fits_expanded`, when we
just want to wrap the lambda body in it instead. if I understand correctly, we'd
need to duplicate basically this whole function to inject `fits_expanded` in the
right place for the lambda formatting in assignments
I tried to get Claude to come up with tests, but most of them weren't very
interesting. I think these two additional types of assignments might be worth
having, though.
parenthesizing these seems redundant. I would prefer our old formatting more
like this:
```py
def ddb():
sql = (
lambda var, table, n=N: f"""
CREATE TABLE {table} AS
SELECT ROW_NUMBER() OVER () AS id, {var}
FROM (
SELECT {var}
FROM RANGE({n}) _ ({var})
ORDER BY RANDOM()
)
"""
)
```
where the `f"""` serves as the parentheses, instead of the current:
```py
def ddb():
sql = lambda var, table, n=N: (
f"""
CREATE TABLE {table} AS
SELECT ROW_NUMBER() OVER () AS id, {var}
FROM (
SELECT {var}
FROM RANGE({n}) _ ({var})
ORDER BY RANDOM()
)
"""
)
```
this case ends up too long at 108 columns:
```py
class C:
def foo():
if True:
transaction_count = self._query_txs_for_range(
get_count_fn=lambda from_ts, to_ts, _chain_id=chain_id: db_evmtx.count_transactions_in_range(
chain_id=_chain_id,
from_ts=from_ts,
to_ts=to_ts,
),
)
```
instead, it should be formatted like this, fitting within 88 columns:
```py
class C:
def foo():
if True:
transaction_count = self._query_txs_for_range(
get_count_fn=lambda from_ts, to_ts, _chain_id=chain_id: (
db_evmtx.count_transactions_in_range(
chain_id=_chain_id,
from_ts=from_ts,
to_ts=to_ts,
)
),
)
```
we can fix this by removing the `has_own_parentheses` check in the new lambda
formatting, but this breaks other cases. we might want to preserve this? in this
specific ecosystem case, the project has a `noqa: E501` comment, so this seems
to be what they want anyway, although we don't know that when formatting
I would expect this to format as:
```py
class C:
_is_recognized_dtype: Callable[[DtypeObj], bool] = lambda x: (
lib.is_np_dtype(x, "M") or isinstance(x, DatetimeTZDtype)
)
```
instead of the current:
```py
class C:
_is_recognized_dtype: Callable[[DtypeObj], bool] = (
lambda x: lib.is_np_dtype(x, "M") or isinstance(x, DatetimeTZDtype)
)
```
```diff
-name = re.sub(r"[^\x21\x23-\x5b\x5d-\x7e]...............", lambda m: (
- f"\\{m.group(0)}"
- ), p["name"])
+name = re.sub(
+ r"[^\x21\x23-\x5b\x5d-\x7e]...............",
+ lambda m: (f"\\{m.group(0)}"),
+ p["name"],
+)
```
the second format is actually fine, but the first one obviously looks horrible,
even if it were stable
this formats as:
```py
class C:
function_dict: Dict[Text, Callable[[CRFToken], Any]] = {
CRFEntityExtractorOptions.POS2: lambda crf_token: crf_token.pos_tag[
:2
] if crf_token.pos_tag is not None else None,
}
```
when I think it should look like:
```py
class C:
function_dict: Dict[Text, Callable[[CRFToken], Any]] = {
CRFEntityExtractorOptions.POS2: lambda crf_token: (
crf_token.pos_tag[:2] if crf_token.pos_tag is not None else None,
)
}
```
Closes#11216
Essentially the approach is to implement `Format` for a new struct
`FormatClause` which is just a clause header _and_ its body. We then
have the information we need to see whether there is a skip suppression
comment on the last child in the body and it all fits on one line.
## Summary
This is another attempt at https://github.com/astral-sh/ruff/pull/21410
that fixes https://github.com/astral-sh/ruff/issues/19226.
@MichaReiser helped me get something working in a very helpful pairing
session. I pushed one additional commit moving the comments back from
leading comments to trailing comments, which I think retains more of the
input formatting.
I was inspired by Dylan's PR (#21185) to make one of these tables:
<table>
<thead>
<tr>
<th scope="col">Input</th>
<th scope="col">Main</th>
<th scope="col">PR</th>
</tr>
</thead>
<tbody>
<tr>
<td><pre lang="python">
if (
not
# comment
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
pass
</pre></td>
<td><pre lang="python">
if (
# comment
not aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
pass
</pre></td>
<td><pre lang="python">
if (
not
# comment
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
pass
</pre></td>
</tr>
<tr>
<td><pre lang="python">
if (
# unary comment
not
# operand comment
(
# comment
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
)
):
pass
</pre></td>
<td><pre lang="python">
if (
# unary comment
# operand comment
not (
# comment
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
)
):
pass
</pre></td>
<td><pre lang="python">
if (
# unary comment
not
# operand comment
(
# comment
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
)
):
pass
</pre></td>
</tr>
<tr>
<td><pre lang="python">
if (
not # comment
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
pass
</pre></td>
<td><pre lang="python">
if ( # comment
not aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
pass
</pre></td>
<td><pre lang="python">
if (
not aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa # comment
+ bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
):
pass
</pre></td>
</tr>
</tbody>
</table>
hopefully it helps even though the snippets are much wider here.
The two main differences are (1) that we now retain own-line comments
between the unary operator and its operand instead of moving these to
leading comments on the operator itself, and (2) that we move
end-of-line comments between the operator and operand to dangling
end-of-line comments on the operand (the last example in the table).
## Test Plan
Existing tests, plus new ones based on the issue. As I noted below, I
also ran the output from main on the unary.py file back through this
branch to check that we don't reformat code from main. This made me feel
a bit better about not preview-gating the changes in this PR.
```shell
> git show main:crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/unary.py | ruff format - | ./target/debug/ruff format --diff -
> echo $?
0
```
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Takayuki Maeda <takoyaki0316@gmail.com>
Closes#19350
This fixes a syntax error caused by formatting. However, the new tests reveal that there are some cases where formatting attributes with certain comments behaves strangely, both before and after this PR, so some more polish may be in order.
For example, without parentheses around the value, and both before and after this PR, we have:
```python
# unformatted
variable = (
something # a comment
.first_method("some string")
)
# formatted
variable = something.first_method("some string") # a comment
```
which is probably not where the comment ought to go.
This PR attempts to improve the placement of own-line comments between
branches in the setting where the comment is more indented than the
preceding node.
There are two main changes.
### First change: Preceding node has leading content
If the preceding node has leading content, we now regard the comment as
automatically _less_ indented than the preceding node, and format
accordingly.
For example,
```python
if True: preceding_node
# leading on `else`, not trailing on `preceding_node`
else: ...
```
This is more compatible with `black`, although there is a (presumably
very uncommon) edge case:
```python
if True:
this;that
# leading on `else`, but trailing in `black`
else: ...
```
I'm sort of okay with this - presumably if one wanted a comment for
those semi-colon separated statements, one should have put it _above_
them, and one wanted a comment only for `that` then it ought to have
been on the same line?
### Second change: searching for last child in body
While searching for the (recursively) last child in the body of the
preceding _branch_, we implicitly assumed that the preceding node had to
have a body to begin the recursion. But actually, in the base case, the
preceding node _is_ the last child in the body of the preceding branch.
So, for example:
```python
if True:
something
last_child_but_no_body
# leading on else for `main` but trailing in this PR
else: ...
```
### More examples
The table below is an attempt to summarize the changes in behavior. The
rows alternate between an example snippet with `while` and the same
example with `if` - in the former case we do _not_ have an `else` node
and in the latter we do.
Notice that:
1. On `main` our handling of `if` vs. `while` is not consistent, whereas
it is consistent in the present PR
2. We disagree with `black` in all cases except that last example on
`main`, but agree in all cases for the present PR (though see above for
a wonky edge case where we disagree).
<table>
<tr>
<th>Original
</th>
<th><code>main</code> </th>
<th>This
PR </th>
<th><code>black</code> </th>
</tr>
<tr>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
else:
# comment
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
</tr>
<tr>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
</tr>
<tr>
<td>
<pre lang="python">
while True: pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
</tr>
<tr>
<td>
<pre lang="python">
if True: pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
</tr>
<tr>
<td>
<pre lang="python">
while True: pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
else:
# comment
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
while True:
pass
# comment
else:
pass
</pre>
</td>
</tr>
<tr>
<td>
<pre lang="python">
if True: pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
<td>
<pre lang="python">
if True:
pass
# comment
else:
pass
</pre>
</td>
</tr>
</table>
Summary
--
This PR fixes#17796 by taking the approach mentioned in
https://github.com/astral-sh/ruff/issues/17796#issuecomment-2847943862
of simply recursing into the `MatchAs` patterns when checking if we need
parentheses. This allows us to reuse the parentheses in the inner
pattern before also breaking the `MatchAs` pattern itself:
```diff
match class_pattern:
case Class(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) as capture:
pass
- case (
- Class(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) as capture
- ):
+ case Class(
+ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+ ) as capture:
pass
- case (
- Class(
- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
- ) as capture
- ):
+ case Class(
+ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+ ) as capture:
pass
case (
Class(
@@ -685,13 +683,11 @@
match sequence_pattern_brackets:
case [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] as capture:
pass
- case (
- [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] as capture
- ):
+ case [
+ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+ ] as capture:
pass
- case (
- [
- xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
- ] as capture
- ):
+ case [
+ xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
+ ] as capture:
pass
```
I haven't really resolved the question of whether or not it's okay
always to recurse, but I'm hoping the ecosystem check on this PR might
shed some light on that.
Test Plan
--
New tests based on the issue and then reviewing the ecosystem check here