Summary
--
This PR makes two changes to comment placement in lambda parameters.
First, we
now insert a line break if the first parameter has a leading comment:
```py
# input
(
lambda
* # comment 2
x:
x
)
# main
(
lambda # comment 2
*x: x
)
# this PR
(
lambda
# comment 2
*x: x
)
```
Note the missing space in the output from main. This case is currently
unstable
on main. Also note that the new formatting is more consistent with our
stable
formatting in cases where the lambda has its own dangling comment:
```py
# input
(
lambda # comment 1
* # comment 2
x:
x
)
# output
(
lambda # comment 1
# comment 2
*x: x
)
```
and when a parameter without a comment precedes the split `*x`:
```py
# input
(
lambda y,
* # comment 2
x:
x
)
# output
(
lambda y,
# comment 2
*x: x
)
```
This does change the stable formatting, but I think such cases are rare
(expecting zero hits in the ecosystem report), this fixes an existing
instability, and it should not change any code we've previously
formatted.
Second, this PR modifies the comment placement such that `# comment 2`
in these
outputs is still a leading comment on the parameter. This is also not
the case
on main, where it becomes a [dangling lambda
comment](https://play.ruff.rs/3b29bb7e-70e4-4365-88e0-e60fe1857a35?secondary=Comments).
This doesn't cause any
instability that I'm aware of on main, but it does cause problems when
trying to
adjust the placement of dangling lambda comments in #21385. Changing the
placement in this way should not affect any formatting here.
Test Plan
--
New lambda tests, plus existing tests covering the cases above with
multiple
comments around the parameters (see lambda.py 122-143, and 122-205 or so
more
broadly)
I also checked manually that the comments are now leading on the
parameter:
```shell
❯ cargo run --bin ruff_python_formatter -- --emit stdout --target-version 3.10 --print-comments <<EOF
(
lambda
# comment 2
*x: x
)
EOF
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/ruff_python_formatter --emit stdout --target-version 3.10 --print-comments`
# Comment decoration: Range, Preceding, Following, Enclosing, Comment
21..32, None, Some((Parameters, 37..39)), (ExprLambda, 6..42), "# comment 2"
{
Node {
kind: Parameter,
range: 37..39,
source: `*x`,
}: {
"leading": [
SourceComment {
text: "# comment 2",
position: OwnLine,
formatted: true,
},
],
"dangling": [],
"trailing": [],
},
}
(
lambda
# comment 2
*x: x
)
```
But I didn't see a great place to put a test like this. Is there
somewhere I can assert this comment placement since it doesn't affect
any formatting yet? Or is it okay to wait until we use this in #21385?
this is unstable because it moves the comment into a new set of parentheses,
which then means the lambda itself can be unparenthesized
```diff
-a = (
- lambda: ( # Dangling
- 1
- )
+a = lambda: ( # Dangling
+ 1
)
```
I don't think we can move the `fits_expanded` call into the assignment
formatting because that would wrap the whole lambda in a `fits_expanded`, when we
just want to wrap the lambda body in it instead. if I understand correctly, we'd
need to duplicate basically this whole function to inject `fits_expanded` in the
right place for the lambda formatting in assignments
## Summary
Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.
The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.
The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.
Part of https://github.com/astral-sh/ty/issues/214.