Commit Graph

45 Commits

Author SHA1 Message Date
Alex Waygood aa0db338d9
Implement `iter()`, `len()` and `is_empty()` for all display-literal AST nodes (#12807) 2024-08-12 10:39:28 +00:00
Micha Reiser 6a1e555537
Upgrade to Rust 1.78 (#11260) 2024-05-03 12:46:21 +00:00
Dhruv Manilawala 13ffb5bc19
Replace LALRPOP parser with hand-written parser (#10036)
(Supersedes #9152, authored by @LaBatata101)

## Summary

This PR replaces the current parser generated from LALRPOP to a
hand-written recursive descent parser.

It also updates the grammar for [PEP
646](https://peps.python.org/pep-0646/) so that the parser outputs the
correct AST. For example, in `data[*x]`, the index expression is now a
tuple with a single starred expression instead of just a starred
expression.

Beyond the performance improvements, the parser is also error resilient
and can provide better error messages. The behavior as seen by any
downstream tools isn't changed. That is, the linter and formatter can
still assume that the parser will _stop_ at the first syntax error. This
will be updated in the following months.

For more details about the change here, refer to the PR corresponding to
the individual commits and the release blog post.

## Test Plan

Write _lots_ and _lots_ of tests for both valid and invalid syntax and
verify the output.

## Acknowledgements

- @MichaReiser for reviewing 100+ parser PRs and continuously providing
guidance throughout the project
- @LaBatata101 for initiating the transition to a hand-written parser in
#9152
- @addisoncrump for implementing the fuzzer which helped
[catch](https://github.com/astral-sh/ruff/pull/10903)
[a](https://github.com/astral-sh/ruff/pull/10910)
[lot](https://github.com/astral-sh/ruff/pull/10966)
[of](https://github.com/astral-sh/ruff/pull/10896)
[bugs](https://github.com/astral-sh/ruff/pull/10877)

---------

Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-04-18 17:57:39 +05:30
Micha Reiser 77c5561646
Add `parenthesized` flag to `ExprTuple` and `ExprGenerator` (#9614) 2024-02-26 15:35:20 +00:00
Alex Waygood a1e65a92bd
Move `is_tuple_parenthesized` from the formatter to `ruff_python_ast` (#9533)
This allows it to be used in the linter as well as the formatter. It
will be useful in #9474
2024-01-15 16:10:40 +00:00
Micha Reiser c7aa816f17
Split tuples in return positions by comma first (#8280) 2023-10-30 00:25:44 +00:00
Charlie Marsh 95702e408f
Respect parenthesized generators in `has_own_parentheses` (#8100)
## Summary

When analyzing:

```python
if "root" not in (
    long_tree_name_tree.split("/")[0]
    for long_tree_name_tree in really_really_long_variable_name
):
    msg = "Could not find root. Please try a different forest."
    raise ValueError(msg)
```

We missed that the generator expression is parenthesized, because the
parentheses are _part_ of the generator -- so
`is_expression_parenthesized` returns `False`. We needed to take into
account that generators and tuples may or may not be parenthesized when
determining whether we can omit parentheses while splitting an
expression.

Closes https://github.com/astral-sh/ruff/issues/8090.

## Test Plan

No changes in similarity.

Before:

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |

After:

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
2023-10-22 19:58:25 -04:00
Charlie Marsh d685107638
Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030)
This is a do-over of https://github.com/astral-sh/ruff/pull/8011, which
I accidentally merged into a non-`main` branch. Sorry!
2023-10-18 00:01:18 +00:00
Charlie Marsh 865c89800e
Avoid searching for bracketed comments in unparenthesized generators (#7627)
Similar to tuples, a generator _can_ be parenthesized or
unparenthesized. Only search for bracketed comments if it contains its
own parentheses.

Closes https://github.com/astral-sh/ruff/issues/7623.
2023-09-24 02:08:44 +00:00
Micha Reiser c05e4628b1
Introduce Token element (#7048) 2023-09-02 10:05:47 +02:00
Micha Reiser adb48692d6
Use optional parentheses for tuples in return statements (#6875) 2023-08-29 08:30:05 +02:00
Charlie Marsh fc89976c24
Move `Ranged` into `ruff_text_size` (#6919)
## Summary

The motivation here is that this enables us to implement `Ranged` in
crates that don't depend on `ruff_python_ast`.

Largely a mechanical refactor with a lot of regex, Clippy help, and
manual fixups.

## Test Plan

`cargo test`
2023-08-27 14:12:51 -04:00
Micha Reiser b52cc84df6
Omit tuple parentheses in for statements except when absolutely necessary (#6765) 2023-08-22 12:18:59 +02:00
Micha Reiser 0cea4975fc
Rename Comments methods (#6649) 2023-08-18 06:37:01 +00:00
Charlie Marsh 12f3c4c931
Fix comment formatting for yielded tuples (#6603)
## Summary
Closes https://github.com/astral-sh/ruff/issues/6384, although I think
the issue was fixed already on main, for the most part.

The linked issue is around formatting expressions like:

```python
def test():
    (
        yield 
        #comment 1
        * # comment 2
        # comment 3
        test # comment 4
    )

```

On main, prior to this PR, we now format like:

```python
def test():
    (
        yield (
            # comment 1
            # comment 2
            # comment 3
            *test
        )  # comment 4
    )
```

Which strikes me as reasonable. (We can't test this, since it's a syntax
error after for our parser, despite being a syntax error in both cases
from CPython's perspective.)

Meanwhile, Black does:

```python
def test():
    (
        yield
        # comment 1
        *  # comment 2
        # comment 3
        test  # comment 4
    )
```

So our formatting differs in that we move comments between the star and
the expression above the star.

As of this PR, we also support formatting this input, which is valid:

```python
def test():
    (
        yield 
        #comment 1
        * # comment 2
        # comment 3
        test, # comment 4
        1
    )
```

Like:

```python
def test():
    (
        yield (
            # comment 1
            (
                # comment 2
                # comment 3
                *test,  # comment 4
                1,
            )
        )
    )
```

There were two fixes here: (1) marking starred comments as dangling and
formatting them properly; and (2) supporting parenthesized comments for
tuples that don't contain their own parentheses, as is often the case
for yielded tuples (previously, we hit a debug assert).

Note that this diff

## Test Plan
cargo test
2023-08-16 13:41:07 +00:00
Charlie Marsh 95f78821ad
Fix parenthesized detection for tuples (#6599)
## Summary

This PR fixes our code for detecting whether a tuple has its own
parentheses, which is necessary when attempting to preserve parentheses.
As-is, we were getting some cases wrong, like `(a := 1), (b := 3))` --
the detection code inferred that this _was_ parenthesized, and so
wrapped the entire thing in an unnecessary set of parentheses.

## Test Plan

`cargo test`

Before:

| project      | similarity index |
|--------------|------------------|
| cpython      | 0.75472          |
| django       | 0.99804          |
| transformers | 0.99618          |
| twine        | 0.99876          |
| typeshed     | 0.74288          |
| warehouse    | 0.99601          |
| zulip        | 0.99727          |

After:
| project      | similarity index |
|--------------|------------------|
| cpython      | 0.75473          |
| django       | 0.99804 |
| transformers | 0.99618          |
| twine        | 0.99876          |
| typeshed     | 0.74288          |
| warehouse    | 0.99601          |
| zulip        | 0.99727          |
2023-08-16 13:20:48 +00:00
Micha Reiser 29c0b9f91c
Use single lookup for leading, dangling, and trailing comments (#6589) 2023-08-15 17:39:45 +02:00
Charlie Marsh c7703e205d
Move `empty_parenthesized` into the `parentheses.rs` (#6403)
## Summary

This PR moves `empty_parenthesized` such that it's peer to
`parenthesized`, and changes the API to better match that of
`parenthesized` (takes `&str` rather than `StaticText`, has a
`with_dangling_comments` method, etc.).

It may be intentionally _not_ part of `parentheses.rs`, but to me
they're so similar that it makes more sense for them to be in the same
module, with the same API, etc.
2023-08-08 19:17:17 +00:00
Charlie Marsh 8919b6ad9a
Add a `with_dangling_comments` to the parenthesized formatter (#6402)
See: https://github.com/astral-sh/ruff/pull/6376#discussion_r1285514328.
2023-08-07 19:12:12 +00:00
konsti a48d16e025
Replace `Formatter<PyFormatContext<'_>>` with `PyFormatter` (#6330)
This is a refactoring to use the type alias in more places. In the
process, I had to fix and run generate.py. There are no functional
changes.
2023-08-04 10:48:58 +02:00
Charlie Marsh 1d8759d5df
Generalize comment-after-bracket handling to lists, sets, etc. (#6320)
## Summary

We already support preserving the end-of-line comment in calls and type
parameters, as in:

```python
foo(  # comment
    bar,
)
```

This PR adds the same behavior for lists, sets, comprehensions, etc.,
such that we preserve:

```python
[  # comment
    1,
    2,
    3,
]
```

And related cases.
2023-08-04 01:28:05 +00:00
Micha Reiser 40f54375cb
Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser 2cf00fee96
Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
Chris Pryer 8eadacda33
Update `TupleParentheses` usage (#5810) 2023-07-24 14:44:36 +00:00
konsti 46f8961292
Formatter: Add EmptyWithDanglingComments helper (#5951)
**Summary** Add a `EmptyWithDanglingComments` format helper that formats
comments inside empty parentheses, brackets or curly braces. Previously,
this was implemented separately, and partially incorrectly, for each use
case.

Empty `()`, `[]` and `{}` are special because there can be dangling
comments, and they can be in
two positions:
```python
x = [  # end-of-line
    # own line
]
```
These comments are dangling because they can't be assigned to any
element inside as they would
in all other cases.

**Test Plan** Added a regression test.

145 (from previously 149) instances of unstable formatting remaining.

```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
145
```
2023-07-23 14:32:16 +02:00
Chris Pryer 9fb8d6e999
Omit tuple parentheses inside comprehensions (#5790) 2023-07-19 12:05:38 +00:00
Micha Reiser 067b2a6ce6
Pass parent to `NeedsParentheses` (#5708) 2023-07-13 08:57:29 +02:00
Micha Reiser 653429bef9
Handle right parens in join comma builder (#5711) 2023-07-12 18:21:28 +02:00
Micha Reiser f1d367655b
Format `target: annotation = value?` expressions (#5661) 2023-07-11 16:40:28 +02:00
Micha Reiser 8665a1a19d
Pass `FormatContext` to `NeedsParentheses`
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

I started working on this because I assumed that I would need access to options inside of `NeedsParantheses` but it then turned out that I won't. 
Anyway, it kind of felt nice to pass fewer arguments. So I'm gonna put this out here to get your feedback if you prefer this over passing individual fiels. 

Oh, I sneeked in another change. I renamed `context.contents` to `source`. `contents` is too generic and doesn't tell you anything. 

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

It compiles
2023-07-11 14:28:50 +02:00
Louis Dispa e7e2f44440
Format `raise` statement (#5595)
## Summary

This PR implements the formatting of `raise` statements. I haven't
looked at the black implementation, this is inspired from from the
`return` statements formatting.

## Test Plan

The black differences with insta.

I also compared manually some edge cases with very long string and call
chaining and it seems to do the same formatting as black.

There is one issue:
```python
# input

raise OsError(
    "aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa).a(aaaa)


# black

raise OsError(
    "aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
    aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(
    aaaa
)


# ruff

raise OsError(
    "aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa"
) from a.aaaaa(
    aksjdhflsakhdflkjsadlfajkslhfdkjsaldajlahflashdfljahlfksajlhfajfjfsaahflakjslhdfkjalhdskjfa
).a(aaaa)
```

But I'm not sure this diff is the raise formatting implementation.

---------

Co-authored-by: Louis Dispa <ldispa@deezer.com>
2023-07-10 21:23:49 +02:00
Micha Reiser 40ddc1604c
Introduce `parenthesized` helper (#5565) 2023-07-07 11:28:25 +02:00
Charlie Marsh fa1b85b3da
Remove prelude from `ruff_python_ast` (#5369)
## Summary

Per @MichaReiser, this is causing more confusion than it is helpful.
2023-06-26 11:43:49 -04:00
Micha Reiser 49cabca3e7
Format implicit string continuation (#5328) 2023-06-26 12:41:47 +00:00
Micha Reiser d3d69a031e
Add `JoinCommaSeparatedBuilder` (#5342) 2023-06-23 22:03:05 +01:00
konstin 4b65446de6
Refactor magic trailing comma (#5339)
## Summary

This is small refactoring to reuse the code that detects the magic
trailing comma across functions. I make this change now to avoid copying
code in a later PR. @MichaReiser is planning on making a larger
refactoring later that integrates with the join nodes builder

## Test Plan

No functional changes. The magic trailing comma behaviour is checked by
the fixtures.
2023-06-23 18:53:55 +02:00
konstin 9419d3f9c8
Special `ExprTuple` formatting option for `for`-loops (#5175)
## Motivation

While black keeps parentheses nearly everywhere, the notable exception
is in the body of for loops:
```python
for (a, b) in x:
    pass
```
becomes
```python
for a, b in x:
    pass
```

This currently blocks #5163, which this PR should unblock.

## Solution

This changes the `ExprTuple` formatting option to include one additional
option that removes the parentheses when not using magic trailing comma
and not breaking. It is supposed to be used through
```rust
#[derive(Debug)]
struct ExprTupleWithoutParentheses<'a>(&'a Expr);

impl Format<PyFormatContext<'_>> for ExprTupleWithoutParentheses<'_> {
    fn fmt(&self, f: &mut Formatter<PyFormatContext<'_>>) -> FormatResult<()> {
        match self.0 {
            Expr::Tuple(expr_tuple) => expr_tuple
                .format()
                .with_options(TupleParentheses::StripInsideForLoop)
                .fmt(f),
            other => other.format().with_options(Parenthesize::IfBreaks).fmt(f),
        }
    }
}
```


## Testing

The for loop formatting isn't merged due to missing this (and i didn't
want to create more git weirdness across two people), but I've confirmed
that when applying this to while loops instead of for loops, then
```rust
        write!(
            f,
            [
                text("while"),
                space(),
                ExprTupleWithoutParentheses(test.as_ref()),
                text(":"),
                trailing_comments(trailing_condition_comments),
                block_indent(&body.format())
            ]
        )?;
```
makes
```python
while (a, b):
    pass

while (
    ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
    b,
):
    pass

while (a,b,):
    pass
```
formatted as
```python
while a, b:
    pass

while (
    ajssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssa,
    b,
):
    pass

while (
    a,
    b,
):
    pass
```
2023-06-21 21:17:47 +02:00
konstin e586c27590
Format ExprTuple (#4963)
This implements formatting ExprTuple, including magic trailing comma. I
intentionally didn't change the settings mechanism but just added a
dummy global const flag.

Besides the snapshots, I added custom breaking/joining tests and a
deeply nested test case. The diffs look better than previously, proper
black compatibility depends on parentheses handling.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-06-12 12:55:47 +00:00
Micha Reiser 68969240c5
Format Function definitions (#4951) 2023-06-08 16:07:33 +00:00
Micha Reiser c1cc6f3be1
Add basic Constant formatting (#4954) 2023-06-08 11:42:44 +00:00
konstin 23abad0bd5
A basic StmtAssign formatter and better dummies for expressions (#4938)
* A basic StmtAssign formatter and better dummies for expressions

The goal of this PR was formatting StmtAssign since many nodes in the black tests (and in python in general) are after an assignment. This caused unstable formatting: The spacing of power op spacing depends on the type of the two involved expressions, but each expression was formatted as dummy string and re-parsed as a ExprName, so in the second round the different rules of ExprName were applied, causing unstable formatting.

This PR does not necessarily bring us closer to black's style, but it unlocks a good porting of black's test suite and is a basis for implementing the Expr nodes.

* fmt

* Review
2023-06-08 12:20:25 +02:00
Micha Reiser bcf745c5ba
Replace verbatim text with `NOT_YET_IMPLEMENTED` (#4904)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes. 

The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:

```python
def func_no_args():
  a; b; c
  if True: raise RuntimeError
  if False: ...
  for i in range(10):
    print(i)
    continue
```

Get's reformatted to

```python
def func_no_args():
    a; b; c
    if True: raise RuntimeError
    if False: ...
    for i in range(10):
    print(i)
    continue
```

because our formatter does not yet support `for` statements and just inserts the text from the source. 

## Downsides

Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid. 

## Upsides

Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)

## Test Plan

None?
2023-06-07 14:57:25 +02:00
Micha Reiser 3f032cf09d
Format binary expressions (#4862)
* Format Binary Expressions

* Extract NeedsParentheses trait
2023-06-06 08:34:53 +00:00
konstin 9bf168c0a4
Use dummy verbatim formatter for all nodes (#4755) 2023-06-01 08:25:26 +00:00
konstin 0945803427
Generate FormatRule definitions (#4724)
* Generate FormatRule definitions

* Generate verbatim output

* pub(crate) everything

* clippy fix

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* stub out with Ok(()) again

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* PyFormatContext::{contents, locator} with `#[allow(unused)]`

* Can't leak private type

* remove commented code

* Fix ruff errors

* pub struct Format{node} due to rust rules

---------

Co-authored-by: Julian LaNeve <lanevejulian@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-06-01 08:38:53 +02:00