Commit Graph

61 Commits

Author SHA1 Message Date
Micha Reiser
2cf00fee96 Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
Charlie Marsh
4204fc002d Remove exception-handler lexing from unused-bound-exception fix (#5851)
## Summary

The motivation here is that it will make this rule easier to rewrite as
a deferred check. Right now, we can't run this rule in the deferred
phase, because it depends on the `except_handler` to power its autofix.
Instead of lexing the `except_handler`, we can use the `SimpleTokenizer`
from the formatter, and just lex forwards and backwards.

For context, this rule detects the unused `e` in:

```python
try:
  pass
except ValueError as e:
  pass
```
2023-07-18 18:27:46 +00:00
Micha Reiser
3cda89ecaf Parenthesize with statements (#5758)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR improves the parentheses handling for with items to get closer
to black's formatting.

### Case 1:

```python
# Black / Input
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    aaaaaaaaaaaaaaaaaaaaaaaaaa
    + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
    + cccccccccccccccccccccccccccc
    + ddddddddddddddddd as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
):
    ...

# Before
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    (
        aaaaaaaaaaaaaaaaaaaaaaaaaa
        + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        + cccccccccccccccccccccccccccc
        + ddddddddddddddddd
    ) as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
    CtxManager2() as example2,
):
    ...
```

Notice how Ruff wraps the binary expression in an extra set of
parentheses


### Case 2:
Black does not expand the with-items if the with has no parentheses:

```python
# Black / Input
with aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c:
    ...

# Before
with (
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa + bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb as c
):
    ...
```

Or 

```python
# Black / Input
with [
    "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
    "bbbbbbbbbb",
    "cccccccccccccccccccccccccccccccccccccccccc",
    dddddddddddddddddddddddddddddddd,
] as example1, aaaaaaaaaaaaaaaaaaaaaaaaaa * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb * cccccccccccccccccccccccccccc + ddddddddddddddddd as example2, CtxManager222222222222222() as example2:
    ...

# Before (Same as Case 1)
with (
    [
        "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
        "bbbbbbbbbb",
        "cccccccccccccccccccccccccccccccccccccccccc",
        dddddddddddddddddddddddddddddddd,
    ] as example1,
    (
        aaaaaaaaaaaaaaaaaaaaaaaaaa
        * bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
        * cccccccccccccccccccccccccccc
        + ddddddddddddddddd
    ) as example2,
    CtxManager222222222222222() as example2,
):
    ...

```
## Test Plan

I added new snapshot tests

Improves the django similarity index from 0.973 to 0.977
2023-07-15 16:03:09 +01:00
Micha Reiser
30bec3fcfa Only omit optinal parens if the expression ends or starts with a parenthesized expression
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR matches Black' behavior where it only omits the optional parentheses if the expression starts or ends with a parenthesized expression:

```python
a + [aaa, bbb, cccc] * c # Don't omit
[aaa, bbb, cccc] + a * c # Split
a + c * [aaa, bbb, ccc] # Split 
```

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

This improves the Jaccard index from 0.945 to 0.946
2023-07-11 17:05:25 +02:00
Micha Reiser
f1d367655b Format target: annotation = value? expressions (#5661) 2023-07-11 16:40:28 +02:00
Dimitri Papadopoulos Orfanos
efe7c393d1 Fix typos found by codespell (#5607)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Fix typos found by
[codespell](https://github.com/codespell-project/codespell).

I have left out `memoize` for now (see #5606).
<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

CI tests.
<!-- How was it tested? -->
2023-07-08 12:33:18 +02:00
konsti
b22e6c3d38 Extend ruff_dev formatter script to compute statistics and format a project (#5492)
## Summary

This extends the `ruff_dev` formatter script util. Instead of only doing
stability checks, you can now choose different compatible options on the
CLI and get statistics.

* It adds an option the formats all files that ruff would check to allow
looking at an entire black-formatted repository with `git diff`
* It computes the [Jaccard
index](https://en.wikipedia.org/wiki/Jaccard_index) as a measure of
deviation between input and output, which is useful as single number
metric for assessing our current deviations from black.
* It adds progress bars to both the single projects as well as the
multi-project mode.
* It adds an option to write the multi-project output to a file

Sample usage:

```
$ cargo run --bin ruff_dev -- format-dev --stability-check crates/ruff/resources/test/cpython
$ cargo run --bin ruff_dev -- format-dev --stability-check /home/konsti/projects/django
Syntax error in /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors (parser error): BaseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
Found 0 stability errors in 2755 files (jaccard index 0.911) in 9.75s
$ cargo run --bin ruff_dev -- format-dev --write /home/konsti/projects/django
```

Options:

```
Several utils related to the formatter which can be run on one or more repositories. The selected set of files in a repository is the same as for `ruff check`.

* Check formatter stability: Format a repository twice and ensure that it looks that the first and second formatting look the same. * Format: Format the files in a repository to be able to check them with `git diff` * Statistics: The subcommand the Jaccard index between the (assumed to be black formatted) input and the ruff formatted output

Usage: ruff_dev format-dev [OPTIONS] [FILES]...

Arguments:
  [FILES]...
          Like `ruff check`'s files. See `--multi-project` if you want to format an ecosystem checkout

Options:
      --stability-check
          Check stability
          
          We want to ensure that once formatted content stays the same when formatted again, which is known as formatter stability or formatter idempotency, and that the formatter prints syntactically valid code. As our test cases cover only a limited amount of code, this allows checking entire repositories.

      --write
          Format the files. Without this flag, the python files are not modified

      --format <FORMAT>
          Control the verbosity of the output
          
          [default: default]

          Possible values:
          - minimal: Filenames only
          - default: Filenames and reduced diff
          - full:    Full diff and invalid code

  -x, --exit-first-error
          Print only the first error and exit, `-x` is same as pytest

      --multi-project
          Checks each project inside a directory, useful e.g. if you want to check all of the ecosystem checkouts

      --error-file <ERROR_FILE>
          Write all errors to this file in addition to stdout. Only used in multi-project mode
```

## Test Plan

I ran this on django (2755 files, jaccard index 0.911) and discovered a
magic trailing comma problem and that we really needed to implement
import formatting. I ran the script on cpython to identify
https://github.com/astral-sh/ruff/pull/5558.
2023-07-07 11:30:12 +00:00
Micha Reiser
dd0d1afb66 Create PyFormatOptions
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR adds a new `PyFormatOptions` struct that stores the python formatter options. 
The new options aren't used yet, with the exception of magical trailing commas and the options passed to the printer. 
I'll follow up with more PRs that use the new options (e.g. `QuoteStyle`).

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

`cargo test` I'll follow up with a new PR that adds support for overriding the options in our fixture tests.
2023-06-26 14:02:17 +02:00
Micha Reiser
8879927b9a Use insta::glob instead of fixture macro (#5364) 2023-06-26 08:46:18 +00:00
Micha Reiser
c52aa8f065 Basic string formatting
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR implements formatting for non-f-string Strings that do not use implicit concatenation. 

Docstring formatting is out of the scope of this PR.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

I added a few tests for simple string literals. 

## Performance

Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times:

1. To detect if it is an implicit string continuation
2. To detect if the string contains any new lines
3. To detect the preferred quote
4. To normalize the string

Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time.
We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.
2023-06-23 09:46:05 +02:00
konstin
7d4f8e59da Improve FormatExprCall dummy (#5290)
This solves an instability when formatting cpython. It also introduces
another one, but i think it's still a worthwhile change for now.

There's no proper testing since this is just a dummy.
2023-06-22 10:59:30 +02:00
konstin
6155fd647d Format Slice Expressions (#5047)
This formats slice expressions and subscript expressions.

Spaces around the colons follows the same rules as black
(https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#slices):
```python
e00 = "e"[:]
e01 = "e"[:1]
e02 = "e"[: a()]
e10 = "e"[1:]
e11 = "e"[1:1]
e12 = "e"[1 : a()]
e20 = "e"[a() :]
e21 = "e"[a() : 1]
e22 = "e"[a() : a()]
e200 = "e"[a() : :]
e201 = "e"[a() :: 1]
e202 = "e"[a() :: a()]
e210 = "e"[a() : 1 :]
```

Comment placement is different due to our very different infrastructure.
If we have explicit bounds (e.g. `x[1:2]`) all comments get assigned as
leading or trailing to the bound expression. If a bound is missing
`[:]`, comments get marked as dangling and placed in the same section as
they were originally in:
```python
x = "x"[ # a
      # b
    :  # c
      # d
]
```
to
```python
x = "x"[
    # a
    # b
    :
    # c
    # d
]
```
Except for the potential trailing end-of-line comments, all comments get
formatted on their own line. This can be improved by keeping end-of-line
comments after the opening bracket or after a colon as such but the
changes were already complex enough.

I added tests for comment placement and spaces.
2023-06-21 15:09:39 +00:00
Micha Reiser
1336ca601b Format UnaryExpr
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR adds basic formatting for unary expressions.

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

I added a new `unary.py` with custom test cases
2023-06-21 10:09:47 +02:00
Micha Reiser
e520a3a721 Fix ArgWithDefault comments handling (#5204) 2023-06-20 20:48:07 +00:00
Micha Reiser
b369288833 Accept any Into<AnyNodeRef> as Comments arguments (#5205) 2023-06-20 16:49:21 +00:00
konstin
e586c27590 Format ExprTuple (#4963)
This implements formatting ExprTuple, including magic trailing comma. I
intentionally didn't change the settings mechanism but just added a
dummy global const flag.

Besides the snapshots, I added custom breaking/joining tests and a
deeply nested test case. The diffs look better than previously, proper
black compatibility depends on parentheses handling.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-06-12 12:55:47 +00:00
Micha Reiser
1accbeffd6 Format if statements (#4961) 2023-06-09 10:55:14 +02:00
Micha Reiser
68969240c5 Format Function definitions (#4951) 2023-06-08 16:07:33 +00:00
Micha Reiser
c1cc6f3be1 Add basic Constant formatting (#4954) 2023-06-08 11:42:44 +00:00
konstin
23abad0bd5 A basic StmtAssign formatter and better dummies for expressions (#4938)
* A basic StmtAssign formatter and better dummies for expressions

The goal of this PR was formatting StmtAssign since many nodes in the black tests (and in python in general) are after an assignment. This caused unstable formatting: The spacing of power op spacing depends on the type of the two involved expressions, but each expression was formatted as dummy string and re-parsed as a ExprName, so in the second round the different rules of ExprName were applied, causing unstable formatting.

This PR does not necessarily bring us closer to black's style, but it unlocks a good porting of black's test suite and is a basis for implementing the Expr nodes.

* fmt

* Review
2023-06-08 12:20:25 +02:00
Micha Reiser
bcf745c5ba Replace verbatim text with NOT_YET_IMPLEMENTED (#4904)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes. 

The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:

```python
def func_no_args():
  a; b; c
  if True: raise RuntimeError
  if False: ...
  for i in range(10):
    print(i)
    continue
```

Get's reformatted to

```python
def func_no_args():
    a; b; c
    if True: raise RuntimeError
    if False: ...
    for i in range(10):
    print(i)
    continue
```

because our formatter does not yet support `for` statements and just inserts the text from the source. 

## Downsides

Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid. 

## Upsides

Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)

## Test Plan

None?
2023-06-07 14:57:25 +02:00
Micha Reiser
6ab3fc60f4 Correctly handle newlines after/before comments (#4895)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This issue fixes the removal of empty lines between a leading comment and the previous statement:

```python
a  = 20

# leading comment
b = 10
```

Ruff removed the empty line between `a` and `b` because:
* The leading comments formatting does not preserve leading newlines (to avoid adding new lines at the top of a body)
* The `JoinNodesBuilder` counted the lines before `b`, which is 1 -> Doesn't insert a new line

This is fixed by changing the `JoinNodesBuilder` to count the lines instead *after* the last node. This correctly gives 1, and the `# leading comment` will insert the empty lines between any other leading comment or the node.



## Test Plan

I added a new test for empty lines.
2023-06-07 14:49:43 +02:00
Micha Reiser
3f032cf09d Format binary expressions (#4862)
* Format Binary Expressions

* Extract NeedsParentheses trait
2023-06-06 08:34:53 +00:00
Micha Reiser
913b9d1fcf Normalize newlines in verbatim_text (#4850) 2023-06-05 19:30:28 +00:00
Micha Reiser
c65f47d7c4 Format while Statement (#4810) 2023-06-05 08:24:00 +00:00
Micha Reiser
ebdc4afc33 Suite formatting and JoinNodesBuilder (#4805) 2023-06-02 14:14:38 +00:00
konstin
c4fdbf8903 Switch PyFormatter lifetimes (#4804)
Stylistic change to have the input lifetime first and the output lifetime second. I'll rebase my other PR on top of this.

Test plan: `cargo clippy`
2023-06-02 12:26:39 +02:00
Micha Reiser
5d939222db Leading, Dangling, and Trailing comments formatting (#4785) 2023-06-02 09:26:36 +02:00
konstin
63d892f1e4 Implement basic module formatting (#4784)
* Add Format for Stmt

* Implement basic module formatting

This implements formatting each statement in a module with a hard line break in between, so that we can start formatting statements.

Basic testing is done by the snapshots
2023-06-01 15:25:50 +02:00
konstin
d4027d8b65 Use new formatter infrastructure in CLI and test (#4767)
* Use dummy verbatim formatter for all nodes

* Use new formatter infrastructure in CLI and test

* Expose the new formatter in the CLI

* Merge import blocks
2023-06-01 11:55:04 +02:00
konstin
9bf168c0a4 Use dummy verbatim formatter for all nodes (#4755) 2023-06-01 08:25:26 +00:00
Micha Reiser
59148344be Place comments of left and right binary expression operands (#4751) 2023-06-01 07:01:32 +00:00
konstin
0945803427 Generate FormatRule definitions (#4724)
* Generate FormatRule definitions

* Generate verbatim output

* pub(crate) everything

* clippy fix

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* stub out with Ok(()) again

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* PyFormatContext::{contents, locator} with `#[allow(unused)]`

* Can't leak private type

* remove commented code

* Fix ruff errors

* pub struct Format{node} due to rust rules

---------

Co-authored-by: Julian LaNeve <lanevejulian@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-06-01 08:38:53 +02:00
Micha Reiser
e209b5fc5f Add reformat check (#4753) 2023-05-31 17:36:15 +02:00
Micha Reiser
06bcb85f81 formatter: Remove CST and old formatting (#4730) 2023-05-31 08:27:23 +02:00
Micha Reiser
0cd453bdf0 Generic "comment to node" association logic (#4642) 2023-05-30 09:28:01 +00:00
Micha Reiser
6146b75dd0 Add MultiMap implementation for storing comments (#4639) 2023-05-30 09:51:25 +02:00
Micha Reiser
edc6c4058f Move shared_traits to ruff_formatter (#4632) 2023-05-24 17:38:11 +02:00
Micha Reiser
6943beee66 Remove source position from FormatElement::DynamicText (#4619) 2023-05-24 16:36:14 +02:00
Micha Reiser
ddf7de7e86 Prototype Black's string joining/splitting (#4449) 2023-05-16 18:42:40 +01:00
Charlie Marsh
f0465bf106 Emit non-logical newlines for "empty" lines (#4444) 2023-05-16 14:58:56 +00:00
Micha Reiser
1ccef5150d Remove lifetime from FormatContext (#4376) 2023-05-11 15:43:42 +00:00
Micha Reiser
cab65b25da Replace row/column based Location with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Charlie Marsh
da1f83fe32 Remove core module from ruff_python_formatter (#3373) 2023-03-08 19:11:39 +00:00
Charlie Marsh
0a9d259f9c Remove copied core modules from ruff_python_formatter (#3371) 2023-03-08 19:03:40 +00:00
Charlie Marsh
16be691712 Enable more non-panicking formatter tests (#3262) 2023-02-27 18:21:53 -05:00
Charlie Marsh
2261e194a0 Create dedicated Body nodes in the formatter CST (#3223) 2023-02-27 22:55:05 +00:00
Charlie Marsh
51bca19c1d Add builders for common comment rendering (#3232) 2023-02-26 04:16:24 +00:00
Charlie Marsh
eb15371453 Make Locator available in AST-to-CST conversion pass (#3194) 2023-02-23 19:43:03 -05:00
Charlie Marsh
bda2a0007a Parenthesize numbers during attribute accesses (#3189) 2023-02-23 14:57:23 -05:00