Commit Graph

14 Commits

Author SHA1 Message Date
Brent Westbrook 2b1d3c60fa
Display diffs for `ruff format --check` and add support for different output formats (#20443)
## Summary

This PR uses the new `Diagnostic` type for rendering formatter
diagnostics. This allows the formatter to inherit all of the output
formats already implemented in the linter and ty. For example, here's
the new `full` output format, with the formatting diff displayed using
the same infrastructure as the linter:

<img width="592" height="364" alt="image"
src="https://github.com/user-attachments/assets/6d09817d-3f27-4960-aa8b-41ba47fb4dc0"
/>


<details><summary>Resolved TODOs</summary>
<p>

~~There are several limitiations/todos here still, especially around the
`OutputFormat` type~~:
- [x] A few literal `todo!`s for the remaining `OutputFormat`s without
matching `DiagnosticFormat`s
- [x] The default output format is `full` instead of something more
concise like the current output
- [x] Some of the output formats (namely JSON) have information that
doesn't make much sense for these diagnostics

The first of these is definitely resolved, and I think the other two are
as well, based on discussion on the design document. In brief, we're
okay inheriting the default `OutputFormat` and can separate the global
option into `lint.output-format` and `format.output-format` in the
future, if needed; and we're okay including redundant information in the
non-human-readable output formats.

My last major concern is with the performance of the new code, as
discussed in the `Benchmarks` section below.

A smaller question is whether we should use `Diagnostic`s for formatting
errors too. I think the answer to this is yes, in line with changes
we're making in the linter too. I still need to implement that here.

</p>
</details> 

<details><summary>Benchmarks</summary>
<p>


The values in the table are from a large benchmark on the CPython 3.10
code
base, which involves checking 2011 files, 1872 of which need to be
reformatted.
`stable` corresponds to the same code used on `main`, while
`preview-full` and
`preview-concise` use the new `Diagnostic` code gated behind `--preview`
for the
`full` and `concise` output formats, respectively. `stable-diff` uses
the
`--diff` to compare the two diff rendering approaches. See the full
hyperfine
command below for more details. For a sense of scale, the `stable`
output format
produces 1873 lines on stdout, compared to 855,278 for `preview-full`
and
857,798 for `stable-diff`.

| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |

|:------------------|--------------:|---------:|---------:|-------------:|
| `stable` | 201.2 ± 6.8 | 192.9 | 220.6 | 1.00 |
| `preview-full` | 9113.2 ± 31.2 | 9076.1 | 9152.0 | 45.29 ± 1.54 |
| `preview-concise` | 214.2 ± 1.4 | 212.0 | 217.6 | 1.06 ± 0.04 |
| `stable-diff` | 3308.6 ± 20.2 | 3278.6 | 3341.8 | 16.44 ± 0.56 |

In summary, the `preview-concise` diagnostics are ~6% slower than the
stable
output format, increasing the average runtime from 201.2 ms to 214.2 ms.
The
`full` preview diagnostics are much more expensive, taking over 9113.2
ms to
complete, which is ~3x more expensive even than the stable diffs
produced by the
`--diff` flag.

My main takeaways here are:
1. Rendering `Edit`s is much more expensive than rendering the diffs
from `--diff`
2. Constructing `Edit`s actually isn't too bad

### Constructing `Edit`s

I also took a closer look at `Edit` construction by modifying the code
and
repeating the `preview-concise` benchmark and found that the main issue
is
constructing a `SourceFile` for use in the `Edit` rendering. Commenting
out the
`Edit` construction itself has basically no effect:

| Command   |   Mean [ms] | Min [ms] | Max [ms] |    Relative |
|:----------|------------:|---------:|---------:|------------:|
| `stable`  | 197.5 ± 1.6 |    195.0 |    200.3 |        1.00 |
| `no-edit` | 208.9 ± 2.2 |    204.8 |    212.2 | 1.06 ± 0.01 |

However, also omitting the source text from the `SourceFile`
construction
resolves the slowdown compared to `stable`. So it seems that copying the
full
source text into a `SourceFile` is the main cause of the slowdown for
non-`full`
diagnostics.

| Command          |   Mean [ms] | Min [ms] | Max [ms] |    Relative |
|:-----------------|------------:|---------:|---------:|------------:|
| `stable`         | 202.4 ± 2.9 |    197.6 |    207.9 |        1.00 |
| `no-source-text` | 202.7 ± 3.3 |    196.3 |    209.1 | 1.00 ± 0.02 |

### Rendering diffs

The main difference between `stable-diff` and `preview-full` seems to be
the diffing strategy we use from `similar`. Both versions use the same
algorithm, but in the existing
[`CodeDiff`](https://github.com/astral-sh/ruff/blob/main/crates/ruff_linter/src/source_kind.rs#L259)
rendering for the `--diff` flag, we only do line-level diffing, whereas
for `Diagnostic`s we use `TextDiff::iter_inline_changes` to highlight
word-level changes too. Skipping the word diff for `Diagnostic`s closes
most of the gap:

| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `stable-diff` | 3.323 ± 0.015 | 3.297 | 3.341 | 1.00 |
| `preview-full` | 3.654 ± 0.019 | 3.618 | 3.682 | 1.10 ± 0.01 |

(In some repeated runs, I've seen as small as a ~5% difference, down
from 10% in the table)

This doesn't actually change any of our snapshots, but it would
obviously change the rendered result in a terminal since we wouldn't
highlight the specific words that changed within a line.

Another much smaller change that we can try is removing the deadline
from the `iter_inline_changes` call. It looks like there's a fair amount
of overhead from the default 500 ms deadline for computing these, and
using `iter_inline_changes(op, None)` (`None` for the optional deadline
argument) improves the runtime quite a bit:

| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `stable-diff` | 3.322 ± 0.013 | 3.298 | 3.341 | 1.00 |
| `preview-full` | 5.296 ± 0.030 | 5.251 | 5.366 | 1.59 ± 0.01 |

<hr>

<details><summary>hyperfine command</summary>

```shell
cargo build --release --bin ruff && hyperfine --ignore-failure --warmup 10 --export-markdown /tmp/table.md \
  -n stable -n preview-full -n preview-concise -n stable-diff \
  "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache" \
  "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=full" \
  "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=concise" \
  "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --diff"
```

</details>

</p>
</details> 

## Test Plan

Some new CLI tests and manual testing
2025-09-30 12:00:51 -04:00
Micha Reiser 9ae698fe30
Switch to Rust 2024 edition (#18129) 2025-05-16 13:25:28 +02:00
Carl Meyer dd6f6233bd
bump MSRV to 1.83 (#16294)
According to our new MSRV policy (see
https://github.com/astral-sh/ruff/issues/16370 ), bump our MSRV to 1.83
(N - 2), and autofix some new clippy lints.
2025-02-26 06:12:43 -08:00
Dhruv Manilawala 5caabe54b6
Allow `ipytest` cell magic (#13745)
## Summary

fixes: #13718 

## Test Plan

Using the notebook as mentioned in
https://github.com/astral-sh/ruff/issues/13718#issuecomment-2410631674,
this PR does not give the "F821 Undefined name `test_sorted`"
diagnostic.
2024-10-14 15:48:33 +05:30
Dhruv Manilawala ff53db3d99
Consider VS Code cell metadata to determine valid code cells (#12864)
## Summary

This PR adds support for VS Code specific cell metadata to consider when
collecting valid code cells.

For context, Ruff only runs on valid code cells. These are the code
cells that doesn't contain cell magics. Previously, Ruff only used the
notebook's metadata to determine whether it's a Python notebook. But, in
VS Code, a notebook's preferred language might be Python but it could
still contain code cells for other languages. This can be determined
with the `metadata.vscode.languageId` field.

### References:
* https://code.visualstudio.com/docs/languages/identifiers
* e6c009a3d4/extensions/ipynb/src/serializers.ts (L104-L107)
*
e6c009a3d4/extensions/ipynb/src/serializers.ts (L117-L122)

This brings us one step closer to fixing #12281.

## Test Plan

Add test cases for `is_valid_python_code_cell` and an integration test
case which showcase running it end to end. The test notebook contains a
JavaScript code cell and a Python code cell.
2024-08-13 22:09:56 +05:30
Jane Lewis 3ab7a8da73
Add Jupyter Notebook document change snapshot test (#11944)
## Summary

Closes #11914.

This PR introduces a snapshot test that replays the LSP requests made
during a document formatting request, and confirms that the notebook
document is updated in the expected way.
2024-06-21 05:29:27 +00:00
Jane Lewis b0731ef9cb
`ruff server`: Support Jupyter Notebook (`*.ipynb`) files (#11206)
## Summary

Closes https://github.com/astral-sh/ruff/issues/10858.

`ruff server` now supports `*.ipynb` (aka Jupyter Notebook) files.
Extensive internal changes have been made to facilitate this, which I've
done some work to contextualize with documentation and an pre-review
that highlights notable sections of the code.

`*.ipynb` cells should behave similarly to `*.py` documents, with one
major exception. The format command `ruff.applyFormat` will only apply
to the currently selected notebook cell - if you want to format an
entire notebook document, use `Format Notebook` from the VS Code context
menu.

## Test Plan

The VS Code extension does not yet have Jupyter Notebook support
enabled, so you'll first need to enable it manually. To do this,
checkout the `pre-release` branch and modify `src/common/server.ts` as
follows:

Before:
![Screenshot 2024-05-13 at 10 59
06 PM](https://github.com/astral-sh/ruff/assets/19577865/c6a3c604-c405-4968-b8a2-5d670de89172)

After:
![Screenshot 2024-05-13 at 10 58
24 PM](https://github.com/astral-sh/ruff/assets/19577865/94ab2e3d-0609-448d-9c8c-cd07c69a513b)

I recommend testing this PR with large, complicated notebook files. I
used notebook files from [this popular
repository](https://github.com/jakevdp/PythonDataScienceHandbook/tree/master/notebooks)
in my preliminary testing.

The main thing to test is ensuring that notebook cells behave the same
as Python documents, besides the aforementioned issue with
`ruff.applyFormat`. You should also test adding and deleting cells (in
particular, deleting all the code cells and ensure that doesn't break
anything), changing the kind of a cell (i.e. from markup -> code or vice
versa), and creating a new notebook file from scratch. Finally, you
should also test that source actions work as expected (and across the
entire notebook).

Note: `ruff.applyAutofix` and `ruff.applyOrganizeImports` are currently
broken for notebook files, and I suspect it has something to do with
https://github.com/astral-sh/ruff/issues/11248. Once this is fixed, I
will update the test plan accordingly.

---------

Co-authored-by: nolan <nolan.king90@gmail.com>
2024-05-21 22:29:30 +00:00
Charlie Marsh bea8f2ee3a
Detect automagic-like assignments in notebooks (#9653)
## Summary

Given a statement like `colors = 6`, we currently treat the cell as an
automagic (since `colors` is an automagic) -- i.e., we assume it's
equivalent to `%colors = 6`. This PR adds some additional detection
whereby if the statement is an _assignment_, we avoid treating it as
such. I audited the list of automagics, and I believe this is safe for
all of them.

Closes https://github.com/astral-sh/ruff/issues/8526.

Closes https://github.com/astral-sh/ruff/issues/9648.

## Test Plan

`cargo test`
2024-01-29 12:55:44 +00:00
konsti cd3c2f773f
Prevent invalid utf8 indexing in cell magic detection (#9146)
The example below used to panic because we tried to split at 2 bytes in
the 4-bytes character `转`.
```python
def sample_func(xx):
    """
    转置 (transpose)
    """
    return xx.T
```

Fixes #9145
Fixes https://github.com/astral-sh/ruff-vscode/issues/362

The second commit is a small test refactoring.
2023-12-15 08:15:46 -06:00
Dhruv Manilawala 6bbabceead
Allow transparent cell magics (#8911)
## Summary

This PR updates the logic for `is_magic_cell` to include certain cell
magics. These cell magics would contain Python code following the line
defining the command. The code could define a variable which can then be
referenced in other cells. Currently, we would ignore the cell
completely leading to undefined-name violation.

As discussed in
https://github.com/astral-sh/ruff/issues/8354#issuecomment-1832221009

## Test Plan

Add new test case to validate this scenario.
2023-12-07 14:15:43 -06:00
Dhruv Manilawala b28556d739
Update `E402` to work at cell level for notebooks (#8872)
## Summary

This PR updates the `E402` rule to work at cell level for Jupyter
notebooks. This is enabled only in preview to gather feedback.

The implementation basically resets the import boundary flag on the
semantic model when we encounter the first statement in a cell.

Another potential solution is to introduce `E403` rule that is
specifically for notebooks that works at cell level while `E402` will be
disabled for notebooks.

## Test Plan

Add a notebook with imports in multiple cells and verify that the rule
works as expected.

resolves: #8669
2023-11-29 00:32:35 +00:00
Dhruv Manilawala 5b726f70f4
Avoid `B015`,`B018` for last expression in a cell (#8815)
## Summary

This PR updates `B015` and `B018` to ignore last top-level expressions
in each cell of a Jupyter Notebook.

Part of #8669

## Test Plan

Add test cases for both rules and update the snapshots.
2023-11-22 15:33:23 +00:00
Dhruv Manilawala 727e389cac
Add `CellOffsets` abstraction (#8814)
Refactor `Notebook::cell_offsets` to use an abstract struct for storing
the cell offsets. This will allow us to add useful methods on it.
2023-11-22 15:27:00 +00:00
Dhruv Manilawala 63a87dda63
Move notebook cell logic in separate file (#8813)
Small refactor to move cell related logic to it's own file.
2023-11-22 09:21:28 -06:00