This PR implements a modification (in preview) to fluent formatting for
method chains: We break _at_ the first call instead of _after_.
For example, we have the following diff between `main` and this PR (with
`line-length=8` so I don't have to stretch out the text):
```diff
x = (
- df.merge()
+ df
+ .merge()
.groupby()
.agg()
.filter()
)
```
## Explanation of current implementation
Recall that we traverse the AST to apply formatting. A method chain,
while read left-to-right, is stored in the AST "in reverse". So if we
start with something like
```python
a.b.c.d().e.f()
```
then the first syntax node we meet is essentially `.f()`. So we have to
peek ahead. And we actually _already_ do this in our current fluent
formatting logic: we peek ahead to count how many calls we have in the
chain to see whether we should be using fluent formatting or now.
In this implementation, we actually _record_ this number inside the enum
for `CallChainLayout`. That is, we make the variant `Fluent` hold an
`AttributeState`. This state can either be:
- The number of call-like attributes preceding the current attribute
- The state `FirstCallOrSubscript` which means we are at the first
call-like attribute in the chain (reading from left to right)
- The state `BeforeFirstCallOrSubscript` which means we are in the
"first group" of attributes, preceding that first call.
In our example, here's what it looks like at each attribute:
```
a.b.c.d().e.f @ Fluent(CallsOrSubscriptsPreceding(1))
a.b.c.d().e @ Fluent(CallsOrSubscriptsPreceding(1))
a.b.c.d @ Fluent(FirstCallOrSubscript)
a.b.c @ Fluent(BeforeFirstCallOrSubscript)
a.b @ Fluent(BeforeFirstCallOrSubscript)
```
Now, as we descend down from the parent expression, we pass along this
little piece of state and modify it as we go to track where we are. This
state doesn't do anything except when we are in `FirstCallOrSubscript`,
in which case we add a soft line break.
Closes#8598
---------
Co-authored-by: Brent Westbrook <36778786+ntBre@users.noreply.github.com>
Summary
--
Following #8179, we now format long lambda expressions a bit more like
Black, preferring to keep long parameter lists on a single line, but we
go one step further to break the body itself across multiple lines and
parenthesize it if it's still too long. This PR documents both the
stable deviation that breaks parameters across multiple lines, and the
new preview deviation that breaks the body instead.
I also fixed a couple of typos in the section immediately above my
addition.
Test Plan
--
I tested all of the snippets here against `main` for the preview
behavior, our playground for the stable behavior, and Black's playground
for their behavior
## Summary
The reference to the pre-commit hook inside the tutorial was to the
legacy alias `ruff` instead of the current `ruff-check`.
Ref: https://github.com/astral-sh/ruff-pre-commit/pull/124
## Test Plan
Not applicable.
Running `eglot-format` in buffers not managed by Eglot causes a
`jsonrpc-error` in Emacs 30. It may also display a
`documentFormattingProvider` warning when the server does not support
formatting. Add checks for both.
## Summary
This PR uses the new `Diagnostic` type for rendering formatter
diagnostics. This allows the formatter to inherit all of the output
formats already implemented in the linter and ty. For example, here's
the new `full` output format, with the formatting diff displayed using
the same infrastructure as the linter:
<img width="592" height="364" alt="image"
src="https://github.com/user-attachments/assets/6d09817d-3f27-4960-aa8b-41ba47fb4dc0"
/>
<details><summary>Resolved TODOs</summary>
<p>
~~There are several limitiations/todos here still, especially around the
`OutputFormat` type~~:
- [x] A few literal `todo!`s for the remaining `OutputFormat`s without
matching `DiagnosticFormat`s
- [x] The default output format is `full` instead of something more
concise like the current output
- [x] Some of the output formats (namely JSON) have information that
doesn't make much sense for these diagnostics
The first of these is definitely resolved, and I think the other two are
as well, based on discussion on the design document. In brief, we're
okay inheriting the default `OutputFormat` and can separate the global
option into `lint.output-format` and `format.output-format` in the
future, if needed; and we're okay including redundant information in the
non-human-readable output formats.
My last major concern is with the performance of the new code, as
discussed in the `Benchmarks` section below.
A smaller question is whether we should use `Diagnostic`s for formatting
errors too. I think the answer to this is yes, in line with changes
we're making in the linter too. I still need to implement that here.
</p>
</details>
<details><summary>Benchmarks</summary>
<p>
The values in the table are from a large benchmark on the CPython 3.10
code
base, which involves checking 2011 files, 1872 of which need to be
reformatted.
`stable` corresponds to the same code used on `main`, while
`preview-full` and
`preview-concise` use the new `Diagnostic` code gated behind `--preview`
for the
`full` and `concise` output formats, respectively. `stable-diff` uses
the
`--diff` to compare the two diff rendering approaches. See the full
hyperfine
command below for more details. For a sense of scale, the `stable`
output format
produces 1873 lines on stdout, compared to 855,278 for `preview-full`
and
857,798 for `stable-diff`.
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:------------------|--------------:|---------:|---------:|-------------:|
| `stable` | 201.2 ± 6.8 | 192.9 | 220.6 | 1.00 |
| `preview-full` | 9113.2 ± 31.2 | 9076.1 | 9152.0 | 45.29 ± 1.54 |
| `preview-concise` | 214.2 ± 1.4 | 212.0 | 217.6 | 1.06 ± 0.04 |
| `stable-diff` | 3308.6 ± 20.2 | 3278.6 | 3341.8 | 16.44 ± 0.56 |
In summary, the `preview-concise` diagnostics are ~6% slower than the
stable
output format, increasing the average runtime from 201.2 ms to 214.2 ms.
The
`full` preview diagnostics are much more expensive, taking over 9113.2
ms to
complete, which is ~3x more expensive even than the stable diffs
produced by the
`--diff` flag.
My main takeaways here are:
1. Rendering `Edit`s is much more expensive than rendering the diffs
from `--diff`
2. Constructing `Edit`s actually isn't too bad
### Constructing `Edit`s
I also took a closer look at `Edit` construction by modifying the code
and
repeating the `preview-concise` benchmark and found that the main issue
is
constructing a `SourceFile` for use in the `Edit` rendering. Commenting
out the
`Edit` construction itself has basically no effect:
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:----------|------------:|---------:|---------:|------------:|
| `stable` | 197.5 ± 1.6 | 195.0 | 200.3 | 1.00 |
| `no-edit` | 208.9 ± 2.2 | 204.8 | 212.2 | 1.06 ± 0.01 |
However, also omitting the source text from the `SourceFile`
construction
resolves the slowdown compared to `stable`. So it seems that copying the
full
source text into a `SourceFile` is the main cause of the slowdown for
non-`full`
diagnostics.
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:-----------------|------------:|---------:|---------:|------------:|
| `stable` | 202.4 ± 2.9 | 197.6 | 207.9 | 1.00 |
| `no-source-text` | 202.7 ± 3.3 | 196.3 | 209.1 | 1.00 ± 0.02 |
### Rendering diffs
The main difference between `stable-diff` and `preview-full` seems to be
the diffing strategy we use from `similar`. Both versions use the same
algorithm, but in the existing
[`CodeDiff`](https://github.com/astral-sh/ruff/blob/main/crates/ruff_linter/src/source_kind.rs#L259)
rendering for the `--diff` flag, we only do line-level diffing, whereas
for `Diagnostic`s we use `TextDiff::iter_inline_changes` to highlight
word-level changes too. Skipping the word diff for `Diagnostic`s closes
most of the gap:
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `stable-diff` | 3.323 ± 0.015 | 3.297 | 3.341 | 1.00 |
| `preview-full` | 3.654 ± 0.019 | 3.618 | 3.682 | 1.10 ± 0.01 |
(In some repeated runs, I've seen as small as a ~5% difference, down
from 10% in the table)
This doesn't actually change any of our snapshots, but it would
obviously change the rendered result in a terminal since we wouldn't
highlight the specific words that changed within a line.
Another much smaller change that we can try is removing the deadline
from the `iter_inline_changes` call. It looks like there's a fair amount
of overhead from the default 500 ms deadline for computing these, and
using `iter_inline_changes(op, None)` (`None` for the optional deadline
argument) improves the runtime quite a bit:
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|:---|---:|---:|---:|---:|
| `stable-diff` | 3.322 ± 0.013 | 3.298 | 3.341 | 1.00 |
| `preview-full` | 5.296 ± 0.030 | 5.251 | 5.366 | 1.59 ± 0.01 |
<hr>
<details><summary>hyperfine command</summary>
```shell
cargo build --release --bin ruff && hyperfine --ignore-failure --warmup 10 --export-markdown /tmp/table.md \
-n stable -n preview-full -n preview-concise -n stable-diff \
"./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache" \
"./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=full" \
"./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=concise" \
"./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --diff"
```
</details>
</p>
</details>
## Test Plan
Some new CLI tests and manual testing
<!--
Thank you for contributing to Ruff/ty! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title? (Please prefix
with `[ty]` for ty pull
requests.)
- Does this pull request include references to any relevant issues?
-->
## Summary
Closes#18349
After this change:
- All deprecated rules are deselected by default
- They are only selected if the user specifically selects them by code,
e.g. `--select UP038`
- Thus, `--select ALL --select UP --select UP0` won't select the
deprecated rule UP038
- Documented the change in version policy. From now on, deprecating a
rule should increase the minor version
## Test Plan
Integration tests in "integration_tests.rs"
Also tested with a temporary test package:
```
~> ../../ruff/target/debug/ruff.exe check --select UP038
warning: Rule `UP038` is deprecated and will be removed in a future release.
warning: Detected debug build without --no-cache.
UP038 Use `X | Y` in `isinstance` call instead of `(X, Y)`
--> main.py:2:11
|
1 | def main():
2 | print(isinstance(25, (str, int)))
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
|
help: Convert to `X | Y`
Found 1 error.
No fixes available (1 hidden fix can be enabled with the `--unsafe-fixes` option).
~> ../../ruff/target/debug/ruff.exe check --select UP03
warning: Detected debug build without --no-cache.
All checks passed!
~> ../../ruff/target/debug/ruff.exe check --select UP0
warning: Detected debug build without --no-cache.
All checks passed!
~> ../../ruff/target/debug/ruff.exe check --select UP
warning: Detected debug build without --no-cache.
All checks passed!
~> ../../ruff/target/debug/ruff.exe check --select ALL
# warnings and errors, but because of other errors, UP038 was deselected
```
This stabilizes the behavior introduced in #16565 which (roughly) tries
to match an import like `import a.b.c` to an actual directory path
`a/b/c` in order to label it as first-party, rather than simply looking
for a directory `a`.
Mainly this affects the sorting of imports in the presence of namespace
packages, but a few other rules are affected as well.
This adds a new `backend: internal | uv` option to the LSP
`FormatOptions` allowing users to perform document and range formatting
operations though uv. The idea here is to prototype a solution for users
to transition to a `uv format` command without encountering version
mismatches (and consequently, formatting differences) between the LSP's
version of `ruff` and uv's version of `ruff`.
The primarily alternative to this would be to use uv to discover the
`ruff` version used to start the LSP in the first place. However, this
would increase the scope of a minimal `uv format` command beyond "run a
formatter", and raise larger questions about how uv should be used to
coordinate toolchain discovery. I think those are good things to
explore, but I'm hesitant to let them block a `uv format`
implementation. Another downside of using uv to discover `ruff`, is that
it needs to be implemented _outside_ the LSP; e.g., we'd need to change
the instructions on how to run the LSP and implement it in each editor
integration, like the VS Code plugin.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
D413 in this section was incorrectly linking to D410.
I haven't checked if this issue happens anywhere else in the docs.
## Test Plan
Look at docs