Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Brent Westbrook	2b1d3c60fa	Display diffs for `ruff format --check` and add support for different output formats (#20443 ) ## Summary This PR uses the new `Diagnostic` type for rendering formatter diagnostics. This allows the formatter to inherit all of the output formats already implemented in the linter and ty. For example, here's the new `full` output format, with the formatting diff displayed using the same infrastructure as the linter: <img width="592" height="364" alt="image" src="https://github.com/user-attachments/assets/6d09817d-3f27-4960-aa8b-41ba47fb4dc0" /> <details><summary>Resolved TODOs</summary> <p> ~~There are several limitiations/todos here still, especially around the `OutputFormat` type~~: - [x] A few literal `todo!`s for the remaining `OutputFormat`s without matching `DiagnosticFormat`s - [x] The default output format is `full` instead of something more concise like the current output - [x] Some of the output formats (namely JSON) have information that doesn't make much sense for these diagnostics The first of these is definitely resolved, and I think the other two are as well, based on discussion on the design document. In brief, we're okay inheriting the default `OutputFormat` and can separate the global option into `lint.output-format` and `format.output-format` in the future, if needed; and we're okay including redundant information in the non-human-readable output formats. My last major concern is with the performance of the new code, as discussed in the `Benchmarks` section below. A smaller question is whether we should use `Diagnostic`s for formatting errors too. I think the answer to this is yes, in line with changes we're making in the linter too. I still need to implement that here. </p> </details> <details><summary>Benchmarks</summary> <p> The values in the table are from a large benchmark on the CPython 3.10 code base, which involves checking 2011 files, 1872 of which need to be reformatted. `stable` corresponds to the same code used on `main`, while `preview-full` and `preview-concise` use the new `Diagnostic` code gated behind `--preview` for the `full` and `concise` output formats, respectively. `stable-diff` uses the `--diff` to compare the two diff rendering approaches. See the full hyperfine command below for more details. For a sense of scale, the `stable` output format produces 1873 lines on stdout, compared to 855,278 for `preview-full` and 857,798 for `stable-diff`. \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:------------------\|--------------:\|---------:\|---------:\|-------------:\| \| `stable` \| 201.2 ± 6.8 \| 192.9 \| 220.6 \| 1.00 \| \| `preview-full` \| 9113.2 ± 31.2 \| 9076.1 \| 9152.0 \| 45.29 ± 1.54 \| \| `preview-concise` \| 214.2 ± 1.4 \| 212.0 \| 217.6 \| 1.06 ± 0.04 \| \| `stable-diff` \| 3308.6 ± 20.2 \| 3278.6 \| 3341.8 \| 16.44 ± 0.56 \| In summary, the `preview-concise` diagnostics are ~6% slower than the stable output format, increasing the average runtime from 201.2 ms to 214.2 ms. The `full` preview diagnostics are much more expensive, taking over 9113.2 ms to complete, which is ~3x more expensive even than the stable diffs produced by the `--diff` flag. My main takeaways here are: 1. Rendering `Edit`s is much more expensive than rendering the diffs from `--diff` 2. Constructing `Edit`s actually isn't too bad ### Constructing `Edit`s I also took a closer look at `Edit` construction by modifying the code and repeating the `preview-concise` benchmark and found that the main issue is constructing a `SourceFile` for use in the `Edit` rendering. Commenting out the `Edit` construction itself has basically no effect: \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:----------\|------------:\|---------:\|---------:\|------------:\| \| `stable` \| 197.5 ± 1.6 \| 195.0 \| 200.3 \| 1.00 \| \| `no-edit` \| 208.9 ± 2.2 \| 204.8 \| 212.2 \| 1.06 ± 0.01 \| However, also omitting the source text from the `SourceFile` construction resolves the slowdown compared to `stable`. So it seems that copying the full source text into a `SourceFile` is the main cause of the slowdown for non-`full` diagnostics. \| Command \| Mean [ms] \| Min [ms] \| Max [ms] \| Relative \| \|:-----------------\|------------:\|---------:\|---------:\|------------:\| \| `stable` \| 202.4 ± 2.9 \| 197.6 \| 207.9 \| 1.00 \| \| `no-source-text` \| 202.7 ± 3.3 \| 196.3 \| 209.1 \| 1.00 ± 0.02 \| ### Rendering diffs The main difference between `stable-diff` and `preview-full` seems to be the diffing strategy we use from `similar`. Both versions use the same algorithm, but in the existing [`CodeDiff`](https://github.com/astral-sh/ruff/blob/main/crates/ruff_linter/src/source_kind.rs#L259) rendering for the `--diff` flag, we only do line-level diffing, whereas for `Diagnostic`s we use `TextDiff::iter_inline_changes` to highlight word-level changes too. Skipping the word diff for `Diagnostic`s closes most of the gap: \| Command \| Mean [s] \| Min [s] \| Max [s] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `stable-diff` \| 3.323 ± 0.015 \| 3.297 \| 3.341 \| 1.00 \| \| `preview-full` \| 3.654 ± 0.019 \| 3.618 \| 3.682 \| 1.10 ± 0.01 \| (In some repeated runs, I've seen as small as a ~5% difference, down from 10% in the table) This doesn't actually change any of our snapshots, but it would obviously change the rendered result in a terminal since we wouldn't highlight the specific words that changed within a line. Another much smaller change that we can try is removing the deadline from the `iter_inline_changes` call. It looks like there's a fair amount of overhead from the default 500 ms deadline for computing these, and using `iter_inline_changes(op, None)` (`None` for the optional deadline argument) improves the runtime quite a bit: \| Command \| Mean [s] \| Min [s] \| Max [s] \| Relative \| \|:---\|---:\|---:\|---:\|---:\| \| `stable-diff` \| 3.322 ± 0.013 \| 3.298 \| 3.341 \| 1.00 \| \| `preview-full` \| 5.296 ± 0.030 \| 5.251 \| 5.366 \| 1.59 ± 0.01 \| <hr> <details><summary>hyperfine command</summary> ```shell cargo build --release --bin ruff && hyperfine --ignore-failure --warmup 10 --export-markdown /tmp/table.md \ -n stable -n preview-full -n preview-concise -n stable-diff \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=full" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=concise" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --diff" ``` </details> </p> </details> ## Test Plan Some new CLI tests and manual testing	2025-09-30 12:00:51 -04:00
Micha Reiser	d8216fa328	[ty] Gracefully handle salsa cancellations and panics in background request handlers (#18254 )	2025-05-26 13:37:49 +01:00
Micha Reiser	d94be0e780	[red-knot] Include salsa backtrace in check and mdtest panic messages (#17732 ) Co-authored-by: David Peter <sharkdp@users.noreply.github.com>	2025-04-30 10:26:40 +02:00
Micha Reiser	1d788981cd	[red-knot] Capture backtrace in "check-failed" diagnostic (#17641 ) Co-authored-by: David Peter <sharkdp@users.noreply.github.com>	2025-04-29 16:58:58 +00:00
Douglas Creager	5f5eb7c0dd	[red-knot] Print non-string panic payloads and (sometimes) backtraces (#15363 ) More refinements to the panic messages for failing mdtests to mimic the output of the default panic hook more closely: - We now print out `Box<dyn Any>` if the panic payload is not a string (which is typically the case for salsa panics). - We now include the panic's backtrace if you set the `RUST_BACKTRACE` environment variable.	2025-01-08 18:12:16 -05:00
Douglas Creager	2ca31e4b43	Fall back on previous panic hook when not in `catch_unwind` wrapper (#15319 ) This fixes #15317. Our `catch_unwind` wrapper installs a panic hook that captures (the rendered contents of) the panic info when a panic occurs. Since the intent is that the caller will render the panic info in some custom way, the hook silences the default stderr panic output. However, the panic hook is a global resource, so if any one thread was in the middle of a `catch_unwind` call, we would silence the default panic output for _all_ threads. The solution is to also keep a thread local that indicates whether the current thread is in the middle of our `catch_unwind`, and to fall back on the default panic hook if not. ## Test Plan Artificially added an mdtest parse error, ran tests via `cargo test -p red_knot_python_semantic` to run a large number of tests in parallel. Before this patch, the panic message was swallowed as reported in #15317. After, the panic message was shown.	2025-01-08 11:34:51 -05:00
Douglas Creager	75015b0ed9	Attribute panics to the mdtests that cause them (#15241 ) This updates the mdtest harness to catch any panics that occur during type checking, and to display the panic message as an mdtest failure. (We don't know which specific line causes the failure, so we attribute panics to the first line of the test case.)	2025-01-03 13:45:56 -05:00

7 Commits