Commit Graph

13 Commits

Author SHA1 Message Date
Brent Westbrook f14ee9edd5
Use structs for JSON serialization (#19270)
## Summary

See https://github.com/astral-sh/ruff/pull/19133#discussion_r2198413586
for recent discussion. This PR moves to using structs for the types in
our JSON output format instead of the `json!` macro.

I didn't rename any of the `message` references because that should be
handled when rebasing #19133 onto this.

My plan for handling the `preview` behavior with the new diagnostics is
to use a wrapper enum. Something like:

```rust
#[derive(Serialize)]
#[serde(untagged)]
pub(crate) enum JsonDiagnostic<'a> {
    Old(OldJsonDiagnostic<'a>),
}

#[derive(Serialize)]
pub(crate) struct OldJsonDiagnostic<'a> {
    // ...
}
```

Initially I thought I could use a `&dyn Serialize` for the affected
fields, but I see that `Serialize` isn't dyn-compatible in testing this
now.

## Test Plan

Existing tests. One quirk of the new types is that their fields are in
alphabetical order. I guess `json!` sorts the fields alphabetically? The
tests were failing before I sorted the struct fields.

## Other formats

It looks like the `rdjson`, `sarif`, and `gitlab` formats also use
`json!`, so if we decide to merge this, I can do something similar for
those before moving them to the new diagnostic format.
2025-07-11 09:37:44 -04:00
Brent Westbrook 77a5c5ac80
Combine `OldDiagnostic` and `Diagnostic` (#19053)
## Summary

This PR is a collaboration with @AlexWaygood from our pairing session
last Friday.

The main goal here is removing `ruff_linter::message::OldDiagnostic` in
favor of
using `ruff_db::diagnostic::Diagnostic` directly. This involved a few
major steps:

- Transferring the fields
- Transferring the methods and trait implementations, where possible
- Converting some constructor methods to free functions
- Moving the `SecondaryCode` struct
- Updating the method names

I'm hoping that some of the methods, especially those in the
`expect_ruff_*`
family, won't be necessary long-term, but I avoided trying to replace
them
entirely for now to keep the already-large diff a bit smaller.

### Related refactors

Alex and I noticed a few refactoring opportunities while looking at the
code,
specifically the very similar implementations for
`create_parse_diagnostic`,
`create_unsupported_syntax_diagnostic`, and
`create_semantic_syntax_diagnostic`.
We combined these into a single generic function, which I then copied
into
`ruff_linter::message` with some small changes and a TODO to combine
them in the
future.

I also deleted the `DisplayParseErrorType` and `TruncateAtNewline` types
for
reporting parse errors. These were added in #4124, I believe to work
around the
error messages from LALRPOP. Removing these didn't affect any tests, so
I think
they were unnecessary now that we fully control the error messages from
the
parser.

On a more minor note, I factored out some calls to the
`OldDiagnostic::filename`
(now `Diagnostic::expect_ruff_filename`) function to avoid repeatedly
allocating
`String`s in some places.

### Snapshot changes

The `show_statistics_syntax_errors` integration test changed because the
`OldDiagnostic::name` method used `syntax-error` instead of
`invalid-syntax`
like in ty. I think this (`--statistics`) is one of the only places we
actually
use this name for syntax errors, so I hope this is okay. An alternative
is to
use `syntax-error` in ty too.

The other snapshot changes are from removing this code, as discussed on

[Discord](https://discord.com/channels/1039017663004942429/1228460843033821285/1388252408848847069):


34052a1185/crates/ruff_linter/src/message/mod.rs (L128-L135)

I think both of these are technically breaking changes, but they only
affect
syntax errors and are very narrow in scope, while also pretty
substantially
simplifying the refactor, so I hope they're okay to include in a patch
release.

## Test plan

Existing tests, with the adjustments mentioned above

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2025-07-03 13:01:09 -04:00
Brent Westbrook 96f3c8d1ab
Convert `OldDiagnostic::noqa_code` to an `Option<String>` (#18946)
## Summary

I think this should be the last step before combining `OldDiagnostic`
and `ruff_db::Diagnostic`. We can't store a `NoqaCode` on
`ruff_db::Diagnostic`, so I converted the `noqa_code` field to an
`Option<String>` and then propagated this change to all of the callers.

I tried to use `&str` everywhere it was possible, so I think the
remaining `to_string` calls are necessary. I spent some time trying to
convert _everything_ to `&str` but ran into lifetime issues, especially
in the `FixTable`. Maybe we can take another look at that if it causes a
performance regression, but hopefully these paths aren't too hot. We
also avoid some `to_string` calls, so it might even out a bit too.

## Test Plan

Existing tests

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2025-06-27 11:36:55 -04:00
Brent Westbrook 10a1d9f01e
Unify `OldDiagnostic` and `Message` (#18391)
Summary
--

This PR unifies the remaining differences between `OldDiagnostic` and
`Message` (`OldDiagnostic` was only missing an optional `noqa_offset`
field) and
replaces `Message` with `OldDiagnostic`.

The biggest functional difference is that the combined `OldDiagnostic`
kind no
longer implements `AsRule` for an infallible conversion to `Rule`. This
was
pretty easy to work around with `is_some_and` and `is_none_or` in the
few places
it was needed. In `LintContext::report_diagnostic_if_enabled` we can
just use
the new `Violation::rule` method, which takes care of most cases.

Most of the interesting changes are in [this
range](8156992540)
before I started renaming.

Test Plan
--

Existing tests

Future Work
--

I think it's time to start shifting some of these fields to the new
`Diagnostic`
kind. I believe we want `Fix` for sure, but I'm less sure about the
others. We
may want to keep a thin wrapper type here anyway to implement a `rule`
method,
so we could leave some of these fields on that too.
2025-06-19 09:37:58 -04:00
Brent Westbrook ce216c79cc
Remove `Message::to_rule` (#18447)
## Summary

As the title says, this PR removes the `Message::to_rule` method by
replacing related uses of `Rule` with `NoqaCode` (or the rule's name in
the case of the cache). Where it seemed a `Rule` was really needed, we
convert back to the `Rule` by parsing either the rule name (with
`str::parse`) or the `NoqaCode` (with `Rule::from_code`).

I thought this was kind of like cheating and that it might not resolve
this part of Micha's
[comment](https://github.com/astral-sh/ruff/pull/18391#issuecomment-2933764275):

> because we can't add Rule to Diagnostic or **have it anywhere in our
shared rendering logic**

but after looking again, the only remaining `Rule` conversion in
rendering code is for the SARIF output format. The other two non-test
`Rule` conversions are for caching and writing a fix summary, which I
don't think fall into the shared rendering logic. That leaves the SARIF
format as the only real problem, but maybe we can delay that for now.

The motivation here is that we won't be able to store a `Rule` on the
new `Diagnostic` type, but we should be able to store a `NoqaCode`,
likely as a string.

## Test Plan

Existing tests

##
[Benchmarks](https://codspeed.io/astral-sh/ruff/branches/brent%2Fremove-to-rule)

Almost no perf regression, only -1% on
`linter/default-rules[large/dataset.py]`.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2025-06-05 12:48:29 -04:00
Brent Westbrook 9925910a29
Add a `ViolationMetadata::rule` method (#18234)
Summary
--

This PR adds a macro-generated method to retrieve the `Rule` associated
with a given `Violation` struct, which makes it substantially cheaper
than parsing from the rule name. The rule is then converted to a
`NoqaCode` for storage on the `Message` (and eventually on the new
diagnostic type). The `ViolationMetadata::rule_name` method was now
unused, so the `rule` method replaces it.

Several types had to be moved from the `ruff_diagnostics` crate to the
`ruff_linter` crate to make this work, namely the `Violation` traits and
the old `Diagnostic` type, which had a constructor generic over a
`Violation`.

It's actually a fairly small PR, minus the hundreds of import changes.
The main changes are in these files:

-
[crates/ruff_linter/src/message/mod.rs](https://github.com/astral-sh/ruff/pull/18234/files#diff-139754ea310d75f28307008d21c771a190038bd106efe3b9267cc2d6c0fa0921)
-
[crates/ruff_diagnostics/src/lib.rs](https://github.com/astral-sh/ruff/pull/18234/files#diff-8e8ea5c586935bf21ea439f24253fcfd5955d2cb130f5377c2fa7bfee3ea3a81)
-
[crates/ruff_linter/src/diagnostic.rs](https://github.com/astral-sh/ruff/pull/18234/files#diff-1d0c9aad90d8f9446079c5be5f284150d97797158715bd9729e6f1f70246297a)
-
[crates/ruff_linter/src/lib.rs](https://github.com/astral-sh/ruff/pull/18234/files#diff-eb93ef7e78a612f5fa9145412c75cf6b1a5cefba1c2233e4a11a880a1ce1fbcc)

Test Plan
--

Existing tests
2025-05-28 09:27:09 -04:00
Brent Westbrook d6009eb942
Unify `Message` variants (#18051)
## Summary

This PR unifies the ruff `Message` enum variants for syntax errors and
rule violations into a single `Message` struct consisting of a shared
`db::Diagnostic` and some additional, optional fields used for some rule
violations.

This version of `Message` is nearly a drop-in replacement for
`ruff_diagnostics::Diagnostic`, which is the next step I have in mind
for the refactor.

I think this is also a useful checkpoint because we could possibly add
some of these optional fields to the new `Diagnostic` type. I think
we've previously discussed wanting support for `Fix`es, but the other
fields seem less relevant, so we may just need to preserve the `Message`
wrapper for a bit longer.

## Test plan

Existing tests

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2025-05-19 13:34:04 -04:00
Micha Reiser 9ae698fe30
Switch to Rust 2024 edition (#18129) 2025-05-16 13:25:28 +02:00
Brent Westbrook 981bd70d39
Convert `Message::SyntaxError` to use `Diagnostic` internally (#17784)
## Summary

This PR is a first step toward integration of the new `Diagnostic` type
into ruff. There are two main changes:
- A new `UnifiedFile` enum wrapping `File` for red-knot and a
`SourceFile` for ruff
- ruff's `Message::SyntaxError` variant is now a `Diagnostic` instead of
a `SyntaxErrorMessage`

The second of these changes was mostly just a proof of concept for the
first, and it went pretty smoothly. Converting `DiagnosticMessage`s will
be most of the work in replacing `Message` entirely.

## Test Plan

Existing tests, which show no changes.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Micha Reiser <micha@reiser.io>
2025-05-08 12:45:51 -04:00
Micha Reiser 1c65e0ad25
Split `SourceLocation` into `LineColumn` and `SourceLocation` (#17587) 2025-04-27 11:27:33 +01:00
Dhruv Manilawala e7b49694a7 Remove `E999` as a rule, disallow any disablement methods for syntax error (#11901)
## Summary

This PR updates the way syntax errors are handled throughout the linter.

The main change is that it's now not considered as a rule which involves
the following changes:
* Update `Message` to be an enum with two variants - one for diagnostic
message and the other for syntax error message
* Provide methods on the new message enum to query information required
by downstream usages

This means that the syntax errors cannot be hidden / disabled via any
disablement methods. These are:
1. Configuration via `select`, `ignore`, `per-file-ignores`, and their
`extend-*` variants
	```console
$ cargo run -- check ~/playground/ruff/src/lsp.py --extend-select=E999
--no-preview --no-cache
	    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.10s
Running `target/debug/ruff check /Users/dhruv/playground/ruff/src/lsp.py
--extend-select=E999 --no-preview --no-cache`
warning: Rule `E999` is deprecated and will be removed in a future
release. Syntax errors will always be shown regardless of whether this
rule is selected or not.
/Users/dhruv/playground/ruff/src/lsp.py:1:8: F401 [*] `abc` imported but
unused
	  |
	1 | import abc
	  |        ^^^ F401
	2 | from pathlib import Path
	3 | import os
	  |
	  = help: Remove unused import: `abc`
	```
3. Command-line flags via `--select`, `--ignore`, `--per-file-ignores`,
and their `--extend-*` variants
	```console
$ cargo run -- check ~/playground/ruff/src/lsp.py --no-cache
--config=~/playground/ruff/pyproject.toml
	    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
Running `target/debug/ruff check /Users/dhruv/playground/ruff/src/lsp.py
--no-cache --config=/Users/dhruv/playground/ruff/pyproject.toml`
warning: Rule `E999` is deprecated and will be removed in a future
release. Syntax errors will always be shown regardless of whether this
rule is selected or not.
/Users/dhruv/playground/ruff/src/lsp.py:1:8: F401 [*] `abc` imported but
unused
	  |
	1 | import abc
	  |        ^^^ F401
	2 | from pathlib import Path
	3 | import os
	  |
	  = help: Remove unused import: `abc`
	```

This also means that the **output format** needs to be updated:
1. The `code`, `noqa_row`, `url` fields in the JSON output is optional
(`null` for syntax errors)
2. Other formats are changed accordingly
For each format, a new test case specific to syntax errors have been
added. Please refer to the snapshot output for the exact format for
syntax error message.

The output of the `--statistics` flag will have a blank entry for syntax
errors:
```
315     F821    [ ] undefined-name
119             [ ] syntax-error
103     F811    [ ] redefined-while-unused
```

The **language server** is updated to consider the syntax errors by
convert them into LSP diagnostic format separately.

### Preview

There are no quick fixes provided to disable syntax errors. This will
automatically work for `ruff-lsp` because the `noqa_row` field will be
`null` in that case.
<img width="772" alt="Screenshot 2024-06-26 at 14 57 08"
src="https://github.com/astral-sh/ruff/assets/67177269/aaac827e-4777-4ac8-8c68-eaf9f2c36774">

Even with `noqa` comment, the syntax error is displayed:
<img width="763" alt="Screenshot 2024-06-26 at 14 59 51"
src="https://github.com/astral-sh/ruff/assets/67177269/ba1afb68-7eaf-4b44-91af-6d93246475e2">

Rule documentation page:
<img width="1371" alt="Screenshot 2024-06-26 at 16 48 07"
src="https://github.com/astral-sh/ruff/assets/67177269/524f01df-d91f-4ac0-86cc-40e76b318b24">


## Test Plan

- [x] Disablement methods via config shows a warning
	- [x] `select`, `extend-select`
	- [ ] ~`ignore`~ _doesn't show any message_
- [ ] ~`per-file-ignores`, `extend-per-file-ignores`~ _doesn't show any
message_
- [x] Disablement methods via command-line flag shows a warning
	- [x] `--select`, `--extend-select`
	- [ ] ~`--ignore`~ _doesn't show any message_
- [ ] ~`--per-file-ignores`, `--extend-per-file-ignores`~ _doesn't show
any message_
- [x] File with syntax errors should exit with code 1
- [x] Language server
	- [x] Should show diagnostics for syntax errors
	- [x] Should not recommend a quick fix edit for adding `noqa` comment
	- [x] Same for `ruff-lsp`

resolves: #8447
2024-06-27 13:44:11 +02:00
Dhruv Manilawala 66179af4f1
Add `cell` field to JSON output format (#7664)
## Summary

This PR adds a new `cell` field to the JSON output format which
indicates the Notebook cell this diagnostic (and fix) belongs to. It
also updates the location for the diagnostic and fixes as per the
`NotebookIndex`. It will be used in the VSCode extension to display the
diagnostic in the correct cell.

The diagnostic and edit start and end source locations are translated
for the notebook as per the `NotebookIndex`. The end source location for
an edit needs some special handling.

### Edit end location

To understand this, the following context is required:

1. Visible lines in Jupyter Notebook vs JSON array strings: The newline
is part of the string in the JSON format. This means that if there are 3
visible lines in a cell where the last line is empty then the JSON would
contain 2 strings in the source array, both ending with a newline:

**JSON format:**
```json
[
	"# first line\n",
	"# second line\n",
]
```

**Notebook view:**
```python
1 # first line
2 # second line
3
```

2. If an edit needs to remove an entire line including the newline, then
the end location would be the start of the next row.

To remove a statement in the following code:
```python
import os
```

The edit would be:
```
start: row 1, col 1
end: row 2, col 1
```

Now, here's where the problem lies. The notebook index doesn't have any
information for row 2 because it doesn't exists in the actual notebook.
The newline was added by Ruff to concatenate the source code and it's
removed before writing back. But, the edit is computed looking at that
newline.

This means that while translating the end location for an edit belong to
a Notebook, we need to check if both the start and end location belongs
to the same cell. If not, then the end location should be the first
character of the next row and if so, translate that back to the last
character of the previous row. Taking the above example, the translated
location for Notebook would be:
```
start: row 1, col 1
end: row 1, col 10
```

## Test Plan

Add test cases for notebook output in the JSON format and update
existing snapshots.
2023-10-13 01:06:02 +00:00
Charlie Marsh 5849a75223
Rename `ruff` crate to `ruff_linter` (#7529) 2023-09-20 08:38:27 +02:00