To enable ruff_dev to automatically generate the rule Markdown tables in
the README the ruff library contained the following function:
Linter::codes() -> Vec<RuleSelector>
which was slightly changed to `fn prefixes(&self) -> Prefixes` in
9dc66b5a65 to enable ruff_dev to split
up the Markdown tables for linters that have multiple prefixes
(pycodestyle has E & W, Pylint has PLC, PLE, PLR & PLW).
The definition of this method was however largely redundant with the
#[prefix] macro attributes in the Linter enum, which are used to
derive the Linter::parse_code function, used by the --explain command.
This commit removes the redundant Linter::prefixes by introducing a
same-named method with a different signature to the RuleNamespace trait:
fn prefixes(&self) -> &'static [&'static str];
As well as implementing IntoIterator<Rule> for &Linter. We extend the
extisting RuleNamespace proc macro to automatically derive both
implementations from the Linter enum definition.
To support the previously mentioned Markdown table splitting we
introduce a very simple hand-written method to the Linter impl:
fn categories(&self) -> Option<&'static [LinterCategory]>;
ParseCode was a fitting name since the trait only contained a single
parse_code method ... since we now however want to introduce an
additional `prefixes` method RuleNamespace is more fitting.
Using Ident as the key type is inconvenient since creating an Ident
requires the specification of a Span, which isn't actually used by
the Hash implementation of Ident.
543865c96b introduced
RuleCode::origin() -> RuleOrigin generation via a macro, while that
signature now has been renamed to Rule::origin() -> Linter we actually
want to get rid of it since rules and linters shouldn't be this tightly
coupled (since one rule can exist in multiple linters).
Another disadvantage of the previous approach was that the prefixes
had to be defined in ruff_macros/src/prefixes.rs, which was easy to
miss when defining new linters in src/*, case in point
INP001 => violations::ImplicitNamespacePackage has in the meantime been
added without ruff_macros/src/prefixes.rs being updated accordingly
which resulted in `ruff --explain INP001` mistakenly reporting that the
rule belongs to isort (since INP001 starts with the isort prefix "I").
The derive proc macro introduced in this commit requires every variant
to have at least one #[prefix = "..."], eliminating such mistakes.
More accurate since the enum also encompasses:
* ALL (which isn't a prefix at all)
* fully-qualified rule codes (which aren't prefixes unless you say
they're a prefix to the empty string but that's not intuitive)
"origin" was accurate since ruff rules are currently always modeled
after one origin (except the Ruff-specific rules).
Since we however want to introduce a many-to-many mapping between codes
and rules, the term "origin" no longer makes much sense. Rules usually
don't have multiple origins but one linter implements a rule first and
then others implement it later (often inspired from another linter).
But we don't actually care much about where a rule originates from when
mapping multiple rule codes to one rule implementation, so renaming
RuleOrigin to Linter is less confusing with the many-to-many system.
As we surface rule names more to users we want
them to be easier to type than PascalCase.
Prior art:
Pylint and ESLint also use kebab-case for their rule names.
Clippy uses snake_case but only for syntactical reasons
(so that the argument to e.g. #![allow(clippy::some_lint)]
can be parsed as a path[1]).
[1]: https://doc.rust-lang.org/reference/paths.html
The idea is nice and simple we replace:
fn placeholder() -> Self;
with
fn message_formats() -> &'static [&'static str];
So e.g. if a Violation implementation defines:
fn message(&self) -> String {
format!("Local variable `{name}` is assigned to but never used")
}
it would also have to define:
fn message_formats() -> &'static [&'static str] {
&["Local variable `{name}` is assigned to but never used"]
}
Since we however obviously do not want to duplicate all of our format
strings we simply introduce a new procedural macro attribute
#[derive_message_formats] that can be added to the message method
declaration in order to automatically derive the message_formats
implementation.
This commit implements the macro. The following and final commit
updates violations.rs to use the macro. (The changes have been separated
because the next commit is autogenerated via a Python script.)
ruff_dev::generate_rules_table previously documented which rules are
autofixable via DiagnosticKind::fixable ... since the DiagnosticKind was
obtained via Rule::kind (and Violation::placeholder) which we both want
to get rid of we have to obtain the autofixability via another way.
This commit implements such another way by adding an AUTOFIX
associated constant to the Violation trait. The constant is of the type
Option<AutoFixkind>, AutofixKind is a new struct containing an
Availability enum { Sometimes, Always}, letting us additionally document
that some autofixes are only available sometimes (which previously
wasn't documented). We intentionally introduce this information in a
struct so that we can easily introduce further autofix metadata in the
future such as autofix applicability[1].
[1]: https://doc.rust-lang.org/stable/nightly-rustc/rustc_errors/enum.Applicability.html
This commit series removes the following associated
function from the Violation trait:
fn placeholder() -> Self;
ruff previously used this placeholder approach for the messages it
listed in the README and displayed when invoked with --explain <code>.
This approach is suboptimal for three reasons:
1. The placeholder implementations are completely boring code since they
just initialize the struct with some dummy values.
2. Displaying concrete error messages with arbitrary interpolated values
can be confusing for the user since they might not recognize that the
values are interpolated.
3. Some violations have varying format strings depending on the
violation which could not be documented with the previous approach
(while we could have changed the signature to return Vec<Self> this
would still very much suffer from the previous two points).
We therefore drop Violation::placeholder in favor of a new macro-based
approach, explained in commit 4/5.
Violation::placeholder is only invoked via Rule::kind, so we firstly
have to get rid of all Rule::kind invocations ... this commit starts
removing the trivial cases.
Since the UI still relies on the rule codes this improves the developer
experience by letting developers view the code of a Rule enum variant by
hovering over it.
This commit series refactors ruff to decouple "rules" from "rule codes",
in order to:
1. Make our code more readable by changing e.g.
RuleCode::UP004 to Rule::UselessObjectInheritance.
2. Let us cleanly map multiple codes to one rule, for example:
[UP004] in pyupgrade, [R0205] in pylint and [PIE792] in flake8-pie
all refer to the rule UselessObjectInheritance but ruff currently
only associates that rule with the UP004 code (since the
implementation was initially modeled after pyupgrade).
3. Let us cleanly map one code to multiple rules, for example:
[C0103] from pylint encompasses N801, N802 and N803 from pep8-naming.
The latter two steps are not yet implemented by this commit series
but this refactoring enables us to introduce such a mapping. Such a
mapping would also let us expand flake8_to_ruff to support e.g. pylint.
After the next commit which just does some renaming the following four
commits remove all trait derivations from the Rule (previously RuleCode)
enum that depend on the variant names to guarantee that they are not
used anywhere anymore so that we can rename all of these variants in the
eigth and final commit without breaking anything.
While the plan very much is to also surface these human-friendly names
more in the user interface this is not yet done in this commit series,
which does not change anything about the UI: it's purely a refactor.
[UP004]: pyupgrade doesn't actually assign codes to its messages.
[R0205]: https://pylint.pycqa.org/en/latest/user_guide/messages/refactor/useless-object-inheritance.html
[PIE792]: https://github.com/sbdchd/flake8-pie#pie792-no-inherit-object
[C0103]: https://pylint.pycqa.org/en/latest/user_guide/messages/convention/invalid-name.html
Implements [flake8-commas](https://github.com/PyCQA/flake8-commas). Fixes#1058.
The plugin is mostly redundant with Black (and also deprecated upstream), but very useful for projects which can't/won't use an auto-formatter.
This linter works on tokens. Before porting to Rust, I cleaned up the Python code ([link](https://gist.github.com/bluetech/7c5dcbdec4a73dd5a74d4bc09c72b8b9)) and made sure the tests pass. In the Rust version I tried to add explanatory comments, to the best of my understanding of the original logic.
Some changes I did make:
- Got rid of rule C814 - "missing trailing comma in Python 2". Ruff doesn't support Python 2.
- Merged rules C815 - "missing trailing comma in Python 3.5+" and C816 - "missing trailing comma in Python 3.6+" into C812 - "missing trailing comma". These Python versions are outdated, didn't think it was worth the complication.
- Added autofixes for C812 and C819.
Autofix is missing for C818 - "trailing comma on bare tuple prohibited". It needs to turn e.g. `x = 1,` into `x = (1, )`, it's a bit difficult to do with tokens only, so I skipped it for now.
I ran the rules on cpython/Lib and on a big internal code base and it works as intended (though I only sampled the diffs).
define_rule_mapping! was previously implemented as a declarative macro,
which was however partially relying on an origin_by_code! proc macro
because declarative macros cannot match on substrings of identifiers.
Currently all define_rule_mapping! lines look like the following:
TID251 => violations::BannedApi,
TID252 => violations::BannedRelativeImport,
We want to break up violations.rs, moving the violation definitions to
the respective rule modules. To do this we want to change the previous
lines to:
TID251 => rules::flake8_tidy_imports::banned_api::BannedApi,
TID252 => rules::flake8_tidy_imports::relative_imports::RelativeImport,
This however doesn't work because the define_rule_mapping! macro is
currently defined as:
($($code:ident => $mod:ident::$name:ident,)+) => { ... }
That is it only supported $module::$name but not longer paths with
multiple modules. While we could define `=> $path:path`[1] then we
could no longer access the last path segment, which we need because
we use it for the DiagnosticKind variant names. And
`$path:path::$last:ident` doesn't work either because it would be
ambiguous (Rust wouldn't know where the path ends ... so path fragments
have to be followed by some punctuation/keyword that may not be part of
paths). And we also cannot just introduce a procedural macro like
path_basename!(...) because the following is not valid Rust code:
enum Foo { foo!(...), }
(macros cannot be called in the place where you define variants.)
So we have to convert define_rule_mapping! into a proc macro in order to
support paths of arbitrary length and this commit implements that.
[1]: https://doc.rust-lang.org/reference/macros-by-example.html#metavariables
We don't have any doctests, but `cargo test --all` spends more than half
the time on doctests? A little confusing, but this brings the test time
from > 4s to < 2s on my machine.