## Summary
Implement `write-whole-file` (`FURB103`), part of #1348. This is largely
a copy and paste of `read-whole-file` #7682.
## Test Plan
Text fixture added.
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
The purpose of this change is mainly to address one of the issues
outlined in #10427. Namely, some lists in the docs were not rendering
properly when preceded by a text block without a newline character. This
PR adds `mdformat` as a final step to the rule documentation script, so
that any missing newlines will be added.
NB: The default behavior of `mdformat` is to escape markdown special
characters found in text such as `<`. This resulted in some misformatted
docs. To address this I implemented an ad-hoc mdformat plugin to
override the behavior. This may be considered a bit 'hacky', but I think
it's a good solution. Nevertheless, if someone has a better idea, let me
know.
## Test Plan
This change is hard to test systematically, however, I tried my best to
look at the before and after diffs to ensure no unwanted changes were
made to the docs.
## Summary
Open files with utf8 encoding when writing files in generate_mkdocs.py.
The following can happen otherwise.
```
../ruff> python scripts/generate_mkdocs.py
Finished dev [unoptimized + debuginfo] target(s) in 0.25s
Running `target\debug\ruff_dev.exe generate-docs`
Traceback (most recent call last):
File "C:\..\ruff\scripts\generate_mkdocs.py", line 185, in <module>
main()
File "C:\..\scripts\generate_mkdocs.py", line 141, in main
f.write(clean_file_content(file_content, title))
File "C:\..\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 1396-1397: character maps to <undefined>
```
I could not determine which character was causing the issue, but opening
with utf8 encoding fixed it.
## Test Plan
Condering the change is small, I simply ran the file and confirmed it
worked, but opened to suggestion on more robust testing.
## Summary
Closes#9783. Feels hacky because of the different key so there *might*
be a nicer way to do this 😄
## Test Plan
Tested locally with `mkdocs serve`.
## Summary
SchemaStore now depends on some Prettier plugins, so we need to install
Prettier and its plugins as part of the project (rather than relying on
a global install).
## Test Plan
Ran the script!
## Summary
Long ago, we had a single `ruff` crate. We started to break that up, and
at some point, we wanted to separate the CLI from the core library. So
we created `ruff_cli`, which created a `ruff` binary. Later, the `ruff`
crate was renamed to `ruff_linter` and further broken up into additional
crates.
(This is all from memory -- I didn't bother to look through the history
to ensure that this is 100% correct :))
Now that `ruff` no longer exists, this PR renames `ruff_cli` to `ruff`.
The primary benefit is that the binary target and the crate name are now
the same, which helps with downstream tooling like `cargo-dist`, and
also removes some complexity from the crate and `Cargo.toml` itself.
## Test Plan
- Ran `rm -rf target/release`.
- Ran `cargo build --release`.
- Verified that `./target/release/ruff` was created.
We've had bugs related to non-latin scripts, most recently #9145, where
just starting a docstring with multi-byte characters would panic. I've
added https://github.com/binary-husky/gpt_academic to catch those in the
ecosystem checks, it's a popular repo with mixed english and chinese
comments and symbols.
When using the autofixer for `Q000` it does not remove the backslashes
from quotes that no longer need escaping.
This new rule checks for such backslashes (regardless whether they come
from the autofixer or not) and can remove them.
fixes#8617
## Summary
This adds redirects from, e.g., `https://docs.astral.sh/ruff/rules/F401`
to `https://docs.astral.sh/ruff/rules/unused-import`, which are
generated automatically when creating the documentation. Though we want
to move towards human-readable names eventually, I think this is a nice
and user-friendly change (and doesn't require any fancy infrastructure,
since the redirects are handled via a plugin and added client-side).
Closes#4710.
Changes the title and adds some notes re the old formatter ecosystem
checks in light of #8223
Does not remove it as I'm not sure where else we test for instabilities.
Closes#7239
- Refactors `scripts/check_ecosystem.py` into a new Python project at
`python/ruff-ecosystem`
- Includes
[documentation](https://github.com/astral-sh/ruff/blob/zanie/ecosystem-format/python/ruff-ecosystem/README.md)
now
- Provides a `ruff-ecosystem` CLI
- Fixes bug where `ruff check` report included "fixable" summary line
- Adds truncation to `ruff check` reports
- Otherwise we often won't see the `ruff format` reports
- The truncation uses some very simple heuristics and could be improved
in the future
- Identifies diagnostic changes that occur just because a violation's
fix available changes
- We still show the diff for the line because it's could matter _where_
this changes, but we could improve this
- Similarly, we could improve detection of diagnostic changes where just
the message changes
- Adds support for JSON ecosystem check output
- I added this primarily for development purposes
- If there are no changes, only errors while processing projects, we
display a different summary message
- When caching repositories, we now checkout the requested ref
- Adds `ruff format` reports, which format with the baseline then the
use `format --diff` to generate a report
- Runs all CI jobs when the CI workflow is changed
## Known problems
- Since we must format the project to get a baseline, the permalink line
numbers do not exactly correspond to the correct range
- This looks... hard. I tried using `git diff` and some wonky hunk
matching to recover the original line numbers but it doesn't seem worth
it. I think we should probably commit the formatted changes to a fork or
something if we want great results here. Consequently, I've just used
the start line instead of a range for now.
- I don't love the comment structure — it'd be nice, perhaps, to have
separate headings for the linter and formatter.
- However, the `pr-comment` workflow is an absolute pain to change
because it runs _separately_ from this pull request so I if I want to
make edits to it I can only test it via manual workflow dispatch.
- Lines are not printed "as we go" which means they're all held in
memory, presumably this would be a problem for large-scale ecosystem
checks
- We are encountering a hard limit with the maximum comment length
supported by GitHub. We will need to move the bulk of the report
elsewhere.
## Future work
- Update `ruff-ecosystem` to support non-default projects and
`check_ecosystem_all.py` behavior
- Remove existing ecosystem check scripts
- Add preview mode toggle (#8076)
- Add a toggle for truncation
- Add hints for quick reproduction of runs locally
- Consider parsing JSON output of Ruff instead of using regex to parse
the text output
- Links to project repositories should use the commit hash we checked
against
- When caching repositories, we should pull the latest changes for the
ref
- Sort check diffs by path and rule code only (changes in messages
should not change order)
- Update check diffs to distinguish between new violations and changes
in messages
- Add "fix" diffs
- Remove existing formatter similarity reports
- On release pull request, compare to the previous tag instead
---------
Co-authored-by: konsti <konstin@mailbox.org>
**Summary** Prepare for the black preview style becoming the black
stable style at the end of the year.
This adds a new test file to compare stable and preview on some relevant
preview options in black, and makes `format_dev` understand the black
preview flag. I've added poetry as a project that uses preview.
I've implemented one specific deviation (collapsing of stub
implementation in non-stub files) which showed up in poetry for testing.
This also improves poetry compatibility from 0.99891 to 0.99919.
Fixes#7440
New compatibility stats:
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 35 |
| home-assistant | 0.99953 | 10596 | 189 |
| poetry | 0.99919 | 317 | 12 |
| transformers | 0.99963 | 2657 | 332 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99969 | 654 | 15 |
| zulip | 0.99970 | 1459 | 22 |
## Summary
This PR updates our documentation for the upcoming formatter release.
Broadly, the documentation is now structured as follows:
- Overview
- Tutorial
- Installing Ruff
- The Ruff Linter
- Overview
- `ruff check`
- Rule selection
- Error suppression
- Exit codes
- The Ruff Formatter
- Overview
- `ruff format`
- Philosophy
- Configuration
- Format suppression
- Exit codes
- Black compatibility
- Known deviations
- Configuring Ruff
- pyproject.toml
- File discovery
- Configuration discovery
- CLI
- Shell autocompletion
- Preview
- Rules
- Settings
- Integrations
- `pre-commit`
- VS Code
- LSP
- PyCharm
- GitHub Actions
- FAQ
- Contributing
The major changes include:
- Removing the "Usage" section from the docs, and instead folding that
information into "Integrations" and the new Linter and Formatter
sections.
- Breaking up "Configuration" into "Configuring Ruff" (for generic
configuration), and new Linter- and Formatter-specific sections.
- Updating all example configurations to use `[tool.ruff.lint]` and
`[tool.ruff.format]`.
My suggestion is to pull and build the docs locally, and review by
reading them in the browser rather than trying to parse all the code
changes.
Closes https://github.com/astral-sh/ruff/issues/7235.
Closes https://github.com/astral-sh/ruff/issues/7647.
- Add changelog entry for 0.1.1
- Bump version to 0.1.1
- Require preview for fix added in #7967
- Allow duplicate headings in changelog (markdownlint setting)
## Summary
We don't enable E501 by default, but `line-length` is a useful example
for configuration, so we now set `--extend-select` in the tutorial with
a note to that effect.
I've also updated all the outputs to match the latest CLI behavior, and
changed the example from `List` to `Sequence` because `List` now spits
out two diagnostics (one for the import, one for the usage), which IMO
is confusing for beginners.
## Summary
The markdown documentation was present, but in the wrong place, so was
not displaying on the website. I moved it and added some references.
Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
Fix CI (broken in #7496).
The code snippet was formatted as Black would format a stub file, but
the CI script doesn't know that (it assumes all code snippets are
non-stub files). Easier to ignore.
Sorry for breaking CI!
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
We're planning to move the documentation from
[https://beta.ruff.rs/docs](https://beta.ruff.rs/docs) to
[https://docs.astral.sh/ruff](https://docs.astral.sh/ruff), for a few
reasons:
1. We want to remove the `beta` from the domain, as Ruff is no longer
considered beta software.
2. We want to migrate to a structure that could accommodate multiple
future tools living under one domain.
The docs are actually already live at
[https://docs.astral.sh/ruff](https://docs.astral.sh/ruff), but later
today, I'll add a permanent redirect from the previous to the new
domain. **All existing links will continue to work, now and in
perpetuity.**
This PR contains the code changes necessary for the updated
documentation. As part of this effort, I moved the playground and
documentation from my personal Cloudflare account to our team Cloudflare
account (hence the new `--project-name` references). After merging, I'll
also update the secrets on this repo.
## Summary
This PR adds a benchmarking script for the formatter, which benchmarks
the Ruff formatter against Black, yapf, and autopep8.
Three benchmarks are included:
1. Format everything.
2. Format everything, but use a single thread.
3. Format everything, but `--check` (don't write to disk).
There's some nuance in figuring out the right combination of arguments
to each command, but the _main_ nuance is to ensure that we always run
the given formatter (and modify the target repo in-place) prior to
benchmarking it, so that the formatters aren't disadvantaged by the
existing formatting of the target repo. (E.g.: prior to benchmarking
Black's preview style, we need to make sure we format the target repo
with Black's preview style; otherwise, preview style appears much
slower.)
Part of https://github.com/astral-sh/ruff/issues/7309.
With https://github.com/django/django/pull/17181 merged, this removes an
odd edge case (tuple expression statements aka bogus trailing commas
after statements that turn them into a tuple without you noticing) that
we don't want to care about because the input code is ~wrong from the
similarity index. I've took this opportunity to update the revisions of
all projects we test.
main
| project | similarity index |
|--------------|------------------|
| cpython | 0.75477 |
| django | 0.99814 |
| transformers | 0.99621 |
| twine | 0.99876 |
| typeshed | 0.99953 |
| warehouse | 0.99601 |
| zulip | 0.99727 |
this PR
| project | similarity index |
|--------------|------------------|
| cpython | 0.75996 |
| django | 0.99819 |
| transformers | 0.99622 |
| twine | 0.99876 |
| typeshed | 0.99953 |
| warehouse | 0.99607 |
| zulip | 0.99729 |
## Summary
In #6387, we accidentally added `git -C "$dir/django" checkout
95e4d6b81312fdd9f8ebf3385be1c1331168b5cf` as the transformers checkout
(duplicated line from the Django case). This PR fixes the SHA, and
spaces out the cases to make it more visible. I _think_ the net effect
here is that we've been formatting `main` on transformers, rather than
the SHA?
Adding five new projects. Some of these have seen issues filed, the
others, I just tabbed through our dependency pain and looked for some
reasonably-large projects that enabled rules beyond the default rule
set.
From the formatter progress CI logs:
```
2023-08-07T03:49:02.5178602Z + mkdir -p /home/runner/work/ruff/ruff/target/progress_projects
2023-08-07T03:49:02.5193474Z + '[' '!' -d /home/runner/work/ruff/ruff/target/progress_projects/build ']'
2023-08-07T03:49:02.5194228Z + '[' '!' -d /home/runner/work/ruff/ruff/target/progress_projects/django ']'
2023-08-07T03:49:02.5194966Z + git clone --filter=tree:0 https://github.com/django/django /home/runner/work/ruff/ruff/target/progress_projects/django
2023-08-07T03:49:02.5209260Z Cloning into '/home/runner/work/ruff/ruff/target/progress_projects/django'...
```
```
2023-08-07T03:51:17.4726088Z [2m2023-08-07T03:51:17.472404Z[0m [31mERROR[0m Failed /home/runner/work/ruff/ruff/target/progress_projects/build: no python files in ["/home/runner/work/ruff/ruff/target/progress_projects/build"]
```
Seems that build exists but is an empty cached folder. These changes
should fix this by a) checking for `.git` instead of just the folder
existing b) running the commit checkout unconditionally. The latter is
also important if we ever want to update the SHAs.
**Summary** Prompted by
https://github.com/astral-sh/ruff/pull/6257#issuecomment-1661308410, it
tried to make the ecosystem script output on failure better
understandable. All log messages are now written to a file, which is
printed on error. Running locally progress is still shown.
Looking through the log output i saw that we currently log syntax errors
in input, which is confusing because they aren't actual errors, but we
don't check that these files don't change due to parser regressions or
improvements. I added `--files-with-errors` to catch that.
**Test Plan** CI
Adds rule to convert type aliases defined with annotations i.e. `x:
TypeAlias = int` to the new PEP-695 syntax e.g. `type x = int`.
Does not support using new generic syntax for type variables, will be
addressed in a follow-up.
Added as part of pyupgrade — ~the code 100 as chosen to avoid collision
with real pyupgrade codes~.
Part of #4617
Builds on #5062
## Summary
This PR implements pycodestyle's E241 (tab after comma) and E242
(multiple whitespace after comma) lints.
These are marked as nursery rules like many other pycodestyle rules.
Refs #2402
## Test Plan
E24.py copied from pycodestyle.
**Summary**
Updated doc comments for `missing_whitespace_around_operator.rs`. Online
docs also benefit from this update.
**Test Plan**
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
## Summary
This is an error message only change to lead an implementor of a new
rule that has an unformatted or invalid bad example to the
right code.
## Test Plan
n/a
## Summary
Updated doc comment for `tab_indentation.rs`. Online docs also benefit
from this update.
## Test Plan
Checked docs via
[mkdocs](389fe13c93/CONTRIBUTING.md?plain=1#L267-L296)
**Summary** Add a formatter progress testing script to CI. This script
will 1) print the black compability on each run 2) catch regressions wrt
to formatter stability, emitting invalid syntax and other kinds of bugs
(e.g. #5917) before they land on main 3) have an additional layer of
real world tests when implementing new nodes or other new formatter
code.
This is currently a bash script, i'm not sure if we want to keep it that
way, or switch to e.g. the regular ecosystem scripts. The output
separation of `format_dev` could also use some polishing. We should also
consider pinning commits so we don't get spurious regression when they
change their code.
**Test Plan** The script extends CI.
I don't know whether we want to make this change but here's some data...
Binary size:
- `main`: 30,384
- `charlie/match-phf`: 30,416
llvm-lines:
- `main`: 1,784,148
- `charlie/match-phf`: 1,789,877
llvm-lines and binary size are both unchanged (or, by < 5) when moving
from `u8` to `u32` return types, and even when moving to `char` keys and
values. I didn't expect this, but I'm not very knowledgable on this
topic.
Performance:
```
Confusables/match/src time: [4.9102 µs 4.9352 µs 4.9777 µs]
change: [+1.7469% +2.2421% +2.8710%] (p = 0.00 < 0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low mild
4 (4.00%) high mild
6 (6.00%) high severe
Confusables/match-with-skip/src
time: [2.0676 µs 2.0945 µs 2.1317 µs]
change: [+0.9384% +1.6000% +2.3920%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
Confusables/phf/src time: [31.087 µs 31.188 µs 31.305 µs]
change: [+1.9262% +2.2188% +2.5496%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
6 (6.00%) high mild
6 (6.00%) high severe
Confusables/phf-with-skip/src
time: [2.0470 µs 2.0486 µs 2.0502 µs]
change: [-0.3093% -0.1446% +0.0106%] (p = 0.08 > 0.05)
No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe
```
The `-with-skip` variants add our optimization which first checks
whether the character is ASCII. So `match` is way, way faster than PHF,
but it tends not to matter since almost all source code is ASCII anyway.
## Summary
The `add_rule.py` script would create a test case that pointed to a file
that didn't exist when the linter is set to `"pylint"`. This PR fixes
that.
## Test Plan
`python scripts/add_rule.py --name DoTheThing --prefix PL --code C0999
--linter pylint`
## Summary
These changes make `scripts/ecosystem_all_check.sh --select ALL` work
again, i forgot to update this script to the new directory structure
from #5299 because it's only run manually
## Test Plan
n/a
## Summary
Check for `Any` in other types for `ANN401`. This reuses the logic from
`implicit-optional` rule to resolve the type to `Any`.
Following types are supported:
* `Union[Any, ...]`
* `Any | ...`
* `Optional[Any]`
* `Annotated[<any of the above variant>, ...]`
* Forward references i.e., `"Any | ..."`
## Test Plan
Added test cases for various combinations.
fixes: #5458
## Summary
Add links for ecosystem check result. This is useful for developers to
quickly check the added/removed violations with a single click.
There are a few downsides of this approach:
* Syntax highlighting is not available for the output
* Content length is increased because of the additional anchor tags
## Test Plan
`python scripts/check_ecosystem.py ./target/debug/ruff ../ruff-test/target/debug/ruff`
<details><summary>Example Output:</summary>
ℹ️ ecosystem check **detected changes**. (+6, -0, 0 error(s))
<details><summary>airflow (+1, -0)</summary>
<p>
<pre>
+ <a
href='https://github.com/apache/airflow/blob/main/dev/breeze/src/airflow_breeze/commands/release_management_commands.py#L654'>dev/breeze/src/airflow_breeze/commands/release_management_commands.py:654:25:</a>
PERF401 Use a list comprehension to create a transformed list
</pre>
</p>
</details>
<details><summary>bokeh (+3, -0)</summary>
<p>
<pre>
+ <a
href='https://github.com/bokeh/bokeh/blob/branch-3.2/src/bokeh/model/model.py#L315'>src/bokeh/model/model.py:315:17:</a>
PERF401 Use a list comprehension to create a transformed list
+ <a
href='https://github.com/bokeh/bokeh/blob/branch-3.2/src/bokeh/resources.py#L470'>src/bokeh/resources.py:470:25:</a>
PERF401 Use a list comprehension to create a transformed list
+ <a
href='https://github.com/bokeh/bokeh/blob/branch-3.2/src/bokeh/sphinxext/bokeh_sampledata_xref.py#L134'>src/bokeh/sphinxext/bokeh_sampledata_xref.py:134:17:</a>
PERF401 Use a list comprehension to create a transformed list
</pre>
</p>
</details>
<details><summary>zulip (+2, -0)</summary>
<p>
<pre>
+ <a
href='https://github.com/zulip/zulip/blob/main/zerver/actions/create_user.py#L197'>zerver/actions/create_user.py:197:17:</a>
PERF401 Use a list comprehension to create a transformed list
+ <a
href='https://github.com/zulip/zulip/blob/main/zerver/lib/markdown/__init__.py#L2412'>zerver/lib/markdown/__init__.py:2412:13:</a>
PERF401 Use a list comprehension to create a transformed list
</pre>
</p>
</details>
</details>
---------
Co-authored-by: konsti <konstin@mailbox.org>
## Summary
I'll write up a more detailed description tomorrow, but in short, this
PR removes our regex-based implementation in favor of "manual" parsing.
I tried a couple different implementations. In the benchmarks below:
- `Directive/Regex` is our implementation on `main`.
- `Directive/Find` just uses `text.find("noqa")`, which is insufficient,
since it doesn't cover case-insensitive variants like `NOQA`, and
doesn't handle multiple `noqa` matches in a single like, like ` # Here's
a noqa comment # noqa: F401`. But it's kind of a baseline.
- `Directive/Memchr` uses three `memchr` iterative finders (one for
`noqa`, `NOQA`, and `NoQA`).
- `Directive/AhoCorasick` is roughly the variant checked-in here.
The raw results:
```
Directive/Regex/# noqa: F401
time: [273.69 ns 274.71 ns 276.03 ns]
change: [+1.4467% +1.8979% +2.4243%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
8 (8.00%) high mild
4 (4.00%) high severe
Directive/Find/# noqa: F401
time: [66.972 ns 67.048 ns 67.132 ns]
change: [+2.8292% +2.9377% +3.0540%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
3 (3.00%) low mild
8 (8.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa: F401
time: [76.922 ns 77.189 ns 77.536 ns]
change: [+0.4265% +0.6862% +0.9871%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Memchr/# noqa: F401
time: [62.627 ns 62.654 ns 62.679 ns]
change: [-0.1780% -0.0887% -0.0120%] (p = 0.03 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
5 (5.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
Directive/Regex/# noqa: F401, F841
time: [321.83 ns 322.39 ns 322.93 ns]
change: [+8602.4% +8623.5% +8644.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# noqa: F401, F841
time: [78.618 ns 78.758 ns 78.896 ns]
change: [+1.6909% +1.8771% +2.0628%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
Directive/AhoCorasick/# noqa: F401, F841
time: [87.739 ns 88.057 ns 88.468 ns]
change: [+0.1843% +0.4685% +0.7854%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
Directive/Memchr/# noqa: F401, F841
time: [80.674 ns 80.774 ns 80.860 ns]
change: [-0.7343% -0.5633% -0.4031%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) low severe
9 (9.00%) low mild
1 (1.00%) high mild
Directive/Regex/# noqa time: [194.86 ns 195.93 ns 196.97 ns]
change: [+11973% +12039% +12103%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) low mild
1 (1.00%) high mild
Directive/Find/# noqa time: [25.327 ns 25.354 ns 25.383 ns]
change: [+3.8524% +4.0267% +4.1845%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa
time: [34.267 ns 34.368 ns 34.481 ns]
change: [+0.5646% +0.8505% +1.1281%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild
Directive/Memchr/# noqa time: [21.770 ns 21.818 ns 21.874 ns]
change: [-0.0990% +0.1464% +0.4046%] (p = 0.26 > 0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
Directive/Regex/# type: ignore # noqa: E501
time: [278.76 ns 279.69 ns 280.72 ns]
change: [+7449.4% +7469.8% +7490.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# type: ignore # noqa: E501
time: [67.791 ns 67.976 ns 68.184 ns]
change: [+2.8321% +3.1735% +3.5418%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# type: ignore # noqa: E501
time: [75.908 ns 76.055 ns 76.210 ns]
change: [+0.9269% +1.1427% +1.3955%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
Directive/Memchr/# type: ignore # noqa: E501
time: [72.549 ns 72.723 ns 72.957 ns]
change: [+1.5881% +1.9660% +2.3974%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
10 (10.00%) high mild
5 (5.00%) high severe
Directive/Regex/# type: ignore # nosec
time: [66.967 ns 67.075 ns 67.207 ns]
change: [+1713.0% +1715.8% +1718.9%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
Directive/Find/# type: ignore # nosec
time: [18.505 ns 18.548 ns 18.597 ns]
change: [+1.3520% +1.6976% +2.0333%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
Directive/AhoCorasick/# type: ignore # nosec
time: [16.162 ns 16.206 ns 16.252 ns]
change: [+1.2919% +1.5587% +1.8430%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
Directive/Memchr/# type: ignore # nosec
time: [39.192 ns 39.233 ns 39.276 ns]
change: [+0.5164% +0.7456% +0.9790%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
4 (4.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Regex/# some very long comment that # is interspersed with characters but # no directive
time: [81.460 ns 81.578 ns 81.703 ns]
change: [+2093.3% +2098.8% +2104.2%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
2 (2.00%) high mild
Directive/Find/# some very long comment that # is interspersed with characters but # no directive
time: [26.284 ns 26.331 ns 26.387 ns]
change: [+0.7554% +1.1027% +1.3832%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# some very long comment that # is interspersed with characters but # no direc...
time: [28.643 ns 28.714 ns 28.787 ns]
change: [+1.3774% +1.6780% +2.0028%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
Directive/Memchr/# some very long comment that # is interspersed with characters but # no directive
time: [55.766 ns 55.831 ns 55.897 ns]
change: [+1.5802% +1.7476% +1.9021%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) low mild
```
While memchr is faster than aho-corasick in some of the common cases
(like `# noqa: F401`), the latter is way, way faster when there _isn't_
a match (like 2x faster -- see the last two cases). Since most comments
_aren't_ `noqa` comments, this felt like the right tradeoff. Note that
all implementations are significantly faster than the regex version.
(I know I originally reported a 10x speedup, but I ended up improving
the regex version a bit in some prior PRs, so it got unintentionally
faster via some refactors.)
There's also one behavior change in here, which is that we now allow
variable spaces, e.g., `#noqa` or `# noqa`. Previously, we required
exactly one space. This thus closes#5177.
## Summary
Adds `PERF401` and `PERF402` mirroring `W8401` and `W8402` from
https://github.com/tonybaloney/perflint
Implementation is not super smart but should be at parity with upstream
implementation judging by:
c07391c176/perflint/comprehension_checker.py (L42-L73)
It essentially checks:
- If the body of a for-loop is just one statement
- If that statement is an `if` and the if-statement contains a call to
`append()` we flag `PERF401` and suggest a list comprehension
- If that statement is a plain call to `append()` or `insert()` we flag
`PERF402` and suggest `list()` or `list.copy()`
I've set the violation to only flag the first append call in a long
`if-else` statement for `PERF401`. Happy to change this to some other
location or make it multiple violations if that makes more sense.
## Test Plan
Fixtures were added with the relevant scenarios for both rules
## Issue Links
Refers: https://github.com/astral-sh/ruff/issues/4789
## Summary
Implements PERF203 from #4789, which throws if a `try/except` block is
inside of a loop. Not sure if we want to extend the diagnostic to the
`except` as well, but I thought that that may get a little messy. We may
also want to just throw on the word `try` - open to suggestions though.
## Test Plan
`cargo test`
## Summary
Add documentation to the `D3XX` rules that check for issues with
docstring quotes. Related to #2646.
## Test Plan
`python scripts/check_docs_formatted.py`
## Summary
Fix a variable name in the `add_plugin.py` script.
## Test Plan
I don't think there are any tests for the scripts, other than manual
confirmation
## Summary
This contains three changes:
* repos in `check_ecosystem.py` are stored as `org:name` instead of
`org/name` to create a flat directory layout
* `check_ecosystem.py` performs a maximum of 50 parallel jobs at the
same time to avoid consuming to much RAM
* `check-formatter-stability` gets a new option `--multi-project` so
it's possible to do `cargo run --bin ruff_dev --
check-formatter-stability --multi-project target/checkouts`
With these three changes it becomes easy to check the formatter
stability over a larger number of repositories. This is part of the
integration of integrating formatter regressions checks into the
ecosystem checks.
## Test Plan
```shell
python scripts/check_ecosystem.py --checkouts target/checkouts --projects github_search.jsonl -v $(which true) $(which true)
cargo run --bin ruff_dev -- check-formatter-stability --multi-project target/checkouts
```
## Summary
Add copyright notice detection to enforce the presence of copyright
headers in Python files.
Configurable settings include: the relevant regular expression, the
author name, and the minimum file size, similar to
[flake8-copyright](https://github.com/savoirfairelinux/flake8-copyright).
Closes https://github.com/charliermarsh/ruff/issues/3579
---------
Signed-off-by: ryan <ryang@waabi.ai>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
* Use phf for confusables to reduce llvm lines
## Summary
This replaces FxHashMap for the confusables with a perfect hash map from the [phf crate](https://github.com/rust-phf/rust-phf) to reduce the generated llvm instructions.
A perfect hash function is one that doesn't have any collisions. We can build one because we know all keys at compile time. This improves hashmap efficiency, even though this is likely not noticeable in our case (except someone has a large non-english crate to test on).
The original hashmap contained a lot of duplicates, which i had to remove when phf_map complained, i did so by sorting the keys.
The important part that it reduces the llvm instructions generated (#3808, `RUSTFLAGS="-Csymbol-mangling-version=v0" cargo llvm-lines -p ruff --lib | head -20`):
```
Lines Copies Function name
----- ------ -------------
1740502 38973 (TOTAL)
27423 (1.6%, 1.6%) 1 (0.0%, 0.0%) ruff[cef4c65d96248843]::rules::ruff::rules::confusables::CONFUSABLES::{closure#0}
10193 (0.6%, 2.2%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::RuleCodePrefix>::iter
8107 (0.5%, 2.6%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::Rule>::noqa_code
7345 (0.4%, 3.0%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_stmt
6412 (0.4%, 3.4%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
6412 (0.4%, 3.8%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
6409 (0.4%, 4.2%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
5696 (0.3%, 4.5%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_expr
4448 (0.3%, 4.7%) 1 (0.0%, 0.0%) ruff[cef4c65d96248843]::flake8_to_ruff::converter::convert
3702 (0.2%, 4.9%) 1 (0.0%, 0.0%) <&ruff[cef4c65d96248843]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
3349 (0.2%, 5.1%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::registry::Linter>::code_for_rule
3132 (0.2%, 5.3%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
3130 (0.2%, 5.5%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<&ruff[cef4c65d96248843]::codes::Rule>>::from
3130 (0.2%, 5.7%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<ruff[cef4c65d96248843]::codes::Rule>>::from
3130 (0.2%, 5.9%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
3128 (0.2%, 6.0%) 1 (0.0%, 0.0%) <ruff[cef4c65d96248843]::codes::RuleIter>::get
2669 (0.2%, 6.2%) 1 (0.0%, 0.0%) <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
```
After:
```
Lines Copies Function name
----- ------ -------------
1710487 38900 (TOTAL)
10193 (0.6%, 0.6%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::RuleCodePrefix>::iter
8107 (0.5%, 1.1%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::Rule>::noqa_code
7345 (0.4%, 1.5%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_stmt
6412 (0.4%, 1.9%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
6412 (0.4%, 2.2%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
6409 (0.4%, 2.6%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
5696 (0.3%, 3.0%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_expr
4448 (0.3%, 3.2%) 1 (0.0%, 0.0%) ruff[52408f46d2058296]::flake8_to_ruff::converter::convert
3702 (0.2%, 3.4%) 1 (0.0%, 0.0%) <&ruff[52408f46d2058296]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
3349 (0.2%, 3.6%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::registry::Linter>::code_for_rule
3132 (0.2%, 3.8%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
3130 (0.2%, 4.0%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<&ruff[52408f46d2058296]::codes::Rule>>::from
3130 (0.2%, 4.2%) 1 (0.0%, 0.0%) <&str as core[da82827a87f140f9]::convert::From<ruff[52408f46d2058296]::codes::Rule>>::from
3130 (0.2%, 4.4%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
3128 (0.2%, 4.5%) 1 (0.0%, 0.0%) <ruff[52408f46d2058296]::codes::RuleIter>::get
2669 (0.2%, 4.7%) 1 (0.0%, 0.0%) <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
2659 (0.2%, 4.9%) 1 (0.0%, 0.0%) <&ruff[52408f46d2058296]::codes::Pylint as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
```
I'd assume this has a positive effect both on compile time and on runtime, but i don't know the actual effect on compile times and can't really measure.
## Test plan
Check CI for any performance regressions.
This should fix#3808 if we merge it.
* clippy
* Update update_ambiguous_characters.py
* Document codes.rs
* Refactor codes.rs before merging
Helper script:
```python
# %%
from pathlib import Path
codes = Path("crates/ruff/src/codes.rs").read_text().splitlines()
rules = Path("a.txt").read_text().strip().splitlines()
rule_map = {i.split("::")[-1]: i for i in rules}
# %%
codes_new = []
for line in codes:
if ", Rule::" in line:
left, right = line.split(", Rule::")
right = right[:-2]
line = left + ", " + rule_map[right] + "),"
codes_new.append(line)
# %%
Path("crates/ruff/src/codes.rs").write_text("\n".join(codes_new))
```
Co-authored-by: Jonathan Plasse <13716151+JonathanPlasse@users.noreply.github.com>
* Don't assume unique repo names in ecosystem checks
This fixes a bug where previously repositories with the same name would have been overwritten.
I tested with `scripts/check_ecosystem.py -v --checkouts target/checkouts_main .venv/bin/ruff target/release/ruff` and ruff 0.0.267 that changes are shown. I confirmed with `scripts/ecosystem_all_check.sh check --select RUF008` (next PR) that the checkouts are now complete.
* Make ecosystem all check more generic
This allows passing arguments to the ecosystem all check script, e.g. you can now do `scripts/ecosystem_all_check.sh check --select RUF008`.
Tested with
```
$ cat target/ecosystem_all_results/*.stdout.txt | head
src/fi_parliament_tools/parsing/data_structures.py:33:17: RUF008 Do not use mutable default values for dataclass attributes
src/fi_parliament_tools/parsing/data_structures.py:76:17: RUF008 Do not use mutable default values for dataclass attributes
src/fi_parliament_tools/parsing/data_structures.py:178:17: RUF008 Do not use mutable default values for dataclass attributes
Found 3 errors.
braid_triggers/tasks.py:46:17: RUF008 Do not use mutable default values for dataclass attributes
Found 1 error.
src/boards/RaspberryPi3.py:15:22: RUF008 Do not use mutable default values for dataclass attributes
src/boards/board.py:21:26: RUF008 Do not use mutable default values for dataclass attributes
src/boards/board.py:22:32: RUF008 Do not use mutable default values for dataclass attributes
src/boards/board.py:23:37: RUF008 Do not use mutable default values for dataclass attributes
$ cat target/ecosystem_all_results/*.stdout.txt | wc -l
115
```
This fixes a bug where previously repositories with the same name would have been overwritten.
I tested with `scripts/check_ecosystem.py -v --checkouts target/checkouts_main .venv/bin/ruff target/release/ruff` and ruff 0.0.267 that changes are shown. I confirmed with `scripts/ecosystem_all_check.sh check --select RUF008` (next PR) that the checkouts are now complete.
* Add a script to update the schemastore
Hacked this together, it clones astral-sh/schemastore, updated the schema and pushes the changes
to a new branch tagged with the ruff git hash. You can see the URL to create the PR
to schemastore in the CLI. The script is separated into three blocks so you can rerun
the schema generation in the middle before committing.
* Use tempdir for schemastore
* Add comments
* Add script for ecosystem wide checks of all rules and fixes
This adds my personal script for checking an entire checkout of ~2.1k packages for
panics, autofix errors and similar problems. It's not really meant to be used by anybody else but i thought it's better if it lives in the repo than if it doesn't.
For reference, this is the current output of failing autofixes: https://gist.github.com/konstin/c3fada0135af6cacec74f166adf87a00. Trimmed down to the useful information: https://gist.github.com/konstin/c864f4c300c7903a24fdda49635c5da9
* Keep github template intact
* Remove the need for ripgrep
* sort output
* Ecosystem CI: Allow storing checkouts locally
This adds a --checkouts options to (re)use a local directory instead of checkouts into a tempdir
* Fix missing path conversion
* Count changes for each rule
* Handle case where rule matches were found in a line
* List and sort by changes
* Remove detail from rule changes
* Add comment about leading :
* Only print rule changes if rule changes are present
* Use re.search and match group
* Remove dict().items()
* Use match group to extract rule code
This PR sets up an "ecosystem" check as an optional part of the CI step for pull requests. The primary piece of this is a new script in `scripts/check_ecosystem.py` which takes two ruff binaries as input and compares their outputs against a corpus of open-source code in parallel. I used ruff's `text` reporting format and stdlib's `difflib` (rather than JSON output and jsondiffs) to avoid adding another dependency. There is a new ecosystem-comment workflow to add a comment to the PR (see [this link](https://securitylab.github.com/research/github-actions-preventing-pwn-requests/) which explains why it needs to be done as a new workflow for security reasons).
## Summary
This PR moves `Diagnostic`, `DiagnosticKind`, and `Fix` into their own crate, which will enable us to further split up Ruff, since sub-linter crates (which need to implement functions that return `Diagnostic`) can now depend on `ruff_diagnostics` rather than Ruff.
Implement PYI006 "bad version info comparison"
## What it does
Ensures that you only `<` and `>=` for version info comparisons with
`sys.version_info` in `.pyi` files. All other comparisons such as
`<`, `<=` and `==` are banned.
## Why is this bad?
```python
>>> import sys
>>> print(sys.version_info)
sys.version_info(major=3, minor=8, micro=10, releaselevel='final', serial=0)
>>> print(sys.version_info > (3, 8))
True
>>> print(sys.version_info == (3, 8))
False
>>> print(sys.version_info <= (3, 8))
False
>>> print(sys.version_info in (3, 8))
False
```
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
I worked on #2993 and ran into issues that the formatter tests are failing on Windows because `writeln!` emits `\n` as line terminator on all platforms, but `git` on Windows converted the line endings in the snapshots to `\r\n`.
I then tried to replicate the issue on my Windows machine and was surprised that all linter snapshot tests are failing on my machine. I figured out after some time that it is due to my global git config keeping the input line endings rather than converting to `\r\n`.
Luckily, I've been made aware of #2033 which introduced an "override" for the `assert_yaml_snapshot` macro that normalizes new lines, by splitting the formatted string using the platform-specific newline character. This is a clever approach and gives nice diffs for multiline fixes but makes assumptions about the setup contributors use and requires special care whenever we use line endings inside of tests.
I recommend that we remove the special new line handling and use `.gitattributes` to enforce the use of `LF` on all platforms [guide](https://docs.github.com/en/get-started/getting-started-with-git/configuring-git-to-handle-line-endings). This gives us platform agnostic tests without having to worry about line endings in our tests or different git configurations.
## Note
It may be necessary for Windows contributors to run the following command to update the line endings of their files
```bash
git rm --cached -r .
git reset --hard
```
Currently the define_rule_mapping! macro generates both the Rule enum as
well as the RuleCodePrefix enum and the mapping between the two. After
this commit series the macro will only generate the Rule enum and the
RuleCodePrefix enum and the mapping will be generated by a new map_codes
proc macro, so we rename the macro now to fit its new purpose.
```console
❯ cargo run rule B017
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/ruff rule B017`
no-assert-raises-exception
Code: B017 (flake8-bugbear)
### What it does
Checks for `self.assertRaises(Exception)`.
## Why is this bad?
`assertRaises(Exception)` can lead to your test passing even if the
code being tested is never executed due to a typo.
Either assert for a more specific exception (builtin or custom), use
`assertRaisesRegex` or the context manager form of `assertRaises`.
```
To enable ruff_dev to automatically generate the rule Markdown tables in
the README the ruff library contained the following function:
Linter::codes() -> Vec<RuleSelector>
which was slightly changed to `fn prefixes(&self) -> Prefixes` in
9dc66b5a65 to enable ruff_dev to split
up the Markdown tables for linters that have multiple prefixes
(pycodestyle has E & W, Pylint has PLC, PLE, PLR & PLW).
The definition of this method was however largely redundant with the
#[prefix] macro attributes in the Linter enum, which are used to
derive the Linter::parse_code function, used by the --explain command.
This commit removes the redundant Linter::prefixes by introducing a
same-named method with a different signature to the RuleNamespace trait:
fn prefixes(&self) -> &'static [&'static str];
As well as implementing IntoIterator<Rule> for &Linter. We extend the
extisting RuleNamespace proc macro to automatically derive both
implementations from the Linter enum definition.
To support the previously mentioned Markdown table splitting we
introduce a very simple hand-written method to the Linter impl:
fn categories(&self) -> Option<&'static [LinterCategory]>;
- optional `prefix` argument for `add_plugin.py`
- rules directory instead of `rules.rs`
- pathlib syntax
- fix test case where code was added instead of name
Example:
```
python scripts/add_plugin.py --url https://pypi.org/project/example/1.0.0/ example --prefix EXA
python scripts/add_rule.py --name SecondRule --code EXA002 --linter example
python scripts/add_rule.py --name FirstRule --code EXA001 --linter example
python scripts/add_rule.py --name ThirdRule --code EXA003 --linter example
```
Note that it breaks compatibility with 'old style' plugins (generation works fine, but namespaces need to be changed):
```
python scripts/add_rule.py --name DoTheThing --code PLC999 --linter pylint
```
543865c96b introduced
RuleCode::origin() -> RuleOrigin generation via a macro, while that
signature now has been renamed to Rule::origin() -> Linter we actually
want to get rid of it since rules and linters shouldn't be this tightly
coupled (since one rule can exist in multiple linters).
Another disadvantage of the previous approach was that the prefixes
had to be defined in ruff_macros/src/prefixes.rs, which was easy to
miss when defining new linters in src/*, case in point
INP001 => violations::ImplicitNamespacePackage has in the meantime been
added without ruff_macros/src/prefixes.rs being updated accordingly
which resulted in `ruff --explain INP001` mistakenly reporting that the
rule belongs to isort (since INP001 starts with the isort prefix "I").
The derive proc macro introduced in this commit requires every variant
to have at least one #[prefix = "..."], eliminating such mistakes.
More accurate since the enum also encompasses:
* ALL (which isn't a prefix at all)
* fully-qualified rule codes (which aren't prefixes unless you say
they're a prefix to the empty string but that's not intuitive)
"origin" was accurate since ruff rules are currently always modeled
after one origin (except the Ruff-specific rules).
Since we however want to introduce a many-to-many mapping between codes
and rules, the term "origin" no longer makes much sense. Rules usually
don't have multiple origins but one linter implements a rule first and
then others implement it later (often inspired from another linter).
But we don't actually care much about where a rule originates from when
mapping multiple rule codes to one rule implementation, so renaming
RuleOrigin to Linter is less confusing with the many-to-many system.
# This commit was automatically generated by running the following
# script (followed by `cargo +nightly fmt`):
import glob
import re
from typing import NamedTuple
class Rule(NamedTuple):
code: str
name: str
path: str
def rules() -> list[Rule]:
"""Returns all the rules defined in `src/registry.rs`."""
file = open('src/registry.rs')
rules = []
while next(file) != 'ruff_macros::define_rule_mapping!(\n':
continue
while (line := next(file)) != ');\n':
line = line.strip().rstrip(',')
if line.startswith('//'):
continue
code, path = line.split(' => ')
name = path.rsplit('::')[-1]
rules.append(Rule(code, name, path))
return rules
code2name = {r.code: r.name for r in rules()}
for pattern in ('src/**/*.rs', 'ruff_cli/**/*.rs', 'ruff_dev/**/*.rs', 'scripts/add_*.py'):
for name in glob.glob(pattern, recursive=True):
with open(name) as f:
text = f.read()
text = re.sub('Rule(?:Code)?::([A-Z]\w+)', lambda m: 'Rule::' + code2name[m.group(1)], text)
text = re.sub(r'(?<!"<FilePattern>:<)RuleCode\b', 'Rule', text)
text = re.sub('(use crate::registry::{.*, Rule), Rule(.*)', r'\1\2', text) # fix duplicate import
with open(name, 'w') as f:
f.write(text)