Commit Graph

438 Commits

Author SHA1 Message Date
Zanie Blue
9ae498595c Upgrade Rust to 1.71 (#6323)
Addresses
[CVE-2023-38497](https://blog.rust-lang.org/2023/08/03/cve-2023-38497.html)

See also the [version release
post](https://blog.rust-lang.org/2023/08/03/Rust-1.71.1.html)
2023-08-03 21:08:39 -04:00
konsti
0fddb31235 Use tracing for format_dev (#6177)
## Summary

[tracing](https://github.com/tokio-rs/tracing) is library for logging,
tracing and related features that has a large ecosystem. Using
[tracing-subscriber](https://docs.rs/tracing-subscriber) and
[tracing-indicatif](https://github.com/emersonford/tracing-indicatif),
we get a nice logging output that you can configure with `RUST_LOG`
(e.g. `RUST_LOG=debug`) and a live look into the formatter progress.

Default:
![Screenshot from 2023-07-30
13-59-53](https://github.com/astral-sh/ruff/assets/6826232/6432f835-9ff1-4771-955b-398e54c406dc)

`RUST_LOG=debug`:
![Screenshot from 2023-07-30
14-01-32](https://github.com/astral-sh/ruff/assets/6826232/5f2c87da-0867-4159-82e7-b5757eebb8eb)

It's easy to see in this output which files take a disproportionate
amount of time.

[Peek 2023-07-30
14-35.webm](https://github.com/astral-sh/ruff/assets/6826232/2c92db5c-1354-465b-a6bc-ddfb281d6f9d)

It opens up further integration with the tracing ecosystem,
[tracing-timing](https://docs.rs/tracing-timing/latest/tracing_timing/)
and [tokio-console](https://github.com/tokio-rs/console) can e.g. show
histograms and the json output allows us building better pipelines than
grepping a log file.

One caveat is using `parent: None` for the logging statements because
tracing subscriber does not allow deactivating the span without
reimplementing all the other log message formatting, too, and we don't
need span information, esp. since it would currently show the progress
bar span.

## Test Plan

n/a
2023-07-31 19:14:01 +00:00
Micha Reiser
40f54375cb Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser
16e1737d1b Use cursor based lexer (#6012) 2023-07-26 11:32:26 +02:00
Dhruv Manilawala
025fa4eba8 Integrate the new Jupyter AST nodes in Ruff (#6086)
## Summary

This PR adds the implementation for the new Jupyter AST nodes i.e.,
`ExprLineMagic` and `StmtLineMagic`.

## Test Plan

Add test cases for `unparse` containing magic commands

resolves: #6087
2023-07-26 08:20:30 +00:00
Charlie Marsh
ed72c027a3 Replace NoHashHasher usages with FxHashMap (#6049)
## Summary

I had always assumed that `NoHashHasher` would be faster when using
integer keys, but benchmarking shows otherwise:

```
linter/default-rules/numpy/globals.py
                        time:   [66.544 µs 66.606 µs 66.678 µs]
                        thrpt:  [44.253 MiB/s 44.300 MiB/s 44.342 MiB/s]
                 change:
                        time:   [-0.1843% +0.1087% +0.3718%] (p = 0.46 > 0.05)
                        thrpt:  [-0.3704% -0.1086% +0.1847%]
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
linter/default-rules/pydantic/types.py
                        time:   [1.3787 ms 1.3811 ms 1.3837 ms]
                        thrpt:  [18.431 MiB/s 18.466 MiB/s 18.498 MiB/s]
                 change:
                        time:   [-0.4827% -0.1074% +0.1927%] (p = 0.56 > 0.05)
                        thrpt:  [-0.1924% +0.1075% +0.4850%]
                        No change in performance detected.
linter/default-rules/numpy/ctypeslib.py
                        time:   [624.82 µs 625.96 µs 627.17 µs]
                        thrpt:  [26.550 MiB/s 26.601 MiB/s 26.650 MiB/s]
                 change:
                        time:   [-0.7071% -0.4908% -0.2736%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2744% +0.4932% +0.7122%]
                        Change within noise threshold.
linter/default-rules/large/dataset.py
                        time:   [3.1585 ms 3.1634 ms 3.1685 ms]
                        thrpt:  [12.840 MiB/s 12.861 MiB/s 12.880 MiB/s]
                 change:
                        time:   [-1.5338% -1.3463% -1.1476%] (p = 0.00 < 0.05)
                        thrpt:  [+1.1610% +1.3647% +1.5577%]
                        Performance has improved.

linter/all-rules/numpy/globals.py
                        time:   [140.17 µs 140.37 µs 140.58 µs]
                        thrpt:  [20.989 MiB/s 21.020 MiB/s 21.051 MiB/s]
                 change:
                        time:   [-0.1066% +0.3140% +0.7479%] (p = 0.14 > 0.05)
                        thrpt:  [-0.7423% -0.3130% +0.1067%]
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
linter/all-rules/pydantic/types.py
                        time:   [2.7030 ms 2.7069 ms 2.7112 ms]
                        thrpt:  [9.4064 MiB/s 9.4216 MiB/s 9.4351 MiB/s]
                 change:
                        time:   [-0.6721% -0.4874% -0.2974%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2982% +0.4898% +0.6766%]
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  12 (12.00%) high mild
  2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
                        time:   [1.4709 ms 1.4727 ms 1.4749 ms]
                        thrpt:  [11.290 MiB/s 11.306 MiB/s 11.320 MiB/s]
                 change:
                        time:   [-1.1617% -0.9766% -0.8094%] (p = 0.00 < 0.05)
                        thrpt:  [+0.8160% +0.9862% +1.1754%]
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe
linter/all-rules/large/dataset.py
                        time:   [5.8086 ms 5.8163 ms 5.8240 ms]
                        thrpt:  [6.9854 MiB/s 6.9946 MiB/s 7.0038 MiB/s]
                 change:
                        time:   [-1.5651% -1.3536% -1.1584%] (p = 0.00 < 0.05)
                        thrpt:  [+1.1720% +1.3721% +1.5900%]
                        Performance has improved.
```

My guess is that `NoHashHasher` underperforms because the keys are not
randomly distributed...

Anyway, it's a ~1% (significant) performance gain on some of the above,
plus we get to remove a dependency.
2023-07-24 23:41:57 +00:00
konsti
196cc9b655 Fix RustPython rev to main branch (#5950)
**Summary** I accidentally merged earlier while the RustPython parser
rev was still pointing to the feature branch instead of to the merged
main. This make the rev point to the RustPython parser repo main again
2023-07-21 15:53:14 +00:00
konsti
972f9a9c15 Fix formatting lambda with empty arguments (#5944)
**Summary** Fix implemented in
https://github.com/astral-sh/RustPython-Parser/pull/35: Previously,
empty lambda arguments (e.g. `lambda: 1`) would get the range of the
entire expression, which leads to incorrect comment placement. Now empty
lambda arguments get an empty range between the `lambda` and the `:`
tokens.

**Test Plan** Added a regression test.

149 instances of unstable formatting remaining.

```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
149
```
2023-07-21 15:48:45 +02:00
konsti
92f471a666 Handle io errors gracefully (#5611)
## Summary

It can happen that we can't read a file (a python file, a jupyter
notebook or pyproject.toml), which needs to be handled and handled
consistently for all file types. Instead of using `Err` or `error!`, we
emit E602 with the io error as message and continue. This PR makes sure
we handle all three cases consistently, emit E602.

I'm not convinced that it should be possible to disable io errors, but
we now handle the regular case consistently and at least print warning
consistently.

I went with `warn!` but i can change them all to `error!`, too.

It also checks the error case when a pyproject.toml is not readable. The
error message is not very helpful, but it's now a bit clearer that
actually ruff itself failed instead vs this being a diagnostic.

## Examples

This is how an Err of `run` looks now:


![image](https://github.com/astral-sh/ruff/assets/6826232/890f7ab2-2309-4b6f-a4b3-67161947cc83)

With an unreadable file and `IOError` disabled:


![image](https://github.com/astral-sh/ruff/assets/6826232/fd3d6959-fa23-4ddf-b2e5-8d6022df54b1)

(we lint zero files but count files before linting not during so we exit
0)

I'm not sure if it should (or if we should take a different path with
manual ExitStatus), but this currently also triggers when `files` is
empty:


![image](https://github.com/astral-sh/ruff/assets/6826232/f7ede301-41b5-4743-97fd-49149f750337)

## Test Plan

Unix only: Create a temporary directory with files with permissions
`000` (not readable by the owner) and run on that directory. Since this
breaks the assumptions of most of the test code (single file, `ruff`
instead of `ruff_cli`), the test code is rather cumbersome and looks a
bit misplaced; i'm happy about suggestions to fit it in closer with the
other tests or streamline it in other ways. I added another test for
when the entire directory is not readable.
2023-07-20 11:30:14 +02:00
Micha Reiser
cda90d071c Upgrade cargo insta (#5872) 2023-07-19 12:56:32 +02:00
Zanie Blue
f47443014e Remove tag information from RustPython-Parser dependency (#5861)
Following https://github.com/astral-sh/RustPython-Parser/pull/27 we now
cherry-pick commits onto our fork instead of rebasing our fork on top of
the upstream which means we do not overwrite history and a tag is not
necessary to preserve the pinned commit.

In the future, we may rewrite the history in our fork. If we do, we
should return to tagging the commits.
2023-07-18 10:48:51 -05:00
konsti
730e6b2b4c Refactor StmtIf: Formatter and Linter (#5459)
## Summary

Previously, `StmtIf` was defined recursively as
```rust
pub struct StmtIf {
    pub range: TextRange,
    pub test: Box<Expr>,
    pub body: Vec<Stmt>,
    pub orelse: Vec<Stmt>,
}
```
Every `elif` was represented as an `orelse` with a single `StmtIf`. This
means that this representation couldn't differentiate between
```python
if cond1:
    x = 1
else:
    if cond2:
        x = 2
```
and 
```python
if cond1:
    x = 1
elif cond2:
    x = 2
```
It also makes many checks harder than they need to be because we have to
recurse just to iterate over an entire if-elif-else and because we're
lacking nodes and ranges on the `elif` and `else` branches.

We change the representation to a flat

```rust
pub struct StmtIf {
    pub range: TextRange,
    pub test: Box<Expr>,
    pub body: Vec<Stmt>,
    pub elif_else_clauses: Vec<ElifElseClause>,
}

pub struct ElifElseClause {
    pub range: TextRange,
    pub test: Option<Expr>,
    pub body: Vec<Stmt>,
}
```
where `test: Some(_)` represents an `elif` and `test: None` an else.

This representation is different tradeoff, e.g. we need to allocate the
`Vec<ElifElseClause>`, the `elif`s are now different than the `if`s
(which matters in rules where want to check both `if`s and `elif`s) and
the type system doesn't guarantee that the `test: None` else is actually
last. We're also now a bit more inconsistent since all other `else`,
those from `for`, `while` and `try`, still don't have nodes. With the
new representation some things became easier, e.g. finding the `elif`
token (we can use the start of the `ElifElseClause`) and formatting
comments for if-elif-else (no more dangling comments splitting, we only
have to insert the dangling comment after the colon manually and set
`leading_alternate_branch_comments`, everything else is taken of by
having nodes for each branch and the usual placement.rs fixups).

## Merge Plan

This PR requires coordination between the parser repo and the main ruff
repo. I've split the ruff part, into two stacked PRs which have to be
merged together (only the second one fixes all tests), the first for the
formatter to be reviewed by @michareiser and the second for the linter
to be reviewed by @charliermarsh.

* MH: Review and merge
https://github.com/astral-sh/RustPython-Parser/pull/20
* MH: Review and merge or move later in stack
https://github.com/astral-sh/RustPython-Parser/pull/21
* MH: Review and approve
https://github.com/astral-sh/RustPython-Parser/pull/22
* MH: Review and approve formatter PR
https://github.com/astral-sh/ruff/pull/5459
* CM: Review and approve linter PR
https://github.com/astral-sh/ruff/pull/5460
* Merge linter PR in formatter PR, fix ecosystem checks (ecosystem
checks can't run on the formatter PR and won't run on the linter PR, so
we need to merge them first)
 * Merge https://github.com/astral-sh/RustPython-Parser/pull/22
 * Create tag in the parser, update linter+formatter PR
 * Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-07-18 13:40:15 +02:00
David Szotten
52aa2fc875 upgrade rustpython to remove tuple-constants (#5840)
c.f. https://github.com/astral-sh/RustPython-Parser/pull/28

Tests: No snapshots changed

---------

Co-authored-by: Zanie <contact@zanie.dev>
2023-07-17 22:50:31 +00:00
Tom Kuson
8420008e79 Avoid checking EXE001 and EXE002 on WSL (#5735)
## Summary

Do not raise `EXE001` and `EXE002` if WSL is detected. Uses the
[`wsl`](https://crates.io/crates/wsl) crate.

Closes #5445.

## Test Plan

`cargo test`

I don't use Windows, so was unable to test on a WSL environment. It
would be good if someone who runs Windows could check the functionality.
2023-07-13 07:36:07 -04:00
Dimitri Papadopoulos Orfanos
efe7c393d1 Fix typos found by codespell (#5607)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Fix typos found by
[codespell](https://github.com/codespell-project/codespell).

I have left out `memoize` for now (see #5606).
<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

CI tests.
<!-- How was it tested? -->
2023-07-08 12:33:18 +02:00
Charlie Marsh
072358e26b Use Instagram's LibCST rather than our fork (#5593)
## Summary

Historically, we only used a fork to enable building without pyo3. But
pyo3 is an optional feature. I may've just not understood how to
accomplish this way back when.
2023-07-07 10:00:44 -04:00
konsti
b22e6c3d38 Extend ruff_dev formatter script to compute statistics and format a project (#5492)
## Summary

This extends the `ruff_dev` formatter script util. Instead of only doing
stability checks, you can now choose different compatible options on the
CLI and get statistics.

* It adds an option the formats all files that ruff would check to allow
looking at an entire black-formatted repository with `git diff`
* It computes the [Jaccard
index](https://en.wikipedia.org/wiki/Jaccard_index) as a measure of
deviation between input and output, which is useful as single number
metric for assessing our current deviations from black.
* It adds progress bars to both the single projects as well as the
multi-project mode.
* It adds an option to write the multi-project output to a file

Sample usage:

```
$ cargo run --bin ruff_dev -- format-dev --stability-check crates/ruff/resources/test/cpython
$ cargo run --bin ruff_dev -- format-dev --stability-check /home/konsti/projects/django
Syntax error in /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors (parser error): BaseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
Found 0 stability errors in 2755 files (jaccard index 0.911) in 9.75s
$ cargo run --bin ruff_dev -- format-dev --write /home/konsti/projects/django
```

Options:

```
Several utils related to the formatter which can be run on one or more repositories. The selected set of files in a repository is the same as for `ruff check`.

* Check formatter stability: Format a repository twice and ensure that it looks that the first and second formatting look the same. * Format: Format the files in a repository to be able to check them with `git diff` * Statistics: The subcommand the Jaccard index between the (assumed to be black formatted) input and the ruff formatted output

Usage: ruff_dev format-dev [OPTIONS] [FILES]...

Arguments:
  [FILES]...
          Like `ruff check`'s files. See `--multi-project` if you want to format an ecosystem checkout

Options:
      --stability-check
          Check stability
          
          We want to ensure that once formatted content stays the same when formatted again, which is known as formatter stability or formatter idempotency, and that the formatter prints syntactically valid code. As our test cases cover only a limited amount of code, this allows checking entire repositories.

      --write
          Format the files. Without this flag, the python files are not modified

      --format <FORMAT>
          Control the verbosity of the output
          
          [default: default]

          Possible values:
          - minimal: Filenames only
          - default: Filenames and reduced diff
          - full:    Full diff and invalid code

  -x, --exit-first-error
          Print only the first error and exit, `-x` is same as pytest

      --multi-project
          Checks each project inside a directory, useful e.g. if you want to check all of the ecosystem checkouts

      --error-file <ERROR_FILE>
          Write all errors to this file in addition to stdout. Only used in multi-project mode
```

## Test Plan

I ran this on django (2755 files, jaccard index 0.911) and discovered a
magic trailing comma problem and that we really needed to implement
import formatting. I ran the script on cpython to identify
https://github.com/astral-sh/ruff/pull/5558.
2023-07-07 11:30:12 +00:00
konstin
7f6cb9dfb5 Format call expressions (without call chaining) (#5341)
## Summary

This formats call expressions with magic trailing comma and parentheses
behaviour but without call chaining

## Test Plan

Lots of new test fixtures, including some that don't work yet
2023-06-27 09:29:40 +00:00
Micha Reiser
8879927b9a Use insta::glob instead of fixture macro (#5364) 2023-06-26 08:46:18 +00:00
Micha Reiser
6ba9d5d5a4 Upgrade RustPython (#5334) 2023-06-23 20:39:47 +00:00
Micha Reiser
f7e1cf4b51 Format class definitions (#5289) 2023-06-22 09:09:43 +00:00
Micha Reiser
e520a3a721 Fix ArgWithDefault comments handling (#5204) 2023-06-20 20:48:07 +00:00
Charlie Marsh
6331598511 Upgrade RustPython to access ranged names (#5194)
## Summary

In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified
RustPython to include ranges for any identifiers that aren't
`Expr::Name` (which already has an identifier).

For example, the `e` in `except ValueError as e` was previously
un-ranged. To extract its range, we had to do some lexing of our own.
This change should improve performance and let us remove a bunch of
code.

## Test Plan

`cargo test`
2023-06-20 15:43:38 +00:00
Logan Hunt
dfb04e679e Small binary size optimization (#5203) 2023-06-20 08:47:01 +02:00
Charlie Marsh
36e01ad6eb Upgrade RustPython (#5192)
## Summary

This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
2023-06-19 21:09:53 +00:00
Charlie Marsh
ddfdc3bb01 Add rule documentation URL to JSON output (#5187)
## Summary

I want to include URLs to the rule documentation in the LSP (the LSP has
a native `code_description` field for this, which, if specified, causes
the source to be rendered as a link to the docs). This PR exposes the
URL to the documentation in the Ruff JSON output.
2023-06-19 21:09:15 +00:00
Dhruv Manilawala
48f4f2d63d Maintain consistency when deserializing to JSON (#5114)
## Summary

Maintain consistency while deserializing Jupyter notebook to JSON. The
following changes were made:

1. Use string array to store the source value as that's the default
(5781720423/nbformat/v4/nbjson.py (L56-L57))
2. Remove unused structs and enums
3. Reorder the keys in alphabetical order as that's the default.
(5781720423/nbformat/v4/nbjson.py (L51))

### Side effect

Removing the `preserve_order` feature means that the order of keys in
JSON output (`--format json`) will be in alphabetical order. This is
because the value is represented using `serde_json::Value` which
internally is a `BTreeMap`, thus sorting it as per the string key. For
posterity if this turns out to be not ideal, then we could define a
struct representing the JSON object and the order of struct fields will
determine the order in the JSON string.

## Test Plan

Add a test case to assert the raw JSON string.
2023-06-19 23:47:56 +05:30
konstin
b8d378b0a3 Add a script that tests formatter stability on repositories (#5055)
## Summary

We want to ensure that once formatted content stays the same when
formatted again, which is known as formatter stability or formatter
idempotency, and that the formatter prints syntactically valid code. As
our test cases cover only a limited amount of code, this allows checking
entire repositories.

This adds a new subcommand to `ruff_dev` which can be invoked as `cargo
run --bin ruff_dev -- check-formatter-stability <repo>`. While initially
only intended to check stability, it has also found cases where the
formatter printed invalid syntax or panicked.

 ## Test Plan

Running this on cpython is already identifying bugs
(https://github.com/astral-sh/ruff/pull/5089)
2023-06-19 14:13:38 +00:00
Charlie Marsh
68b6d30c46 Use consistent Cargo.toml metadata in all crates (#5015) 2023-06-12 00:02:40 +00:00
Dhruv Manilawala
07cc4bcb0f Update links to point to Astral org (#4949) 2023-06-08 11:43:40 -04:00
Micha Reiser
39a1f3980f Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
Micha Reiser
694bf7f5b8 Upgrade to Rust 1.70 (#4848) 2023-06-04 17:51:47 +00:00
Charlie Marsh
399eb84d5e Add a ruff_textwrap crate (#4731) 2023-05-31 16:35:23 +00:00
Micha Reiser
6c1ff6a85f Upgrade RustPython (#4747) 2023-05-31 08:26:35 +00:00
Charlie Marsh
ea31229be0 Track TYPE_CHECKING blocks in Importer (#4593) 2023-05-30 16:18:10 +00:00
Micha Reiser
0cd453bdf0 Generic "comment to node" association logic (#4642) 2023-05-30 09:28:01 +00:00
Micha Reiser
154439728a Add AnyNode and AnyNodeRef unions (#4578) 2023-05-23 08:53:22 +02:00
Micha Reiser
daadd24bde Include decorators in Function and Class definition ranges (#4467) 2023-05-22 17:50:42 +02:00
Micha Reiser
063431cb0f Upgrade RustPython (#4576) 2023-05-22 14:50:49 +02:00
figsoda
bab818e801 Update RustPython dependencies (#4503) 2023-05-18 15:28:13 -04:00
Jeong, YunWon
4b05ca1198 Specialize ConversionFlag (#4450) 2023-05-16 18:00:13 +02:00
Charlie Marsh
f0465bf106 Emit non-logical newlines for "empty" lines (#4444) 2023-05-16 14:58:56 +00:00
Jeong, YunWon
6049aabe27 Update RustPyhon and enable full-lexer feature (#4442) 2023-05-16 07:19:57 +00:00
Micha Reiser
fa26860296 Refactor range from Attributed to Nodes (#4422) 2023-05-16 06:36:32 +00:00
Micha Reiser
f5afa8198c Use new rustpython_format crate over rustpython-common (#4388) 2023-05-13 12:35:02 +00:00
Jeong, YunWon
bbadbb5de5 Refactor code to use the new RustPython is method (#4369) 2023-05-11 16:16:36 +02:00
Jeong, YunWon
be6e00ef6e Re-integrate RustPython parser repository (#4359)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-11 07:47:17 +00:00
Micha Reiser
cab65b25da Replace row/column based Location with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Micha Reiser
e33887718d Use Rust 1.69 (#4065) 2023-04-22 23:04:17 +01:00
Micha Reiser
ba4f4f4672 Upgrade dependencies (#4064) 2023-04-22 18:04:01 +01:00