Commit Graph

482 Commits

Author SHA1 Message Date
James Berry f194572be8
Remove visit_arg_with_default (#5265)
## Summary

This is a follow up to #5221. Turns out it was easy to restructure the
visitor to get the right order, I'm just dumb 🤷‍♂️ I've
removed `visit_arg_with_default` entirely from the `Visitor`, although
it still exists as part of `preorder::Visitor`.
2023-06-21 16:00:24 -04:00
James Berry 9b5fb8f38f
Fix AST visitor traversal order (#5221)
## Summary

According to the AST visitor documentation, the AST visitor "visits all
nodes in the AST recursively in evaluation-order". However, the current
traversal fails to meet this specification in a few places.

### Function traversal

```python
order = []
@(order.append("decorator") or (lambda x: x))
def f(
    posonly: order.append("posonly annotation") = order.append("posonly default"),
    /,
    arg: order.append("arg annotation") = order.append("arg default"),
    *args: order.append("vararg annotation"),
    kwarg: order.append("kwarg annotation") = order.append("kwarg default"),
    **kwargs: order.append("kwarg annotation")
) -> order.append("return annotation"):
    pass
print(order)
```

Executing the above snippet using CPython 3.10.6 prints the following
result (formatted for readability):
```python
[
    'decorator',
    'posonly default',
    'arg default',
    'kwarg default',
    'arg annotation',
    'posonly annotation',
    'vararg annotation',
    'kwarg annotation',
    'kwarg annotation',
    'return annotation',
]
```

Here we can see that decorators are evaluated first, followed by
argument defaults, and annotations are last. The current traversal of a
function's AST does not align with this order.

### Annotated assignment traversal
```python
order = []
x: order.append("annotation") = order.append("expression")
print(order)
```

Executing the above snippet using CPython 3.10.6 prints the following
result:
```python
['expression', 'annotation']
```

Here we can see that an annotated assignments annotation gets evaluated
after the assignment's expression. The current traversal of an annotated
assignment's AST does not align with this order.

## Why?

I'm slowly working on #3946 and porting over some of the logic and tests
from ssort. ssort is very sensitive to AST traversal order, so ensuring
the utmost correctness here is important.

## Test Plan

There doesn't seem to be existing tests for the AST visitor, so I didn't
bother adding tests for these very subtle changes. However, this
behavior will be captured in the tests for the PR which addresses #3946.
2023-06-21 14:40:58 -04:00
Micha Reiser e520a3a721
Fix ArgWithDefault comments handling (#5204) 2023-06-20 20:48:07 +00:00
Charlie Marsh 7bc33a8d5f
Remove identifier lexing in favor of parser ranges (#5195)
## Summary

Now that all identifiers include ranges (#5194), we can remove a ton of
this "custom lexing" code that we have to sketchily extract identifier
ranges from source.

## Test Plan

`cargo test`
2023-06-20 12:07:29 -04:00
Charlie Marsh 6331598511
Upgrade `RustPython` to access ranged names (#5194)
## Summary

In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified
RustPython to include ranges for any identifiers that aren't
`Expr::Name` (which already has an identifier).

For example, the `e` in `except ValueError as e` was previously
un-ranged. To extract its range, we had to do some lexing of our own.
This change should improve performance and let us remove a bunch of
code.

## Test Plan

`cargo test`
2023-06-20 15:43:38 +00:00
Charlie Marsh 8e06140d1d
Remove continuations when deleting statements (#5198)
## Summary

This PR modifies our statement deletion logic to delete any preceding
continuation lines.

For example, given:

```py
x = 1; \
  import os
```

We'll now rewrite to:

```py
x = 1;
```

In addition, the logic can now handle multiple preceding continuations
(which is unlikely, but valid).
2023-06-19 22:04:28 -04:00
Charlie Marsh 36e01ad6eb
Upgrade RustPython (#5192)
## Summary

This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
2023-06-19 21:09:53 +00:00
Thomas de Zeeuw e3c12764f8
Only use a single cache file per Python package (#5117)
## Summary

This changes the caching design from one cache file per source file, to
one cache file per package. This greatly reduces the amount of cache
files that are opened and written, while maintaining roughly the same
(combined) size as bincode is very compact.

Below are some very much not scientific performance tests. It uses
projects/sources to check:

* small.py: single, 31 bytes Python file with 2 errors.
* test.py: single, 43k Python file with 8 errors.
* fastapi: FastAPI repo, 1134 files checked, 0 errors.

Source   | Before # files | After # files | Before size | After size
-------|-------|-------|-------|-------
small.py | 1              | 1             | 20 K        | 20 K
test.py  | 1              | 1             | 60 K        | 60 K
fastapi  | 1134           | 518           | 4.5 M       | 2.3 M

One question that might come up is why fastapi still has 518 cache files
and not 1? That is because this is using the existing package
resolution, which sees examples, docs, etc. as separate from the "main"
source code (in the fastapi directory in the repo). In this future it
might be worth consider switching to a one cache file per repo strategy.

This new design is not perfect and does have a number of known issues.
First, like the old design it doesn't remove the cache for a source file
that has been (re)moved until `ruff clean` is called.

Second, this currently uses a large mutex around the mutation of the
package cache (e.g. inserting result). This could be (or become) a
bottleneck. It's future work to test and improve this (if needed).

Third, currently the packages and opened and stored in a sequential
loop, this could be done parallel. This is also future work.


## Test Plan

Run `ruff check` (with caching enabled) twice on any Python source code
and it should produce the same results.
2023-06-19 17:46:13 +02:00
Charlie Marsh 2b82caa163
Detect continuations at start-of-file (#5173)
## Summary

Given:

```python
\
import os
```

Deleting `import os` leaves a syntax error: a file can't end in a
continuation. We have code to handle this case, but it failed to pick up
continuations at the _very start_ of a file.

Closes #5156.
2023-06-19 00:09:02 -04:00
Charlie Marsh fab2a4adf7
Use `matches!` for insecure hash rule (#5141) 2023-06-16 04:18:32 +00:00
Charlie Marsh 5ea3e42513
Always use identifier ranges to store bindings (#5110)
## Summary

At present, when we store a binding, we include a `TextRange` alongside
it. The `TextRange` _sometimes_ matches the exact range of the
identifier to which the `Binding` is linked, but... not always.

For example, given:

```python
x = 1
```

The binding we create _will_ use the range of `x`, because the left-hand
side is an `Expr::Name`, which has a valid range on it.

However, given:

```python
try:
  pass
except ValueError as e:
  pass
```

When we create a binding for `e`, we don't have a `TextRange`... The AST
doesn't give us one. So we end up extracting it via lexing.

This PR extends that pattern to the rest of the binding kinds, to ensure
that whenever we create a binding, we always use the range of the bound
name. This leads to better diagnostics in cases like pattern matching,
whereby the diagnostic for "unused variable `x`" here used to include
`*x`, instead of just `x`:

```python
def f(provided: int) -> int:
    match provided:
        case [_, *x]:
            pass
```

This is _also_ required for symbol renames, since we track writes as
bindings -- so we need to know the ranges of the bound symbols.

By storing these bindings precisely, we can also remove the
`binding.trimmed_range` abstraction -- since bindings already use the
"trimmed range".

To implement this behavior, I took some of our existing utilities (like
the code we had for `except ValueError as e` above), migrated them from
a full lexer to a zero-allocation lexer that _only_ identifies
"identifiers", and moved the behavior into a trait, so we can now do
`stmt.identifier(locator)` to get the range for the identifier.

Honestly, we might end up discarding much of this if we decide to put
ranges on all identifiers
(https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we
do, this will _still_ be a good change, because the lexer introduced
here is useful beyond names (e.g., we use it find the `except` keyword
in an exception handler, to find the `else` after a `for` loop, and so
on). So, I'm fine committing this even if we end up changing our minds
about the right approach.

Closes #5090.

## Benchmarks

No significant change, with one statistically significant improvement
(-2.1654% on `linter/all-rules/large/dataset.py`):

```
linter/default-rules/numpy/globals.py
                        time:   [73.922 µs 73.955 µs 73.986 µs]
                        thrpt:  [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s]
                 change:
                        time:   [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05)
                        thrpt:  [+0.3996% +0.4755% +0.5611%]
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
linter/default-rules/pydantic/types.py
                        time:   [1.4909 ms 1.4917 ms 1.4926 ms]
                        thrpt:  [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s]
                 change:
                        time:   [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05)
                        thrpt:  [-0.3380% -0.2734% -0.2136%]
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe
linter/default-rules/numpy/ctypeslib.py
                        time:   [688.97 µs 691.34 µs 694.15 µs]
                        thrpt:  [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s]
                 change:
                        time:   [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05)
                        thrpt:  [+0.1468% +0.7351% +1.3461%]
                        Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  12 (12.00%) high severe
linter/default-rules/large/dataset.py
                        time:   [3.3872 ms 3.4032 ms 3.4191 ms]
                        thrpt:  [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s]
                 change:
                        time:   [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05)
                        thrpt:  [-0.0905% +0.2642% +0.6469%]
                        No change in performance detected.
Found 20 outliers among 100 measurements (20.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  4 (4.00%) high mild
  13 (13.00%) high severe

linter/all-rules/numpy/globals.py
                        time:   [148.99 µs 149.21 µs 149.42 µs]
                        thrpt:  [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s]
                 change:
                        time:   [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2785% +0.5094% +0.7395%]
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high severe
linter/all-rules/pydantic/types.py
                        time:   [3.0362 ms 3.0396 ms 3.0441 ms]
                        thrpt:  [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s]
                 change:
                        time:   [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05)
                        thrpt:  [-0.2121% -0.0618% +0.0958%]
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low severe
  3 (3.00%) low mild
  5 (5.00%) high mild
  2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
                        time:   [1.6879 ms 1.6894 ms 1.6909 ms]
                        thrpt:  [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s]
                 change:
                        time:   [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05)
                        thrpt:  [-0.0435% +0.0889% +0.2284%]
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) low mild
  1 (1.00%) high severe
linter/all-rules/large/dataset.py
                        time:   [7.1520 ms 7.1586 ms 7.1654 ms]
                        thrpt:  [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s]
                 change:
                        time:   [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05)
                        thrpt:  [+1.8102% +2.2133% +2.6300%]
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
```
2023-06-15 18:43:19 +00:00
konstin 66089e1a2e
Fix a number of formatter errors from the cpython repository (#5089)
## Summary

This fixes a number of problems in the formatter that showed up with
various files in the [cpython](https://github.com/python/cpython)
repository. These problems surfaced as unstable formatting and invalid
code. This is not the entirety of problems discovered through cpython,
but a big enough chunk to separate it. Individual fixes are generally
individual commits. They were discovered with #5055, which i update as i
work through the output

## Test Plan

I added regression tests with links to cpython for each entry, except
for the two stubs that also got comment stubs since they'll be
implemented properly later.
2023-06-15 11:24:14 +00:00
Charlie Marsh 716cab2f19
Run `rustfmt` on nightly to clean up erroneous comments (#5106)
## Summary

This PR runs `rustfmt` with a few nightly options as a one-time fix to
catch some malformatted comments. I ended up just running with:

```toml
condense_wildcard_suffixes = true
edition = "2021"
max_width = 100
normalize_comments = true
normalize_doc_attributes = true
reorder_impl_items = true
unstable_features = true
use_field_init_shorthand = true
```

Since these all seem like reasonable things to fix, so may as well while
I'm here.
2023-06-15 00:19:05 +00:00
Charlie Marsh aa41ffcfde
Add `BindingKind` variants to represent deleted bindings (#5071)
## Summary

Our current mechanism for handling deletions (e.g., `del x`) is to
remove the symbol from the scope's `bindings` table. This "does the
right thing", in that if we then reference a deleted symbol, we're able
to determine that it's unbound -- but it causes a variety of problems,
mostly in that it makes certain bindings and references unreachable
after-the-fact.

Consider:

```python
x = 1
print(x)
del x
```

If we analyze this code _after_ running the semantic model over the AST,
we'll have no way of knowing that `x` was ever introduced in the scope,
much less that it was bound to a value, read, and then deleted --
because we effectively erased `x` from the model entirely when we hit
the deletion.

In practice, this will make it impossible for us to support local symbol
renames. It also means that certain rules that we want to move out of
the model-building phase and into the "check dead scopes" phase wouldn't
work today, since we'll have lost important information about the source
code.

This PR introduces two new `BindingKind` variants to model deletions:

- `BindingKind::Deletion`, which represents `x = 1; del x`.
- `BindingKind::UnboundException`, which represents:

```python
try:
  1 / 0
except Exception as e:
  pass
```

In the latter case, `e` gets unbound after the exception handler
(assuming it's triggered), so we want to handle it similarly to a
deletion.

The main challenge here is auditing all of our existing `Binding` and
`Scope` usages to understand whether they need to accommodate deletions
or otherwise behave differently. If you look one commit back on this
branch, you'll see that the code is littered with `NOTE(charlie)`
comments that describe the reasoning behind changing (or not) each of
those call sites. I've also augmented our test suite in preparation for
this change over a few prior PRs.

### Alternatives

As an alternative, I considered introducing a flag to `BindingFlags`,
like `BindingFlags::UNBOUND`, and setting that at the appropriate time.

This turned out to be a much more difficult change, because we tend to
match on `BindingKind` all over the place (e.g., we have a bunch of code
blocks that only run when a `BindingKind` is
`BindingKind::Importation`). As a result, introducing these new
`BindingKind` variants requires only a few changes at the client sites.
Adding a flag would've required a much wider-reaching change.
2023-06-14 09:27:24 -04:00
Charlie Marsh fc6580592d
Use Expr::is_* methods at more call sites (#5075) 2023-06-14 04:02:39 +00:00
Charlie Marsh 3f6584b74f
Fix erroneous kwarg reference (#5068) 2023-06-14 00:01:52 +00:00
Charlie Marsh c2fa568b46
Use dedicated structs for excepthandler variants (#5065)
## Summary

Oversight from #5042.
2023-06-13 22:37:06 +00:00
Charlie Marsh cc44349401
Use dedicated structs in `comparable.rs` (#5042)
## Summary

Updating to match the updated AST structure, for consistency.
2023-06-13 03:57:34 +00:00
Charlie Marsh 780336db0a
Include f-string prefixes in quote-stripping utilities (#5039)
Mentioned here:
https://github.com/astral-sh/ruff/pull/4853#discussion_r1217560348.

Generated with this hacky script:
https://gist.github.com/charliermarsh/8ecc4e55bc87d51dc27340402f33b348.
2023-06-12 18:25:47 -04:00
Charlie Marsh 7e37d8916c
Remove lexer dependency from identifier_range (#5036)
## Summary

We run this quite a bit -- the new version is zero-allocation, though
it's not quite as nice as the lexer we have in the formatter.
2023-06-12 22:06:03 +00:00
Charlie Marsh ab11dd08df
Improve `TypedDict` conversion logic for shadowed builtins and dunder methods (#5038)
## Summary

This PR (1) avoids flagging `TypedDict` and `NamedTuple` conversions
when attributes are dunder methods, like `__dict__`, and (2) avoids
flagging the `A003` shadowed-attribute rule for `TypedDict` classes at
all, where it doesn't really apply (since those attributes are only
accessed via subscripting anyway).

Closes #5027.
2023-06-12 21:23:39 +00:00
Addison Crump 70e6c212d9
Improve ruff_parse_simple to find UTF-8 violations (#5008)
Improves the `ruff_parse_simple` fuzz harness by adding checks for
parsed locations to ensure they all lie on UTF-8 character boundaries.
This will allow for faster identification of issues like #5004.

This also adds additional details for Apple M1 users and clarifies the
importance of using `init-fuzzer.sh` (thanks for the feedback,
@jasikpark 🙂).
2023-06-12 12:10:23 -04:00
Charlie Marsh 68b6d30c46
Use consistent `Cargo.toml` metadata in all crates (#5015) 2023-06-12 00:02:40 +00:00
Charlie Marsh 445e1723ab
Use `Stmt::parse` in lieu of `Suite` unwraps (#5002) 2023-06-10 04:55:31 +00:00
Charlie Marsh 2d597bc1fb
Parenthesize expressions prior to lexing in F632 (#5001) 2023-06-10 04:23:43 +00:00
Charlie Marsh 02b8ce82af
Refactor `RET504` to only enforce assignment-then-return pattern (#4997)
## Summary

The `RET504` rule, which looks for unnecessary assignments before return
statements, is a frequent source of issues (#4173, #4236, #4242, #1606,
#2950). Over time, we've tried to refine the logic to handle more cases.
For example, we now avoid analyzing any functions that contain any
function calls or attribute assignments, since those operations can
contain side effects (and so we mark them as a "read" on all variables
in the function -- we could do a better job with code graph analysis to
handle this limitation, but that'd be a more involved change.) We also
avoid flagging any variables that are the target of multiple
assignments. Ultimately, though, I'm not happy with the implementation
-- we just can't do sufficiently reliable analysis of arbitrary code
flow given the limited logic herein, and the existing logic is very hard
to reason about and maintain.

This PR refocuses the rule to only catch cases of the form:

```py
def f():
    x = 1
    return x
```

That is, we now only flag returns that are immediately preceded by an
assignment to the returned variable. While this is more limiting, in
some ways, it lets us flag more cases vis-a-vis the previous
implementation, since we no longer "fully eject" when functions contain
function calls and other effect-ful operations.

Closes #4173.

Closes #4236.

Closes #4242.
2023-06-10 00:05:01 -04:00
Charlie Marsh f401050878
Introduce `PythonWhitespace` to confine trim operations to Python whitespace (#4994)
## Summary

We use `.trim()` and friends in a bunch of places, to strip whitespace
from source code. However, not all Unicode whitespace characters are
considered "whitespace" in Python, which only supports the standard
space, tab, and form-feed characters.

This PR audits our usages of `.trim()`, `.trim_start()`, `.trim_end()`,
and `char::is_whitespace`, and replaces them as appropriate with a new
`.trim_whitespace()` analogues, powered by a `PythonWhitespace` trait.

In general, the only place that should continue to use `.trim()` is
content within docstrings, which don't need to adhere to Python's
semantic definitions of whitespace.

Closes #4991.
2023-06-09 21:44:50 -04:00
Charlie Marsh 1d756dc3a7
Move Python whitespace utilities into new `ruff_python_whitespace` crate (#4993)
## Summary

`ruff_newlines` becomes `ruff_python_whitespace`, and includes the
existing "universal newline" handlers alongside the Python
whitespace-specific utilities.
2023-06-10 00:59:57 +00:00
Charlie Marsh 16d1e63a5e
Respect 'is not' operators split across newlines (#4977) 2023-06-09 05:07:45 +00:00
Charlie Marsh 58d08219e8
Allow re-assignments to `__all__` (#4967) 2023-06-08 17:19:56 +00:00
Micha Reiser 68969240c5
Format Function definitions (#4951) 2023-06-08 16:07:33 +00:00
Micha Reiser 39a1f3980f
Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
Micha Reiser 19abee086b
Introduce `AnyFunctionDefinition` Node (#4898) 2023-06-06 20:37:46 +02:00
Charlie Marsh d1b8fe6af2
Fix round-tripping of nested functions (#4875) 2023-06-05 16:13:08 -04:00
Charlie Marsh 8938b2d555
Use `qualified_name` terminology in more structs for consistency (#4873) 2023-06-05 19:06:48 +00:00
Ryan Yang 72245960a1
implement E307 for pylint invalid str return type (#4854) 2023-06-05 17:54:15 +00:00
Micha Reiser c89d2f835e
Add to `AnyNode` and `AnyNodeRef` conversion methods to `AstNode` (#4783) 2023-06-02 08:10:41 +02:00
qdegraaf fcbf5c3fae
Add PYI034 for `flake8-pyi` plugin (#4764) 2023-06-02 02:15:57 +00:00
Charlie Marsh ab26f2dc9d
Use saturating_sub in more token-walking methods (#4773) 2023-06-01 17:16:32 -04:00
Micha Reiser be31d71849
Correctly associate own-line comments in bodies (#4671) 2023-06-01 08:12:53 +02:00
Charlie Marsh 9d0ffd33ca
Move universal newline handling into its own crate (#4729) 2023-05-31 12:00:47 -04:00
Micha Reiser 6c1ff6a85f
Upgrade RustPython (#4747) 2023-05-31 08:26:35 +00:00
Charlie Marsh f47a517e79
Enable callers to specify import-style preferences in `Importer` (#4717) 2023-05-30 16:46:19 +00:00
Charlie Marsh ea31229be0
Track `TYPE_CHECKING` blocks in `Importer` (#4593) 2023-05-30 16:18:10 +00:00
Micha Reiser 0cd453bdf0
Generic "comment to node" association logic (#4642) 2023-05-30 09:28:01 +00:00
Charlie Marsh 9741f788c7
Remove globals table from `Scope` (#4686) 2023-05-27 22:35:20 -04:00
Micha Reiser 33a7ed058f
Create `PreorderVisitor` trait (#4658) 2023-05-26 06:14:08 +00:00
Charlie Marsh 0f610f2cf7
Remove dedicated ScopeKind structs in favor of AST nodes (#4648) 2023-05-25 19:31:02 +00:00
Micha Reiser 85f094f592
Improve `Message` sorting performance (#4624) 2023-05-24 16:34:48 +02:00
Micha Reiser 2681c0e633
Add missing nodes to `AnyNodeRef` and `AnyNode` (#4608) 2023-05-23 18:30:27 +02:00
Micha Reiser 154439728a
Add `AnyNode` and `AnyNodeRef` unions (#4578) 2023-05-23 08:53:22 +02:00
Micha Reiser daadd24bde
Include decorators in `Function` and `Class` definition ranges (#4467) 2023-05-22 17:50:42 +02:00
Charlie Marsh 19c4b7bee6
Rename ruff_python_semantic's `Context` struct to `SemanticModel` (#4565) 2023-05-22 02:35:03 +00:00
Micha Reiser 2f35099f81
Remove `regex` dependency from `ruff_python_ast` (#4518) 2023-05-19 06:44:18 +00:00
Ville Skyttä 2e2ba2cb16
Avoid some false positives in dunder variable assigments (#4508) 2023-05-19 02:11:20 +00:00
Charlie Marsh e9c6f16c56
Move unparse utility methods onto Generator (#4497) 2023-05-18 15:00:46 +00:00
Charlie Marsh d3b18345c5
Move triple-quoted string detection into `Indexer` method (#4495) 2023-05-18 14:42:05 +00:00
Charlie Marsh 73efbeb581
Invert quote-style when generating code within f-strings (#4487) 2023-05-18 14:33:33 +00:00
Charlie Marsh e8e66f3824
Remove unnecessary path prefixes (#4492) 2023-05-18 10:19:09 -04:00
Charlie Marsh 14c6419bc1
Bring pycodestyle rules into full compatibility (on SciPy) (#4472) 2023-05-17 16:51:55 +00:00
Charlie Marsh d9c3f8e249
Avoid flagging missing whitespace for decorators (#4454) 2023-05-16 13:15:01 -04:00
Charlie Marsh 7e0d018b35
Avoid emitting empty logical lines (#4452) 2023-05-16 16:33:33 +00:00
Jeong, YunWon 4b05ca1198
Specialize ConversionFlag (#4450) 2023-05-16 18:00:13 +02:00
Charlie Marsh f0465bf106
Emit non-logical newlines for "empty" lines (#4444) 2023-05-16 14:58:56 +00:00
Jeong, YunWon badade3ccc
Impl `Default` for `SourceLocation` (#4328)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-16 07:03:43 +00:00
Micha Reiser fa26860296
Refactor range from `Attributed` to `Node`s (#4422) 2023-05-16 06:36:32 +00:00
Jonathan Plasse c10a4535b9
Disallow `unreachable_pub` (#4314) 2023-05-11 18:00:00 -04:00
Jeong, YunWon be6e00ef6e
Re-integrate RustPython parser repository (#4359)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-11 07:47:17 +00:00
Charlie Marsh fd34797d0f
Add a specialized `StatementVisitor` (#4349) 2023-05-10 12:42:20 -04:00
Micha Reiser 99a755f936
Add `schemars` feature (#4305) 2023-05-09 16:15:18 +02:00
Charlie Marsh c1f0661225
Replace `parents` statement stack with a `Nodes` abstraction (#4233) 2023-05-06 16:12:41 +00:00
Micha Reiser e93f378635
Refactor whitespace around operator (#4223) 2023-05-05 09:37:56 +02:00
Jonathan Plasse 8c97e7922b
Fix F811 false positive with match (#4161) 2023-04-30 14:39:45 -04:00
Micha Reiser e04ef42334
Use `memchr` to speedup newline search on x86 (#3985) 2023-04-26 20:15:47 +01:00
Micha Reiser f3e6ddda62
perf(logical-lines): Various small perf improvements (#4022) 2023-04-26 20:10:35 +01:00
Micha Reiser cab65b25da
Replace row/column based `Location` with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Jonathan Plasse df77595426
Move `Truthiness` into `ruff_python_ast` (#4071) 2023-04-24 04:54:31 +00:00
Micha Reiser ba4f4f4672
Upgrade dependencies (#4064) 2023-04-22 18:04:01 +01:00
Charlie Marsh 7fa1da20fb
Support relative imports in `banned-api` enforcement (#4025) 2023-04-19 14:30:13 -04:00
Charlie Marsh eb0dd74040
Avoid adding required imports to stub files (#3940) 2023-04-11 22:31:20 -04:00
Micha Reiser e8aebee3f6
Pretty print `Diagnostic`s in snapshot tests (#3906) 2023-04-11 09:03:00 +00:00
Micha Reiser c33c9dc585
Introduce SourceFile to avoid cloning the message filename (#3904) 2023-04-11 08:28:55 +00:00
Micha Reiser 056c212975
Render code frame with context (#3901) 2023-04-11 10:22:11 +02:00
Micha Reiser 381203c084
Store source code on message (#3897) 2023-04-11 07:57:36 +00:00
Micha Reiser 76c47a9a43
Cheap cloneable LineIndex (#3896) 2023-04-11 07:33:40 +00:00
Evan Rittenhouse abaf0a198d
Ensure that tab characters aren't in multi-line strings before throwing a violation (#3837) 2023-04-06 22:25:40 -04:00
Charlie Marsh d919adc13c
Introduce a `ruff_python_semantic` crate (#3865) 2023-04-04 16:50:47 +00:00
Chris Chan 10504eb9ed
Generate `ImportMap` from module path to imported dependencies (#3243) 2023-04-04 03:31:37 +00:00
Charlie Marsh f4173b2a93
Move shadow tracking into `Scope` directly (#3854) 2023-04-03 15:33:44 -04:00
Charlie Marsh 5625410936
Remove `uses_magic_variable_access` dependence on `Checker` (#3864) 2023-04-03 12:22:06 -04:00
Charlie Marsh 3744e9ab3f
Remove `contains_effect`'s dependency on `Context` (#3855) 2023-04-03 12:08:13 -04:00
Nazia Povey 849091d846
When checking module visibility, don't check entire ancestry (#3835) 2023-04-03 11:38:41 -04:00
Charlie Marsh 25771cd4b9
Use references for `Export` binding type (#3853) 2023-04-03 15:26:42 +00:00
Charlie Marsh 924bebbb4a
Change "indexes" to "indices" in various contexts (#3856) 2023-04-02 23:08:03 +00:00
Charlie Marsh 08e5b3fa61
Make `collect_call_path` return an `Option` (#3849) 2023-04-01 22:29:32 -04:00
Charlie Marsh d822e08111
Move `CallPath` into its own module (#3847) 2023-04-01 11:25:04 -04:00
Charlie Marsh 2f90157ce2
Move logging resolver into `logging.rs` (#3843) 2023-04-01 03:50:44 +00:00
Charlie Marsh 88308ef9cc
Move `Binding` structs out of `scope.rs` (#3842) 2023-03-31 23:49:48 -04:00
Charlie Marsh 6d80c79bac
Combine `operations.rs` and `helpers.rs` (#3841) 2023-04-01 03:40:34 +00:00
Charlie Marsh 2fbc620ad3
Move `__all__` utilities to `all.rs` (#3840) 2023-04-01 03:31:15 +00:00
Charlie Marsh 27e40e9b31
Remove `helpers.rs` dependency on `Binding` (#3839) 2023-04-01 03:19:45 +00:00
Charlie Marsh b6276e2d95
Move f-string identification into rule module (#3838) 2023-03-31 23:10:11 -04:00
Charlie Marsh dfc872c9a0
Track star imports on `Scope` directly (#3822) 2023-03-31 15:01:12 +00:00
Charlie Marsh cf7e1ddd08
Remove some `usize` references (#3819) 2023-03-30 17:35:42 -04:00
Charlie Marsh 01357f62e5
Add import insertion support to autofix capabilities (#3787) 2023-03-30 15:33:46 +00:00
Charlie Marsh f79506f5a4
Move some generic structs out of `isort` (#3788) 2023-03-30 08:58:01 -04:00
Charlie Marsh 8601dcc09b
Add import name resolution to `Context` (#3777) 2023-03-29 21:47:50 +00:00
Micha Reiser 595cd065f3
Reduce explcit clones (#3793) 2023-03-29 15:15:14 +02:00
Charlie Marsh 22d5b0071d
Rename `end_of_statement` to `end_of_last_statement` (#3775) 2023-03-28 12:31:06 -04:00
Micha Reiser f68c26a506
perf(pycodestyle): Initialize Stylist from tokens (#3757) 2023-03-28 11:53:35 +02:00
Micha Reiser 000394f428
perf(pycodestyle): Introduce TokenKind (#3745) 2023-03-28 11:22:39 +02:00
Dhruv Manilawala 63adf9f5e8
Allow aliased `logging` module as a logger candidate (#3718) 2023-03-24 17:19:09 -04:00
Charlie Marsh 0f95056f13
Avoid panics for implicitly concatenated forward references (#3700) 2023-03-23 19:13:50 -04:00
Charlie Marsh 028329854b
Avoid parsing f-strings in type annotations (#3699) 2023-03-23 18:51:44 -04:00
Charlie Marsh ba43d6bd0b
Avoid parsing `ForwardRef` contents as type references (#3698) 2023-03-23 18:44:02 -04:00
Charlie Marsh e8d17d23cb
Expand the scope of useless-expression (B018) (#3455) 2023-03-23 18:33:58 -04:00
Charlie Marsh 189c9d4683
Add dedicated structs for `BindingKind` variants (#3672) 2023-03-22 19:08:48 -04:00
Charlie Marsh 1b3e54231c
Flag, but don't fix, unused imports in `ModuleNotFoundError` blocks (#3658) 2023-03-22 13:03:30 -04:00
Charlie Marsh 3a8e98341b
Enable autofix for annotations within 'simple' string literals (#3657) 2023-03-22 12:45:51 -04:00
Charlie Marsh 7c0f17279c
Flag PEP 585 and PEP 604 violations in quoted annotations (#3593) 2023-03-20 11:15:44 -04:00
Charlie Marsh 4892167217
Avoid panics for implicitly-concatenated docstrings (#3584)
## Summary

In the rare event that a docstring contains an implicit string concatenation, we currently have the potential to panic, because we assume that if a string starts with triple quotes, it _ends_ with triple quotes. But with implicit concatenation, that's not the case: a single `Expr` could start and end with different quote styles, because it can contain multiple string tokens.

Supporting these "properly" is pretty hard. In some cases it's hard to even know what the "right" behavior is. So for now, I'm just detecting and warning, which is better than a panic.

Closes #3543.

Closes #3585.
2023-03-19 14:16:50 -04:00
Micha Reiser dedf4cbdeb
refactor: Move scope and binding types to `scope.rs` (#3573) 2023-03-17 17:31:33 +01:00
Micha Reiser 92179e6369
Scope and Binding IDs (#3572) 2023-03-17 17:12:27 +01:00
Charlie Marsh 373a77e8c2
Avoid C1901 violations within subscripts (#3517) 2023-03-17 02:52:05 +00:00
Micha Reiser 685c242761
refactor(ruff_python_ast): Split `get_argument` (#3478) 2023-03-13 18:18:25 +01:00
Charlie Marsh 3a5fbd6d74
Upgrade RustPython to fix Serde dependency (#3481) 2023-03-13 12:29:31 -04:00
Charlie Marsh c2750a59ab
Implement an iterator for universal newlines (#3454)
# Summary

We need to support CR line endings (as opposed to LF and CRLF line endings, which are already supported). They're rare, but they do appear in Python code, and we tend to panic on any file that uses them.

Our `Locator` abstraction now supports CR line endings. However, Rust's `str#lines` implementation does _not_.

This PR adds a `UniversalNewlineIterator` implementation that respects all of CR, LF, and CRLF line endings, and plugs it into most of the `.lines()` call sites.

As an alternative design, it could be nice if we could leverage `Locator` for this. We've already computed all of the line endings, so we could probably iterate much more efficiently?

# Test Plan

Largely relying on automated testing, however, also ran over some known failure cases, like #3404.
2023-03-13 00:01:29 -04:00
Micha Reiser d2988043af
perf: Optimize UTF8/ASCII byte offset index (#3439) 2023-03-11 13:12:10 +01:00
Charlie Marsh 0a9d259f9c
Remove copied `core` modules from `ruff_python_formatter` (#3371) 2023-03-08 19:03:40 +00:00
Charlie Marsh 130e733023
Implement `From<Located>` for `Range` (#3377) 2023-03-08 18:50:20 +00:00
Charlie Marsh ff2c0dd491
Use shared `leading_quote` implementation in ruff_python_formatter (#3396) 2023-03-08 18:21:59 +00:00
Charlie Marsh bad6bdda1f
Create a `rust_python_ast` crate (#3370)
This PR productionizes @MichaReiser's suggestion in https://github.com/charliermarsh/ruff/issues/1820#issuecomment-1440204423, by creating a separate crate for the `ast` module (`rust_python_ast`). This will enable us to further split up the `ruff` crate, as we'll be able to create (e.g.) separate sub-linter crates that have access to these common AST utilities.

This was mostly a straightforward copy (with adjustments to module imports), as the few dependencies that _did_ require modifications were handled in #3366, #3367, and #3368.
2023-03-07 15:18:40 +00:00