Commit Graph

1977 Commits

Author SHA1 Message Date
Charlie Marsh 965243d98e Remove Statements#depth 2023-08-07 12:11:50 -04:00
Charlie Marsh 9328606843
Remove `Statements#parent` (#6392)
Discussed in
https://github.com/astral-sh/ruff/pull/6351#discussion_r1284997065.
2023-08-07 15:41:02 +00:00
Dhruv Manilawala e4a4660925
Support help end escape command with priority (#6272)
## Summary

This PR adds support for help end escape command in the lexer.

### What are "help end escape commands"?

First, the escape commands are special IPython syntax which enhances the
functionality for the IPython REPL. There are 9 types of escape kinds
which are recognized by the tokens which are present at the start of the
command (`?`, `??`, `!`, `!!`, etc.).

Here, the help command is using either the `?` or `??` token at the
start (`?str.replace` for example). Those 2 tokens are also supported
when they're at the end of the command (`str.replace?`), but the other
tokens aren't supported in that position.

There are mainly two types of help end escape commands:
1. Ending with either `?` or `??`, but it also starts with one of the
escape tokens (`%matplotlib?`)
2. On the other hand, there's a stricter version for (1) which doesn't
start with any escape tokens (`str.replace?`)

This PR adds support for (1) while (2) will be supported in the parser.

### Priority

Now, if the command starts and ends with an escape token, how do we
decide the kind of this command? This is where priority comes into
picture. This is simple as there's only one priority where `?`/`??` at
the end takes priority over any other escape token and all of the other
tokens are at the same priority. Remember that only `?`/`??` at the end
is considered valid.

This is mainly useful in the case where someone would want to invoke the
help command on the magic command itself. For example, in `%matplotlib?`
the help command takes priority which means that we want help for the
`matplotlib` magic function instead of calling the magic function
itself.

### Specification

Here's where things get a bit tricky. What if there are question mark
tokens at both ends. How do we decide if it's `Help` (`?`) kind or
`Help2` (`??`) kind?

|     | Magic       | Value     | Kind    |
| --- | ---         | ---       | ---     |
| 1   | `?foo?`     | `foo`     | `Help`  |
| 2   | `??foo?`    | `foo`     | `Help`  |
| 3   | `?foo??`    | `foo`     | `Help2` |
| 4   | `??foo??`   | `foo`     | `Help2` |
| 5   | `???foo??`  | `foo`     | `Help2` |
| 6   | `??foo???`  | `foo???`  | `Help2` |
| 7   | `???foo???` | `?foo???` | `Help2` |

Looking at the above table:

- The question mark tokens on the right takes priority over the ones on
the left but only if the number of question mark on the right is 1 or 2.
- If there are more than 2 question mark tokens on the right side, then
the left side is used to determine the same.
- If the right side is used to determine the kind, then all of the
question marks and whitespaces on the left side are ignored in the
`value`, but if it’s the other way around, then all of the extra
question marks are part of the `value`.

### References

- IPython implementation using the regex:
292e3a2345/IPython/core/inputtransformer2.py (L454-L462)
- Priorities:
292e3a2345/IPython/core/inputtransformer2.py (L466-L469)

## Test Plan

Add a bunch of test cases for the lexer and verify that it matches the
behavior of
IPython transformer.

resolves: #6357
2023-08-07 21:01:02 +05:30
Charlie Marsh b21abe0a57
Use separate structs for expression and statement tracking (#6351)
## Summary

This PR fixes the performance degradation introduced in
https://github.com/astral-sh/ruff/pull/6345. Instead of using the
generic `Nodes` structs, we now use separate `Statement` and
`Expression` structs. Importantly, we can avoid tracking a bunch of
state for expressions that we need for parents: we don't need to track
reference-to-ID pointers (we just have no use-case for this -- I'd
actually like to remove this from statements too, but we need it for
branch detection right now), we don't need to track depth, etc.

In my testing, this entirely removes the regression on all-rules, and
gets us down to 2ms slower on the default rules (as a crude hyperfine
benchmark, so this is within margin of error IMO).

No behavioral changes.
2023-08-07 15:27:42 +00:00
Charlie Marsh 61d3977f95
Make the `statement` vector private on `SemanticModel` (#6348)
## Summary

Instead, expose these as methods, now that we can use a reasonable
nomenclature on the API.
2023-08-07 15:02:14 +00:00
Charlie Marsh bae87fa016
Rename semantic model methods to use `current_*` prefix (#6347)
## Summary

This PR attempts to draw a clearer divide between "methods that take
(e.g.) an expression or statement as input" and "methods that rely on
the _current_ expression or statement" in the semantic model, by
renaming methods like `stmt()` to `current_statement()`.

This had led to confusion in the past. For example, prior to this PR, we
had `scope()` (which returns the current scope), and `parent_scope`,
which returns the parent _of a scope that's passed in_. Now, the API is
clearer: `current_scope` returns the current scope, and `parent_scope`
takes a scope as argument and returns its parent.

Per above, I also changed `stmt` to `statement` and `expr` to
`expression`.
2023-08-07 14:44:49 +00:00
Charlie Marsh b763973357
Avoid hard line break after dangling open-parenthesis comments (#6380)
## Summary

Given:

```python
[  # comment
    first,
    second,
    third
]  # another comment
```

We were adding a hard line break as part of the formatting of `#
comment`, which led to the following formatting:

```python
[first, second, third]  # comment
  # another comment
```

Closes https://github.com/astral-sh/ruff/issues/6367.
2023-08-07 14:15:32 +00:00
Charlie Marsh 63692b3798
Use `parenthesized_with_dangling_comments` in arguments formatter (#6376)
## Summary

Fixes an instability whereby this:

```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
    # Returns a list of deployments not older than threshold days
    # including `/root/zulip` directory if it exists.
    recent = set()
    threshold_date = datetime.datetime.now() - datetime.timedelta(  # noqa: DTZ005
        days=threshold_days
    )
```

Was being formatted as:

```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
    # Returns a list of deployments not older than threshold days
    # including `/root/zulip` directory if it exists.
    recent = set()
    threshold_date = (
        datetime.datetime.now()
        - datetime.timedelta(days=threshold_days)  # noqa: DTZ005
    )
```

Which was in turn being formatted as:

```python
def get_recent_deployments(threshold_days: int) -> Set[str]:
    # Returns a list of deployments not older than threshold days
    # including `/root/zulip` directory if it exists.
    recent = set()
    threshold_date = (
        datetime.datetime.now() - datetime.timedelta(days=threshold_days)  # noqa: DTZ005
    )
```

The second-to-third formattings still differs from Black because we
aren't taking the line suffix into account when splitting
(https://github.com/astral-sh/ruff/issues/6377), but the first
formatting is correct and should be unchanged (i.e., the first-to-second
formattings is incorrect, and fixed here).

## Test Plan

`cargo run --bin ruff_dev -- format-dev --stability-check ../zulip`
2023-08-07 09:43:57 -04:00
Charlie Marsh 89e4e038b0
Store expression hierarchy in semantic model snapshots (#6345)
## Summary

When we iterate over the AST for analysis, we often process nodes in a
"deferred" manner. For example, if we're analyzing a function, we push
the function body onto a deferred stack, along with a snapshot of the
current semantic model state. Later, when we analyze the body, we
restore the semantic model state from the snapshot. This ensures that we
know the correct scope, hierarchy of statement parents, etc., when we go
to analyze the function body.

Historically, we _haven't_ included the _expression_ hierarchy in the
model snapshot -- so we track the current expression parents in the
visitor, but we never save and restore them when processing deferred
nodes. This can lead to subtle bugs, in that methods like
`expr_parent()` aren't guaranteed to be correct, if you're in a deferred
visitor.

This PR migrates expression tracking to mirror statement tracking
exactly. So we push all expressions onto an `IndexVec`, and include the
current expression on the snapshot. This ensures that `expr_parent()`
and related methods are "always correct" rather than "sometimes
correct".

There's a performance cost here, both at runtime and in terms of memory
consumption (we now store an additional pointer for every expression).
In my hyperfine testing, it's about a 1% performance decrease for
all-rules on CPython (up to 533.8ms, from 528.3ms) and a 4% performance
decrease for default-rules on CPython (up to 212ms, from 204ms).
However... I think this is worth it given the incorrectness of our
current approach. In the future, we may want to reconsider how we do
these upward traversals (e.g., with something like a red-green tree).
(**Note**: in https://github.com/astral-sh/ruff/pull/6351, the slowdown
seems to be entirely removed.)
2023-08-07 09:42:04 -04:00
Tom Kuson 5d2a4ebc99
Add documentation to `subprocess-with[out]-shell-equals-true` rules (#6373) 2023-08-07 03:48:36 +00:00
Harutaka Kawamura 9c3fbcdf4a
Add `PT011` and `PT012` docs (#6362) 2023-08-06 21:28:24 -04:00
Konrad Listwan-Ciesielski 61532e8aad
Add `DTZ003` and `DTZ004` docs (#6223)
Changes:
- Fixes typo and repeated phrase in `DTZ002`
- Adds docs for `DTZ003`
- Adds docs for `DTZ004`
- Adds example for <=Python3.10 in `DTZ001`

Related to: https://github.com/astral-sh/ruff/issues/2646
2023-08-07 01:21:14 +00:00
Charlie Marsh 9171e97d15
Avoid allocation in no-signature (#6375) 2023-08-06 15:27:56 +00:00
Charlie Marsh a5a29bb8d6
Revert change to `require_git(false)` in `WalkBuilder` (#6368)
## Summary

This was changed to fix https://github.com/astral-sh/ruff/issues/5930
(respect `.gitignore` for unzipped source repositories), but led to
undesirable behavior whereby `.gitignore` files in parent directories
are respected regardless of whether you're working in a child git
repository (see: https://github.com/astral-sh/ruff/issues/6335). The
latter is a bigger problem than the former is an important use-case to
support, so pragmatically erring on the side of a revert.

Closes https://github.com/astral-sh/ruff/issues/6335.
2023-08-05 19:45:50 +00:00
Zixuan Li be657f5e7e
Respect typing_extensions imports of Annotated for B006. (#6361)
`typing_extensions.Annotated` should be treated the same way as
`typing.Annotated`.
2023-08-05 17:39:52 +00:00
Charlie Marsh 76148ddb76
Store call paths rather than stringified names (#6102)
## Summary

Historically, we've stored "qualified names" on our
`BindingKind::Import`, `BindingKind::SubmoduleImport`, and
`BindingKind::ImportFrom` structs. In Ruff, a "qualified name" is a
dot-separated path to a symbol. For example, given `import foo.bar`, the
"qualified name" would be `"foo.bar"`; and given `from foo.bar import
baz`, the "qualified name" would be `foo.bar.baz`.

This PR modifies the `BindingKind` structs to instead store _call paths_
rather than qualified names. So in the examples above, we'd store
`["foo", "bar"]` and `["foo", "bar", "baz"]`. It turns out that this
more efficient given our data access patterns. Namely, we frequently
need to convert the qualified name to a call path (whenever we call
`resolve_call_path`), and it turns out that we do this operation enough
that those conversations show up on benchmarks.

There are a few other advantages to using call paths, rather than
qualified names:

1. The size of `BindingKind` is reduced from 32 to 24 bytes, since we no
longer need to store a `String` (only a boxed slice).
2. All three import types are more consistent, since they now all store
a boxed slice, rather than some storing an `&str` and some storing a
`String` (for `BindingKind::ImportFrom`, we needed to allocate a
`String` to create the qualified name, but the call path is a slice of
static elements that don't require that allocation).
3. A lot of code gets simpler, in part because we now do call path
resolution "earlier". Most notably, for relative imports (`from .foo
import bar`), we store the _resolved_ call path rather than the relative
call path, so the semantic model doesn't have to deal with that
resolution. (See that `resolve_call_path` is simpler, fewer branches,
etc.)

In my testing, this change improves the all-rules benchmark by another
4-5% on top of the improvements mentioned in #6047.
2023-08-05 15:21:50 +00:00
Harutaka Kawamura 501f537cb8
Avoid auto-fixing UP031 if there are comments within the right-hand side (#6364) 2023-08-05 11:14:29 -04:00
Dhruv Manilawala 1ac2699b5e
Update `F841` autofix to not remove line magic expr (#6141)
## Summary

Update `F841` autofix to not remove line magic expr

## Test Plan

Added test case for assignment statement with and without type
annotation

fixes: #6116
2023-08-05 00:45:01 +00:00
Dhruv Manilawala 32fa05765a
Use `Jupyter` mode while parsing Notebook files (#5552)
## Summary

Enable using the new `Mode::Jupyter` for the tokenizer/parser to parse
Jupyter line magic tokens.

The individual call to the lexer i.e., `lex_starts_at` done by various
rules should consider the context of the source code (is this content
from a Jupyter Notebook?). Thus, a new field `source_type` (of type
`PySourceType`) is added to `Checker` which is being passed around as an
argument to the relevant functions. This is then used to determine the
`Mode` for the lexer.

## Test Plan

Add new test cases to make sure that the magic statement is considered
while generating the diagnostic and autofix:
* For `I001`, if there's a magic statement in between two import blocks,
they should be sorted independently

fixes: #6090
2023-08-05 00:32:07 +00:00
Charlie Marsh d788957ec4
Allow capitalized names for logger candidate heuristic match (#6356)
Closes https://github.com/astral-sh/ruff/issues/6353.
2023-08-04 23:25:34 +00:00
Victor Hugo Gomes 78a370303b
[`flake8-pyi`] Add tests cases for bad imports from PYI027 to PYI022 (UP035) (#6354)
## Summary
As of version
[23.1.0](2a86db8271/CHANGELOG.md?plain=1#L158-L160),
`flake8-pyi` remove the rule `Y027`.

The errors that resulted in `PYI027` are now being emitted by `PYI022`
(`UP035`).

ref: #848 

## Test Plan

Add new tests cases.
2023-08-04 19:00:33 -04:00
Charlie Marsh 5e73345a1c
Avoid panic with positional-only arguments in `PYI019` (#6350)
## Summary

Previously, failed on methods like:

```python
@classmethod
def bad_posonly_class_method(cls: type[_S], /) -> _S: ...  # PYI019
```

Since we check if there are any positional-only or non-positional
arguments, but then do an unsafe access on `parameters.args`.

Closes https://github.com/astral-sh/ruff/issues/6349.

## Test Plan

`cargo test` (verified that `main` panics on the new fixtures)
2023-08-04 18:37:07 +00:00
Charlie Marsh b8fd69311c
Remove `ruff_python_ast` prefix in fixes.rs (#6346) 2023-08-04 16:48:20 +00:00
Charlie Marsh fa5c9cced9
Ignore same-line docstrings for lines-before and lines-after rules (#6344)
These rules assume that the docstring is on its own line. pydocstyle
treats them inconsistently, so I'm just going to disable them in this
case.

Closes https://github.com/astral-sh/ruff/issues/6329.
2023-08-04 16:08:36 +00:00
Harutaka Kawamura 08dd87e04d
Avoid auto-fixing UP032 if comments are present around format call arguments (#6342) 2023-08-04 15:37:23 +00:00
konsti 9bb21283ca
More similarity index digits (#6343)
**Summary** We were at similarity index 0.998 for django, we need more
decimal places, now we're at 0.99779.

**Test Plan** n/a
2023-08-04 17:12:33 +02:00
Charlie Marsh 4d47dfd6c0
Tweak breaking groups for comprehensions (#6321)
## Summary

Fixes some comprehension formatting by avoiding creating the group for
the comprehension itself (so that if it breaks, all parts break on their
own lines, e.g. the `for` and the `if` clauses).

Closes https://github.com/astral-sh/ruff/issues/6063.

## Test Plan

Bunch of new fixtures.
2023-08-04 14:00:54 +00:00
konsti 99baad12d8
Call chain formatting in fluent style (#6151)
Implement fluent style/call chains. See the `call_chains.py` formatting
for examples.

This isn't fully like black because in `raise A from B` they allow `A`
breaking can influence the formatting of `B` even if it is already
multiline.

Similarity index:

| project      | main  | PR    |
|--------------|-------|-------|
| build        | ???   | 0.753 |
| django       | 0.991 | 0.998 |
| transformers | 0.993 | 0.994 |
| typeshed     | 0.723 | 0.723 |
| warehouse    | 0.978 | 0.994 |
| zulip        | 0.992 | 0.994 |

Call chain formatting is affected by
https://github.com/astral-sh/ruff/issues/627, but i'm cutting scope
here.

Closes #5343

**Test Plan**:
 * Added a dedicated call chains test file
 * The ecosystem checks found some bugs
 * I manually check django and zulip formatting

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-08-04 13:58:01 +00:00
Charlie Marsh 35bdbe43a8
Flag `comparison-with-itself` on builtin calls (#6324)
## Summary

Extends `comparison-with-itself` to cover simple function calls on
known-pure functions, like `id`. For example, we now flag `id(x) ==
id(x)`.

Closes https://github.com/astral-sh/ruff/issues/6276.

## Test Plan

`cargo test`
2023-08-04 09:51:41 -04:00
Charlie Marsh 3a985dd71e
Rename `CommentPlacement#then_with` to `or_else` (#6341)
Per nits in the PR.
2023-08-04 13:50:57 +00:00
Charlie Marsh 1e3fe67ca5
Refactor and rename `skip_trailing_trivia` (#6312)
Based on feedback here:
https://github.com/astral-sh/ruff/pull/6274#discussion_r1282747964.
2023-08-04 13:30:53 +00:00
Charlie Marsh 38a96c88c1
Add missing enable check for bad-string-format-character (#6340) 2023-08-04 13:27:53 +00:00
Micha Reiser f4831d5a26
Formatter comment handling nits (#6339) 2023-08-04 13:22:16 +00:00
konsti 1031bb6550
Formatter: Add SourceType to context to enable special formatting for stub files (#6331)
**Summary** This adds the information whether we're in a .py python
source file or in a .pyi stub file to enable people working on #5822 and
related issues.

I'm not completely happy with `Default` for something that depends on
the input.

**Test Plan** None, this is currently unused, i'm leaving this to first
implementation of stub file specific formatting.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-08-04 11:52:26 +00:00
David Szotten fe97a2a302
Fix panic with empty attribute inner comment (#6332)
Fixes https://github.com/astral-sh/ruff/issues/6181
2023-08-04 11:59:55 +02:00
konsti a48d16e025
Replace `Formatter<PyFormatContext<'_>>` with `PyFormatter` (#6330)
This is a refactoring to use the type alias in more places. In the
process, I had to fix and run generate.py. There are no functional
changes.
2023-08-04 10:48:58 +02:00
Charlie Marsh 8a5bc93fdd
Make the `Nodes` vector generic on node type (#6328) 2023-08-04 03:57:15 +00:00
Charlie Marsh 6da527170f
Match left-hand side `types()` call in `types-comparison` (#6326)
Follow-up to https://github.com/astral-sh/ruff/pull/6325, to avoid false
positives in cases like:

```python
if x == int:
    ...
```

Which is valid, since we don't know that we're comparing the type _of_
something -- we're comparing the type objects directly.
2023-08-03 23:01:23 -04:00
Charlie Marsh 8cddb6c08d
Include comparisons to builtin types in `type-comparison` rule (#6325)
## Summary

Extends `type-comparison` to flag:

```python
if type(obj) is int:
    pass
```

In addition to the existing cases, like:

```python
if type(obj) is type(1):
    pass
```

Closes https://github.com/astral-sh/ruff/issues/6260.
2023-08-04 02:25:19 +00:00
Victor Hugo Gomes b8ca220eeb
[`flake8-pyi`] Implement PYI055 (#6316) 2023-08-04 01:36:00 +00:00
Charlie Marsh 1d8759d5df
Generalize comment-after-bracket handling to lists, sets, etc. (#6320)
## Summary

We already support preserving the end-of-line comment in calls and type
parameters, as in:

```python
foo(  # comment
    bar,
)
```

This PR adds the same behavior for lists, sets, comprehensions, etc.,
such that we preserve:

```python
[  # comment
    1,
    2,
    3,
]
```

And related cases.
2023-08-04 01:28:05 +00:00
Charlie Marsh d3aa8b4ee0
Add API to chain comment placement operations (#6319)
## Summary

This PR adds an API for chaining comment placement methods based on the
[`then_with`](https://doc.rust-lang.org/std/cmp/enum.Ordering.html#method.then_with)
from `Ordering` in the standard library.

For example, you can now do:

```rust
try_some_case(comment).then_with(|comment| try_some_other_case_if_still_default(comment))
```

This lets us avoid this kind of pattern, which I've seen in
`placement.rs` and used myself before:

```rust
let comment = match handle_own_line_comment_between_branches(comment, preceding, locator) {
    CommentPlacement::Default(comment) => comment,
    placement => return placement,
};
```
2023-08-03 21:08:50 -04:00
Charlie Marsh 5f225b18ab
Generalize bracketed end-of-line comment handling (#6315)
Micha suggested this in
https://github.com/astral-sh/ruff/pull/6274#discussion_r1282774151, and
it allows us to unify the implementations for arguments and type params.
2023-08-03 20:51:03 +00:00
Charlie Marsh 1705fcef36
Mark trailing comments in parenthesized tests (#6287)
## Summary

This ensures that we treat `# comment` as parenthesized in contexts
like:

```python
while (
    True
    # comment
):
    pass
```

The same logic applies equally to `for`, `async for`, `if`, `with`, and
`async with`. The general pattern is that you have an expression which
precedes a colon-separated suite.
2023-08-03 20:45:03 +00:00
konsti 51ff98f9e9
Make formatter ecosystem check failure output better understandable (#6300)
**Summary** Prompted by
https://github.com/astral-sh/ruff/pull/6257#issuecomment-1661308410, it
tried to make the ecosystem script output on failure better
understandable. All log messages are now written to a file, which is
printed on error. Running locally progress is still shown.

Looking through the log output i saw that we currently log syntax errors
in input, which is confusing because they aren't actual errors, but we
don't check that these files don't change due to parser regressions or
improvements. I added `--files-with-errors` to catch that.

**Test Plan** CI
2023-08-03 20:23:25 +02:00
Charlie Marsh b3f3529499
Improve comments around `Arguments` handling in classes (#6310)
## Summary

Based on the confusion here:
https://github.com/astral-sh/ruff/pull/6274#discussion_r1282754515.

I looked into moving this logic into `placement.rs`, but I think it's
trickier than it may appear.
2023-08-03 12:34:03 -04:00
Charlie Marsh 2fa508793f
Return a slice in `StmtClassDef#bases` (#6311)
Slices are strictly more flexible, since you can always convert to an
iterator, etc., but not the other way around. Suggested in
https://github.com/astral-sh/ruff/pull/6259#discussion_r1282730994.
2023-08-03 16:21:55 +00:00
Zanie Blue 718e3945e3
Add rule to upgrade type alias annotations to keyword (UP040) (#6289)
Adds rule to convert type aliases defined with annotations i.e. `x:
TypeAlias = int` to the new PEP-695 syntax e.g. `type x = int`.

Does not support using new generic syntax for type variables, will be
addressed in a follow-up.
Added as part of pyupgrade — ~the code 100 as chosen to avoid collision
with real pyupgrade codes~.

Part of #4617 
Builds on #5062
2023-08-03 16:13:06 +00:00
Charlie Marsh c75e8a8dab
Move `ExprCall`'s `NeedsParentheses` impl into `expr_call.rs` (#6309)
Accidental move.
2023-08-03 16:01:01 +00:00
Harutaka Kawamura 74e734e962
More precise invalid expression check for `UP032` (#6308) 2023-08-03 15:49:02 +00:00