Commit Graph

39 Commits

Author SHA1 Message Date
Charlie Marsh 2d505e2b04
Remove suite body tracking from `SemanticModel` (#5848)
## Summary

The `SemanticModel` currently stores the "body" of a given `Suite`,
along with the current statement index. This is used to support "next
sibling" queries, but we only use this in exactly one place -- the rule
that simplifies constructs like this to `any` or `all`:

```python
for x in y:
    if x == 0:
        return True
return False
```

Instead of tracking the state, we can just do a (slightly more
expensive) traversal, by finding the node within its parent and
returning the next node in the body.

Note that we'll only have to do this extremely rarely -- namely, for
functions that contain something like:

```python
for x in y:
    if x == 0:
        return True
```
2023-07-18 18:58:31 -04:00
Charlie Marsh 06b5c6c06f
Use `SmallVec#extend_from_slice` in lieu of `SmallVec#extend` (#5793)
## Summary

There's a note in the docs that suggests this can be faster, and in the
benchmarks it... seems like it is? Might just be noise but held up over
a few runs.

Before:

<img width="1792" alt="Screen Shot 2023-07-15 at 9 10 06 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/973cd955-d4e6-4ae3-898e-90b7eb52ecf2">

After:

<img width="1792" alt="Screen Shot 2023-07-15 at 9 10 09 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/1491b391-d219-48e9-aa47-110bc7dc7f90">
2023-07-15 21:25:12 -04:00
Charlie Marsh bf4b96c5de
Differentiate between runtime and typing-time annotations (#5575)
## Summary

In Python, the annotations on `x` and `y` here have very different
treatment:

```python
def foo(x: int):
  y: int
```

The `int` in `x: int` is a runtime-required annotation, because `x` gets
added to the function's `__annotations__`. You'll notice, for example,
that this fails:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
  from foo import Bar

def f(x: Bar):
  ...
```

Because `Bar` is required to be available at runtime, not just at typing
time. Meanwhile, this succeeds:

```python
from typing import TYPE_CHECKING

if TYPE_CHECKING:
  from foo import Bar

def f():
  x: Bar = 1

f()
```

(Both cases are fine if you use `from __future__ import annotations`.)

Historically, we've tracked those annotations that are _not_
runtime-required via the semantic model's `ANNOTATION` flag. But
annotations that _are_ runtime-required have been treated as "type
definitions" that aren't annotations.

This causes problems for the flake8-future-annotations rules, which try
to detect whether adding `from __future__ import annotations` would
_allow_ you to rewrite a type annotation. We need to know whether we're
in _any_ type annotation, runtime-required or not, since adding `from
__future__ import annotations` will convert any runtime-required
annotation to a typing-only annotation.

This PR adds separate state to track these runtime-required annotations.
The changes in the test fixtures are correct -- these were false
negatives before.

Closes https://github.com/astral-sh/ruff/issues/5574.
2023-07-07 00:21:44 -04:00
Charlie Marsh 9e1039f823
Enable attribute lookups via semantic model (#5536)
## Summary

This PR enables us to resolve attribute accesses within files, at least
for static and class methods. For example, we can now detect that this
is a function access (and avoid a false-positive):

```python
class Class:
    @staticmethod
    def error():
        return ValueError("Something")


# OK
raise Class.error()
```

Closes #5487.

Closes #5416.
2023-07-05 15:19:14 -04:00
Charlie Marsh 0e89c94947
Run shadowed-variable analyses in deferred handlers (#5181)
## Summary

This PR extracts a bunch of complex logic from `add_binding`, instead
running the the shadowing rules in the deferred handler, thereby
decoupling the binding phase (during which we build up the semantic
model) from the analysis phase, and generally making `add_binding` much
more focused.

This was made possible by improving the semantic model to better handle
deletions -- previously, we'd "lose track" of bindings if they were
deleted, which made this kind of refactor impossible.

## Test Plan

We have good automated coverage for this, but I want to benchmark it
separately.
2023-06-29 00:08:18 +00:00
Charlie Marsh ecf61d49fa
Restore existing bindings when unbinding caught exceptions (#5256)
## Summary

In the latest release, we made some improvements to the semantic model,
but our modifications to exception-unbinding are causing some
false-positives. For example:

```py
try:
    v = 3
except ImportError as v:
    print(v)
else:
    print(v)
```

In the latest release, we started unbinding `v` after the `except`
handler. (We used to restore the existing binding, the `v = 3`, but this
was quite complicated.) Because we don't have full branch analysis, we
can't then know that `v` is still bound in the `else` branch.

The solution here modifies `resolve_read` to skip-lookup when hitting
unbound exceptions. So when store the "unbind" for `except ImportError
as v`, we save the binding that it shadowed `v = 3`, and skip to that.

Closes #5249.

Closes #5250.
2023-06-21 12:53:58 -04:00
Charlie Marsh 94abf7f088
Rename `*Importation` structs to `*Import` (#5185)
## Summary

I find "Importation" a bit awkward, it may not even be grammatically
correct here.
2023-06-19 12:09:10 -04:00
Charlie Marsh a6cf31cc89
Move `dead_scopes` to `deferred.scopes` (#5171)
## Summary

This is more consistent with the rest of the `deferred` patterns.
2023-06-18 15:57:38 +00:00
Charlie Marsh d0ad1ed0af
Replace static `CallPath` vectors with `matches!` macros (#5148)
## Summary

After #5140, I audited the codebase for similar patterns (defining a
list of `CallPath` entities in a static vector, then looping over them
to pattern-match). This PR migrates all other such cases to use `match`
and `matches!` where possible.

There are a few benefits to this:

1. It more clearly denotes the intended semantics (branches are
exclusive).
2. The compiler can help deduplicate the patterns and detect unreachable
branches.
3. Performance: in the benchmark below, the all-rules performance is
increased by nearly 10%...

## Benchmarks

I decided to benchmark against a large file in the Airflow repository
with a lot of type annotations
([`views.py`](https://raw.githubusercontent.com/apache/airflow/f03f73100e8a7d6019249889de567cb00e71e457/airflow/www/views.py)):

```
linter/default-rules/airflow/views.py
                        time:   [10.871 ms 10.882 ms 10.894 ms]
                        thrpt:  [19.739 MiB/s 19.761 MiB/s 19.781 MiB/s]
                 change:
                        time:   [-2.7182% -2.5687% -2.4204%] (p = 0.00 < 0.05)
                        thrpt:  [+2.4805% +2.6364% +2.7942%]
                        Performance has improved.

linter/all-rules/airflow/views.py
                        time:   [24.021 ms 24.038 ms 24.062 ms]
                        thrpt:  [8.9373 MiB/s 8.9461 MiB/s 8.9527 MiB/s]
                 change:
                        time:   [-8.9537% -8.8516% -8.7527%] (p = 0.00 < 0.05)
                        thrpt:  [+9.5923% +9.7112% +9.8342%]
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe
```

The impact is dramatic -- nearly a 10% improvement for `all-rules`.
2023-06-16 17:34:42 +00:00
Charlie Marsh fd1dfc3bfa
Add support for global and nonlocal symbol renames (#5134)
## Summary

In #5074, we introduced an abstraction to support local symbol renames
("local" here refers to "within a module"). However, that abstraction
didn't support `global` and `nonlocal` symbols. This PR extends it to
those cases.

Broadly, there are considerations.

First, if we're renaming a symbol in a scope in which it is declared
`global` or `nonlocal`. For example, given:

```python
x = 1

def foo():
    global x
```

Then when renaming `x` in `foo`, we need to detect that it's `global`
and instead perform the rename starting from the module scope.

Second, when renaming a symbol, we need to determine the scopes in which
it is declared `global` or `nonlocal`. This is effectively the inverse
of the above: when renaming `x` in the module scope, we need to detect
that we should _also_ rename `x` in `foo`.

To support these cases, the renaming algorithm was adjusted as follows:

- When we start a rename in a scope, determine whether the symbol is
declared `global` or `nonlocal` by looking for a `global` or `nonlocal`
binding. If it is, start the rename in the defining scope. (This
requires storing the defining scope on the `nonlocal` binding, which is
new.)
- We then perform the rename in the defining scope.
- We then check whether the symbol was declared as `global` or
`nonlocal` in any scopes, and perform the rename in those scopes too.
(Thankfully, this doesn't need to be done recursively.)

Closes #5092.

## Test Plan

Added some additional snapshot tests.
2023-06-16 14:35:10 +00:00
Charlie Marsh a33bbe6335
Track "delayed" annotations in the semantic model (#5070)
## Summary

This PR tackles a corner case that we'll need to support local symbol
renaming. It relates to a nuance in how we want handle annotations
(i.e., `AnnAssign` statements with no value, like `x: int` in a function
body).

When we see a statement like:

```python
x: int
```

We create a `BindingKind::Annotation` for `x`. This is a special
`BindingKind` that the resolver isn't allowed to return. For example,
given:

```python
x: int
print(x)
```

The second line will yield an `undefined-name` error.

So why does this `BindingKind` exist at all? In Pyflakes, to support the
`unused-annotation` lint:

```python
def f():
    x: int  # unused-annotation
```

If we don't track `BindingKind::Annotation`, we can't lint for unused
variables that are only "defined" via annotations.

There are a few other wrinkles to `BindingKind::Annotation`. One is
that, if a binding already exists in the scope, we actually just discard
the `BindingKind`. So in this case:

```python
x = 1
x: int
```

When we go to create the `BindingKind::Annotation` for the second
statement, we notice that (1) we're creating an annotation but (2) the
scope already has binding for the name -- so we just drop the binding on
the floor. This has the nice property that annotations aren't considered
to "shadow" another binding, which is important in a bunch of places
(e.g., if we have `import os; os: int`, we still consider `os` to be an
import, as we should). But it also means that these "delayed"
annotations are one of the few remaining references that we don't track
anywhere in the semantic model.

This PR adds explicit support for these via a new `delayed_annotations`
attribute on the semantic model. These should be extremely rare, but we
do need to track them if we want to support local symbol renaming.

### This isn't the right way to model this

This isn't the right way to model this.

Here's an alternative:

- Remove `BindingKind::Annotation`, and treat annotations as their own,
separate concept.
- Instead of storing a map from name to `BindingId` on each `Scope`,
store a map from name to... `SymbolId`.
- Introduce a `Symbol` abstraction, where a symbol can point to a
current binding, and a list of annotations, like:

```rust
pub struct Symbol {
  binding: Option<BindingId>,
  annotations: Vec<AnnotationId>
}
```

If we did this, we could appropriately model the semantics described
above. When we go to resolve a binding, we ignore annotations (always).
When we try to find unused variables, we look through the list of
symbols, and have sufficient information to discriminate between
annotations and bound variables. Etc.

The main downside of this `Symbol`-based approach is that it's going to
take a lot more work to implement, and it'll be less performant (we'll
be storing more data per symbol, and our binding lookups will have an
added layer of indirection).
2023-06-14 17:54:35 +00:00
Charlie Marsh c74ef77e85
Move binding accesses into `SemanticModel` method (#5084) 2023-06-14 14:07:46 +00:00
Charlie Marsh 1e497162d1
Add a dedicated read result for unbound locals (#5083)
## Summary

Small follow-up to #4888 to add a dedicated `ResolvedRead` case for
unbound locals, mostly for clarity and documentation purposes (no
behavior changes).

## Test Plan

`cargo test`
2023-06-14 09:58:48 -04:00
Charlie Marsh aa41ffcfde
Add `BindingKind` variants to represent deleted bindings (#5071)
## Summary

Our current mechanism for handling deletions (e.g., `del x`) is to
remove the symbol from the scope's `bindings` table. This "does the
right thing", in that if we then reference a deleted symbol, we're able
to determine that it's unbound -- but it causes a variety of problems,
mostly in that it makes certain bindings and references unreachable
after-the-fact.

Consider:

```python
x = 1
print(x)
del x
```

If we analyze this code _after_ running the semantic model over the AST,
we'll have no way of knowing that `x` was ever introduced in the scope,
much less that it was bound to a value, read, and then deleted --
because we effectively erased `x` from the model entirely when we hit
the deletion.

In practice, this will make it impossible for us to support local symbol
renames. It also means that certain rules that we want to move out of
the model-building phase and into the "check dead scopes" phase wouldn't
work today, since we'll have lost important information about the source
code.

This PR introduces two new `BindingKind` variants to model deletions:

- `BindingKind::Deletion`, which represents `x = 1; del x`.
- `BindingKind::UnboundException`, which represents:

```python
try:
  1 / 0
except Exception as e:
  pass
```

In the latter case, `e` gets unbound after the exception handler
(assuming it's triggered), so we want to handle it similarly to a
deletion.

The main challenge here is auditing all of our existing `Binding` and
`Scope` usages to understand whether they need to accommodate deletions
or otherwise behave differently. If you look one commit back on this
branch, you'll see that the code is littered with `NOTE(charlie)`
comments that describe the reasoning behind changing (or not) each of
those call sites. I've also augmented our test suite in preparation for
this change over a few prior PRs.

### Alternatives

As an alternative, I considered introducing a flag to `BindingFlags`,
like `BindingFlags::UNBOUND`, and setting that at the appropriate time.

This turned out to be a much more difficult change, because we tend to
match on `BindingKind` all over the place (e.g., we have a bunch of code
blocks that only run when a `BindingKind` is
`BindingKind::Importation`). As a result, introducing these new
`BindingKind` variants requires only a few changes at the client sites.
Adding a flag would've required a much wider-reaching change.
2023-06-14 09:27:24 -04:00
Charlie Marsh 1895011ac2
Document some attributes on the semantic model (#5064) 2023-06-13 20:45:24 +00:00
Charlie Marsh 364bd82aee
Don't treat annotations as resolved in forward references (#5060)
## Summary

This behavior dates back to a Pyflakes commit (5fc37cbd), which was used
to allow this test to pass:

```py
from __future__ import annotations
T: object
def f(t: T): pass
def g(t: 'T'): pass
```

But, I think this is an error. Mypy and Pyright don't accept it -- you
can only use variables as type annotations if they're type aliases
(i.e., annotated with `TypeAlias`), in which case, there has to be an
assignment on the right-hand side (see: [PEP
613](https://peps.python.org/pep-0613/)).
2023-06-13 14:47:29 -04:00
Charlie Marsh 19f972a305
Use `Scope#has` in lieu of `Scope#get` (#5051)
## Summary

These usages don't actually need the `BindingId`.
2023-06-13 15:59:53 +00:00
Charlie Marsh e86f12a1ec
Rename some methods on `SemanticModel` (#4990) 2023-06-09 19:36:59 +00:00
Davide Canton 63fdcea29e
Handled dict and set inside f-string (#4249) (#4563) 2023-06-09 04:53:13 +00:00
Charlie Marsh ae75b303f0
Avoid attributing runtime references to module-level imports (#4942) 2023-06-07 21:56:03 +00:00
Charlie Marsh 8938b2d555
Use `qualified_name` terminology in more structs for consistency (#4873) 2023-06-05 19:06:48 +00:00
Charlie Marsh 466719247b
Invert parent-shadowed bindings map (#4847) 2023-06-04 00:18:46 -04:00
Charlie Marsh 3fa4440d87
Modify semantic model API to push bindings upon creation (#4846) 2023-06-04 02:28:25 +00:00
Charlie Marsh c14896b42c
Move `Binding` initialization into `SemanticModel` (#4819) 2023-06-03 15:26:55 -04:00
Charlie Marsh 935094c2ff
Move import-name matching into methods on `BindingKind` (#4818) 2023-06-03 15:01:27 -04:00
Charlie Marsh 26b1dd0ca2
Remove `name` field from import binding kinds (#4817) 2023-06-02 23:02:47 -04:00
Charlie Marsh ea3cbcc362
Avoid enforcing native-literals rule within nested f-strings (#4488) 2023-06-02 04:00:31 +00:00
qdegraaf fcbf5c3fae
Add PYI034 for `flake8-pyi` plugin (#4764) 2023-06-02 02:15:57 +00:00
Charlie Marsh 80fa3f2bfa
Add a convenience method to check if a name is bound (#4718) 2023-05-30 01:52:41 +00:00
Charlie Marsh 9741f788c7
Remove globals table from `Scope` (#4686) 2023-05-27 22:35:20 -04:00
Charlie Marsh af433ac14d
Avoid using typing-imported symbols for runtime edits (#4649) 2023-05-26 20:36:37 -04:00
Charlie Marsh 0f610f2cf7
Remove dedicated ScopeKind structs in favor of AST nodes (#4648) 2023-05-25 19:31:02 +00:00
Charlie Marsh f0e173d9fd
Use `BindingId` copies in lieu of `&BindingId` in semantic model methods (#4633) 2023-05-24 15:55:45 +00:00
Charlie Marsh fcdc7bdd33
Remove separate `ReferenceContext` enum (#4631) 2023-05-24 15:12:38 +00:00
Charlie Marsh 8961d8eb6f
Track all read references in semantic model (#4610) 2023-05-24 14:14:27 +00:00
Charlie Marsh ba4c0a21fa
Rename `ContextFlags` to `SemanticModelFlags` (#4611) 2023-05-23 17:47:07 -04:00
Charlie Marsh 74effb40b9
Rename `index` to `binding_id` in a few iterators (#4594) 2023-05-23 03:56:00 +00:00
Micha Reiser cbe344f4d5
Rename `Checker::model` to `semantic_model` (#4573) 2023-05-22 15:14:30 +02:00
Charlie Marsh 19c4b7bee6
Rename ruff_python_semantic's `Context` struct to `SemanticModel` (#4565) 2023-05-22 02:35:03 +00:00