Commit Graph

476 Commits

Author SHA1 Message Date
Brent Westbrook 5697d21fca
[syntax-errors] Irrefutable case pattern before final case (#16905)
Summary
--

Detects irrefutable `match` cases before the final case using a modified
version
of the existing `Pattern::is_irrefutable` method from the AST crate. The
modified method helps to retrieve a more precise diagnostic range to
match what
Python 3.13 shows in the REPL.

Test Plan
--

New inline tests, as well as some updates to existing tests that had
irrefutable
patterns before the last block.
2025-03-26 12:27:16 -04:00
Brent Westbrook e4f5fe8cf7
[syntax-errors] Duplicate type parameter names (#16858)
Summary
--

Detects duplicate type parameter names in function definitions, class
definitions, and type alias statements.

I also boxed the `type_params` field on `StmtTypeAlias` to make it
easier to
`match` with functions and classes. (That's the reason for the red-knot
code
owner review requests, sorry!)

Test Plan
--

New `ruff_python_syntax_errors` unit tests.

Fixes #11119.
2025-03-21 15:06:22 -04:00
Junhson Jean-Baptiste 2a4d835132
Use the common `OperatorPrecedence` for the parser (#16747)
## Summary

This change continues to resolve #16071 (and continues the work started
in #16162). Specifically, this PR changes the code in the parser so that
it uses the `OperatorPrecedence` struct from `ruff_python_ast` instead
of its own version. This is part of an effort to get rid of the
redundant definitions of `OperatorPrecedence` throughout the codebase.

Note that this PR only makes this change for `ruff_python_parser` -- we
still want to make a similar change for the formatter (namely the
`OperatorPrecedence` defined in the expression part of the formatter,
the pattern one is different). I separated the work to keep the PRs
small and easily reviewable.

## Test Plan

Because this is an internal change, I didn't add any additional tests.
Existing tests do pass.
2025-03-21 09:40:37 +05:30
Junhson Jean-Baptiste 47c4ccff5d
Separate `BitXorOr` into `BitXor` and `BitOr` precedence (#16844)
## Summary

This change follows up on the bug-fix requested in #16747 --
`ruff_python_ast::OperatorPrecedence` had an enum variant, `BitXorOr`,
which which gave the same precedence to the `|` and `^` operators. This
goes against [Python's documentation for operator
precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence),
so this PR changes the code so that it's correct.

This is part of the overall effort to unify redundant definitions of
`OperatorPrecedence` throughout the codebase (#16071)

## Test Plan

Because this is an internal change, I only ran existing tests to ensure
nothing was broken.
2025-03-20 16:13:47 +05:30
Shaygan Hooshyari 360ba095ff
[red-knot] Auto generate statement nodes (#16645)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

<!-- What's the purpose of the change? What does it do, and why? -->

Part of #15655 

Replaced statement nodes with autogenerated ones. Reused the stuff we
introduced in #16285. Nothing except for copying the nodes to new
format.

## Test Plan

Tests run without any changes. Also moved the test that checks size of
AST nodes to `generated.rs` since all of the structs that it tests are
now there.
<!-- How was it tested? -->
2025-03-13 15:43:48 +01:00
Dhruv Manilawala 0361021863
[red-knot] Understand `typing.Callable` (#16493)
## Summary

Part of https://github.com/astral-sh/ruff/issues/15382

This PR implements a general callable type that wraps around a
`Signature` and it uses that new type to represent `typing.Callable`.

It also implements `Display` support for `Callable`. The format is as:
```
([<arg name>][: <arg type>][ = <default type>], ...) -> <return type>
```

The `/` and `*` separators are added at the correct boundary for
positional-only and keyword-only parameters. Now, as `typing.Callable`
only has positional-only parameters, the rendered signature would be:

```py
Callable[[int, str], None]
# (int, str, /) -> None
```

The `/` separator represents that all the arguments are positional-only.

The relationship methods that check assignability, subtype relationship,
etc. are not yet implemented and will be done so as a follow-up.

## Test Plan

Add test cases for display support for `Signature` and various mdtest
for `typing.Callable`.
2025-03-08 03:58:52 +00:00
Shaygan Hooshyari 23fd4927ae
Auto generate ast expression nodes (#16285)
## Summary

Part of https://github.com/astral-sh/ruff/issues/15655

- Auto generate AST nodes using definitions in `ast.toml`. I added
attributes similar to
[`Field`](https://github.com/python/cpython/blob/main/Parser/asdl.py#L67)
in ASDL to hold field information

## Test Plan

Nothing outside the `ruff_python_ast` package should change.

---------

Co-authored-by: Douglas Creager <dcreager@dcreager.net>
2025-03-05 08:25:55 -05:00
Carl Meyer dd6f6233bd
bump MSRV to 1.83 (#16294)
According to our new MSRV policy (see
https://github.com/astral-sh/ruff/issues/16370 ), bump our MSRV to 1.83
(N - 2), and autofix some new clippy lints.
2025-02-26 06:12:43 -08:00
InSync c814745643
[`flake8-self`] Ignore attribute accesses on instance-like variables (`SLF001`) (#16149) 2025-02-23 10:00:49 +00:00
Alex Waygood 16d0625dfb
Improve internal docs for various string-node APIs (#16256) 2025-02-19 16:13:45 +00:00
Alex Waygood 25920fe489
Rename `ExprStringLiteral::as_unconcatenated_string()` to `ExprStringLiteral::as_single_part_string()` (#16253) 2025-02-19 16:06:57 +00:00
Alex Waygood f50849aeef
Add `text_len()` methods to more `*Prefix` enums in `ruff_python_ast` (#16254) 2025-02-19 14:47:07 +00:00
Brent Westbrook a9efdea113
Use `ast::PythonVersion` internally in the formatter and linter (#16170)
## Summary

This PR updates the formatter and linter to use the `PythonVersion`
struct from the `ruff_python_ast` crate internally. While this doesn't
remove the need for the `linter::PythonVersion` enum, it does remove the
`formatter::PythonVersion` enum and limits the use in the linter to
deserializing from CLI arguments and config files and moves most of the
remaining methods to the `ast::PythonVersion` struct.

## Test Plan

Existing tests, with some inputs and outputs updated to reflect the new
(de)serialization format. I think these are test-specific and shouldn't
affect any external (de)serialization.

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2025-02-18 12:03:13 -05:00
Alex Waygood b6b1947010
Improve API exposed on `ExprStringLiteral` nodes (#16192)
## Summary

This PR makes the following changes:
- It adjusts various callsites to use the new
`ast::StringLiteral::contents_range()` method that was introduced in
https://github.com/astral-sh/ruff/pull/16183. This is less verbose and
more type-safe than using the `ast::str::raw_contents()` helper
function.
- It adds a new `ast::ExprStringLiteral::as_unconcatenated_literal()`
helper method, and adjusts various callsites to use it. This addresses
@MichaReiser's review comment at
https://github.com/astral-sh/ruff/pull/16183#discussion_r1957334365.
There is no functional change here, but it helps readability to make it
clearer that we're differentiating between implicitly concatenated
strings and unconcatenated strings at various points.
- It renames the `StringLiteralValue::flags()` method to
`StringLiteralFlags::first_literal_flags()`. If you're dealing with an
implicitly concatenated string `string_node`,
`string_node.value.flags().closer_len()` could give an incorrect result;
this renaming makes it clearer that the `StringLiteralFlags` instance
returned by the method is only guaranteed to give accurate information
for the first `StringLiteral` contained in the `ExprStringLiteral` node.
- It deletes the unused `BytesLiteralValue::flags()` method. This seems
prone to misuse in the same way as `StringLiteralValue::flags()`: if
it's an implicitly concatenated bytestring, the `BytesLiteralFlags`
instance returned by the method would only give accurate information for
the first `BytesLiteral` in the bytestring.

## Test Plan

`cargo test`
2025-02-17 07:58:54 +00:00
Alex Waygood 61fef0a64a
Reduce memory usage of `Docstring` struct (#16183) 2025-02-16 15:23:52 +00:00
Brent Westbrook f58a54f043
Move `red_knot_python_semantic::PythonVersion` to the `ruff_python_ast` crate (#16147)
## Summary

This PR moves the `PythonVersion` struct from the
`red_knot_python_semantic` crate to the `ruff_python_ast` crate so that
it can be used more easily in the syntax error detection work. Compared
to that [prototype](https://github.com/astral-sh/ruff/pull/16090/) these
changes reduce us from 2 `PythonVersion` structs to 1.

This does not unify any of the `PythonVersion` *enums*, but I hope to
make some progress on that in a follow-up.

## Test Plan

Existing tests, this should not change any external behavior.

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2025-02-14 12:48:08 -05:00
Junhson Jean-Baptiste fa28dc5ccf
[internal] Move Linter `OperatorPrecedence` into `ruff_python_ast` crate (#16162)
## Summary

This change begins to resolve #16071 by moving the `OperatorPrecedence`
structs from the `ruff_python_linter` crate into `ruff_python_ast`. This
PR also implements `precedence()` methods on the `Expr` and `ExprRef`
enums.

## Test Plan

Since this change mainly shifts existing logic, I didn't add any
additional tests. Existing tests do pass.
2025-02-14 15:55:07 +01:00
Shaygan Hooshyari 0a75a1d56b
Replace is-macro with implementation in enums (#16144) 2025-02-13 22:49:00 +00:00
Ibraheem Ahmed 69d86d1d69
Transition to salsa coarse-grained tracked structs (#15763)
## Summary

Transition to using coarse-grained tracked structs (depends on
https://github.com/salsa-rs/salsa/pull/657). For now, this PR doesn't
add any `#[tracked]` fields, meaning that any changes cause the entire
struct to be invalidated. It also changes `AstNodeRef` to be
compared/hashed by pointer address, instead of performing a deep AST
comparison.

## Test Plan

This yields a 10-15% improvement on my machine (though weirdly some runs
were 5-10% without being flagged as inconsistent by criterion, is there
some non-determinism involved?). It's possible that some of this is
unrelated, I'll try applying the patch to the current salsa version to
make sure.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2025-02-11 11:38:50 +01:00
Dylan 3a806ecaa1
[`flake8-annotations`] Correct syntax for `typing.Union` in suggested return type fixes for `ANN20x` rules (#16025)
When suggesting a return type as a union in Python <=3.9, we now avoid a
`TypeError` by correctly suggesting syntax like `Union[int,str,None]`
instead of `Union[int | str | None]`.
2025-02-07 17:17:20 -06:00
David Peter d296f602e7
[red-knot] Merge Markdown code blocks inside a single section (#15950)
## Summary

Allow for literate style in Markdown tests and merge multiple (unnamed)
code blocks into a single embedded file.

closes #15941

## Test Plan

- Interactively made sure that error-lines were reported correctly in
  multi-snippet sections.
2025-02-05 22:26:15 +01:00
Douglas Creager 444b055cec
[red-knot] Use ternary decision diagrams (TDDs) for visibility constraints (#15861)
We now use ternary decision diagrams (TDDs) to represent visibility
constraints. A TDD is just like a BDD ([_binary_ decision
diagram](https://en.wikipedia.org/wiki/Binary_decision_diagram)), but
with "ambiguous" as an additional allowed value. Unlike the previous
representation, TDDs are strongly normalizing, so equivalent ternary
formulas are represented by exactly the same graph node, and can be
compared for equality in constant time.

We currently have a slight 1-3% performance regression with this in
place, according to local testing. However, we also have a _5× increase_
in performance for pathological cases, since we can now remove the
recursion limit when we evaluate visibility constraints.

As follow-on work, we are now closer to being able to remove the
`simplify_visibility_constraint` calls in the semantic index builder. In
the vast majority of cases, we now see (for instance) that the
visibility constraint after an `if` statement, for bindings of symbols
that weren't rebound in any branch, simplifies back to `true`. But there
are still some cases we generate constraints that are cyclic. With
fixed-point cycle support in salsa, or with some careful analysis of the
still-failing cases, we might be able to remove those.
2025-02-04 14:32:11 -05:00
Alex Waygood cb71393332
Simplify the `StringFlags` trait (#15944) 2025-02-04 18:14:28 +00:00
Brent Westbrook b5e5271adf
Preserve triple quotes and prefixes for strings (#15818)
## Summary

This is a follow-up to #15726, #15778, and #15794 to preserve the triple
quote and prefix flags in plain strings, bytestrings, and f-strings.

I also added a `StringLiteralFlags::without_triple_quotes` method to
avoid passing along triple quotes in rules like SIM905 where it might
not make sense, as discussed
[here](https://github.com/astral-sh/ruff/pull/15726#discussion_r1930532426).

## Test Plan

Existing tests, plus many new cases in the `generator::tests::quote`
test that should cover all combinations of quotes and prefixes, at least
for simple string bodies.

Closes #7799 when combined with #15694, #15726, #15778, and #15794.

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2025-02-04 08:41:06 -05:00
Alex Waygood d9a1034db0
Add convenience helper methods for AST nodes representing function parameters (#15871) 2025-02-01 17:16:32 +00:00
Brent Westbrook 23c98849fc
Preserve quotes in generated f-strings (#15794)
## Summary

This is another follow-up to #15726 and #15778, extending the
quote-preserving behavior to f-strings and deleting the now-unused
`Generator::quote` field.

## Details
I also made one unrelated change to `rules/flynt/helpers.rs` to remove a
`to_string` call for making a `Box<str>` and tweaked some arguments to
some of the `Generator::unparse_f_string` methods to make the code
easier to follow, in my opinion. Happy to revert especially the latter
of these if needed.

Unfortunately this still does not fix the issue in #9660, which appears
to be more of an escaping issue than a quote-preservation issue. After
#15726, the result is now `a = f'# {"".join([])}' if 1 else ""` instead
of `a = f"# {''.join([])}" if 1 else ""` (single quotes on the outside
now), but we still don't have the desired behavior of double quotes
everywhere on Python 3.12+. I added a test for this but split it off
into another branch since it ended up being unaddressed here, but my
`dbg!` statements showed the correct preferred quotes going into
[`UnicodeEscape::with_preferred_quote`](https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_literal/src/escape.rs#L54).

## Test Plan

Existing rule and `Generator` tests.

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2025-01-29 13:28:22 -05:00
Brent Westbrook 98d20a8219
Preserve quotes in generated byte strings (#15778)
## Summary

This is a very closely related follow-up to #15726, adding the same
quote-preserving behavior to bytestrings. Only one rule (UP018) was
affected this time, and it was easy to mirror the plain string changes.

## Test Plan

Existing tests
2025-01-28 08:19:40 -05:00
Brent Westbrook 9bf138c45a
Preserve quote style in generated code (#15726)
## Summary

This is a first step toward fixing #7799 by using the quoting style
stored in the `flags` field on `ast::StringLiteral`s to select a quoting
style. This PR does not include support for f-strings or byte strings.

Several rules also needed small updates to pass along existing quoting
styles instead of using `StringLiteralFlags::default()`. The remaining
snapshot changes are intentional and should preserve the quotes from the
input strings.

## Test Plan

Existing tests with some accepted updates, plus a few new RUF055 tests
for raw strings.

---------

Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2025-01-27 13:41:03 -05:00
Brent Westbrook cffd1866ce
Preserve raw string prefix and escapes (#15694)
## Summary

Fixes #9663 and also improves the fixes for
[RUF055](https://docs.astral.sh/ruff/rules/unnecessary-regular-expression/)
since regular expressions are often written as raw strings.

This doesn't include raw f-strings.

## Test Plan

Existing snapshots for RUF055 and PT009, plus a new `Generator` test and
a regression test for the reported `PIE810` issue.
2025-01-23 12:12:10 -05:00
Douglas Creager ef85c682bd
Remove customizable reference enum names (#15647)
The AST generator creates a reference enum for each syntax group — an
enum where each variant contains a reference to the relevant syntax
node. Previously you could customize the name of the reference enum for
a group — primarily because there was an existing `ExpressionRef` type
that wouldn't have lined up with the auto-derived name `ExprRef`. This
follow-up PR is a simple search/replace to switch over to the
auto-derived name, so that we can remove this customization point.
2025-01-21 13:46:31 -05:00
Douglas Creager fa546b20a6
Separate grouped and ungrouped nodes more clearly in AST generator (#15646)
This is a minor cleanup to the AST generation script to make a clearer
separation between nodes that do appear in a group enum, and those that
don't. There are some types and methods that we create for every syntax
node, and others that refer to the group that the syntax node belongs
to, and which therefore don't make sense for ungrouped nodes. This new
separation makes it clearer which category each definition is in, since
you're either inside of a `for group in ast.groups` loop, or a `for node
in ast.all_nodes` loop.
2025-01-21 13:37:18 -05:00
Calum Young 023c52d82b
Standardise ruff config (#15558) 2025-01-21 12:09:11 +01:00
Douglas Creager 98ef564170
Remove `AstNode` and `AnyNode` (#15479)
While looking into potential AST optimizations, I noticed the `AstNode`
trait and `AnyNode` type aren't used anywhere in Ruff or Red Knot. It
looks like they might be historical artifacts of previous ways of
consuming AST nodes?

- `AstNode::cast`, `AstNode::cast_ref`, and `AstNode::can_cast` are not
used anywhere.
- Since `cast_ref` isn't needed anymore, the `Ref` associated type isn't
either.

This is a pure refactoring, with no intended behavior changes.
2025-01-17 17:11:00 -05:00
Douglas Creager 8e3633f55a
Auto-generate AST boilerplate (#15544)
This PR replaces most of the hard-coded AST definitions with a
generation script, similar to what happens in `rust_python_formatter`.
I've replaced every "rote" definition that I could find, where the
content is entirely boilerplate and only depends on what syntax nodes
there are and which groups they belong to.

This is a pretty massive diff, but it's entirely a refactoring. It
should make absolutely no changes to the API or implementation. In
particular, this required adding some configuration knobs that let us
override default auto-generated names where they don't line up with
types that we created previously by hand.

## Test plan

There should be no changes outside of the `rust_python_ast` crate, which
verifies that there were no API changes as a result of the
auto-generation. Aggressive `cargo clippy` and `uvx pre-commit` runs
after each commit in the branch.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2025-01-17 14:23:02 -05:00
InSync aed0bf1c11
[`ruff`] `itertools.starmap(..., zip(...))` (`RUF058`) (#15483)
Co-authored-by: Micha Reiser <micha@reiser.io>
2025-01-16 15:18:12 +01:00
Micha Reiser c39ca8fe6d
Upgrade Rust toolchain to 1.84.0 (#15408) 2025-01-11 09:51:58 +01:00
Micha Reiser 2b28d566a4
Associate a trailing end-of-line comment in a parenthesized implicit concatenated string with the last literal (#15378) 2025-01-10 19:21:34 +01:00
w0nder1ng 0837cdd931
[`RUF`] Add rule to detect empty literal in deque call (`RUF025`) (#15104) 2025-01-03 11:57:13 +01:00
David Salvisberg 1ef0f615f1
[`flake8-type-checking`] Improve flexibility of `runtime-evaluated-decorators` (#15204)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-12-31 16:28:10 +00:00
Arnav Gupta 3c9021ffcb
[ruff] Implement falsy-dict-get-fallback (RUF056) (#15160)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-12-31 11:40:51 +01:00
Rebecca Chen 112e9d2d82
[ruff_python_ast] Add name and default functions to TypeParam. (#14964)
## Summary

This change adds `name` and `default` functions to `TypeParam` to access
the corresponding attributes more conveniently. I currently have these
as helper functions in code built on top of ruff_python_ast, and they
seemed like they might be generally useful.

## Test Plan

Ran the checks listed in CONTRIBUTING.md#development.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-12-15 12:04:51 +00:00
David Peter d2712c7669
ruff_python_ast: Make `Singleton` `Copy` (#14943)
## Summary

Minor changed pulled out from #14759, as it seems to make sense in
isolation.

## Test Plan

—
2024-12-12 20:49:54 +01:00
InSync fda8b1f884
[`ruff`] Unnecessary cast to `int` (`RUF046`) (#14697)
## Summary

Resolves #11412.

## Test Plan

`cargo nextest run` and `cargo insta test`.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2024-12-05 10:30:06 +01:00
Micha Reiser b63c2e126b
Upgrade Rust toolchain to 1.83 (#14677) 2024-11-29 12:05:05 +00:00
Dhruv Manilawala f3dac27e9a
Fix f-string formatting in assignment statement (#14454)
## Summary

fixes: #13813

This PR fixes a bug in the formatting assignment statement when the
value is an f-string.

This is resolved by using custom best fit layouts if the f-string is (a)
not already a flat f-string (thus, cannot be multiline) and (b) is not a
multiline string (thus, cannot be flattened). So, it is used in cases
like the following:
```py
aaaaaaaaaaaaaaaaaa = f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
    expression}moreeeeeeeeeeeeeeeee"
```
Which is (a) `FStringLayout::Multiline` and (b) not a multiline.

There are various other examples in the PR diff along with additional
explanation and context as code comments.

## Test Plan

Add multiple test cases for various scenarios.
2024-11-26 15:07:18 +05:30
Charlie Marsh de62e39eba
Use truthiness check in `auto_attribs` detection (#14562) 2024-11-23 22:06:10 -05:00
Micha Reiser 2b58705cc1
Remove the optional salsa dependency from the AST crate (#14363) 2024-11-15 16:46:04 +00:00
InSync be69f61b3e
[`flake8-simplify`] Infer "unknown" truthiness for literal iterables whose items are all unpacks (`SIM222`) (#14263)
## Summary

Resolves #14237.

## Test Plan

`cargo nextest run` and `cargo insta test`.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-11-11 15:23:34 -05:00
Micha Reiser bc0586d922
Avoid cloning `Name` when looking up function and class types (#14092) 2024-11-04 15:52:59 +01:00
Charlie Marsh e00594e8d2
Respect hash-equivalent literals in `iteration-over-set` (#14063)
## Summary

Closes https://github.com/astral-sh/ruff/issues/14049.
2024-11-03 18:44:52 +00:00
Charlie Marsh 4a3eeeff86
Remove `HashableExpr` abstraction (#14057)
## Summary

It looks like `ComparableExpr` now implements `Hash` so we can just
remove this.
2024-11-02 20:28:35 +00:00
Charlie Marsh 71536a43db
Add remaining augmented assignment dunders (#13985)
## Summary

See: https://github.com/astral-sh/ruff/issues/12699
2024-10-30 13:02:29 +00:00
Charlie Marsh b6847b371e
Skip namespace package enforcement for PEP 723 scripts (#13974)
## Summary

Vendors the PEP 723 parser from
[uv](debe67ffdb/crates/uv-scripts/src/lib.rs (L283)).

Closes https://github.com/astral-sh/ruff/issues/13912.
2024-10-29 02:11:31 +00:00
Micha Reiser 9f3a38d408
Extract `LineIndex` independent methods from `Locator` (#13938) 2024-10-28 07:53:41 +00:00
TomerBin 35f007f17f
[red-knot] Type narrow in else clause (#13918)
## Summary

Add support for type narrowing in elif and else scopes as part of
#13694.

## Test Plan

- mdtest
- builder unit test for union negation.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2024-10-26 16:22:57 +00:00
Micha Reiser 73ee72b665
Join implicit concatenated strings when they fit on a line (#13663) 2024-10-24 11:52:22 +02:00
Micha Reiser e402e27a09
Use referencial equality in `traversal` helper methods (#13895) 2024-10-24 11:30:22 +02:00
Micha Reiser 2f88f84972
Alternate quotes for strings inside f-strings in preview (#13860) 2024-10-23 07:57:53 +02:00
Alex Waygood 72adb09bf3
Simplify iteration idioms (#13834)
Remove unnecessary uses of `.as_ref()`, `.iter()`, `&**` and similar, mostly in situations when iterating over variables. Many of these changes are only possible following #13826, when we bumped our MSRV to 1.80: several useful implementations on `&Box<[T]>` were only stabilised in Rust 1.80. Some of these changes we could have done earlier, however.
2024-10-20 22:25:27 +01:00
Micha Reiser 27c50bebec
Bump MSRV to Rust 1.80 (#13826) 2024-10-20 10:55:36 +02:00
Neil Mitchell 2d2baeca23
[python_ast] Make the iter_mut functions public (#13542) 2024-10-19 20:04:00 +01:00
Carl Meyer f4b5e70fae
[red-knot] binary arithmetic on instances (#13800)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-10-19 15:22:54 +00:00
Micha Reiser 8f5b2aac9a
Refactor: Remove `StringPart` and `AnyStringPart` in favor of `StringLikePart` (#13772) 2024-10-16 12:52:06 +02:00
Alex Waygood 5b4afd30ca
Harmonise methods for distinguishing different Python source types (#13682) 2024-10-09 13:18:52 +00:00
Dylan 14ee5dbfde
[refurb] Count codepoints not bytes for `slice-to-remove-prefix-or-suffix (FURB188)` (#13631) 2024-10-07 16:13:28 +02:00
Sid 31ca1c3064
[`flake8-async`] allow async generators (`ASYNC100`) (#13639)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Treat async generators as "await" in ASYNC100.

Fixes #13637

## Test Plan

Updated snapshot
2024-10-07 07:25:54 -05:00
Dylan f27a8b8c7a
[internal] `ComparableExpr` (f)strings and bytes made invariant under concatenation (#13301) 2024-09-25 16:58:57 +02:00
Dhruv Manilawala 62c7d8f6ba
[red-knot] Add control flow support for match statement (#13241)
## Summary

This PR adds support for control flow for match statement.

It also adds the necessary infrastructure required for narrowing
constraints in case blocks and implements the logic for
`PatternMatchSingleton` which is either `None` / `True` / `False`. Even
after this the inferred type doesn't get simplified completely, there's
a TODO for that in the test code.

## Test Plan

Add test cases for control flow for (a) when there's a wildcard pattern
and (b) when there isn't. There's also a test case to verify the
narrowing logic.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2024-09-10 02:14:19 +05:30
Dhruv Manilawala 47f0b45be3
Implement `AstNode` for `Identifier` (#13207)
## Summary

Follow-up to #13147, this PR implements the `AstNode` for `Identifier`.
This makes it easier to create the `NodeKey` in red knot because it uses
a generic method to construct the key from `AnyNodeRef` and is important
for definitions that are created only on identifiers instead of
`ExprName`.

## Test Plan

`cargo test` and `cargo clippy`
2024-09-02 16:27:12 +05:30
Teodoro Freund b9c8113a8a
Added bytes type and some inference (#13061)
## Summary

This PR adds the `bytes` type to red-knot:
- Added the `bytes` type
- Added support for bytes literals
- Support for the `+` operator

Improves on #12701 

Big TODO on supporting and normalizing r-prefixed bytestrings
(`rb"hello\n"`)

## Test Plan

Added a test for a bytes literals, concatenation, and corner values
2024-08-22 13:27:15 -07:00
Alex Waygood aa0db338d9
Implement `iter()`, `len()` and `is_empty()` for all display-literal AST nodes (#12807) 2024-08-12 10:39:28 +00:00
Micha Reiser 138e70bd5c
Upgrade to Rust 1.80 (#12586) 2024-07-30 19:18:08 +00:00
Charlie Marsh d930052de8
Move required import parsing out of lint rule (#12536)
## Summary

Instead, make it part of the serialization and deserialization itself.
This makes it _much_ easier to reuse when solving
https://github.com/astral-sh/ruff/issues/12458.
2024-07-26 13:35:45 -04:00
Micha Reiser 0c72577b5d
[red-knot] Add notebook support (#12338) 2024-07-17 08:26:33 +00:00
renovate[bot] 8ad10b9307
Update Rust crate compact_str to 0.8.0 (#12333)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-07-15 06:03:23 +00:00
Charlie Marsh 18c364d5df
[`flake8-bandit`] Support explicit string concatenations in S310 HTTP detection (#12315)
Closes https://github.com/astral-sh/ruff/issues/12314.
2024-07-14 10:44:08 -04:00
Gaétan Lepage d0298dc26d
Explicitly add schemars to ruff_python_ast Cargo.toml (#12275)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-07-11 06:46:34 +00:00
Micha Reiser 5109b50bb3
Use `CompactString` for `Identifier` (#12101) 2024-07-01 10:06:02 +02:00
Micha Reiser d4dd96d1f4
red-knot: `source_text`, `line_index`, and `parsed_module` queries (#11822) 2024-06-13 07:37:02 +00:00
Micha Reiser 32ca704956
Rename `PreorderVisitor` to `SourceOrderVisitor` (#11798)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-07 17:01:58 +00:00
Dhruv Manilawala bf5b62edac
Maintain synchronicity between the lexer and the parser (#11457)
## Summary

This PR updates the entire parser stack in multiple ways:

### Make the lexer lazy

* https://github.com/astral-sh/ruff/pull/11244
* https://github.com/astral-sh/ruff/pull/11473

Previously, Ruff's lexer would act as an iterator. The parser would
collect all the tokens in a vector first and then process the tokens to
create the syntax tree.

The first task in this project is to update the entire parsing flow to
make the lexer lazy. This includes the `Lexer`, `TokenSource`, and
`Parser`. For context, the `TokenSource` is a wrapper around the `Lexer`
to filter out the trivia tokens[^1]. Now, the parser will ask the token
source to get the next token and only then the lexer will continue and
emit the token. This means that the lexer needs to be aware of the
"current" token. When the `next_token` is called, the current token will
be updated with the newly lexed token.

The main motivation to make the lexer lazy is to allow re-lexing a token
in a different context. This is going to be really useful to make the
parser error resilience. For example, currently the emitted tokens
remains the same even if the parser can recover from an unclosed
parenthesis. This is important because the lexer emits a
`NonLogicalNewline` in parenthesized context while a normal `Newline` in
non-parenthesized context. This different kinds of newline is also used
to emit the indentation tokens which is important for the parser as it's
used to determine the start and end of a block.

Additionally, this allows us to implement the following functionalities:
1. Checkpoint - rewind infrastructure: The idea here is to create a
checkpoint and continue lexing. At a later point, this checkpoint can be
used to rewind the lexer back to the provided checkpoint.
2. Remove the `SoftKeywordTransformer` and instead use lookahead or
speculative parsing to determine whether a soft keyword is a keyword or
an identifier
3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted
by the lexer but it contains owned data which makes it expensive to
clone. The new `TokenKind` enum just represents the type of token which
is very cheap.

This brings up a question as to how will the parser get the owned value
which was stored on `Tok`. This will be solved by introducing a new
`TokenValue` enum which only contains a subset of token kinds which has
the owned value. This is stored on the lexer and is requested by the
parser when it wants to process the data. For example:
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)

[^1]: Trivia tokens are `NonLogicalNewline` and `Comment`

### Remove `SoftKeywordTransformer`

* https://github.com/astral-sh/ruff/pull/11441
* https://github.com/astral-sh/ruff/pull/11459
* https://github.com/astral-sh/ruff/pull/11442
* https://github.com/astral-sh/ruff/pull/11443
* https://github.com/astral-sh/ruff/pull/11474

For context,
https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220
added support for soft keywords in the parser which uses infinite
lookahead to classify a soft keyword as a keyword or an identifier. This
is a brilliant idea as it basically wraps the existing Lexer and works
on top of it which means that the logic for lexing and re-lexing a soft
keyword remains separate. The change here is to remove
`SoftKeywordTransformer` and let the parser determine this based on
context, lookahead and speculative parsing.

* **Context:** The transformer needs to know the position of the lexer
between it being at a statement position or a simple statement position.
This is because a `match` token starts a compound statement while a
`type` token starts a simple statement. **The parser already knows
this.**
* **Lookahead:** Now that the parser knows the context it can perform
lookahead of up to two tokens to classify the soft keyword. The logic
for this is mentioned in the PR implementing it for `type` and `match
soft keyword.
* **Speculative parsing:** This is where the checkpoint - rewind
infrastructure helps. For `match` soft keyword, there are certain cases
for which we can't classify based on lookahead. The idea here is to
create a checkpoint and keep parsing. Based on whether the parsing was
successful and what tokens are ahead we can classify the remaining
cases. Refer to #11443 for more details.

If the soft keyword is being parsed in an identifier context, it'll be
converted to an identifier and the emitted token will be updated as
well. Refer
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491).

The `case` soft keyword doesn't require any special handling because
it'll be a keyword only in the context of a match statement.

### Update the parser API

* https://github.com/astral-sh/ruff/pull/11494
* https://github.com/astral-sh/ruff/pull/11505

Now that the lexer is in sync with the parser, and the parser helps to
determine whether a soft keyword is a keyword or an identifier, the
lexer cannot be used on its own. The reason being that it's not
sensitive to the context (which is correct). This means that the parser
API needs to be updated to not allow any access to the lexer.

Previously, there were multiple ways to parse the source code:
1. Passing the source code itself
2. Or, passing the tokens

Now that the lexer and parser are working together, the API
corresponding to (2) cannot exists. The final API is mentioned in this
PR description: https://github.com/astral-sh/ruff/pull/11494.

### Refactor the downstream tools (linter and formatter)

* https://github.com/astral-sh/ruff/pull/11511
* https://github.com/astral-sh/ruff/pull/11515
* https://github.com/astral-sh/ruff/pull/11529
* https://github.com/astral-sh/ruff/pull/11562
* https://github.com/astral-sh/ruff/pull/11592

And, the final set of changes involves updating all references of the
lexer and `Tok` enum. This was done in two-parts:
1. Update all the references in a way that doesn't require any changes
from this PR i.e., it can be done independently
	* https://github.com/astral-sh/ruff/pull/11402
	* https://github.com/astral-sh/ruff/pull/11406
	* https://github.com/astral-sh/ruff/pull/11418
	* https://github.com/astral-sh/ruff/pull/11419
	* https://github.com/astral-sh/ruff/pull/11420
	* https://github.com/astral-sh/ruff/pull/11424
2. Update all the remaining references to use the changes made in this
PR

For (2), there were various strategies used:
1. Introduce a new `Tokens` struct which wraps the token vector and add
methods to query a certain subset of tokens. These includes:
	1. `up_to_first_unknown` which replaces the `tokenize` function
2. `in_range` and `after` which replaces the `lex_starts_at` function
where the former returns the tokens within the given range while the
latter returns all the tokens after the given offset
2. Introduce a new `TokenFlags` which is a set of flags to query certain
information from a token. Currently, this information is only limited to
any string type token but can be expanded to include other information
in the future as needed. https://github.com/astral-sh/ruff/pull/11578
3. Move the `CommentRanges` to the parsed output because this
information is common to both the linter and the formatter. This removes
the need for `tokens_and_ranges` function.

## Test Plan

- [x] Update and verify the test snapshots
- [x] Make sure the entire test suite is passing
- [x] Make sure there are no changes in the ecosystem checks
- [x] Run the fuzzer on the parser
- [x] Run this change on dozens of open-source projects

### Running this change on dozens of open-source projects

Refer to the PR description to get the list of open source projects used
for testing.

Now, the following tests were done between `main` and this branch:
1. Compare the output of `--select=E999` (syntax errors)
2. Compare the output of default rule selection
3. Compare the output of `--select=ALL`

**Conclusion: all output were same**

## What's next?

The next step is to introduce re-lexing logic and update the parser to
feed the recovery information to the lexer so that it can emit the
correct token. This moves us one step closer to having error resilience
in the parser and provides Ruff the possibility to lint even if the
source code contains syntax errors.
2024-06-03 18:23:50 +05:30
Tushar Sadhwani e0169d8dea
[`flake8-pyi`] Implement `PYI064` (#11325)
## Summary

Implements `Y064` from `flake8-pyi` and its autofix.

## Test Plan

`cargo test` / `cargo insta review`
2024-05-28 23:57:13 +00:00
Charlie Marsh 16acd4913f
Remove some unused `pub` functions (#11576)
## Summary

I left anything in `red-knot`, any `with_` methods, etc.
2024-05-28 09:56:51 -04:00
Ahmed Ilyas b36c713279
Consider irrefutable pattern similar to `if .. else` for `C901` (#11565)
## Summary

Follow up to https://github.com/astral-sh/ruff/pull/11521

Removes the extra added complexity for catch all match cases. This
matches the implementation of plain `else` statements.

## Test Plan
Added new test cases.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-05-27 17:33:36 +00:00
Dhruv Manilawala e28e737296
Update `FStringElements` to deref to a slice (#11570)
Ref: https://github.com/astral-sh/ruff/pull/11400#discussion_r1615600354
2024-05-27 15:52:13 +00:00
Alex Waygood 246a3388ee
Implement a common trait for the string flags (#11564) 2024-05-27 16:02:01 +01:00
Alex Waygood 6963f75a14
Move string-prefix enumerations to a separate submodule (#11425)
## Summary

This moves the string-prefix enumerations in `ruff_python_ast` to a
separate submodule. I think this helps clarify that these prefixes are
purely abstract: they only depend on each other, and do not depend on
any of the other code in `nodes.rs` in any way. Moreover, while various
AST nodes _use_ them, they're not really nodes themselves, so they feel
slightly out of place in `nodes.rs`.

I considered moving all of them to `str.rs`, but it felt like enough
code that it could be a separate submodule.

## Test Plan

`cargo test`
2024-05-15 07:40:27 -04:00
Dhruv Manilawala c3c87e86ef
Implement `IntoIterator` for `FStringElements` (#11410)
A change which I lost somewhere when I force pushed in
https://github.com/astral-sh/ruff/pull/11400
2024-05-13 16:24:49 +00:00
Dhruv Manilawala ca99e9e2f0
Move `W605` to the AST checker (#11402)
## Summary

This PR moves the `W605` rule to the AST checker.

This is part of #11401

## Test Plan

`cargo test`
2024-05-13 16:13:06 +00:00
Dhruv Manilawala 4b41e4de7f
Create a newtype wrapper around `Vec<FStringElement>` (#11400)
## Summary

This PR adds a newtype wrapper around `Vec<FStringElement>` that derefs
to a `&Vec<FStringElement>`.

Both f-string and format specifier are made up of `Vec<FStringElement>`.
By creating a newtype wrapper around it, we can share the methods for
both parent types.
2024-05-13 16:04:04 +00:00
Dhruv Manilawala 0dc130e841
Add `Iterator` impl for `StringLike` parts (#11399)
## Summary

This PR adds support to iterate over each part of a string-like
expression.

This similar to the one in the formatter:


128414cd95/crates/ruff_python_formatter/src/string/any.rs (L121-L125)

Although I don't think it's a 1-1 replacement in the formatter because
the one implemented in the formatter has another information for certain
variants (as can be seen for `FString`).

The main motivation for this is to avoid duplication for rules which
work only on the parts of the string and doesn't require any information
from the parent node. Here, the parent node being the expression node
which could be an implicitly concatenated string.

This PR also updates certain rule implementation to make use of this and
avoids logic duplication.
2024-05-13 15:52:03 +00:00
Charlie Marsh af60d539ab
Move sub-crates to workspace dependencies (#11407)
## Summary

This matches the setup we use in `uv` and allows for consistency in the
`Cargo.toml` files.
2024-05-13 14:37:50 +00:00
Dhruv Manilawala 6ecb4776de
Rename `AnyStringKind` -> `AnyStringFlags` (#11405)
## Summary

This PR renames `AnyStringKind` to `AnyStringFlags` and `AnyStringFlags`
to `AnyStringFlagsInner`.

The main motivation is to have consistent usage of "kind" and "flags".
For each string kind, it's "flags" like `StringLiteralFlags`,
`BytesLiteralFlags`, and `FStringFlags` but it was `AnyStringKind` for
the "any" variant.
2024-05-13 13:18:07 +00:00
Charlie Marsh 6cec82fff8
Get `cargo shear` passing (#11392)
## Summary

Remove some unused dependencies, add a few ignores.
2024-05-13 01:56:24 +00:00
Charlie Marsh 35ba3c91ce
Use `u64` instead of `i64` in Int type (#11356)
## Summary

I believe the value here is always unsigned, since we represent `-42` as
a unary operator on `42`.
2024-05-10 13:35:15 +00:00
Micha Reiser 22639c5a2a
Move all module from the AST to the semantic crate (#11330) 2024-05-08 08:56:50 +00:00
Alex Waygood 6774f27f4b
Refactor the `ExprDict` node (#11267)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-05-07 11:46:10 +00:00
Dhruv Manilawala 28cc71fb6b
Remove cyclic dev dependency with the parser crate (#11261)
## Summary

This PR removes the cyclic dev dependency some of the crates had with
the parser crate.

The cyclic dependencies are:
* `ruff_python_ast` has a **dev dependency** on `ruff_python_parser` and
`ruff_python_parser` directly depends on `ruff_python_ast`
* `ruff_python_trivia` has a **dev dependency** on `ruff_python_parser`
and `ruff_python_parser` has an indirect dependency on
`ruff_python_trivia` (`ruff_python_parser` - `ruff_python_ast` -
`ruff_python_trivia`)

Specifically, this PR does the following:
* Introduce two new crates
* `ruff_python_ast_integration_tests` and move the tests from the
`ruff_python_ast` crate which uses the parser in this crate
* `ruff_python_trivia_integration_tests` and move the tests from the
`ruff_python_trivia` crate which uses the parser in this crate

### Motivation

The main motivation for this PR is to help development. Before this PR,
`rust-analyzer` wouldn't provide any intellisense in the
`ruff_python_parser` crate regarding the symbols in `ruff_python_ast`
crate.

```
[ERROR][2024-05-03 13:47:06] .../vim/lsp/rpc.lua:770	"rpc"	"/Users/dhruv/.cargo/bin/rust-analyzer"	"stderr"	"[ERROR project_model::workspace] cyclic deps: ruff_python_parser(Idx::<CrateData>(50)) -> ruff_python_ast(Idx::<CrateData>(37)), alternative path: ruff_python_ast(Idx::<CrateData>(37)) -> ruff_python_parser(Idx::<CrateData>(50))\n"
```

## Test Plan

Check the logs of `rust-analyzer` to not see any signs of cyclic
dependency.
2024-05-07 09:24:57 +00:00
Micha Reiser 6a1e555537
Upgrade to Rust 1.78 (#11260) 2024-05-03 12:46:21 +00:00
Charlie Marsh 9a1f6f6762
Avoid allocations for isort module names (#11251)
## Summary

Random refactor I noticed when investigating the F401 changes. We don't
need to allocate in most cases here.
2024-05-02 19:17:56 +00:00