Commit Graph

476 Commits

Author SHA1 Message Date
Charlie Marsh 4a3eeeff86
Remove `HashableExpr` abstraction (#14057)
## Summary

It looks like `ComparableExpr` now implements `Hash` so we can just
remove this.
2024-11-02 20:28:35 +00:00
Charlie Marsh 71536a43db
Add remaining augmented assignment dunders (#13985)
## Summary

See: https://github.com/astral-sh/ruff/issues/12699
2024-10-30 13:02:29 +00:00
Charlie Marsh b6847b371e
Skip namespace package enforcement for PEP 723 scripts (#13974)
## Summary

Vendors the PEP 723 parser from
[uv](debe67ffdb/crates/uv-scripts/src/lib.rs (L283)).

Closes https://github.com/astral-sh/ruff/issues/13912.
2024-10-29 02:11:31 +00:00
Micha Reiser 9f3a38d408
Extract `LineIndex` independent methods from `Locator` (#13938) 2024-10-28 07:53:41 +00:00
TomerBin 35f007f17f
[red-knot] Type narrow in else clause (#13918)
## Summary

Add support for type narrowing in elif and else scopes as part of
#13694.

## Test Plan

- mdtest
- builder unit test for union negation.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2024-10-26 16:22:57 +00:00
Micha Reiser 73ee72b665
Join implicit concatenated strings when they fit on a line (#13663) 2024-10-24 11:52:22 +02:00
Micha Reiser e402e27a09
Use referencial equality in `traversal` helper methods (#13895) 2024-10-24 11:30:22 +02:00
Micha Reiser 2f88f84972
Alternate quotes for strings inside f-strings in preview (#13860) 2024-10-23 07:57:53 +02:00
Alex Waygood 72adb09bf3
Simplify iteration idioms (#13834)
Remove unnecessary uses of `.as_ref()`, `.iter()`, `&**` and similar, mostly in situations when iterating over variables. Many of these changes are only possible following #13826, when we bumped our MSRV to 1.80: several useful implementations on `&Box<[T]>` were only stabilised in Rust 1.80. Some of these changes we could have done earlier, however.
2024-10-20 22:25:27 +01:00
Micha Reiser 27c50bebec
Bump MSRV to Rust 1.80 (#13826) 2024-10-20 10:55:36 +02:00
Neil Mitchell 2d2baeca23
[python_ast] Make the iter_mut functions public (#13542) 2024-10-19 20:04:00 +01:00
Carl Meyer f4b5e70fae
[red-knot] binary arithmetic on instances (#13800)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-10-19 15:22:54 +00:00
Micha Reiser 8f5b2aac9a
Refactor: Remove `StringPart` and `AnyStringPart` in favor of `StringLikePart` (#13772) 2024-10-16 12:52:06 +02:00
Alex Waygood 5b4afd30ca
Harmonise methods for distinguishing different Python source types (#13682) 2024-10-09 13:18:52 +00:00
Dylan 14ee5dbfde
[refurb] Count codepoints not bytes for `slice-to-remove-prefix-or-suffix (FURB188)` (#13631) 2024-10-07 16:13:28 +02:00
Sid 31ca1c3064
[`flake8-async`] allow async generators (`ASYNC100`) (#13639)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Treat async generators as "await" in ASYNC100.

Fixes #13637

## Test Plan

Updated snapshot
2024-10-07 07:25:54 -05:00
Dylan f27a8b8c7a
[internal] `ComparableExpr` (f)strings and bytes made invariant under concatenation (#13301) 2024-09-25 16:58:57 +02:00
Dhruv Manilawala 62c7d8f6ba
[red-knot] Add control flow support for match statement (#13241)
## Summary

This PR adds support for control flow for match statement.

It also adds the necessary infrastructure required for narrowing
constraints in case blocks and implements the logic for
`PatternMatchSingleton` which is either `None` / `True` / `False`. Even
after this the inferred type doesn't get simplified completely, there's
a TODO for that in the test code.

## Test Plan

Add test cases for control flow for (a) when there's a wildcard pattern
and (b) when there isn't. There's also a test case to verify the
narrowing logic.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2024-09-10 02:14:19 +05:30
Dhruv Manilawala 47f0b45be3
Implement `AstNode` for `Identifier` (#13207)
## Summary

Follow-up to #13147, this PR implements the `AstNode` for `Identifier`.
This makes it easier to create the `NodeKey` in red knot because it uses
a generic method to construct the key from `AnyNodeRef` and is important
for definitions that are created only on identifiers instead of
`ExprName`.

## Test Plan

`cargo test` and `cargo clippy`
2024-09-02 16:27:12 +05:30
Teodoro Freund b9c8113a8a
Added bytes type and some inference (#13061)
## Summary

This PR adds the `bytes` type to red-knot:
- Added the `bytes` type
- Added support for bytes literals
- Support for the `+` operator

Improves on #12701 

Big TODO on supporting and normalizing r-prefixed bytestrings
(`rb"hello\n"`)

## Test Plan

Added a test for a bytes literals, concatenation, and corner values
2024-08-22 13:27:15 -07:00
Alex Waygood aa0db338d9
Implement `iter()`, `len()` and `is_empty()` for all display-literal AST nodes (#12807) 2024-08-12 10:39:28 +00:00
Micha Reiser 138e70bd5c
Upgrade to Rust 1.80 (#12586) 2024-07-30 19:18:08 +00:00
Charlie Marsh d930052de8
Move required import parsing out of lint rule (#12536)
## Summary

Instead, make it part of the serialization and deserialization itself.
This makes it _much_ easier to reuse when solving
https://github.com/astral-sh/ruff/issues/12458.
2024-07-26 13:35:45 -04:00
Micha Reiser 0c72577b5d
[red-knot] Add notebook support (#12338) 2024-07-17 08:26:33 +00:00
renovate[bot] 8ad10b9307
Update Rust crate compact_str to 0.8.0 (#12333)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-07-15 06:03:23 +00:00
Charlie Marsh 18c364d5df
[`flake8-bandit`] Support explicit string concatenations in S310 HTTP detection (#12315)
Closes https://github.com/astral-sh/ruff/issues/12314.
2024-07-14 10:44:08 -04:00
Gaétan Lepage d0298dc26d
Explicitly add schemars to ruff_python_ast Cargo.toml (#12275)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-07-11 06:46:34 +00:00
Micha Reiser 5109b50bb3
Use `CompactString` for `Identifier` (#12101) 2024-07-01 10:06:02 +02:00
Micha Reiser d4dd96d1f4
red-knot: `source_text`, `line_index`, and `parsed_module` queries (#11822) 2024-06-13 07:37:02 +00:00
Micha Reiser 32ca704956
Rename `PreorderVisitor` to `SourceOrderVisitor` (#11798)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-07 17:01:58 +00:00
Dhruv Manilawala bf5b62edac
Maintain synchronicity between the lexer and the parser (#11457)
## Summary

This PR updates the entire parser stack in multiple ways:

### Make the lexer lazy

* https://github.com/astral-sh/ruff/pull/11244
* https://github.com/astral-sh/ruff/pull/11473

Previously, Ruff's lexer would act as an iterator. The parser would
collect all the tokens in a vector first and then process the tokens to
create the syntax tree.

The first task in this project is to update the entire parsing flow to
make the lexer lazy. This includes the `Lexer`, `TokenSource`, and
`Parser`. For context, the `TokenSource` is a wrapper around the `Lexer`
to filter out the trivia tokens[^1]. Now, the parser will ask the token
source to get the next token and only then the lexer will continue and
emit the token. This means that the lexer needs to be aware of the
"current" token. When the `next_token` is called, the current token will
be updated with the newly lexed token.

The main motivation to make the lexer lazy is to allow re-lexing a token
in a different context. This is going to be really useful to make the
parser error resilience. For example, currently the emitted tokens
remains the same even if the parser can recover from an unclosed
parenthesis. This is important because the lexer emits a
`NonLogicalNewline` in parenthesized context while a normal `Newline` in
non-parenthesized context. This different kinds of newline is also used
to emit the indentation tokens which is important for the parser as it's
used to determine the start and end of a block.

Additionally, this allows us to implement the following functionalities:
1. Checkpoint - rewind infrastructure: The idea here is to create a
checkpoint and continue lexing. At a later point, this checkpoint can be
used to rewind the lexer back to the provided checkpoint.
2. Remove the `SoftKeywordTransformer` and instead use lookahead or
speculative parsing to determine whether a soft keyword is a keyword or
an identifier
3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted
by the lexer but it contains owned data which makes it expensive to
clone. The new `TokenKind` enum just represents the type of token which
is very cheap.

This brings up a question as to how will the parser get the owned value
which was stored on `Tok`. This will be solved by introducing a new
`TokenValue` enum which only contains a subset of token kinds which has
the owned value. This is stored on the lexer and is requested by the
parser when it wants to process the data. For example:
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)

[^1]: Trivia tokens are `NonLogicalNewline` and `Comment`

### Remove `SoftKeywordTransformer`

* https://github.com/astral-sh/ruff/pull/11441
* https://github.com/astral-sh/ruff/pull/11459
* https://github.com/astral-sh/ruff/pull/11442
* https://github.com/astral-sh/ruff/pull/11443
* https://github.com/astral-sh/ruff/pull/11474

For context,
https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220
added support for soft keywords in the parser which uses infinite
lookahead to classify a soft keyword as a keyword or an identifier. This
is a brilliant idea as it basically wraps the existing Lexer and works
on top of it which means that the logic for lexing and re-lexing a soft
keyword remains separate. The change here is to remove
`SoftKeywordTransformer` and let the parser determine this based on
context, lookahead and speculative parsing.

* **Context:** The transformer needs to know the position of the lexer
between it being at a statement position or a simple statement position.
This is because a `match` token starts a compound statement while a
`type` token starts a simple statement. **The parser already knows
this.**
* **Lookahead:** Now that the parser knows the context it can perform
lookahead of up to two tokens to classify the soft keyword. The logic
for this is mentioned in the PR implementing it for `type` and `match
soft keyword.
* **Speculative parsing:** This is where the checkpoint - rewind
infrastructure helps. For `match` soft keyword, there are certain cases
for which we can't classify based on lookahead. The idea here is to
create a checkpoint and keep parsing. Based on whether the parsing was
successful and what tokens are ahead we can classify the remaining
cases. Refer to #11443 for more details.

If the soft keyword is being parsed in an identifier context, it'll be
converted to an identifier and the emitted token will be updated as
well. Refer
8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491).

The `case` soft keyword doesn't require any special handling because
it'll be a keyword only in the context of a match statement.

### Update the parser API

* https://github.com/astral-sh/ruff/pull/11494
* https://github.com/astral-sh/ruff/pull/11505

Now that the lexer is in sync with the parser, and the parser helps to
determine whether a soft keyword is a keyword or an identifier, the
lexer cannot be used on its own. The reason being that it's not
sensitive to the context (which is correct). This means that the parser
API needs to be updated to not allow any access to the lexer.

Previously, there were multiple ways to parse the source code:
1. Passing the source code itself
2. Or, passing the tokens

Now that the lexer and parser are working together, the API
corresponding to (2) cannot exists. The final API is mentioned in this
PR description: https://github.com/astral-sh/ruff/pull/11494.

### Refactor the downstream tools (linter and formatter)

* https://github.com/astral-sh/ruff/pull/11511
* https://github.com/astral-sh/ruff/pull/11515
* https://github.com/astral-sh/ruff/pull/11529
* https://github.com/astral-sh/ruff/pull/11562
* https://github.com/astral-sh/ruff/pull/11592

And, the final set of changes involves updating all references of the
lexer and `Tok` enum. This was done in two-parts:
1. Update all the references in a way that doesn't require any changes
from this PR i.e., it can be done independently
	* https://github.com/astral-sh/ruff/pull/11402
	* https://github.com/astral-sh/ruff/pull/11406
	* https://github.com/astral-sh/ruff/pull/11418
	* https://github.com/astral-sh/ruff/pull/11419
	* https://github.com/astral-sh/ruff/pull/11420
	* https://github.com/astral-sh/ruff/pull/11424
2. Update all the remaining references to use the changes made in this
PR

For (2), there were various strategies used:
1. Introduce a new `Tokens` struct which wraps the token vector and add
methods to query a certain subset of tokens. These includes:
	1. `up_to_first_unknown` which replaces the `tokenize` function
2. `in_range` and `after` which replaces the `lex_starts_at` function
where the former returns the tokens within the given range while the
latter returns all the tokens after the given offset
2. Introduce a new `TokenFlags` which is a set of flags to query certain
information from a token. Currently, this information is only limited to
any string type token but can be expanded to include other information
in the future as needed. https://github.com/astral-sh/ruff/pull/11578
3. Move the `CommentRanges` to the parsed output because this
information is common to both the linter and the formatter. This removes
the need for `tokens_and_ranges` function.

## Test Plan

- [x] Update and verify the test snapshots
- [x] Make sure the entire test suite is passing
- [x] Make sure there are no changes in the ecosystem checks
- [x] Run the fuzzer on the parser
- [x] Run this change on dozens of open-source projects

### Running this change on dozens of open-source projects

Refer to the PR description to get the list of open source projects used
for testing.

Now, the following tests were done between `main` and this branch:
1. Compare the output of `--select=E999` (syntax errors)
2. Compare the output of default rule selection
3. Compare the output of `--select=ALL`

**Conclusion: all output were same**

## What's next?

The next step is to introduce re-lexing logic and update the parser to
feed the recovery information to the lexer so that it can emit the
correct token. This moves us one step closer to having error resilience
in the parser and provides Ruff the possibility to lint even if the
source code contains syntax errors.
2024-06-03 18:23:50 +05:30
Tushar Sadhwani e0169d8dea
[`flake8-pyi`] Implement `PYI064` (#11325)
## Summary

Implements `Y064` from `flake8-pyi` and its autofix.

## Test Plan

`cargo test` / `cargo insta review`
2024-05-28 23:57:13 +00:00
Charlie Marsh 16acd4913f
Remove some unused `pub` functions (#11576)
## Summary

I left anything in `red-knot`, any `with_` methods, etc.
2024-05-28 09:56:51 -04:00
Ahmed Ilyas b36c713279
Consider irrefutable pattern similar to `if .. else` for `C901` (#11565)
## Summary

Follow up to https://github.com/astral-sh/ruff/pull/11521

Removes the extra added complexity for catch all match cases. This
matches the implementation of plain `else` statements.

## Test Plan
Added new test cases.

---------

Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
2024-05-27 17:33:36 +00:00
Dhruv Manilawala e28e737296
Update `FStringElements` to deref to a slice (#11570)
Ref: https://github.com/astral-sh/ruff/pull/11400#discussion_r1615600354
2024-05-27 15:52:13 +00:00
Alex Waygood 246a3388ee
Implement a common trait for the string flags (#11564) 2024-05-27 16:02:01 +01:00
Alex Waygood 6963f75a14
Move string-prefix enumerations to a separate submodule (#11425)
## Summary

This moves the string-prefix enumerations in `ruff_python_ast` to a
separate submodule. I think this helps clarify that these prefixes are
purely abstract: they only depend on each other, and do not depend on
any of the other code in `nodes.rs` in any way. Moreover, while various
AST nodes _use_ them, they're not really nodes themselves, so they feel
slightly out of place in `nodes.rs`.

I considered moving all of them to `str.rs`, but it felt like enough
code that it could be a separate submodule.

## Test Plan

`cargo test`
2024-05-15 07:40:27 -04:00
Dhruv Manilawala c3c87e86ef
Implement `IntoIterator` for `FStringElements` (#11410)
A change which I lost somewhere when I force pushed in
https://github.com/astral-sh/ruff/pull/11400
2024-05-13 16:24:49 +00:00
Dhruv Manilawala ca99e9e2f0
Move `W605` to the AST checker (#11402)
## Summary

This PR moves the `W605` rule to the AST checker.

This is part of #11401

## Test Plan

`cargo test`
2024-05-13 16:13:06 +00:00
Dhruv Manilawala 4b41e4de7f
Create a newtype wrapper around `Vec<FStringElement>` (#11400)
## Summary

This PR adds a newtype wrapper around `Vec<FStringElement>` that derefs
to a `&Vec<FStringElement>`.

Both f-string and format specifier are made up of `Vec<FStringElement>`.
By creating a newtype wrapper around it, we can share the methods for
both parent types.
2024-05-13 16:04:04 +00:00
Dhruv Manilawala 0dc130e841
Add `Iterator` impl for `StringLike` parts (#11399)
## Summary

This PR adds support to iterate over each part of a string-like
expression.

This similar to the one in the formatter:


128414cd95/crates/ruff_python_formatter/src/string/any.rs (L121-L125)

Although I don't think it's a 1-1 replacement in the formatter because
the one implemented in the formatter has another information for certain
variants (as can be seen for `FString`).

The main motivation for this is to avoid duplication for rules which
work only on the parts of the string and doesn't require any information
from the parent node. Here, the parent node being the expression node
which could be an implicitly concatenated string.

This PR also updates certain rule implementation to make use of this and
avoids logic duplication.
2024-05-13 15:52:03 +00:00
Charlie Marsh af60d539ab
Move sub-crates to workspace dependencies (#11407)
## Summary

This matches the setup we use in `uv` and allows for consistency in the
`Cargo.toml` files.
2024-05-13 14:37:50 +00:00
Dhruv Manilawala 6ecb4776de
Rename `AnyStringKind` -> `AnyStringFlags` (#11405)
## Summary

This PR renames `AnyStringKind` to `AnyStringFlags` and `AnyStringFlags`
to `AnyStringFlagsInner`.

The main motivation is to have consistent usage of "kind" and "flags".
For each string kind, it's "flags" like `StringLiteralFlags`,
`BytesLiteralFlags`, and `FStringFlags` but it was `AnyStringKind` for
the "any" variant.
2024-05-13 13:18:07 +00:00
Charlie Marsh 6cec82fff8
Get `cargo shear` passing (#11392)
## Summary

Remove some unused dependencies, add a few ignores.
2024-05-13 01:56:24 +00:00
Charlie Marsh 35ba3c91ce
Use `u64` instead of `i64` in Int type (#11356)
## Summary

I believe the value here is always unsigned, since we represent `-42` as
a unary operator on `42`.
2024-05-10 13:35:15 +00:00
Micha Reiser 22639c5a2a
Move all module from the AST to the semantic crate (#11330) 2024-05-08 08:56:50 +00:00
Alex Waygood 6774f27f4b
Refactor the `ExprDict` node (#11267)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-05-07 11:46:10 +00:00
Dhruv Manilawala 28cc71fb6b
Remove cyclic dev dependency with the parser crate (#11261)
## Summary

This PR removes the cyclic dev dependency some of the crates had with
the parser crate.

The cyclic dependencies are:
* `ruff_python_ast` has a **dev dependency** on `ruff_python_parser` and
`ruff_python_parser` directly depends on `ruff_python_ast`
* `ruff_python_trivia` has a **dev dependency** on `ruff_python_parser`
and `ruff_python_parser` has an indirect dependency on
`ruff_python_trivia` (`ruff_python_parser` - `ruff_python_ast` -
`ruff_python_trivia`)

Specifically, this PR does the following:
* Introduce two new crates
* `ruff_python_ast_integration_tests` and move the tests from the
`ruff_python_ast` crate which uses the parser in this crate
* `ruff_python_trivia_integration_tests` and move the tests from the
`ruff_python_trivia` crate which uses the parser in this crate

### Motivation

The main motivation for this PR is to help development. Before this PR,
`rust-analyzer` wouldn't provide any intellisense in the
`ruff_python_parser` crate regarding the symbols in `ruff_python_ast`
crate.

```
[ERROR][2024-05-03 13:47:06] .../vim/lsp/rpc.lua:770	"rpc"	"/Users/dhruv/.cargo/bin/rust-analyzer"	"stderr"	"[ERROR project_model::workspace] cyclic deps: ruff_python_parser(Idx::<CrateData>(50)) -> ruff_python_ast(Idx::<CrateData>(37)), alternative path: ruff_python_ast(Idx::<CrateData>(37)) -> ruff_python_parser(Idx::<CrateData>(50))\n"
```

## Test Plan

Check the logs of `rust-analyzer` to not see any signs of cyclic
dependency.
2024-05-07 09:24:57 +00:00
Micha Reiser 6a1e555537
Upgrade to Rust 1.78 (#11260) 2024-05-03 12:46:21 +00:00
Charlie Marsh 9a1f6f6762
Avoid allocations for isort module names (#11251)
## Summary

Random refactor I noticed when investigating the F401 changes. We don't
need to allocate in most cases here.
2024-05-02 19:17:56 +00:00
Micha Reiser 64700d296f
Remove ImportMap (#11234)
## Summary

This PR removes the `ImportMap` implementation and all its routing
through ruff.

The import map was added in https://github.com/astral-sh/ruff/pull/3243
but we then never ended up using it to do cross file analysis.

We are now working on adding multifile analysis to ruff, and revisit
import resolution as part of it.


```
hyperfine --warmup 10 --runs 20 --setup "./target/release/ruff clean" \
              "./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I" \
              "./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I" 
Benchmark 1: ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
  Time (mean ± σ):      37.6 ms ±   0.9 ms    [User: 52.2 ms, System: 63.7 ms]
  Range (min … max):    35.8 ms …  39.8 ms    20 runs
 
Benchmark 2: ./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
  Time (mean ± σ):      36.0 ms ±   0.7 ms    [User: 50.3 ms, System: 58.4 ms]
  Range (min … max):    34.5 ms …  37.6 ms    20 runs
 
Summary
  ./target/release/ruff-import check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I ran
    1.04 ± 0.03 times faster than ./target/release/ruff check crates/ruff_linter/resources/test/cpython -e -s --extend-select=I
```

I suspect that the performance improvement should even be more
significant for users that otherwise don't have any diagnostics.


```
hyperfine --warmup 10 --runs 20 --setup "cd ../ecosystem/airflow && ../../ruff/target/release/ruff clean" \
              "./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I" \
              "./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I" 
Benchmark 1: ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I
  Time (mean ± σ):      53.7 ms ±   1.8 ms    [User: 68.4 ms, System: 63.0 ms]
  Range (min … max):    51.1 ms …  58.7 ms    20 runs
 
Benchmark 2: ./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I
  Time (mean ± σ):      50.8 ms ±   1.4 ms    [User: 50.7 ms, System: 60.9 ms]
  Range (min … max):    48.5 ms …  55.3 ms    20 runs
 
Summary
  ./target/release/ruff-import check ../ecosystem/airflow -e -s --extend-select=I ran
    1.06 ± 0.05 times faster than ./target/release/ruff check ../ecosystem/airflow -e -s --extend-select=I

```

## Test Plan

`cargo test`
2024-05-02 11:26:02 -07:00
Dhruv Manilawala 04a922866a
Add basic docs for the parser crate (#11199)
## Summary

This PR adds a basic README for the `ruff_python_parser` crate and
updates the CONTRIBUTING docs with the fuzzer and benchmark section.

Additionally, it also updates some inline documentation within the
parser crate and splits the `parse_program` function into
`parse_single_expression` and `parse_module` which will be called by
matching against the `Mode`.

This PR doesn't go into too much internal detail around the parser logic
due to the following reasons:
1. Where should the docs go? Should it be as a module docs in `lib.rs`
or in README?
2. The parser is still evolving and could include a lot of refactors
with the future work (feedback loop and improved error recovery and
resilience)

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-04-29 17:08:07 +00:00
Alex Waygood 87929ad5f1
Add convenience methods for iterating over all parameter nodes in a function (#11174) 2024-04-29 10:36:15 +00:00
Micha Reiser 7cd065e4a2
Kick off Red-knot (#10849)
Co-authored-by: Carl Meyer <carl@oddbird.net>
Co-authored-by: Carl Meyer <carl@astral.sh>
2024-04-27 08:34:00 +00:00
Carl Meyer 845ba7cf5f
Make ImportFrom level just a u32 (#11170) 2024-04-26 20:38:35 -06:00
Jelle Zijlstra cd3e319538
Add support for PEP 696 syntax (#11120) 2024-04-26 09:47:29 +02:00
Alex Waygood 269014a539
Delete unused methods from `Parameters` (#11150) 2024-04-25 22:11:24 +01:00
Carl Meyer c80b9a4a90
Reduce size of Stmt from 144 to 120 bytes (#11051)
## Summary

I happened to notice that we box `TypeParams` on `StmtClassDef` but not
on `StmtFunctionDef` and wondered why, since `StmtFunctionDef` is bigger
and sets the size of `Stmt`.

@charliermarsh found that at the time we started boxing type params on
classes, classes were the largest statement type (see #6275), but that's
no longer true.

So boxing type-params also on functions reduces the overall size of
`Stmt`.

## Test Plan

The `<=` size tests are a bit irritating (since their failure doesn't
tell you the actual size), but I manually confirmed that the size is
actually 120 now.
2024-04-19 17:02:17 -06:00
Dhruv Manilawala 13ffb5bc19
Replace LALRPOP parser with hand-written parser (#10036)
(Supersedes #9152, authored by @LaBatata101)

## Summary

This PR replaces the current parser generated from LALRPOP to a
hand-written recursive descent parser.

It also updates the grammar for [PEP
646](https://peps.python.org/pep-0646/) so that the parser outputs the
correct AST. For example, in `data[*x]`, the index expression is now a
tuple with a single starred expression instead of just a starred
expression.

Beyond the performance improvements, the parser is also error resilient
and can provide better error messages. The behavior as seen by any
downstream tools isn't changed. That is, the linter and formatter can
still assume that the parser will _stop_ at the first syntax error. This
will be updated in the following months.

For more details about the change here, refer to the PR corresponding to
the individual commits and the release blog post.

## Test Plan

Write _lots_ and _lots_ of tests for both valid and invalid syntax and
verify the output.

## Acknowledgements

- @MichaReiser for reviewing 100+ parser PRs and continuously providing
guidance throughout the project
- @LaBatata101 for initiating the transition to a hand-written parser in
#9152
- @addisoncrump for implementing the fuzzer which helped
[catch](https://github.com/astral-sh/ruff/pull/10903)
[a](https://github.com/astral-sh/ruff/pull/10910)
[lot](https://github.com/astral-sh/ruff/pull/10966)
[of](https://github.com/astral-sh/ruff/pull/10896)
[bugs](https://github.com/astral-sh/ruff/pull/10877)

---------

Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-04-18 17:57:39 +05:30
Alex Waygood f779babc5f
Improve handling of builtin symbols in linter rules (#10919)
Add a new method to the semantic model to simplify and improve the correctness of a common pattern
2024-04-16 11:37:31 +01:00
renovate[bot] 388658efdb
Update pre-commit dependencies (#10698)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Zanie Blue <contact@zanie.dev>
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2024-04-06 23:00:41 +00:00
Charlie Marsh 7fb5f47efe
Respect `# noqa` directives on `__all__` openers (#10798)
## Summary

Historically, given:

```python
__all__ = [  # noqa: F822
    "Bernoulli",
    "Beta",
    "Binomial",
]
```

The F822 violations would be attached to the `__all__`, so this `# noqa`
would be enforced for _all_ definitions in the list. This changed in
https://github.com/astral-sh/ruff/pull/10525 for the better, in that we
now use the range of each string. But these `# noqa` directives stopped
working.

This PR sets the `__all__` as a parent range in the diagnostic, so that
these directives are respected once again.

Closes https://github.com/astral-sh/ruff/issues/10795.

## Test Plan

`cargo test`
2024-04-06 14:51:17 +00:00
Dhruv Manilawala 99dd3a8ab0
Implement `as_str` & `Display` for all operator enums (#10691)
## Summary

This PR adds the `as_str` implementation for all the operator methods.
It already exists for `CmpOp` which is being [used in the
linter](ffcd77860c/crates/ruff_linter/src/rules/flake8_simplify/rules/key_in_dict.rs (L117))
and it makes sense to implement it for the rest as well. This will also
be utilized in error messages for the new parser.
2024-04-02 10:34:36 +00:00
Dhruv Manilawala eee2d5b915
Remove unused operator methods and impl (#10690)
## Summary

This PR removes unused operator methods and impl traits. There is
already the `is_macro::Is` implementation for all the operators and this
seems unnecessary.
2024-04-02 15:53:20 +05:30
Alex Waygood a06ffeb54e
Track ranges of names inside `__all__` definitions (#10525) 2024-03-22 18:38:40 +00:00
Charlie Marsh 60fd98eb2f
Update Rust to v1.77 (#10510) 2024-03-21 12:10:33 -04:00
Alex Waygood 7caf0d064a
Simplify formatting of strings by using flags from the AST nodes (#10489) 2024-03-20 16:16:54 +00:00
Alex Waygood 162d2eb723
Track casing of r-string prefixes in the tokenizer and AST (#10314)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-03-18 17:18:04 +00:00
Dhruv Manilawala 5f40371ffc
Use `ExprFString` for `StringLike::FString` variant (#10311)
## Summary

This PR updates the `StringLike::FString` variant to use `ExprFString`
instead of `FStringLiteralElement`.

For context, the reason it used `FStringLiteralElement` is that the node
is actually the string part of an f-string ("foo" in `f"foo{x}"`). But,
this is inconsistent with other variants where the captured value is the
_entire_ string.

This is also problematic w.r.t. implicitly concatenated strings. Any
rules which work with `StringLike::FString` doesn't account for the
string part in an implicitly concatenated f-strings. For example, we
don't flag confusable character in the first part of `"𝐁ad" f"𝐁ad
string"`, but only the second part
(https://play.ruff.rs/16071c4c-a1dd-4920-b56f-e2ce2f69c843).

### Update `PYI053`

_This is included in this PR because otherwise it requires a temporary
workaround to be compatible with the old logic._

This PR also updates the `PYI053` (`string-or-bytes-too-long`) rule for
f-string to consider _all_ the visible characters in a f-string,
including the ones which are implicitly concatenated. This is consistent
with implicitly concatenated strings and bytes.

For example,

```python
def foo(
	# We count all the characters here
    arg1: str = '51 character ' 'stringgggggggggggggggggggggggggggggggg',
	# But not here because of the `{x}` replacement field which _breaks_ them up into two chunks
    arg2: str = f'51 character {x} stringgggggggggggggggggggggggggggggggggggggggggggg',
) -> None: ...
```

This PR fixes it to consider all _visible_ characters inside an f-string
which includes expressions as well.

fixes: #10310 
fixes: #10307 

## Test Plan

Add new test cases and update the snapshots.

## Review

To facilitate the review process, the change have been split into two
commits: one which has the code change while the other has the test
cases and updated snapshots.
2024-03-14 13:30:22 +05:30
Alex Waygood c2e15f38ee
Unify enums used for internal representation of quoting style (#10383) 2024-03-13 17:19:17 +00:00
Dhruv Manilawala 32d6f84e3d
Add methods to iter over f-string elements (#10309)
## Summary

This PR adds methods on `FString` to iterate over the two different kind
of elements it can have - literals and expressions. This is similar to
the methods we have on `ExprFString`.

---------

Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-03-13 08:46:55 +00:00
Charlie Marsh dbf82233b8
Gate f-string struct size test for Rustc < 1.76 (#10371)
Closes https://github.com/astral-sh/ruff/issues/10319.
2024-03-12 15:46:36 -04:00
Alex Waygood 1d97f27335
Start tracking quoting style in the AST (#10298)
This PR modifies our AST so that nodes for string literals, bytes literals and f-strings all retain the following information:
- The quoting style used (double or single quotes)
- Whether the string is triple-quoted or not
- Whether the string is raw or not

This PR is a followup to #10256. Like with that PR, this PR does not, in itself, fix any bugs. However, it means that we will have the necessary information to preserve quoting style and rawness of strings in the `ExprGenerator` in a followup PR, which will allow us to provide a fix for https://github.com/astral-sh/ruff/issues/7799.

The information is recorded on the AST nodes using a bitflag field on each node, similarly to how we recorded the information on `Tok::String`, `Tok::FStringStart` and `Tok::FStringMiddle` tokens in #10298. Rather than reusing the bitflag I used for the tokens, however, I decided to create a custom bitflag for each AST node.

Using different bitflags for each node allows us to make invalid states unrepresentable: it is valid to set a `u` prefix on a string literal, but not on a bytes literal or an f-string. It also allows us to have better debug representations for each AST node modified in this PR.
2024-03-08 19:11:47 +00:00
Micha Reiser 8ea5b08700
refactor: Use `QualifiedName` for `Imported::call_path` (#10214)
## Summary

When you try to remove an internal representation leaking into another
type and end up rewriting a simple version of `smallvec`.

The goal of this PR is to replace the `Box<[&'a str]>` with
`Box<QualifiedName>` to avoid that the internal `QualifiedName`
representation leaks (and it gives us a nicer API too). However, doing
this when `QualifiedName` uses `SmallVec` internally gives us all sort
of funny lifetime errors. I was lost but @BurntSushi came to rescue me.
He figured out that `smallvec` has a variance problem which is already
tracked in https://github.com/servo/rust-smallvec/issues/146

To fix the variants problem, I could use the smallvec-2-alpha-4 or
implement our own smallvec. I went with implementing our own small vec
for this specific problem. It obviously isn't as sophisticated as
smallvec (only uses safe code), e.g. it doesn't perform any size
optimizations, but it does its job.

Other changes:

* Removed `Imported::qualified_name` (the version that returns a
`String`). This can be replaced by calling `ToString` on the qualified
name.
* Renamed `Imported::call_path` to `qualified_name` and changed its
return type to `&QualifiedName`.
* Renamed `QualifiedName::imported` to `user_defined` which is the more
common term when talking about builtins vs the rest/user defined
functions.


## Test plan

`cargo test`
2024-03-06 09:55:59 +01:00
Micha Reiser 184241f99a
Remove `Expr` postfix from `ExprNamed`, `ExprIf`, and `ExprGenerator` (#10229)
The expression types in our AST are called `ExprYield`, `ExprAwait`,
`ExprStringLiteral` etc, except `ExprNamedExpr`, `ExprIfExpr` and
`ExprGenratorExpr`. This seems to align with [Python AST's
naming](https://docs.python.org/3/library/ast.html) but feels
inconsistent and excessive.

This PR removes the `Expr` postfix from `ExprNamedExpr`, `ExprIfExpr`,
and `ExprGeneratorExpr`.
2024-03-04 12:55:01 +01:00
Micha Reiser a6d892b1f4
Split `CallPath` into `QualifiedName` and `UnqualifiedName` (#10210)
## Summary

Charlie can probably explain this better than I but it turns out,
`CallPath` is used for two different things:

* To represent unqualified names like `version` where `version` can be a
local variable or imported (e.g. `from sys import version` where the
full qualified name is `sys.version`)
* To represent resolved, full qualified names

This PR splits `CallPath` into two types to make this destinction clear.

> Note: I haven't renamed all `call_path` variables to `qualified_name`
or `unqualified_name`. I can do that if that's welcomed but I first want
to get feedback on the approach and naming overall.

## Test Plan

`cargo test`
2024-03-04 09:06:51 +00:00
Micha Reiser db25a563f7
Remove unneeded lifetime bounds (#10213)
## Summary

This PR removes the unneeded lifetime `'b` from many of our `Visitor`
implementations.

The lifetime is unneeded because it is only constraint by `'a`, so we
can use `'a` directly.

## Test Plan

`cargo build`
2024-03-03 18:12:11 +00:00
Micha Reiser e725b6fdaf
CallPath newtype wrapper (#10201)
## Summary

This PR changes the `CallPath` type alias to a newtype wrapper. 

A newtype wrapper allows us to limit the API and to experiment with
alternative ways to implement matching on `CallPath`s.



## Test Plan

`cargo test`
2024-03-03 16:54:24 +01:00
Jane Lewis 0293908b71
Implement RUF028 to detect useless formatter suppression comments (#9899)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
Fixes #6611

## Summary

This lint rule spots comments that are _intended_ to suppress or enable
the formatter, but will be ignored by the Ruff formatter.

We borrow some functions the formatter uses for determining comment
placement / putting them in context within an AST.

The analysis function uses an AST visitor to visit each comment and
attach it to the AST. It then uses that context to check:
1. Is this comment in an expression?
2. Does this comment have bad placement? (e.g. a `# fmt: skip` above a
function instead of at the end of a line)
3. Is this comment redundant?
4. Does this comment actually suppress any code?
5. Does this comment have ambiguous placement? (e.g. a `# fmt: off`
above an `else:` block)

If any of these are true, a violation is thrown. The reported reason
depends on the order of the above check-list: in other words, a `# fmt:
skip` comment on its own line within a list expression will be reported
as being in an expression, since that reason takes priority.

The lint suggests removing the comment as an unsafe fix, regardless of
the reason.

## Test Plan

A snapshot test has been created.
2024-02-28 19:21:06 +00:00
Micha Reiser 77c5561646
Add `parenthesized` flag to `ExprTuple` and `ExprGenerator` (#9614) 2024-02-26 15:35:20 +00:00
Charlie Marsh 0304623878
[`perflint`] Catch a wider range of mutations in `PERF101` (#9955)
## Summary

This PR ensures that if a list `x` is modified within a `for` loop, we
avoid flagging `list(x)` as unnecessary. Previously, we only detected
calls to exactly `.append`, and they couldn't be nested within other
statements.

Closes https://github.com/astral-sh/ruff/issues/9925.
2024-02-12 12:17:55 -05:00
Charlie Marsh 6f0e4ad332
Remove unnecessary string cloning from the parser (#9884)
Closes https://github.com/astral-sh/ruff/issues/9869.
2024-02-09 16:03:27 -05:00
Charlie Marsh 49fe1b85f2
Reduce size of `Expr` from 80 to 64 bytes (#9900)
## Summary

This PR reduces the size of `Expr` from 80 to 64 bytes, by reducing the
sizes of...

- `ExprCall` from 72 to 56 bytes, by using boxed slices for `Arguments`.
- `ExprCompare` from 64 to 48 bytes, by using boxed slices for its
various vectors.

In testing, the parser gets a bit faster, and the linter benchmarks
improve quite a bit.
2024-02-09 02:53:13 +00:00
Micha Reiser fe7d965334
Reduce `Result<Tok, LexicalError>` size by using `Box<str>` instead of `String` (#9885) 2024-02-08 20:36:22 +00:00
Micha Reiser 688177ff6a
Use Rust 1.76 (#9897) 2024-02-08 18:20:08 +00:00
Charlie Marsh daae28efc7
Respect `async with` in `timeout-without-await` (#9859)
Closes https://github.com/astral-sh/ruff/issues/9855.
2024-02-06 12:04:24 -05:00
Dhruv Manilawala 36b752876e
Implement `AnyNode`/`AnyNodeRef` for `FStringFormatSpec` (#9836)
## Summary

This PR adds the `AnyNode` and `AnyNodeRef` implementation for
`FStringFormatSpec` node which will be required in the f-string
formatting.

The main usage for this is so that we can pass in the node directly to
`suppressed_node` in case debug expression is used to format is as
verbatim text.
2024-02-05 19:23:43 +00:00
Charlie Marsh ea1c089652
Use `AhoCorasick` to speed up quote match (#9773)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

When I was looking at the v0.2.0 release, this method showed up in a
CodSpeed regression (we were calling it more), so I decided to quickly
look at speeding it up. @BurntSushi suggested using Aho-Corasick, and it
looks like it's about 7 or 8x faster:

```text
Parser/AhoCorasick      time:   [8.5646 ns 8.5914 ns 8.6191 ns]
Parser/Iterator         time:   [64.992 ns 65.124 ns 65.271 ns]
```

## Test Plan

`cargo test`
2024-02-02 09:57:39 -05:00
Micha Reiser ce14f4dea5
Range formatting API (#9635) 2024-01-31 11:13:37 +01:00
Micha Reiser 5fe0fdd0a8
Delete `is_node_with_body` method (#9643) 2024-01-25 14:41:13 +00:00
Alex Waygood a1e65a92bd
Move `is_tuple_parenthesized` from the formatter to `ruff_python_ast` (#9533)
This allows it to be used in the linter as well as the formatter. It
will be useful in #9474
2024-01-15 16:10:40 +00:00
Chammika Mannakkara 0003c730e0
[`flake8-simplify`] Implement `enumerate-for-loop` (`SIM113`) (#7777)
Implements SIM113 from #998

Added tests
Limitations 
   - No fix yet
   - Only flag cases where index variable immediately precede `for` loop

@charliermarsh please review and let me know any improvements

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-01-14 11:00:59 -05:00
Charlie Marsh 009430e034
[`ruff`] Avoid treating named expressions as static keys (`RUF011`) (#9494)
Closes https://github.com/astral-sh/ruff/issues/9487.
2024-01-12 14:33:45 -05:00
Charlie Marsh a31a314b2b
Account for possibly-empty f-string values in truthiness logic (#9484)
Closes https://github.com/astral-sh/ruff/issues/9479.
2024-01-11 21:16:19 -05:00
Micha Reiser f192c72596
Remove type parameter from `parse_*` methods (#9466) 2024-01-11 19:41:19 +01:00
Micha Reiser 94968fedd5
Use Rust 1.75 toolchain (#9437) 2024-01-08 18:03:16 +01:00
Charlie Marsh 701697c37e
Support variable keys in static dictionary key rule (#9411)
Closes https://github.com/astral-sh/ruff/issues/9410.
2024-01-06 20:44:40 +00:00
Charlie Marsh 0e202718fd
Misc. small tweaks from perusing modules (#9383) 2024-01-03 12:30:25 -05:00
Charlie Marsh 9073220887
Make all dependencies workspace dependencies (#9333)
## Summary

This PR modifies our `Cargo.toml` files to use workspace dependencies
for _all_ dependencies, rather than the status quo of sporadically
trying to use workspace dependencies for those dependencies that are
used across multiple crates. I find the current situation more confusing
and harder to manage, since we have a mix of workspace and crate-local
dependencies, whereas this setup consistently uses the same approach for
all dependencies.
2024-01-02 13:41:59 +00:00
Charlie Marsh 1c9268d2ff
Remove some unused dependencies (#9330) 2023-12-31 07:38:16 -05:00
Charlie Marsh e80260a3c5
Remove source path from parser errors (#9322)
## Summary

I always found it odd that we had to pass this in, since it's really
higher-level context for the error. The awkwardness is further evidenced
by the fact that we pass in fake values everywhere (even outside of
tests). The source path isn't actually used to display the error; it's
only accessed elsewhere to _re-display_ the error in certain cases. This
PR modifies to instead pass the path directly in those cases.
2023-12-30 20:33:05 +00:00
Charlie Marsh eb9a1bc5f1
Use consistent re-export from `ruff_source_file` (#9320)
Right now, we both re-export (via `pub use`) and mark the modules
themselves a `pub`, so they can be imported through two different paths.
2023-12-30 14:48:45 -05:00
Charlie Marsh 2895e7d126
Respect mixed `return` and `raise` cases in return-type analysis (#9310)
## Summary

Given:

```python
from somewhere import get_cfg

def lookup_cfg(cfg_description):
    cfg = get_cfg(cfg_description)
    if cfg is not None:
        return cfg
    raise AttributeError(f"No cfg found matching {cfg_description}")
```

We were analyzing the method from last-to-first statement. So we saw the
`raise`, then assumed the method _always_ raised. In reality, though, it
_might_ return. This PR improves the branch analysis to respect these
mixed cases.

Closes https://github.com/astral-sh/ruff/issues/9269.
Closes https://github.com/astral-sh/ruff/issues/9304.
2023-12-29 16:46:37 +00:00
Charlie Marsh a9ceef5b5d
[`ruff`] Add `never-union` rule to detect redundant `typing.NoReturn` and `typing.Never` (#9217)
## Summary

Adds a rule to detect unions that include `typing.NoReturn` or
`typing.Never`. In such cases, the use of the bottom type is redundant.

Closes https://github.com/astral-sh/ruff/issues/9113.

## Test Plan

`cargo test`
2023-12-21 20:53:31 +00:00
Charlie Marsh 5ccc21aea2
Add support for `NoReturn` in auto-return-typing (#9206)
## Summary

Given a function like:

```python
def func(x: int):
    if not x:
        raise ValueError
    else:
        raise TypeError
```

We now correctly use `NoReturn` as the return type, rather than `None`.

Closes https://github.com/astral-sh/ruff/issues/9201.
2023-12-20 00:06:31 -05:00
Dhruv Manilawala 18452cf477
Add `as_slice` method for all string nodes (#9111)
This PR adds a `as_slice` method to all the string nodes which returns
all the parts of the nodes as a slice. This will be useful in the next
PR to split the string formatting to use this method to extract the
_single node_ or _implicitly concanated nodes_.
2023-12-13 06:31:20 +00:00
Dhruv Manilawala 96ae9fe685
Introduce `StringLike` enum (#9016)
## Summary

This PR introduces a new `StringLike` enum which is a narrow type to
indicate string-like nodes. These includes the string literals, bytes
literals, and the literal parts of f-strings.

The main motivation behind this is to avoid repetition of rule calling
in the AST checker. We add a new `analyze::string_like` function which
takes in the enum and calls all the respective rule functions which
expects atleast 2 of the variants of this enum.

I'm open to discarding this if others think it's not that useful at this
stage as currently only 3 rules require these nodes.

As suggested
[here](https://github.com/astral-sh/ruff/pull/8835#discussion_r1414746934)
and
[here](https://github.com/astral-sh/ruff/pull/8835#discussion_r1414750204).

## Test Plan

`cargo test`
2023-12-07 16:39:13 +00:00
Dhruv Manilawala cdac90ef68
New AST nodes for f-string elements (#8835)
Rebase of #6365 authored by @davidszotten.

## Summary

This PR updates the AST structure for an f-string elements.

The main **motivation** behind this change is to have a dedicated node
for the string part of an f-string. Previously, the existing
`ExprStringLiteral` node was used for this purpose which isn't exactly
correct. The `ExprStringLiteral` node should include the quotes as well
in the range but the f-string literal element doesn't include the quote
as it's a specific part within an f-string. For example,

```python
f"foo {x}"
# ^^^^
# This is the literal part of an f-string
```

The introduction of `FStringElement` enum is helpful which represent
either the literal part or the expression part of an f-string.

### Rule Updates

This means that there'll be two nodes representing a string depending on
the context. One for a normal string literal while the other is a string
literal within an f-string. The AST checker is updated to accommodate
this change. The rules which work on string literal are updated to check
on the literal part of f-string as well.

#### Notes

1. The `Expr::is_literal_expr` method would check for
`ExprStringLiteral` and return true if so. But now that we don't
represent the literal part of an f-string using that node, this improves
the method's behavior and confines to the actual expression. We do have
the `FStringElement::is_literal` method.
2. We avoid checking if we're in a f-string context before adding to
`string_type_definitions` because the f-string literal is now a
dedicated node and not part of `Expr`.
3. Annotations cannot use f-string so we avoid changing any rules which
work on annotation and checks for `ExprStringLiteral`.

## Test Plan

- All references of `Expr::StringLiteral` were checked to see if any of
the rules require updating to account for the f-string literal element
node.
- New test cases are added for rules which check against the literal
part of an f-string.
- Check the ecosystem results and ensure it remains unchanged.

## Performance

There's a performance penalty in the parser. The reason for this remains
unknown as it seems that the generated assembly code is now different
for the `__reduce154` function. The reduce function body is just popping
the `ParenthesizedExpr` on top of the stack and pushing it with the new
location.

- The size of `FStringElement` enum is the same as `Expr` which is what
it replaces in `FString::format_spec`
- The size of `FStringExpressionElement` is the same as
`ExprFormattedValue` which is what it replaces

I tried reducing the `Expr` enum from 80 bytes to 72 bytes but it hardly
resulted in any performance gain. The difference can be seen here:
- Original profile: https://share.firefox.dev/3Taa7ES
- Profile after boxing some node fields:
https://share.firefox.dev/3GsNXpD

### Backtracking

I tried backtracking the changes to see if any of the isolated change
produced this regression. The problem here is that the overall change is
so small that there's only a single checkpoint where I can backtrack and
that checkpoint results in the same regression. This checkpoint is to
revert using `Expr` to the `FString::format_spec` field. After this
point, the change would revert back to the original implementation.

## Review process

The review process is similar to #7927. The first set of commits update
the node structure, parser, and related AST files. Then, further commits
update the linter and formatter part to account for the AST change.

---------

Co-authored-by: David Szotten <davidszotten@gmail.com>
2023-12-07 10:28:05 -06:00
Dhruv Manilawala ef7778d794
Fix preorder visitor tests (#9025)
Follow-up PR to #9009 to fix the `PreorderVisitor` test cases as
suggested here: https://github.com/astral-sh/ruff/pull/9009#discussion_r1416459688
2023-12-06 16:58:51 +00:00
Dhruv Manilawala bd443ebe91
Add visitor tests for strings, bytes, f-strings (#9009)
This PR adds tests for visitor implementation for string literals, bytes
literals and f-strings.
2023-12-06 10:52:19 -06:00
Micha Reiser 7e390d3772
Move `ParenthesizedExpr` to `ruff_python_parser` (#8987) 2023-12-04 05:36:28 +00:00
Charlie Marsh e5db72459e
Detect implicit returns in auto-return-types (#8952)
## Summary

Adds detection for branches without a `return` or `raise`, so that we
can properly `Optional` the return types. I'd like to remove this and
replace it with our code graph analysis from the `unreachable.rs` rule,
but it at least fixes the worst offenders.

Closes #8942.
2023-12-01 12:35:01 -05:00
Charlie Marsh 6435e4e4aa
Enable auto-return-type involving `Optional` and `Union` annotations (#8885)
## Summary

Previously, this was only supported for Python 3.10 and later, since we
always use the PEP 604-style unions.
2023-11-28 18:35:55 -08:00
Dhruv Manilawala ec7456bac0
Rename `as_str` to `to_str` (#8886)
This PR renames the method on `StringLiteralValue` from `as_str` to
`to_str`. The main motivation is to follow the naming convention as
described in the [Rust API
Guidelines](https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv).
This method can perform a string allocation in case the string is
implicitly concatenated.
2023-11-28 18:50:42 -06:00
Dhruv Manilawala b28556d739
Update `E402` to work at cell level for notebooks (#8872)
## Summary

This PR updates the `E402` rule to work at cell level for Jupyter
notebooks. This is enabled only in preview to gather feedback.

The implementation basically resets the import boundary flag on the
semantic model when we encounter the first statement in a cell.

Another potential solution is to introduce `E403` rule that is
specifically for notebooks that works at cell level while `E402` will be
disabled for notebooks.

## Test Plan

Add a notebook with imports in multiple cells and verify that the rule
works as expected.

resolves: #8669
2023-11-29 00:32:35 +00:00
Dhruv Manilawala 501cca8b72
Remove `#[allow(unused_variables)]` from visitor methods (#8828)
Small follow-up to remove `#[allow(unused_variables)]` from visitor
methods and use underscore prefix for unused variables instead.
2023-11-25 00:09:46 +00:00
Dhruv Manilawala 626b0577cd
Explicit `as_str` (no deref), add no allocation methods (#8826)
## Summary

This PR is a follow-up to the AST refactor which does the following:
- Remove `Deref` implementation on `StringLiteralValue` and use explicit
`as_str` calls instead. The `Deref` implementation would implicitly
perform allocations in case of implicitly concatenated strings. This is
to make sure the allocation is explicit.
- Now, certain methods can be implemented to do zero allocations which
have been implemented in this PR. They are:
    - `is_empty`
    - `len`
    - `chars`
    - Custom `PartialEq` implementation to compare each character

## Test Plan

Run the linter test suite and make sure all tests pass.
2023-11-25 00:03:59 +00:00
Dhruv Manilawala 017e829115
Update string nodes for implicit concatenation (#7927)
## Summary

This PR updates the string nodes (`ExprStringLiteral`,
`ExprBytesLiteral`, and `ExprFString`) to account for implicit string
concatenation.

### Motivation

In Python, implicit string concatenation are joined while parsing
because the interpreter doesn't require the information for each part.
While that's feasible for an interpreter, it falls short for a static
analysis tool where having such information is more useful. Currently,
various parts of the code uses the lexer to get the individual string
parts.

One of the main challenge this solves is that of string formatting.
Currently, the formatter relies on the lexer to get the individual
string parts, and formats them including the comments accordingly. But,
with PEP 701, f-string can also contain comments. Without this change,
it becomes very difficult to add support for f-string formatting.

### Implementation

The initial proposal was made in this discussion:
https://github.com/astral-sh/ruff/discussions/6183#discussioncomment-6591993.
There were various AST designs which were explored for this task which
are available in the linked internal document[^1].

The selected variant was the one where the nodes were kept as it is
except that the `implicit_concatenated` field was removed and instead a
new struct was added to the `Expr*` struct. This would be a private
struct would contain the actual implementation of how the AST is
designed for both single and implicitly concatenated strings.

This implementation is achieved through an enum with two variants:
`Single` and `Concatenated` to avoid allocating a vector even for single
strings. There are various public methods available on the value struct
to query certain information regarding the node.

The nodes are structured in the following way:

```
ExprStringLiteral - "foo" "bar"
|- StringLiteral - "foo"
|- StringLiteral - "bar"

ExprBytesLiteral - b"foo" b"bar"
|- BytesLiteral - b"foo"
|- BytesLiteral - b"bar"

ExprFString - "foo" f"bar {x}"
|- FStringPart::Literal - "foo"
|- FStringPart::FString - f"bar {x}"
  |- StringLiteral - "bar "
  |- FormattedValue - "x"
```

[^1]: Internal document:
https://www.notion.so/astral-sh/Implicit-String-Concatenation-e036345dc48943f89e416c087bf6f6d9?pvs=4

#### Visitor

The way the nodes are structured is that the entire string, including
all the parts that are implicitly concatenation, is a single node
containing individual nodes for the parts. The previous section has a
representation of that tree for all the string nodes. This means that
new visitor methods are added to visit the individual parts of string,
bytes, and f-strings for `Visitor`, `PreorderVisitor`, and
`Transformer`.

## Test Plan

- `cargo insta test --workspace --all-features --unreferenced reject`
- Verify that the ecosystem results are unchanged
2023-11-24 17:55:41 -06:00
Samuel Cormier-Iijima 852a8f4a4f
[PIE796] don't report when using ellipses for enum values in stub files (#8825)
## Summary

Just ignores ellipses as enum values inside stub files.

Fixes #8818.
2023-11-24 15:24:57 +00:00
konsti 14e65afdc6
Update to Rust 1.74 and use new clippy lints table (#8722)
Update to [Rust
1.74](https://blog.rust-lang.org/2023/11/16/Rust-1.74.0.html) and use
the new clippy lints table.

The update itself introduced a new clippy lint about superfluous hashes
in raw strings, which got removed.

I moved our lint config from `rustflags` to the newly stabilized
[workspace.lints](https://doc.rust-lang.org/stable/cargo/reference/workspaces.html#the-lints-table).
One consequence is that we have to `unsafe_code = "warn"` instead of
"forbid" because the latter now actually bans unsafe code:

```
error[E0453]: allow(unsafe_code) incompatible with previous forbid
  --> crates/ruff_source_file/src/newlines.rs:62:17
   |
62 |         #[allow(unsafe_code)]
   |                 ^^^^^^^^^^^ overruled by previous forbid
   |
   = note: `forbid` lint level was set on command line
```

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-11-16 18:12:46 -05:00
Charlie Marsh bf2cc3f520
Add autotyping-like return type inference for annotation rules (#8643)
## Summary

This PR adds (unsafe) fixes to the flake8-annotations rules that enforce
missing return types, offering to automatically insert type annotations
for functions with literal return values. The logic is smart enough to
generate simplified unions (e.g., `float` instead of `int | float`) and
deal with implicit returns (`return` without a value).

Closes https://github.com/astral-sh/ruff/issues/1640 (though we could
open a separate issue for referring parameter types).

Closes https://github.com/astral-sh/ruff/issues/8213.

## Test Plan

`cargo test`
2023-11-13 23:34:15 -05:00
Charlie Marsh df9ade7fd9
Use AST transformer for `relocate` (#8660) 2023-11-13 13:24:27 -05:00
Charlie Marsh 345e1401cf
Treat `class C: ...` and `class C(): ...` equivalently (#8659)
## Summary

These should be seen as identical from the `ComparableAst` perspective.
2023-11-13 18:03:04 +00:00
Charlie Marsh d574fcd1ac
Compare formatted and unformatted ASTs during formatter tests (#8624)
## Summary

This PR implements validation in the formatter tests to ensure that we
don't modify the AST during formatting. Black has similar logic.

In implementing this, I learned that Black actually _does_ modify the
AST, and their test infrastructure normalizes the AST to wipe away those
differences. Specifically, Black changes the indentation of docstrings,
which _does_ modify the AST; and it also inserts parentheses in `del`
statements, which changes the AST too.

Ruff also does both these things, so we _also_ implement the same
normalization using a new visitor that allows for modifying the AST.

Closes https://github.com/astral-sh/ruff/issues/8184.

## Test Plan

`cargo test`
2023-11-13 17:43:27 +00:00
Jesse Serrao 39728a1198
Add check for is comparison with mutable initialisers to rule F632 (#8607)
## Summary

Adds an extra check to F632 to check for any `is` comparisons to a
mutable initialisers.
Implements #8589 .

Example:
```Python
named_var = {}
if named_var is {}:  # F632 (fix)
    pass
```
The if condition will always evaluate to False because it checks on
identity and it's impossible to take the same identity as a hard coded
list/set/dict initializer.

## Test Plan

Multiple test cases were added to ensure the rule works + doesn't flag
false positives + the fix works correctly.
2023-11-11 00:29:23 +00:00
Kar Petrosyan e2c7b1ece6
[TRIO] Add TRIO109 rule (#8534)
## Summary

Adds TRIO109 from the [flake8-trio
plugin](https://github.com/Zac-HD/flake8-trio).
Relates to: https://github.com/astral-sh/ruff/issues/8451
2023-11-07 17:13:01 -05:00
qdegraaf 4170ef0508
[`TRIO`] Add `TRIO105`: `SyncTrioCall` (#8490)
## Summary

Adds `TRIO105` from the [flake8-trio
plugin](https://github.com/Zac-HD/flake8-trio). The `MethodName` logic
mirrors that of `TRIO100` to stay consistent within the plugin.

It is at 95% parity with the exception of upstream also checking for a
slightly more complex scenario where a call to `start()` on a
`trio.Nursery` context should also be immediately awaited. Upstream
plugin appears to just check for anything named `nursery` judging from
[the relevant issue](https://github.com/Zac-HD/flake8-trio/issues/56).

Unsure if we want to do so something similar or, alternatively, if there
is some capability in ruff to check for calls made on this context some
other way

## Test Plan

Added a new fixture, based on [the one from upstream
plugin](https://github.com/Zac-HD/flake8-trio/blob/main/tests/eval_files/trio105.py)

## Issue link

Refers: https://github.com/astral-sh/ruff/issues/8451
2023-11-05 19:56:10 +00:00
Micha Reiser f16505d885
Formatter: Remove unnecessary `group` (#8455) 2023-11-03 04:14:29 +00:00
Dhruv Manilawala d350ede992
Remove unicode flag from comparable (#8440)
## Summary

This PR removes the `unicode` flag from the string literal in
`ComparableExpr`. This flag isn't required as all strings are unicode in
Python 3 so `"foo" == u"foo"`.
2023-11-02 13:21:45 +05:30
Dhruv Manilawala 97ae617fac
Introduce `LiteralExpressionRef` for all literals (#8339)
## Summary

This PR adds a new `LiteralExpressionRef` which wraps all of the literal
expression nodes in a single enum. This allows for a narrow type when
working exclusively with a literal node. Additionally, it also
implements a `Expr::as_literal_expr` method to return the new enum if
the expression is indeed a literal one.

A few rules have been updated to account for the new enum:
1. `redundant_literal_union`
2. `if_else_block_instead_of_dict_lookup`
3. `magic_value_comparison`

To account for the change in (2), a new `ComparableLiteral` has been
added which can be constructed from the new enum
(`ComparableLiteral::from(<LiteralExpressionRef>)`).

### Open Questions

1. The new `ComparableLiteral` can be exclusively used via the
`LiteralExpressionRef` enum. Should we remove all of the literal
variants from `ComparableExpr` and instead have a single
`ComparableExpr::Literal(ComparableLiteral)` variant instead?

## Test Plan

`cargo test`
2023-10-31 12:56:11 +00:00
Dhruv Manilawala 8977b6ae11
Inline AST helpers for new literal nodes (#8374)
A small refactor to inline the `is_const_none` now that there's a
dedicated `ExprNoneLiteral` node.
2023-10-31 11:06:54 +00:00
Charlie Marsh 161c093c06
Avoid including literal `shell=True` for truthy, non-`True` diagnostics (#8359)
## Summary

If the value of `shell` wasn't literally `True`, we now show a message
describing it as truthy, rather than the (misleading) `shell=True`
literal in the diagnostic.

Closes https://github.com/astral-sh/ruff/issues/8310.
2023-10-30 15:44:38 +00:00
Dhruv Manilawala b0dc5a86a1
Impl `Default` for `(String|Bytes|Boolean|None|Ellipsis)Literal` (#8341)
## Summary

This PR adds `Default` for the following literal nodes:
* `StringLiteral`
* `BytesLiteral`
* `BooleanLiteral`
* `NoneLiteral`
* `EllipsisLiteral`

The implementation creates the zero value of the respective literal
nodes in terms of the Python language.

## Test Plan

`cargo test`
2023-10-30 08:47:44 +00:00
Dhruv Manilawala 230c9ce236
Split `Constant` to individual literal nodes (#8064)
## Summary

This PR splits the `Constant` enum as individual literal nodes. It
introduces the following new nodes for each variant:
* `ExprStringLiteral`
* `ExprBytesLiteral`
* `ExprNumberLiteral`
* `ExprBooleanLiteral`
* `ExprNoneLiteral`
* `ExprEllipsisLiteral`

The main motivation behind this refactor is to introduce the new AST
node for implicit string concatenation in the coming PR. The elements of
that node will be either a string literal, bytes literal or a f-string
which can be implemented using an enum. This means that a string or
bytes literal cannot be represented by `Constant::Str` /
`Constant::Bytes` which creates an inconsistency.

This PR avoids that inconsistency by splitting the constant nodes into
it's own literal nodes, literal being the more appropriate naming
convention from a static analysis tool perspective.

This also makes working with literals in the linter and formatter much
more ergonomic like, for example, if one would want to check if this is
a string literal, it can be done easily using
`Expr::is_string_literal_expr` or matching against `Expr::StringLiteral`
as oppose to matching against the `ExprConstant` and enum `Constant`. A
few AST helper methods can be simplified as well which will be done in a
follow-up PR.

This introduces a new `Expr::is_literal_expr` method which is the same
as `Expr::is_constant_expr`. There are also intermediary changes related
to implicit string concatenation which are quiet less. This is done so
as to avoid having a huge PR which this already is.

## Test Plan

1. Verify and update all of the existing snapshots (parser, visitor)
2. Verify that the ecosystem check output remains **unchanged** for both
the linter and formatter

### Formatter ecosystem check

#### `main`

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |

#### `dhruv/constant-to-literal`

| project | similarity index | total files | changed files |

|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
2023-10-30 12:13:23 +05:30
Dhruv Manilawala 78bbf6d403
New `Singleton` enum for `PatternMatchSingleton` node (#8063)
## Summary

This PR adds a new `Singleton` enum for the `PatternMatchSingleton`
node.

Earlier the node was using the `Constant` enum but the value for this
pattern can only be either `None`, `True` or `False`. With the coming PR
to remove the `Constant`, this node required a new type to fill in.

This also has the benefit of narrowing the type down to only the
possible values for the node as evident by the removal of `unreachable`.

## Test Plan

Update the AST snapshots and run `cargo test`.
2023-10-30 05:48:53 +00:00
Dhruv Manilawala ec1be60dcb
Remove leftover constant tuple reference (#8062)
This PR removes the leftover reference to the tuple variant in
`Constant`.
2023-10-19 17:50:45 +00:00
konsti 8f9753f58e
Comments outside expression parentheses (#7873)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Fixes https://github.com/astral-sh/ruff/issues/7448
Fixes https://github.com/astral-sh/ruff/issues/7892

I've removed automatic dangling comment formatting, we're doing manual
dangling comment formatting everywhere anyway (the
assert-all-comments-formatted ensures this) and dangling comments would
break the formatting there.

## Test Plan

New test file.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-10-19 09:24:11 +00:00
konsti 0c3123e07e
Insert newline after nested function or class statements (#7946)
**Summary** Insert a newline after nested function and class
definitions, unless there is a trailing own line comment.

We need to e.g. format
```python
if platform.system() == "Linux":
    if sys.version > (3, 10):
        def f():
            print("old")
    else:
        def f():
            print("new")
    f()
```
as
```python
if platform.system() == "Linux":
    if sys.version > (3, 10):

        def f():
            print("old")

    else:

        def f():
            print("new")

    f()
```
even though `f()` is directly preceded by an if statement, not a
function or class definition. See the comments and fixtures for trailing
own line comment handling.

**Test Plan** I checked that the new content of `newlines.py` matches
black's formatting.

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-10-18 09:45:58 +00:00
Charlie Marsh d685107638
Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030)
This is a do-over of https://github.com/astral-sh/ruff/pull/8011, which
I accidentally merged into a non-`main` branch. Sorry!
2023-10-18 00:01:18 +00:00
Tom Kuson 62f1ee08e7
[`refurb`] Implement `single-item-membership-test` (`FURB171`) (#7815)
## Summary

Implement
[`no-single-item-in`](https://github.com/dosisod/refurb/blob/master/refurb/checks/iterable/no_single_item_in.py)
as `single-item-membership-test` (`FURB171`).

Uses the helper function `generate_comparison` from the `pycodestyle`
implementations; this function should probably be moved, but I am not
sure where at the moment.

Update: moved it to `ruff_python_ast::helpers`.

Related to #1348.

## Test Plan

`cargo test`
2023-10-08 14:08:47 +00:00
Charlie Marsh f71c80af68
Show changed files when running under `--check` (#7788)
## Summary

We now list each changed file when running with `--check`.

Closes https://github.com/astral-sh/ruff/issues/7782.

## Test Plan

```
❯ cargo run -p ruff_cli -- format foo.py --check
   Compiling ruff_cli v0.0.292 (/Users/crmarsh/workspace/ruff/crates/ruff_cli)
rgo +    Finished dev [unoptimized + debuginfo] target(s) in 1.41s
     Running `target/debug/ruff format foo.py --check`
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended only for experimentation.
Would reformat: foo.py
1 file would be reformatted
```
2023-10-03 18:50:06 +00:00
Tom Kuson e129f77bcf
Extend `reimplemented-starmap` (`FURB140`) to catch calls with a single and starred argument (#7768) 2023-10-02 21:38:05 -04:00
Dhruv Manilawala e62e245c61
Add support for PEP 701 (#7376)
## Summary

This PR adds support for PEP 701 in Ruff. This is a rollup PR of all the
other individual PRs. The separate PRs were created for logic separation
and code reviews. Refer to each pull request for a detail description on
the change.

Refer to the PR description for the list of pull requests within this PR.

## Test Plan

### Formatter ecosystem checks

Explanation for the change in ecosystem check:
https://github.com/astral-sh/ruff/pull/7597#issue-1908878183

#### `main`

```
| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76083 |              1789 |              1631 |
| django       |           0.99983 |              2760 |                36 |
| transformers |           0.99963 |              2587 |               319 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99983 |              3496 |                18 |
| warehouse    |           0.99967 |               648 |                15 |
| zulip        |           0.99972 |              1437 |                21 |
```

#### `dhruv/pep-701`

```
| project      | similarity index  | total files       | changed files     |
|--------------|------------------:|------------------:|------------------:|
| cpython      |           0.76051 |              1789 |              1632 |
| django       |           0.99983 |              2760 |                36 |
| transformers |           0.99963 |              2587 |               319 |
| twine        |           1.00000 |                33 |                 0 |
| typeshed     |           0.99983 |              3496 |                18 |
| warehouse    |           0.99967 |               648 |                15 |
| zulip        |           0.99972 |              1437 |                21 |
```
2023-09-29 02:55:39 +00:00
Charlie Marsh f45281345d
Include radix base prefix in large number representation (#7700)
## Summary

When lexing a number like `0x995DC9BBDF1939FA` that exceeds our small
number representation, we were only storing the portion after the base
(in this case, `995DC9BBDF1939FA`). When using that representation in
code generation, this could lead to invalid syntax, since
`995DC9BBDF1939FA)` on its own is not a valid integer.

This PR modifies the code to store the full span, including the radix
prefix.

See:
https://github.com/astral-sh/ruff/issues/7455#issuecomment-1739802958.

## Test Plan

`cargo test`
2023-09-28 20:38:06 +00:00
Charlie Marsh 0a8cad2550
Allow named expressions in `__all__` assignments (#7673)
## Summary

This PR adds support for named expressions when analyzing `__all__`
assignments, as per https://github.com/astral-sh/ruff/issues/7672. It
also loosens the enforcement around assignments like: `__all__ =
list(some_other_expression)`. We shouldn't flag these as invalid, even
though we can't analyze the members, since we _know_ they evaluate to a
`list`.

Closes https://github.com/astral-sh/ruff/issues/7672.

## Test Plan

`cargo test`
2023-09-27 00:36:55 -04:00
Charlie Marsh 93b5d8a0fb
Implement our own small-integer optimization (#7584)
## Summary

This is a follow-up to #7469 that attempts to achieve similar gains, but
without introducing malachite. Instead, this PR removes the `BigInt`
type altogether, instead opting for a simple enum that allows us to
store small integers directly and only allocate for values greater than
`i64`:

```rust
/// A Python integer literal. Represents both small (fits in an `i64`) and large integers.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Int(Number);

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum Number {
    /// A "small" number that can be represented as an `i64`.
    Small(i64),
    /// A "large" number that cannot be represented as an `i64`.
    Big(Box<str>),
}

impl std::fmt::Display for Number {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Number::Small(value) => write!(f, "{value}"),
            Number::Big(value) => write!(f, "{value}"),
        }
    }
}
```

We typically don't care about numbers greater than `isize` -- our only
uses are comparisons against small constants (like `1`, `2`, `3`, etc.),
so there's no real loss of information, except in one or two rules where
we're now a little more conservative (with the worst-case being that we
don't flag, e.g., an `itertools.pairwise` that uses an extremely large
value for the slice start constant). For simplicity, a few diagnostics
now show a dedicated message when they see integers that are out of the
supported range (e.g., `outdated-version-block`).

An additional benefit here is that we get to remove a few dependencies,
especially `num-bigint`.

## Test Plan

`cargo test`
2023-09-25 15:13:21 +00:00
Charlie Marsh 4d6f5ff0a7
Remove `Int` wrapper type from parser (#7577)
## Summary

This is only used for the `level` field in relative imports (e.g., `from
..foo import bar`). It seems unnecessary to use a wrapper here, so this
PR changes to a `u32` directly.
2023-09-21 17:01:44 +00:00
Charlie Marsh 5df0326bc8
Treat parameters-with-newline as empty in function formatting (#7550)
## Summary

If a function has no parameters (and no comments within the parameters'
`()`), we're supposed to wrap the return annotation _whenever_ it
breaks. However, our `empty_parameters` test didn't properly account for
the case in which the parameters include a newline (but no other
content), like:

```python
def get_dashboards_hierarchy(
) -> Dict[Type['BaseDashboard'], List[Type['BaseDashboard']]]:
    """Get hierarchy of dashboards classes.

    Returns:
        Dict of dashboards classes.
    """
    dashboards_hierarchy = {}
```

This PR fixes that detection. Instead of lexing, it now checks if the
parameters itself is empty (or if it contains comments).

Closes https://github.com/astral-sh/ruff/issues/7457.
2023-09-20 16:20:22 -04:00
konsti 2cbe1733c8
Use CommentRanges in backwards lexing (#7360)
## Summary

The tokenizer was split into a forward and a backwards tokenizer. The
backwards tokenizer uses the same names as the forwards ones (e.g.
`next_token`). The backwards tokenizer gets the comment ranges that we
already built to skip comments.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-09-16 03:21:45 +00:00
Charlie Marsh ec2f229a45
Remove `ExprContext` from `ComparableExpr` (#7362)
`ComparableExpr` includes the `ExprContext` field on an expression, so,
e.g., the two tuples in `(a, b) = (a, b)` won't be considered equal.
Similarly, the tuples in `[(a, b) for (a, b) in c]` _also_ wouldn't be
considered equal. I find this behavior surprising, since
`ComparableExpr` is intended to allow you to compare two ASTs, but
`ExprContext` is really encoding information about the broader context
for the expression.
2023-09-14 15:40:02 +00:00