Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Dhruv Manilawala	13ffb5bc19	Replace LALRPOP parser with hand-written parser (#10036 ) (Supersedes #9152, authored by @LaBatata101) ## Summary This PR replaces the current parser generated from LALRPOP to a hand-written recursive descent parser. It also updates the grammar for [PEP 646](https://peps.python.org/pep-0646/) so that the parser outputs the correct AST. For example, in `data[*x]`, the index expression is now a tuple with a single starred expression instead of just a starred expression. Beyond the performance improvements, the parser is also error resilient and can provide better error messages. The behavior as seen by any downstream tools isn't changed. That is, the linter and formatter can still assume that the parser will _stop_ at the first syntax error. This will be updated in the following months. For more details about the change here, refer to the PR corresponding to the individual commits and the release blog post. ## Test Plan Write _lots_ and _lots_ of tests for both valid and invalid syntax and verify the output. ## Acknowledgements - @MichaReiser for reviewing 100+ parser PRs and continuously providing guidance throughout the project - @LaBatata101 for initiating the transition to a hand-written parser in #9152 - @addisoncrump for implementing the fuzzer which helped [catch](https://github.com/astral-sh/ruff/pull/10903) [a](https://github.com/astral-sh/ruff/pull/10910) [lot](https://github.com/astral-sh/ruff/pull/10966) [of](https://github.com/astral-sh/ruff/pull/10896) [bugs](https://github.com/astral-sh/ruff/pull/10877) --------- Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org> Co-authored-by: Micha Reiser <micha@reiser.io>	2024-04-18 17:57:39 +05:30
Alex Waygood	f779babc5f	Improve handling of builtin symbols in linter rules (#10919 ) Add a new method to the semantic model to simplify and improve the correctness of a common pattern	2024-04-16 11:37:31 +01:00
renovate[bot]	388658efdb	Update pre-commit dependencies (#10698 ) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Zanie Blue <contact@zanie.dev> Co-authored-by: Alex Waygood <alex.waygood@gmail.com>	2024-04-06 23:00:41 +00:00
Charlie Marsh	7fb5f47efe	Respect `# noqa` directives on `__all__` openers (#10798 ) ## Summary Historically, given: ```python __all__ = [ # noqa: F822 "Bernoulli", "Beta", "Binomial", ] ``` The F822 violations would be attached to the `__all__`, so this `# noqa` would be enforced for _all_ definitions in the list. This changed in https://github.com/astral-sh/ruff/pull/10525 for the better, in that we now use the range of each string. But these `# noqa` directives stopped working. This PR sets the `__all__` as a parent range in the diagnostic, so that these directives are respected once again. Closes https://github.com/astral-sh/ruff/issues/10795. ## Test Plan `cargo test`	2024-04-06 14:51:17 +00:00
Dhruv Manilawala	99dd3a8ab0	Implement `as_str` & `Display` for all operator enums (#10691 ) ## Summary This PR adds the `as_str` implementation for all the operator methods. It already exists for `CmpOp` which is being [used in the linter](`ffcd77860c/crates/ruff_linter/src/rules/flake8_simplify/rules/key_in_dict.rs (L117)`) and it makes sense to implement it for the rest as well. This will also be utilized in error messages for the new parser.	2024-04-02 10:34:36 +00:00
Dhruv Manilawala	eee2d5b915	Remove unused operator methods and impl (#10690 ) ## Summary This PR removes unused operator methods and impl traits. There is already the `is_macro::Is` implementation for all the operators and this seems unnecessary.	2024-04-02 15:53:20 +05:30
Alex Waygood	a06ffeb54e	Track ranges of names inside `__all__` definitions (#10525 )	2024-03-22 18:38:40 +00:00
Charlie Marsh	60fd98eb2f	Update Rust to v1.77 (#10510 )	2024-03-21 12:10:33 -04:00
Alex Waygood	7caf0d064a	Simplify formatting of strings by using flags from the AST nodes (#10489 )	2024-03-20 16:16:54 +00:00
Alex Waygood	162d2eb723	Track casing of r-string prefixes in the tokenizer and AST (#10314 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2024-03-18 17:18:04 +00:00
Dhruv Manilawala	5f40371ffc	Use `ExprFString` for `StringLike::FString` variant (#10311 ) ## Summary This PR updates the `StringLike::FString` variant to use `ExprFString` instead of `FStringLiteralElement`. For context, the reason it used `FStringLiteralElement` is that the node is actually the string part of an f-string ("foo" in `f"foo{x}"`). But, this is inconsistent with other variants where the captured value is the _entire_ string. This is also problematic w.r.t. implicitly concatenated strings. Any rules which work with `StringLike::FString` doesn't account for the string part in an implicitly concatenated f-strings. For example, we don't flag confusable character in the first part of `"𝐁ad" f"𝐁ad string"`, but only the second part (https://play.ruff.rs/16071c4c-a1dd-4920-b56f-e2ce2f69c843). ### Update `PYI053` _This is included in this PR because otherwise it requires a temporary workaround to be compatible with the old logic._ This PR also updates the `PYI053` (`string-or-bytes-too-long`) rule for f-string to consider _all_ the visible characters in a f-string, including the ones which are implicitly concatenated. This is consistent with implicitly concatenated strings and bytes. For example, ```python def foo( # We count all the characters here arg1: str = '51 character ' 'stringgggggggggggggggggggggggggggggggg', # But not here because of the `{x}` replacement field which _breaks_ them up into two chunks arg2: str = f'51 character {x} stringgggggggggggggggggggggggggggggggggggggggggggg', ) -> None: ... ``` This PR fixes it to consider all _visible_ characters inside an f-string which includes expressions as well. fixes: #10310 fixes: #10307 ## Test Plan Add new test cases and update the snapshots. ## Review To facilitate the review process, the change have been split into two commits: one which has the code change while the other has the test cases and updated snapshots.	2024-03-14 13:30:22 +05:30
Alex Waygood	c2e15f38ee	Unify enums used for internal representation of quoting style (#10383 )	2024-03-13 17:19:17 +00:00
Dhruv Manilawala	32d6f84e3d	Add methods to iter over f-string elements (#10309 ) ## Summary This PR adds methods on `FString` to iterate over the two different kind of elements it can have - literals and expressions. This is similar to the methods we have on `ExprFString`. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2024-03-13 08:46:55 +00:00
Charlie Marsh	dbf82233b8	Gate f-string struct size test for Rustc < 1.76 (#10371 ) Closes https://github.com/astral-sh/ruff/issues/10319.	2024-03-12 15:46:36 -04:00
Alex Waygood	1d97f27335	Start tracking quoting style in the AST (#10298 ) This PR modifies our AST so that nodes for string literals, bytes literals and f-strings all retain the following information: - The quoting style used (double or single quotes) - Whether the string is triple-quoted or not - Whether the string is raw or not This PR is a followup to #10256. Like with that PR, this PR does not, in itself, fix any bugs. However, it means that we will have the necessary information to preserve quoting style and rawness of strings in the `ExprGenerator` in a followup PR, which will allow us to provide a fix for https://github.com/astral-sh/ruff/issues/7799. The information is recorded on the AST nodes using a bitflag field on each node, similarly to how we recorded the information on `Tok::String`, `Tok::FStringStart` and `Tok::FStringMiddle` tokens in #10298. Rather than reusing the bitflag I used for the tokens, however, I decided to create a custom bitflag for each AST node. Using different bitflags for each node allows us to make invalid states unrepresentable: it is valid to set a `u` prefix on a string literal, but not on a bytes literal or an f-string. It also allows us to have better debug representations for each AST node modified in this PR.	2024-03-08 19:11:47 +00:00
Micha Reiser	8ea5b08700	refactor: Use `QualifiedName` for `Imported::call_path` (#10214 ) ## Summary When you try to remove an internal representation leaking into another type and end up rewriting a simple version of `smallvec`. The goal of this PR is to replace the `Box<[&'a str]>` with `Box<QualifiedName>` to avoid that the internal `QualifiedName` representation leaks (and it gives us a nicer API too). However, doing this when `QualifiedName` uses `SmallVec` internally gives us all sort of funny lifetime errors. I was lost but @BurntSushi came to rescue me. He figured out that `smallvec` has a variance problem which is already tracked in https://github.com/servo/rust-smallvec/issues/146 To fix the variants problem, I could use the smallvec-2-alpha-4 or implement our own smallvec. I went with implementing our own small vec for this specific problem. It obviously isn't as sophisticated as smallvec (only uses safe code), e.g. it doesn't perform any size optimizations, but it does its job. Other changes: * Removed `Imported::qualified_name` (the version that returns a `String`). This can be replaced by calling `ToString` on the qualified name. * Renamed `Imported::call_path` to `qualified_name` and changed its return type to `&QualifiedName`. * Renamed `QualifiedName::imported` to `user_defined` which is the more common term when talking about builtins vs the rest/user defined functions. ## Test plan `cargo test`	2024-03-06 09:55:59 +01:00
Micha Reiser	184241f99a	Remove `Expr` postfix from `ExprNamed`, `ExprIf`, and `ExprGenerator` (#10229 ) The expression types in our AST are called `ExprYield`, `ExprAwait`, `ExprStringLiteral` etc, except `ExprNamedExpr`, `ExprIfExpr` and `ExprGenratorExpr`. This seems to align with [Python AST's naming](https://docs.python.org/3/library/ast.html) but feels inconsistent and excessive. This PR removes the `Expr` postfix from `ExprNamedExpr`, `ExprIfExpr`, and `ExprGeneratorExpr`.	2024-03-04 12:55:01 +01:00
Micha Reiser	a6d892b1f4	Split `CallPath` into `QualifiedName` and `UnqualifiedName` (#10210 ) ## Summary Charlie can probably explain this better than I but it turns out, `CallPath` is used for two different things: * To represent unqualified names like `version` where `version` can be a local variable or imported (e.g. `from sys import version` where the full qualified name is `sys.version`) * To represent resolved, full qualified names This PR splits `CallPath` into two types to make this destinction clear. > Note: I haven't renamed all `call_path` variables to `qualified_name` or `unqualified_name`. I can do that if that's welcomed but I first want to get feedback on the approach and naming overall. ## Test Plan `cargo test`	2024-03-04 09:06:51 +00:00
Micha Reiser	db25a563f7	Remove unneeded lifetime bounds (#10213 ) ## Summary This PR removes the unneeded lifetime `'b` from many of our `Visitor` implementations. The lifetime is unneeded because it is only constraint by `'a`, so we can use `'a` directly. ## Test Plan `cargo build`	2024-03-03 18:12:11 +00:00
Micha Reiser	e725b6fdaf	CallPath newtype wrapper (#10201 ) ## Summary This PR changes the `CallPath` type alias to a newtype wrapper. A newtype wrapper allows us to limit the API and to experiment with alternative ways to implement matching on `CallPath`s. ## Test Plan `cargo test`	2024-03-03 16:54:24 +01:00
Jane Lewis	0293908b71	Implement RUF028 to detect useless formatter suppression comments (#9899 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> Fixes #6611 ## Summary This lint rule spots comments that are _intended_ to suppress or enable the formatter, but will be ignored by the Ruff formatter. We borrow some functions the formatter uses for determining comment placement / putting them in context within an AST. The analysis function uses an AST visitor to visit each comment and attach it to the AST. It then uses that context to check: 1. Is this comment in an expression? 2. Does this comment have bad placement? (e.g. a `# fmt: skip` above a function instead of at the end of a line) 3. Is this comment redundant? 4. Does this comment actually suppress any code? 5. Does this comment have ambiguous placement? (e.g. a `# fmt: off` above an `else:` block) If any of these are true, a violation is thrown. The reported reason depends on the order of the above check-list: in other words, a `# fmt: skip` comment on its own line within a list expression will be reported as being in an expression, since that reason takes priority. The lint suggests removing the comment as an unsafe fix, regardless of the reason. ## Test Plan A snapshot test has been created.	2024-02-28 19:21:06 +00:00
Micha Reiser	77c5561646	Add `parenthesized` flag to `ExprTuple` and `ExprGenerator` (#9614 )	2024-02-26 15:35:20 +00:00
Charlie Marsh	0304623878	[`perflint`] Catch a wider range of mutations in `PERF101` (#9955 ) ## Summary This PR ensures that if a list `x` is modified within a `for` loop, we avoid flagging `list(x)` as unnecessary. Previously, we only detected calls to exactly `.append`, and they couldn't be nested within other statements. Closes https://github.com/astral-sh/ruff/issues/9925.	2024-02-12 12:17:55 -05:00
Charlie Marsh	6f0e4ad332	Remove unnecessary string cloning from the parser (#9884 ) Closes https://github.com/astral-sh/ruff/issues/9869.	2024-02-09 16:03:27 -05:00
Charlie Marsh	49fe1b85f2	Reduce size of `Expr` from 80 to 64 bytes (#9900 ) ## Summary This PR reduces the size of `Expr` from 80 to 64 bytes, by reducing the sizes of... - `ExprCall` from 72 to 56 bytes, by using boxed slices for `Arguments`. - `ExprCompare` from 64 to 48 bytes, by using boxed slices for its various vectors. In testing, the parser gets a bit faster, and the linter benchmarks improve quite a bit.	2024-02-09 02:53:13 +00:00
Micha Reiser	fe7d965334	Reduce `Result<Tok, LexicalError>` size by using `Box<str>` instead of `String` (#9885 )	2024-02-08 20:36:22 +00:00
Micha Reiser	688177ff6a	Use Rust 1.76 (#9897 )	2024-02-08 18:20:08 +00:00
Charlie Marsh	daae28efc7	Respect `async with` in `timeout-without-await` (#9859 ) Closes https://github.com/astral-sh/ruff/issues/9855.	2024-02-06 12:04:24 -05:00
Dhruv Manilawala	36b752876e	Implement `AnyNode`/`AnyNodeRef` for `FStringFormatSpec` (#9836 ) ## Summary This PR adds the `AnyNode` and `AnyNodeRef` implementation for `FStringFormatSpec` node which will be required in the f-string formatting. The main usage for this is so that we can pass in the node directly to `suppressed_node` in case debug expression is used to format is as verbatim text.	2024-02-05 19:23:43 +00:00
Charlie Marsh	ea1c089652	Use `AhoCorasick` to speed up quote match (#9773 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary When I was looking at the v0.2.0 release, this method showed up in a CodSpeed regression (we were calling it more), so I decided to quickly look at speeding it up. @BurntSushi suggested using Aho-Corasick, and it looks like it's about 7 or 8x faster: ```text Parser/AhoCorasick time: [8.5646 ns 8.5914 ns 8.6191 ns] Parser/Iterator time: [64.992 ns 65.124 ns 65.271 ns] ``` ## Test Plan `cargo test`	2024-02-02 09:57:39 -05:00
Micha Reiser	ce14f4dea5	Range formatting API (#9635 )	2024-01-31 11:13:37 +01:00
Micha Reiser	5fe0fdd0a8	Delete `is_node_with_body` method (#9643 )	2024-01-25 14:41:13 +00:00
Alex Waygood	a1e65a92bd	Move `is_tuple_parenthesized` from the formatter to `ruff_python_ast` (#9533 ) This allows it to be used in the linter as well as the formatter. It will be useful in #9474	2024-01-15 16:10:40 +00:00
Chammika Mannakkara	0003c730e0	[`flake8-simplify`] Implement `enumerate-for-loop` (`SIM113`) (#7777 ) Implements SIM113 from #998 Added tests Limitations - No fix yet - Only flag cases where index variable immediately precede `for` loop @charliermarsh please review and let me know any improvements --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2024-01-14 11:00:59 -05:00
Charlie Marsh	009430e034	[`ruff`] Avoid treating named expressions as static keys (`RUF011`) (#9494 ) Closes https://github.com/astral-sh/ruff/issues/9487.	2024-01-12 14:33:45 -05:00
Charlie Marsh	a31a314b2b	Account for possibly-empty f-string values in truthiness logic (#9484 ) Closes https://github.com/astral-sh/ruff/issues/9479.	2024-01-11 21:16:19 -05:00
Micha Reiser	f192c72596	Remove type parameter from `parse_*` methods (#9466 )	2024-01-11 19:41:19 +01:00
Micha Reiser	94968fedd5	Use Rust 1.75 toolchain (#9437 )	2024-01-08 18:03:16 +01:00
Charlie Marsh	701697c37e	Support variable keys in static dictionary key rule (#9411 ) Closes https://github.com/astral-sh/ruff/issues/9410.	2024-01-06 20:44:40 +00:00
Charlie Marsh	0e202718fd	Misc. small tweaks from perusing modules (#9383 )	2024-01-03 12:30:25 -05:00
Charlie Marsh	9073220887	Make all dependencies workspace dependencies (#9333 ) ## Summary This PR modifies our `Cargo.toml` files to use workspace dependencies for _all_ dependencies, rather than the status quo of sporadically trying to use workspace dependencies for those dependencies that are used across multiple crates. I find the current situation more confusing and harder to manage, since we have a mix of workspace and crate-local dependencies, whereas this setup consistently uses the same approach for all dependencies.	2024-01-02 13:41:59 +00:00
Charlie Marsh	1c9268d2ff	Remove some unused dependencies (#9330 )	2023-12-31 07:38:16 -05:00
Charlie Marsh	e80260a3c5	Remove source path from parser errors (#9322 ) ## Summary I always found it odd that we had to pass this in, since it's really higher-level context for the error. The awkwardness is further evidenced by the fact that we pass in fake values everywhere (even outside of tests). The source path isn't actually used to display the error; it's only accessed elsewhere to _re-display_ the error in certain cases. This PR modifies to instead pass the path directly in those cases.	2023-12-30 20:33:05 +00:00
Charlie Marsh	eb9a1bc5f1	Use consistent re-export from `ruff_source_file` (#9320 ) Right now, we both re-export (via `pub use`) and mark the modules themselves a `pub`, so they can be imported through two different paths.	2023-12-30 14:48:45 -05:00
Charlie Marsh	2895e7d126	Respect mixed `return` and `raise` cases in return-type analysis (#9310 ) ## Summary Given: ```python from somewhere import get_cfg def lookup_cfg(cfg_description): cfg = get_cfg(cfg_description) if cfg is not None: return cfg raise AttributeError(f"No cfg found matching {cfg_description}") ``` We were analyzing the method from last-to-first statement. So we saw the `raise`, then assumed the method _always_ raised. In reality, though, it _might_ return. This PR improves the branch analysis to respect these mixed cases. Closes https://github.com/astral-sh/ruff/issues/9269. Closes https://github.com/astral-sh/ruff/issues/9304.	2023-12-29 16:46:37 +00:00
Charlie Marsh	a9ceef5b5d	[`ruff`] Add `never-union` rule to detect redundant `typing.NoReturn` and `typing.Never` (#9217 ) ## Summary Adds a rule to detect unions that include `typing.NoReturn` or `typing.Never`. In such cases, the use of the bottom type is redundant. Closes https://github.com/astral-sh/ruff/issues/9113. ## Test Plan `cargo test`	2023-12-21 20:53:31 +00:00
Charlie Marsh	5ccc21aea2	Add support for `NoReturn` in auto-return-typing (#9206 ) ## Summary Given a function like: ```python def func(x: int): if not x: raise ValueError else: raise TypeError ``` We now correctly use `NoReturn` as the return type, rather than `None`. Closes https://github.com/astral-sh/ruff/issues/9201.	2023-12-20 00:06:31 -05:00
Dhruv Manilawala	18452cf477	Add `as_slice` method for all string nodes (#9111 ) This PR adds a `as_slice` method to all the string nodes which returns all the parts of the nodes as a slice. This will be useful in the next PR to split the string formatting to use this method to extract the _single node_ or _implicitly concanated nodes_.	2023-12-13 06:31:20 +00:00
Dhruv Manilawala	96ae9fe685	Introduce `StringLike` enum (#9016 ) ## Summary This PR introduces a new `StringLike` enum which is a narrow type to indicate string-like nodes. These includes the string literals, bytes literals, and the literal parts of f-strings. The main motivation behind this is to avoid repetition of rule calling in the AST checker. We add a new `analyze::string_like` function which takes in the enum and calls all the respective rule functions which expects atleast 2 of the variants of this enum. I'm open to discarding this if others think it's not that useful at this stage as currently only 3 rules require these nodes. As suggested [here](https://github.com/astral-sh/ruff/pull/8835#discussion_r1414746934) and [here](https://github.com/astral-sh/ruff/pull/8835#discussion_r1414750204). ## Test Plan `cargo test`	2023-12-07 16:39:13 +00:00
Dhruv Manilawala	cdac90ef68	New AST nodes for f-string elements (#8835 ) Rebase of #6365 authored by @davidszotten. ## Summary This PR updates the AST structure for an f-string elements. The main motivation behind this change is to have a dedicated node for the string part of an f-string. Previously, the existing `ExprStringLiteral` node was used for this purpose which isn't exactly correct. The `ExprStringLiteral` node should include the quotes as well in the range but the f-string literal element doesn't include the quote as it's a specific part within an f-string. For example, ```python f"foo {x}" # ^^^^ # This is the literal part of an f-string ``` The introduction of `FStringElement` enum is helpful which represent either the literal part or the expression part of an f-string. ### Rule Updates This means that there'll be two nodes representing a string depending on the context. One for a normal string literal while the other is a string literal within an f-string. The AST checker is updated to accommodate this change. The rules which work on string literal are updated to check on the literal part of f-string as well. #### Notes 1. The `Expr::is_literal_expr` method would check for `ExprStringLiteral` and return true if so. But now that we don't represent the literal part of an f-string using that node, this improves the method's behavior and confines to the actual expression. We do have the `FStringElement::is_literal` method. 2. We avoid checking if we're in a f-string context before adding to `string_type_definitions` because the f-string literal is now a dedicated node and not part of `Expr`. 3. Annotations cannot use f-string so we avoid changing any rules which work on annotation and checks for `ExprStringLiteral`. ## Test Plan - All references of `Expr::StringLiteral` were checked to see if any of the rules require updating to account for the f-string literal element node. - New test cases are added for rules which check against the literal part of an f-string. - Check the ecosystem results and ensure it remains unchanged. ## Performance There's a performance penalty in the parser. The reason for this remains unknown as it seems that the generated assembly code is now different for the `__reduce154` function. The reduce function body is just popping the `ParenthesizedExpr` on top of the stack and pushing it with the new location. - The size of `FStringElement` enum is the same as `Expr` which is what it replaces in `FString::format_spec` - The size of `FStringExpressionElement` is the same as `ExprFormattedValue` which is what it replaces I tried reducing the `Expr` enum from 80 bytes to 72 bytes but it hardly resulted in any performance gain. The difference can be seen here: - Original profile: https://share.firefox.dev/3Taa7ES - Profile after boxing some node fields: https://share.firefox.dev/3GsNXpD ### Backtracking I tried backtracking the changes to see if any of the isolated change produced this regression. The problem here is that the overall change is so small that there's only a single checkpoint where I can backtrack and that checkpoint results in the same regression. This checkpoint is to revert using `Expr` to the `FString::format_spec` field. After this point, the change would revert back to the original implementation. ## Review process The review process is similar to #7927. The first set of commits update the node structure, parser, and related AST files. Then, further commits update the linter and formatter part to account for the AST change. --------- Co-authored-by: David Szotten <davidszotten@gmail.com>	2023-12-07 10:28:05 -06:00
Dhruv Manilawala	ef7778d794	Fix preorder visitor tests (#9025 ) Follow-up PR to #9009 to fix the `PreorderVisitor` test cases as suggested here: https://github.com/astral-sh/ruff/pull/9009#discussion_r1416459688	2023-12-06 16:58:51 +00:00
Dhruv Manilawala	bd443ebe91	Add visitor tests for strings, bytes, f-strings (#9009 ) This PR adds tests for visitor implementation for string literals, bytes literals and f-strings.	2023-12-06 10:52:19 -06:00
Micha Reiser	7e390d3772	Move `ParenthesizedExpr` to `ruff_python_parser` (#8987 )	2023-12-04 05:36:28 +00:00
Charlie Marsh	e5db72459e	Detect implicit returns in auto-return-types (#8952 ) ## Summary Adds detection for branches without a `return` or `raise`, so that we can properly `Optional` the return types. I'd like to remove this and replace it with our code graph analysis from the `unreachable.rs` rule, but it at least fixes the worst offenders. Closes #8942.	2023-12-01 12:35:01 -05:00
Charlie Marsh	6435e4e4aa	Enable auto-return-type involving `Optional` and `Union` annotations (#8885 ) ## Summary Previously, this was only supported for Python 3.10 and later, since we always use the PEP 604-style unions.	2023-11-28 18:35:55 -08:00
Dhruv Manilawala	ec7456bac0	Rename `as_str` to `to_str` (#8886 ) This PR renames the method on `StringLiteralValue` from `as_str` to `to_str`. The main motivation is to follow the naming convention as described in the [Rust API Guidelines](https://rust-lang.github.io/api-guidelines/naming.html#ad-hoc-conversions-follow-as_-to_-into_-conventions-c-conv). This method can perform a string allocation in case the string is implicitly concatenated.	2023-11-28 18:50:42 -06:00
Dhruv Manilawala	b28556d739	Update `E402` to work at cell level for notebooks (#8872 ) ## Summary This PR updates the `E402` rule to work at cell level for Jupyter notebooks. This is enabled only in preview to gather feedback. The implementation basically resets the import boundary flag on the semantic model when we encounter the first statement in a cell. Another potential solution is to introduce `E403` rule that is specifically for notebooks that works at cell level while `E402` will be disabled for notebooks. ## Test Plan Add a notebook with imports in multiple cells and verify that the rule works as expected. resolves: #8669	2023-11-29 00:32:35 +00:00
Dhruv Manilawala	501cca8b72	Remove `#[allow(unused_variables)]` from visitor methods (#8828 ) Small follow-up to remove `#[allow(unused_variables)]` from visitor methods and use underscore prefix for unused variables instead.	2023-11-25 00:09:46 +00:00
Dhruv Manilawala	626b0577cd	Explicit `as_str` (no deref), add no allocation methods (#8826 ) ## Summary This PR is a follow-up to the AST refactor which does the following: - Remove `Deref` implementation on `StringLiteralValue` and use explicit `as_str` calls instead. The `Deref` implementation would implicitly perform allocations in case of implicitly concatenated strings. This is to make sure the allocation is explicit. - Now, certain methods can be implemented to do zero allocations which have been implemented in this PR. They are: - `is_empty` - `len` - `chars` - Custom `PartialEq` implementation to compare each character ## Test Plan Run the linter test suite and make sure all tests pass.	2023-11-25 00:03:59 +00:00
Dhruv Manilawala	017e829115	Update string nodes for implicit concatenation (#7927 ) ## Summary This PR updates the string nodes (`ExprStringLiteral`, `ExprBytesLiteral`, and `ExprFString`) to account for implicit string concatenation. ### Motivation In Python, implicit string concatenation are joined while parsing because the interpreter doesn't require the information for each part. While that's feasible for an interpreter, it falls short for a static analysis tool where having such information is more useful. Currently, various parts of the code uses the lexer to get the individual string parts. One of the main challenge this solves is that of string formatting. Currently, the formatter relies on the lexer to get the individual string parts, and formats them including the comments accordingly. But, with PEP 701, f-string can also contain comments. Without this change, it becomes very difficult to add support for f-string formatting. ### Implementation The initial proposal was made in this discussion: https://github.com/astral-sh/ruff/discussions/6183#discussioncomment-6591993. There were various AST designs which were explored for this task which are available in the linked internal document[^1]. The selected variant was the one where the nodes were kept as it is except that the `implicit_concatenated` field was removed and instead a new struct was added to the `Expr*` struct. This would be a private struct would contain the actual implementation of how the AST is designed for both single and implicitly concatenated strings. This implementation is achieved through an enum with two variants: `Single` and `Concatenated` to avoid allocating a vector even for single strings. There are various public methods available on the value struct to query certain information regarding the node. The nodes are structured in the following way: ``` ExprStringLiteral - "foo" "bar" \|- StringLiteral - "foo" \|- StringLiteral - "bar" ExprBytesLiteral - b"foo" b"bar" \|- BytesLiteral - b"foo" \|- BytesLiteral - b"bar" ExprFString - "foo" f"bar {x}" \|- FStringPart::Literal - "foo" \|- FStringPart::FString - f"bar {x}" \|- StringLiteral - "bar " \|- FormattedValue - "x" ``` [^1]: Internal document: https://www.notion.so/astral-sh/Implicit-String-Concatenation-e036345dc48943f89e416c087bf6f6d9?pvs=4 #### Visitor The way the nodes are structured is that the entire string, including all the parts that are implicitly concatenation, is a single node containing individual nodes for the parts. The previous section has a representation of that tree for all the string nodes. This means that new visitor methods are added to visit the individual parts of string, bytes, and f-strings for `Visitor`, `PreorderVisitor`, and `Transformer`. ## Test Plan - `cargo insta test --workspace --all-features --unreferenced reject` - Verify that the ecosystem results are unchanged	2023-11-24 17:55:41 -06:00
Samuel Cormier-Iijima	852a8f4a4f	[PIE796] don't report when using ellipses for enum values in stub files (#8825 ) ## Summary Just ignores ellipses as enum values inside stub files. Fixes #8818.	2023-11-24 15:24:57 +00:00
konsti	14e65afdc6	Update to Rust 1.74 and use new clippy lints table (#8722 ) Update to [Rust 1.74](https://blog.rust-lang.org/2023/11/16/Rust-1.74.0.html) and use the new clippy lints table. The update itself introduced a new clippy lint about superfluous hashes in raw strings, which got removed. I moved our lint config from `rustflags` to the newly stabilized [workspace.lints](https://doc.rust-lang.org/stable/cargo/reference/workspaces.html#the-lints-table). One consequence is that we have to `unsafe_code = "warn"` instead of "forbid" because the latter now actually bans unsafe code: ``` error[E0453]: allow(unsafe_code) incompatible with previous forbid --> crates/ruff_source_file/src/newlines.rs:62:17 \| 62 \| #[allow(unsafe_code)] \| ^^^^^^^^^^^ overruled by previous forbid \| = note: `forbid` lint level was set on command line ``` --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2023-11-16 18:12:46 -05:00
Charlie Marsh	bf2cc3f520	Add autotyping-like return type inference for annotation rules (#8643 ) ## Summary This PR adds (unsafe) fixes to the flake8-annotations rules that enforce missing return types, offering to automatically insert type annotations for functions with literal return values. The logic is smart enough to generate simplified unions (e.g., `float` instead of `int \| float`) and deal with implicit returns (`return` without a value). Closes https://github.com/astral-sh/ruff/issues/1640 (though we could open a separate issue for referring parameter types). Closes https://github.com/astral-sh/ruff/issues/8213. ## Test Plan `cargo test`	2023-11-13 23:34:15 -05:00
Charlie Marsh	df9ade7fd9	Use AST transformer for `relocate` (#8660 )	2023-11-13 13:24:27 -05:00
Charlie Marsh	345e1401cf	Treat `class C: ...` and `class C(): ...` equivalently (#8659 ) ## Summary These should be seen as identical from the `ComparableAst` perspective.	2023-11-13 18:03:04 +00:00
Charlie Marsh	d574fcd1ac	Compare formatted and unformatted ASTs during formatter tests (#8624 ) ## Summary This PR implements validation in the formatter tests to ensure that we don't modify the AST during formatting. Black has similar logic. In implementing this, I learned that Black actually _does_ modify the AST, and their test infrastructure normalizes the AST to wipe away those differences. Specifically, Black changes the indentation of docstrings, which _does_ modify the AST; and it also inserts parentheses in `del` statements, which changes the AST too. Ruff also does both these things, so we _also_ implement the same normalization using a new visitor that allows for modifying the AST. Closes https://github.com/astral-sh/ruff/issues/8184. ## Test Plan `cargo test`	2023-11-13 17:43:27 +00:00
Jesse Serrao	39728a1198	Add check for is comparison with mutable initialisers to rule F632 (#8607 ) ## Summary Adds an extra check to F632 to check for any `is` comparisons to a mutable initialisers. Implements #8589 . Example: ```Python named_var = {} if named_var is {}: # F632 (fix) pass ``` The if condition will always evaluate to False because it checks on identity and it's impossible to take the same identity as a hard coded list/set/dict initializer. ## Test Plan Multiple test cases were added to ensure the rule works + doesn't flag false positives + the fix works correctly.	2023-11-11 00:29:23 +00:00
Kar Petrosyan	e2c7b1ece6	[TRIO] Add TRIO109 rule (#8534 ) ## Summary Adds TRIO109 from the [flake8-trio plugin](https://github.com/Zac-HD/flake8-trio). Relates to: https://github.com/astral-sh/ruff/issues/8451	2023-11-07 17:13:01 -05:00
qdegraaf	4170ef0508	[`TRIO`] Add `TRIO105`: `SyncTrioCall` (#8490 ) ## Summary Adds `TRIO105` from the [flake8-trio plugin](https://github.com/Zac-HD/flake8-trio). The `MethodName` logic mirrors that of `TRIO100` to stay consistent within the plugin. It is at 95% parity with the exception of upstream also checking for a slightly more complex scenario where a call to `start()` on a `trio.Nursery` context should also be immediately awaited. Upstream plugin appears to just check for anything named `nursery` judging from [the relevant issue](https://github.com/Zac-HD/flake8-trio/issues/56). Unsure if we want to do so something similar or, alternatively, if there is some capability in ruff to check for calls made on this context some other way ## Test Plan Added a new fixture, based on [the one from upstream plugin](https://github.com/Zac-HD/flake8-trio/blob/main/tests/eval_files/trio105.py) ## Issue link Refers: https://github.com/astral-sh/ruff/issues/8451	2023-11-05 19:56:10 +00:00
Micha Reiser	f16505d885	Formatter: Remove unnecessary `group` (#8455 )	2023-11-03 04:14:29 +00:00
Dhruv Manilawala	d350ede992	Remove unicode flag from comparable (#8440 ) ## Summary This PR removes the `unicode` flag from the string literal in `ComparableExpr`. This flag isn't required as all strings are unicode in Python 3 so `"foo" == u"foo"`.	2023-11-02 13:21:45 +05:30
Dhruv Manilawala	97ae617fac	Introduce `LiteralExpressionRef` for all literals (#8339 ) ## Summary This PR adds a new `LiteralExpressionRef` which wraps all of the literal expression nodes in a single enum. This allows for a narrow type when working exclusively with a literal node. Additionally, it also implements a `Expr::as_literal_expr` method to return the new enum if the expression is indeed a literal one. A few rules have been updated to account for the new enum: 1. `redundant_literal_union` 2. `if_else_block_instead_of_dict_lookup` 3. `magic_value_comparison` To account for the change in (2), a new `ComparableLiteral` has been added which can be constructed from the new enum (`ComparableLiteral::from(<LiteralExpressionRef>)`). ### Open Questions 1. The new `ComparableLiteral` can be exclusively used via the `LiteralExpressionRef` enum. Should we remove all of the literal variants from `ComparableExpr` and instead have a single `ComparableExpr::Literal(ComparableLiteral)` variant instead? ## Test Plan `cargo test`	2023-10-31 12:56:11 +00:00
Dhruv Manilawala	8977b6ae11	Inline AST helpers for new literal nodes (#8374 ) A small refactor to inline the `is_const_none` now that there's a dedicated `ExprNoneLiteral` node.	2023-10-31 11:06:54 +00:00
Charlie Marsh	161c093c06	Avoid including literal `shell=True` for truthy, non-`True` diagnostics (#8359 ) ## Summary If the value of `shell` wasn't literally `True`, we now show a message describing it as truthy, rather than the (misleading) `shell=True` literal in the diagnostic. Closes https://github.com/astral-sh/ruff/issues/8310.	2023-10-30 15:44:38 +00:00
Dhruv Manilawala	b0dc5a86a1	Impl `Default` for `(String\|Bytes\|Boolean\|None\|Ellipsis)Literal` (#8341 ) ## Summary This PR adds `Default` for the following literal nodes: * `StringLiteral` * `BytesLiteral` * `BooleanLiteral` * `NoneLiteral` * `EllipsisLiteral` The implementation creates the zero value of the respective literal nodes in terms of the Python language. ## Test Plan `cargo test`	2023-10-30 08:47:44 +00:00
Dhruv Manilawala	230c9ce236	Split `Constant` to individual literal nodes (#8064 ) ## Summary This PR splits the `Constant` enum as individual literal nodes. It introduces the following new nodes for each variant: * `ExprStringLiteral` * `ExprBytesLiteral` * `ExprNumberLiteral` * `ExprBooleanLiteral` * `ExprNoneLiteral` * `ExprEllipsisLiteral` The main motivation behind this refactor is to introduce the new AST node for implicit string concatenation in the coming PR. The elements of that node will be either a string literal, bytes literal or a f-string which can be implemented using an enum. This means that a string or bytes literal cannot be represented by `Constant::Str` / `Constant::Bytes` which creates an inconsistency. This PR avoids that inconsistency by splitting the constant nodes into it's own literal nodes, literal being the more appropriate naming convention from a static analysis tool perspective. This also makes working with literals in the linter and formatter much more ergonomic like, for example, if one would want to check if this is a string literal, it can be done easily using `Expr::is_string_literal_expr` or matching against `Expr::StringLiteral` as oppose to matching against the `ExprConstant` and enum `Constant`. A few AST helper methods can be simplified as well which will be done in a follow-up PR. This introduces a new `Expr::is_literal_expr` method which is the same as `Expr::is_constant_expr`. There are also intermediary changes related to implicit string concatenation which are quiet less. This is done so as to avoid having a huge PR which this already is. ## Test Plan 1. Verify and update all of the existing snapshots (parser, visitor) 2. Verify that the ecosystem check output remains unchanged for both the linter and formatter ### Formatter ecosystem check #### `main` \| project \| similarity index \| total files \| changed files \| \|----------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.75803 \| 1799 \| 1647 \| \| django \| 0.99983 \| 2772 \| 34 \| \| home-assistant \| 0.99953 \| 10596 \| 186 \| \| poetry \| 0.99891 \| 317 \| 17 \| \| transformers \| 0.99966 \| 2657 \| 330 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99978 \| 3669 \| 20 \| \| warehouse \| 0.99977 \| 654 \| 13 \| \| zulip \| 0.99970 \| 1459 \| 22 \| #### `dhruv/constant-to-literal` \| project \| similarity index \| total files \| changed files \| \|----------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.75803 \| 1799 \| 1647 \| \| django \| 0.99983 \| 2772 \| 34 \| \| home-assistant \| 0.99953 \| 10596 \| 186 \| \| poetry \| 0.99891 \| 317 \| 17 \| \| transformers \| 0.99966 \| 2657 \| 330 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99978 \| 3669 \| 20 \| \| warehouse \| 0.99977 \| 654 \| 13 \| \| zulip \| 0.99970 \| 1459 \| 22 \|	2023-10-30 12:13:23 +05:30
Dhruv Manilawala	78bbf6d403	New `Singleton` enum for `PatternMatchSingleton` node (#8063 ) ## Summary This PR adds a new `Singleton` enum for the `PatternMatchSingleton` node. Earlier the node was using the `Constant` enum but the value for this pattern can only be either `None`, `True` or `False`. With the coming PR to remove the `Constant`, this node required a new type to fill in. This also has the benefit of narrowing the type down to only the possible values for the node as evident by the removal of `unreachable`. ## Test Plan Update the AST snapshots and run `cargo test`.	2023-10-30 05:48:53 +00:00
Dhruv Manilawala	ec1be60dcb	Remove leftover constant tuple reference (#8062 ) This PR removes the leftover reference to the tuple variant in `Constant`.	2023-10-19 17:50:45 +00:00
konsti	8f9753f58e	Comments outside expression parentheses (#7873 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary Fixes https://github.com/astral-sh/ruff/issues/7448 Fixes https://github.com/astral-sh/ruff/issues/7892 I've removed automatic dangling comment formatting, we're doing manual dangling comment formatting everywhere anyway (the assert-all-comments-formatted ensures this) and dangling comments would break the formatting there. ## Test Plan New test file. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2023-10-19 09:24:11 +00:00
konsti	0c3123e07e	Insert newline after nested function or class statements (#7946 ) Summary Insert a newline after nested function and class definitions, unless there is a trailing own line comment. We need to e.g. format ```python if platform.system() == "Linux": if sys.version > (3, 10): def f(): print("old") else: def f(): print("new") f() ``` as ```python if platform.system() == "Linux": if sys.version > (3, 10): def f(): print("old") else: def f(): print("new") f() ``` even though `f()` is directly preceded by an if statement, not a function or class definition. See the comments and fixtures for trailing own line comment handling. Test Plan I checked that the new content of `newlines.py` matches black's formatting. --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2023-10-18 09:45:58 +00:00
Charlie Marsh	d685107638	Move {AnyNodeRef, AstNode} to ruff_python_ast crate root (#8030 ) This is a do-over of https://github.com/astral-sh/ruff/pull/8011, which I accidentally merged into a non-`main` branch. Sorry!	2023-10-18 00:01:18 +00:00
Tom Kuson	62f1ee08e7	[`refurb`] Implement `single-item-membership-test` (`FURB171`) (#7815 ) ## Summary Implement [`no-single-item-in`](https://github.com/dosisod/refurb/blob/master/refurb/checks/iterable/no_single_item_in.py) as `single-item-membership-test` (`FURB171`). Uses the helper function `generate_comparison` from the `pycodestyle` implementations; this function should probably be moved, but I am not sure where at the moment. Update: moved it to `ruff_python_ast::helpers`. Related to #1348. ## Test Plan `cargo test`	2023-10-08 14:08:47 +00:00
Charlie Marsh	f71c80af68	Show changed files when running under `--check` (#7788 ) ## Summary We now list each changed file when running with `--check`. Closes https://github.com/astral-sh/ruff/issues/7782. ## Test Plan ``` ❯ cargo run -p ruff_cli -- format foo.py --check Compiling ruff_cli v0.0.292 (/Users/crmarsh/workspace/ruff/crates/ruff_cli) rgo + Finished dev [unoptimized + debuginfo] target(s) in 1.41s Running `target/debug/ruff format foo.py --check` warning: `ruff format` is a work-in-progress, subject to change at any time, and intended only for experimentation. Would reformat: foo.py 1 file would be reformatted ```	2023-10-03 18:50:06 +00:00
Tom Kuson	e129f77bcf	Extend `reimplemented-starmap` (`FURB140`) to catch calls with a single and starred argument (#7768 )	2023-10-02 21:38:05 -04:00
Dhruv Manilawala	e62e245c61	Add support for PEP 701 (#7376 ) ## Summary This PR adds support for PEP 701 in Ruff. This is a rollup PR of all the other individual PRs. The separate PRs were created for logic separation and code reviews. Refer to each pull request for a detail description on the change. Refer to the PR description for the list of pull requests within this PR. ## Test Plan ### Formatter ecosystem checks Explanation for the change in ecosystem check: https://github.com/astral-sh/ruff/pull/7597#issue-1908878183 #### `main` ``` \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76083 \| 1789 \| 1631 \| \| django \| 0.99983 \| 2760 \| 36 \| \| transformers \| 0.99963 \| 2587 \| 319 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99983 \| 3496 \| 18 \| \| warehouse \| 0.99967 \| 648 \| 15 \| \| zulip \| 0.99972 \| 1437 \| 21 \| ``` #### `dhruv/pep-701` ``` \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76051 \| 1789 \| 1632 \| \| django \| 0.99983 \| 2760 \| 36 \| \| transformers \| 0.99963 \| 2587 \| 319 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99983 \| 3496 \| 18 \| \| warehouse \| 0.99967 \| 648 \| 15 \| \| zulip \| 0.99972 \| 1437 \| 21 \| ```	2023-09-29 02:55:39 +00:00
Charlie Marsh	f45281345d	Include radix base prefix in large number representation (#7700 ) ## Summary When lexing a number like `0x995DC9BBDF1939FA` that exceeds our small number representation, we were only storing the portion after the base (in this case, `995DC9BBDF1939FA`). When using that representation in code generation, this could lead to invalid syntax, since `995DC9BBDF1939FA)` on its own is not a valid integer. This PR modifies the code to store the full span, including the radix prefix. See: https://github.com/astral-sh/ruff/issues/7455#issuecomment-1739802958. ## Test Plan `cargo test`	2023-09-28 20:38:06 +00:00
Charlie Marsh	0a8cad2550	Allow named expressions in `__all__` assignments (#7673 ) ## Summary This PR adds support for named expressions when analyzing `__all__` assignments, as per https://github.com/astral-sh/ruff/issues/7672. It also loosens the enforcement around assignments like: `__all__ = list(some_other_expression)`. We shouldn't flag these as invalid, even though we can't analyze the members, since we _know_ they evaluate to a `list`. Closes https://github.com/astral-sh/ruff/issues/7672. ## Test Plan `cargo test`	2023-09-27 00:36:55 -04:00
Charlie Marsh	93b5d8a0fb	Implement our own small-integer optimization (#7584 ) ## Summary This is a follow-up to #7469 that attempts to achieve similar gains, but without introducing malachite. Instead, this PR removes the `BigInt` type altogether, instead opting for a simple enum that allows us to store small integers directly and only allocate for values greater than `i64`: ```rust /// A Python integer literal. Represents both small (fits in an `i64`) and large integers. #[derive(Clone, PartialEq, Eq, Hash)] pub struct Int(Number); #[derive(Debug, Clone, PartialEq, Eq, Hash)] pub enum Number { /// A "small" number that can be represented as an `i64`. Small(i64), /// A "large" number that cannot be represented as an `i64`. Big(Box<str>), } impl std::fmt::Display for Number { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { match self { Number::Small(value) => write!(f, "{value}"), Number::Big(value) => write!(f, "{value}"), } } } ``` We typically don't care about numbers greater than `isize` -- our only uses are comparisons against small constants (like `1`, `2`, `3`, etc.), so there's no real loss of information, except in one or two rules where we're now a little more conservative (with the worst-case being that we don't flag, e.g., an `itertools.pairwise` that uses an extremely large value for the slice start constant). For simplicity, a few diagnostics now show a dedicated message when they see integers that are out of the supported range (e.g., `outdated-version-block`). An additional benefit here is that we get to remove a few dependencies, especially `num-bigint`. ## Test Plan `cargo test`	2023-09-25 15:13:21 +00:00
Charlie Marsh	4d6f5ff0a7	Remove `Int` wrapper type from parser (#7577 ) ## Summary This is only used for the `level` field in relative imports (e.g., `from ..foo import bar`). It seems unnecessary to use a wrapper here, so this PR changes to a `u32` directly.	2023-09-21 17:01:44 +00:00
Charlie Marsh	5df0326bc8	Treat parameters-with-newline as empty in function formatting (#7550 ) ## Summary If a function has no parameters (and no comments within the parameters' `()`), we're supposed to wrap the return annotation _whenever_ it breaks. However, our `empty_parameters` test didn't properly account for the case in which the parameters include a newline (but no other content), like: ```python def get_dashboards_hierarchy( ) -> Dict[Type['BaseDashboard'], List[Type['BaseDashboard']]]: """Get hierarchy of dashboards classes. Returns: Dict of dashboards classes. """ dashboards_hierarchy = {} ``` This PR fixes that detection. Instead of lexing, it now checks if the parameters itself is empty (or if it contains comments). Closes https://github.com/astral-sh/ruff/issues/7457.	2023-09-20 16:20:22 -04:00
konsti	2cbe1733c8	Use CommentRanges in backwards lexing (#7360 ) ## Summary The tokenizer was split into a forward and a backwards tokenizer. The backwards tokenizer uses the same names as the forwards ones (e.g. `next_token`). The backwards tokenizer gets the comment ranges that we already built to skip comments. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2023-09-16 03:21:45 +00:00
Charlie Marsh	ec2f229a45	Remove `ExprContext` from `ComparableExpr` (#7362 ) `ComparableExpr` includes the `ExprContext` field on an expression, so, e.g., the two tuples in `(a, b) = (a, b)` won't be considered equal. Similarly, the tuples in `[(a, b) for (a, b) in c]` _also_ wouldn't be considered equal. I find this behavior surprising, since `ComparableExpr` is intended to allow you to compare two ASTs, but `ExprContext` is really encoding information about the broader context for the expression.	2023-09-14 15:40:02 +00:00
konsti	f4c7bff36b	Don't reorder parameters in function calls (#7268 ) ## Summary In `f(args, a=b, args2, *kwargs)` the args (`args`, `args2`) and keywords (`a=b`, `kwargs`) are interleaved, which we previously didn't handle. Fixes #6498 main* \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76083 \| 1789 \| 1632 \| \| django \| 0.99966 \| 2760 \| 58 \| \| transformers \| 0.99930 \| 2587 \| 447 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99983 \| 3496 \| 18 \| \| warehouse \| 0.99825 \| 648 \| 22 \| \| zulip \| 0.99950 \| 1437 \| 27 \| PR \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76083 \| 1789 \| 1632 \| \| django \| 0.99967 \| 2760 \| 53 \| \| transformers \| 0.99930 \| 2587 \| 447 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99983 \| 3496 \| 18 \| \| warehouse \| 0.99825 \| 648 \| 22 \| \| zulip \| 0.99950 \| 1437 \| 27 \| ## Test Plan New fixtures	2023-09-13 09:01:49 +00:00
konsti	56440ad835	Introduce `ArgOrKeyword` to keep call parameter order (#7302 ) ## Motivation The `ast::Arguments` for call argument are split into positional arguments (args) and keywords arguments (keywords). We currently assume that call consists of first args and then keywords, which is generally the case, but not always: ```python f(args, a=2, args2, *kwargs) class A(args, a=2, args2, *kwargs): pass ``` The consequence is accidentally reordering arguments (https://github.com/astral-sh/ruff/pull/7268). ## Summary `Arguments::args_and_keywords` returns an iterator of an `ArgOrKeyword` enum that yields args and keywords in the correct order. I've fixed the obvious `args` and `keywords` usages, but there might be some cases with wrong assumptions remaining. ## Test Plan The generator got new test cases, otherwise the stacked PR (https://github.com/astral-sh/ruff/pull/7268) which uncovered this.	2023-09-13 08:45:46 +00:00
Dhruv Manilawala	04f2842e4f	Move `ExprConstant::kind` to `StringConstant::unicode` (#7180 )	2023-09-06 07:39:25 +00:00
Dhruv Manilawala	fa6bff0078	Add inline documentation for `Ipy*` AST nodes (#7178 )	2023-09-06 12:07:34 +05:30
Charlie Marsh	b0d171ac19	Supported starred exceptions in length-one tuple detection (#7080 )	2023-09-03 13:31:13 +00:00
Charlie Marsh	68f605e80a	Fix `WithItem` ranges for parenthesized, non-`as` items (#6782 ) ## Summary This PR attempts to address a problem in the parser related to the range's of `WithItem` nodes in certain contexts -- specifically, `WithItem` nodes in parentheses that do not have an `as` token after them. For example, [here](https://play.ruff.rs/71be2d0b-2a04-4c7e-9082-e72bff152679): ```python with (a, b): pass ``` The range of the `WithItem` `a` is set to the range of `(a, b)`, as is the range of the `WithItem` `b`. In other words, when we have this kind of sequence, we use the range of the entire parenthesized context, rather than the ranges of the items themselves. Note that this also applies to cases [like](https://play.ruff.rs/c551e8e9-c3db-4b74-8cc6-7c4e3bf3713a): ```python with (a, b, c as d): pass ``` You can see the issue in the parser here: ```rust #[inline] WithItemsNoAs: Vec<ast::WithItem> = { <location:@L> <all:OneOrMore<Test<"all">>> <end_location:@R> => { all.into_iter().map(\|context_expr\| ast::WithItem { context_expr, optional_vars: None, range: (location..end_location).into() }).collect() }, } ``` Fixing this issue is... very tricky. The naive approach is to use the range of the `context_expr` as the range for the `WithItem`, but that range will be incorrect when the `context_expr` is itself parenthesized. For example, _that_ solution would fail here, since the range of the first `WithItem` would be that of `a`, rather than `(a)`: ```python with ((a), b): pass ``` The `with` parsing in general is highly precarious due to ambiguities in the grammar. Changing it in _any_ way seems to lead to an ambiguous grammar that LALRPOP fails to translate. Consensus seems to be that we don't really understand _why_ the current grammar works (i.e., _how_ it avoids these ambiguities as-is). The solution implemented here is to avoid changing the grammar itself, and instead change the shape of the nodes returned by various rules in the grammar. Specifically, everywhere that we return `Expr`, we instead return `ParenthesizedExpr`, which includes a parenthesized range and the underlying `Expr` itself. (If an `Expr` isn't parenthesized, the ranges will be equivalent.) In `WithItemsNoAs`, we can then use the parenthesized range as the range for the `WithItem`.	2023-08-31 16:21:29 +01:00
Valeriy Savchenko	26d53c56a2	[refurb] Implement `repeated-append` rule (`FURB113`) (#6702 ) ## Summary As an initial effort with replicating `refurb` rules (#1348 ), this PR adds support for [FURB113](https://github.com/dosisod/refurb/blob/master/refurb/checks/builtin/list_extend.py) and adds a new category of checks. ## Test Plan I included a new test + checked that all other tests pass.	2023-08-28 22:51:59 +00:00
Charlie Marsh	58f5f27dc3	Add TOML files to `SourceType` (#6929 ) ## Summary This PR adds a higher-level enum (`SourceType`) around `PySourceType` to allow us to use the same detection path to handle TOML files. Right now, we have ad hoc `is_pyproject_toml` checks littered around, and some codepaths are omitting that logic altogether (like `add_noqa`). Instead, we should always be required to check the source type and handle TOML files as appropriate. This PR will also help with our pre-commit capabilities. If we add `toml` to pre-commit (to support `pyproject.toml`), pre-commit will start to pass _other_ files to Ruff (along with `poetry.lock` and `Pipfile` -- see [identify](`b59996304f/identify/extensions.py (L355)`)). By detecting those files and handling those cases, we avoid attempting to parse them as Python files, which would lead to pre-commit errors. (We tried to add `toml` to pre-commit here (https://github.com/astral-sh/ruff-pre-commit/pull/44), but had to revert here (https://github.com/astral-sh/ruff-pre-commit/pull/45) as it led to the pre-commit hook attempting to parse `poetry.lock` files as Python files.)	2023-08-28 15:01:48 +00:00
Charlie Marsh	fc89976c24	Move `Ranged` into `ruff_text_size` (#6919 ) ## Summary The motivation here is that this enables us to implement `Ranged` in crates that don't depend on `ruff_python_ast`. Largely a mechanical refactor with a lot of regex, Clippy help, and manual fixups. ## Test Plan `cargo test`	2023-08-27 14:12:51 -04:00
Micha Reiser	7c480236e0	Use dyn dispatch for `any_over_*` (#6912 )	2023-08-27 15:54:01 +02:00
Charlie Marsh	15b73bdb8a	Introduce AST nodes for `PatternMatchClass` arguments (#6881 ) ## Summary This PR introduces two new AST nodes to improve the representation of `PatternMatchClass`. As a reminder, `PatternMatchClass` looks like this: ```python case Point2D(0, 0, x=1, y=2): ... ``` Historically, this was represented as a vector of patterns (for the `0, 0` portion) and parallel vectors of keyword names (for `x` and `y`) and values (for `1` and `2`). This introduces a bunch of challenges for the formatter, but importantly, it's also really different from how we represent similar nodes, like arguments (`func(0, 0, x=1, y=2)`) or parameters (`def func(x, y)`). So, firstly, we now use a single node (`PatternArguments`) for the entire parenthesized region, making it much more consistent with our other nodes. So, above, `PatternArguments` would be `(0, 0, x=1, y=2)`. Secondly, we now have a `PatternKeyword` node for `x=1` and `y=2`. This is much more similar to the how `Keyword` is represented within `Arguments` for call expressions. Closes https://github.com/astral-sh/ruff/issues/6866. Closes https://github.com/astral-sh/ruff/issues/6880.	2023-08-26 14:45:44 +00:00
Dhruv Manilawala	d1f07008f7	Rename Notebook related symbols (#6862 ) This PR renames the following symbols: * `PySourceType::Jupyter` -> `PySourceType::Ipynb` * `SourceKind::Jupyter` -> `SourceKind::IpyNotebook` * `JupyterIndex` -> `NotebookIndex`	2023-08-25 11:40:54 +05:30
Charlie Marsh	847432cacf	Avoid attempting to fix PT018 in multi-statement lines (#6829 ) ## Summary These fixes will _always_ fail, so we should avoid trying to construct them in the first place. Closes https://github.com/astral-sh/ruff/issues/6812.	2023-08-23 19:09:34 -04:00
Charlie Marsh	26e63ab137	Remove lexing from flake8-pytest-style (#6795 ) ## Summary Another drive-by change to remove unnecessary custom lexing. We just need to know the parenthesized range, so we can use... `parenthesized_range`. I've also updated `parenthesized_range` to support nested parentheses. ## Test Plan `cargo test`	2023-08-23 15:54:11 +00:00
Charlie Marsh	6a5acde226	Make `Parameters` an optional field on `ExprLambda` (#6669 ) ## Summary If a lambda doesn't contain any parameters, or any parameter _tokens_ (like `*`), we can use `None` for the parameters. This feels like a better representation to me, since, e.g., what should the `TextRange` be for a non-existent set of parameters? It also allows us to remove several sites where we check if the `Parameters` is empty by seeing if it contains any arguments, so semantically, we're already trying to detect and model around this elsewhere. Changing this also fixes a number of issues with dangling comments in parameter-less lambdas, since those comments are now automatically marked as dangling on the lambda. (As-is, we were also doing something not-great whereby the lambda was responsible for formatting dangling comments on the parameters, which has been removed.) Closes https://github.com/astral-sh/ruff/issues/6646. Closes https://github.com/astral-sh/ruff/issues/6647. ## Test Plan `cargo test`	2023-08-18 15:34:54 +00:00
Charlie Marsh	1050142a58	Expand expressions to include parentheses in E712 (#6575 ) ## Summary This PR exposes our `is_expression_parenthesized` logic such that we can use it to expand expressions when autofixing to include their parenthesized ranges. This solution has a few drawbacks: (1) we need to compute parenthesized ranges in more places, which also relies on backwards lexing; and (2) we need to make use of this in any relevant fixes. However, I still think it's worth pursuing. On (1), the implementation is very contained, so IMO we can easily swap this out for a more performant solution in the future if needed. On (2), this improves correctness and fixes some bad syntax errors detected by fuzzing, which means it has value even if it's not as robust as an _actual_ `ParenthesizedExpression` node in the AST itself. Closes https://github.com/astral-sh/ruff/issues/4925. ## Test Plan `cargo test` with new cases that previously failed the fuzzer.	2023-08-17 15:51:09 +00:00
Charlie Marsh	db1c556508	Implement `Ranged` on more structs (#6639 ) ## Summary I noticed some inconsistencies around uses of `.range.start()`, structs that have a `TextRange` field but don't implement `Ranged`, etc. ## Test Plan `cargo test`	2023-08-17 11:22:39 -04:00
Charlie Marsh	1334232168	Introduce `ExpressionRef` (#6637 ) ## Summary This PR revives the `ExpressionRef` concept introduced in https://github.com/astral-sh/ruff/pull/5644, motivated by the change we want to make in https://github.com/astral-sh/ruff/pull/6575 to narrow the type of the expression that can be passed to `parenthesized_range`. ## Test Plan `cargo test`	2023-08-17 10:07:16 -04:00
Micha Reiser	455db84a59	Replace `inline(always)` with `inline` (#6590 )	2023-08-15 08:58:11 +02:00
Charlie Marsh	7f7df852e8	Remove some extraneous newlines in Cargo.toml (#6577 )	2023-08-14 23:39:41 +00:00
Charlie Marsh	96d310fbab	Remove `Stmt::TryStar` (#6566 ) ## Summary Instead, we set an `is_star` flag on `Stmt::Try`. This is similar to the pattern we've migrated towards for `Stmt::For` (removing `Stmt::AsyncFor`) and friends. While these are significant differences for an interpreter, we tend to handle these cases identically or nearly identically. ## Test Plan `cargo test`	2023-08-14 13:39:44 -04:00
Charlie Marsh	a7cf8f0b77	Replace dynamic implicit concatenation detection with parser flag (#6513 ) ## Summary In https://github.com/astral-sh/ruff/pull/6512, we added a flag to the AST to mark implicitly-concatenated string expressions. This PR makes use of that flag to remove the `is_implicit_concatenation` method. ## Test Plan `cargo test`	2023-08-14 10:27:17 -04:00
Charlie Marsh	f16e780e0a	Add an implicit concatenation flag to string and bytes constants (#6512 ) ## Summary Per the discussion in https://github.com/astral-sh/ruff/discussions/6183, this PR adds an `implicit_concatenated` flag to the string and bytes constant variants. It's not actually _used_ anywhere as of this PR, but it is covered by the tests. Specifically, we now use a struct for the string and bytes cases, along with the `Expr::FString` node. That struct holds the value, plus the flag: ```rust #[derive(Clone, Debug, PartialEq, is_macro::Is)] pub enum Constant { Str(StringConstant), Bytes(BytesConstant), ... } #[derive(Clone, Debug, PartialEq, Eq)] pub struct StringConstant { /// The string value as resolved by the parser (i.e., without quotes, or escape sequences, or /// implicit concatenations). pub value: String, /// Whether the string contains multiple string tokens that were implicitly concatenated. pub implicit_concatenated: bool, } impl Deref for StringConstant { type Target = str; fn deref(&self) -> &Self::Target { self.value.as_str() } } #[derive(Clone, Debug, PartialEq, Eq)] pub struct BytesConstant { /// The bytes value as resolved by the parser (i.e., without quotes, or escape sequences, or /// implicit concatenations). pub value: Vec<u8>, /// Whether the string contains multiple string tokens that were implicitly concatenated. pub implicit_concatenated: bool, } impl Deref for BytesConstant { type Target = [u8]; fn deref(&self) -> &Self::Target { self.value.as_slice() } } ``` ## Test Plan `cargo test`	2023-08-14 13:46:54 +00:00
Micha Reiser	9584f613b9	Remove `allow(pedantic)` from formatter (#6549 )	2023-08-14 14:02:06 +02:00
Micha Reiser	ac5c8bb3b6	Add `AnyNodeRef.visit_preorder` <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR adds the `AnyNodeRef.visit_preorder` method. I'll need this method to mark all comments of a suppressed node's children as formatted (in debug builds). I'm not super happy with this because it now requires a double-dispatch where the `walk_*` methods call into `node.visit_preorder` and the `visit_preorder` then calls back into the visitor. Meaning, the new implementation now probably results in way more function calls. The other downside is that `AnyNodeRef` now contains code that is difficult to auto-generate. This could be mitigated by extracting the `visit_preorder` method into its own `VisitPreorder` trait. Anyway, this approach solves the need and avoids duplicating the visiting code once more. <!-- What's the purpose of the change? What does it do, and why? --> ## Test Plan `cargo test` <!-- How was it tested? -->	2023-08-10 08:35:09 +02:00
Charlie Marsh	395bb31247	Improve counting of message arguments when msg is provided as a keyword (#6456 ) Closes https://github.com/astral-sh/ruff/issues/6454.	2023-08-09 20:39:10 +00:00
Dhruv Manilawala	6a64f2289b	Rename `Magic` to `IpyEscape` (#6395 ) ## Summary This PR renames the `MagicCommand` token to `IpyEscapeCommand` token and `MagicKind` to `IpyEscapeKind` type to better reflect the purpose of the token and type. Similarly, it renames the AST nodes from `LineMagic` to `IpyEscapeCommand` prefixed with `Stmt`/`Expr` wherever necessary. It also makes renames from using `jupyter_magic` to `ipython_escape_commands` in various function names. The mode value is still `Mode::Jupyter` because the escape commands are part of the IPython syntax but the lexing/parsing is done for a Jupyter notebook. ### Motivation behind the rename: * IPython codebase defines it as "EscapeCommand" / "Escape Sequences": * Escape Sequences: `292e3a2345/IPython/core/inputtransformer2.py (L329-L333)` * Escape command: `292e3a2345/IPython/core/inputtransformer2.py (L410-L411)` * The word "magic" is used mainly for the actual magic commands i.e., the ones starting with `%`/`%%` (https://ipython.readthedocs.io/en/stable/interactive/reference.html#magic-command-system). So, this avoids any confusion between the Magic token (`%`, `%%`) and the escape command itself. ## Test Plan * `cargo test` to make sure all renames are done correctly. * `grep` for `jupyter_escape`/`magic` to make sure all renames are done correctly.	2023-08-09 13:28:18 +00:00
Micha Reiser	a39dd76d95	Add `enter` and `leave_node` methods to Preoder visitor (#6422 )	2023-08-09 09:09:00 +00:00
Charlie Marsh	3f0eea6d87	Rename `JoinedStr` to `FString` in the AST (#6379 ) ## Summary Per the proposal in https://github.com/astral-sh/ruff/discussions/6183, this PR renames the `JoinedStr` node to `FString`.	2023-08-07 17:33:17 +00:00
Charlie Marsh	c439435615	Use dedicated AST nodes on `MemberKind` (#6374 ) ## Summary This PR leverages the unified function definition node to add precise AST node types to `MemberKind`, which is used to power our docstring definition tracking (e.g., classes and functions, whether they're methods or functions or nested functions and so on, whether they have a docstring, etc.). It was painful to do this in the past because the function variants needed to support a union anyway, but storing precise nodes removes like a dozen panics. No behavior changes -- purely a refactor. ## Test Plan `cargo test`	2023-08-07 17:17:58 +00:00
Charlie Marsh	daefa74e9a	Remove async AST node variants for `with`, `for`, and `def` (#6369 ) ## Summary Per the suggestion in https://github.com/astral-sh/ruff/discussions/6183, this PR removes `AsyncWith`, `AsyncFor`, and `AsyncFunctionDef`, replacing them with an `is_async` field on the non-async variants of those structs. Unlike an interpreter, we _generally_ have identical handling for these nodes, so separating them into distinct variants adds complexity from which we don't really benefit. This can be seen below, where we get to remove a _ton_ of code related to adding generic `Any*` wrappers, and a ton of duplicate branches for these cases. ## Test Plan `cargo test` is unchanged, apart from parser snapshots.	2023-08-07 16:36:02 +00:00
Charlie Marsh	c895252aae	Remove `RefEquality` (#6393 ) ## Summary See discussion in https://github.com/astral-sh/ruff/pull/6351#discussion_r1284996979. We can remove `RefEquality` entirely and instead use a text offset for statement keys, since no two statements can start at the same text offset. ## Test Plan `cargo test`	2023-08-07 16:04:50 +00:00
Dhruv Manilawala	e4a4660925	Support help end escape command with priority (#6272 ) ## Summary This PR adds support for help end escape command in the lexer. ### What are "help end escape commands"? First, the escape commands are special IPython syntax which enhances the functionality for the IPython REPL. There are 9 types of escape kinds which are recognized by the tokens which are present at the start of the command (`?`, `??`, `!`, `!!`, etc.). Here, the help command is using either the `?` or `??` token at the start (`?str.replace` for example). Those 2 tokens are also supported when they're at the end of the command (`str.replace?`), but the other tokens aren't supported in that position. There are mainly two types of help end escape commands: 1. Ending with either `?` or `??`, but it also starts with one of the escape tokens (`%matplotlib?`) 2. On the other hand, there's a stricter version for (1) which doesn't start with any escape tokens (`str.replace?`) This PR adds support for (1) while (2) will be supported in the parser. ### Priority Now, if the command starts and ends with an escape token, how do we decide the kind of this command? This is where priority comes into picture. This is simple as there's only one priority where `?`/`??` at the end takes priority over any other escape token and all of the other tokens are at the same priority. Remember that only `?`/`??` at the end is considered valid. This is mainly useful in the case where someone would want to invoke the help command on the magic command itself. For example, in `%matplotlib?` the help command takes priority which means that we want help for the `matplotlib` magic function instead of calling the magic function itself. ### Specification Here's where things get a bit tricky. What if there are question mark tokens at both ends. How do we decide if it's `Help` (`?`) kind or `Help2` (`??`) kind? \| \| Magic \| Value \| Kind \| \| --- \| --- \| --- \| --- \| \| 1 \| `?foo?` \| `foo` \| `Help` \| \| 2 \| `??foo?` \| `foo` \| `Help` \| \| 3 \| `?foo??` \| `foo` \| `Help2` \| \| 4 \| `??foo??` \| `foo` \| `Help2` \| \| 5 \| `???foo??` \| `foo` \| `Help2` \| \| 6 \| `??foo???` \| `foo???` \| `Help2` \| \| 7 \| `???foo???` \| `?foo???` \| `Help2` \| Looking at the above table: - The question mark tokens on the right takes priority over the ones on the left but only if the number of question mark on the right is 1 or 2. - If there are more than 2 question mark tokens on the right side, then the left side is used to determine the same. - If the right side is used to determine the kind, then all of the question marks and whitespaces on the left side are ignored in the `value`, but if it’s the other way around, then all of the extra question marks are part of the `value`. ### References - IPython implementation using the regex: `292e3a2345/IPython/core/inputtransformer2.py (L454-L462)` - Priorities: `292e3a2345/IPython/core/inputtransformer2.py (L466-L469)` ## Test Plan Add a bunch of test cases for the lexer and verify that it matches the behavior of IPython transformer. resolves: #6357	2023-08-07 21:01:02 +05:30
Charlie Marsh	76148ddb76	Store call paths rather than stringified names (#6102 ) ## Summary Historically, we've stored "qualified names" on our `BindingKind::Import`, `BindingKind::SubmoduleImport`, and `BindingKind::ImportFrom` structs. In Ruff, a "qualified name" is a dot-separated path to a symbol. For example, given `import foo.bar`, the "qualified name" would be `"foo.bar"`; and given `from foo.bar import baz`, the "qualified name" would be `foo.bar.baz`. This PR modifies the `BindingKind` structs to instead store _call paths_ rather than qualified names. So in the examples above, we'd store `["foo", "bar"]` and `["foo", "bar", "baz"]`. It turns out that this more efficient given our data access patterns. Namely, we frequently need to convert the qualified name to a call path (whenever we call `resolve_call_path`), and it turns out that we do this operation enough that those conversations show up on benchmarks. There are a few other advantages to using call paths, rather than qualified names: 1. The size of `BindingKind` is reduced from 32 to 24 bytes, since we no longer need to store a `String` (only a boxed slice). 2. All three import types are more consistent, since they now all store a boxed slice, rather than some storing an `&str` and some storing a `String` (for `BindingKind::ImportFrom`, we needed to allocate a `String` to create the qualified name, but the call path is a slice of static elements that don't require that allocation). 3. A lot of code gets simpler, in part because we now do call path resolution "earlier". Most notably, for relative imports (`from .foo import bar`), we store the _resolved_ call path rather than the relative call path, so the semantic model doesn't have to deal with that resolution. (See that `resolve_call_path` is simpler, fewer branches, etc.) In my testing, this change improves the all-rules benchmark by another 4-5% on top of the improvements mentioned in #6047.	2023-08-05 15:21:50 +00:00
Dhruv Manilawala	1ac2699b5e	Update `F841` autofix to not remove line magic expr (#6141 ) ## Summary Update `F841` autofix to not remove line magic expr ## Test Plan Added test case for assignment statement with and without type annotation fixes: #6116	2023-08-05 00:45:01 +00:00
konsti	1031bb6550	Formatter: Add SourceType to context to enable special formatting for stub files (#6331 ) Summary This adds the information whether we're in a .py python source file or in a .pyi stub file to enable people working on #5822 and related issues. I'm not completely happy with `Default` for something that depends on the input. Test Plan None, this is currently unused, i'm leaving this to first implementation of stub file specific formatting. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2023-08-04 11:52:26 +00:00
Charlie Marsh	2fa508793f	Return a slice in `StmtClassDef#bases` (#6311 ) Slices are strictly more flexible, since you can always convert to an iterator, etc., but not the other way around. Suggested in https://github.com/astral-sh/ruff/pull/6259#discussion_r1282730994.	2023-08-03 16:21:55 +00:00
Charlie Marsh	9f3567dea6	Use `range: _` in lieu of `range: _range` (#6296 ) ## Summary `range: _range` is slightly inconvenient because you can't use it multiple times within a single match, unlike `_`.	2023-08-02 22:11:13 -04:00
Zanie Blue	1a60d1e3c6	Add formatting of type parameters in class and function definitions (#6161 ) Part of #5062 Closes https://github.com/astral-sh/ruff/issues/5931 Implements formatting of a sequence of type parameters in a dedicated struct for reuse by classes, functions, and type aliases (preparing for #5929). Adds formatting of type parameters in class and function definitions — previously, they were just elided.	2023-08-02 20:29:28 +00:00
Charlie Marsh	23b8fc4366	Move `includes_arg_name` onto `Parameters` (#6282 ) ## Summary Like #6279, no reason for this to be a standalone method.	2023-08-02 18:05:26 +00:00
Charlie Marsh	fd40864924	Move `find_keyword` helpers onto `Arguments` struct (#6280 ) ## Summary Similar to #6279, moving some helpers onto the struct in the name of reducing the number of random undiscoverable utilities we have in `helpers.rs`. Most of the churn is migrating rules to take `ast::ExprCall` instead of the spread call arguments. ## Test Plan `cargo test`	2023-08-02 13:54:48 -04:00
Charlie Marsh	041946fb64	Remove `CallArguments` abstraction (#6279 ) ## Summary This PR removes a now-unnecessary abstraction from `helper.rs` (`CallArguments`), in favor of adding methods to `Arguments` directly, which helps with discoverability.	2023-08-02 13:25:43 -04:00
Charlie Marsh	8a0f844642	Box type params and arguments fields on the class definition node (#6275 ) ## Summary This PR boxes the `TypeParams` and `Arguments` fields on the class definition node. These fields are optional and often emitted, and given that class definition is our largest enum variant, we pay the cost of including them for every statement in the AST. Boxing these types reduces the statement size by 40 bytes, which seems like a good tradeoff given how infrequently these are accessed. ## Test Plan Need to benchmark, but no behavior changes.	2023-08-02 16:47:06 +00:00
Charlie Marsh	4c53bfe896	Add formatter support for call and class definition `Arguments` (#6274 ) ## Summary This PR leverages the `Arguments` AST node introduced in #6259 in the formatter, which ensures that we correctly handle trailing comments in calls, like: ```python f( 1, # comment ) pass ``` (Previously, this was treated as a leading comment on `pass`.) This also allows us to unify the argument handling across calls and class definitions. ## Test Plan A bunch of new fixture tests, plus improved Black compatibility.	2023-08-02 11:54:22 -04:00
Charlie Marsh	b095b7204b	Add a `TypeParams` node to the AST (#6261 ) ## Summary Similar to #6259, this PR adds a `TypeParams` node to the AST, to capture the list of type parameters with their surrounding brackets. If a statement lacks type parameters, the `type_params` field will be `None`.	2023-08-02 14:12:45 +00:00
Charlie Marsh	981e64f82b	Introduce an `Arguments` AST node for function calls and class definitions (#6259 ) ## Summary This PR adds a new `Arguments` AST node, which we can use for function calls and class definitions. The `Arguments` node spans from the left (open) to right (close) parentheses inclusive. In the case of classes, the `Arguments` is an option, to differentiate between: ```python # None class C: ... # Some, with empty vectors class C(): ... ``` In this PR, we don't really leverage this change (except that a few rules get much simpler, since we don't need to lex to find the start and end ranges of the parentheses, e.g., `crates/ruff/src/rules/pyupgrade/rules/lru_cache_without_parameters.rs`, `crates/ruff/src/rules/pyupgrade/rules/unnecessary_class_parentheses.rs`). In future PRs, this will be especially helpful for the formatter, since we can track comments enclosed on the node itself. ## Test Plan `cargo test`	2023-08-02 10:01:13 -04:00
Charlie Marsh	9c708d8fc1	Rename `Parameter#arg` and `ParameterWithDefault#def` fields (#6255 ) ## Summary This PR renames... - `Parameter#arg` to `Parameter#name` - `ParameterWithDefault#def` to `ParameterWithDefault#parameter` (such that `ParameterWithDefault` has a `default` and a `parameter`) ## Test Plan `cargo test`	2023-08-01 14:28:34 -04:00
Charlie Marsh	adc8bb7821	Rename `Arguments` to `Parameters` in the AST (#6253 ) ## Summary This PR renames a few AST nodes for clarity: - `Arguments` is now `Parameters` - `Arg` is now `Parameter` - `ArgWithDefault` is now `ParameterWithDefault` For now, the attribute names that reference `Parameters` directly are changed (e.g., on `StmtFunctionDef`), but the attributes on `Parameters` itself are not (e.g., `vararg`). We may revisit that decision in the future. For context, the AST node formerly known as `Arguments` is used in function definitions. Formally (outside of the Python context), "arguments" typically refers to "the values passed to a function", while "parameters" typically refers to "the variables used in a function definition". E.g., if you Google "arguments vs parameters", you'll get some explanation like: > A parameter is a variable in a function definition. It is a placeholder and hence does not have a concrete value. An argument is a value passed during function invocation. We're thus deviating from Python's nomenclature in favor of a scheme that we find to be more precise.	2023-08-01 13:53:28 -04:00
konsti	1df7e9831b	Replace `.map_or(false, $closure)` with `.is_some_and(closure)` (#6244 ) Summary [Option::is_some_and](https://doc.rust-lang.org/stable/std/option/enum.Option.html#method.is_some_and) and [Result::is_ok_and](https://doc.rust-lang.org/std/result/enum.Result.html#method.is_ok_and) are new methods is rust 1.70. I find them way more readable than `.map_or(false, ...)`. The changes are `s/.map_or(false,/.is_some_and(/g`, then manually switching to `is_ok_and` where the value is a Result rather than an Option. Test Plan n/a^	2023-08-01 19:29:42 +02:00
Micha Reiser	debfca3a11	Remove `Parse` trait (#6235 )	2023-08-01 18:35:03 +02:00
Charlie Marsh	83fe103d6e	Allow generic tuple and list calls in __all__ (#6247 ) ## Summary Allows, e.g., `__all__ = list[str]()`. Closes https://github.com/astral-sh/ruff/issues/6226.	2023-08-01 12:01:48 -04:00
Micha Reiser	f45e8645d7	Remove unused parser modes <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR removes the `Interactive` and `FunctionType` parser modes that are unused by ruff <!-- What's the purpose of the change? What does it do, and why? --> ## Test Plan `cargo test` <!-- How was it tested? -->	2023-08-01 13:10:07 +02:00
Micha Reiser	7c7231db2e	Remove unsupported `type_comment` field <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR removes the `type_comment` field which our parser doesn't support. <!-- What's the purpose of the change? What does it do, and why? --> ## Test Plan `cargo test` <!-- How was it tested? -->	2023-08-01 12:53:13 +02:00
Micha Reiser	4ad5903ef6	Delete type-ignore node <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR removes the type ignore node from the AST because our parser doesn't support it, and just having it around is confusing. <!-- What's the purpose of the change? What does it do, and why? --> ## Test Plan `cargo build` <!-- How was it tested? -->	2023-08-01 12:34:50 +02:00
Micha Reiser	ecfdd8d58b	Add static assertions to nodes (#6228 )	2023-08-01 11:54:49 +02:00
David Szotten	ba990b676f	add `DebugText` for self-documenting f-strings (#6167 )	2023-08-01 07:55:03 +02:00
Charlie Marsh	646ff6497c	Ignore end-of-line file exemption comments (#6160 ) ## Summary This PR protects against code like: ```python from typing import Optional import bar # ruff: noqa import baz class Foo: x: Optional[str] = None ``` In which the user wrote `# ruff: noqa` to ignore a specific error, not realizing that it was a file-level exemption that thus turned off all lint rules. Specifically, if a `# ruff: noqa` directive is not at the start of a line, we now ignore it and warn, since this is almost certainly a mistake.	2023-07-29 00:40:32 +00:00
Micha Reiser	40f54375cb	Pull in RustPython parser (#6099 )	2023-07-27 09:29:11 +00:00
konsti	13f9a16e33	Rewrite placement logic (#6040 ) ## Summary This is a rewrite of the main comment placement logic. `place_comment` now has three parts: - place own line comments - between branches - after a branch - place end-of-line comments - after colon - after a branch - place comments for specific nodes (that include module level comments) The rewrite fixed three bugs: `class A: # trailing comment` comments now stay end-of-line, `try: # comment` remains end-of-line and deeply indented try-else-finally comments remain with the right nested statement. It will be much easier to give more alternative branches nodes since this is abstracted away by `is_node_with_body` and the first/last child helpers. Adding new node types can now be done by adding an entry to the `place_comment` match. The code went from 1526 lines before #6033 to 1213 lines now. It thinks it easier to just read the new `placement.rs` rather than reviewing the diff. ## Test Plan The existing fixtures staying the same or improving plus new ones for the bug fixes.	2023-07-26 16:21:23 +00:00
Micha Reiser	2cf00fee96	Remove parser dependency from ruff-python-ast (#6096 )	2023-07-26 17:47:22 +02:00
Micha Reiser	16e1737d1b	Use cursor based lexer (#6012 )	2023-07-26 11:32:26 +02:00
Dhruv Manilawala	025fa4eba8	Integrate the new Jupyter AST nodes in Ruff (#6086 ) ## Summary This PR adds the implementation for the new Jupyter AST nodes i.e., `ExprLineMagic` and `StmtLineMagic`. ## Test Plan Add test cases for `unparse` containing magic commands resolves: #6087	2023-07-26 08:20:30 +00:00
Harutaka Kawamura	62f821daaa	Avoid raising PT012 for simple `with` statements (#6081 )	2023-07-26 01:43:31 +00:00
Zanie Blue	389fe13c93	Implement visitation of type aliases and parameters (#5927 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary <!-- What's the purpose of the change? What does it do, and why? --> Part of #5062 Requires https://github.com/astral-sh/RustPython-Parser/pull/32 Adds visitation of type alias statements and type parameters in class and function definitions. Duplicates tests for `PreorderVisitor` into `Visitor` with new snapshots. Testing required node implementations for the `TypeParam` enum, which is a chunk of the diff and the reason we need `Ranged` implementations in https://github.com/astral-sh/RustPython-Parser/pull/32. ## Test Plan <!-- How was it tested? --> Adds unit tests with snapshots.	2023-07-25 17:11:26 +00:00
konsti	e7f228f781	Placement refactor (#6034 ) ## Summary This PR is a refactoring of placement.rs. The code got more consistent, some comments were updated and some dead code was removed or replaced with debug assertions. It also contains a bugfix for the placement of end-of-branch comments with nested bodies inside try statements that occurred when refactoring the nested body loop. ## Test Plan The existing test cases don't change. I added a couple of cases that i think should be tested but weren't, and a regression test for the bugfix	2023-07-25 11:49:05 +02:00
Charlie Marsh	0d94337b96	Avoid allocations in `SimpleCallArgs` (#6021 ) ## Summary My intuition is that it's faster to do these checks as-needed rather than allocation new hash maps and vectors for the arguments. (We typically only query once anyway.)	2023-07-24 04:55:37 +00:00
Charlie Marsh	9834c69c98	Remove `__all__` enforcement rules out of binding phase (#5897 ) ## Summary This PR moves two rules (`invalid-all-format` and `invalid-all-object`) out of the name-binding phase, and into the dedicated pass over all bindings that occurs at the end of the `Checker`. This is part of my continued quest to separate the semantic model-building logic from the actual rule enforcement.	2023-07-19 21:18:47 +00:00
Zanie Blue	b27f0fa433	Implement `any_over_expr` for type alias and type params (#5866 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-19 16:17:06 -05:00
Charlie Marsh	5f3da9955a	Rename `ruff_python_whitespace` to `ruff_python_trivia` (#5886 ) ## Summary This crate now contains utilities for dealing with trivia more broadly: whitespace, newlines, "simple" trivia lexing, etc. So renaming it to reflect its increased responsibilities. To avoid conflicts, I've also renamed `Token` and `TokenKind` to `SimpleToken` and `SimpleTokenKind`.	2023-07-19 11:48:27 -04:00
Charlie Marsh	626d8dc2cc	Use `.as_ref()` in lieu of `&**` (#5874 ) I find this less opaque (and often more succinct).	2023-07-19 00:49:13 +00:00
Charlie Marsh	2d505e2b04	Remove suite body tracking from `SemanticModel` (#5848 ) ## Summary The `SemanticModel` currently stores the "body" of a given `Suite`, along with the current statement index. This is used to support "next sibling" queries, but we only use this in exactly one place -- the rule that simplifies constructs like this to `any` or `all`: ```python for x in y: if x == 0: return True return False ``` Instead of tracking the state, we can just do a (slightly more expensive) traversal, by finding the node within its parent and returning the next node in the body. Note that we'll only have to do this extremely rarely -- namely, for functions that contain something like: ```python for x in y: if x == 0: return True ```	2023-07-18 18:58:31 -04:00
Zanie Blue	a93254f026	Implement `unparse` for type aliases and parameters (#5869 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 16:25:49 -05:00
Zanie Blue	41da52a61b	Implement `TokenKind` for type aliases (#5870 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 18:21:51 +00:00
Zanie Blue	d5c43a45b3	Implement `Comparable` for type aliases and parameters (#5865 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 17:18:14 +00:00
Zanie Blue	0eab4b3c22	Implement `AnyNode` and `AnyNodRef` for `StmtTypeAlias` (#5863 ) Part of https://github.com/astral-sh/ruff/issues/5062	2023-07-18 10:44:55 -05:00
Charlie Marsh	c868def374	Unroll `collect_call_path` to speed up common cases (#5792 ) ## Summary This PR just naively unrolls `collect_call_path` to handle attribute resolutions of up to eight segments. In profiling via Instruments, it seems to be about 4x faster for a very hot code path (4% of total execution time on `main`, 1% here). Profiling by running `RAYON_NUM_THREADS=1 cargo instruments -t time --profile release-debug --time-limit 10000 -p ruff_cli -o FromSlice.trace -- check crates/ruff/resources/test/cpython --silent -e --no-cache --select ALL`, and modifying the linter to loop infinitely up to the specified time (10 seconds) to increase sample size. Before: <img width="1792" alt="Screen Shot 2023-07-15 at 5 13 34 PM" src="https://github.com/astral-sh/ruff/assets/1309177/4a8b0b45-8b67-43e9-af5e-65b326928a8e"> After: <img width="1792" alt="Screen Shot 2023-07-15 at 8 38 51 PM" src="https://github.com/astral-sh/ruff/assets/1309177/d8829159-2c79-4a49-ab3c-9e4e86f5b2b1">	2023-07-18 11:29:59 -04:00
konsti	730e6b2b4c	Refactor `StmtIf`: Formatter and Linter (#5459 ) ## Summary Previously, `StmtIf` was defined recursively as ```rust pub struct StmtIf { pub range: TextRange, pub test: Box<Expr>, pub body: Vec<Stmt>, pub orelse: Vec<Stmt>, } ``` Every `elif` was represented as an `orelse` with a single `StmtIf`. This means that this representation couldn't differentiate between ```python if cond1: x = 1 else: if cond2: x = 2 ``` and ```python if cond1: x = 1 elif cond2: x = 2 ``` It also makes many checks harder than they need to be because we have to recurse just to iterate over an entire if-elif-else and because we're lacking nodes and ranges on the `elif` and `else` branches. We change the representation to a flat ```rust pub struct StmtIf { pub range: TextRange, pub test: Box<Expr>, pub body: Vec<Stmt>, pub elif_else_clauses: Vec<ElifElseClause>, } pub struct ElifElseClause { pub range: TextRange, pub test: Option<Expr>, pub body: Vec<Stmt>, } ``` where `test: Some(_)` represents an `elif` and `test: None` an else. This representation is different tradeoff, e.g. we need to allocate the `Vec<ElifElseClause>`, the `elif`s are now different than the `if`s (which matters in rules where want to check both `if`s and `elif`s) and the type system doesn't guarantee that the `test: None` else is actually last. We're also now a bit more inconsistent since all other `else`, those from `for`, `while` and `try`, still don't have nodes. With the new representation some things became easier, e.g. finding the `elif` token (we can use the start of the `ElifElseClause`) and formatting comments for if-elif-else (no more dangling comments splitting, we only have to insert the dangling comment after the colon manually and set `leading_alternate_branch_comments`, everything else is taken of by having nodes for each branch and the usual placement.rs fixups). ## Merge Plan This PR requires coordination between the parser repo and the main ruff repo. I've split the ruff part, into two stacked PRs which have to be merged together (only the second one fixes all tests), the first for the formatter to be reviewed by @michareiser and the second for the linter to be reviewed by @charliermarsh. * MH: Review and merge https://github.com/astral-sh/RustPython-Parser/pull/20 * MH: Review and merge or move later in stack https://github.com/astral-sh/RustPython-Parser/pull/21 * MH: Review and approve https://github.com/astral-sh/RustPython-Parser/pull/22 * MH: Review and approve formatter PR https://github.com/astral-sh/ruff/pull/5459 * CM: Review and approve linter PR https://github.com/astral-sh/ruff/pull/5460 * Merge linter PR in formatter PR, fix ecosystem checks (ecosystem checks can't run on the formatter PR and won't run on the linter PR, so we need to merge them first) * Merge https://github.com/astral-sh/RustPython-Parser/pull/22 * Create tag in the parser, update linter+formatter PR * Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459 --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2023-07-18 13:40:15 +02:00
David Szotten	52aa2fc875	upgrade rustpython to remove tuple-constants (#5840 ) c.f. https://github.com/astral-sh/RustPython-Parser/pull/28 Tests: No snapshots changed --------- Co-authored-by: Zanie <contact@zanie.dev>	2023-07-17 22:50:31 +00:00
Charlie Marsh	2cd117ba81	Remove `TryIdentifier` trait (#5816 ) ## Summary Last remaining usage here is for patterns, but we now have ranges on identifiers so it's unnecessary.	2023-07-16 21:24:16 -04:00
Charlie Marsh	01b05fe247	Remove `Identifier` usages for isolating exception names (#5797 ) ## Summary The motivating change here is to remove `let range = except_handler.try_identifier().unwrap();` and instead just do `name.range()`, since exception names now have ranges attached to them by the parse. This also required some refactors (which are improvements) to the built-in attribute shadowing rules, since at least one invocation relied on passing in the exception handler and calling `.try_identifier()`. Now that we have easy access to identifiers, we can remove the whole `AnyShadowing` abstraction.	2023-07-16 04:49:48 +00:00
Charlie Marsh	4782675bf9	Remove lexer-based comment range detection (#5785 ) ## Summary I'm doing some unrelated profiling, and I noticed that this method is actually measurable on the CPython benchmark -- it's > 1% of execution time. We don't need to lex here, we already know the ranges of all comments, so we can just do a simple binary search for overlap, which brings the method down to 0%. ## Test Plan `cargo test`	2023-07-16 01:03:27 +00:00
guillaumeLepape	6824b67f44	Include alias when formatting import-from structs (#5786 ) ## Summary When required-imports is set with the syntax from ... import ... as ..., autofix I002 is failing ## Test Plan Reuse the same python files as `crates/ruff/src/rules/isort/mod.rs:required_import` test.	2023-07-15 15:53:21 -04:00
Charlie Marsh	5a4516b812	Misc. stylistic changes from flipping through rules late at night (#5757 ) ## Summary This is really bad PR hygiene, but a mix of: using `Locator`-based fixes in a few places (in lieu of `Generator`-based fixes), using match syntax to avoid `.len() == 1` checks, using common helpers in more places, etc. ## Test Plan `cargo test`	2023-07-14 05:23:47 +00:00
Charlie Marsh	6dbc6d2e59	Use shared `Cursor` across crates (#5715 ) ## Summary We have two `Cursor` implementations. This PR moves the implementation from the formatter into `ruff_python_whitespace` (kind of a poorly-named crate now) and uses it for both use-cases.	2023-07-12 21:09:27 +00:00
Charlie Marsh	4dee49d6fa	Run nightly Clippy over the Ruff repo (#5670 ) ## Summary This is the result of running `cargo +nightly clippy --workspace --all-targets --all-features -- -D warnings` and fixing all violations. Just wanted to see if there were any interesting new checks on nightly 👀	2023-07-10 23:44:38 -04:00
konsti	0b9af031fb	Format ExprIfExp (ternary operator) (#5597 ) ## Summary Format `ExprIfExp`, also known as the ternary operator or inline `if`. It can look like ```python a1 = 1 if True else 2 ``` but also ```python b1 = ( # We return "a" ... "a" # that's our True value # ... if this condition matches ... if True # that's our test # ... otherwise we return "b§ else "b" # that's our False value ) ``` This also fixes a visitor order bug. The jaccard index on django goes from 0.911 to 0.915. ## Test Plan I added fixtures without and with comments in strange places.	2023-07-07 19:11:52 +00:00
konsti	8184235f93	Try statements have a body: Fix formatter instability (#5558 ) ## Summary The following code was previously leading to unstable formatting: ```python try: try: pass finally: print(1) # issue7208 except A: pass ``` The comment would be formatted as a trailing comment of `try` which is unstable as an end-of-line comment gets two extra whitespaces. This was originally found in `99b00efd5e/Lib/getpass.py (L68-L91)` ## Test Plan I added a regression test	2023-07-06 16:07:47 +02:00
Charlie Marsh	dadad0e9ed	Remove some allocations in argument detection (#5481 ) ## Summary Drive-by PR to remove some allocations around argument name matching.	2023-07-03 12:21:26 -04:00
Anders Kaseorg	df13e69c3c	Format let-else with rustfmt nightly (#5461 ) Support for `let…else` formatting was just merged to nightly (rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02 to pick this up. Followup to #939. Signed-off-by: Anders Kaseorg <andersk@mit.edu>	2023-07-03 02:13:35 +00:00
Charlie Marsh	fa1b85b3da	Remove prelude from `ruff_python_ast` (#5369 ) ## Summary Per @MichaReiser, this is causing more confusion than it is helpful.	2023-06-26 11:43:49 -04:00
Micha Reiser	6ba9d5d5a4	Upgrade RustPython (#5334 )	2023-06-23 20:39:47 +00:00
James Berry	f85eb709e2	Visit AugAssign target after value (#5325 ) ## Summary When visiting AugAssign in evaluation order, the AugAssign `target` should be visited after it's `value`. Based on my testing, the pseudo code for `a += b` is effectively: ```python tmp = a a = tmp.__iadd__(b) ``` That is, an ideal traversal order would look something like this: 1. load a 2. b 3. op 4. store a But, there is only a single AST node which captures `a` in the statement `a += b`, so it cannot be traversed both before and after the traversal of `b` and the `op`. Nonetheless, I think traversing `a` after `b` and the `op` makes the most sense for a number of reasons: 1. All the other assignment expressions traverse their `value`s before their `target`s. Having `AugAssign` traverse in the same order would be more consistent. 2. Within the AST, the `ctx` of the `target` for an `AugAssign` is `Store` (though technically this is a `Load` and `Store` operation, the AST only indicates it as a `Store`). Since the the store portion of the `AugAssign` occurs last, I think it makes sense to traverse the `target` last as well. The effect of this is marginal, but it may have an impact on the behavior of #5271.	2023-06-23 09:54:54 -04:00
Micha Reiser	c52aa8f065	Basic string formatting <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR implements formatting for non-f-string Strings that do not use implicit concatenation. Docstring formatting is out of the scope of this PR. <!-- What's the purpose of the change? What does it do, and why? --> ## Test Plan I added a few tests for simple string literals. ## Performance Ouch. This is hitting performance somewhat hard. This is probably because we now iterate each string a couple of times: 1. To detect if it is an implicit string continuation 2. To detect if the string contains any new lines 3. To detect the preferred quote 4. To normalize the string Edit: I integrated the detection of newlines into the preferred quote detection so that we only iterate the string three time. We can probably do better by merging the implicit string continuation with the quote detection and new line detection by iterating till the end of the string part and returning the offset. We then use our simple tokenizer to skip over any comments or whitespace until we find the first non trivia token. From there we keep continue doing this in a loop until we reach the end o the string. I'll leave this improvement for later.	2023-06-23 09:46:05 +02:00
James Berry	2142bf6141	Fix annotation and format spec visitors (#5324 ) ## Summary The `Visitor` and `preorder::Visitor` traits provide some convenience functions, `visit_annotation` and `visit_format_spec`, for handling annotation and format spec expressions respectively. Both of these functions accept an `&Expr` and have a default implementation which delegates to `walk_expr`. The problem with this approach is that any custom handling done in `visit_expr` will be skipped for annotations and format specs. Instead, to capture any custom logic implemented in `visit_expr`, both of these function's default implementations should delegate to `visit_expr` instead of `walk_expr`. ## Example Consider the below `Visitor` implementation: ```rust impl<'a> Visitor<'a> for Example<'a> { fn visit_expr(&mut self, expr: &'a Expr) { match expr { Expr::Name(ExprName { id, .. }) => println!("Visiting {:?}", id), _ => walk_expr(self, expr), } } } ``` Run on the following Python snippet: ```python a: b ``` I would expect such a visitor to print the following: ``` Visiting b Visiting a ``` But it instead prints the following: ``` Visiting a ``` Our custom `visit_expr` handler is not invoked for the annotation. ## Test Plan Tests added in #5271 caught this behavior.	2023-06-23 03:55:42 +00:00
James Berry	f194572be8	Remove visit_arg_with_default (#5265 ) ## Summary This is a follow up to #5221. Turns out it was easy to restructure the visitor to get the right order, I'm just dumb 🤷‍♂️ I've removed `visit_arg_with_default` entirely from the `Visitor`, although it still exists as part of `preorder::Visitor`.	2023-06-21 16:00:24 -04:00
James Berry	9b5fb8f38f	Fix AST visitor traversal order (#5221 ) ## Summary According to the AST visitor documentation, the AST visitor "visits all nodes in the AST recursively in evaluation-order". However, the current traversal fails to meet this specification in a few places. ### Function traversal ```python order = [] @(order.append("decorator") or (lambda x: x)) def f( posonly: order.append("posonly annotation") = order.append("posonly default"), /, arg: order.append("arg annotation") = order.append("arg default"), args: order.append("vararg annotation"), kwarg: order.append("kwarg annotation") = order.append("kwarg default"), *kwargs: order.append("kwarg annotation") ) -> order.append("return annotation"): pass print(order) ``` Executing the above snippet using CPython 3.10.6 prints the following result (formatted for readability): ```python [ 'decorator', 'posonly default', 'arg default', 'kwarg default', 'arg annotation', 'posonly annotation', 'vararg annotation', 'kwarg annotation', 'kwarg annotation', 'return annotation', ] ``` Here we can see that decorators are evaluated first, followed by argument defaults, and annotations are last. The current traversal of a function's AST does not align with this order. ### Annotated assignment traversal ```python order = [] x: order.append("annotation") = order.append("expression") print(order) ``` Executing the above snippet using CPython 3.10.6 prints the following result: ```python ['expression', 'annotation'] ``` Here we can see that an annotated assignments annotation gets evaluated after the assignment's expression. The current traversal of an annotated assignment's AST does not align with this order. ## Why? I'm slowly working on #3946 and porting over some of the logic and tests from ssort. ssort is very sensitive to AST traversal order, so ensuring the utmost correctness here is important. ## Test Plan There doesn't seem to be existing tests for the AST visitor, so I didn't bother adding tests for these very subtle changes. However, this behavior will be captured in the tests for the PR which addresses #3946.	2023-06-21 14:40:58 -04:00
Micha Reiser	e520a3a721	Fix ArgWithDefault comments handling (#5204 )	2023-06-20 20:48:07 +00:00
Charlie Marsh	7bc33a8d5f	Remove identifier lexing in favor of parser ranges (#5195 ) ## Summary Now that all identifiers include ranges (#5194), we can remove a ton of this "custom lexing" code that we have to sketchily extract identifier ranges from source. ## Test Plan `cargo test`	2023-06-20 12:07:29 -04:00
Charlie Marsh	6331598511	Upgrade `RustPython` to access ranged names (#5194 ) ## Summary In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified RustPython to include ranges for any identifiers that aren't `Expr::Name` (which already has an identifier). For example, the `e` in `except ValueError as e` was previously un-ranged. To extract its range, we had to do some lexing of our own. This change should improve performance and let us remove a bunch of code. ## Test Plan `cargo test`	2023-06-20 15:43:38 +00:00
Charlie Marsh	8e06140d1d	Remove continuations when deleting statements (#5198 ) ## Summary This PR modifies our statement deletion logic to delete any preceding continuation lines. For example, given: ```py x = 1; \ import os ``` We'll now rewrite to: ```py x = 1; ``` In addition, the logic can now handle multiple preceding continuations (which is unlikely, but valid).	2023-06-19 22:04:28 -04:00
Charlie Marsh	36e01ad6eb	Upgrade RustPython (#5192 ) ## Summary This PR upgrade RustPython to pull in the changes to `Arguments` (zip defaults with their identifiers) and all the renames to `CmpOp` and friends.	2023-06-19 21:09:53 +00:00
Thomas de Zeeuw	e3c12764f8	Only use a single cache file per Python package (#5117 ) ## Summary This changes the caching design from one cache file per source file, to one cache file per package. This greatly reduces the amount of cache files that are opened and written, while maintaining roughly the same (combined) size as bincode is very compact. Below are some very much not scientific performance tests. It uses projects/sources to check: * small.py: single, 31 bytes Python file with 2 errors. * test.py: single, 43k Python file with 8 errors. * fastapi: FastAPI repo, 1134 files checked, 0 errors. Source \| Before # files \| After # files \| Before size \| After size -------\|-------\|-------\|-------\|------- small.py \| 1 \| 1 \| 20 K \| 20 K test.py \| 1 \| 1 \| 60 K \| 60 K fastapi \| 1134 \| 518 \| 4.5 M \| 2.3 M One question that might come up is why fastapi still has 518 cache files and not 1? That is because this is using the existing package resolution, which sees examples, docs, etc. as separate from the "main" source code (in the fastapi directory in the repo). In this future it might be worth consider switching to a one cache file per repo strategy. This new design is not perfect and does have a number of known issues. First, like the old design it doesn't remove the cache for a source file that has been (re)moved until `ruff clean` is called. Second, this currently uses a large mutex around the mutation of the package cache (e.g. inserting result). This could be (or become) a bottleneck. It's future work to test and improve this (if needed). Third, currently the packages and opened and stored in a sequential loop, this could be done parallel. This is also future work. ## Test Plan Run `ruff check` (with caching enabled) twice on any Python source code and it should produce the same results.	2023-06-19 17:46:13 +02:00
Charlie Marsh	2b82caa163	Detect continuations at start-of-file (#5173 ) ## Summary Given: ```python \ import os ``` Deleting `import os` leaves a syntax error: a file can't end in a continuation. We have code to handle this case, but it failed to pick up continuations at the _very start_ of a file. Closes #5156.	2023-06-19 00:09:02 -04:00
Charlie Marsh	fab2a4adf7	Use `matches!` for insecure hash rule (#5141 )	2023-06-16 04:18:32 +00:00
Charlie Marsh	5ea3e42513	Always use identifier ranges to store bindings (#5110 ) ## Summary At present, when we store a binding, we include a `TextRange` alongside it. The `TextRange` _sometimes_ matches the exact range of the identifier to which the `Binding` is linked, but... not always. For example, given: ```python x = 1 ``` The binding we create _will_ use the range of `x`, because the left-hand side is an `Expr::Name`, which has a valid range on it. However, given: ```python try: pass except ValueError as e: pass ``` When we create a binding for `e`, we don't have a `TextRange`... The AST doesn't give us one. So we end up extracting it via lexing. This PR extends that pattern to the rest of the binding kinds, to ensure that whenever we create a binding, we always use the range of the bound name. This leads to better diagnostics in cases like pattern matching, whereby the diagnostic for "unused variable `x`" here used to include `x`, instead of just `x`: ```python def f(provided: int) -> int: match provided: case [_, x]: pass ``` This is _also_ required for symbol renames, since we track writes as bindings -- so we need to know the ranges of the bound symbols. By storing these bindings precisely, we can also remove the `binding.trimmed_range` abstraction -- since bindings already use the "trimmed range". To implement this behavior, I took some of our existing utilities (like the code we had for `except ValueError as e` above), migrated them from a full lexer to a zero-allocation lexer that _only_ identifies "identifiers", and moved the behavior into a trait, so we can now do `stmt.identifier(locator)` to get the range for the identifier. Honestly, we might end up discarding much of this if we decide to put ranges on all identifiers (https://github.com/astral-sh/RustPython-Parser/pull/8). But even if we do, this will _still_ be a good change, because the lexer introduced here is useful beyond names (e.g., we use it find the `except` keyword in an exception handler, to find the `else` after a `for` loop, and so on). So, I'm fine committing this even if we end up changing our minds about the right approach. Closes #5090. ## Benchmarks No significant change, with one statistically significant improvement (-2.1654% on `linter/all-rules/large/dataset.py`): ``` linter/default-rules/numpy/globals.py time: [73.922 µs 73.955 µs 73.986 µs] thrpt: [39.882 MiB/s 39.898 MiB/s 39.916 MiB/s] change: time: [-0.5579% -0.4732% -0.3980%] (p = 0.00 < 0.05) thrpt: [+0.3996% +0.4755% +0.5611%] Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) low severe 1 (1.00%) low mild 1 (1.00%) high mild linter/default-rules/pydantic/types.py time: [1.4909 ms 1.4917 ms 1.4926 ms] thrpt: [17.087 MiB/s 17.096 MiB/s 17.106 MiB/s] change: time: [+0.2140% +0.2741% +0.3392%] (p = 0.00 < 0.05) thrpt: [-0.3380% -0.2734% -0.2136%] Change within noise threshold. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe linter/default-rules/numpy/ctypeslib.py time: [688.97 µs 691.34 µs 694.15 µs] thrpt: [23.988 MiB/s 24.085 MiB/s 24.168 MiB/s] change: time: [-1.3282% -0.7298% -0.1466%] (p = 0.02 < 0.05) thrpt: [+0.1468% +0.7351% +1.3461%] Change within noise threshold. Found 15 outliers among 100 measurements (15.00%) 1 (1.00%) low mild 2 (2.00%) high mild 12 (12.00%) high severe linter/default-rules/large/dataset.py time: [3.3872 ms 3.4032 ms 3.4191 ms] thrpt: [11.899 MiB/s 11.954 MiB/s 12.011 MiB/s] change: time: [-0.6427% -0.2635% +0.0906%] (p = 0.17 > 0.05) thrpt: [-0.0905% +0.2642% +0.6469%] No change in performance detected. Found 20 outliers among 100 measurements (20.00%) 1 (1.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 13 (13.00%) high severe linter/all-rules/numpy/globals.py time: [148.99 µs 149.21 µs 149.42 µs] thrpt: [19.748 MiB/s 19.776 MiB/s 19.805 MiB/s] change: time: [-0.7340% -0.5068% -0.2778%] (p = 0.00 < 0.05) thrpt: [+0.2785% +0.5094% +0.7395%] Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high severe linter/all-rules/pydantic/types.py time: [3.0362 ms 3.0396 ms 3.0441 ms] thrpt: [8.3779 MiB/s 8.3903 MiB/s 8.3997 MiB/s] change: time: [-0.0957% +0.0618% +0.2125%] (p = 0.45 > 0.05) thrpt: [-0.2121% -0.0618% +0.0958%] No change in performance detected. Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 3 (3.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe linter/all-rules/numpy/ctypeslib.py time: [1.6879 ms 1.6894 ms 1.6909 ms] thrpt: [9.8478 MiB/s 9.8562 MiB/s 9.8652 MiB/s] change: time: [-0.2279% -0.0888% +0.0436%] (p = 0.18 > 0.05) thrpt: [-0.0435% +0.0889% +0.2284%] No change in performance detected. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) low mild 1 (1.00%) high severe linter/all-rules/large/dataset.py time: [7.1520 ms 7.1586 ms 7.1654 ms] thrpt: [5.6777 MiB/s 5.6831 MiB/s 5.6883 MiB/s] change: time: [-2.5626% -2.1654% -1.7780%] (p = 0.00 < 0.05) thrpt: [+1.8102% +2.2133% +2.6300%] Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low mild 1 (1.00%) high mild ```	2023-06-15 18:43:19 +00:00
konstin	66089e1a2e	Fix a number of formatter errors from the cpython repository (#5089 ) ## Summary This fixes a number of problems in the formatter that showed up with various files in the [cpython](https://github.com/python/cpython) repository. These problems surfaced as unstable formatting and invalid code. This is not the entirety of problems discovered through cpython, but a big enough chunk to separate it. Individual fixes are generally individual commits. They were discovered with #5055, which i update as i work through the output ## Test Plan I added regression tests with links to cpython for each entry, except for the two stubs that also got comment stubs since they'll be implemented properly later.	2023-06-15 11:24:14 +00:00
Charlie Marsh	716cab2f19	Run `rustfmt` on nightly to clean up erroneous comments (#5106 ) ## Summary This PR runs `rustfmt` with a few nightly options as a one-time fix to catch some malformatted comments. I ended up just running with: ```toml condense_wildcard_suffixes = true edition = "2021" max_width = 100 normalize_comments = true normalize_doc_attributes = true reorder_impl_items = true unstable_features = true use_field_init_shorthand = true ``` Since these all seem like reasonable things to fix, so may as well while I'm here.	2023-06-15 00:19:05 +00:00
Charlie Marsh	aa41ffcfde	Add `BindingKind` variants to represent deleted bindings (#5071 ) ## Summary Our current mechanism for handling deletions (e.g., `del x`) is to remove the symbol from the scope's `bindings` table. This "does the right thing", in that if we then reference a deleted symbol, we're able to determine that it's unbound -- but it causes a variety of problems, mostly in that it makes certain bindings and references unreachable after-the-fact. Consider: ```python x = 1 print(x) del x ``` If we analyze this code _after_ running the semantic model over the AST, we'll have no way of knowing that `x` was ever introduced in the scope, much less that it was bound to a value, read, and then deleted -- because we effectively erased `x` from the model entirely when we hit the deletion. In practice, this will make it impossible for us to support local symbol renames. It also means that certain rules that we want to move out of the model-building phase and into the "check dead scopes" phase wouldn't work today, since we'll have lost important information about the source code. This PR introduces two new `BindingKind` variants to model deletions: - `BindingKind::Deletion`, which represents `x = 1; del x`. - `BindingKind::UnboundException`, which represents: ```python try: 1 / 0 except Exception as e: pass ``` In the latter case, `e` gets unbound after the exception handler (assuming it's triggered), so we want to handle it similarly to a deletion. The main challenge here is auditing all of our existing `Binding` and `Scope` usages to understand whether they need to accommodate deletions or otherwise behave differently. If you look one commit back on this branch, you'll see that the code is littered with `NOTE(charlie)` comments that describe the reasoning behind changing (or not) each of those call sites. I've also augmented our test suite in preparation for this change over a few prior PRs. ### Alternatives As an alternative, I considered introducing a flag to `BindingFlags`, like `BindingFlags::UNBOUND`, and setting that at the appropriate time. This turned out to be a much more difficult change, because we tend to match on `BindingKind` all over the place (e.g., we have a bunch of code blocks that only run when a `BindingKind` is `BindingKind::Importation`). As a result, introducing these new `BindingKind` variants requires only a few changes at the client sites. Adding a flag would've required a much wider-reaching change.	2023-06-14 09:27:24 -04:00

... 2 3 4 5 6 ...

468 Commits