Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Brent Westbrook	144484d46c	Refactor semantic syntax error scope handling (#17314 ) ## Summary Based on the discussion in https://github.com/astral-sh/ruff/pull/17298#discussion_r2033975460, we decided to move the scope handling out of the `SemanticSyntaxChecker` and into the `SemanticSyntaxContext` trait. This PR implements that refactor by: - Reverting all of the `Checkpoint` and `in_async_context` code in the `SemanticSyntaxChecker` - Adding four new methods to the `SemanticSyntaxContext` trait - `in_async_context`: matches `SemanticModel::in_async_context` and only detects the nearest enclosing function - `in_sync_comprehension`: uses the new `is_async` tracking on `Generator` scopes to detect any enclosing sync comprehension - `in_module_scope`: reports whether we're at the top-level scope - `in_notebook`: reports whether we're in a Jupyter notebook - In-lining the `TestContext` directly into the `SemanticSyntaxCheckerVisitor` - This allows modifying the context as the visitor traverses the AST, which wasn't possible before One potential question here is "why not add a single method returning a `Scope` or `Scopes` to the context?" The main reason is that the `Scope` type is defined in the `ruff_python_semantic` crate, which is not currently a dependency of the parser. It also doesn't appear to be used in red-knot. So it seemed best to use these more granular methods instead of trying to access `Scope` in `ruff_python_parser` (and red-knot). ## Test Plan Existing parser and linter tests.	2025-04-09 14:23:29 -04:00
Brent Westbrook	2fbc4d577e	[syntax-errors] Document behavior of `global` declarations in `try` nodes before 3.13 (#17285 ) Summary -- This PR extends the documentation of the `LoadBeforeGlobalDeclaration` check to specify the behavior on versions of Python before 3.13. Namely, on Python 3.12, the `else` clause of a `try` statement is visited before the `except` handlers: ```pycon Python 3.12.9 (main, Feb 12 2025, 14:50:50) [Clang 19.1.6 ] on linux Type "help", "copyright", "credits" or "license" for more information. >>> a = 10 >>> def g(): ... try: ... 1 / 0 ... except: ... a = 1 ... else: ... global a ... >>> def f(): ... try: ... pass ... except: ... global a ... else: ... print(a) ... File "<stdin>", line 5 SyntaxError: name 'a' is used prior to global declaration ``` The order is swapped on 3.13 (see [CPython#111123](https://github.com/python/cpython/issues/111123)): ```pycon Python 3.13.2 (main, Feb 5 2025, 08:05:21) [GCC 14.2.1 20250128] on linux Type "help", "copyright", "credits" or "license" for more information. >>> a = 10 ... def g(): ... try: ... 1 / 0 ... except: ... a = 1 ... else: ... global a ... File "<python-input-0>", line 8 global a ^^^^^^^^ SyntaxError: name 'a' is assigned to before global declaration >>> def f(): ... try: ... pass ... except: ... global a ... else: ... print(a) ... >>> ``` The current implementation of PLE0118 is correct for 3.13 but not 3.12: [playground](https://play.ruff.rs/d7467ea6-f546-4a76-828f-8e6b800694c9) (it flags the first case regardless of Python version). We decided to maintain this incorrect diagnostic for Python versions before 3.13 because the pre-3.13 behavior is very unintuitive and confirmed to be a bug, although the bug fix was not backported to earlier versions. This can lead to false positives and false negatives for pre-3.13 code, but we also expect that to be very rare, as demonstrated by the ecosystem check (before the version-dependent check was reverted here). Test Plan -- N/a	2025-04-09 12:54:21 -04:00
Brent Westbrook	058439d5d3	[syntax-errors] Async comprehension in sync comprehension (#17177 ) Summary -- Detect async comprehensions nested in sync comprehensions in async functions before Python 3.11, when this was [changed]. The actual logic of this rule is very straightforward, but properly tracking the async scopes took a bit of work. An alternative to the current approach is to offload the `in_async_context` check into the `SemanticSyntaxContext` trait, but that actually required much more extensive changes to the `TestContext` and also to ruff's semantic model, as you can see in the changes up to 31554b473507034735bd410760fde6341d54a050. This version has the benefit of mostly centralizing the state tracking in `SemanticSyntaxChecker`, although there was some subtlety around deferred function body traversal that made the changes to `Checker` more intrusive too (hence the new linter test). The `Checkpoint` struct/system is obviously overkill for now since it's only tracking a single `bool`, but I thought it might be more useful later. [changed]: https://github.com/python/cpython/issues/77527 Test Plan -- New inline tests and a new linter integration test.	2025-04-08 12:50:52 -04:00
Brent Westbrook	0891689d2f	[syntax-errors] Check annotations in annotated assignments (#17283 ) Summary -- This PR extends the checks in #17101 and #17282 to annotated assignments after Python 3.13. Currently stacked on #17282 to include `await`. Test Plan -- New inline tests. These are simpler than the other cases because there's no place to put generics.	2025-04-08 08:56:25 -04:00
Brent Westbrook	127a45622f	[syntax-errors] Extend annotation checks to `await` (#17282 ) Summary -- This PR extends the changes in #17101 to include `await` in the same positions. I also renamed the `valid_annotation_function` test to include `_py313` and explicitly passed a Python version to contrast it with the `_py314` version. Test Plan -- New test cases added to existing files.	2025-04-08 08:55:43 -04:00
Micha Reiser	3150812ac4	[red-knot] Add 'Format document' to playground (#17217 ) ## Summary This is more "because we can" than something we need. But since we're already building an "almost IDE" ## Test Plan https://github.com/user-attachments/assets/3a4bdad1-ba32-455a-9909-cfeb8caa1b28	2025-04-07 09:26:03 +02:00
Brent Westbrook	acc5662e8b	[syntax-errors] Allow `yield` in base classes and annotations (#17206 ) Summary -- This PR fixes the issue pointed out by @JelleZijlstra in https://github.com/astral-sh/ruff/pull/17101#issuecomment-2777480204. Namely, I conflated two very different errors from CPython: ```pycon >>> def m[T](x: (yield from 1)): ... File "<python-input-310>", line 1 def m[T](x: (yield from 1)): ... ^^^^^^^^^^^^ SyntaxError: yield expression cannot be used within the definition of a generic >>> def m(x: (yield from 1)): ... File "<python-input-311>", line 1 def m(x: (yield from 1)): ... ^^^^^^^^^^^^ SyntaxError: 'yield from' outside function >>> def outer(): ... def m(x: (yield from 1)): ... ... >>> ``` I thought the second error was the same as the first, but `yield` (and `yield from`) is actually valid in this position when inside a function scope. The same is true for base classes, as pointed out in the original comment. We don't currently raise an error for `yield` outside of a function, but that should be handled separately. On the upside, this had the benefit of removing the `InvalidExpressionPosition::BaseClass` variant and the `allow_named_expr` field from the visitor because they were both no longer used. Test Plan -- Updated inline tests.	2025-04-04 13:48:28 -04:00
Micha Reiser	a4ba10ff0a	[red-knot] Add basic on-hover to playground and LSP (#17057 ) ## Summary Implement a very basic hover in the playground and LSP. It's basic, because it only shows the type on-hover. Most other LSPs also show: * The signature of the symbol beneath the cursor. E.g. `class Test(a:int, b:int)` (we want something like `54f7da25f9/packages/pyright-internal/src/analyzer/typeEvaluator.ts (L21929-L22129)`) * The symbols' documentation * Do more fancy markdown rendering I decided to defer these features for now because it requires new semantic APIs (similar to goto definition), and investing in fancy rendering only makes sense once we have the relevant data. Closes [#16826](https://github.com/astral-sh/ruff/issues/16826) ## Test Plan https://github.com/user-attachments/assets/044aeee4-58ad-4d4e-9e26-ac2a712026be https://github.com/user-attachments/assets/4a1f4004-2982-4cf2-9dfd-cb8b84ff2ecb	2025-04-04 08:13:43 +02:00
Brent Westbrook	4f924bb975	[minor] Fix extra semicolon for clippy (#17188 )	2025-04-03 18:17:00 -04:00
Brent Westbrook	c2b2e42ad3	[syntax-errors] Invalid syntax in annotations (#17101 ) Summary -- This PR detects the use of invalid syntax in annotation scopes, including `yield` and `yield from` expressions and named expressions. I combined a few different types of CPython errors here, but I think the resulting error messages still make sense and are even preferable to what CPython gives. For example, we report `yield expression cannot be used in a type annotation` for both of these: ```pycon >>> def f[T](x: (yield 1)): ... File "<python-input-26>", line 1 def f[T](x: (yield 1)): ... ^^^^^^^ SyntaxError: yield expression cannot be used within the definition of a generic >>> def foo() -> (yield x): ... File "<python-input-28>", line 1 def foo() -> (yield x): ... ^^^^^^^ SyntaxError: 'yield' outside function ``` Fixes https://github.com/astral-sh/ruff/issues/11118. Test Plan -- New inline tests, along with some updates to existing tests.	2025-04-03 17:56:55 -04:00
Brent Westbrook	24b1b1d52c	[syntax-errors] Duplicate attributes in match class pattern (#17186 ) Summary -- Detects duplicate attributes in a `match` class pattern: ```python match x: case Class(x=1, x=2): ... ``` which are more analogous to the similar check for mapping patterns than to the multiple assignments rule. I also realized that both this and the mapping check would only work on top-level patterns, despite the possibility that they can be nested inside other patterns: ```python match x: case [{"x": 1, "x": 2}]: ... # false negative in the old version ``` and moved these checks into the recursive pattern visitor instead. I also tidied up some of the names like the `multiple_case_assignment` function and the `MultipleCaseAssignmentVisitor`, which are now doing more than checking for multiple assignments. Test Plan -- New inline tests for both classes and mappings.	2025-04-03 17:55:37 -04:00
Brent Westbrook	6a07dd227d	[syntax-errors] Fix multiple assignment for class keyword argument (#17184 ) Summary -- Fixes #17181. The cases being tested with multiple keys being equal are actually a slightly different error, more like the error for `MatchMapping` than like the other multiple assignment errors: ```pycon >>> match x: ... case Class(x=x, x=x): ... ... File "<python-input-249>", line 2 case Class(x=x, x=x): ... ^ SyntaxError: attribute name repeated in class pattern: x >>> match x: ... case {"x": 1, "x": 2}: ... ... File "<python-input-251>", line 2 case {"x": 1, "x": 2}: ... ^^^^^^^^^^^^^^^^ SyntaxError: mapping pattern checks duplicate key ('x') >>> match x: ... case [x, x]: ... ... File "<python-input-252>", line 2 case [x, x]: ... ^ SyntaxError: multiple assignments to name 'x' in pattern ``` This PR just stops the false positive reported in the issue, but I will quickly follow it up with a new rule (or possibly combined with the mapping rule) catching the repeated attributes separately. Test Plan -- New inline `test_ok` and updating the `test_err` cases to have duplicate values instead of keys.	2025-04-03 17:32:39 -04:00
Micha Reiser	8a4158c5f8	Upgrade to Rust 1.86 and bump MSRV to 1.84 (#17171 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary I decided to disable the new [`needless_continue`](https://rust-lang.github.io/rust-clippy/master/index.html#needless_continue) rule because I often found the explicit `continue` more readable over an empty block or having to invert the condition of an other branch. ## Test Plan `cargo test` --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2025-04-03 15:59:44 +00:00
Brent Westbrook	6e2b8f9696	[syntax-errors] Detect duplicate keys in `match` mapping patterns (#17129 ) Summary -- Detects duplicate literals in `match` mapping keys. This PR also adds a `source` method to `SemanticSyntaxContext` to display the duplicated key in the error message by slicing out its range. Test Plan -- New inline tests.	2025-04-03 10:22:37 -04:00
Brent Westbrook	d382065f8a	[syntax-errors] Reimplement PLE0118 (#17135 ) Summary -- This PR reimplements [load-before-global-declaration (PLE0118)](https://docs.astral.sh/ruff/rules/load-before-global-declaration/) as a semantic syntax error. I added a `global` method to the `SemanticSyntaxContext` trait to make this very easy, at least in ruff. Does red-knot have something similar? If this approach will also work in red-knot, I think some of the other PLE rules are also compile-time errors in CPython, PLE0117 in particular. 0115 and 0116 also mention `SyntaxError`s in their docs, but I haven't confirmed them in the REPL yet. Test Plan -- Existing linter tests for PLE0118. I think this actually can't be tested very easily in an inline test because the `TestContext` doesn't have a real way to track globals. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2025-04-02 13:03:44 +00:00
Brent Westbrook	d45593288f	[syntax-errors] Starred expressions in return, yield, and for (#17134 ) Summary -- Fixes https://github.com/astral-sh/ruff/issues/16520 by flagging single, starred expressions in `return`, `yield`, and `for` statements. I thought `yield from` would also be included here, but that error is emitted by the CPython parser: ```pycon >>> ast.parse("def f(): yield from x") Traceback (most recent call last): File "<python-input-214>", line 1, in <module> ast.parse("def f(): yield from x") ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.13/ast.py", line 54, in parse return compile(source, filename, mode, flags, _feature_version=feature_version, optimize=optimize) File "<unknown>", line 1 def f(): yield from *x ^ SyntaxError: invalid syntax ``` And we also already catch it in our parser. Test Plan -- New inline tests and updates to existing tests.	2025-04-02 08:38:25 -04:00
Micha Reiser	2ae39edccf	[red-knot] Goto type definition (#16901 ) ## Summary Implement basic Goto type definition support for Red Knot's LSP. This PR also builds the foundation for other LSP operations. E.g., Goto definition, hover, etc., should be able to reuse some, if not most, logic introduced in this PR. The basic steps of resolving the type definitions are: 1. Find the closest token for the cursor offset. This is a bit more subtle than I first anticipated because the cursor could be positioned right between the callee and the `(` in `call(test)`, in which case we want to resolve the type for `call`. 2. Find the node with the minimal range that fully encloses the token found in 1. I somewhat suspect that 1 and 2 could be done at the same time but it complicated things because we also need to compute the spine (ancestor chain) for the node and there's no guarantee that the found nodes have the same ancestors 3. Reduce the node found in 2. to a node that is a valid goto target. This may require traversing upwards to e.g. find the closest expression. 4. Resolve the type for the goto target 5. Resolve the location for the type, return it to the LSP ## Design decisions The current implementation navigates to the inferred type. I think this is what we want because it means that it correctly accounts for narrowing (in which case we want to go to the narrowed type because that's the value's type at the given position). However, it does have the downside that Goto type definition doesn't work whenever we infer `T & Unknown` because intersection types aren't supported. I'm not sure what to do about this specific case, other than maybe ignoring `Unkown` in Goto type definition if the type is an intersection? ## Known limitations * Types defined in the vendored typeshed aren't supported because the client can't open files from the red knot binary (we can either implement our own file protocol and handler OR extract the typeshed files and point there). See https://github.com/astral-sh/ruff/issues/17041 * Red Knot only exposes an API to get types for expressions and definitions. However, there are many other nodes with identifiers that can have a type (e.g. go to type of a globals statement, match patterns, ...). We can add support for those in separate PRs (after we figure out how to query the types from the semantic model). See https://github.com/astral-sh/ruff/issues/17113 * We should have a higher-level API for the LSP that doesn't directly call semantic queries. I intentionally decided not to design that API just yet. ## Test plan https://github.com/user-attachments/assets/fa077297-a42d-4ec8-b71f-90c0802b4edb Goto type definition on a union <img width="1215" alt="Screenshot 2025-04-01 at 13 02 55" src="https://github.com/user-attachments/assets/689cabcc-4a86-4a18-b14a-c56f56868085" /> Note: I recorded this using a custom typeshed path so that navigating to builtins works.	2025-04-02 12:12:48 +00:00
renovate[bot]	a192d96880	Update pre-commit dependencies (#17073 ) This PR contains the following updates: \| Package \| Type \| Update \| Change \| \|---\|---\|---\|---\| \| [abravalheri/validate-pyproject](https://redirect.github.com/abravalheri/validate-pyproject) \| repository \| patch \| `v0.24` -> `v0.24.1` \| \| [astral-sh/ruff-pre-commit](https://redirect.github.com/astral-sh/ruff-pre-commit) \| repository \| patch \| `v0.11.0` -> `v0.11.2` \| \| [crate-ci/typos](https://redirect.github.com/crate-ci/typos) \| repository \| minor \| `v1.30.2` -> `v1.31.0` \| \| [python-jsonschema/check-jsonschema](https://redirect.github.com/python-jsonschema/check-jsonschema) \| repository \| minor \| `0.31.3` -> `0.32.1` \| \| [woodruffw/zizmor-pre-commit](https://redirect.github.com/woodruffw/zizmor-pre-commit) \| repository \| patch \| `v1.5.1` -> `v1.5.2` \| --- > [!WARNING] > Some dependencies could not be looked up. Check the Dependency Dashboard for more information. Note: The `pre-commit` manager in Renovate is not supported by the `pre-commit` maintainers or community. Please do not report any problems there, instead [create a Discussion in the Renovate repository](https://redirect.github.com/renovatebot/renovate/discussions/new) if you have any questions. --- ### Release Notes <details> <summary>abravalheri/validate-pyproject (abravalheri/validate-pyproject)</summary> ### [`v0.24.1`](https://redirect.github.com/abravalheri/validate-pyproject/releases/tag/v0.24.1) [Compare Source](https://redirect.github.com/abravalheri/validate-pyproject/compare/v0.24...v0.24.1) #### What's Changed - Fixed multi plugin id was read from the wrong place by [@henryiii](https://redirect.github.com/henryiii) in [https://github.com/abravalheri/validate-pyproject/pull/240](https://redirect.github.com/abravalheri/validate-pyproject/pull/240) - Implemented alternative plugin sorting, [https://github.com/abravalheri/validate-pyproject/pull/243](https://redirect.github.com/abravalheri/validate-pyproject/pull/243) Full Changelog: https://github.com/abravalheri/validate-pyproject/compare/v0.24...v0.24.1 </details> <details> <summary>astral-sh/ruff-pre-commit (astral-sh/ruff-pre-commit)</summary> ### [`v0.11.2`](https://redirect.github.com/astral-sh/ruff-pre-commit/releases/tag/v0.11.2) [Compare Source](https://redirect.github.com/astral-sh/ruff-pre-commit/compare/v0.11.1...v0.11.2) See: https://github.com/astral-sh/ruff/releases/tag/0.11.2 ### [`v0.11.1`](https://redirect.github.com/astral-sh/ruff-pre-commit/releases/tag/v0.11.1) [Compare Source](https://redirect.github.com/astral-sh/ruff-pre-commit/compare/v0.11.0...v0.11.1) See: https://github.com/astral-sh/ruff/releases/tag/0.11.1 </details> <details> <summary>crate-ci/typos (crate-ci/typos)</summary> ### [`v1.31.0`](https://redirect.github.com/crate-ci/typos/releases/tag/v1.31.0) [Compare Source](https://redirect.github.com/crate-ci/typos/compare/v1.30.3...v1.31.0) #### \[1.31.0] - 2025-03-28 ##### Features - Updated the dictionary with the [March 2025](https://redirect.github.com/crate-ci/typos/issues/1266) changes ### [`v1.30.3`](https://redirect.github.com/crate-ci/typos/releases/tag/v1.30.3) [Compare Source](https://redirect.github.com/crate-ci/typos/compare/v1.30.2...v1.30.3) #### \[1.30.3] - 2025-03-24 ##### Features - Support detecting `go.work` and `go.work.sum` files </details> <details> <summary>python-jsonschema/check-jsonschema (python-jsonschema/check-jsonschema)</summary> ### [`v0.32.1`](https://redirect.github.com/python-jsonschema/check-jsonschema/blob/HEAD/CHANGELOG.rst#0321) [Compare Source](https://redirect.github.com/python-jsonschema/check-jsonschema/compare/0.32.0...0.32.1) - Fix the `check-meltano` hook to use `types_or`. Thanks :user:`edgarrmondragon`! (:pr:`543`) ### [`v0.32.0`](https://redirect.github.com/python-jsonschema/check-jsonschema/blob/HEAD/CHANGELOG.rst#0320) [Compare Source](https://redirect.github.com/python-jsonschema/check-jsonschema/compare/0.31.3...0.32.0) - Update vendored schemas: circle-ci, compose-spec, dependabot, github-workflows, gitlab-ci, mergify, renovate, taskfile (2025-03-25) - Add Meltano schema and pre-commit hook. Thanks :user:`edgarrmondragon`! (:issue:`540`) - Add Snapcraft schema and pre-commit hook. Thanks :user:`fabolhak`! (:issue:`535`) </details> <details> <summary>woodruffw/zizmor-pre-commit (woodruffw/zizmor-pre-commit)</summary> ### [`v1.5.2`](https://redirect.github.com/woodruffw/zizmor-pre-commit/releases/tag/v1.5.2) [Compare Source](https://redirect.github.com/woodruffw/zizmor-pre-commit/compare/v1.5.1...v1.5.2) See: https://github.com/woodruffw/zizmor/releases/tag/v1.5.2 </details> --- ### Configuration 📅 Schedule: Branch creation - "before 4am on Monday" (UTC), Automerge - At any time (no schedule defined). 🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied. ♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 👻 Immortal: This PR will be recreated if closed unmerged. Get [config help](https://redirect.github.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/astral-sh/ruff). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4yMDcuMSIsInVwZGF0ZWRJblZlciI6IjM5LjIwNy4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJpbnRlcm5hbCJdfQ==--> --------- Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Micha Reiser <micha@reiser.io>	2025-03-31 07:42:15 +00:00
Brent Westbrook	ab1011ce70	[syntax-errors] Single starred assignment target (#17024 ) Summary -- Detects starred assignment targets outside of tuples and lists like `a = (1,)`. This PR only considers assignment statements. I also checked annotated assigment statements, but these give a separate error that we already catch, so I think they're okay not to consider: ```pycon >>> a: list[int] = [] File "<python-input-72>", line 1 a: list[int] = [] ^ SyntaxError: invalid syntax ``` Fixes #13759 Test Plan -- New inline tests, plus a new `SemanticSyntaxError` for an existing parser test. I also removed a now-invalid case from an otherwise-valid test fixture. The new semantic error leads to two errors for the case below: ```python foo() = 42 ``` but this matches [pyright] too. [pyright]: https://pyright-play.net/?code=FQMw9mAUCUAEC8sAsAmAUEA	2025-03-29 12:35:47 -04:00
Brent Westbrook	a0819f0c51	[syntax-errors] Store to or delete `__debug__` (#16984 ) Summary -- Detect setting or deleting `__debug__`. Assigning to `__debug__` was a `SyntaxError` on the earliest version I tested (3.8). Deleting `__debug__` was made a `SyntaxError` in [BPO 45000], which said it was resolved in Python 3.10. However, `del __debug__` was also a runtime error (`NameError`) when I tested in Python 3.9.6, so I thought it was worth including 3.9 in this check. I don't think it was ever a good idea to try `del __debug__`, so I think there's also an argument for not making this version-dependent at all. That would only simplify the implementation very slightly, though. [BPO 45000]: https://github.com/python/cpython/issues/89163 Test Plan -- New inline tests. This also required adding a `PythonVersion` field to the `TestContext` that could be taken from the inline `ParseOptions` and making the version field on the options accessible.	2025-03-29 12:07:20 -04:00
Brent Westbrook	d70a3e6753	[syntax-errors] Multiple assignments in `case` pattern (#16957 ) Summary -- This PR detects multiple assignments to the same name in `case` patterns by recursively visiting each pattern. Test Plan -- New inline tests.	2025-03-26 13:02:42 -04:00
Brent Westbrook	5697d21fca	[syntax-errors] Irrefutable case pattern before final case (#16905 ) Summary -- Detects irrefutable `match` cases before the final case using a modified version of the existing `Pattern::is_irrefutable` method from the AST crate. The modified method helps to retrieve a more precise diagnostic range to match what Python 3.13 shows in the REPL. Test Plan -- New inline tests, as well as some updates to existing tests that had irrefutable patterns before the last block.	2025-03-26 12:27:16 -04:00
Brent Westbrook	2711e08eb8	[syntax-errors] Fix false positive for parenthesized tuple index (#16948 ) Summary -- Fixes #16943 by checking if the tuple is not parenthesized before emitting an error. Test Plan -- New inline test based on the initial report	2025-03-24 10:34:38 -04:00
Brent Westbrook	e4f5fe8cf7	[syntax-errors] Duplicate type parameter names (#16858 ) Summary -- Detects duplicate type parameter names in function definitions, class definitions, and type alias statements. I also boxed the `type_params` field on `StmtTypeAlias` to make it easier to `match` with functions and classes. (That's the reason for the red-knot code owner review requests, sorry!) Test Plan -- New `ruff_python_syntax_errors` unit tests. Fixes #11119.	2025-03-21 15:06:22 -04:00
Brent Westbrook	2baaedda6c	[syntax-errors] Start detecting compile-time syntax errors (#16106 ) ## Summary This PR implements the "greeter" approach for checking the AST for syntax errors emitted by the CPython compiler. It introduces two main infrastructural changes to support all of the compile-time errors: 1. Adds a new `semantic_errors` module to the parser crate with public `SemanticSyntaxChecker` and `SemanticSyntaxError` types 2. Embeds a `SemanticSyntaxChecker` in the `ruff_linter::Checker` for checking these errors in ruff As a proof of concept, it also implements detection of two syntax errors: 1. A reimplementation of [`late-future-import`](https://docs.astral.sh/ruff/rules/late-future-import/) (`F404`) 2. Detection of rebound comprehension iteration variables (https://github.com/astral-sh/ruff/issues/14395) ## Test plan Existing F404 tests, new inline tests in the `ruff_python_parser` crate, and a linter CLI test showing an example of the `Message` output. I also tested in VS Code, where `preview = false` and turning off syntax errors both disable the new errors: ![image](https://github.com/user-attachments/assets/cf453d95-04f7-484b-8440-cb812f29d45e) And on the playground, where `preview = false` also disables the errors: ![image](https://github.com/user-attachments/assets/a97570c4-1efa-439f-9d99-a54487dd6064) Fixes #14395 --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2025-03-21 14:45:25 -04:00
Junhson Jean-Baptiste	2a4d835132	Use the common `OperatorPrecedence` for the parser (#16747 ) ## Summary This change continues to resolve #16071 (and continues the work started in #16162). Specifically, this PR changes the code in the parser so that it uses the `OperatorPrecedence` struct from `ruff_python_ast` instead of its own version. This is part of an effort to get rid of the redundant definitions of `OperatorPrecedence` throughout the codebase. Note that this PR only makes this change for `ruff_python_parser` -- we still want to make a similar change for the formatter (namely the `OperatorPrecedence` defined in the expression part of the formatter, the pattern one is different). I separated the work to keep the PRs small and easily reviewable. ## Test Plan Because this is an internal change, I didn't add any additional tests. Existing tests do pass.	2025-03-21 09:40:37 +05:30
Brent Westbrook	42cbce538b	[syntax-errors] Fix star annotation before Python 3.11 (#16878 ) Summary -- Fixes #16874. I previously emitted a syntax error when starred annotations were _allowed_ rather than when they were actually used. This caused false positives for any starred parameter name because these are allowed to have starred annotations but not required to. The fix is to check if the annotation is actually starred after parsing it. Test Plan -- New inline parser tests derived from the initial report and more examples from the comments, although I think the first case should cover them all.	2025-03-20 17:44:52 -04:00
Brent Westbrook	dcf31c9348	[syntax-errors] PEP 701 f-strings before Python 3.12 (#16543 ) ## Summary This PR detects the use of PEP 701 f-strings before 3.12. This one sounded difficult and ended up being pretty easy, so I think there's a good chance I've over-simplified things. However, from experimenting in the Python REPL and checking with [pyright], I think this is correct. pyright actually doesn't even flag the comment case, but Python does. I also checked pyright's implementation for [quotes](`98dc4469cc/packages/pyright-internal/src/analyzer/checker.ts (L1379-L1398)`) and [escapes](`98dc4469cc/packages/pyright-internal/src/analyzer/checker.ts (L1365-L1377)`) and think I've approximated how they do it. Python's error messages also point to the simple approach of these characters simply not being allowed: ```pycon Python 3.11.11 (main, Feb 12 2025, 14:51:05) [Clang 19.1.6 ] on linux Type "help", "copyright", "credits" or "license" for more information. >>> f'''multiline { ... expression # comment ... }''' File "<stdin>", line 3 }''' ^ SyntaxError: f-string expression part cannot include '#' >>> f'''{not a line \ ... continuation}''' File "<stdin>", line 2 continuation}''' ^ SyntaxError: f-string expression part cannot include a backslash >>> f'hello {'world'}' File "<stdin>", line 1 f'hello {'world'}' ^^^^^ SyntaxError: f-string: expecting '}' ``` And since escapes aren't allowed, I don't think there are any tricky cases where nested quotes or comments can sneak in. It's also slightly annoying that the error is repeated for every nested quote character, but that also mirrors pyright, although they highlight the whole nested string, which is a little nicer. However, their check is in the analysis phase, so I don't think we have such easy access to the quoted range, at least without adding another mini visitor. ## Test Plan New inline tests [pyright]: https://pyright-play.net/?pythonVersion=3.11&strict=true&code=EYQw5gBAvBAmCWBjALgCgO4gHaygRgEoAoEaCAIgBpyiiBiCLAUwGdknYIBHAVwHt2LIgDMA5AFlwSCJhwAuCAG8IoMAG1Rs2KIC6EAL6iIxosbPmLlq5foRWiEAAcmERAAsQAJxAomnltY2wuSKogA6WKIAdABWfPBYqCAE%2BuSBVqbpWVm2iHwAtvlMWMgB2ekiolUAgq4FjgA2TAAeEMieSADWCsoV5qoaqrrGDJ5MiDz%2B8ABuLqosAIREhlXlaybrmyYMXsDw7V4AnoysyAmQ5SIhwYo3d9cheADUeKlv5O%2BpQA	2025-03-18 11:12:15 -04:00
Brent Westbrook	75a562d313	[syntax-errors] Parenthesized context managers before Python 3.9 (#16523 ) Summary -- I thought this was very complicated based on the comment here: https://github.com/astral-sh/ruff/pull/16106#issuecomment-2653505671 and on some of the discussion in the CPython issue here: https://github.com/python/cpython/issues/56991. However, after a little bit of experimentation, I think it boils down to this example: ```python with (x as y): ... ``` The issue is parentheses around a `with` item with an `optional_var`, as we (and [Python](https://docs.python.org/3/library/ast.html#ast.withitem)) call the trailing variable name (`y` in this case). It's not actually about line breaks after all, except that line breaks are allowed in parenthesized expressions, which explains the validity of cases like ```pycon >>> with ( ... x, ... y ... ) as foo: ... pass ... ``` even on Python 3.8. I followed [pyright]'s example again here on the diagnostic range (just the opening paren) and the wording of the error. Test Plan -- Inline tests [pyright]: https://pyright-play.net/?pythonVersion=3.7&strict=true&code=FAdwlgLgFgBAFAewA4FMB2cBEAzBCB0EAHhJgJQwCGAzjLgmQFwz6tA	2025-03-17 08:54:55 -04:00
Alex Waygood	38bfda94ce	[syntax-errors] Improve error message and range for pre-PEP-614 decorator syntax errors (#16581 ) ## Summary A small followup to https://github.com/astral-sh/ruff/pull/16386. We now tell the user exactly what it was about their decorator that constituted invalid syntax on Python <3.9, and the range now highlights the specific sub-expression that is invalid rather than highlighting the whole decorator ## Test Plan Inline snapshots are updated, and new ones are added.	2025-03-17 11:17:27 +00:00
Brent Westbrook	3a32e56445	[syntax-errors] Unparenthesized assignment expressions in sets and indexes (#16404 ) ## Summary This PR detects unparenthesized assignment expressions used in set literals and comprehensions and in sequence indexes. The link to the release notes in https://github.com/astral-sh/ruff/issues/6591 just has this entry: > * Assignment expressions can now be used unparenthesized within set literals and set comprehensions, as well as in sequence indexes (but not slices). with no other information, so hopefully the test cases I came up with cover all of the changes. I also tested these out in the Python REPL and they actually worked in Python 3.9 too. I'm guessing this may be another case that was "formally made part of the language spec in Python 3.10, but usable -- and commonly used -- in Python >=3.9" as @AlexWaygood added to the body of #6591 for context managers. So we may want to change the version cutoff, but I've gone along with the release notes for now. ## Test Plan New inline parser tests and linter CLI tests.	2025-03-14 15:06:42 -04:00
Brent Westbrook	6311412373	[syntax-errors] Star annotations before Python 3.11 (#16545 ) Summary -- This is closely related to (and stacked on) https://github.com/astral-sh/ruff/pull/16544 and detects star annotations in function definitions. I initially called the variant `StarExpressionInAnnotation` to mirror `StarExpressionInIndex`, but I realized it's not really a "star expression" in this position and renamed it. `StarAnnotation` seems in line with the PEP. Test Plan -- Two new inline tests. It looked like there was pretty good existing coverage of this syntax, so I just added simple examples to test the version cutoff.	2025-03-14 15:20:44 +00:00
Brent Westbrook	4f2851982d	[syntax-errors] Star expression in index before Python 3.11 (#16544 ) Summary -- This PR detects tuple unpacking expressions in index/subscript expressions before Python 3.11. Test Plan -- New inline tests	2025-03-14 14:51:34 +00:00
Brent Westbrook	2382fe1f25	[syntax-errors] Tuple unpacking in `for` statement iterator clause before Python 3.9 (#16558 ) Summary -- This PR reuses a slightly modified version of the `check_tuple_unpacking` method added for detecting unpacking in `return` and `yield` statements to detect the same issue in the iterator clause of `for` loops. I ran into the same issue with a bare `for x in *rest: ...` example (invalid even on Python 3.13) and added it as a comment on https://github.com/astral-sh/ruff/issues/16520. I considered just making this an additional `StarTupleKind` variant as well, but this change was in a different version of Python, so I kept it separate. Test Plan -- New inline tests.	2025-03-13 15:55:17 -04:00
Micha Reiser	9cd0cdefd3	Assert that formatted code doesn't introduce any new unsupported syntax errors (#16549 ) ## Summary This should give us better coverage for the unsupported syntax error features and increases our confidence that the formatter doesn't accidentially introduce new unsupported syntax errors. A feature like this would have been very useful when working on f-string formatting where it took a lot of iteration to find all Python 3.11 or older incompatibilities. ## Test Plan I applied my changes on top of https://github.com/astral-sh/ruff/pull/16523 and removed the target version check in the with-statement formatting code. As expected, the integration tests now failed	2025-03-07 09:12:00 +01:00
Brent Westbrook	b3c884f4f3	[syntax-errors] Parenthesized keyword argument names after Python 3.8 (#16482 ) Summary -- Unlike the other syntax errors detected so far, parenthesized keyword arguments are only allowed before 3.8. It sounds like they were only accidentally allowed before that [^1]. As an aside, you get a pretty confusing error from Python for this, so it's nice that we can catch it: ```pycon >>> def f(**kwargs): ... ... f((a)=1) ... File "<python-input-0>", line 2 f((a)=1) ^^^ SyntaxError: expression cannot contain assignment, perhaps you meant "=="? >>> ``` Test Plan -- Inline tests. [^1]: https://github.com/python/cpython/issues/78822	2025-03-06 12:18:13 -05:00
Brent Westbrook	6c14225c66	[syntax-errors] Tuple unpacking in `return` and `yield` before Python 3.8 (#16485 ) Summary -- Checks for tuple unpacking in `return` and `yield` statements before Python 3.8, as described [here]. Test Plan -- Inline tests. [here]: https://github.com/python/cpython/issues/76298	2025-03-06 11:57:20 -05:00
Brent Westbrook	318f503714	[syntax-errors] Named expressions in decorators before Python 3.9 (#16386 ) Summary -- This PR detects the relaxed grammar for decorators proposed in [PEP 614](https://peps.python.org/pep-0614/) on Python 3.8 and lower. The 3.8 grammar for decorators is [here](https://docs.python.org/3.8/reference/compound_stmts.html#grammar-token-decorators): ``` decorators ::= decorator+ decorator ::= "@" dotted_name ["(" [argument_list [","]] ")"] NEWLINE dotted_name ::= identifier ("." identifier)* ``` in contrast to the current grammar [here](https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-decorators) ``` decorators ::= decorator+ decorator ::= "@" assignment_expression NEWLINE assignment_expression ::= [identifier ":="] expression ``` Test Plan -- New inline parser tests.	2025-03-05 17:08:18 +00:00
Brent Westbrook	d0623888b3	[syntax-errors] Positional-only parameters before Python 3.8 (#16481 ) Summary -- Detect positional-only parameters before Python 3.8, as marked by the `/` separator in a parameter list. Test Plan -- Inline tests.	2025-03-05 13:46:43 +00:00
Brent Westbrook	81bcdcebd3	[syntax-errors] Type parameter lists before Python 3.12 (#16479 ) Summary -- Another simple one, just detect type parameter lists in functions and classes. Like pyright, we don't emit a second diagnostic for `type` alias statements, which were also introduced in 3.12. Test Plan -- Inline tests.	2025-03-05 13:19:09 +00:00
Brent Westbrook	32c66ec4b7	[syntax-errors] `type` alias statements before Python 3.12 (#16478 ) Summary -- Another simple one, just detect standalone `type` statements. I limited the diagnostic to `type` itself like [pyright]. That probably makes the most sense for more complicated examples. Test Plan -- Inline tests. [pyright]: https://pyright-play.net/?pythonVersion=3.8&strict=true&code=C4TwDgpgBAHlC8UCWA7YQ	2025-03-04 17:20:10 +00:00
Brent Westbrook	e7b93f93ef	[syntax-errors] Type parameter defaults before Python 3.13 (#16447 ) Summary -- Detects the presence of a [PEP 696] type parameter default before Python 3.13. Test Plan -- New inline parser tests for type aliases, generic functions and generic classes. [PEP 696]: https://peps.python.org/pep-0696/#grammar-changes	2025-03-04 16:53:38 +00:00
Brent Westbrook	c8a06a9be8	[syntax-errors] Limit `except` range to `` (#16473 ) Summary -- This is a follow-up to #16446 to fix the diagnostic range to point to the `` like `pyright` does (https://github.com/astral-sh/ruff/pull/16446#discussion_r1976900643). Storing the range in the `ExceptClauseKind::Star` variant feels slightly awkward, but we don't store the star itself anywhere on the `ExceptHandler`. And we can't just take `ExceptHandler.start() + "except".text_len()` because this code appears to be valid: ```python try: ... except Error: ... ``` Test Plan -- Existing tests.	2025-03-04 16:50:09 +00:00
Brent Westbrook	37fbe58b13	Document `LinterResult::has_syntax_error` and add `Parsed::has_no_syntax_errors` (#16443 ) Summary -- This is a follow up addressing the comments on #16425. As @dhruvmanila pointed out, the naming is a bit tricky. I went with `has_no_errors` to try to differentiate it from `is_valid`. It actually ends up negated in most uses, so it would be more convenient to have `has_any_errors` or `has_errors`, but I thought it would sound too much like the opposite of `is_valid` in that case. I'm definitely open to suggestions here. Test Plan -- Existing tests.	2025-03-04 08:35:38 -05:00
Brent Westbrook	e924ecbdac	[syntax-errors] `except` before Python 3.11 (#16446 ) Summary -- One of the simpler ones, just detect the use of `except` before 3.11. Test Plan -- New inline tests.	2025-03-02 18:20:18 +00:00
Brent Westbrook	4431978262	[syntax-errors] Assignment expressions before Python 3.8 (#16383 ) ## Summary This PR is the first in a series derived from https://github.com/astral-sh/ruff/pull/16308, each of which add support for detecting one version-related syntax error from https://github.com/astral-sh/ruff/issues/6591. This one should be the largest because it also includes the addition of the `Parser::add_unsupported_syntax_error` method Otherwise I think the general structure will be the same for each syntax error: * Detecting the error in the parser * Inline parser tests for the new error * New ruff CLI tests for the new error ## Test Plan As noted above, there are new inline parser tests, as well as new ruff CLI tests. Once https://github.com/astral-sh/ruff/pull/16379 is resolved, there should also be new mdtests for red-knot, but this PR does not currently include those.	2025-02-28 17:13:46 -05:00
Brent Westbrook	764aa0e6a1	Allow passing `ParseOptions` to inline tests (#16357 ) ## Summary This PR adds support for a pragma-style header for inline parser tests containing JSON-serialized `ParseOptions`. For example, ```python # parse_options: { "target-version": "3.9" } match 2: case 1: pass ``` The line must start with `# parse_options: ` and then the rest of the (trimmed) line is deserialized into `ParseOptions` used for parsing the the test. ## Test Plan Existing inline tests, plus two new inline tests for `match-before-py310`. --------- Co-authored-by: Alex Waygood <alex.waygood@gmail.com>	2025-02-27 10:23:15 -05:00
Carl Meyer	dd6f6233bd	bump MSRV to 1.83 (#16294 ) According to our new MSRV policy (see https://github.com/astral-sh/ruff/issues/16370 ), bump our MSRV to 1.83 (N - 2), and autofix some new clippy lints.	2025-02-26 06:12:43 -08:00
Brent Westbrook	78806361fd	Start detecting version-related syntax errors in the parser (#16090 ) ## Summary This PR builds on the changes in #16220 to pass a target Python version to the parser. It also adds the `Parser::unsupported_syntax_errors` field, which collects version-related syntax errors while parsing. These syntax errors are then turned into `Message`s in ruff (in preview mode). This PR only detects one syntax error (`match` statement before Python 3.10), but it has been pretty quick to extend to several other simple errors (see #16308 for example). ## Test Plan The current tests are CLI tests in the linter crate, but these could be supplemented with inline parser tests after #16357. I also tested the display of these syntax errors in VS Code: ![image](https://github.com/user-attachments/assets/062b4441-740e-46c3-887c-a954049ef26e) ![image](https://github.com/user-attachments/assets/101f55b8-146c-4d59-b6b0-922f19bcd0fa) --------- Co-authored-by: Alex Waygood <alex.waygood@gmail.com>	2025-02-25 23:03:48 -05:00
Alex Waygood	25920fe489	Rename `ExprStringLiteral::as_unconcatenated_string()` to `ExprStringLiteral::as_single_part_string()` (#16253 )	2025-02-19 16:06:57 +00:00
Brent Westbrook	97d0659ce3	Pass `ParserOptions` to the parser (#16220 ) ## Summary This is part of the preparation for detecting syntax errors in the parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested in [this comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509), I started working on a `ParseOptions` struct that could be stored in the parser. For this initial refactor, I only made it hold the existing `Mode` option, but for syntax errors, we will also need it to have a `PythonVersion`. For that use case, I'm picturing something like a `ParseOptions::with_python_version` method, so you can extend the current calls to something like ```rust ParseOptions::from(mode).with_python_version(settings.target_version) ``` But I thought it was worth adding `ParseOptions` alone without changing any other behavior first. Most of the diff is just updating call sites taking `Mode` to take `ParseOptions::from(Mode)` or those taking `PySourceType`s to take `ParseOptions::from(PySourceType)`. The interesting changes are in the new `parser/options.rs` file and smaller parts of `parser/mod.rs` and `ruff_python_parser/src/lib.rs`. ## Test Plan Existing tests, this should not change any behavior.	2025-02-19 10:50:50 -05:00
Alex Waygood	b6b1947010	Improve API exposed on `ExprStringLiteral` nodes (#16192 ) ## Summary This PR makes the following changes: - It adjusts various callsites to use the new `ast::StringLiteral::contents_range()` method that was introduced in https://github.com/astral-sh/ruff/pull/16183. This is less verbose and more type-safe than using the `ast::str::raw_contents()` helper function. - It adds a new `ast::ExprStringLiteral::as_unconcatenated_literal()` helper method, and adjusts various callsites to use it. This addresses @MichaReiser's review comment at https://github.com/astral-sh/ruff/pull/16183#discussion_r1957334365. There is no functional change here, but it helps readability to make it clearer that we're differentiating between implicitly concatenated strings and unconcatenated strings at various points. - It renames the `StringLiteralValue::flags()` method to `StringLiteralFlags::first_literal_flags()`. If you're dealing with an implicitly concatenated string `string_node`, `string_node.value.flags().closer_len()` could give an incorrect result; this renaming makes it clearer that the `StringLiteralFlags` instance returned by the method is only guaranteed to give accurate information for the first `StringLiteral` contained in the `ExprStringLiteral` node. - It deletes the unused `BytesLiteralValue::flags()` method. This seems prone to misuse in the same way as `StringLiteralValue::flags()`: if it's an implicitly concatenated bytestring, the `BytesLiteralFlags` instance returned by the method would only give accurate information for the first `BytesLiteral` in the bytestring. ## Test Plan `cargo test`	2025-02-17 07:58:54 +00:00
InSync	7d2e40be2d	[`pylint`] Do not offer fix for raw strings (`PLE251`) (#16132 ) ## Summary Resolves #13294, follow-up to #13882. At #13882, it was concluded that a fix should not be offered for raw strings. This change implements that. The five rules in question are now no longer always fixable. ## Test Plan `cargo nextest run` and `cargo insta test`. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2025-02-13 08:36:11 +00:00
Alex Waygood	cb71393332	Simplify the `StringFlags` trait (#15944 )	2025-02-04 18:14:28 +00:00
Brent Westbrook	b5e5271adf	Preserve triple quotes and prefixes for strings (#15818 ) ## Summary This is a follow-up to #15726, #15778, and #15794 to preserve the triple quote and prefix flags in plain strings, bytestrings, and f-strings. I also added a `StringLiteralFlags::without_triple_quotes` method to avoid passing along triple quotes in rules like SIM905 where it might not make sense, as discussed [here](https://github.com/astral-sh/ruff/pull/15726#discussion_r1930532426). ## Test Plan Existing tests, plus many new cases in the `generator::tests::quote` test that should cover all combinations of quotes and prefixes, at least for simple string bodies. Closes #7799 when combined with #15694, #15726, #15778, and #15794. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2025-02-04 08:41:06 -05:00
Brent Westbrook	9bf138c45a	Preserve quote style in generated code (#15726 ) ## Summary This is a first step toward fixing #7799 by using the quoting style stored in the `flags` field on `ast::StringLiteral`s to select a quoting style. This PR does not include support for f-strings or byte strings. Several rules also needed small updates to pass along existing quoting styles instead of using `StringLiteralFlags::default()`. The remaining snapshot changes are intentional and should preserve the quotes from the input strings. ## Test Plan Existing tests with some accepted updates, plus a few new RUF055 tests for raw strings. --------- Co-authored-by: Alex Waygood <alex.waygood@gmail.com>	2025-01-27 13:41:03 -05:00
Shaygan Hooshyari	cf4ab7cba1	Parse triple quoted string annotations as if parenthesized (#15387 ) ## Summary Resolves #9467 Parse quoted annotations as if the string content is inside parenthesis. With this logic `x` and `y` in this example are equal: ```python y: """ int \| str """ z: """( int \| str ) """ ``` Also this rule only applies to triple quotes([link](https://github.com/python/typing-council/issues/9#issuecomment-1890808610)). This PR is based on the [comments](https://github.com/astral-sh/ruff/issues/9467#issuecomment-2579180991) on the issue. I did one extra change, since we don't want any indentation tokens I am setting the `State::Other` as the initial state of the Lexer. Remaining work: - [x] Add a test case for red-knot. - [x] Add more tests. ## Test Plan Added a test which previously failed because quoted annotation contained indentation. Added an mdtest for red-knot. Updated previous test. Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com> Co-authored-by: Micha Reiser <micha@reiser.io>	2025-01-16 11:38:15 +05:30
Andrew Gallant	17f01a4355	test: add more missing carets This update includes some missing `^` in the diagnostic annotations. This update also includes some shifting of "syntax error" annotations to the end of the preceding line. I believe this is technically a regression, but fixing them has proven quite difficult. I think the best way to do that might be to tweak the spans generated by the Python parser errors, but I didn't want to dig into that. (Another approach would be to change the `annotate-snippets` rendering, but when I tried that and managed to fix these regressions, I ended up causing a bunch of other regressions.) Ref `77d454525e (r1915458616)`	2025-01-15 13:37:52 -05:00
Andrew Gallant	84ba4ecaf5	ruff_annotate_snippets: support overriding the "cut indicator" We do this because `...` is valid Python, which makes it pretty likely that some line trimming will lead to ambiguous output. So we add support for overriding the cut indicator. This also requires changing some of the alignment math, which was previously tightly coupled to `...`. For Ruff, we go with `…` (`U+2026 HORIZONTAL ELLIPSIS`) for our cut indicator. For more details, see the patch sent to upstream: https://github.com/rust-lang/annotate-snippets-rs/pull/172	2025-01-15 13:37:52 -05:00
Andrew Gallant	5caef89af3	test: update snapshots with improper end-of-line placement This looks like a bug fix that occurs when the annotation is a zero-width span immediately following a line terminator. Previously, the caret seems to be rendered on the next line, but it should be rendered at the end of the line the span corresponds to. I admit that this one is kinda weird. I would somewhat expect that our spans here are actually incorrect, and that to obtain this sort of rendering, we should identify a span just immediately _before_ the line terminator and not after it. But I don't want to dive into that rabbit hole for now (and given how `annotate-snippets` now renders these spans, perhaps there is more to it than I see), and this does seem like a clear improvement given the spans we feed to `annotate-snippets`.	2025-01-15 13:37:52 -05:00
Andrew Gallant	f49cfb6c28	test: update snapshots with missing `^` The previous rendering just seems wrong in that a `^` is omitted. The new version of `annotate-snippets` seems to get this right. I checked a pseudo random sample of these, and it seems to only happen when the position pointed at a line terminator.	2025-01-15 13:37:52 -05:00
Andrew Gallant	3fa4479c85	test: update snapshots with missing annotations These updates center around the addition of annotations in the diagnostic rendering. Previously, the annotation was just not rendered at all. With the `annotate-snippets` upgrade, it is now rendered. I examined a pseudo random sample of these, and they all look correct. As will be true in future batches, some of these snapshots also have changes to whitespace in them as well.	2025-01-15 13:37:52 -05:00
Andrew Gallant	0de8216a25	test: update snapshots with just whitespace changes These snapshot changes should all only be a result of changes to trailing whitespace in the output. I checked a psuedo random sample of these, and the whitespace found in the previous snapshots seems to be an artifact of the rendering and _not_ of the source data. So this seems like a strict bug fix to me. There are other snapshots with whitespace changes, but they also have other changes that we split out into separate commits. Basically, we're going to do approximately one commit per category of change. This represents, by far, the biggest chunk of changes to snapshots as a result of the `annotate-snippets` upgrade.	2025-01-15 13:37:52 -05:00
Andrew Gallant	84179aaa96	ruff_linter,ruff_python_parser: migrate to updated `annotate-snippets` This is pretty much just moving to the new API and taking care to use byte offsets. This is almost enough. The next commit will fix a bug involving the handling of unprintable characters as a result of switching to byte offsets.	2025-01-15 13:37:52 -05:00
Dylan	c1eaf6ff72	Modify parsing of raise with cause when exception is absent (#15049 ) When confronted with `raise from exc` the parser will now create a `StmtRaise` that has `None` for the exception and `exc` for the cause. Before, the parser created a `StmtRaise` with `from` for the exception, no cause, and a spurious expression `exc` afterwards.	2024-12-19 13:36:32 +00:00
Dylan	a3bb0cd5ec	Raise syntax error for mixing `except` and `except` (#14895 ) This PR adds a syntax error if the parser encounters a `TryStmt` that has except clauses both with and without a star. The displayed error points to each except clause that contradicts the original except clause kind. So, for example, ```python try: .... except: #<-- we assume this is the desired except kind .... except: #<--- error will point here .... except*: #<--- and here .... ``` Closes #14860	2024-12-10 17:50:55 -06:00
Dimitri Papadopoulos Orfanos	59145098d6	Fix typos found by codespell (#14863 ) ## Summary Just fix typos. ## Test Plan CI tests. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2024-12-09 09:32:12 +00:00
Micha Reiser	b63c2e126b	Upgrade Rust toolchain to 1.83 (#14677 )	2024-11-29 12:05:05 +00:00
Alex Waygood	f1b2e85339	py-fuzzer: recommend using `uvx` rather than `uv run` to run the fuzzer (#14645 )	2024-11-27 22:19:52 +00:00
Alex Waygood	e0f3eaf1dd	Turn the `fuzz-parser` script into a properly packaged Python project (#14606 ) ## Summary This PR gets rid of the `requirements.in` and `requirements.txt` files in the `scripts/fuzz-parser` directory, and replaces them with `pyproject.toml` and `uv.lock` files. The script is renamed from `fuzz-parser` to `py-fuzzer` (since it can now also be used to fuzz red-knot as well as the parser, following https://github.com/astral-sh/ruff/pull/14566), and moved from the `scripts/` directory to the `python/` directory, since it's now a (uv)-pip-installable project in its own right. I've been resisting this for a while, because conceptually this script just doesn't feel "complicated" enough to me for it to be a full-blown package. However, I think it's time to do this. Making it a proper package has several advantages: - It means we can run it from the project root using `uv run` without having to activate a virtual environment and ensure that all required dependencies are installed into that environment - Using a `pyproject.toml` file means that we can express that the project requires Python 3.12+ to run properly; this wasn't possible before - I've been running mypy on the project locally when I've been working on it or reviewing other people's PRs; now I can put the mypy config for the project in the `pyproject.toml` file ## Test Plan I manually tested that all the commands detailed in `python/py-fuzzer/README.md` work for me locally. --------- Co-authored-by: David Peter <sharkdp@users.noreply.github.com>	2024-11-27 08:09:04 +00:00
Micha Reiser	c847cad389	Update insta snapshots (#14366 )	2024-11-15 19:31:15 +01:00
Micha Reiser	bd33b4972d	Short circuit `lex_identifier` if the name is longer or shorter than any known keyword (#13815 )	2024-10-19 11:07:15 +00:00
Junzhuo ZHOU	a354d9ead6	Expose internal types as public access (#13509 )	2024-09-26 17:34:30 +02:00
Micha Reiser	c3bcd5c842	Upgrade to Rust 1.81 (#13265 )	2024-09-06 15:09:09 +02:00
Alex Waygood	b7c7b4b387	Add a method to `Checker` for cached parsing of stringified type annotations (#13158 )	2024-09-02 12:44:20 +00:00
Micha Reiser	138e70bd5c	Upgrade to Rust 1.80 (#12586 )	2024-07-30 19:18:08 +00:00
Dhruv Manilawala	978909fcf4	Raise syntax error for unparenthesized generator expr in multi-argument call (#12445 ) ## Summary This PR fixes a bug to raise a syntax error when an unparenthesized generator expression is used as an argument to a call when there are more than one argument. For reference, the grammar is: ``` primary: \| ... \| primary genexp \| primary '(' [arguments] ')' \| ... genexp: \| '(' ( assignment_expression \| expression !':=') for_if_clauses ')' ``` The `genexp` requires the parenthesis as mentioned in the grammar. So, the grammar for a call expression is either a name followed by a generator expression or a name followed by a list of argument. In the former case, the parenthesis are excluded because the generator expression provides them while in the later case, the parenthesis are explicitly provided for a list of arguments which means that the generator expression requires it's own parenthesis. This was discovered in https://github.com/astral-sh/ruff/issues/12420. ## Test Plan Add test cases for valid and invalid syntax. Make sure that the parser from CPython also raises this at the parsing step: ```console $ python3.13 -m ast parser/_.py File "parser/_.py", line 1 total(1, 2, x for x in range(5), 6) ^^^^^^^^^^^^^^^^^^^ SyntaxError: Generator expression must be parenthesized $ python3.13 -m ast parser/_.py File "parser/_.py", line 1 sum(x for x in range(10), 10) ^^^^^^^^^^^^^^^^^^^^ SyntaxError: Generator expression must be parenthesized ```	2024-07-22 14:44:20 +05:30
Dhruv Manilawala	8f40928534	Enable token-based rules on source with syntax errors (#11950 ) ## Summary This PR updates the linter, specifically the token-based rules, to work on the tokens that come after a syntax error. For context, the token-based rules only diagnose the tokens up to the first lexical error. This PR builds up an error resilience by introducing a `TokenIterWithContext` which updates the `nesting` level and tries to reflect it with what the lexer is seeing. This isn't 100% accurate because if the parser recovered from an unclosed parenthesis in the middle of the line, the context won't reduce the nesting level until it sees the newline token at the end of the line. resolves: #11915 ## Test Plan * Add test cases for a bunch of rules that are affected by this change. * Run the fuzzer for a long time, making sure to fix any other bugs.	2024-07-02 08:57:46 +00:00
Micha Reiser	5109b50bb3	Use `CompactString` for `Identifier` (#12101 )	2024-07-01 10:06:02 +02:00
Micha Reiser	f765d19402	Mention that `Cursor` is based on rustc's implementation. (#12109 )	2024-06-30 16:53:25 +01:00
Micha Reiser	da78de0439	Remove allcation in `parse_identifier` (#12103 )	2024-06-29 15:00:24 +02:00
Dhruv Manilawala	434ce307a7	Revert "Use correct range to highlight line continuation error" (#12089 ) This PR reverts https://github.com/astral-sh/ruff/pull/12016 with a small change where the error location points to the continuation character only. Earlier, it would also highlight the whitespace that came before it. The motivation for this change is to avoid panic in https://github.com/astral-sh/ruff/pull/11950. For example: ```py \) ``` Playground: https://play.ruff.rs/87711071-1b54-45a3-b45a-81a336a1ea61 The range of `Unknown` token and `Rpar` is the same. Once #11950 is enabled, the indexer would panic. It won't panic in the stable version because we stop at the first `Unknown` token.	2024-06-28 18:10:00 +05:30
Dhruv Manilawala	a4688aebe9	Use `TokenSource` to find new location for re-lexing (#12060 ) ## Summary This PR splits the re-lexing logic into two parts: 1. `TokenSource`: The token source will be responsible to find the position the lexer needs to be moved to 2. `Lexer`: The lexer will be responsible to reduce the nesting level and move itself to the new position if recovered from a parenthesized context This split makes it easy to find the new lexer position without needing to implement the backwards lexing logic again which would need to handle cases involving: * Different kinds of newlines * Line continuation character(s) * Comments * Whitespaces ### F-strings This change did reveal one thing about re-lexing f-strings. Consider the following example: ```py f'{' # ^ f'foo' ``` Here, the quote as highlighted by the caret (`^`) is the start of a string inside an f-string expression. This is unterminated string which means the token emitted is actually `Unknown`. The parser tries to recover from it but there's no newline token in the vector so the new logic doesn't recover from it. The previous logic does recover because it's looking at the raw characters instead. The parser would be at `FStringStart` (the one for the second line) when it calls into the re-lexing logic to recover from an unterminated f-string on the first line. So, moving backwards the first character encountered is a newline character but the first token encountered is an `Unknown` token. This is improved with #12067 fixes: #12046 fixes: #12036 ## Test Plan Update the snapshot and validate the changes.	2024-06-27 17:12:39 +05:30
Dhruv Manilawala	e137c824c3	Avoid consuming newline for unterminated string (#12067 ) ## Summary This PR fixes the lexer logic to not consume the newline character for an unterminated string literal. Currently, the lexer would consume it to be part of the string itself but that would be bad for recovery because then the lexer wouldn't emit the newline token ever. This PR fixes that to avoid consuming the newline character in that case. This was discovered during https://github.com/astral-sh/ruff/pull/12060. ## Test Plan Update the snapshots and validate them.	2024-06-27 17:02:48 +05:30
Dhruv Manilawala	47c9ed07f2	Consider 2-character EOL before line continuation (#12035 ) ## Summary This PR fixes a bug introduced in https://github.com/astral-sh/ruff/pull/12008 which didn't consider the two character newline after the line continuation character. For example, consider the following code highlighted with whitespaces: ```py call(foo # comment \\r\n \r\n def bar():\r\n ....pass\r\n ``` The lexer is at `def` when it's running the re-lexing logic and trying to move back to a newline character. It encounters `\n` and it's being escaped (incorrect) but `\r` is being escaped, so it moves the lexer to `\n` character. This creates an overlap in token ranges which causes the panic. ``` Name 0..4 Lpar 4..5 Name 5..8 Comment 9..20 NonLogicalNewline 20..22 <-- overlap between Newline 21..22 <-- these two tokens NonLogicalNewline 22..23 Def 23..26 ... ``` fixes: #12028 ## Test Plan Add a test case with line continuation and windows style newline character.	2024-06-26 14:00:48 +05:30
Dhruv Manilawala	7cb2619ef5	Add syntax error for empty type parameter list (#12030 ) ## Summary (I'm pretty sure I added this in the parser re-write but must've got lost in the rebase?) This PR raises a syntax error if the type parameter list is empty. As per the grammar, there should be at least one type parameter: ``` type_params: \| invalid_type_params \| '[' type_param_seq ']' type_param_seq: ','.type_param+ [','] ``` Verified via the builtin `ast` module as well: ```console $ python3.13 -m ast parser/_.py Traceback (most recent call last): [..] File "parser/_.py", line 1 def foo[](): ^ SyntaxError: Type parameter list cannot be empty ``` ## Test Plan Add inline test cases and update the snapshots.	2024-06-26 08:10:35 +05:30
Dhruv Manilawala	7109214b57	Update parser tests to validate token ranges (#12019 ) ## Summary This PR updates the parser test infrastructure to validate the token ranges. From the code documentation: ``` /// Verifies that: /// * the ranges are strictly increasing when loop the tokens in insertion order /// * all ranges are within the length of the source code ``` Follow-up from #12016 and #12017 resolves: #11938 ## Test Plan Make sure that there are no failures.	2024-06-25 08:14:28 +00:00
Dhruv Manilawala	d930e97212	Do not include newline for unterminated string range (#12017 ) ## Summary This PR updates the unterminated string error range to not include the final newline character. This is a follow-up to #12016 and required for #12019 This is not done for when the unterminated string goes till the end of file (not a newline character). The unterminated f-string range is correct. ### Why is this required for #12019 ? Because otherwise the token ranges will overlap. For example: ```py f"{" f"{foo!r" ``` Here, the re-lexing logic recovers from an unterminated f-string and thus emitting a `Newline` token for the one at the end of the first line. But, currently the `Unknown` and the `Newline` token would overlap because the `Unknown` token (unterminated string literal) range would include the newline character. ## Test Plan Update and validate the snapshot.	2024-06-25 08:10:07 +00:00
Dhruv Manilawala	9c1b6ec411	Use correct range to highlight line continuation error (#12016 ) ## Summary This PR fixes the range highlighted for the line continuation error. Previously, it would highlight an incorrect range: ``` 1 \| call(a, b, \\\ \| ^^ Syntax Error: unexpected character after line continuation character 2 \| 3 \| def bar(): \| ``` And now: ``` \| 1 \| call(a, b, \\\ \| ^ Syntax Error: unexpected character after line continuation character 2 \| 3 \| def bar(): \| ``` This is implemented by avoiding to update the token range for the `Unknown` token which is emitted when there's a lexical error. Instead, the `push_error` helper method will be responsible to update the range to the error location. This actually becomes a requirement which can be seen in follow-up PRs. ## Test Plan Update and validate the snapshot.	2024-06-25 13:35:24 +05:30
Dhruv Manilawala	68a8978454	Consider line continuation character for re-lexing (#12008 ) ## Summary This PR fixes a bug where the re-lexing logic didn't consider the line continuation character being present before the newline character. This meant that the lexer was being moved back to the newline character which is actually ignored via `\`. Considering the following code: ```py f'middle {'string':\ 'format spec'} ``` The old token stream is: ``` ... Colon 18..19 FStringMiddle 19..29 (flags = F_STRING) Newline 20..21 Indent 21..29 String 29..42 Rbrace 42..43 ... ``` Notice how the ranges are overlapping between the `FStringMiddle` token and the tokens emitted after moving the lexer backwards. After this fix, the new token stream which is without moving the lexer backwards in this scenario: ``` FStringStart 0..2 (flags = F_STRING) FStringMiddle 2..9 (flags = F_STRING) Lbrace 9..10 String 10..18 Colon 18..19 FStringMiddle 19..29 (flags = F_STRING) FStringEnd 29..30 (flags = F_STRING) Name 30..36 Name 37..41 Unknown 41..44 Newline 44..45 ``` fixes: #12004 ## Test Plan Add test cases and update the snapshots.	2024-06-25 02:13:54 +00:00
renovate[bot]	53a80a5c11	Update Rust crate rustc-hash to v2 (#12001 )	2024-06-23 20:46:42 -04:00
Dhruv Manilawala	81160320de	Manual impl of `Debug` on `Token` (#11958 ) ## Summary I look at the token stream a lot, not specifically in the playground but in the terminal output and it's annoying to scroll a lot to find specific location. Most of the information is also redundant. The final format we end up with is: `<kind> <range> (flags = ...)` e.g., `String 0..4 (flags = BYTE_STRING)` where the flags part is only populated if there are any flags set.	2024-06-22 04:18:24 +00:00
Dhruv Manilawala	27ebff36ec	Remove `Token::is_trivia` method (#11962 ) Sorry, a leftover from my rebase	2024-06-21 10:24:42 +00:00
Dhruv Manilawala	96da136e6a	Move token and error structs into related modules (#11957 ) ## Summary This PR does some housekeeping into moving certain structs into related modules. Specifically, 1. Move `LexicalError` from `lexer.rs` to `error.rs` which also contains the `ParseError` 2. Move `Token`, `TokenFlags` and `TokenValue` from `lexer.rs` to `token.rs`	2024-06-21 10:07:19 +00:00
Dhruv Manilawala	4667d8697c	Remove duplication around `is_trivia` functions (#11956 ) ## Summary This PR removes the duplication around `is_trivia` functions. There are two of them in the codebase: 1. In `pycodestyle`, it's for newline, indent, dedent, non-logical newline and comment 2. In the parser, it's for non-logical newline and comment The `TokenKind::is_trivia` method used (1) but that's not correct in that context. So, this PR introduces a new `is_non_logical_token` helper method for the `pycodestyle` crate and updates the `TokenKind::is_trivia` implementation with (2). This also means we can remove `Token::is_trivia` method and the standalone `token_source::is_trivia` function and use the one on `TokenKind`. ## Test Plan `cargo insta test`	2024-06-21 10:02:40 +00:00
Dhruv Manilawala	ed948eaefb	Avoid moving back the lexer for triple-quoted fstring (#11939 ) ## Summary This PR avoids moving back the lexer for a triple-quoted f-string during the re-lexing phase. The reason this is a problem is that for a triple-quoted f-string the newlines are part of the f-string itself, specifically they'll be part of the `FStringMiddle` token. So, if we moved the lexer back, there would be a `Newline` token whose range would be in between an `FStringMiddle` token. This creates a panic in downstream usage. fixes: #11937 ## Test Plan Add test cases and validate the snapshots.	2024-06-20 16:27:36 +05:30
Dhruv Manilawala	b617d90651	Update `E999` to show all syntax errors (#11900 ) ## Summary This PR updates the linter to show all the parse errors as diagnostics instead of just the first one. Note that this doesn't affect the parse error displayed as error log message. This will be removed in a follow-up PR. ### Breaking? I don't think this is a breaking change even though this might give more diagnostics. The main reason is that this shouldn't affect any users because it'll only give additional diagnostics in the case of multiple syntax errors. ## Test Plan Add an integration test case which would raise more than one parse error.	2024-06-19 13:09:54 +05:30
Dhruv Manilawala	cdc7c71449	Avoid consuming trailing whitespace during re-lexing (#11933 ) ## Summary This PR updates the re-lexing logic to avoid consuming the trailing whitespace and move the lexer explicitly to the last newline character encountered while moving backwards. Consider the following code snippet as taken from the test case highlighted with whitespace (`.`) and newline (`\n`) characters: ```py # There are trailing whitespace before the newline character but those whitespaces are # part of the comment token f"""hello {x # comment....\n # ^ y = 1\n ``` The parser is at `y` when it's trying to recover from an unclosed `{`, so it calls into the re-lexing logic which tries to move the lexer back to the end of the previous line. But, as it consumed all whitespaces it moved the lexer to the location marked by `^` in the above code snippet. But, those whitespaces are part of the comment token. This means that the range for the two tokens were overlapping which introduced the panic. Note that this is only a bug when there's a comment with a trailing whitespace otherwise it's fine to move the lexer to the whitespace character. This is because the lexer would just skip the whitespace otherwise. Nevertheless, this PR updates the logic to move it explicitly to the newline character in all cases. fixes: #11929 ## Test Plan Add test cases and update the snapshot. Make sure that it doesn't panic on the code snippet in the linked issue.	2024-06-19 12:14:18 +05:30
Dhruv Manilawala	1e0642fac8	Use re-lexing for normal list parsing (#11871 ) ## Summary This PR is a follow-up on #11845 to add the re-lexing logic for normal list parsing. A normal list parsing is basically parsing elements without any separator in between i.e., there can only be trivia tokens in between the two elements. Currently, this is only being used for parsing assignment statement and f-string elements. Assignment statements cannot be in a parenthesized context, but f-string can have curly braces so this PR is specifically for them. I don't think this is an ideal recovery but the problem is that both lexer and parser could add an error for f-strings. If the lexer adds an error it'll emit an `Unknown` token instead while the parser adds the error directly. I think we'd need to move all f-string errors to be emitted by the parser instead. This way the parser can correctly inform the lexer that it's out of an f-string and then the lexer can pop the current f-string context out of the stack. ## Test Plan Add test cases, update the snapshots, and run the fuzzer.	2024-06-18 12:14:41 +05:30
Dhruv Manilawala	8499abfa7f	Implement re-lexing logic for better error recovery (#11845 ) ## Summary This PR implements the re-lexing logic in the parser. This logic is only applied when recovering from an error during list parsing. The logic is as follows: 1. During list parsing, if an unexpected token is encountered and it detects that an outer context can understand it and thus recover from it, it invokes the re-lexing logic in the lexer 2. This logic first checks if the lexer is in a parenthesized context and returns if it's not. Thus, the logic is a no-op if the lexer isn't in a parenthesized context 3. It then reduces the nesting level by 1. It shouldn't reset it to 0 because otherwise the recovery from nested list parsing will be incorrect 4. Then, it tries to find last newline character going backwards from the current position of the lexer. This avoids any whitespaces but if it encounters any character other than newline or whitespace, it aborts. 5. Now, if there's a newline character, then it needs to be re-lexed in a logical context which means that the lexer needs to emit it as a `Newline` token instead of `NonLogicalNewline`. 6. If the re-lexing gives a different token than the current one, the token source needs to update it's token collection to remove all the tokens which comes after the new current position. It turns out that the list parsing isn't that happy with the results so it requires some re-arranging such that the following two errors are raised correctly: 1. Expected comma 2. Recovery context error For (1), the following scenarios needs to be considered: * Missing comma between two elements * Half parsed element because the grammar doesn't allow it (for example, named expressions) For (2), the following scenarios needs to be considered: 1. If the parser is at a comma which means that there's a missing element otherwise the comma would've been consumed by the first `eat` call above. And, the parser doesn't take the re-lexing route on a comma token. 2. If it's the first element and the current token is not a comma which means that it's an invalid element. resolves: #11640 ## Test Plan - [x] Update existing test snapshots and validate them - [x] Add additional test cases specific to the re-lexing logic and validate the snapshots - [x] Run the fuzzer on 3000+ valid inputs - [x] Run the fuzzer on invalid inputs - [x] Run the parser on various open source projects - [x] Make sure the ecosystem changes are none	2024-06-17 06:47:00 +00:00

1 2 3 4 5 ...

304 Commits