Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Dylan	04a3ec3689	Adjust own-line comment placement between branches (#21185 ) This PR attempts to improve the placement of own-line comments between branches in the setting where the comment is more indented than the preceding node. There are two main changes. ### First change: Preceding node has leading content If the preceding node has leading content, we now regard the comment as automatically _less_ indented than the preceding node, and format accordingly. For example, ```python if True: preceding_node # leading on `else`, not trailing on `preceding_node` else: ... ``` This is more compatible with `black`, although there is a (presumably very uncommon) edge case: ```python if True: this;that # leading on `else`, but trailing in `black` else: ... ``` I'm sort of okay with this - presumably if one wanted a comment for those semi-colon separated statements, one should have put it _above_ them, and one wanted a comment only for `that` then it ought to have been on the same line? ### Second change: searching for last child in body While searching for the (recursively) last child in the body of the preceding _branch_, we implicitly assumed that the preceding node had to have a body to begin the recursion. But actually, in the base case, the preceding node _is_ the last child in the body of the preceding branch. So, for example: ```python if True: something last_child_but_no_body # leading on else for `main` but trailing in this PR else: ... ``` ### More examples The table below is an attempt to summarize the changes in behavior. The rows alternate between an example snippet with `while` and the same example with `if` - in the former case we do _not_ have an `else` node and in the latter we do. Notice that: 1. On `main` our handling of `if` vs. `while` is not consistent, whereas it is consistent in the present PR 2. We disagree with `black` in all cases except that last example on `main`, but agree in all cases for the present PR (though see above for a wonky edge case where we disagree). <table> <tr> <th>Original                             </th> <th><code>main</code>                               </th> <th>This PR                               </th> <th><code>black</code>                               </th> </tr> <tr> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> while True: pass else: # comment pass </pre> </td> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> </tr> <tr> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> </tr> <tr> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> </tr> <tr> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> </tr> <tr> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> while True: pass else: # comment pass </pre> </td> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> while True: pass # comment else: pass </pre> </td> </tr> <tr> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> <td> <pre lang="python"> if True: pass # comment else: pass </pre> </td> </tr> </table>	2025-11-17 07:30:34 -06:00
Brent Westbrook	63b1c1ea8b	Avoid extra parentheses for long `match` patterns with `as` captures (#21176 ) Summary -- This PR fixes #17796 by taking the approach mentioned in https://github.com/astral-sh/ruff/issues/17796#issuecomment-2847943862 of simply recursing into the `MatchAs` patterns when checking if we need parentheses. This allows us to reuse the parentheses in the inner pattern before also breaking the `MatchAs` pattern itself: ```diff match class_pattern: case Class(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) as capture: pass - case ( - Class(xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) as capture - ): + case Class( + xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + ) as capture: pass - case ( - Class( - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx - ) as capture - ): + case Class( + xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + ) as capture: pass case ( Class( @@ -685,13 +683,11 @@ match sequence_pattern_brackets: case [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] as capture: pass - case ( - [xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx] as capture - ): + case [ + xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + ] as capture: pass - case ( - [ - xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx - ] as capture - ): + case [ + xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx + ] as capture: pass ``` I haven't really resolved the question of whether or not it's okay always to recurse, but I'm hoping the ecosystem check on this PR might shed some light on that. Test Plan -- New tests based on the issue and then reviewing the ecosystem check here	2025-11-03 17:06:52 -05:00
Brent Westbrook	827d8ae5d4	Allow newlines after function headers without docstrings (#21110 ) Summary -- This is a first step toward fixing #9745. After reviewing our open issues and several Black issues and PRs, I personally found the function case the most compelling, especially with very long argument lists: ```py def func( self, arg1: int, arg2: bool, arg3: bool, arg4: float, arg5: bool, ) -> tuple[...]: if arg2 and arg3: raise ValueError ``` or many annotations: ```py def function( self, data: torch.Tensor \| tuple[torch.Tensor, ...], other_argument: int ) -> torch.Tensor \| tuple[torch.Tensor, ...]: do_something(data) return something ``` I think docstrings help the situation substantially both because syntax highlighting will usually give a very clear separation between the annotations and the docstring and because we already allow a blank line _after_ the docstring: ```py def function( self, data: torch.Tensor \| tuple[torch.Tensor, ...], other_argument: int ) -> torch.Tensor \| tuple[torch.Tensor, ...]: """ A function doing something. And a longer description of the things it does. """ do_something(data) return something ``` There are still other comments on #9745, such as [this one] with 9 upvotes, where users specifically request blank lines in all block types, or at least including conditionals and loops. I'm sympathetic to that case as well, even if personally I don't find an [example] like this: ```py if blah: # Do some stuff that is logically related data = get_data() # Do some different stuff that is logically related results = calculate_results() return results ``` to be much more readable than: ```py if blah: # Do some stuff that is logically related data = get_data() # Do some different stuff that is logically related results = calculate_results() return results ``` I'm probably just used to the latter from the formatters I've used, but I do prefer it. I also think that functions are the least susceptible to the accidental introduction of a newline after refactoring described in Micha's [comment] on #8893. I actually considered further restricting this change to functions with multiline headers. I don't think very short functions like: ```py def foo(): return 1 ``` benefit nearly as much from the allowed newline, but I just went with any function without a docstring for now. I guess a marginal case like: ```py def foo(a_long_parameter: ALongType, b_long_parameter: BLongType) -> CLongType: return 1 ``` might be a good argument for not restricting it. I caused a couple of syntax errors before adding special handling for the ellipsis-only case, so I suspect that there are some other interesting edge cases that may need to be handled better. Test Plan -- Existing tests, plus a few simple new ones. As noted above, I suspect that we may need a few more for edge cases I haven't considered. [this one]: https://github.com/astral-sh/ruff/issues/9745#issuecomment-2876771400 [example]: https://github.com/psf/black/issues/902#issuecomment-1562154809 [comment]: https://github.com/astral-sh/ruff/issues/8893#issuecomment-1867259744	2025-10-31 14:53:40 -04:00
Dylan	116611bd39	Fix finding keyword range for clause header after statement ending with semicolon (#21067 ) When formatting clause headers for clauses that are not their own node, like an `else` clause or `finally` clause, we begin searching for the keyword at the end of the previous statement. However, if the previous statement ended in a semicolon this caused a panic because we only expected trivia between the end of the last statement and the keyword. This PR adjusts the starting point of our search for the keyword to begin after the optional semicolon in these cases. Closes #21065	2025-10-27 09:52:17 -05:00
Micha Reiser	3c7f56f582	Restore `indent.py` (#21094 )	2025-10-27 10:34:29 +00:00
Brent Westbrook	4b0fa5f270	Render a diagnostic for syntax errors introduced in formatter tests (#21021 ) ## Summary I spun this out from #21005 because I thought it might be helpful separately. It just renders a nice `Diagnostic` for syntax errors pointing to the source of the error. This seemed a bit more helpful to me than just the byte offset when working on #21005, and we had most of the code around after #20443 anyway. ## Test Plan This doesn't actually affect any passing tests, but here's an example of the additional output I got when I broke the spacing after the `in` token: ``` error[internal-error]: Expected 'in', found name --> /home/brent/astral/ruff/crates/ruff_python_formatter/resources/test/fixtures/black/cases/cantfit.py:50:79 \| 48 \| need_more_to_make_the_line_long_enough, 49 \| ) 50 \| del ([], name_1, name_2), [(), [], name_4, name_3], name_1[[name_2 for name_1 inname_0]] \| ^^^^^^^^ 51 \| del () \| ``` I just appended this to the other existing output for now.	2025-10-21 13:47:26 -04:00
Brent Westbrook	0115fd3757	Avoid reusing nested, interpolated quotes before Python 3.12 (#20930 ) ## Summary Fixes #20774 by tracking whether an `InterpolatedStringState` element is nested inside of another interpolated element. This feels like kind of a naive fix, so I'm welcome to other ideas. But it resolves the problem in the issue and clears up the syntax error in the black compatibility test, without affecting many other cases. The other affected case is actually interesting too because the [input](`96b156303b/crates/ruff_python_formatter/resources/test/fixtures/ruff/expression/fstring.py (L707)`) is invalid, but the previous quote selection fixed the invalid syntax: ```pycon Python 3.11.13 (main, Sep 2 2025, 14:20:25) [Clang 20.1.4 ] on linux Type "help", "copyright", "credits" or "license" for more information. >>> f'{1: abcd "{'aa'}" }' # input File "<stdin>", line 1 f'{1: abcd "{'aa'}" }' ^^ SyntaxError: f-string: expecting '}' >>> f'{1: abcd "{"aa"}" }' # old output Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Invalid format specifier ' abcd "aa" ' for object of type 'int' >>> f'{1: abcd "{'aa'}" }' # new output File "<stdin>", line 1 f'{1: abcd "{'aa'}" }' ^^ SyntaxError: f-string: expecting '}' ``` We now preserve the invalid syntax in the input. Unfortunately, this also seems to be another edge case I didn't consider in https://github.com/astral-sh/ruff/pull/20867 because we don't flag this as a syntax error after 0.14.1: <details><summary>Shell output</summary> <p> ``` > uvx ruff@0.14.0 check --ignore ALL --target-version py311 - <<EOF f'{1: abcd "{'aa'}" }' EOF invalid-syntax: Cannot reuse outer quote character in f-strings on Python 3.11 (syntax was added in Python 3.12) --> -:1:14 \| 1 \| f'{1: abcd "{'aa'}" }' \| ^ \| Found 1 error. > uvx ruff@0.14.1 check --ignore ALL --target-version py311 - <<EOF f'{1: abcd "{'aa'}" }' EOF All checks passed! > uvx python@3.11 -m ast <<EOF f'{1: abcd "{'aa'}" }' EOF Traceback (most recent call last): File "<frozen runpy>", line 198, in _run_module_as_main File "<frozen runpy>", line 88, in _run_code File "/home/brent/.local/share/uv/python/cpython-3.11.13-linux-x86_64-gnu/lib/python3.11/ast.py", line 1752, in <module> main() File "/home/brent/.local/share/uv/python/cpython-3.11.13-linux-x86_64-gnu/lib/python3.11/ast.py", line 1748, in main tree = parse(source, args.infile.name, args.mode, type_comments=args.no_type_comments) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/brent/.local/share/uv/python/cpython-3.11.13-linux-x86_64-gnu/lib/python3.11/ast.py", line 50, in parse return compile(source, filename, mode, flags, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "<stdin>", line 1 f'{1: abcd "{'aa'}" }' ^^ SyntaxError: f-string: expecting '}' ``` </p> </details> I assumed that was the same `ParseError` as the one caused by `f"{1:""}"`, but this is a nested interpolation inside of the format spec. ## Test Plan New test copied from the black compatibility test. I guess this is a duplicate now, I started working on this branch before the new black tests were imported, so I could delete the separate test in our fixtures if that's preferable.	2025-10-17 08:49:16 -04:00
Brent Westbrook	8b9ab48ac6	Fix syntax error false positives for escapes and quotes in f-strings (#20867 ) Summary -- Fixes #20844 by refining the unsupported syntax error check for [PEP 701] f-strings before Python 3.12 to allow backslash escapes and escaped outer quotes in the format spec part of f-strings. These are only disallowed within the f-string expression part on earlier versions. Using the examples from the PR: ```pycon >>> f"{1:\x64}" '1' >>> f"{1:\"d\"}" Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: Invalid format specifier '"d"' for object of type 'int' ``` Note that the second case is a runtime error, but this is actually avoidable if you override `__format__`, so despite being pretty weird, this could actually be a valid use case. ```pycon >>> class C: ... def __format__(args, *kwargs): return "<C>" ... >>> f"{C():\"d\"}" '<C>' ``` At first I thought narrowing the range we check to exclude the format spec would only work for escapes, but it turns out that cases like `f"{1:""}"` are already covered by an existing `ParseError`, so we can just narrow the range of both our escape and quote checks. Our comment check also seems to be working correctly because it's based on the actual tokens. A case like [this](https://play.ruff.rs/9f1c2ff2-cd8e-4ad7-9f40-56c0a524209f): ```python f"""{1:# }""" ``` doesn't include a comment token, instead the `#` is part of an `InterpolatedStringLiteralElement`. Test Plan -- New inline parser tests [PEP 701]: https://peps.python.org/pep-0701/	2025-10-15 09:23:16 -04:00
Brent Westbrook	591e9bbccb	Remove parentheses around multiple exception types on Python 3.14+ (#20768 ) Summary -- This PR implements the black preview style from https://github.com/psf/black/pull/4720. As of Python 3.14, you're allowed to omit the parentheses around groups of exceptions, as long as there's no `as` binding: 3.13 ```pycon Python 3.13.4 (main, Jun 4 2025, 17:37:06) [Clang 20.1.4 ] on linux Type "help", "copyright", "credits" or "license" for more information. >>> try: ... ... except (Exception, BaseException): ... ... Ellipsis >>> try: ... ... except Exception, BaseException: ... ... File "<python-input-1>", line 2 except Exception, BaseException: ... ^^^^^^^^^^^^^^^^^^^^^^^^ SyntaxError: multiple exception types must be parenthesized ``` 3.14 ```pycon Python 3.14.0rc2 (main, Sep 2 2025, 14:20:56) [Clang 20.1.4 ] on linux Type "help", "copyright", "credits" or "license" for more information. >>> try: ... ... except Exception, BaseException: ... ... Ellipsis >>> try: ... ... except (Exception, BaseException): ... ... Ellipsis >>> try: ... ... except Exception, BaseException as e: ... ... File "<python-input-2>", line 2 except Exception, BaseException as e: ... ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ SyntaxError: multiple exception types must be parenthesized when using 'as' ``` I think this ended up being pretty straightforward, at least once Micha showed me where to start :) Test Plan -- New tests At first I thought we were deviating from black in how we handle comments within the exception type tuple, but I think this applies to how we format all tuples, not specifically with the new preview style.	2025-10-14 11:17:45 -04:00
Brent Westbrook	1ed9b215b9	Update Black tests (#20794 ) Summary -- ```shell git clone git@github.com:psf/black.git ../other/black crates/ruff_python_formatter/resources/test/fixtures/import_black_tests.py ../other/black ``` Then ran our tests and accepted the snapshots I had to make a small fix to our tuple normalization logic for `del` statements in the second commit, otherwise the tests were panicking at a changed AST. I think the new implementation is closer to the intention described in the nearby comment anyway, though. The first commit adds the new Python, settings, and `.expect` files, the next three commits make some small fixes to help get the tests running, and then the fifth commit accepts all but one of the new snapshots. The last commit includes the new unsupported syntax error for one f-string example, tracked in #20774. Test Plan -- Newly imported tests. I went through all of the new snapshots and added review comments below. I think they're all expected, except a few cases I wasn't 100% sure about.	2025-10-14 10:14:59 -04:00
Brent Westbrook	71f8389f61	Fix syntax error false positives on parenthesized context managers (#20846 ) This PR resolves the issue noticed in https://github.com/astral-sh/ruff/pull/20777#discussion_r2417233227. Namely, cases like this were being flagged as syntax errors despite being perfectly valid on Python 3.8: ```pycon Python 3.8.20 (default, Oct 2 2024, 16:34:12) [Clang 18.1.8 ] on linux Type "help", "copyright", "credits" or "license" for more information. >>> with (open("foo.txt", "w")): ... ... Ellipsis >>> with (open("foo.txt", "w")) as f: print(f) ... <_io.TextIOWrapper name='foo.txt' mode='w' encoding='UTF-8'> ``` The second of these was already allowed but not the first: ```shell > ruff check --target-version py38 --ignore ALL - <<EOF with (open("foo.txt", "w")): ... with (open("foo.txt", "w")) as f: print(f) EOF invalid-syntax: Cannot use parentheses within a `with` statement on Python 3.8 (syntax was added in Python 3.9) --> -:1:6 \| 1 \| with (open("foo.txt", "w")): ... \| ^ 2 \| with (open("foo.txt", "w")) as f: print(f) \| Found 1 error. ``` There was some discussion of related cases in https://github.com/astral-sh/ruff/pull/16523#discussion_r1984657793, but it seems I overlooked the single-element case when flagging tuples. As suggested in the other thread, we can just check if there's more than one element or a trailing comma, which will cause the tuple parsing on <=3.8 and avoid the false positives.	2025-10-13 14:13:27 -04:00
Brent Westbrook	975891fc90	Render unsupported syntax errors in formatter tests (#20777 ) ## Summary Based on the suggestion in https://github.com/astral-sh/ruff/issues/20774#issuecomment-3383153511, I added rendering of unsupported syntax errors in our `format` test. In support of this, I added a `DummyFileResolver` type to `ruff_db` to pass to `DisplayDiagnostics::new` (first commit). Another option would obviously be implementing this directly in the fixtures, but we'd have to import a `NotebookIndex` somehow; either by depending directly on `ruff_notebook` or re-exporting it from `ruff_db`. I thought it might be convenient elsewhere to have a dummy resolver, for example in the parser, where we currently have a separate rendering pipeline [copied](https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_parser/tests/fixtures.rs#L321) from our old rendering code in `ruff_linter`. I also briefly tried implementing a `TestDb` in the formatter since I noticed the `ruff_python_formatter::db` module, but that was turning into a lot more code than the dummy resolver. We could also push this a bit further if we wanted. I didn't add the new snapshots to the black compatibility tests or to the preview snapshots, for example. I thought it was kind of noisy enough (and helpful enough) already, though. We could also use a shorter diagnostic format, but the full output seems most useful once we accept this initial large batch of changes. ## Test Plan I went through the baseline snapshots pretty quickly, but they all looked reasonable to me, with one exception I noted below. I also tested that the case from #20774 produces a new unsupported syntax error.	2025-10-13 10:00:37 -04:00
Brent Westbrook	88c0ce3e38	Update default and latest Python versions for 3.14 (#20725 ) Summary -- Closes #19467 and also removes the warning about using Python 3.14 without preview enabled. I also bumped `PythonVersion::default` to 3.9 because it reaches EOL this month, but we could also defer that for now if we wanted. The first three commits are related to the `latest` bump to 3.14; the fourth commit bumps the default to 3.10. Note that this PR also bumps the default Python version for ty to 3.10 because there was a test asserting that it stays in sync with `ast::PythonVersion`. Test Plan -- Existing tests I spot-checked the ecosystem report, and I believe these are all expected. Inbits doesn't specify a target Python version, so I guess we're applying the default. UP007, UP035, and UP045 all use the new default value to emit new diagnostics.	2025-10-07 12:23:11 -04:00
Ibraheem Ahmed	7abc41727b	[ty] Shrink size of `AstNodeRef` (#20028 ) ## Summary Removes the `module_ptr` field from `AstNodeRef` in release mode, and change `NodeIndex` to a `NonZeroU32` to reduce the size of `Option<AstNodeRef<_>>` fields. I believe CI runs in debug mode, so this won't show up in the memory report, but this reduces memory by ~2% in release mode.	2025-08-22 17:03:22 -04:00
Dylan	008bbfdf5a	Disallow implicit concatenation of t-strings and other string types (#19485 ) As of [this cpython PR](https://github.com/python/cpython/pull/135996), it is not allowed to concatenate t-strings with non-t-strings, implicitly or explicitly. Expressions such as `"foo" t"{bar}"` are now syntax errors. This PR updates some AST nodes and parsing to reflect this change. The structural change is that `TStringPart` is no longer needed, since, as in the case of `BytesStringLiteral`, the only possibilities are that we have a single `TString` or a vector of such (representing an implicit concatenation of t-strings). This removes a level of nesting from many AST expressions (which is what all the snapshot changes reflect), and simplifies some logic in the implementation of visitors, for example. The other change of note is in the parser. When we meet an implicit concatenation of string-like literals, we now count the number of t-string literals. If these do not exhaust the total number of implicitly concatenated pieces, then we emit a syntax error. To recover from this syntax error, we encode any t-string pieces as _invalid_ string literals (which means we flag them as invalid, record their range, and record the value as `""`). Note that if at least one of the pieces is an f-string we prefer to parse the entire string as an f-string; otherwise we parse it as a string. This logic is exactly the same as how we currently treat `BytesStringLiteral` parsing and error recovery - and carries with it the same pros and cons. Finally, note that I have not implemented any changes in the implementation of the formatter. As far as I can tell, none are needed. I did change a few of the fixtures so that we are always concatenating t-strings with t-strings.	2025-07-27 12:41:03 +00:00
K	47653ca88a	[formatter] Fix missing blank lines before decorated classes in .pyi files (#18888 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2025-06-24 16:25:44 +02:00
Micha Reiser	1188ffccc4	Disallow newlines in format specifiers of single quoted f- or t-strings (#18708 )	2025-06-18 14:56:15 +02:00
Micha Reiser	c22f809049	Hug closing `}` when f-string expression has a format specifier (#18704 )	2025-06-17 07:39:42 +02:00
Dylan	c5b58187da	Add syntax error when conversion flag does not immediately follow exclamation mark (#18706 ) Closes #18671 Note that while this has, I believe, always been invalid syntax, it was reported as a different syntax error until Python 3.12: Python 3.11: ```pycon >>> x = 1 >>> f"{x! s}" File "<stdin>", line 1 f"{x! s}" ^ SyntaxError: f-string: invalid conversion character: expected 's', 'r', or 'a' ``` Python 3.12: ```pycon >>> x = 1 >>> f"{x! s}" File "<stdin>", line 1 f"{x! s}" ^^^ SyntaxError: f-string: conversion type must come right after the exclamanation mark ```	2025-06-16 11:44:42 -05:00
Micha Reiser	8237d4670c	Fix `\r` and `\r\n` handling in t- and f-string debug texts (#18673 )	2025-06-15 06:53:06 +01:00
Ibraheem Ahmed	c9dff5c7d5	[ty] AST garbage collection (#18482 ) ## Summary Garbage collect ASTs once we are done checking a given file. Queries with a cross-file dependency on the AST will reparse the file on demand. This reduces ty's peak memory usage by ~20-30%. The primary change of this PR is adding a `node_index` field to every AST node, that is assigned by the parser. `ParsedModule` can use this to create a flat index of AST nodes any time the file is parsed (or reparsed). This allows `AstNodeRef` to simply index into the current instance of the `ParsedModule`, instead of storing a pointer directly. The indices are somewhat hackily (using an atomic integer) assigned by the `parsed_module` query instead of by the parser directly. Assigning the indices in source-order in the (recursive) parser turns out to be difficult, and collecting the nodes during semantic indexing is impossible as `SemanticIndex` does not hold onto a specific `ParsedModuleRef`, which the pointers in the flat AST are tied to. This means that we have to do an extra AST traversal to assign and collect the nodes into a flat index, but the small performance impact (~3% on cold runs) seems worth it for the memory savings. Part of https://github.com/astral-sh/ty/issues/214.	2025-06-13 08:40:11 -04:00
Micha Reiser	b3b900dc1e	Treat `ty: ` comments as pragma comments (#18532 ) ## Summary Add support for ty's `ty:` pragma comments to ruff's formatter and E501 Fixes https://github.com/astral-sh/ruff/issues/18529 ## Test Plan Added test	2025-06-07 16:02:43 +02:00
Dylan	9bbf4987e8	Implement template strings (#17851 ) This PR implements template strings (t-strings) in the parser and formatter for Ruff. Minimal changes necessary to compile were made in other parts of the code (e.g. ty, the linter, etc.). These will be covered properly in follow-up PRs.	2025-05-30 15:00:56 -05:00
Max Mynter	bdf488462a	Preserve tuple parentheses in case patterns (#18147 )	2025-05-22 07:52:21 +02:00
Micha Reiser	9ae698fe30	Switch to Rust 2024 edition (#18129 )	2025-05-16 13:25:28 +02:00
Micha Reiser	1c65e0ad25	Split `SourceLocation` into `LineColumn` and `SourceLocation` (#17587 )	2025-04-27 11:27:33 +01:00
Max Mynter	1aad180aae	Don't add chaperone space after escaped quote in triple quote (#17216 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2025-04-11 10:21:47 +02:00
Micha Reiser	a4b7c4ef70	[formatter] Stabilize fix for single-with-item formatting with trailing comment (#16603 ) ## Summary This PR stabilizies the fix for https://github.com/astral-sh/ruff/issues/14001 We try to only make breaking formatting changes once a year. However, the plan was to release this fix as part of Ruff 0.9 but I somehow missed it when promoting all other formatter changes. I think it's worth making an exception here considering that this is a bug fix, it improves readability, and it should be rare (very few files in a single project). Our version policy explicitly allows breaking formatter changes in any minor release and the idea of only making breaking formatter changes once a year is mainly to avoid multiple releases throughout the year that introduce large formatter changes Closes https://github.com/astral-sh/ruff/issues/14001 ## Test Plan Updated snapshot	2025-03-13 15:37:37 +01:00
Micha Reiser	9cd0cdefd3	Assert that formatted code doesn't introduce any new unsupported syntax errors (#16549 ) ## Summary This should give us better coverage for the unsupported syntax error features and increases our confidence that the formatter doesn't accidentially introduce new unsupported syntax errors. A feature like this would have been very useful when working on f-string formatting where it took a lot of iteration to find all Python 3.11 or older incompatibilities. ## Test Plan I applied my changes on top of https://github.com/astral-sh/ruff/pull/16523 and removed the target version check in the with-statement formatting code. As expected, the integration tests now failed	2025-03-07 09:12:00 +01:00
Brent Westbrook	97d0659ce3	Pass `ParserOptions` to the parser (#16220 ) ## Summary This is part of the preparation for detecting syntax errors in the parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested in [this comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509), I started working on a `ParseOptions` struct that could be stored in the parser. For this initial refactor, I only made it hold the existing `Mode` option, but for syntax errors, we will also need it to have a `PythonVersion`. For that use case, I'm picturing something like a `ParseOptions::with_python_version` method, so you can extend the current calls to something like ```rust ParseOptions::from(mode).with_python_version(settings.target_version) ``` But I thought it was worth adding `ParseOptions` alone without changing any other behavior first. Most of the diff is just updating call sites taking `Mode` to take `ParseOptions::from(Mode)` or those taking `PySourceType`s to take `ParseOptions::from(PySourceType)`. The interesting changes are in the new `parser/options.rs` file and smaller parts of `parser/mod.rs` and `ruff_python_parser/src/lib.rs`. ## Test Plan Existing tests, this should not change any behavior.	2025-02-19 10:50:50 -05:00
Brent Westbrook	a9efdea113	Use `ast::PythonVersion` internally in the formatter and linter (#16170 ) ## Summary This PR updates the formatter and linter to use the `PythonVersion` struct from the `ruff_python_ast` crate internally. While this doesn't remove the need for the `linter::PythonVersion` enum, it does remove the `formatter::PythonVersion` enum and limits the use in the linter to deserializing from CLI arguments and config files and moves most of the remaining methods to the `ast::PythonVersion` struct. ## Test Plan Existing tests, with some inputs and outputs updated to reflect the new (de)serialization format. I think these are test-specific and shouldn't affect any external (de)serialization. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2025-02-18 12:03:13 -05:00
Micha Reiser	31180a84e4	Fix unstable formatting of trailing end-of-line comments of parenthesized attribute values (#16187 )	2025-02-18 08:43:51 +01:00
Brent Westbrook	23c98849fc	Preserve quotes in generated f-strings (#15794 ) ## Summary This is another follow-up to #15726 and #15778, extending the quote-preserving behavior to f-strings and deleting the now-unused `Generator::quote` field. ## Details I also made one unrelated change to `rules/flynt/helpers.rs` to remove a `to_string` call for making a `Box<str>` and tweaked some arguments to some of the `Generator::unparse_f_string` methods to make the code easier to follow, in my opinion. Happy to revert especially the latter of these if needed. Unfortunately this still does not fix the issue in #9660, which appears to be more of an escaping issue than a quote-preservation issue. After #15726, the result is now `a = f'# {"".join([])}' if 1 else ""` instead of `a = f"# {''.join([])}" if 1 else ""` (single quotes on the outside now), but we still don't have the desired behavior of double quotes everywhere on Python 3.12+. I added a test for this but split it off into another branch since it ended up being unaddressed here, but my `dbg!` statements showed the correct preferred quotes going into [`UnicodeEscape::with_preferred_quote`](https://github.com/astral-sh/ruff/blob/main/crates/ruff_python_literal/src/escape.rs#L54). ## Test Plan Existing rule and `Generator` tests. --------- Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2025-01-29 13:28:22 -05:00
Brent Westbrook	98d20a8219	Preserve quotes in generated byte strings (#15778 ) ## Summary This is a very closely related follow-up to #15726, adding the same quote-preserving behavior to bytestrings. Only one rule (UP018) was affected this time, and it was easy to mirror the plain string changes. ## Test Plan Existing tests	2025-01-28 08:19:40 -05:00
Alex Waygood	9c938442e5	[minor] Simplify some `ExprStringLiteral` creation logic (#15775 )	2025-01-27 18:51:13 +00:00
Brent Westbrook	9bf138c45a	Preserve quote style in generated code (#15726 ) ## Summary This is a first step toward fixing #7799 by using the quoting style stored in the `flags` field on `ast::StringLiteral`s to select a quoting style. This PR does not include support for f-strings or byte strings. Several rules also needed small updates to pass along existing quoting styles instead of using `StringLiteralFlags::default()`. The remaining snapshot changes are intentional and should preserve the quotes from the input strings. ## Test Plan Existing tests with some accepted updates, plus a few new RUF055 tests for raw strings. --------- Co-authored-by: Alex Waygood <alex.waygood@gmail.com>	2025-01-27 13:41:03 -05:00
Micha Reiser	1ecb7ce645	Fix unstable f-string formatting for expressions containing a trailing comma (#15545 )	2025-01-17 10:08:09 +01:00
Micha Reiser	9ed67ba33e	Fix bracket spacing for single-element tuples in f-string expressions (#15537 )	2025-01-17 08:02:34 +00:00
Micha Reiser	420365811f	Fix joining of f-strings with different quotes when using quote style `Preserve` (#15524 )	2025-01-16 12:01:42 +01:00
Micha Reiser	96c2d0996d	Fix curly bracket spacing around curly f-string expressions (#15471 )	2025-01-15 09:22:47 +01:00
Micha Reiser	2b28d566a4	Associate a trailing end-of-line comment in a parenthesized implicit concatenated string with the last literal (#15378 )	2025-01-10 19:21:34 +01:00
Micha Reiser	424b720c19	Ruff 2025 style guide (#13906 ) Closes #13371	2025-01-09 10:20:06 +01:00
Micha Reiser	1218bc65ed	Preserve multiline implicit concatenated strings in docstring positions (#15126 )	2025-01-03 10:27:14 +01:00
Micha Reiser	82faa9bb62	Add tests demonstrating f-strings with debug expressions in replacements that contain escaped characters (#14929 )	2024-12-12 09:33:20 +00:00
Dimitri Papadopoulos Orfanos	59145098d6	Fix typos found by codespell (#14863 ) ## Summary Just fix typos. ## Test Plan CI tests. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2024-12-09 09:32:12 +00:00
Micha Reiser	1559c73fcd	Fix fstring formatting removing overlong implicit concatenated string in expression part (#14811 ) ## Summary Fixes https://github.com/astral-sh/ruff/issues/14778 The formatter incorrectly removed the inner implicitly concatenated string for following single-line f-string: ```py f"{'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' 'a' if True else ""}" # formatted f"{ if True else ''}" ``` This happened because I changed the `RemoveSoftlinesBuffer` in https://github.com/astral-sh/ruff/pull/14489 to remove any content wrapped in `if_group_breaks`. After all, it emulates an all flat layout. This works fine when `if_group_breaks` is only used to add content if the gorup breaks. It doesn't work if the same content is rendered differently depending on if the group fits using `if_group_breaks` and `if_groups_fits` because the enclosing `group` might still break if the entire content exceeds the line-length limit. This PR fixes this by unwrapping any `if_group_fits` content by removing the `if_group_fits` start and end tags. ## Test Plan added test	2024-12-06 13:01:04 +01:00
Dhruv Manilawala	f96fa6b0e2	Do not consider f-strings with escaped newlines as multiline (#14624 ) ## Summary This PR fixes a bug in the f-string formatting to not consider the escaped newlines for `is_multiline`. This is done by checking if the f-string is triple-quoted or not similar to normal string literals. This is not required to be gated behind preview because the logic change for `is_multiline` was added in https://github.com/astral-sh/ruff/pull/14454. ## Test Plan Add a test case which formats differently on `main`: https://play.ruff.rs/ea3c55c2-f0fe-474e-b6b8-e3365e0ede5e	2024-11-27 10:25:38 +00:00
Dhruv Manilawala	c84c690f1e	Avoid invalid syntax for format-spec with quotes for all Python versions (#14625 ) ## Summary fixes: #14608 The logic that was only applied for 3.12+ target version needs to be applied for other versions as well. ## Test Plan I've moved the existing test cases for 3.12 only to `f_string.py` so that it's tested against the default target version. I think we should probably enabled testing for two target version (pre 3.12 and 3.12) but it won't highlight any issue because the parser doesn't consider this. Maybe we should enable this once we have target version specific syntax errors in place (https://github.com/astral-sh/ruff/issues/6591).	2024-11-27 13:19:33 +05:30
Dhruv Manilawala	f3dac27e9a	Fix f-string formatting in assignment statement (#14454 ) ## Summary fixes: #13813 This PR fixes a bug in the formatting assignment statement when the value is an f-string. This is resolved by using custom best fit layouts if the f-string is (a) not already a flat f-string (thus, cannot be multiline) and (b) is not a multiline string (thus, cannot be flattened). So, it is used in cases like the following: ```py aaaaaaaaaaaaaaaaaa = f"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{ expression}moreeeeeeeeeeeeeeeee" ``` Which is (a) `FStringLayout::Multiline` and (b) not a multiline. There are various other examples in the PR diff along with additional explanation and context as code comments. ## Test Plan Add multiple test cases for various scenarios.	2024-11-26 15:07:18 +05:30
Micha Reiser	b80de52592	Consider quotes inside format-specs when choosing the quotes for an f-string (#14493 )	2024-11-22 12:43:53 +00:00
Micha Reiser	302fe76c2b	Fix unnecessary space around power op in overlong f-string expressions (#14489 )	2024-11-22 13:01:22 +01:00
yataka	1b180c8342	Change default for Python version from 3.8 to 3.9 (#13896 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2024-11-20 13:11:51 +01:00
Micha Reiser	c847cad389	Update insta snapshots (#14366 )	2024-11-15 19:31:15 +01:00
Micha Reiser	443fd3b660	Disallow single-line implicit concatenated strings (#13928 )	2024-11-03 11:49:26 +00:00
Micha Reiser	cf0f5e1318	Fix formatting of single with-item with trailing comment (#14005 )	2024-11-01 09:08:06 +01:00
Dhruv Manilawala	5af0966057	Remove unreferenced snapshots (#13958 )	2024-10-28 07:16:05 +01:00
Micha Reiser	73ee72b665	Join implicit concatenated strings when they fit on a line (#13663 )	2024-10-24 11:52:22 +02:00
Micha Reiser	2f88f84972	Alternate quotes for strings inside f-strings in preview (#13860 )	2024-10-23 07:57:53 +02:00
Micha Reiser	27c50bebec	Bump MSRV to Rust 1.80 (#13826 )	2024-10-20 10:55:36 +02:00
Micha Reiser	a94914dc35	Enable preview mode for 'unstable' black tests (#13776 )	2024-10-16 12:25:34 +00:00
Micha Reiser	fc661e193a	Normalize implicit concatenated f-string quotes per part (#13539 )	2024-10-08 09:59:17 +00:00
Micha Reiser	c046101b79	Fix codeblock dynamic line length calculation for indented examples (#13523 )	2024-09-27 09:09:07 +02:00
Micha Reiser	9442cd8fae	Parenthesize `match..case` `if` guards (#13513 )	2024-09-26 06:44:33 +00:00
Micha Reiser	8012707348	Align formatting of patterns in match-cases with expression formatting in clause headers (#13510 )	2024-09-26 08:35:22 +02:00
Micha Reiser	531ebf6dff	Fix parentheses around return type annotations (#13381 )	2024-09-20 09:23:53 +02:00
Hamir Mahal	8b3da1867e	refactor: remove unnecessary string hashes (#13250 )	2024-09-18 19:08:59 +02:00
Micha Reiser	6ac61d7b89	Fix placement of inline parameter comments (#13379 )	2024-09-18 08:26:06 +02:00
Micha Reiser	d86e5ad031	Update Black tests (#13375 )	2024-09-17 11:16:50 +02:00
Micha Reiser	ed238e0c76	Fix incorrect placement of leading function comment with type params (#12447 )	2024-07-22 14:17:00 +02:00
konsti	9a817a2922	Insert empty line between suite and alternative branch after def/class (#12294 ) When there is a function or class definition at the end of a suite followed by the beginning of an alternative block, we have to insert a single empty line between them. In the if-else-statement example below, we insert an empty line after the `foo` in the if-block, but none after the else-block `foo`, since in the latter case the enclosing suite already adds empty lines. ```python if sys.version_info >= (3, 10): def foo(): return "new" else: def foo(): return "old" class Bar: pass ``` To do so, we track whether the current suite is the last one in the current statement with a new option on the suite kind. Fixes #12199 --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2024-07-15 12:59:33 +02:00
Micha Reiser	bd01004a42	Use `space` separator before parenthesiszed expressions in comprehensions with leading comments. (#12282 )	2024-07-11 22:38:12 +02:00
Micha Reiser	5806bc915d	Fix formatter instability for lines only consisting of zero-width characters (#11748 )	2024-06-05 17:55:14 +02:00
Dhruv Manilawala	bf5b62edac	Maintain synchronicity between the lexer and the parser (#11457 ) ## Summary This PR updates the entire parser stack in multiple ways: ### Make the lexer lazy * https://github.com/astral-sh/ruff/pull/11244 * https://github.com/astral-sh/ruff/pull/11473 Previously, Ruff's lexer would act as an iterator. The parser would collect all the tokens in a vector first and then process the tokens to create the syntax tree. The first task in this project is to update the entire parsing flow to make the lexer lazy. This includes the `Lexer`, `TokenSource`, and `Parser`. For context, the `TokenSource` is a wrapper around the `Lexer` to filter out the trivia tokens[^1]. Now, the parser will ask the token source to get the next token and only then the lexer will continue and emit the token. This means that the lexer needs to be aware of the "current" token. When the `next_token` is called, the current token will be updated with the newly lexed token. The main motivation to make the lexer lazy is to allow re-lexing a token in a different context. This is going to be really useful to make the parser error resilience. For example, currently the emitted tokens remains the same even if the parser can recover from an unclosed parenthesis. This is important because the lexer emits a `NonLogicalNewline` in parenthesized context while a normal `Newline` in non-parenthesized context. This different kinds of newline is also used to emit the indentation tokens which is important for the parser as it's used to determine the start and end of a block. Additionally, this allows us to implement the following functionalities: 1. Checkpoint - rewind infrastructure: The idea here is to create a checkpoint and continue lexing. At a later point, this checkpoint can be used to rewind the lexer back to the provided checkpoint. 2. Remove the `SoftKeywordTransformer` and instead use lookahead or speculative parsing to determine whether a soft keyword is a keyword or an identifier 3. Remove the `Tok` enum. The `Tok` enum represents the tokens emitted by the lexer but it contains owned data which makes it expensive to clone. The new `TokenKind` enum just represents the type of token which is very cheap. This brings up a question as to how will the parser get the owned value which was stored on `Tok`. This will be solved by introducing a new `TokenValue` enum which only contains a subset of token kinds which has the owned value. This is stored on the lexer and is requested by the parser when it wants to process the data. For example: `8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L1260-L1262)` [^1]: Trivia tokens are `NonLogicalNewline` and `Comment` ### Remove `SoftKeywordTransformer` * https://github.com/astral-sh/ruff/pull/11441 * https://github.com/astral-sh/ruff/pull/11459 * https://github.com/astral-sh/ruff/pull/11442 * https://github.com/astral-sh/ruff/pull/11443 * https://github.com/astral-sh/ruff/pull/11474 For context, https://github.com/RustPython/RustPython/pull/4519/files#diff-5de40045e78e794aa5ab0b8aacf531aa477daf826d31ca129467703855408220 added support for soft keywords in the parser which uses infinite lookahead to classify a soft keyword as a keyword or an identifier. This is a brilliant idea as it basically wraps the existing Lexer and works on top of it which means that the logic for lexing and re-lexing a soft keyword remains separate. The change here is to remove `SoftKeywordTransformer` and let the parser determine this based on context, lookahead and speculative parsing. * Context: The transformer needs to know the position of the lexer between it being at a statement position or a simple statement position. This is because a `match` token starts a compound statement while a `type` token starts a simple statement. The parser already knows this. * Lookahead: Now that the parser knows the context it can perform lookahead of up to two tokens to classify the soft keyword. The logic for this is mentioned in the PR implementing it for `type` and `match soft keyword. * Speculative parsing: This is where the checkpoint - rewind infrastructure helps. For `match` soft keyword, there are certain cases for which we can't classify based on lookahead. The idea here is to create a checkpoint and keep parsing. Based on whether the parsing was successful and what tokens are ahead we can classify the remaining cases. Refer to #11443 for more details. If the soft keyword is being parsed in an identifier context, it'll be converted to an identifier and the emitted token will be updated as well. Refer `8196720f80/crates/ruff_python_parser/src/parser/expression.rs (L487-L491)`. The `case` soft keyword doesn't require any special handling because it'll be a keyword only in the context of a match statement. ### Update the parser API * https://github.com/astral-sh/ruff/pull/11494 * https://github.com/astral-sh/ruff/pull/11505 Now that the lexer is in sync with the parser, and the parser helps to determine whether a soft keyword is a keyword or an identifier, the lexer cannot be used on its own. The reason being that it's not sensitive to the context (which is correct). This means that the parser API needs to be updated to not allow any access to the lexer. Previously, there were multiple ways to parse the source code: 1. Passing the source code itself 2. Or, passing the tokens Now that the lexer and parser are working together, the API corresponding to (2) cannot exists. The final API is mentioned in this PR description: https://github.com/astral-sh/ruff/pull/11494. ### Refactor the downstream tools (linter and formatter) * https://github.com/astral-sh/ruff/pull/11511 * https://github.com/astral-sh/ruff/pull/11515 * https://github.com/astral-sh/ruff/pull/11529 * https://github.com/astral-sh/ruff/pull/11562 * https://github.com/astral-sh/ruff/pull/11592 And, the final set of changes involves updating all references of the lexer and `Tok` enum. This was done in two-parts: 1. Update all the references in a way that doesn't require any changes from this PR i.e., it can be done independently * https://github.com/astral-sh/ruff/pull/11402 * https://github.com/astral-sh/ruff/pull/11406 * https://github.com/astral-sh/ruff/pull/11418 * https://github.com/astral-sh/ruff/pull/11419 * https://github.com/astral-sh/ruff/pull/11420 * https://github.com/astral-sh/ruff/pull/11424 2. Update all the remaining references to use the changes made in this PR For (2), there were various strategies used: 1. Introduce a new `Tokens` struct which wraps the token vector and add methods to query a certain subset of tokens. These includes: 1. `up_to_first_unknown` which replaces the `tokenize` function 2. `in_range` and `after` which replaces the `lex_starts_at` function where the former returns the tokens within the given range while the latter returns all the tokens after the given offset 2. Introduce a new `TokenFlags` which is a set of flags to query certain information from a token. Currently, this information is only limited to any string type token but can be expanded to include other information in the future as needed. https://github.com/astral-sh/ruff/pull/11578 3. Move the `CommentRanges` to the parsed output because this information is common to both the linter and the formatter. This removes the need for `tokens_and_ranges` function. ## Test Plan - [x] Update and verify the test snapshots - [x] Make sure the entire test suite is passing - [x] Make sure there are no changes in the ecosystem checks - [x] Run the fuzzer on the parser - [x] Run this change on dozens of open-source projects ### Running this change on dozens of open-source projects Refer to the PR description to get the list of open source projects used for testing. Now, the following tests were done between `main` and this branch: 1. Compare the output of `--select=E999` (syntax errors) 2. Compare the output of default rule selection 3. Compare the output of `--select=ALL` Conclusion: all output were same ## What's next? The next step is to introduce re-lexing logic and update the parser to feed the recovery information to the lexer so that it can emit the correct token. This moves us one step closer to having error resilience in the parser and provides Ruff the possibility to lint even if the source code contains syntax errors.	2024-06-03 18:23:50 +05:30
Micha Reiser	9b6d2ce1f2	Fix incorect placement of trailing stub function comments (#11632 )	2024-05-31 12:06:17 +00:00
Dimitri Papadopoulos Orfanos	3b0584449d	Fix a few typos found by codespell (#11404 ) ## Summary Just fix typos. ## Test Plan CI jobs. --------- Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>	2024-05-13 13:22:35 +00:00
Dhruv Manilawala	77a72ecd38	Avoid multiline expression if format specifier is present (#11123 ) ## Summary This PR fixes the bug where the formatter would format an f-string and could potentially change the AST. For a triple-quoted f-string, the element can't be formatted into multiline if it has a format specifier because otherwise the newline would be treated as part of the format specifier. Given the following f-string: ```python f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc { variable:.3f} ddddddddddddddd eeeeeeee""" ``` The formatter sees that the f-string is already multiline so it assumes that it can contain line breaks i.e., broken into multiple lines. But, in this specific case we can't format it as: ```python f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc { variable:.3f } ddddddddddddddd eeeeeeee""" ``` Because the format specifier string would become ".3f\n", which is not the original string (`.3f`). If the original source code already contained a newline, they'll be preserved. For example: ```python f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc { variable:.3f } ddddddddddddddd eeeeeeee""" ``` The above will be formatted as: ```py f"""aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbb ccccccccccc {variable:.3f } ddddddddddddddd eeeeeeee""" ``` Note that the newline after `.3f` is part of the format specifier which needs to be preserved. The Python version is irrelevant in this case. fixes: #10040 ## Test Plan Add some test cases to verify this behavior.	2024-04-26 13:34:38 +00:00
Jelle Zijlstra	cd3e319538	Add support for PEP 696 syntax (#11120 )	2024-04-26 09:47:29 +02:00
Dhruv Manilawala	13ffb5bc19	Replace LALRPOP parser with hand-written parser (#10036 ) (Supersedes #9152, authored by @LaBatata101) ## Summary This PR replaces the current parser generated from LALRPOP to a hand-written recursive descent parser. It also updates the grammar for [PEP 646](https://peps.python.org/pep-0646/) so that the parser outputs the correct AST. For example, in `data[*x]`, the index expression is now a tuple with a single starred expression instead of just a starred expression. Beyond the performance improvements, the parser is also error resilient and can provide better error messages. The behavior as seen by any downstream tools isn't changed. That is, the linter and formatter can still assume that the parser will _stop_ at the first syntax error. This will be updated in the following months. For more details about the change here, refer to the PR corresponding to the individual commits and the release blog post. ## Test Plan Write _lots_ and _lots_ of tests for both valid and invalid syntax and verify the output. ## Acknowledgements - @MichaReiser for reviewing 100+ parser PRs and continuously providing guidance throughout the project - @LaBatata101 for initiating the transition to a hand-written parser in #9152 - @addisoncrump for implementing the fuzzer which helped [catch](https://github.com/astral-sh/ruff/pull/10903) [a](https://github.com/astral-sh/ruff/pull/10910) [lot](https://github.com/astral-sh/ruff/pull/10966) [of](https://github.com/astral-sh/ruff/pull/10896) [bugs](https://github.com/astral-sh/ruff/pull/10877) --------- Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org> Co-authored-by: Micha Reiser <micha@reiser.io>	2024-04-18 17:57:39 +05:30
Micha Reiser	9d705a4414	Fix subscript comment placement with parenthesized value (#10496 ) ## Summary This is a follow up on https://github.com/astral-sh/ruff/pull/10492 I incorrectly assumed that `subscript.value.end()` always points past the value. However, this isn't the case for parenthesized values where the end "ends" before the parentheses. ## Test Plan I added new tests for the parenthesized case.	2024-03-20 20:30:22 +00:00
Micha Reiser	954a48b129	Fix instable formatting for trailing subscribt end-of-line comment (#10492 ) ## Summary This PR fixes an instability where formatting a subscribt where the `slice` is not an `ExprSlice` and it has a trailing end-of-line comment after its opening `[` required two formatting passes to be stable. The fix is to associate the trailing end-of-line comment as dangling comment on `[` to preserve its position, similar to how Ruff does it for other parenthesized expressions. This also matches how trailing end-of-line subscript comments are handled when the `slice` is an `ExprSlice`. Fixes https://github.com/astral-sh/ruff/issues/10355 ## Versioning Shipping this as part of a patch release is fine because: * It fixes a stability issue * It doesn't impact already formatted code because Ruff would already have moved to the comment to the end of the line (instead of preserving it) ## Test Plan Added tests	2024-03-20 18:12:10 +01:00
Auguste Lalande	3ed707f245	Spellcheck & grammar (#10375 ) ## Summary I used `codespell` and `gramma` to identify mispellings and grammar errors throughout the codebase and fixed them. I tried not to make any controversial changes, but feel free to revert as you see fit.	2024-03-13 02:34:23 +00:00
Micha Reiser	b64f2ea401	Formatter: Improve single-with item formatting for Python 3.8 or older (#10276 ) ## Summary This PR changes how we format `with` statements with a single with item for Python 3.8 or older. This change is not compatible with Black. This is how we format a single-item with statement today ```python def run(data_path, model_uri): with pyspark.sql.SparkSession.builder.config( key="spark.python.worker.reuse", value=True ).config(key="spark.ui.enabled", value=False).master( "local-cluster[2, 1, 1024]" ).getOrCreate(): # ignore spark log output spark.sparkContext.setLogLevel("OFF") print(score_model(spark, data_path, model_uri)) ``` This is different than how we would format the same expression if it is inside any other clause header (`while`, `if`, ...): ```python def run(data_path, model_uri): while ( pyspark.sql.SparkSession.builder.config( key="spark.python.worker.reuse", value=True ) .config(key="spark.ui.enabled", value=False) .master("local-cluster[2, 1, 1024]") .getOrCreate() ): # ignore spark log output spark.sparkContext.setLogLevel("OFF") print(score_model(spark, data_path, model_uri)) ``` Which seems inconsistent to me. This PR changes the formatting of the single-item with Python 3.8 or older to match that of other clause headers. ```python def run(data_path, model_uri): with ( pyspark.sql.SparkSession.builder.config( key="spark.python.worker.reuse", value=True ) .config(key="spark.ui.enabled", value=False) .master("local-cluster[2, 1, 1024]") .getOrCreate() ): # ignore spark log output spark.sparkContext.setLogLevel("OFF") print(score_model(spark, data_path, model_uri)) ``` According to our versioning policy, this style change is gated behind a preview flag. ## Test Plan See added tests. Added	2024-03-08 23:56:02 +00:00
Micha Reiser	4bce801065	Fix unstable with-items formatting (#10274 ) ## Summary Fixes https://github.com/astral-sh/ruff/issues/10267 The issue with the current formatting is that the formatter flips between the `SingleParenthesizedContextManager` and `ParenthesizeIfExpands` or `SingleWithTarget` because the layouts use incompatible formatting ( `SingleParenthesizedContextManager`: `maybe_parenthesize_expression(context)` vs `ParenthesizeIfExpands`: `parenthesize_if_expands(item)`, `SingleWithTarget`: `optional_parentheses(item)`. The fix is to ensure that the layouts between which the formatter flips when adding or removing parentheses are the same. I do this by introducing a new `SingleWithoutTarget` layout that uses the same formatting as `SingleParenthesizedContextManager` if it has no target and prefer `SingleWithoutTarget` over using `ParenthesizeIfExpands` or `SingleWithTarget`. ## Formatting change The downside is that we now use `maybe_parenthesize_expression` over `parenthesize_if_expands` for expressions where `can_omit_optional_parentheses` returns `false`. This can lead to stable formatting changes. I only found one formatting change in our ecosystem check and, unfortunately, this is necessary to fix the instability (and instability fixes are okay to have as part of minor changes according to our versioning policy) The benefit of the change is that `with` items with a single context manager and without a target are now formatted identically to how the same expression would be formatted in other clause headers. ## Test Plan I ran the ecosystem check locally	2024-03-08 23:48:47 +00:00
Micha Reiser	965adbed4b	Fix trailing kwargs end of line comment after slash (#10297 ) ## Summary Fixes the handling end of line comments that belong to `kwargs` when the `kwargs` come after a slash. The issue was that we missed to include the `**kwargs` start position when determining the start of the next node coming after the `/`. Fixes https://github.com/astral-sh/ruff/issues/10281 ## Test Plan Added test	2024-03-08 14:45:26 +00:00
Micha Reiser	dcc92f50cf	Update black tests (#10166 )	2024-02-29 10:00:51 +01:00
Micha Reiser	a6f32ddc5e	Ruff 2024.2 style (#9639 )	2024-02-29 09:30:54 +01:00
Dhruv Manilawala	72bf1c2880	Preview minimal f-string formatting (#9642 ) ## Summary _This is preview only feature and is available using the `--preview` command-line flag._ With the implementation of [PEP 701] in Python 3.12, f-strings can now be broken into multiple lines, can contain comments, and can re-use the same quote character. Currently, no other Python formatter formats the f-strings so there's some discussion which needs to happen in defining the style used for f-string formatting. Relevant discussion: https://github.com/astral-sh/ruff/discussions/9785 The goal for this PR is to add minimal support for f-string formatting. This would be to format expression within the replacement field without introducing any major style changes. ### Newlines The heuristics for adding newline is similar to that of [Prettier](https://prettier.io/docs/en/next/rationale.html#template-literals) where the formatter would only split an expression in the replacement field across multiple lines if there was already a line break within the replacement field. In other words, the formatter would not add any newlines unless they were already present i.e., they were added by the user. This makes breaking any expression inside an f-string optional and in control of the user. For example, ```python # We wouldn't break this aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc" # But, we would break the following as there's already a newline aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc" ``` If there are comments in any of the replacement field of the f-string, then it will always be a multi-line f-string in which case the formatter would prefer to break expressions i.e., introduce newlines. For example, ```python x = f"{ # comment a }" ``` ### Quotes The logic for formatting quotes remains unchanged. The existing logic is used to determine the necessary quote char and is used accordingly. Now, if the expression inside an f-string is itself a string like, then we need to make sure to preserve the existing quote and not change it to the preferred quote unless it's 3.12. For example, ```python f"outer {'inner'} outer" # For pre 3.12, preserve the single quote f"outer {'inner'} outer" # While for 3.12 and later, the quotes can be changed f"outer {"inner"} outer" ``` But, for triple-quoted strings, we can re-use the same quote char unless the inner string is itself a triple-quoted string. ```python f"""outer {"inner"} outer""" # valid f"""outer {'''inner'''} outer""" # preserve the single quote char for the inner string ``` ### Debug expressions If debug expressions are present in the replacement field of a f-string, then the whitespace needs to be preserved as they will be rendered as it is (for example, `f"{ x = }"`. If there are any nested f-strings, then the whitespace in them needs to be preserved as well which means that we'll stop formatting the f-string as soon as we encounter a debug expression. ```python f"outer { x = !s :.3f}" # ^^ # We can remove these whitespaces ``` Now, the whitespace doesn't need to be preserved around conversion spec and format specifiers, so we'll format them as usual but we won't be formatting any nested f-string within the format specifier. ### Miscellaneous - The [`hug_parens_with_braces_and_square_brackets`](https://github.com/astral-sh/ruff/issues/8279) preview style isn't implemented w.r.t. the f-string curly braces. - The [indentation](https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590) is always relative to the f-string containing statement ## Test Plan * Add new test cases * Review existing snapshot changes * Review the ecosystem changes [PEP 701]: https://peps.python.org/pep-0701/	2024-02-16 20:28:11 +05:30
Micha Reiser	edfe8421ec	Disable top-level docstring formatting for notebooks (#9957 )	2024-02-12 18:14:02 +00:00
Micha Reiser	8657a392ff	Docstring formatting: Preserve tab indentation when using `indent-style=tabs` (#9915 )	2024-02-12 16:09:13 +01:00
Micha Reiser	4946a1876f	Stabilize quote-style `preserve` (#9922 )	2024-02-12 09:30:07 +00:00
Micha Reiser	fe7d965334	Reduce `Result<Tok, LexicalError>` size by using `Box<str>` instead of `String` (#9885 )	2024-02-08 20:36:22 +00:00
Shaygan Hooshyari	b47f85eb69	Preview Style: Format module level docstring (#9725 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2024-02-05 15:03:34 +00:00
Micha Reiser	80fc02e7d5	Don't trim last empty line in docstrings (#9813 )	2024-02-05 13:29:24 +00:00
Micha Reiser	4f7fb566f0	Range formatting: Fix invalid syntax after parenthesizing expression (#9751 )	2024-02-02 17:56:25 +01:00
Micha Reiser	ce14f4dea5	Range formatting API (#9635 )	2024-01-31 11:13:37 +01:00
Dhruv Manilawala	541aef4e6c	Implement `blank_line_after_nested_stub_class` preview style (#9155 ) ## Summary This PR implements the `blank_line_after_nested_stub_class` preview style in the formatter. The logic is divided into 3 parts: 1. In between preceding and following nodes at top level and nested suite 2. When there's a trailing comment after the class 3. When there is no following node from (1) which is the case when it's the last or the only node in a suite We handle (3) with `FormatLeadingAlternateBranchComments`. ## Test Plan - Add new test cases and update existing snapshots - Checked the `typeshed` diff fixes: #8891	2024-01-31 00:09:38 +05:30
Micha Reiser	3c7fea769c	Show source-type in formatter snapshot tests with options (#9699 )	2024-01-30 10:08:50 +00:00
Micha Reiser	0045032905	Set source type: Stub for black tests with options (#9674 )	2024-01-29 15:54:30 +01:00
Micha Reiser	91046e4c81	Preserve indent around multiline strings (#9637 )	2024-01-26 08:18:30 +01:00
Micha Reiser	395bf3dc98	Fix the input for black's line ranges test file (#9622 )	2024-01-23 10:40:23 +00:00

1 2 3 4 5 ...

448 Commits