Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Dhruv Manilawala	33ac2867b7	Use non-parenthesized range for `DebugText` (#9953 ) ## Summary This PR fixes the `DebugText` implementation to use the expression range instead of the parenthesized range. Taking the following code snippet as an example: ```python x = 1 print(f"{ ( x ) = }") ``` The output of running it would be: ``` ( x ) = 1 ``` Notice that the whitespace between the parentheses and the expression is preserved as is. Currently, we don't preserve this information in the AST which defeats the purpose of `DebugText` as the main purpose of the struct is to preserve whitespaces _around_ the expression. This is also problematic when generating the code from the AST node as then the generator has no information about the parentheses the whitespaces between them and the expression which would lead to the removal of the parentheses in the generated code. I noticed this while working on the f-string formatting where the debug text would be used to preserve the text surrounding the expression in the presence of debug expression. The parentheses were being dropped then which made me realize that the problem is instead in the parser. ## Test Plan 1. Add a test case for the parser 2. Add a test case for the generator	2024-02-12 23:00:02 +05:30
Charlie Marsh	49fe1b85f2	Reduce size of `Expr` from 80 to 64 bytes (#9900 ) ## Summary This PR reduces the size of `Expr` from 80 to 64 bytes, by reducing the sizes of... - `ExprCall` from 72 to 56 bytes, by using boxed slices for `Arguments`. - `ExprCompare` from 64 to 48 bytes, by using boxed slices for its various vectors. In testing, the parser gets a bit faster, and the linter benchmarks improve quite a bit.	2024-02-09 02:53:13 +00:00
Micha Reiser	fe7d965334	Reduce `Result<Tok, LexicalError>` size by using `Box<str>` instead of `String` (#9885 )	2024-02-08 20:36:22 +00:00
Micha Reiser	688177ff6a	Use Rust 1.76 (#9897 )	2024-02-08 18:20:08 +00:00
Micha Reiser	47ad7b4500	Approximate tokens len (#9546 )	2024-01-19 17:39:37 +01:00
Micha Reiser	f192c72596	Remove type parameter from `parse_*` methods (#9466 )	2024-01-11 19:41:19 +01:00
Charlie Marsh	48e04cc2c8	Add row and column numbers to formatted parse errors (#9321 ) ## Summary We now render parse errors in the formatter identically to those in the linter, e.g.: ``` ❯ cargo run -p ruff_cli -- format foo.py error: Failed to parse foo.py:1:17: Unexpected token '=' ``` Closes https://github.com/astral-sh/ruff/issues/8338. Closes https://github.com/astral-sh/ruff/issues/9311.	2023-12-31 07:10:45 -05:00
Charlie Marsh	e80260a3c5	Remove source path from parser errors (#9322 ) ## Summary I always found it odd that we had to pass this in, since it's really higher-level context for the error. The awkwardness is further evidenced by the fact that we pass in fake values everywhere (even outside of tests). The source path isn't actually used to display the error; it's only accessed elsewhere to _re-display_ the error in certain cases. This PR modifies to instead pass the path directly in those cases.	2023-12-30 20:33:05 +00:00
Charlie Marsh	97e9d3c54f	Use `Display` for formatter parse errors (#9316 ) ## Summary This helps a bit with (but does not close) the issues described in https://github.com/astral-sh/ruff/issues/9311. E.g., now, we at least see: `error: Failed to format main.py: source contains syntax errors: invalid syntax. Got unexpected token '=' at byte offset 20`.	2023-12-29 22:26:57 +00:00
Micha Reiser	7e390d3772	Move `ParenthesizedExpr` to `ruff_python_parser` (#8987 )	2023-12-04 05:36:28 +00:00
Charlie Marsh	20782ab02c	Support type alias statements in simple statement positions (#8916 ) <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary Our `SoftKeywordTokenizer` only respected soft keywords in compound statement positions -- for example, at the start of a logical line: ```python type X = int ``` However, type aliases can also appear in simple statement positions, like: ```python class Class: type X = int ``` (Note that `match` and `case` are _not_ valid keywords in such positions.) This PR upgrades the tokenizer to track both kinds of valid positions. Closes https://github.com/astral-sh/ruff/issues/8900. Closes https://github.com/astral-sh/ruff/issues/8899. ## Test Plan `cargo test`	2023-11-30 19:15:19 +00:00
konsti	14e65afdc6	Update to Rust 1.74 and use new clippy lints table (#8722 ) Update to [Rust 1.74](https://blog.rust-lang.org/2023/11/16/Rust-1.74.0.html) and use the new clippy lints table. The update itself introduced a new clippy lint about superfluous hashes in raw strings, which got removed. I moved our lint config from `rustflags` to the newly stabilized [workspace.lints](https://doc.rust-lang.org/stable/cargo/reference/workspaces.html#the-lints-table). One consequence is that we have to `unsafe_code = "warn"` instead of "forbid" because the latter now actually bans unsafe code: ``` error[E0453]: allow(unsafe_code) incompatible with previous forbid --> crates/ruff_source_file/src/newlines.rs:62:17 \| 62 \| #[allow(unsafe_code)] \| ^^^^^^^^^^^ overruled by previous forbid \| = note: `forbid` lint level was set on command line ``` --------- Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>	2023-11-16 18:12:46 -05:00
Charlie Marsh	d6a4283003	Fix range of unparenthesized tuple subject in match statement (#8101 ) ## Summary This was just a bug in the parser ranges, probably since it was initially implemented. Given `match n % 3, n % 5: ...`, the "subject" (i.e., the tuple of two binary operators) was using the entire range of the `match` statement. Closes https://github.com/astral-sh/ruff/issues/8091. ## Test Plan `cargo test`	2023-10-22 19:58:33 -04:00
Dhruv Manilawala	43883b7a15	Disallow f-strings in match pattern literal (#7857 ) ## Summary This PR fixes a bug to disallow f-strings in match pattern literal. ``` literal_pattern ::= signed_number \| signed_number "+" NUMBER \| signed_number "-" NUMBER \| strings \| "None" \| "True" \| "False" \| signed_number: NUMBER \| "-" NUMBER ``` Source: https://docs.python.org/3/reference/compound_stmts.html#grammar-token-python-grammar-literal_pattern Also, ```console $ python /tmp/t.py File "/tmp/t.py", line 4 case "hello " f"{name}": ^^^^^^^^^^^^^^^^^^ SyntaxError: patterns may only match literals and attribute lookups ``` ## Test Plan Update existing test case and accordingly the snapshots. Also, add a new test case to verify that the parser does raise an error.	2023-10-09 10:11:08 +00:00
Dhruv Manilawala	709abd534a	Fix lexing single-quoted f-string with multi-line format spec (#7787 ) ## Summary Reported at https://github.com/python/cpython/issues/110259 ## Test Plan Add test cases for the fix and update the snapshots	2023-10-05 23:12:09 +05:30
konsti	3ccd1d580d	Use crates.io unicode_names2 0.6.0 (#6478 ) Update `unicode_names2` to the crates.io release 0.6.0, removing a git dependency.	2023-10-02 18:17:38 -04:00
Dhruv Manilawala	e62e245c61	Add support for PEP 701 (#7376 ) ## Summary This PR adds support for PEP 701 in Ruff. This is a rollup PR of all the other individual PRs. The separate PRs were created for logic separation and code reviews. Refer to each pull request for a detail description on the change. Refer to the PR description for the list of pull requests within this PR. ## Test Plan ### Formatter ecosystem checks Explanation for the change in ecosystem check: https://github.com/astral-sh/ruff/pull/7597#issue-1908878183 #### `main` ``` \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76083 \| 1789 \| 1631 \| \| django \| 0.99983 \| 2760 \| 36 \| \| transformers \| 0.99963 \| 2587 \| 319 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99983 \| 3496 \| 18 \| \| warehouse \| 0.99967 \| 648 \| 15 \| \| zulip \| 0.99972 \| 1437 \| 21 \| ``` #### `dhruv/pep-701` ``` \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76051 \| 1789 \| 1632 \| \| django \| 0.99983 \| 2760 \| 36 \| \| transformers \| 0.99963 \| 2587 \| 319 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99983 \| 3496 \| 18 \| \| warehouse \| 0.99967 \| 648 \| 15 \| \| zulip \| 0.99972 \| 1437 \| 21 \| ```	2023-09-29 02:55:39 +00:00
konsti	4d16e2308d	Formatter and parser refactoring (#7569 ) I got confused and refactored a bit, now the naming should be more consistent. This is the basis for the range formatting work. Chages: * `format_module` -> `format_module_source` (format a string) * `format_node` -> `format_module_ast` (format a program parsed into an AST) * Added `parse_ok_tokens` that takes `Token` instead of `Result<Token>` * Call the source code `source` consistently * Added a `tokens_and_ranges` helper * `python_ast` -> `module` (because that's the type)	2023-09-26 15:29:43 +02:00
Dhruv Manilawala	1adde24133	Rename parser mode from `Jupyter` to `Ipython` (#7153 )	2023-09-05 14:12:26 +00:00
Charlie Marsh	68f605e80a	Fix `WithItem` ranges for parenthesized, non-`as` items (#6782 ) ## Summary This PR attempts to address a problem in the parser related to the range's of `WithItem` nodes in certain contexts -- specifically, `WithItem` nodes in parentheses that do not have an `as` token after them. For example, [here](https://play.ruff.rs/71be2d0b-2a04-4c7e-9082-e72bff152679): ```python with (a, b): pass ``` The range of the `WithItem` `a` is set to the range of `(a, b)`, as is the range of the `WithItem` `b`. In other words, when we have this kind of sequence, we use the range of the entire parenthesized context, rather than the ranges of the items themselves. Note that this also applies to cases [like](https://play.ruff.rs/c551e8e9-c3db-4b74-8cc6-7c4e3bf3713a): ```python with (a, b, c as d): pass ``` You can see the issue in the parser here: ```rust #[inline] WithItemsNoAs: Vec<ast::WithItem> = { <location:@L> <all:OneOrMore<Test<"all">>> <end_location:@R> => { all.into_iter().map(\|context_expr\| ast::WithItem { context_expr, optional_vars: None, range: (location..end_location).into() }).collect() }, } ``` Fixing this issue is... very tricky. The naive approach is to use the range of the `context_expr` as the range for the `WithItem`, but that range will be incorrect when the `context_expr` is itself parenthesized. For example, _that_ solution would fail here, since the range of the first `WithItem` would be that of `a`, rather than `(a)`: ```python with ((a), b): pass ``` The `with` parsing in general is highly precarious due to ambiguities in the grammar. Changing it in _any_ way seems to lead to an ambiguous grammar that LALRPOP fails to translate. Consensus seems to be that we don't really understand _why_ the current grammar works (i.e., _how_ it avoids these ambiguities as-is). The solution implemented here is to avoid changing the grammar itself, and instead change the shape of the nodes returned by various rules in the grammar. Specifically, everywhere that we return `Expr`, we instead return `ParenthesizedExpr`, which includes a parenthesized range and the underlying `Expr` itself. (If an `Expr` isn't parenthesized, the ranges will be equivalent.) In `WithItemsNoAs`, we can then use the parenthesized range as the range for the `WithItem`.	2023-08-31 16:21:29 +01:00
Charlie Marsh	a70807e1e1	Expand `NamedExpr` range to include full range of parenthesized value (#6632 ) ## Summary Given: ```python if ( x := ( # 4 y # 5 ) # 6 ): pass ``` It turns out the parser ended the range of the `NamedExpr` at the end of `y`, rather than the end of the parenthesis that encloses `y`. This just seems like a bug -- the range should be from the start of the name on the left, to the end of the parenthesized node on the right. ## Test Plan `cargo test`	2023-08-17 14:34:05 +00:00
Micha Reiser	9584f613b9	Remove `allow(pedantic)` from formatter (#6549 )	2023-08-14 14:02:06 +02:00
Dhruv Manilawala	6a64f2289b	Rename `Magic` to `IpyEscape` (#6395 ) ## Summary This PR renames the `MagicCommand` token to `IpyEscapeCommand` token and `MagicKind` to `IpyEscapeKind` type to better reflect the purpose of the token and type. Similarly, it renames the AST nodes from `LineMagic` to `IpyEscapeCommand` prefixed with `Stmt`/`Expr` wherever necessary. It also makes renames from using `jupyter_magic` to `ipython_escape_commands` in various function names. The mode value is still `Mode::Jupyter` because the escape commands are part of the IPython syntax but the lexing/parsing is done for a Jupyter notebook. ### Motivation behind the rename: * IPython codebase defines it as "EscapeCommand" / "Escape Sequences": * Escape Sequences: `292e3a2345/IPython/core/inputtransformer2.py (L329-L333)` * Escape command: `292e3a2345/IPython/core/inputtransformer2.py (L410-L411)` * The word "magic" is used mainly for the actual magic commands i.e., the ones starting with `%`/`%%` (https://ipython.readthedocs.io/en/stable/interactive/reference.html#magic-command-system). So, this avoids any confusion between the Magic token (`%`, `%%`) and the escape command itself. ## Test Plan * `cargo test` to make sure all renames are done correctly. * `grep` for `jupyter_escape`/`magic` to make sure all renames are done correctly.	2023-08-09 13:28:18 +00:00
Dhruv Manilawala	e257c5af32	Add support for help end IPython escape commands (#6358 ) ## Summary This PR adds support for a stricter version of help end escape commands[^1] in the parser. By stricter, I mean that the escape tokens are only at the end of the command and there are no tokens at the start. This makes it difficult to implement it in the lexer without having to do a lot of look aheads or keeping track of previous tokens. Now, as we're adding this in the parser, the lexer needs to recognize and emit a new token for `?`. So, `Question` token is added which will be recognized only in `Jupyter` mode. The conditions applied are the same as the ones in the original implementation in IPython codebase (which is a regex): * There can only be either 1 or 2 question mark(s) at the end * The node before the question mark can be a `Name`, `Attribute`, `Subscript` (only with integer constants in slice position), or any combination of the 3 nodes. ## Test Plan Added test cases for various combination of the possible nodes in the command value position and update the snapshots. fixes: #6359 fixes: #5030 (This is the final piece) [^1]: https://github.com/astral-sh/ruff/pull/6272#issue-1833094281	2023-08-09 10:28:52 +05:30
Micha Reiser	debfca3a11	Remove `Parse` trait (#6235 )	2023-08-01 18:35:03 +02:00
Micha Reiser	f45e8645d7	Remove unused parser modes <!-- Thank you for contributing to Ruff! To help us out with reviewing, please consider the following: - Does this pull request include a summary of the change? (See below.) - Does this pull request include a descriptive title? - Does this pull request include references to any relevant issues? --> ## Summary This PR removes the `Interactive` and `FunctionType` parser modes that are unused by ruff <!-- What's the purpose of the change? What does it do, and why? --> ## Test Plan `cargo test` <!-- How was it tested? -->	2023-08-01 13:10:07 +02:00
Micha Reiser	40f54375cb	Pull in RustPython parser (#6099 )	2023-07-27 09:29:11 +00:00

27 Commits