Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Micha Reiser	8237d4670c	Fix `\r` and `\r\n` handling in t- and f-string debug texts (#18673 )	2025-06-15 06:53:06 +01:00
David Peter	b913f568c4	[ty] Mark generated files as such in .gitattributes (#18195 ) ## Summary See comment here: https://github.com/astral-sh/ruff/pull/18156#discussion_r2095850586	2025-05-19 16:50:48 +02:00
Max Mynter	a01f25107a	[`pyupgrade`] Preserve parenthesis when fixing native literals containing newlines (`UP018`) (#17220 )	2025-04-24 08:48:02 +02:00
Douglas Creager	8e3633f55a	Auto-generate AST boilerplate (#15544 ) This PR replaces most of the hard-coded AST definitions with a generation script, similar to what happens in `rust_python_formatter`. I've replaced every "rote" definition that I could find, where the content is entirely boilerplate and only depends on what syntax nodes there are and which groups they belong to. This is a pretty massive diff, but it's entirely a refactoring. It should make absolutely no changes to the API or implementation. In particular, this required adding some configuration knobs that let us override default auto-generated names where they don't line up with types that we created previously by hand. ## Test plan There should be no changes outside of the `rust_python_ast` crate, which verifies that there were no API changes as a result of the auto-generation. Aggressive `cargo clippy` and `uvx pre-commit` runs after each commit in the branch. --------- Co-authored-by: Micha Reiser <micha@reiser.io> Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2025-01-17 14:23:02 -05:00
Dhruv Manilawala	47c9ed07f2	Consider 2-character EOL before line continuation (#12035 ) ## Summary This PR fixes a bug introduced in https://github.com/astral-sh/ruff/pull/12008 which didn't consider the two character newline after the line continuation character. For example, consider the following code highlighted with whitespaces: ```py call(foo # comment \\r\n \r\n def bar():\r\n ....pass\r\n ``` The lexer is at `def` when it's running the re-lexing logic and trying to move back to a newline character. It encounters `\n` and it's being escaped (incorrect) but `\r` is being escaped, so it moves the lexer to `\n` character. This creates an overlap in token ranges which causes the panic. ``` Name 0..4 Lpar 4..5 Name 5..8 Comment 9..20 NonLogicalNewline 20..22 <-- overlap between Newline 21..22 <-- these two tokens NonLogicalNewline 22..23 Def 23..26 ... ``` fixes: #12028 ## Test Plan Add a test case with line continuation and windows style newline character.	2024-06-26 14:00:48 +05:30
Dhruv Manilawala	8499abfa7f	Implement re-lexing logic for better error recovery (#11845 ) ## Summary This PR implements the re-lexing logic in the parser. This logic is only applied when recovering from an error during list parsing. The logic is as follows: 1. During list parsing, if an unexpected token is encountered and it detects that an outer context can understand it and thus recover from it, it invokes the re-lexing logic in the lexer 2. This logic first checks if the lexer is in a parenthesized context and returns if it's not. Thus, the logic is a no-op if the lexer isn't in a parenthesized context 3. It then reduces the nesting level by 1. It shouldn't reset it to 0 because otherwise the recovery from nested list parsing will be incorrect 4. Then, it tries to find last newline character going backwards from the current position of the lexer. This avoids any whitespaces but if it encounters any character other than newline or whitespace, it aborts. 5. Now, if there's a newline character, then it needs to be re-lexed in a logical context which means that the lexer needs to emit it as a `Newline` token instead of `NonLogicalNewline`. 6. If the re-lexing gives a different token than the current one, the token source needs to update it's token collection to remove all the tokens which comes after the new current position. It turns out that the list parsing isn't that happy with the results so it requires some re-arranging such that the following two errors are raised correctly: 1. Expected comma 2. Recovery context error For (1), the following scenarios needs to be considered: * Missing comma between two elements * Half parsed element because the grammar doesn't allow it (for example, named expressions) For (2), the following scenarios needs to be considered: 1. If the parser is at a comma which means that there's a missing element otherwise the comma would've been consumed by the first `eat` call above. And, the parser doesn't take the re-lexing route on a comma token. 2. If it's the first element and the current token is not a comma which means that it's an invalid element. resolves: #11640 ## Test Plan - [x] Update existing test snapshots and validate them - [x] Add additional test cases specific to the re-lexing logic and validate the snapshots - [x] Run the fuzzer on 3000+ valid inputs - [x] Run the fuzzer on invalid inputs - [x] Run the parser on various open source projects - [x] Make sure the ecosystem changes are none	2024-06-17 06:47:00 +00:00
Dhruv Manilawala	13ffb5bc19	Replace LALRPOP parser with hand-written parser (#10036 ) (Supersedes #9152, authored by @LaBatata101) ## Summary This PR replaces the current parser generated from LALRPOP to a hand-written recursive descent parser. It also updates the grammar for [PEP 646](https://peps.python.org/pep-0646/) so that the parser outputs the correct AST. For example, in `data[*x]`, the index expression is now a tuple with a single starred expression instead of just a starred expression. Beyond the performance improvements, the parser is also error resilient and can provide better error messages. The behavior as seen by any downstream tools isn't changed. That is, the linter and formatter can still assume that the parser will _stop_ at the first syntax error. This will be updated in the following months. For more details about the change here, refer to the PR corresponding to the individual commits and the release blog post. ## Test Plan Write _lots_ and _lots_ of tests for both valid and invalid syntax and verify the output. ## Acknowledgements - @MichaReiser for reviewing 100+ parser PRs and continuously providing guidance throughout the project - @LaBatata101 for initiating the transition to a hand-written parser in #9152 - @addisoncrump for implementing the fuzzer which helped [catch](https://github.com/astral-sh/ruff/pull/10903) [a](https://github.com/astral-sh/ruff/pull/10910) [lot](https://github.com/astral-sh/ruff/pull/10966) [of](https://github.com/astral-sh/ruff/pull/10896) [bugs](https://github.com/astral-sh/ruff/pull/10877) --------- Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org> Co-authored-by: Micha Reiser <micha@reiser.io>	2024-04-18 17:57:39 +05:30
Auguste Lalande	b117f33075	[`pycodestyle`] Implement `blank-line-at-end-of-file` (`W391`) (#10243 ) ## Summary Implements the [blank line at end of file](https://pycodestyle.pycqa.org/en/latest/intro.html#error-codes) rule (W391) from pycodestyle. Renamed to TooManyNewlinesAtEndOfFile for clarity. ## Test Plan New fixtures have been added Part of #2402	2024-03-11 22:07:59 -04:00
Andrew Gallant	d9845a2628	format doctests in docstrings (#8811 ) ## Summary This PR adds opt-in support for formatting doctests in docstrings. This reflects initial support and it is intended to add support for Markdown and reStructuredText Python code blocks in the future. But I believe this PR lays the groundwork, and future additions for Markdown and reST should be less costly to add. It's strongly recommended to review this PR commit-by-commit. The last few commits in particular implement the bulk of the work here and represent the denser portions. Some things worth mentioning: * The formatter is itself not perfect, and it is possible for it to produce invalid Python code. Because of this, reformatted code snippets are checked for Python validity. If they aren't valid, then we (unfortunately silently) bail on formatting that code snippet. * There are a couple places where it would be nice to at least warn the user that doctest formatting failed, but it wasn't clear to me what the best way to do that is. * I haven't yet run this in anger on a real world code base. I think that should happen before merging. Closes #7146 ## Test Plan * [x] Pass the local test suite. * [x] Scrutinize ecosystem changes. * [x] Run this formatter on extant code and scrutinize the results. (e.g., CPython, numpy.)	2023-11-27 11:14:55 -05:00
Charlie Marsh	5849a75223	Rename `ruff` crate to `ruff_linter` (#7529 )	2023-09-20 08:38:27 +02:00
Thomas de Zeeuw	0b963ddcfa	Add unreachable code rule (#5384 ) Co-authored-by: Thomas de Zeeuw <thomas@astral.sh> Co-authored-by: Micha Reiser <micha@reiser.io>	2023-07-04 14:27:23 +00:00
Micha Reiser	f72ed255e5	chore: Use `LF` on all platforms (#3005 ) I worked on #2993 and ran into issues that the formatter tests are failing on Windows because `writeln!` emits `\n` as line terminator on all platforms, but `git` on Windows converted the line endings in the snapshots to `\r\n`. I then tried to replicate the issue on my Windows machine and was surprised that all linter snapshot tests are failing on my machine. I figured out after some time that it is due to my global git config keeping the input line endings rather than converting to `\r\n`. Luckily, I've been made aware of #2033 which introduced an "override" for the `assert_yaml_snapshot` macro that normalizes new lines, by splitting the formatted string using the platform-specific newline character. This is a clever approach and gives nice diffs for multiline fixes but makes assumptions about the setup contributors use and requires special care whenever we use line endings inside of tests. I recommend that we remove the special new line handling and use `.gitattributes` to enforce the use of `LF` on all platforms [guide](https://docs.github.com/en/get-started/getting-started-with-git/configuring-git-to-handle-line-endings). This gives us platform agnostic tests without having to worry about line endings in our tests or different git configurations. ## Note It may be necessary for Windows contributors to run the following command to update the line endings of their files ```bash git rm --cached -r . git reset --hard ```	2023-02-20 20:13:37 +00:00

12 Commits