Python/ruff - ruff - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
Micha Reiser	5109b50bb3	Use `CompactString` for `Identifier` (#12101 )	2024-07-01 10:06:02 +02:00
Micha Reiser	a6d892b1f4	Split `CallPath` into `QualifiedName` and `UnqualifiedName` (#10210 ) ## Summary Charlie can probably explain this better than I but it turns out, `CallPath` is used for two different things: * To represent unqualified names like `version` where `version` can be a local variable or imported (e.g. `from sys import version` where the full qualified name is `sys.version`) * To represent resolved, full qualified names This PR splits `CallPath` into two types to make this destinction clear. > Note: I haven't renamed all `call_path` variables to `qualified_name` or `unqualified_name`. I can do that if that's welcomed but I first want to get feedback on the approach and naming overall. ## Test Plan `cargo test`	2024-03-04 09:06:51 +00:00
Micha Reiser	e725b6fdaf	CallPath newtype wrapper (#10201 ) ## Summary This PR changes the `CallPath` type alias to a newtype wrapper. A newtype wrapper allows us to limit the API and to experiment with alternative ways to implement matching on `CallPath`s. ## Test Plan `cargo test`	2024-03-03 16:54:24 +01:00
Dhruv Manilawala	017e829115	Update string nodes for implicit concatenation (#7927 ) ## Summary This PR updates the string nodes (`ExprStringLiteral`, `ExprBytesLiteral`, and `ExprFString`) to account for implicit string concatenation. ### Motivation In Python, implicit string concatenation are joined while parsing because the interpreter doesn't require the information for each part. While that's feasible for an interpreter, it falls short for a static analysis tool where having such information is more useful. Currently, various parts of the code uses the lexer to get the individual string parts. One of the main challenge this solves is that of string formatting. Currently, the formatter relies on the lexer to get the individual string parts, and formats them including the comments accordingly. But, with PEP 701, f-string can also contain comments. Without this change, it becomes very difficult to add support for f-string formatting. ### Implementation The initial proposal was made in this discussion: https://github.com/astral-sh/ruff/discussions/6183#discussioncomment-6591993. There were various AST designs which were explored for this task which are available in the linked internal document[^1]. The selected variant was the one where the nodes were kept as it is except that the `implicit_concatenated` field was removed and instead a new struct was added to the `Expr*` struct. This would be a private struct would contain the actual implementation of how the AST is designed for both single and implicitly concatenated strings. This implementation is achieved through an enum with two variants: `Single` and `Concatenated` to avoid allocating a vector even for single strings. There are various public methods available on the value struct to query certain information regarding the node. The nodes are structured in the following way: ``` ExprStringLiteral - "foo" "bar" \|- StringLiteral - "foo" \|- StringLiteral - "bar" ExprBytesLiteral - b"foo" b"bar" \|- BytesLiteral - b"foo" \|- BytesLiteral - b"bar" ExprFString - "foo" f"bar {x}" \|- FStringPart::Literal - "foo" \|- FStringPart::FString - f"bar {x}" \|- StringLiteral - "bar " \|- FormattedValue - "x" ``` [^1]: Internal document: https://www.notion.so/astral-sh/Implicit-String-Concatenation-e036345dc48943f89e416c087bf6f6d9?pvs=4 #### Visitor The way the nodes are structured is that the entire string, including all the parts that are implicitly concatenation, is a single node containing individual nodes for the parts. The previous section has a representation of that tree for all the string nodes. This means that new visitor methods are added to visit the individual parts of string, bytes, and f-strings for `Visitor`, `PreorderVisitor`, and `Transformer`. ## Test Plan - `cargo insta test --workspace --all-features --unreferenced reject` - Verify that the ecosystem results are unchanged	2023-11-24 17:55:41 -06:00
Dhruv Manilawala	230c9ce236	Split `Constant` to individual literal nodes (#8064 ) ## Summary This PR splits the `Constant` enum as individual literal nodes. It introduces the following new nodes for each variant: * `ExprStringLiteral` * `ExprBytesLiteral` * `ExprNumberLiteral` * `ExprBooleanLiteral` * `ExprNoneLiteral` * `ExprEllipsisLiteral` The main motivation behind this refactor is to introduce the new AST node for implicit string concatenation in the coming PR. The elements of that node will be either a string literal, bytes literal or a f-string which can be implemented using an enum. This means that a string or bytes literal cannot be represented by `Constant::Str` / `Constant::Bytes` which creates an inconsistency. This PR avoids that inconsistency by splitting the constant nodes into it's own literal nodes, literal being the more appropriate naming convention from a static analysis tool perspective. This also makes working with literals in the linter and formatter much more ergonomic like, for example, if one would want to check if this is a string literal, it can be done easily using `Expr::is_string_literal_expr` or matching against `Expr::StringLiteral` as oppose to matching against the `ExprConstant` and enum `Constant`. A few AST helper methods can be simplified as well which will be done in a follow-up PR. This introduces a new `Expr::is_literal_expr` method which is the same as `Expr::is_constant_expr`. There are also intermediary changes related to implicit string concatenation which are quiet less. This is done so as to avoid having a huge PR which this already is. ## Test Plan 1. Verify and update all of the existing snapshots (parser, visitor) 2. Verify that the ecosystem check output remains unchanged for both the linter and formatter ### Formatter ecosystem check #### `main` \| project \| similarity index \| total files \| changed files \| \|----------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.75803 \| 1799 \| 1647 \| \| django \| 0.99983 \| 2772 \| 34 \| \| home-assistant \| 0.99953 \| 10596 \| 186 \| \| poetry \| 0.99891 \| 317 \| 17 \| \| transformers \| 0.99966 \| 2657 \| 330 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99978 \| 3669 \| 20 \| \| warehouse \| 0.99977 \| 654 \| 13 \| \| zulip \| 0.99970 \| 1459 \| 22 \| #### `dhruv/constant-to-literal` \| project \| similarity index \| total files \| changed files \| \|----------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.75803 \| 1799 \| 1647 \| \| django \| 0.99983 \| 2772 \| 34 \| \| home-assistant \| 0.99953 \| 10596 \| 186 \| \| poetry \| 0.99891 \| 317 \| 17 \| \| transformers \| 0.99966 \| 2657 \| 330 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99978 \| 3669 \| 20 \| \| warehouse \| 0.99977 \| 654 \| 13 \| \| zulip \| 0.99970 \| 1459 \| 22 \|	2023-10-30 12:13:23 +05:30
Micha Reiser	b34278e0cd	Introduce `LinterSettings` ## Stack Summary This stack splits `Settings` into `FormatterSettings` and `LinterSettings` and moves it into `ruff_workspace`. This change is necessary to add the `FormatterSettings` to `Settings` without adding `ruff_python_formatter` as a dependency to `ruff_linter` (and the linter should not contain the formatter settings). A quick overview of our settings struct at play: * `Options`: 1:1 representation of the options in the `pyproject.toml` or `ruff.toml`. Used for deserialization. * `Configuration`: Resolved `Options`, potentially merged from multiple configurations (when using `extend`). The representation is very close if not identical to the `Options`. * `Settings`: The resolved configuration that uses a data format optimized for reading. Optional fields are initialized with their default values. Initialized by `Configuration::into_settings` . The goal of this stack is to split `Settings` into tool-specific resolved `Settings` that are independent of each other. This comes at the advantage that the individual crates don't need to know anything about the other tools. The downside is that information gets duplicated between `Settings`. Right now the duplication is minimal (`line-length`, `tab-width`) but we may need to come up with a solution if more expensive data needs sharing. This stack focuses on `Settings`. Splitting `Configuration` into some smaller structs is something I'll follow up on later. ## PR Summary This PR extracts the linter-specific settings into a new `LinterSettings` struct and adds it as a `linter` field to the `Settings` struct. This is in preparation for moving `Settings` from `ruff_linter` to `ruff_workspace` ## Test Plan `cargo test`	2023-09-20 17:02:34 +02:00
Charlie Marsh	5849a75223	Rename `ruff` crate to `ruff_linter` (#7529 )	2023-09-20 08:38:27 +02:00

7 Commits