## Summary
This PR splits the `Constant` enum as individual literal nodes. It
introduces the following new nodes for each variant:
* `ExprStringLiteral`
* `ExprBytesLiteral`
* `ExprNumberLiteral`
* `ExprBooleanLiteral`
* `ExprNoneLiteral`
* `ExprEllipsisLiteral`
The main motivation behind this refactor is to introduce the new AST
node for implicit string concatenation in the coming PR. The elements of
that node will be either a string literal, bytes literal or a f-string
which can be implemented using an enum. This means that a string or
bytes literal cannot be represented by `Constant::Str` /
`Constant::Bytes` which creates an inconsistency.
This PR avoids that inconsistency by splitting the constant nodes into
it's own literal nodes, literal being the more appropriate naming
convention from a static analysis tool perspective.
This also makes working with literals in the linter and formatter much
more ergonomic like, for example, if one would want to check if this is
a string literal, it can be done easily using
`Expr::is_string_literal_expr` or matching against `Expr::StringLiteral`
as oppose to matching against the `ExprConstant` and enum `Constant`. A
few AST helper methods can be simplified as well which will be done in a
follow-up PR.
This introduces a new `Expr::is_literal_expr` method which is the same
as `Expr::is_constant_expr`. There are also intermediary changes related
to implicit string concatenation which are quiet less. This is done so
as to avoid having a huge PR which this already is.
## Test Plan
1. Verify and update all of the existing snapshots (parser, visitor)
2. Verify that the ecosystem check output remains **unchanged** for both
the linter and formatter
### Formatter ecosystem check
#### `main`
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
#### `dhruv/constant-to-literal`
| project | similarity index | total files | changed files |
|----------------|------------------:|------------------:|------------------:|
| cpython | 0.75803 | 1799 | 1647 |
| django | 0.99983 | 2772 | 34 |
| home-assistant | 0.99953 | 10596 | 186 |
| poetry | 0.99891 | 317 | 17 |
| transformers | 0.99966 | 2657 | 330 |
| twine | 1.00000 | 33 | 0 |
| typeshed | 0.99978 | 3669 | 20 |
| warehouse | 0.99977 | 654 | 13 |
| zulip | 0.99970 | 1459 | 22 |
## Summary
This makes use of memchr and other methods to parse the strings
(hopefully) faster. It might also be worth converting the
`parse_fstring_middle` helper to use similar techniques, but I did not
implement it in this PR.
## Test Plan
This was tested using the existing tests and passed all of them.
## Summary
This PR fixes the bug where the value of a string node type includes the
escaped mac/windows newline character.
Note that the token value still includes them, it's only removed when
parsing the string content.
## Test Plan
Add new test cases for the string node type to check that the escapes
aren't being included in the string value.
fixes: #7723
## Summary
The motivation here is that this enables us to implement `Ranged` in
crates that don't depend on `ruff_python_ast`.
Largely a mechanical refactor with a lot of regex, Clippy help, and
manual fixups.
## Test Plan
`cargo test`
## Summary
I noticed some inconsistencies around uses of `.range.start()`, structs
that have a `TextRange` field but don't implement `Ranged`, etc.
## Test Plan
`cargo test`
## Summary
Per the discussion in
https://github.com/astral-sh/ruff/discussions/6183, this PR adds an
`implicit_concatenated` flag to the string and bytes constant variants.
It's not actually _used_ anywhere as of this PR, but it is covered by
the tests.
Specifically, we now use a struct for the string and bytes cases, along
with the `Expr::FString` node. That struct holds the value, plus the
flag:
```rust
#[derive(Clone, Debug, PartialEq, is_macro::Is)]
pub enum Constant {
Str(StringConstant),
Bytes(BytesConstant),
...
}
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct StringConstant {
/// The string value as resolved by the parser (i.e., without quotes, or escape sequences, or
/// implicit concatenations).
pub value: String,
/// Whether the string contains multiple string tokens that were implicitly concatenated.
pub implicit_concatenated: bool,
}
impl Deref for StringConstant {
type Target = str;
fn deref(&self) -> &Self::Target {
self.value.as_str()
}
}
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct BytesConstant {
/// The bytes value as resolved by the parser (i.e., without quotes, or escape sequences, or
/// implicit concatenations).
pub value: Vec<u8>,
/// Whether the string contains multiple string tokens that were implicitly concatenated.
pub implicit_concatenated: bool,
}
impl Deref for BytesConstant {
type Target = [u8];
fn deref(&self) -> &Self::Target {
self.value.as_slice()
}
}
```
## Test Plan
`cargo test`