## Summary
Fixes https://github.com/astral-sh/ruff/issues/19771
Fixes incorrect parsing of Unicode named escape sequences like `Hey
\N{snowman}` in `FormatString`, which were being incorrectly split into
separate literal and field parts instead of being treated as a single
literal unit.
## Problem
The `FormatString` parser incorrectly handles Unicode named escape
sequences:
- **Current**: `Hey \N{snowman}` is parsed into 2 parts `Literal("Hey
\N")` & `Field("snowman")`
- **Expected**: `Hey \N{snowman}` should be parsed into 1 part
`Literal("Hey \N{snowman}")`
This affects f-string conversion rules when fixing `UP032` that rely on
proper format string parsing.
## Solution
I modified `parse_literal` to detect and handle Unicode named escape
sequences before parsing single characters:
- Introduced a flag to track when a backslash is "available" to escape
something.
- When the flag is `true`, and the text starts with `N{`, try to parse
the complete Unicode escape sequence as one unit, and set the flag to
`false` after parsing successfully.
- Set the flag to `false` when the backslash is already consumed.
## Manual Verification
`"\N{angle}AOB = {angle}°".format(angle=180)`
**Result**
```bash
def foo():
- "\N{angle}AOB = {angle}°".format(angle=180)
+ f"\N{angle}AOB = {180}°"
Would fix 1 error.
```
`"\N{snowman} {snowman}".format(snowman=1)`
**Result**
```bash
def foo():
- "\N{snowman} {snowman}".format(snowman=1)
+ f"\N{snowman} {1}"
Would fix 1 error.
```
`"\\N{snowman} {snowman}".format(snowman=1)`
**Result**
```bash
def foo():
- "\\N{snowman} {snowman}".format(snowman=1)
+ f"\\N{1} {1}"
Would fix 1 error.
```
## Test Plan
- Added test cases (happy case, invalid case, edge case) for
`FormatString` when parsing Unicode escape sequence.
- Updated snapshots.
## Summary
This PR fixes a bug where the checker would require the tokens for an
invalid offset w.r.t. the source code.
Taking the source code from the linked issue as an example:
```py
relese_version :"0.0is 64"
```
Now, this isn't really a valid type annotation but that's what this PR
is fixing. Regardless of whether it's valid or not, Ruff shouldn't
panic.
The checker would visit the parsed type annotation (`0.0is 64`) and try
to detect any violations. Certain rule logic requests the tokens for the
same but it would fail because the lexer would only have the `String`
token considering original source code. This worked before because the
lexer was invoked again for each rule logic.
The solution is to store the parsed type annotation on the checker if
it's in a typing context and use the tokens from that instead if it's
available. This is enforced by creating a new API on the checker to get
the tokens.
But, this means that there are two ways to get the tokens via the
checker API. I want to restrict this in a follow-up PR (#11741) to only
expose `tokens` and `comment_ranges` as methods and restrict access to
the parsed source code.
fixes: #11736
## Test Plan
- [x] Add a test case for `F632` rule and update the snapshot
- [x] Check all affected rules
- [x] No ecosystem changes
## Summary
Given a format string like `"{x} {x}".format(x=foo())`, we should avoid
converting to an f-string, since doing so would require repeating the
function call (`f"{foo()} {foo()}"`), which could introduce side
effects.
Closes https://github.com/astral-sh/ruff/issues/10258.
## Summary
This PR resolves an issue raised in
https://github.com/astral-sh/ruff/discussions/7810, whereby we don't fix
an f-string that exceeds the line length _even if_ the resultant code is
_shorter_ than the current code.
As part of this change, I've also refactored and extracted some common
logic we use around "ensuring a fix isn't breaking the line length
rules".
## Test Plan
`cargo test`