Emit `MagicCommand` token when it is the assignment value[^1] i.e., on
the right side of an assignment statement.
Examples:
```python
pwd = !pwd
foo = %timeit a = b
bar = %timeit a % 3
baz = %matplotlib \
inline"
```
[^1]: Only `%` and `!` are valid in that position, other magic kinds are
not valid
Previously, empty lambda arguments (e.g. `lambda: 1`) would get the
range of the entire expression, which leads to incorrect comment
placement. Now empty lambda arguments get an empty range between the
`lambda` and the `:` tokens.
The old if layout couldn't differentiate between an else block with a
single if statement and an elif statement. Additionally we getting rid
of the recursion in favor of a single if struct with a vec of elif/else
children. This is accompanied by a big refactoring in ruff which removes
a bunch of TODOs and false negatives.
Lex Jupyter line magic with `Mode::Jupyter`
This PR adds a new token `MagicCommand`[^1] which the lexer will
recognize when in `Mode::Jupyter`. The rules for the lexer is as
follows:
1. Given that we are at the start of line, skip the indentation and look
for [characters that represent the start of a magic
command](635815e8f1/IPython/core/inputtransformer2.py (L335-L346)),
determine the magic kind and capture all the characters following it as
the command string.
2. If the command extends multiple lines, the lexer will skip the line
continuation character (`\`) but only if it's followed by a newline
(`\n` or `\r`). The reason to skip this only in case of newline is
because they can occur in the command string which we should not skip:
```rust
// Skip this backslash
// v
// !pwd \
// && ls -a | sed 's/^/\\ /'
// ^^
// Don't skip these backslashes
```
3. The parser, when in `Mode::Jupyter`, will filter these tokens before
the parsing begins. There is a small caveat when the magic command is
indented. In the following example, when the parser filters out magic
command, it'll throw an indentation error:
```python
for i in range(5):
!ls
# What the parser will see
for i in range(5):
```
[^1]: I would prefer to have some other name as this not only represent
a line magic (`%`) but also shell command (`!`), help command (`?`) and
others. In original implementation, it's named as ["IPython
Syntax"](635815e8f1/IPython/core/inputtransformer2.py (L332))
Extends #95Closes#82
Adds parsing of new `type` soft keyword for defining type aliases.
Supports type alias statements as defined in PEP 695 e.g.
```python
type IntOrStr = int | str
type ListOrSet[T] = list[T] | set[T]
type AnimalOrVegetable = Animal | "Vegetable"
type RecursiveList[T] = T | list[RecursiveList[T]]
```
All type parameter kinds are supported as in #95.
Builds on soft keyword abstractions introduced in https://github.com/RustPython/RustPython/pull/4519
This removes the ASDL code generation in favor of handwriting the AST.
The motivations for moving away from the ASDL are:
* CPython compatibility is no longer a goal
* The ASDL grammar isn't as expressive as we would like
* The codegen scripts have a high complexity which makes extensions time
consuming
* We don't make heavy use of code generation (compared to e.g.
RustPython that generates Pyo3 bindings, a fold implementation etc).
We may want to revisit a grammar based code generation in the future,
e.g. by using [ungrammar](https://github.com/rust-analyzer/ungrammar)
## Summary
This PR adds `TextRange` to `Identifier`. Right now, the AST only
includes ranges for identifiers in certain cases (`Expr::Name`,
`Keyword`, etc.), namely when the identifier comprises an entire AST
node. In Ruff, we do additional ad-hoc lexing to extract identifiers
from source code.
One frequent example: given a function `def f(): ...`, we lex to find
the range of `f`, for use in diagnostics.
Another: `except ValueError as e`, for which the AST doesn't include a
range for `e`.
Note that, as an optimization, we avoid storing the `TextRange` for
`Expr::Name`, since it's already included.