Lex Jupyter line magic with `Mode::Jupyter`
This PR adds a new token `MagicCommand`[^1] which the lexer will
recognize when in `Mode::Jupyter`. The rules for the lexer is as
follows:
1. Given that we are at the start of line, skip the indentation and look
for [characters that represent the start of a magic
command](635815e8f1/IPython/core/inputtransformer2.py (L335-L346)),
determine the magic kind and capture all the characters following it as
the command string.
2. If the command extends multiple lines, the lexer will skip the line
continuation character (`\`) but only if it's followed by a newline
(`\n` or `\r`). The reason to skip this only in case of newline is
because they can occur in the command string which we should not skip:
```rust
// Skip this backslash
// v
// !pwd \
// && ls -a | sed 's/^/\\ /'
// ^^
// Don't skip these backslashes
```
3. The parser, when in `Mode::Jupyter`, will filter these tokens before
the parsing begins. There is a small caveat when the magic command is
indented. In the following example, when the parser filters out magic
command, it'll throw an indentation error:
```python
for i in range(5):
!ls
# What the parser will see
for i in range(5):
```
[^1]: I would prefer to have some other name as this not only represent
a line magic (`%`) but also shell command (`!`), help command (`?`) and
others. In original implementation, it's named as ["IPython
Syntax"](635815e8f1/IPython/core/inputtransformer2.py (L332))
Extends #95Closes#82
Adds parsing of new `type` soft keyword for defining type aliases.
Supports type alias statements as defined in PEP 695 e.g.
```python
type IntOrStr = int | str
type ListOrSet[T] = list[T] | set[T]
type AnimalOrVegetable = Animal | "Vegetable"
type RecursiveList[T] = T | list[RecursiveList[T]]
```
All type parameter kinds are supported as in #95.
Builds on soft keyword abstractions introduced in https://github.com/RustPython/RustPython/pull/4519
This removes the ASDL code generation in favor of handwriting the AST.
The motivations for moving away from the ASDL are:
* CPython compatibility is no longer a goal
* The ASDL grammar isn't as expressive as we would like
* The codegen scripts have a high complexity which makes extensions time
consuming
* We don't make heavy use of code generation (compared to e.g.
RustPython that generates Pyo3 bindings, a fold implementation etc).
We may want to revisit a grammar based code generation in the future,
e.g. by using [ungrammar](https://github.com/rust-analyzer/ungrammar)
## Summary
This PR adds `TextRange` to `Identifier`. Right now, the AST only
includes ranges for identifiers in certain cases (`Expr::Name`,
`Keyword`, etc.), namely when the identifier comprises an entire AST
node. In Ruff, we do additional ad-hoc lexing to extract identifiers
from source code.
One frequent example: given a function `def f(): ...`, we lex to find
the range of `f`, for use in diagnostics.
Another: `except ValueError as e`, for which the AST doesn't include a
range for `e`.
Note that, as an optimization, we avoid storing the `TextRange` for
`Expr::Name`, since it's already included.
* Move `range` from `Attributed` to `Node`s
* No Attributed + custom for Range PoC
* Generate all located variants, generate enum implementations
* Implement `Copy` on simple enums
* Move `Suite` to `ranged` and `located`
* Update tests
---------
Co-authored-by: Jeong YunWon <jeong@youknowone.org>