Lex Jupyter line magic with `Mode::Jupyter`
This PR adds a new token `MagicCommand`[^1] which the lexer will
recognize when in `Mode::Jupyter`. The rules for the lexer is as
follows:
1. Given that we are at the start of line, skip the indentation and look
for [characters that represent the start of a magic
command](635815e8f1/IPython/core/inputtransformer2.py (L335-L346)),
determine the magic kind and capture all the characters following it as
the command string.
2. If the command extends multiple lines, the lexer will skip the line
continuation character (`\`) but only if it's followed by a newline
(`\n` or `\r`). The reason to skip this only in case of newline is
because they can occur in the command string which we should not skip:
```rust
// Skip this backslash
// v
// !pwd \
// && ls -a | sed 's/^/\\ /'
// ^^
// Don't skip these backslashes
```
3. The parser, when in `Mode::Jupyter`, will filter these tokens before
the parsing begins. There is a small caveat when the magic command is
indented. In the following example, when the parser filters out magic
command, it'll throw an indentation error:
```python
for i in range(5):
!ls
# What the parser will see
for i in range(5):
```
[^1]: I would prefer to have some other name as this not only represent
a line magic (`%`) but also shell command (`!`), help command (`?`) and
others. In original implementation, it's named as ["IPython
Syntax"](635815e8f1/IPython/core/inputtransformer2.py (L332))
Extends #95Closes#82
Adds parsing of new `type` soft keyword for defining type aliases.
Supports type alias statements as defined in PEP 695 e.g.
```python
type IntOrStr = int | str
type ListOrSet[T] = list[T] | set[T]
type AnimalOrVegetable = Animal | "Vegetable"
type RecursiveList[T] = T | list[RecursiveList[T]]
```
All type parameter kinds are supported as in #95.
Builds on soft keyword abstractions introduced in https://github.com/RustPython/RustPython/pull/4519
This removes the ASDL code generation in favor of handwriting the AST.
The motivations for moving away from the ASDL are:
* CPython compatibility is no longer a goal
* The ASDL grammar isn't as expressive as we would like
* The codegen scripts have a high complexity which makes extensions time
consuming
* We don't make heavy use of code generation (compared to e.g.
RustPython that generates Pyo3 bindings, a fold implementation etc).
We may want to revisit a grammar based code generation in the future,
e.g. by using [ungrammar](https://github.com/rust-analyzer/ungrammar)
This adds the missing implementation of `Ranged` for `TextRange` itself
```rust
impl Ranged for TextRange {
fn range(&self) -> TextRange {
*self
}
}
```
This allows e.g. using `has_comments` with arbitrary ranges instead of
just a node.
It also adds .venv to the .gitignore
In the example below, `arg` is `&Expr`, so `&Ranged`, but `entries()`
want a `T: Ranged`. This adds the missing bridge impl.
```rust
let all_args = format_with(|f| {
f.join_comma_separated()
.entries(
// We have the parentheses from the call so the arguments never need any
args.iter()
.map(|arg| (arg, arg.format().with_options(Parenthesize::Never))),
)
.nodes(keywords.iter())
.finish()
});
```
## Summary
This PR adds `TextRange` to `Identifier`. Right now, the AST only
includes ranges for identifiers in certain cases (`Expr::Name`,
`Keyword`, etc.), namely when the identifier comprises an entire AST
node. In Ruff, we do additional ad-hoc lexing to extract identifiers
from source code.
One frequent example: given a function `def f(): ...`, we lex to find
the range of `f`, for use in diagnostics.
Another: `except ValueError as e`, for which the AST doesn't include a
range for `e`.
Note that, as an optimization, we avoid storing the `TextRange` for
`Expr::Name`, since it's already included.