Commit Graph

11 Commits

Author SHA1 Message Date
Amethyst Reese 8fb29eafb8
[ruff] improve handling of intermixed comments inside from-imports (#20561)
Resolves a crash when attempting to format code like:

```
from x import (a as # whatever
b)
```

Reworks the way comments are associated with nodes when parsing modules,
so that all possible comment positions can be retained and reproduced during
formatting.

Overall follows Black's formatting style for multi-line import statements.

Fixes issue #19138
2025-10-07 08:14:09 -07:00
Ibraheem Ahmed c9dff5c7d5
[ty] AST garbage collection (#18482)
## Summary

Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.

The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.

The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.

Part of https://github.com/astral-sh/ty/issues/214.
2025-06-13 08:40:11 -04:00
Charlie Marsh c71ff7eae1
Avoid printing continuations within import identifiers (#7744)
## Summary

It turns out that _some_ identifiers can contain newlines --
specifically, dot-delimited import identifiers, like:
```python
import foo\
    .bar
```

At present, we print all identifiers verbatim, which causes us to retain
the `\` in the formatted output. This also leads to violating some debug
assertions (see the linked issue, though that's a symptom of this
formatting failure).

This PR adds detection for import identifiers that contain newlines, and
formats them via `text` (slow) rather than `source_code_slice` (fast) in
those cases.

Closes https://github.com/astral-sh/ruff/issues/7734.

## Test Plan

`cargo test`
2023-10-02 09:51:07 -04:00
Micha Reiser c05e4628b1
Introduce Token element (#7048) 2023-09-02 10:05:47 +02:00
Charlie Marsh edb9b0c62a
Use the formatter prelude in more files (#6882)
Removes a bunch of imports that are made redundant by the prelude.
2023-08-25 16:51:07 -04:00
Micha Reiser 40f54375cb
Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser 2cf00fee96
Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
konsti 787e2fd49d
Format import statements (#5493)
## Summary

Format import statements in all their variants. Specifically, this
implemented formatting `StmtImport`, `StmtImportFrom` and `Alias`.

## Test Plan

I added some custom snapshots, even though this has been covered well by
black's tests.
2023-07-04 07:07:20 +00:00
Micha Reiser bcf745c5ba
Replace verbatim text with `NOT_YET_IMPLEMENTED` (#4904)
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR replaces the `verbatim_text` builder with a `not_yet_implemented` builder that emits `NOT_YET_IMPLEMENTED_<NodeKind>` for not yet implemented nodes. 

The motivation for this change is that partially formatting compound statements can result in incorrectly indented code, which is a syntax error:

```python
def func_no_args():
  a; b; c
  if True: raise RuntimeError
  if False: ...
  for i in range(10):
    print(i)
    continue
```

Get's reformatted to

```python
def func_no_args():
    a; b; c
    if True: raise RuntimeError
    if False: ...
    for i in range(10):
    print(i)
    continue
```

because our formatter does not yet support `for` statements and just inserts the text from the source. 

## Downsides

Using an identifier will not work in all situations. For example, an identifier is invalid in an `Arguments ` position. That's why I kept `verbatim_text` around and e.g. use it in the `Arguments` formatting logic where incorrect indentations are impossible (to my knowledge). Meaning, `verbatim_text` we can opt in to `verbatim_text` when we want to iterate quickly on nodes that we don't want to provide a full implementation yet and using an identifier would be invalid. 

## Upsides

Running this on main discovered stability issues with the newline handling that were previously "hidden" because of the verbatim formatting. I guess that's an upside :)

## Test Plan

None?
2023-06-07 14:57:25 +02:00
konstin 9bf168c0a4
Use dummy verbatim formatter for all nodes (#4755) 2023-06-01 08:25:26 +00:00
konstin 0945803427
Generate FormatRule definitions (#4724)
* Generate FormatRule definitions

* Generate verbatim output

* pub(crate) everything

* clippy fix

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* stub out with Ok(()) again

* Update crates/ruff_python_formatter/src/lib.rs

Co-authored-by: Micha Reiser <micha@reiser.io>

* PyFormatContext::{contents, locator} with `#[allow(unused)]`

* Can't leak private type

* remove commented code

* Fix ruff errors

* pub struct Format{node} due to rust rules

---------

Co-authored-by: Julian LaNeve <lanevejulian@gmail.com>
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-06-01 08:38:53 +02:00