Commit Graph

15 Commits

Author SHA1 Message Date
Alex Waygood c80c1712f0
[red-knot] Vendor typeshed's stdlib (#11340)
This PR vendors typeshed!

-  The first commit vendors the stdlib directory from typeshed into a new crates/red_knot/vendored_typeshed directory.
-  The second commit adjusts various linting config files to make sure that the vendored code is excluded from typo checks, formatting checks, etc.
-  The LICENSE and README.md files are also vendored, but all other directories and files (stubs, scripts, tests, test_cases, etc.) are excluded. We should have no need for them (except possibly stubs/, discussed in more depth below).
-  Similar to the way pyright has a commit.txt file in its vendored copy of typeshed, to indicate which typeshed commit the vendored code corresponds to, I've also added a crates/red_knot/vendored_typeshed/source_commit.txt file in the third commit of this PR.

One open question is: should we vendor the stdlib and stubs directories, or just the stdlib directory? The stubs/ directory contains stubs for 162 third-party packages outside the stdlib. Mypy and typeshed_client1 only vendor the stdlib directory; pyright and pyre vendor both the stdlib and stubs directories; pytype vendors the entire typeshed repo (scripts/, tests/ and all).

In this PR, I've chosen to copy mypy and typeshed_client. Unlike vendoring the stdlib, which is unavoidable if we want to do typechecking of the stdlib, it's not strictly necessary to vendor the stubs directory: each subdirectory in stubs is published to PyPI as a standalone stubs distribution that can be (uv)-pip-installed into a virtual environment. It might be useful for our users if we vendored those stubs anyway, but there are costs as well as benefits to doing so (apart from just the sheer amount of vendored code in the ruff repository), so I'd rather consider it separately.
2024-05-09 12:44:53 +01:00
renovate[bot] b7fe2b57de
Update pre-commit dependencies (#11296) 2024-05-06 01:20:31 +00:00
Dhruv Manilawala 13ffb5bc19
Replace LALRPOP parser with hand-written parser (#10036)
(Supersedes #9152, authored by @LaBatata101)

## Summary

This PR replaces the current parser generated from LALRPOP to a
hand-written recursive descent parser.

It also updates the grammar for [PEP
646](https://peps.python.org/pep-0646/) so that the parser outputs the
correct AST. For example, in `data[*x]`, the index expression is now a
tuple with a single starred expression instead of just a starred
expression.

Beyond the performance improvements, the parser is also error resilient
and can provide better error messages. The behavior as seen by any
downstream tools isn't changed. That is, the linter and formatter can
still assume that the parser will _stop_ at the first syntax error. This
will be updated in the following months.

For more details about the change here, refer to the PR corresponding to
the individual commits and the release blog post.

## Test Plan

Write _lots_ and _lots_ of tests for both valid and invalid syntax and
verify the output.

## Acknowledgements

- @MichaReiser for reviewing 100+ parser PRs and continuously providing
guidance throughout the project
- @LaBatata101 for initiating the transition to a hand-written parser in
#9152
- @addisoncrump for implementing the fuzzer which helped
[catch](https://github.com/astral-sh/ruff/pull/10903)
[a](https://github.com/astral-sh/ruff/pull/10910)
[lot](https://github.com/astral-sh/ruff/pull/10966)
[of](https://github.com/astral-sh/ruff/pull/10896)
[bugs](https://github.com/astral-sh/ruff/pull/10877)

---------

Co-authored-by: Victor Hugo Gomes <labatata101@linuxmail.org>
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-04-18 17:57:39 +05:30
renovate[bot] 3fd22973da
Update pre-commit dependencies (#10822) 2024-04-08 21:31:38 +01:00
Zanie Blue 5b3e922050
Upgrade pre-commit dependencies (#8518) 2023-11-06 10:08:22 -06:00
Dimitri Papadopoulos Orfanos efe7c393d1
Fix typos found by codespell (#5607)
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

Fix typos found by
[codespell](https://github.com/codespell-project/codespell).

I have left out `memoize` for now (see #5606).
<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

CI tests.
<!-- How was it tested? -->
2023-07-08 12:33:18 +02:00
Thomas de Zeeuw e7316c1cc6
Consider ignore-names in all pep8 naming rules (#5079)
## Summary

This changes all remaining pep8 naming rules to consider the
`ingore-names` argument.

Closes #5050

## Test Plan

Added new tests.
2023-06-14 16:57:09 +02:00
konstin 775326790e
Fix pre-commit typos action (#4876)
The typos pre-commit action would also edit test fixtures and snapshots. Unfortunately the pre-commit action also doesn't respect _typos.toml and typos action doesn't allow for an exclude key, so i've added a top level exclude key. I have confirmed that this does stop typos from rewriting my fixtures and snapshots
2023-06-06 08:06:48 +02:00
Calum Young f0f4bf2929
Move typos to pre-commit config (#4148) 2023-04-29 12:13:35 -04:00
Tom Kuson 3e81403fbe
Add pygrep-hooks documentation (#4131) 2023-04-27 18:33:07 +00:00
Rob Young a6a7584d79
Implement `flake8-bandit` shell injection rules (#3924) 2023-04-13 14:45:27 -04:00
Charlie Marsh 6f97e2c457
Split list of users into top-level and dedicated section (#2986) 2023-02-17 13:36:32 +00:00
Charlie Marsh ca49b00e55
Add initial formatter implementation (#2883)
# Summary

This PR contains the code for the autoformatter proof-of-concept.

## Crate structure

The primary formatting hook is the `fmt` function in `crates/ruff_python_formatter/src/lib.rs`.

The current formatter approach is outlined in `crates/ruff_python_formatter/src/lib.rs`, and is structured as follows:

- Tokenize the code using the RustPython lexer.
- In `crates/ruff_python_formatter/src/trivia.rs`, extract a variety of trivia tokens from the token stream. These include comments, trailing commas, and empty lines.
- Generate the AST via the RustPython parser.
- In `crates/ruff_python_formatter/src/cst.rs`, convert the AST to a CST structure. As of now, the CST is nearly identical to the AST, except that every node gets a `trivia` vector. But we might want to modify it further.
- In `crates/ruff_python_formatter/src/attachment.rs`, attach each trivia token to the corresponding CST node. The logic for this is mostly in `decorate_trivia` and is ported almost directly from Prettier (given each token, find its preceding, following, and enclosing nodes, then attach the token to the appropriate node in a second pass).
- In `crates/ruff_python_formatter/src/newlines.rs`, normalize newlines to match Black’s preferences. This involves traversing the CST and inserting or removing `TriviaToken` values as we go.
- Call `format!` on the CST, which delegates to type-specific formatter implementations (e.g., `crates/ruff_python_formatter/src/format/stmt.rs` for `Stmt` nodes, and similar for `Expr` nodes; the others are trivial). Those type-specific implementations delegate to kind-specific functions (e.g., `format_func_def`).

## Testing and iteration

The formatter is being developed against the Black test suite, which was copied over in-full to `crates/ruff_python_formatter/resources/test/fixtures/black`.

The Black fixtures had to be modified to create `[insta](https://github.com/mitsuhiko/insta)`-compatible snapshots, which now exist in the repo.

My approach thus far has been to try and improve coverage by tackling fixtures one-by-one.

## What works, and what doesn’t

- *Most* nodes are supported at a basic level (though there are a few stragglers at time of writing, like `StmtKind::Try`).
- Newlines are properly preserved in most cases.
- Magic trailing commas are properly preserved in some (but not all) cases.
- Trivial leading and trailing standalone comments mostly work (although maybe not at the end of a file).
- Inline comments, and comments within expressions, often don’t work -- they work in a few cases, but it’s one-off right now. (We’re probably associating them with the “right” nodes more often than we are actually rendering them in the right place.)
- We don’t properly normalize string quotes. (At present, we just repeat any constants verbatim.)
- We’re mishandling a bunch of wrapping cases (if we treat Black as the reference implementation). Here are a few examples (demonstrating Black's stable behavior):

```py
# In some cases, if the end expression is "self-closing" (functions,
# lists, dictionaries, sets, subscript accesses, and any length-two
# boolean operations that end in these elments), Black
# will wrap like this...
if some_expression and f(
    b,
    c,
    d,
):
    pass

# ...whereas we do this:
if (
    some_expression
    and f(
        b,
        c,
        d,
    )
):
    pass

# If function arguments can fit on a single line, then Black will
# format them like this, rather than exploding them vertically.
if f(
    a, b, c, d, e, f, g, ...
):
    pass
```

- We don’t properly preserve parentheses in all cases. Black preserves parentheses in some but not all cases.
2023-02-15 04:06:35 +00:00
Charlie Marsh 3ef1c2e303
Add `rome_formatter` fork as `ruff_formatter` (#2872)
The Ruff autoformatter is going to be based on an intermediate representation (IR) formatted via [Wadler's algorithm](https://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf). This is architecturally similar to [Rome](https://github.com/rome/tools), Prettier, [Skip](https://github.com/skiplang/skip/blob/master/src/tools/printer/printer.sk), and others.

This PR adds a fork of the `rome_formatter` crate from [Rome](https://github.com/rome/tools), renamed here to `ruff_formatter`, which provides generic definitions for a formatter IR as well as a generic IR printer. (We've also pulled in `rome_rowan`, `rome_text_size`, and `rome_text_edit`, though some of these will be removed in future PRs.)

Why fork? `rome_formatter` contains code that's specific to Rome's AST representation (e.g., it relies on a fork of rust-analyzer's `rowan`), and we'll likely want to support different abstractions and formatting capabilities (there are already a few changes coming in future PRs). Once we've dropped `ruff_rowan` and trimmed down `ruff_formatter` to the code we currently need, it's also not a huge surface area to maintain and update.
2023-02-14 19:22:55 -05:00
Charlie Marsh 12ed1837ee
Ignore typos in snapshots (#2609) 2023-02-06 15:43:03 -05:00