Commit Graph

47 Commits

Author SHA1 Message Date
Charlie Marsh 4782675bf9
Remove lexer-based comment range detection (#5785)
## Summary

I'm doing some unrelated profiling, and I noticed that this method is
actually measurable on the CPython benchmark -- it's > 1% of execution
time. We don't need to lex here, we already know the ranges of all
comments, so we can just do a simple binary search for overlap, which
brings the method down to 0%.

## Test Plan

`cargo test`
2023-07-16 01:03:27 +00:00
Charlie Marsh 4dee49d6fa
Run nightly Clippy over the Ruff repo (#5670)
## Summary

This is the result of running `cargo +nightly clippy --workspace
--all-targets --all-features -- -D warnings` and fixing all violations.
Just wanted to see if there were any interesting new checks on nightly
👀
2023-07-10 23:44:38 -04:00
Micha Reiser 6ba9d5d5a4
Upgrade RustPython (#5334) 2023-06-23 20:39:47 +00:00
Charlie Marsh 6331598511
Upgrade `RustPython` to access ranged names (#5194)
## Summary

In https://github.com/astral-sh/RustPython-Parser/pull/8, we modified
RustPython to include ranges for any identifiers that aren't
`Expr::Name` (which already has an identifier).

For example, the `e` in `except ValueError as e` was previously
un-ranged. To extract its range, we had to do some lexing of our own.
This change should improve performance and let us remove a bunch of
code.

## Test Plan

`cargo test`
2023-06-20 15:43:38 +00:00
Charlie Marsh 36e01ad6eb
Upgrade RustPython (#5192)
## Summary

This PR upgrade RustPython to pull in the changes to `Arguments` (zip
defaults with their identifiers) and all the renames to `CmpOp` and
friends.
2023-06-19 21:09:53 +00:00
Thomas de Zeeuw e3c12764f8
Only use a single cache file per Python package (#5117)
## Summary

This changes the caching design from one cache file per source file, to
one cache file per package. This greatly reduces the amount of cache
files that are opened and written, while maintaining roughly the same
(combined) size as bincode is very compact.

Below are some very much not scientific performance tests. It uses
projects/sources to check:

* small.py: single, 31 bytes Python file with 2 errors.
* test.py: single, 43k Python file with 8 errors.
* fastapi: FastAPI repo, 1134 files checked, 0 errors.

Source   | Before # files | After # files | Before size | After size
-------|-------|-------|-------|-------
small.py | 1              | 1             | 20 K        | 20 K
test.py  | 1              | 1             | 60 K        | 60 K
fastapi  | 1134           | 518           | 4.5 M       | 2.3 M

One question that might come up is why fastapi still has 518 cache files
and not 1? That is because this is using the existing package
resolution, which sees examples, docs, etc. as separate from the "main"
source code (in the fastapi directory in the repo). In this future it
might be worth consider switching to a one cache file per repo strategy.

This new design is not perfect and does have a number of known issues.
First, like the old design it doesn't remove the cache for a source file
that has been (re)moved until `ruff clean` is called.

Second, this currently uses a large mutex around the mutation of the
package cache (e.g. inserting result). This could be (or become) a
bottleneck. It's future work to test and improve this (if needed).

Third, currently the packages and opened and stored in a sequential
loop, this could be done parallel. This is also future work.


## Test Plan

Run `ruff check` (with caching enabled) twice on any Python source code
and it should produce the same results.
2023-06-19 17:46:13 +02:00
Charlie Marsh 2b82caa163
Detect continuations at start-of-file (#5173)
## Summary

Given:

```python
\
import os
```

Deleting `import os` leaves a syntax error: a file can't end in a
continuation. We have code to handle this case, but it failed to pick up
continuations at the _very start_ of a file.

Closes #5156.
2023-06-19 00:09:02 -04:00
Charlie Marsh 716cab2f19
Run `rustfmt` on nightly to clean up erroneous comments (#5106)
## Summary

This PR runs `rustfmt` with a few nightly options as a one-time fix to
catch some malformatted comments. I ended up just running with:

```toml
condense_wildcard_suffixes = true
edition = "2021"
max_width = 100
normalize_comments = true
normalize_doc_attributes = true
reorder_impl_items = true
unstable_features = true
use_field_init_shorthand = true
```

Since these all seem like reasonable things to fix, so may as well while
I'm here.
2023-06-15 00:19:05 +00:00
Addison Crump 70e6c212d9
Improve ruff_parse_simple to find UTF-8 violations (#5008)
Improves the `ruff_parse_simple` fuzz harness by adding checks for
parsed locations to ensure they all lie on UTF-8 character boundaries.
This will allow for faster identification of issues like #5004.

This also adds additional details for Apple M1 users and clarifies the
importance of using `init-fuzzer.sh` (thanks for the feedback,
@jasikpark 🙂).
2023-06-12 12:10:23 -04:00
Charlie Marsh 445e1723ab
Use `Stmt::parse` in lieu of `Suite` unwraps (#5002) 2023-06-10 04:55:31 +00:00
Charlie Marsh 1d756dc3a7
Move Python whitespace utilities into new `ruff_python_whitespace` crate (#4993)
## Summary

`ruff_newlines` becomes `ruff_python_whitespace`, and includes the
existing "universal newline" handlers alongside the Python
whitespace-specific utilities.
2023-06-10 00:59:57 +00:00
Micha Reiser 39a1f3980f
Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
Charlie Marsh d1b8fe6af2
Fix round-tripping of nested functions (#4875) 2023-06-05 16:13:08 -04:00
Charlie Marsh ab26f2dc9d
Use saturating_sub in more token-walking methods (#4773) 2023-06-01 17:16:32 -04:00
Charlie Marsh 9d0ffd33ca
Move universal newline handling into its own crate (#4729) 2023-05-31 12:00:47 -04:00
Micha Reiser 6c1ff6a85f
Upgrade RustPython (#4747) 2023-05-31 08:26:35 +00:00
Micha Reiser 0cd453bdf0
Generic "comment to node" association logic (#4642) 2023-05-30 09:28:01 +00:00
Micha Reiser 85f094f592
Improve `Message` sorting performance (#4624) 2023-05-24 16:34:48 +02:00
Charlie Marsh 19c4b7bee6
Rename ruff_python_semantic's `Context` struct to `SemanticModel` (#4565) 2023-05-22 02:35:03 +00:00
Charlie Marsh e9c6f16c56
Move unparse utility methods onto Generator (#4497) 2023-05-18 15:00:46 +00:00
Charlie Marsh d3b18345c5
Move triple-quoted string detection into `Indexer` method (#4495) 2023-05-18 14:42:05 +00:00
Charlie Marsh 73efbeb581
Invert quote-style when generating code within f-strings (#4487) 2023-05-18 14:33:33 +00:00
Charlie Marsh e8e66f3824
Remove unnecessary path prefixes (#4492) 2023-05-18 10:19:09 -04:00
Jeong, YunWon 4b05ca1198
Specialize ConversionFlag (#4450) 2023-05-16 18:00:13 +02:00
Charlie Marsh f0465bf106
Emit non-logical newlines for "empty" lines (#4444) 2023-05-16 14:58:56 +00:00
Jeong, YunWon badade3ccc
Impl `Default` for `SourceLocation` (#4328)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-16 07:03:43 +00:00
Micha Reiser fa26860296
Refactor range from `Attributed` to `Node`s (#4422) 2023-05-16 06:36:32 +00:00
Jonathan Plasse c10a4535b9
Disallow `unreachable_pub` (#4314) 2023-05-11 18:00:00 -04:00
Jeong, YunWon be6e00ef6e
Re-integrate RustPython parser repository (#4359)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-11 07:47:17 +00:00
Micha Reiser e04ef42334
Use `memchr` to speedup newline search on x86 (#3985) 2023-04-26 20:15:47 +01:00
Micha Reiser f3e6ddda62
perf(logical-lines): Various small perf improvements (#4022) 2023-04-26 20:10:35 +01:00
Micha Reiser cab65b25da
Replace row/column based `Location` with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Micha Reiser e8aebee3f6
Pretty print `Diagnostic`s in snapshot tests (#3906) 2023-04-11 09:03:00 +00:00
Micha Reiser c33c9dc585
Introduce SourceFile to avoid cloning the message filename (#3904) 2023-04-11 08:28:55 +00:00
Micha Reiser 056c212975
Render code frame with context (#3901) 2023-04-11 10:22:11 +02:00
Micha Reiser 381203c084
Store source code on message (#3897) 2023-04-11 07:57:36 +00:00
Micha Reiser 76c47a9a43
Cheap cloneable LineIndex (#3896) 2023-04-11 07:33:40 +00:00
Evan Rittenhouse abaf0a198d
Ensure that tab characters aren't in multi-line strings before throwing a violation (#3837) 2023-04-06 22:25:40 -04:00
Charlie Marsh d822e08111
Move `CallPath` into its own module (#3847) 2023-04-01 11:25:04 -04:00
Micha Reiser 595cd065f3
Reduce explcit clones (#3793) 2023-03-29 15:15:14 +02:00
Micha Reiser f68c26a506
perf(pycodestyle): Initialize Stylist from tokens (#3757) 2023-03-28 11:53:35 +02:00
Charlie Marsh c2750a59ab
Implement an iterator for universal newlines (#3454)
# Summary

We need to support CR line endings (as opposed to LF and CRLF line endings, which are already supported). They're rare, but they do appear in Python code, and we tend to panic on any file that uses them.

Our `Locator` abstraction now supports CR line endings. However, Rust's `str#lines` implementation does _not_.

This PR adds a `UniversalNewlineIterator` implementation that respects all of CR, LF, and CRLF line endings, and plugs it into most of the `.lines()` call sites.

As an alternative design, it could be nice if we could leverage `Locator` for this. We've already computed all of the line endings, so we could probably iterate much more efficiently?

# Test Plan

Largely relying on automated testing, however, also ran over some known failure cases, like #3404.
2023-03-13 00:01:29 -04:00
Micha Reiser d2988043af
perf: Optimize UTF8/ASCII byte offset index (#3439) 2023-03-11 13:12:10 +01:00
Charlie Marsh 0a9d259f9c
Remove copied `core` modules from `ruff_python_formatter` (#3371) 2023-03-08 19:03:40 +00:00
Charlie Marsh 130e733023
Implement `From<Located>` for `Range` (#3377) 2023-03-08 18:50:20 +00:00
Charlie Marsh ff2c0dd491
Use shared `leading_quote` implementation in ruff_python_formatter (#3396) 2023-03-08 18:21:59 +00:00
Charlie Marsh bad6bdda1f
Create a `rust_python_ast` crate (#3370)
This PR productionizes @MichaReiser's suggestion in https://github.com/charliermarsh/ruff/issues/1820#issuecomment-1440204423, by creating a separate crate for the `ast` module (`rust_python_ast`). This will enable us to further split up the `ruff` crate, as we'll be able to create (e.g.) separate sub-linter crates that have access to these common AST utilities.

This was mostly a straightforward copy (with adjustments to module imports), as the few dependencies that _did_ require modifications were handled in #3366, #3367, and #3368.
2023-03-07 15:18:40 +00:00