Commit Graph

16 Commits

Author SHA1 Message Date
Thomas de Zeeuw e3c12764f8
Only use a single cache file per Python package (#5117)
## Summary

This changes the caching design from one cache file per source file, to
one cache file per package. This greatly reduces the amount of cache
files that are opened and written, while maintaining roughly the same
(combined) size as bincode is very compact.

Below are some very much not scientific performance tests. It uses
projects/sources to check:

* small.py: single, 31 bytes Python file with 2 errors.
* test.py: single, 43k Python file with 8 errors.
* fastapi: FastAPI repo, 1134 files checked, 0 errors.

Source   | Before # files | After # files | Before size | After size
-------|-------|-------|-------|-------
small.py | 1              | 1             | 20 K        | 20 K
test.py  | 1              | 1             | 60 K        | 60 K
fastapi  | 1134           | 518           | 4.5 M       | 2.3 M

One question that might come up is why fastapi still has 518 cache files
and not 1? That is because this is using the existing package
resolution, which sees examples, docs, etc. as separate from the "main"
source code (in the fastapi directory in the repo). In this future it
might be worth consider switching to a one cache file per repo strategy.

This new design is not perfect and does have a number of known issues.
First, like the old design it doesn't remove the cache for a source file
that has been (re)moved until `ruff clean` is called.

Second, this currently uses a large mutex around the mutation of the
package cache (e.g. inserting result). This could be (or become) a
bottleneck. It's future work to test and improve this (if needed).

Third, currently the packages and opened and stored in a sequential
loop, this could be done parallel. This is also future work.


## Test Plan

Run `ruff check` (with caching enabled) twice on any Python source code
and it should produce the same results.
2023-06-19 17:46:13 +02:00
Charlie Marsh 9d0ffd33ca
Move universal newline handling into its own crate (#4729) 2023-05-31 12:00:47 -04:00
Micha Reiser 6c1ff6a85f
Upgrade RustPython (#4747) 2023-05-31 08:26:35 +00:00
Micha Reiser 0cd453bdf0
Generic "comment to node" association logic (#4642) 2023-05-30 09:28:01 +00:00
Micha Reiser 85f094f592
Improve `Message` sorting performance (#4624) 2023-05-24 16:34:48 +02:00
Charlie Marsh 19c4b7bee6
Rename ruff_python_semantic's `Context` struct to `SemanticModel` (#4565) 2023-05-22 02:35:03 +00:00
Charlie Marsh 73efbeb581
Invert quote-style when generating code within f-strings (#4487) 2023-05-18 14:33:33 +00:00
Jeong, YunWon badade3ccc
Impl `Default` for `SourceLocation` (#4328)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-05-16 07:03:43 +00:00
Micha Reiser e04ef42334
Use `memchr` to speedup newline search on x86 (#3985) 2023-04-26 20:15:47 +01:00
Micha Reiser cab65b25da
Replace row/column based `Location` with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Micha Reiser c33c9dc585
Introduce SourceFile to avoid cloning the message filename (#3904) 2023-04-11 08:28:55 +00:00
Micha Reiser 056c212975
Render code frame with context (#3901) 2023-04-11 10:22:11 +02:00
Micha Reiser 381203c084
Store source code on message (#3897) 2023-04-11 07:57:36 +00:00
Micha Reiser 76c47a9a43
Cheap cloneable LineIndex (#3896) 2023-04-11 07:33:40 +00:00
Micha Reiser f68c26a506
perf(pycodestyle): Initialize Stylist from tokens (#3757) 2023-03-28 11:53:35 +02:00
Charlie Marsh bad6bdda1f
Create a `rust_python_ast` crate (#3370)
This PR productionizes @MichaReiser's suggestion in https://github.com/charliermarsh/ruff/issues/1820#issuecomment-1440204423, by creating a separate crate for the `ast` module (`rust_python_ast`). This will enable us to further split up the `ruff` crate, as we'll be able to create (e.g.) separate sub-linter crates that have access to these common AST utilities.

This was mostly a straightforward copy (with adjustments to module imports), as the few dependencies that _did_ require modifications were handled in #3366, #3367, and #3368.
2023-03-07 15:18:40 +00:00