Commit Graph

28 Commits

Author SHA1 Message Date
Micha Reiser 92c5f62ec0
[ty] Enable LRU collection for parsed module (#21749) 2025-12-03 12:16:18 +01:00
Ibraheem Ahmed 7abc41727b
[ty] Shrink size of `AstNodeRef` (#20028)
## Summary

Removes the `module_ptr` field from `AstNodeRef` in release mode, and
change `NodeIndex` to a `NonZeroU32` to reduce the size of
`Option<AstNodeRef<_>>` fields.

I believe CI runs in debug mode, so this won't show up in the memory
report, but this reduces memory by ~2% in release mode.
2025-08-22 17:03:22 -04:00
Ibraheem Ahmed 21ac16db85
[ty] Avoid overcounting shared memory usage (#19773)
## Summary

Use a global tracker to avoid double counting `Arc` instances.
2025-08-06 15:32:02 -04:00
Ibraheem Ahmed 8f8c39c435
Simplify `get_size2` usage (#19643)
## Summary

These were added in the 0.5.0 release.
2025-07-30 15:31:37 -04:00
Ibraheem Ahmed 6f7b1c9bb3
[ty] Add environment variable to dump Salsa memory usage stats (#18928)
## Summary

Setting `TY_MEMORY_REPORT=full` will generate and print a memory usage
report to the CLI after a `ty check` run:

```
=======SALSA STRUCTS=======
`Definition`                                       metadata=7.24MB   fields=17.38MB  count=181062
`Expression`                                       metadata=4.45MB   fields=5.94MB   count=92804
`member_lookup_with_policy_::interned_arguments`   metadata=1.97MB   fields=2.25MB   count=35176
...
=======SALSA QUERIES=======
`File -> ty_python_semantic::semantic_index::SemanticIndex`
    metadata=11.46MB  fields=88.86MB  count=1638
`Definition -> ty_python_semantic::types::infer::TypeInference`
    metadata=24.52MB  fields=86.68MB  count=146018
`File -> ruff_db::parsed::ParsedModule`
    metadata=0.12MB   fields=69.06MB  count=1642
...
=======SALSA SUMMARY=======
TOTAL MEMORY USAGE: 577.61MB
    struct metadata = 29.00MB
    struct fields = 35.68MB
    memo metadata = 103.87MB
    memo fields = 409.06MB
```

Eventually, we should integrate these numbers into CI in some form. The
one limitation currently is that heap allocations in salsa structs (e.g.
interned values) are not tracked, but memoized values should have full
coverage. We may also want a peak memory usage counter (that accounts
for non-salsa memory), but that is relatively simple to profile manually
(e.g. `time -v ty check`) and would require a compile-time option to
avoid runtime overhead.
2025-06-26 21:27:51 +00:00
Ibraheem Ahmed c9dff5c7d5
[ty] AST garbage collection (#18482)
## Summary

Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.

The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.

The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.

Part of https://github.com/astral-sh/ty/issues/214.
2025-06-13 08:40:11 -04:00
Ibraheem Ahmed 8531f4b3ca
[ty] Add infrastructure for AST garbage collection (#18445)
## Summary

https://github.com/astral-sh/ty/issues/214 will require a couple
invasive changes that I would like to get merged even before garbage
collection is fully implemented (to avoid rebasing):
- `ParsedModule` can no longer be dereferenced directly. Instead you
need to load a `ParsedModuleRef` to access the AST, which requires a
reference to the salsa database (as it may require re-parsing the AST if
it was collected).
- `AstNodeRef` can only be dereferenced with the `node` method, which
takes a reference to the `ParsedModuleRef`. This allows us to encode the
fact that ASTs do not live as long as the database and may be collected
as soon a given instance of a `ParsedModuleRef` is dropped. There are a
number of places where we currently merge the `'db` and `'ast`
lifetimes, so this requires giving some types/functions two separate
lifetime parameters.
2025-06-05 11:43:18 -04:00
Micha Reiser 9ae698fe30
Switch to Rust 2024 edition (#18129) 2025-05-16 13:25:28 +02:00
Micha Reiser 6cd8a49638
[ty] Update salsa (#17964) 2025-05-09 11:54:07 +02:00
Brent Westbrook 9c47b6dbb0
[red-knot] Detect version-related syntax errors (#16379)
## Summary
This PR extends version-related syntax error detection to red-knot. The
main changes here are:

1. Passing `ParseOptions` specifying a `PythonVersion` to parser calls
2. Adding a `python_version` method to the `Db` trait to make this
possible
3. Converting `UnsupportedSyntaxError`s to `Diagnostic`s
4. Updating existing mdtests  to avoid unrelated syntax errors

My initial draft of (1) and (2) in #16090 instead tried passing a
`PythonVersion` down to every parser call, but @MichaReiser suggested
the `Db` approach instead
[here](https://github.com/astral-sh/ruff/pull/16090#discussion_r1969198407),
and I think it turned out much nicer.

All of the new `python_version` methods look like this:

```rust
fn python_version(&self) -> ruff_python_ast::PythonVersion {
    Program::get(self).python_version(self)
}
```

with the exception of the `TestDb` in `ruff_db`, which hard-codes
`PythonVersion::latest()`.

## Test Plan

Existing mdtests, plus a new mdtest to see at least one of the new
diagnostics.
2025-04-17 14:00:30 -04:00
Micha Reiser 3150812ac4
[red-knot] Add 'Format document' to playground (#17217)
## Summary
This is more "because we can" than something we need. 

But since we're already building an "almost IDE" 

## Test Plan



https://github.com/user-attachments/assets/3a4bdad1-ba32-455a-9909-cfeb8caa1b28
2025-04-07 09:26:03 +02:00
Micha Reiser c100d519e9
[internal]: Upgrade salsa (#16794)
## Summary

Another salsa upgrade. 

The main motivation is to stay on a recent salsa version because there
are still a lot of breaking changes happening.
The most significant changes in this update:

* Salsa no longer derives `Debug` by default. It now requires
`interned(debug)` (or similar)
* This version ships the foundation for garbage collecting interned
values. However, this comes at the cost that queries now track which
interned values they created (or read). The micro benchmarks in the
salsa repo showed a significant perf regression. Will see if this also
visible in our benchmarks.

## Test Plan

`cargo test`
2025-03-17 11:05:54 +01:00
Micha Reiser ce0018c3cb
Add `OsSystem` support to mdtests (#16518)
## Summary

This PR introduces a new mdtest option `system` that can either be
`in-memory` or `os`
where `in-memory` is the default.

The motivation for supporting `os` is so that we can write OS/system
specific tests
with mdtests. Specifically, I want to write mdtests for the module
resolver,
testing that module resolution is case sensitive. 

## Test Plan

I tested that the case-sensitive module resolver test start failing when
setting `system = "os"`
2025-03-06 10:41:40 +01:00
Brent Westbrook 37fbe58b13
Document `LinterResult::has_syntax_error` and add `Parsed::has_no_syntax_errors` (#16443)
Summary
--

This is a follow up addressing the comments on #16425. As @dhruvmanila
pointed out, the naming is a bit tricky. I went with `has_no_errors` to
try to differentiate it from `is_valid`. It actually ends up negated in
most uses, so it would be more convenient to have `has_any_errors` or
`has_errors`, but I thought it would sound too much like the opposite of
`is_valid` in that case. I'm definitely open to suggestions here.

Test Plan
--

Existing tests.
2025-03-04 08:35:38 -05:00
Ibraheem Ahmed 69d86d1d69
Transition to salsa coarse-grained tracked structs (#15763)
## Summary

Transition to using coarse-grained tracked structs (depends on
https://github.com/salsa-rs/salsa/pull/657). For now, this PR doesn't
add any `#[tracked]` fields, meaning that any changes cause the entire
struct to be invalidated. It also changes `AstNodeRef` to be
compared/hashed by pointer address, instead of performing a deep AST
comparison.

## Test Plan

This yields a 10-15% improvement on my machine (though weirdly some runs
were 5-10% without being flagged as inconsistent by criterion, is there
some non-determinism involved?). It's possible that some of this is
unrelated, I'll try applying the patch to the current salsa version to
make sure.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2025-02-11 11:38:50 +01:00
Micha Reiser 653c09001a
Use an empty vendored file system in Ruff (#13436)
## Summary

This PR changes removes the typeshed stubs from the vendored file system
shipped with ruff
and instead ships an empty "typeshed".

Making the typeshed files optional required extracting the typshed files
into a new `ruff_vendored` crate. I do like this even if all our builds
always include typeshed because it means `red_knot_python_semantic`
contains less code that needs compiling.

This also allows us to use deflate because the compression algorithm
doesn't matter for an archive containing a single, empty file.

## Test Plan

`cargo test`

I verified with ` cargo tree -f "{p} {f}" -p <package> ` that:

* red_knot_wasm: enables `deflate` compression
* red_knot: enables `zstd` compression
* `ruff`: uses stored


I'm not quiet sure how to build the binary that maturin builds but
comparing the release artifact size with `strip = true` shows a `1.5MB`
size reduction

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2024-09-21 16:31:42 +00:00
Dhruv Manilawala 551ed2706b
[red-knot] Simplify virtual file support (#13043)
## Summary

This PR simplifies the virtual file support in the red knot core,
specifically:

* Update `File::add_virtual_file` method to `File::virtual_file` which
will always create a new virtual file and override the existing entry in
the lookup table
* Add `VirtualFile` which is a wrapper around `File` and provides
methods to increment the file revision / close the virtual file
* Add a new `File::try_virtual_file` to lookup the `VirtualFile` from
`Files`
* Add `File::sync_virtual_path` which takes in the `SystemVirtualPath`,
looks up the `VirtualFile` for it and calls the `sync` method to
increment the file revision
* Removes the `virtual_path_metadata` method on `System` trait

## Test Plan

- [x] Make sure the existing red knot tests pass
- [x] Updated code works well with the LSP
2024-08-23 07:04:15 +00:00
Micha Reiser dc6aafecc2
Setup tracing and document tracing usage (#12730) 2024-08-08 06:28:40 +00:00
Micha Reiser e18b4e42d3
[red-knot] Upgrade to the *new* *new* salsa (#12406) 2024-07-29 07:21:24 +00:00
Dhruv Manilawala 6f4db8675b
[red-knot] Add support for untitled files (#12492)
## Summary

This PR adds support for untitled files in the Red Knot project.

Refer to the [design
discussion](https://github.com/astral-sh/ruff/discussions/12336) for
more details.

### Changes
* The `parsed_module` always assumes that the `SystemVirtual` path is of
`PySourceType::Python`.
* For the module resolver, as suggested, I went ahead by adding a new
`SystemOrVendoredPath` enum and renamed `FilePathRef` to
`SystemOrVendoredPathRef` (happy to consider better names here).
* The `file_to_module` query would return if it's a
`FilePath::SystemVirtual` variant because a virtual file doesn't belong
to any module.
* The sync implementation for the system virtual path is basically the
same as that of system path except that it uses the
`virtual_path_metadata`. The reason for this is that the system
(language server) would provide the metadata on whether it still exists
or not and if it exists, the corresponding metadata.

For point (1), VS Code would use `Untitled-1` for Python files and
`Untitled-1.ipynb` for Jupyter Notebooks. We could use this distinction
to determine whether the source type is `Python` or `Ipynb`.

## Test Plan

Added test cases in #12526
2024-07-26 18:13:31 +05:30
Micha Reiser ac04380f36
[red-knot] Rename `FileSystem` to `System` (#12214) 2024-07-09 07:20:51 +00:00
Micha Reiser 37f260b5af
Introduce `HasTy` trait and `SemanticModel` facade (#11963) 2024-07-01 14:48:27 +02:00
Micha Reiser b456051be8
[red-knot] Add tracing to Salsa queries (#11949) 2024-06-20 13:33:41 +02:00
Alex Waygood 1d73d60bd3
[red-knot]: Add a VendoredFileSystem implementation (#11863)
Co-authored-by: Micha Reiser <micha@reiser.io>
2024-06-18 15:43:39 +00:00
Micha Reiser f666d79cd7
red-knot: Symbol table (#11860) 2024-06-18 13:10:45 +00:00
Micha Reiser 98b13b9844
red-knot: Add a method to resolve a file for an arbitrary `VfsPath` (#11826)
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
2024-06-18 12:03:30 +00:00
Micha Reiser 22b6488550
red-knot: Add directory support to `MemoryFileSystem` (#11825) 2024-06-13 07:48:28 +00:00
Micha Reiser d4dd96d1f4
red-knot: `source_text`, `line_index`, and `parsed_module` queries (#11822) 2024-06-13 07:37:02 +00:00