## Summary
This PR updates Salsa to pull in Ibraheem's multithreading improvements (https://github.com/salsa-rs/salsa/pull/921).
## Performance
A small regression for single-threaded benchmarks is expected because
papaya is slightly slower than a `Mutex<FxHashMap>` in the uncontested
case (~10%). However, this shouldn't matter as much in practice because:
1. Salsa has a fast-path when only using 1 DB instance which is the
common case in production. This fast-path is not impacted by the changes
but we measure the slow paths in our benchmarks (because we use multiple
db instances)
2. Fixing the 10x slowdown for the congested case (multi threading)
outweights the downsides of a 10% perf regression for single threaded
use cases, especially considering that ty is heavily multi threaded.
## Test Plan
`cargo test`
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
As discussed on Slack, there was an issue with the walltime metrics with
`divan`. The issue was fixed with the latest version of `cargo-codspeed`
and `codpseed-divan-compat`:
https://github.com/CodSpeedHQ/codspeed-rust/releases/tag/v3.0.0.
This PR updates all crates related to CodSpeed. A performance increase
of the following benchmarks is expected, as now the correct metric will
be used.
```
crates/ruff_benchmark/benches/ty_walltime.rs::multithreaded[pydantic]
crates/ruff_benchmark/benches/ty_walltime.rs::small[altair]
crates/ruff_benchmark/benches/ty_walltime.rs::small[freqtrade]
crates/ruff_benchmark/benches/ty_walltime.rs::small[pydantic]
crates/ruff_benchmark/benches/ty_walltime.rs::small[tanjun]
```
Once this is merged, we will update the historic data of the affected
benchmark on the CodSpeed UI, so that no false positives will appear.
## Summary
Setting `TY_MEMORY_REPORT=full` will generate and print a memory usage
report to the CLI after a `ty check` run:
```
=======SALSA STRUCTS=======
`Definition` metadata=7.24MB fields=17.38MB count=181062
`Expression` metadata=4.45MB fields=5.94MB count=92804
`member_lookup_with_policy_::interned_arguments` metadata=1.97MB fields=2.25MB count=35176
...
=======SALSA QUERIES=======
`File -> ty_python_semantic::semantic_index::SemanticIndex`
metadata=11.46MB fields=88.86MB count=1638
`Definition -> ty_python_semantic::types::infer::TypeInference`
metadata=24.52MB fields=86.68MB count=146018
`File -> ruff_db::parsed::ParsedModule`
metadata=0.12MB fields=69.06MB count=1642
...
=======SALSA SUMMARY=======
TOTAL MEMORY USAGE: 577.61MB
struct metadata = 29.00MB
struct fields = 35.68MB
memo metadata = 103.87MB
memo fields = 409.06MB
```
Eventually, we should integrate these numbers into CI in some form. The
one limitation currently is that heap allocations in salsa structs (e.g.
interned values) are not tracked, but memoized values should have full
coverage. We may also want a peak memory usage counter (that accounts
for non-salsa memory), but that is relatively simple to profile manually
(e.g. `time -v ty check`) and would require a compile-time option to
avoid runtime overhead.
## Summary
Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.
The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.
The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.
Part of https://github.com/astral-sh/ty/issues/214.
My editor runs `rustfmt` on save to format Rust code, not `cargo fmt`.
With our recent bump to the Rust 2024 edition, the formatting that
`rustfmt`/`cargo fmt` applies changed. Unfortunately, `rustfmt` and
`cargo fmt` have different behaviors for determining which edition to
use when formatting: `cargo fmt` looks for the Rust edition in
`Cargo.toml`, whereas `rustfmt` looks for it in `rustfmt.toml`. As a
result, whenever I save, I have to remember to manually run `cargo fmt`
before committing/pushing.
There is an open issue asking for `rustfmt` to also look at `Cargo.toml`
when it's present (https://github.com/rust-lang/rust.vim/issues/368),
but it seems like they "closed" that issue just by bumping the default
edition (six years ago, from 2015 to 2018).
In the meantime, this PR adds a `rustfmt.toml` file with our current
Rust edition so that both invocation have the same behavior. I don't
love that this duplicates information in `Cargo.toml`, but I've added a
reminder comment there to hopefully ensure that we bump the edition in
both places three years from now.
## Summary
This PR deletes the `DiagnosticKind` type by inlining its three fields
(`name`, `body`, and `suggestion`) into three other diagnostic types:
`Diagnostic`, `DiagnosticMessage`, and `CacheMessage`.
Instead of deferring to an internal `DiagnosticKind`, both `Diagnostic`
and `DiagnosticMessage` now have their own macro-generated `AsRule`
implementations.
This should make both https://github.com/astral-sh/ruff/pull/18051 and
another follow-up PR changing the type of `name` on `CacheMessage`
easier since its type will be able to change separately from
`Diagnostic` and `DiagnosticMessage`.
## Test Plan
Existing tests
## Summary
* Add initial support for `typing.dataclass_transform`
* Support decorating a function decorator with `@dataclass_transform(…)`
(used by `attrs`, `strawberry`)
* Support decorating a metaclass with `@dataclass_transform(…)` (used by
`pydantic`, but doesn't work yet, because we don't seem to model
`__new__` calls correctly?)
* *No* support yet for decorating base classes with
`@dataclass_transform(…)`. I haven't figured out how this even supposed
to work. And haven't seen it being used.
* Add `strawberry` as an ecosystem project, as it makes heavy use of
`@dataclass_transform`
## Test Plan
New Markdown tests
We weren't really using `chrono` for anything other than getting the
current time and formatting it for logs.
Unfortunately, this doesn't quite get us to a point where `chrono`
can be removed. From what I can tell, we're still bringing it via
[`tracing-subscriber`](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/)
and
[`quick-junit`](https://docs.rs/quick-junit/latest/quick_junit/).
`tracing-subscriber` does have an
[issue open about Jiff](https://github.com/tokio-rs/tracing/discussions/3128),
but there's no movement on it.
Normally I'd suggest holding off on this since it doesn't get us all of
the way there and it would be better to avoid bringing in two datetime
libraries, but we are, it appears, already there. In particular,
`env_logger` brings in Jiff. So this PR doesn't really make anything
worse, but it does bring us closer to an all-Jiff world.
This causes spurious query cycles.
This PR also includes an update to Salsa, which gives us db events on
cycle iteration, so we can write tests asserting the absence of a cycle.
Putting this up to confirm that it does what it should:
* undirty the release.yml by including action-commits in the config
* add persist-credentials=false hardening