mirror of
https://github.com/astral-sh/ruff
synced 2026-01-22 14:00:51 -05:00
## Summary This PR uses the new `Diagnostic` type for rendering formatter diagnostics. This allows the formatter to inherit all of the output formats already implemented in the linter and ty. For example, here's the new `full` output format, with the formatting diff displayed using the same infrastructure as the linter: <img width="592" height="364" alt="image" src="https://github.com/user-attachments/assets/6d09817d-3f27-4960-aa8b-41ba47fb4dc0" /> <details><summary>Resolved TODOs</summary> <p> ~~There are several limitiations/todos here still, especially around the `OutputFormat` type~~: - [x] A few literal `todo!`s for the remaining `OutputFormat`s without matching `DiagnosticFormat`s - [x] The default output format is `full` instead of something more concise like the current output - [x] Some of the output formats (namely JSON) have information that doesn't make much sense for these diagnostics The first of these is definitely resolved, and I think the other two are as well, based on discussion on the design document. In brief, we're okay inheriting the default `OutputFormat` and can separate the global option into `lint.output-format` and `format.output-format` in the future, if needed; and we're okay including redundant information in the non-human-readable output formats. My last major concern is with the performance of the new code, as discussed in the `Benchmarks` section below. A smaller question is whether we should use `Diagnostic`s for formatting errors too. I think the answer to this is yes, in line with changes we're making in the linter too. I still need to implement that here. </p> </details> <details><summary>Benchmarks</summary> <p> The values in the table are from a large benchmark on the CPython 3.10 code base, which involves checking 2011 files, 1872 of which need to be reformatted. `stable` corresponds to the same code used on `main`, while `preview-full` and `preview-concise` use the new `Diagnostic` code gated behind `--preview` for the `full` and `concise` output formats, respectively. `stable-diff` uses the `--diff` to compare the two diff rendering approaches. See the full hyperfine command below for more details. For a sense of scale, the `stable` output format produces 1873 lines on stdout, compared to 855,278 for `preview-full` and 857,798 for `stable-diff`. | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:------------------|--------------:|---------:|---------:|-------------:| | `stable` | 201.2 ± 6.8 | 192.9 | 220.6 | 1.00 | | `preview-full` | 9113.2 ± 31.2 | 9076.1 | 9152.0 | 45.29 ± 1.54 | | `preview-concise` | 214.2 ± 1.4 | 212.0 | 217.6 | 1.06 ± 0.04 | | `stable-diff` | 3308.6 ± 20.2 | 3278.6 | 3341.8 | 16.44 ± 0.56 | In summary, the `preview-concise` diagnostics are ~6% slower than the stable output format, increasing the average runtime from 201.2 ms to 214.2 ms. The `full` preview diagnostics are much more expensive, taking over 9113.2 ms to complete, which is ~3x more expensive even than the stable diffs produced by the `--diff` flag. My main takeaways here are: 1. Rendering `Edit`s is much more expensive than rendering the diffs from `--diff` 2. Constructing `Edit`s actually isn't too bad ### Constructing `Edit`s I also took a closer look at `Edit` construction by modifying the code and repeating the `preview-concise` benchmark and found that the main issue is constructing a `SourceFile` for use in the `Edit` rendering. Commenting out the `Edit` construction itself has basically no effect: | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:----------|------------:|---------:|---------:|------------:| | `stable` | 197.5 ± 1.6 | 195.0 | 200.3 | 1.00 | | `no-edit` | 208.9 ± 2.2 | 204.8 | 212.2 | 1.06 ± 0.01 | However, also omitting the source text from the `SourceFile` construction resolves the slowdown compared to `stable`. So it seems that copying the full source text into a `SourceFile` is the main cause of the slowdown for non-`full` diagnostics. | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:-----------------|------------:|---------:|---------:|------------:| | `stable` | 202.4 ± 2.9 | 197.6 | 207.9 | 1.00 | | `no-source-text` | 202.7 ± 3.3 | 196.3 | 209.1 | 1.00 ± 0.02 | ### Rendering diffs The main difference between `stable-diff` and `preview-full` seems to be the diffing strategy we use from `similar`. Both versions use the same algorithm, but in the existing [`CodeDiff`](https://github.com/astral-sh/ruff/blob/main/crates/ruff_linter/src/source_kind.rs#L259) rendering for the `--diff` flag, we only do line-level diffing, whereas for `Diagnostic`s we use `TextDiff::iter_inline_changes` to highlight word-level changes too. Skipping the word diff for `Diagnostic`s closes most of the gap: | Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `stable-diff` | 3.323 ± 0.015 | 3.297 | 3.341 | 1.00 | | `preview-full` | 3.654 ± 0.019 | 3.618 | 3.682 | 1.10 ± 0.01 | (In some repeated runs, I've seen as small as a ~5% difference, down from 10% in the table) This doesn't actually change any of our snapshots, but it would obviously change the rendered result in a terminal since we wouldn't highlight the specific words that changed within a line. Another much smaller change that we can try is removing the deadline from the `iter_inline_changes` call. It looks like there's a fair amount of overhead from the default 500 ms deadline for computing these, and using `iter_inline_changes(op, None)` (`None` for the optional deadline argument) improves the runtime quite a bit: | Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | `stable-diff` | 3.322 ± 0.013 | 3.298 | 3.341 | 1.00 | | `preview-full` | 5.296 ± 0.030 | 5.251 | 5.366 | 1.59 ± 0.01 | <hr> <details><summary>hyperfine command</summary> ```shell cargo build --release --bin ruff && hyperfine --ignore-failure --warmup 10 --export-markdown /tmp/table.md \ -n stable -n preview-full -n preview-concise -n stable-diff \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=full" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --preview --output-format=concise" \ "./target/release/ruff format --check ./crates/ruff_linter/resources/test/cpython/ --no-cache --diff" ``` </details> </p> </details> ## Test Plan Some new CLI tests and manual testing
224 lines
7.0 KiB
Rust
224 lines
7.0 KiB
Rust
use std::any::Any;
|
|
use std::backtrace::BacktraceStatus;
|
|
use std::cell::Cell;
|
|
use std::panic::Location;
|
|
use std::sync::OnceLock;
|
|
|
|
#[derive(Debug)]
|
|
pub struct PanicError {
|
|
pub location: Option<String>,
|
|
pub payload: Payload,
|
|
pub backtrace: Option<std::backtrace::Backtrace>,
|
|
pub salsa_backtrace: Option<salsa::Backtrace>,
|
|
}
|
|
|
|
#[derive(Debug)]
|
|
pub struct Payload(Box<dyn std::any::Any + Send>);
|
|
|
|
impl Payload {
|
|
pub fn as_str(&self) -> Option<&str> {
|
|
if let Some(s) = self.0.downcast_ref::<String>() {
|
|
Some(s)
|
|
} else if let Some(s) = self.0.downcast_ref::<&str>() {
|
|
Some(s)
|
|
} else {
|
|
None
|
|
}
|
|
}
|
|
|
|
pub fn downcast_ref<R: Any>(&self) -> Option<&R> {
|
|
self.0.downcast_ref::<R>()
|
|
}
|
|
}
|
|
|
|
impl PanicError {
|
|
pub fn to_diagnostic_message(&self, path: Option<impl std::fmt::Display>) -> String {
|
|
use std::fmt::Write;
|
|
|
|
let mut message = String::new();
|
|
message.push_str("Panicked");
|
|
|
|
if let Some(location) = &self.location {
|
|
let _ = write!(&mut message, " at {location}");
|
|
}
|
|
|
|
if let Some(path) = path {
|
|
let _ = write!(&mut message, " when checking `{path}`");
|
|
}
|
|
|
|
if let Some(payload) = self.payload.as_str() {
|
|
let _ = write!(&mut message, ": `{payload}`");
|
|
}
|
|
|
|
message
|
|
}
|
|
}
|
|
|
|
impl std::fmt::Display for PanicError {
|
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
write!(f, "panicked at")?;
|
|
if let Some(location) = &self.location {
|
|
write!(f, " {location}")?;
|
|
}
|
|
if let Some(payload) = self.payload.as_str() {
|
|
write!(f, ":\n{payload}")?;
|
|
}
|
|
if let Some(query_trace) = self.salsa_backtrace.as_ref() {
|
|
let _ = writeln!(f, "{query_trace}");
|
|
}
|
|
|
|
if let Some(backtrace) = &self.backtrace {
|
|
match backtrace.status() {
|
|
BacktraceStatus::Disabled => {
|
|
writeln!(
|
|
f,
|
|
"\nrun with `RUST_BACKTRACE=1` environment variable to display a backtrace"
|
|
)?;
|
|
}
|
|
BacktraceStatus::Captured => {
|
|
writeln!(f, "\nBacktrace: {backtrace}")?;
|
|
}
|
|
_ => {}
|
|
}
|
|
}
|
|
|
|
Ok(())
|
|
}
|
|
}
|
|
|
|
#[derive(Default)]
|
|
struct CapturedPanicInfo {
|
|
backtrace: Option<std::backtrace::Backtrace>,
|
|
location: Option<String>,
|
|
salsa_backtrace: Option<salsa::Backtrace>,
|
|
}
|
|
|
|
thread_local! {
|
|
static CAPTURE_PANIC_INFO: Cell<bool> = const { Cell::new(false) };
|
|
static LAST_BACKTRACE: Cell<CapturedPanicInfo> = const {
|
|
Cell::new(CapturedPanicInfo { backtrace: None, location: None, salsa_backtrace: None })
|
|
};
|
|
}
|
|
|
|
fn install_hook() {
|
|
static ONCE: OnceLock<()> = OnceLock::new();
|
|
ONCE.get_or_init(|| {
|
|
let prev = std::panic::take_hook();
|
|
std::panic::set_hook(Box::new(move |info| {
|
|
let should_capture = CAPTURE_PANIC_INFO.with(Cell::get);
|
|
if !should_capture {
|
|
return (*prev)(info);
|
|
}
|
|
|
|
let location = info.location().map(Location::to_string);
|
|
let backtrace = Some(std::backtrace::Backtrace::capture());
|
|
|
|
LAST_BACKTRACE.set(CapturedPanicInfo {
|
|
backtrace,
|
|
location,
|
|
salsa_backtrace: salsa::Backtrace::capture(),
|
|
});
|
|
}));
|
|
});
|
|
}
|
|
|
|
/// Invokes a closure, capturing and returning the cause of an unwinding panic if one occurs.
|
|
///
|
|
/// ### Thread safety
|
|
///
|
|
/// This is implemented by installing a custom [panic hook](std::panic::set_hook). This panic hook
|
|
/// is a global resource. The hook that we install captures panic info in a thread-safe manner,
|
|
/// and also ensures that any threads that are _not_ currently using this `catch_unwind` wrapper
|
|
/// still use the previous hook (typically the default hook, which prints out panic information to
|
|
/// stderr).
|
|
///
|
|
/// We assume that there is nothing else running in this process that needs to install a competing
|
|
/// panic hook. We are careful to install our custom hook only once, and we do not ever restore
|
|
/// the previous hook (since you can always retain the previous hook's behavior by not calling this
|
|
/// wrapper).
|
|
pub fn catch_unwind<F, R>(f: F) -> Result<R, PanicError>
|
|
where
|
|
F: FnOnce() -> R + std::panic::UnwindSafe,
|
|
{
|
|
install_hook();
|
|
let prev_should_capture = CAPTURE_PANIC_INFO.replace(true);
|
|
let result = std::panic::catch_unwind(f).map_err(|payload| {
|
|
// Try to get the backtrace and location from our custom panic hook.
|
|
// The custom panic hook only runs once when `panic!` is called (or similar). It doesn't
|
|
// run when the panic is propagated with `std::panic::resume_unwind`. The panic hook
|
|
// is also not called when the panic is raised with `std::panic::resum_unwind` as is the
|
|
// case for salsa unwinds (see the ignored test below).
|
|
// Because of that, always take the payload from `catch_unwind` because it may have been transformed
|
|
// by an inner `std::panic::catch_unwind` handlers and only use the information
|
|
// from the custom handler to enrich the error with the backtrace and location.
|
|
let CapturedPanicInfo {
|
|
location,
|
|
backtrace,
|
|
salsa_backtrace,
|
|
} = LAST_BACKTRACE.with(Cell::take);
|
|
|
|
PanicError {
|
|
location,
|
|
payload: Payload(payload),
|
|
backtrace,
|
|
salsa_backtrace,
|
|
}
|
|
});
|
|
CAPTURE_PANIC_INFO.set(prev_should_capture);
|
|
result
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use salsa::{Database, Durability};
|
|
|
|
#[test]
|
|
#[ignore = "super::catch_unwind installs a custom panic handler, which could effect test isolation"]
|
|
fn no_backtrace_for_salsa_cancelled() {
|
|
#[salsa::input]
|
|
struct Input {
|
|
value: u32,
|
|
}
|
|
|
|
#[salsa::tracked]
|
|
fn test_query(db: &dyn Database, input: Input) -> u32 {
|
|
loop {
|
|
// This should throw a cancelled error
|
|
let _ = input.value(db);
|
|
}
|
|
}
|
|
|
|
let db = salsa::DatabaseImpl::new();
|
|
|
|
let input = Input::new(&db, 42);
|
|
|
|
let result = std::thread::scope(move |scope| {
|
|
{
|
|
let mut db = db.clone();
|
|
scope.spawn(move || {
|
|
// This will cancel the other thread by throwing a `salsa::Cancelled` error.
|
|
db.synthetic_write(Durability::MEDIUM);
|
|
});
|
|
}
|
|
|
|
{
|
|
scope.spawn(move || {
|
|
super::catch_unwind(|| {
|
|
test_query(&db, input);
|
|
})
|
|
})
|
|
}
|
|
.join()
|
|
.unwrap()
|
|
});
|
|
|
|
match result {
|
|
Ok(_) => panic!("Expected query to panic"),
|
|
Err(err) => {
|
|
// Panics triggered with `resume_unwind` have no backtrace.
|
|
assert!(err.backtrace.is_none());
|
|
}
|
|
}
|
|
}
|
|
}
|