Commit Graph

12 Commits

Author SHA1 Message Date
Micha Reiser 341c2698a7
Run doctests as part of CI pipeline (#9939) 2024-02-12 10:18:58 +01:00
konsti 14e65afdc6
Update to Rust 1.74 and use new clippy lints table (#8722)
Update to [Rust
1.74](https://blog.rust-lang.org/2023/11/16/Rust-1.74.0.html) and use
the new clippy lints table.

The update itself introduced a new clippy lint about superfluous hashes
in raw strings, which got removed.

I moved our lint config from `rustflags` to the newly stabilized
[workspace.lints](https://doc.rust-lang.org/stable/cargo/reference/workspaces.html#the-lints-table).
One consequence is that we have to `unsafe_code = "warn"` instead of
"forbid" because the latter now actually bans unsafe code:

```
error[E0453]: allow(unsafe_code) incompatible with previous forbid
  --> crates/ruff_source_file/src/newlines.rs:62:17
   |
62 |         #[allow(unsafe_code)]
   |                 ^^^^^^^^^^^ overruled by previous forbid
   |
   = note: `forbid` lint level was set on command line
```

---------

Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
2023-11-16 18:12:46 -05:00
Charlie Marsh 93b5d8a0fb
Implement our own small-integer optimization (#7584)
## Summary

This is a follow-up to #7469 that attempts to achieve similar gains, but
without introducing malachite. Instead, this PR removes the `BigInt`
type altogether, instead opting for a simple enum that allows us to
store small integers directly and only allocate for values greater than
`i64`:

```rust
/// A Python integer literal. Represents both small (fits in an `i64`) and large integers.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Int(Number);

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum Number {
    /// A "small" number that can be represented as an `i64`.
    Small(i64),
    /// A "large" number that cannot be represented as an `i64`.
    Big(Box<str>),
}

impl std::fmt::Display for Number {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Number::Small(value) => write!(f, "{value}"),
            Number::Big(value) => write!(f, "{value}"),
        }
    }
}
```

We typically don't care about numbers greater than `isize` -- our only
uses are comparisons against small constants (like `1`, `2`, `3`, etc.),
so there's no real loss of information, except in one or two rules where
we're now a little more conservative (with the worst-case being that we
don't flag, e.g., an `itertools.pairwise` that uses an extremely large
value for the slice start constant). For simplicity, a few diagnostics
now show a dedicated message when they see integers that are out of the
supported range (e.g., `outdated-version-block`).

An additional benefit here is that we get to remove a few dependencies,
especially `num-bigint`.

## Test Plan

`cargo test`
2023-09-25 15:13:21 +00:00
Charlie Marsh 768686148f
Add support for unions to our Python builtins type system (#6541)
## Summary

Fixes some TODOs introduced in
https://github.com/astral-sh/ruff/pull/6538. In short, given an
expression like `1 if x > 0 else "Hello, world!"`, we now return a union
type that says the expression can resolve to either an `int` or a `str`.
The system remains very limited, it only works for obvious primitive
types, and there's no attempt to do inference on any more complex
variables. (If any expression yields `Unknown` or `TypeError`, we
propagate that result throughout and abort on the client's end.)
2023-08-13 18:00:50 -04:00
Micha Reiser 40f54375cb
Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser 2cf00fee96
Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
Charlie Marsh ed72c027a3
Replace `NoHashHasher` usages with `FxHashMap` (#6049)
## Summary

I had always assumed that `NoHashHasher` would be faster when using
integer keys, but benchmarking shows otherwise:

```
linter/default-rules/numpy/globals.py
                        time:   [66.544 µs 66.606 µs 66.678 µs]
                        thrpt:  [44.253 MiB/s 44.300 MiB/s 44.342 MiB/s]
                 change:
                        time:   [-0.1843% +0.1087% +0.3718%] (p = 0.46 > 0.05)
                        thrpt:  [-0.3704% -0.1086% +0.1847%]
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
linter/default-rules/pydantic/types.py
                        time:   [1.3787 ms 1.3811 ms 1.3837 ms]
                        thrpt:  [18.431 MiB/s 18.466 MiB/s 18.498 MiB/s]
                 change:
                        time:   [-0.4827% -0.1074% +0.1927%] (p = 0.56 > 0.05)
                        thrpt:  [-0.1924% +0.1075% +0.4850%]
                        No change in performance detected.
linter/default-rules/numpy/ctypeslib.py
                        time:   [624.82 µs 625.96 µs 627.17 µs]
                        thrpt:  [26.550 MiB/s 26.601 MiB/s 26.650 MiB/s]
                 change:
                        time:   [-0.7071% -0.4908% -0.2736%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2744% +0.4932% +0.7122%]
                        Change within noise threshold.
linter/default-rules/large/dataset.py
                        time:   [3.1585 ms 3.1634 ms 3.1685 ms]
                        thrpt:  [12.840 MiB/s 12.861 MiB/s 12.880 MiB/s]
                 change:
                        time:   [-1.5338% -1.3463% -1.1476%] (p = 0.00 < 0.05)
                        thrpt:  [+1.1610% +1.3647% +1.5577%]
                        Performance has improved.

linter/all-rules/numpy/globals.py
                        time:   [140.17 µs 140.37 µs 140.58 µs]
                        thrpt:  [20.989 MiB/s 21.020 MiB/s 21.051 MiB/s]
                 change:
                        time:   [-0.1066% +0.3140% +0.7479%] (p = 0.14 > 0.05)
                        thrpt:  [-0.7423% -0.3130% +0.1067%]
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
linter/all-rules/pydantic/types.py
                        time:   [2.7030 ms 2.7069 ms 2.7112 ms]
                        thrpt:  [9.4064 MiB/s 9.4216 MiB/s 9.4351 MiB/s]
                 change:
                        time:   [-0.6721% -0.4874% -0.2974%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2982% +0.4898% +0.6766%]
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  12 (12.00%) high mild
  2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
                        time:   [1.4709 ms 1.4727 ms 1.4749 ms]
                        thrpt:  [11.290 MiB/s 11.306 MiB/s 11.320 MiB/s]
                 change:
                        time:   [-1.1617% -0.9766% -0.8094%] (p = 0.00 < 0.05)
                        thrpt:  [+0.8160% +0.9862% +1.1754%]
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe
linter/all-rules/large/dataset.py
                        time:   [5.8086 ms 5.8163 ms 5.8240 ms]
                        thrpt:  [6.9854 MiB/s 6.9946 MiB/s 7.0038 MiB/s]
                 change:
                        time:   [-1.5651% -1.3536% -1.1584%] (p = 0.00 < 0.05)
                        thrpt:  [+1.1720% +1.3721% +1.5900%]
                        Performance has improved.
```

My guess is that `NoHashHasher` underperforms because the keys are not
randomly distributed...

Anyway, it's a ~1% (significant) performance gain on some of the above,
plus we get to remove a dependency.
2023-07-24 23:41:57 +00:00
Charlie Marsh 68b6d30c46
Use consistent `Cargo.toml` metadata in all crates (#5015) 2023-06-12 00:02:40 +00:00
Charlie Marsh ea31229be0
Track `TYPE_CHECKING` blocks in `Importer` (#4593) 2023-05-30 16:18:10 +00:00
Micha Reiser 652c644c2a
Introduce `ruff_index` crate (#4597) 2023-05-23 17:40:35 +02:00
Micha Reiser cab65b25da
Replace row/column based `Location` with byte-offsets. (#3931) 2023-04-26 18:11:02 +00:00
Charlie Marsh d919adc13c
Introduce a `ruff_python_semantic` crate (#3865) 2023-04-04 16:50:47 +00:00