Commit Graph

167 Commits

Author SHA1 Message Date
Charlie Marsh
8d1610d960 Implement Display on formatter structs (#6983)
Feedback from
https://github.com/astral-sh/ruff/pull/6948#discussion_r1308260021.
2023-08-29 16:57:26 +00:00
Charlie Marsh
fad23bbe60 Add a --check flag to the formatter CLI (#6982)
## Summary

Returns an exit code of 1 if any files would be reformatted:

```
ruff on  charlie/format-check:main [$?⇡] is 📦 v0.0.286 via 🐍 v3.11.2 via 🦀 v1.72.0
❯ cargo run -p ruff_cli -- format foo.py --check
   Compiling ruff_cli v0.0.286 (/Users/crmarsh/workspace/ruff/crates/ruff_cli)
    Finished dev [unoptimized + debuginfo] target(s) in 1.69s
     Running `target/debug/ruff format foo.py --check`
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended only for experimentation.
1 file would be reformatted
ruff on  charlie/format-check:main [$?⇡] is 📦 v0.0.286 via 🐍 v3.11.2 via 🦀 v1.72.0 took 2s
❯ echo $?
1
```

Closes #6966.
2023-08-29 12:40:00 -04:00
Charlie Marsh
25c374856a Move stdin formatting to its own command file (#6981)
## Summary

This is similar to `commands::check` vs. `commands::check_stdin`, and
gets the logic out of the parent file (`lib.rs`). It also ensures that
we avoid formatting files that should be excluded when `--force-exclude`
is provided.
2023-08-29 16:06:10 +00:00
Charlie Marsh
34221346c1 Rename run.rs command to check.rs (#6980)
The CLI command is called "check", so this is more consistent (and
consistent with the pattern used in other commands).
2023-08-29 15:52:06 +00:00
Charlie Marsh
1439bb592e Report number of changed files on the format CLI (#6948)
## Summary

Very basic summary:

<img width="962" alt="Screen Shot 2023-08-28 at 1 17 37 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/53537aca-7579-44d8-855b-f4553affae50">

If you run with `--verbose`, we'll also show you the timing:

<img width="962" alt="Screen Shot 2023-08-28 at 1 17 58 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/63cbd13e-9462-4e49-b3a3-c6663a7ad41c">
2023-08-28 18:42:31 -04:00
Charlie Marsh
ec575188c4 Narrow the supported options on the format CLI (#6944)
## Summary

Ensures that we only show supported options:

<img width="1228" alt="Screen Shot 2023-08-28 at 11 03 16 AM"
src="https://github.com/astral-sh/ruff/assets/1309177/50fb7595-dc30-43d2-a7e4-c0103acc15b9">

For now, I'm not super focused on DRYing up the CLI.
2023-08-28 15:28:22 +00:00
Charlie Marsh
58f5f27dc3 Add TOML files to SourceType (#6929)
## Summary

This PR adds a higher-level enum (`SourceType`) around `PySourceType` to
allow us to use the same detection path to handle TOML files. Right now,
we have ad hoc `is_pyproject_toml` checks littered around, and some
codepaths are omitting that logic altogether (like `add_noqa`). Instead,
we should always be required to check the source type and handle TOML
files as appropriate.

This PR will also help with our pre-commit capabilities. If we add
`toml` to pre-commit (to support `pyproject.toml`), pre-commit will
start to pass _other_ files to Ruff (along with `poetry.lock` and
`Pipfile` -- see
[identify](b59996304f/identify/extensions.py (L355))).
By detecting those files and handling those cases, we avoid attempting
to parse them as Python files, which would lead to pre-commit errors.
(We tried to add `toml` to pre-commit here
(https://github.com/astral-sh/ruff-pre-commit/pull/44), but had to
revert here (https://github.com/astral-sh/ruff-pre-commit/pull/45) as it
led to the pre-commit hook attempting to parse `poetry.lock` files as
Python files.)
2023-08-28 15:01:48 +00:00
konsti
e615870659 Unify line size settings between ruff and the formatter (#6873) 2023-08-28 06:44:56 +00:00
Micha Reiser
a6aa16630d Move Configuration to ruff_workspace crate (#6920) 2023-08-28 06:21:35 +00:00
Charlie Marsh
6bc1ba6d62 Use stdin for formatter when --stdin-filename is provided (#6926)
## Summary

Just making the formatter CLI more consistent with the linter -- e.g.,
we now use stdin on invocations like `cat foo.py | cargo run -p ruff_cli
-- format -- --stdin-filename=foo.py`, instead of _only_ relying on the
`-` file (and use the same helper as the linter to facilitate this).
2023-08-27 20:32:18 +00:00
Charlie Marsh
cd47368ae4 Use consistent formatting for user-facing formatter errors (#6925)
## Summary

This PR changes the alpha formatter CLI to use the same format for
errors as the linter, e.g.:

<img width="868" alt="Screen Shot 2023-08-27 at 4 03 30 PM"
src="https://github.com/astral-sh/ruff/assets/1309177/9f3dea37-593b-4788-a0c0-e64bcf0d0560">
2023-08-27 20:22:06 +00:00
Charlie Marsh
a871714705 Collect result in format CLI (#6924) 2023-08-27 20:02:18 +00:00
konsti
c2413dcd2c Add prototype of ruff format for projects (#6871)
**Summary** Add recursive formatting based on `ruff check` file
discovery for `ruff format`, as a prototype for the formatter alpha.
This allows e.g. `format ../projects/django/`. It's still lacking
support for any settings except line length.

Note just like the existing `ruff format` this will become part of the
production build, i.e. you'll be able to use it - hidden by default and
with a prominent warning - with `ruff format .` after the next release.

Error handling works in my manual tests (the colors do also work):

```
$  target/debug/ruff format scripts/
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended for internal use only.
```
(the above changes `add_rule.py` where we have the wrong bin op
breaking)

```
$ target/debug/ruff format ../projects/django/
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended for internal use only.
Failed to format /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors: ParseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
```

```
$ target/debug/ruff format a
warning: `ruff format` is a work-in-progress, subject to change at any time, and intended for internal use only.
Failed to read /home/konsti/ruff/a/d.py: Permission denied (os error 13)
```

**Test Plan** Missing! I'm not sure if it's worth building tests at this
stage or how they should look like.
2023-08-27 19:12:18 +00:00
Micha Reiser
3f3494ad44 Implement ConfigProcessor on non-ref type (#6915) 2023-08-27 15:03:11 +02:00
Dhruv Manilawala
d1f07008f7 Rename Notebook related symbols (#6862)
This PR renames the following symbols:

* `PySourceType::Jupyter` -> `PySourceType::Ipynb`
* `SourceKind::Jupyter` -> `SourceKind::IpyNotebook`
* `JupyterIndex` -> `NotebookIndex`
2023-08-25 11:40:54 +05:30
konsti
aafde6db28 Remove some indexing (#6728)
**Summary** A common pattern in the code used to be
```rust
if statements.len() != 1 {
    return;
}
use_single_entry(statements[0])?;
```
which can be better expressed as
```rust
let [statement] = statements else {
    return;
};
use_single_entry(statements)?;
```

Direct indexing can cause panics if you don't manually take care of
checking the length, while matching (such as if-let or let-else) can
never panic.

This isn't a complete refactor, i've just removed some of the obvious
cases. I've specifically looked for `.len() != 1` and fixed those.

**Test Plan** No functional changes
2023-08-21 16:56:15 +02:00
Micha Reiser
ea72d5feba Refactor SourceKind to store file content (#6640) 2023-08-18 13:45:38 +00:00
Charlie Marsh
2aeb27334d Avoid cloning source code multiple times (#6629)
## Summary

In working on https://github.com/astral-sh/ruff/pull/6628, I noticed
that we clone the source code contents, potentially multiple times,
prior to linting. The issue is that `SourceKind::Python` takes a
`String`, so we first have to provide it with a `String`. In the stdin
case, that means cloning. However, on top of this, we then have to clone
`source_kind.contents()` because `SourceKind` gets mutated. So for
stdin, we end up cloning twice. For non-stdin, we end up cloning once,
but unnecessarily (since the _contents_ don't get mutated, only the
kind).

This PR removes the `String` from `source_kind`, instead requiring that
we parse it out elsewhere. It reduces the number of clones down to 1 for
Jupyter Notebooks, and zero otherwise.
2023-08-18 09:32:18 -04:00
Charlie Marsh
98b9f2e705 Respect .ipynb and .pyi sources when linting from stdin (#6628)
## Summary

When running Ruff from stdin, we were always falling back to the default
source type, even if the user specified a path (as is the case when
running from the LSP). This PR wires up the source type inference, which
means we now get the expected result when checking `.pyi` and `.ipynb`
files.

Closes #6627.

## Test Plan

Verified that `cat
crates/ruff/resources/test/fixtures/jupyter/valid.ipynb | cargo run -p
ruff_cli -- --force-exclude --no-cache --no-fix --isolated --select ALL
--stdin-filename foo.ipynb -` yielded the expected results (and differs
from the errors you get if you omit the filename).

Verified that `cat foo.pyi | cargo run -p ruff_cli -- --force-exclude
--no-cache --no-fix --format json --isolated --select TCH
--stdin-filename path/to/foo.pyi -` yielded no errors.
2023-08-16 20:33:59 +00:00
Dhruv Manilawala
32fa05765a Use Jupyter mode while parsing Notebook files (#5552)
## Summary

Enable using the new `Mode::Jupyter` for the tokenizer/parser to parse
Jupyter line magic tokens.

The individual call to the lexer i.e., `lex_starts_at` done by various
rules should consider the context of the source code (is this content
from a Jupyter Notebook?). Thus, a new field `source_type` (of type
`PySourceType`) is added to `Checker` which is being passed around as an
argument to the relevant functions. This is then used to determine the
`Mode` for the lexer.

## Test Plan

Add new test cases to make sure that the magic statement is considered
while generating the diagnostic and autofix:
* For `I001`, if there's a magic statement in between two import blocks,
they should be sorted independently

fixes: #6090
2023-08-05 00:32:07 +00:00
konsti
1031bb6550 Formatter: Add SourceType to context to enable special formatting for stub files (#6331)
**Summary** This adds the information whether we're in a .py python
source file or in a .pyi stub file to enable people working on #5822 and
related issues.

I'm not completely happy with `Default` for something that depends on
the input.

**Test Plan** None, this is currently unused, i'm leaving this to first
implementation of stub file specific formatting.

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-08-04 11:52:26 +00:00
konsti
1df7e9831b Replace .map_or(false, $closure) with .is_some_and(closure) (#6244)
**Summary**
[Option::is_some_and](https://doc.rust-lang.org/stable/std/option/enum.Option.html#method.is_some_and)
and
[Result::is_ok_and](https://doc.rust-lang.org/std/result/enum.Result.html#method.is_ok_and)
are new methods is rust 1.70. I find them way more readable than
`.map_or(false, ...)`.

The changes are `s/.map_or(false,/.is_some_and(/g`, then manually
switching to `is_ok_and` where the value is a Result rather than an
Option.

**Test Plan** n/a^
2023-08-01 19:29:42 +02:00
konsti
e52b636da0 Log configuration in ruff_dev (#6193)
**Summary** This includes two changes:
 * Allow setting `-v` in `ruff_dev`, using the `ruff_cli` implementation
 * `debug!` which ruff configuration strategy was used

This is a byproduct of debugging #6187.

**Test Plan** n/a
2023-07-31 17:52:38 +00:00
Charlie Marsh
4231ed2fc3 Skip partial duplicates when applying multi-edit fixes (#6144)
## Summary

Right now, if we have two fixes that have an overlapping edit, but not
an _identical_ set of edits, they'll conflict, causing us to do another
linter traversal. Here, I've enabled the fixer to support partially
overlapping edits, which (as an example) let's us greatly reduce the
number of iterations required in the test suite.

The most common case here is that in which a bunch of edits need to
import some symbol, and then use that symbol, but in different ways. In
that case, all edits will have a common fix (to import the symbol), but
deviate in some way. With this change, we can do all of those edits in
one pass.

Note that the simplest way to enable this was to store sorted edits on
`Fix`. We don't allow modifying the edits on `Fix` once it's
constructed, so this is an easy change, and allows us to avoid a bunch
of clones and traversals later on.

Closes #5800.
2023-07-29 12:11:57 +00:00
Dhruv Manilawala
3c99fbf808 Implement --diff for Jupyter Notebooks (#6149)
## Summary

Implement `--diff` for Jupyter Notebooks

## Test Plan

1. Use `crates/ruff/resources/test/fixtures/jupyter/isort.ipynb` as a
test case
and add a markdown cell in between the code cells to check that the diff
   outputs the correct cell index.
2. Run the command:
`cargo run --bin ruff --package ruff_cli -- check --no-cache --isolated
--select=ALL crates/ruff/resources/test/fixtures/jupyter/isort.ipynb
--fix --diff`

<details><summary>Example output:</summary>
<p>

```diff
--- /Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 0
+++ /Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 0
@@ -1,3 +0,0 @@
-from pathlib import Path
-import random
-import math
--- /Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 4
+++ /Users/dhruv/playground/ruff/notebooks/test.ipynb:cell 4
@@ -1,5 +1,3 @@
-from typing import Any
-import collections
 # Newline should be added here
 def foo():
     pass

--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 8
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 8
@@ -1,8 +1,7 @@
 import pprint
 import tempfile
 
-from IPython import display
 import matplotlib.pyplot as plt
-
 import tensorflow as tf
-import tensorflow_datasets as tfds
+import tensorflow_datasets as tfds
+from IPython import display
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 10
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 10
@@ -1,5 +1,4 @@
 import tensorflow_models as tfm
 
 # These are not in the tfm public API for v2.9. They will be available in v2.10
-from official.vision.serving import export_saved_model_lib
-import official.core.train_lib
+from official.vision.serving import export_saved_model_lib
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 13
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 13
@@ -1,5 +1,5 @@
-exp_config = tfm.core.exp_factory.get_exp_config('resnet_imagenet')
-tfds_name = 'cifar10'
+exp_config = tfm.core.exp_factory.get_exp_config("resnet_imagenet")
+tfds_name = "cifar10"
 ds,ds_info = tfds.load(
 tfds_name,
 with_info=True)
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 15
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 15
@@ -6,12 +6,12 @@
 # Configure training and testing data
 batch_size = 128
 
-exp_config.task.train_data.input_path = ''
+exp_config.task.train_data.input_path = ""
 exp_config.task.train_data.tfds_name = tfds_name
-exp_config.task.train_data.tfds_split = 'train'
+exp_config.task.train_data.tfds_split = "train"
 exp_config.task.train_data.global_batch_size = batch_size
 
-exp_config.task.validation_data.input_path = ''
+exp_config.task.validation_data.input_path = ""
 exp_config.task.validation_data.tfds_name = tfds_name
-exp_config.task.validation_data.tfds_split = 'test'
+exp_config.task.validation_data.tfds_split = "test"
 exp_config.task.validation_data.global_batch_size = batch_size
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 17
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 17
@@ -1,16 +1,16 @@
 logical_device_names = [logical_device.name for logical_device in tf.config.list_logical_devices()]
 
-if 'GPU' in ''.join(logical_device_names):
-  print('This may be broken in Colab.')
-  device = 'GPU'
-elif 'TPU' in ''.join(logical_device_names):
-  print('This may be broken in Colab.')
-  device = 'TPU'
+if "GPU" in "".join(logical_device_names):
+  print("This may be broken in Colab.")
+  device = "GPU"
+elif "TPU" in "".join(logical_device_names):
+  print("This may be broken in Colab.")
+  device = "TPU"
 else:
-  print('Running on CPU is slow, so only train for a few steps.')
-  device = 'CPU'
+  print("Running on CPU is slow, so only train for a few steps.")
+  device = "CPU"
 
-if device=='CPU':
+if device=="CPU":
   train_steps = 20
   exp_config.trainer.steps_per_loop = 5
 else:
@@ -20,9 +20,9 @@
 exp_config.trainer.summary_interval = 100
 exp_config.trainer.checkpoint_interval = train_steps
 exp_config.trainer.validation_interval = 1000
-exp_config.trainer.validation_steps =  ds_info.splits['test'].num_examples // batch_size
+exp_config.trainer.validation_steps =  ds_info.splits["test"].num_examples // batch_size
 exp_config.trainer.train_steps = train_steps
-exp_config.trainer.optimizer_config.learning_rate.type = 'cosine'
+exp_config.trainer.optimizer_config.learning_rate.type = "cosine"
 exp_config.trainer.optimizer_config.learning_rate.cosine.decay_steps = train_steps
 exp_config.trainer.optimizer_config.learning_rate.cosine.initial_learning_rate = 0.1
 exp_config.trainer.optimizer_config.warmup.linear.warmup_steps = 100
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 21
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 21
@@ -1,14 +1,14 @@
 logical_device_names = [logical_device.name for logical_device in tf.config.list_logical_devices()]
 
 if exp_config.runtime.mixed_precision_dtype == tf.float16:
-    tf.keras.mixed_precision.set_global_policy('mixed_float16')
+    tf.keras.mixed_precision.set_global_policy("mixed_float16")
 
-if 'GPU' in ''.join(logical_device_names):
+if "GPU" in "".join(logical_device_names):
   distribution_strategy = tf.distribute.MirroredStrategy()
-elif 'TPU' in ''.join(logical_device_names):
+elif "TPU" in "".join(logical_device_names):
   tf.tpu.experimental.initialize_tpu_system()
-  tpu = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='/device:TPU_SYSTEM:0')
+  tpu = tf.distribute.cluster_resolver.TPUClusterResolver(tpu="/device:TPU_SYSTEM:0")
   distribution_strategy = tf.distribute.experimental.TPUStrategy(tpu)
 else:
-  print('Warning: this will be really slow.')
+  print("Warning: this will be really slow.")
   distribution_strategy = tf.distribute.OneDeviceStrategy(logical_device_names[0])
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 23
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 23
@@ -1,5 +1,3 @@
 with distribution_strategy.scope():
   model_dir = tempfile.mkdtemp()
   task = tfm.core.task_factory.get_task(exp_config.task, logging_dir=model_dir)
-
-#  tf.keras.utils.plot_model(task.build_model(), show_shapes=True)
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 24
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 24
@@ -1,4 +1,4 @@
 for images, labels in task.build_inputs(exp_config.task.train_data).take(1):
   print()
-  print(f'images.shape: {str(images.shape):16}  images.dtype: {images.dtype!r}')
-  print(f'labels.shape: {str(labels.shape):16}  labels.dtype: {labels.dtype!r}')
+  print(f"images.shape: {images.shape!s:16}  images.dtype: {images.dtype!r}")
+  print(f"labels.shape: {labels.shape!s:16}  labels.dtype: {labels.dtype!r}")
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 27
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 27
@@ -1 +1 @@
-plt.hist(images.numpy().flatten());
+plt.hist(images.numpy().flatten())
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 29
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 29
@@ -1,2 +1,2 @@
-label_info = ds_info.features['label']
+label_info = ds_info.features["label"]
 label_info.int2str(1)
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 31
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 31
@@ -10,9 +10,6 @@
     if predictions is None:
       plt.title(label_info.int2str(labels[i]))
     else:
-      if labels[i] == predictions[i]:
-        color = 'g'
-      else:
-        color = 'r'
+      color = "g" if labels[i] == predictions[i] else "r"
       plt.title(label_info.int2str(predictions[i]), color=color)
     plt.axis("off")
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 35
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 35
@@ -1,3 +1,3 @@
-plt.figure(figsize=(10, 10));
+plt.figure(figsize=(10, 10))
 for images, labels in task.build_inputs(exp_config.task.validation_data).take(1):
   show_batch(images, labels)
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 37
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 37
@@ -1,7 +1,7 @@
 model, eval_logs = tfm.core.train_lib.run_experiment(
     distribution_strategy=distribution_strategy,
     task=task,
-    mode='train_and_eval',
+    mode="train_and_eval",
     params=exp_config,
     model_dir=model_dir,
     run_post_eval=True)
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 38
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 38
@@ -1 +0,0 @@
-#  tf.keras.utils.plot_model(model, show_shapes=True)
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 40
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 40
@@ -1,4 +1,4 @@
 for key, value in eval_logs.items():
     if isinstance(value, tf.Tensor):
       value = value.numpy()
-    print(f'{key:20}: {value:.3f}')
+    print(f"{key:20}: {value:.3f}")
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 42
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 42
@@ -4,5 +4,5 @@
 
 show_batch(images, labels, tf.cast(predictions, tf.int32))
 
-if device=='CPU':
-  plt.suptitle('The model was only trained for a few steps, it is not expected to do well.')
+if device=="CPU":
+  plt.suptitle("The model was only trained for a few steps, it is not expected to do well.")
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 45
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 45
@@ -1,8 +1,8 @@
 # Saving and exporting the trained model
 export_saved_model_lib.export_inference_graph(
-    input_type='image_tensor',
+    input_type="image_tensor",
     batch_size=1,
     input_image_size=[32, 32],
     params=exp_config,
     checkpoint_path=tf.train.latest_checkpoint(model_dir),
-    export_dir='./export/')
+    export_dir="./export/")
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 47
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 47
@@ -1,3 +1,3 @@
 # Importing SavedModel
-imported = tf.saved_model.load('./export/')
-model_fn = imported.signatures['serving_default']
+imported = tf.saved_model.load("./export/")
+model_fn = imported.signatures["serving_default"]
--- /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 49
+++ /Users/dhruv/playground/ruff/notebooks/image_classification.ipynb:cell 49
@@ -1,10 +1,10 @@
 plt.figure(figsize=(10, 10))
-for data in tfds.load('cifar10', split='test').batch(12).take(1):
+for data in tfds.load("cifar10", split="test").batch(12).take(1):
   predictions = []
-  for image in data['image']:
-    index = tf.argmax(model_fn(image[tf.newaxis, ...])['logits'], axis=1)[0]
+  for image in data["image"]:
+    index = tf.argmax(model_fn(image[tf.newaxis, ...])["logits"], axis=1)[0]
     predictions.append(index)
-  show_batch(data['image'], data['label'], predictions)
+  show_batch(data["image"], data["label"], predictions)
 
-  if device=='CPU':
-    plt.suptitle('The model was only trained for a few steps, it is not expected to do better than random.')
+  if device=="CPU":
+    plt.suptitle("The model was only trained for a few steps, it is not expected to do better than random.")

Would fix 61 errors.
```

</p>
</details> 

resolves: #4727
2023-07-29 04:22:56 +00:00
Micha Reiser
2cf00fee96 Remove parser dependency from ruff-python-ast (#6096) 2023-07-26 17:47:22 +02:00
Zanie Blue
3000a47fe8 Include file permissions in key for cached files (#5901)
Reimplements https://github.com/astral-sh/ruff/pull/3104
Closes https://github.com/astral-sh/ruff/issues/5726

Note that we will generate the hash for a cache key twice in normal
operation. Once to check for the cached item and again to update the
cache. We could optimize this by generating the hash once in
`diagnostics::lint_file` and passing the `u64` into `get` and `update`.
We'd probably want to wrap it in a `CacheKeyHash` enum for type safety.

## Test plan

Unit tests for Windows and Unix.

Manual test with case from issue

```
❯ touch fake.py
❯ chmod +x fake.py
❯ ./target/debug/ruff --select EXE fake.py
fake.py:1:1: EXE002 The file is executable but no shebang is present
Found 1 error.
❯ chmod -x fake.py
❯ ./target/debug/ruff --select EXE fake.py
```
2023-07-25 17:06:47 +00:00
Charlie Marsh
057faabcdd Use Flags::intersects rather than Flags::contains (#6007)
## Summary

This is equivalent for a single flag, but I think it's more likely to be
correct when the bitflags are modified -- the primary reason being that
we sometimes define flags as the union of other flags, e.g.:

```rust
const ANNOTATION = Self::TYPING_ONLY_ANNOTATION.bits() | Self::RUNTIME_ANNOTATION.bits();
```

In this case, `flags.contains(Flag::ANNOTATION)` requires that _both_
flags in the union are set, whereas `flags.intersects(Flag::ANNOTATION)`
requires that _at least one_ flag is set.
2023-07-23 02:59:31 +00:00
konsti
92f471a666 Handle io errors gracefully (#5611)
## Summary

It can happen that we can't read a file (a python file, a jupyter
notebook or pyproject.toml), which needs to be handled and handled
consistently for all file types. Instead of using `Err` or `error!`, we
emit E602 with the io error as message and continue. This PR makes sure
we handle all three cases consistently, emit E602.

I'm not convinced that it should be possible to disable io errors, but
we now handle the regular case consistently and at least print warning
consistently.

I went with `warn!` but i can change them all to `error!`, too.

It also checks the error case when a pyproject.toml is not readable. The
error message is not very helpful, but it's now a bit clearer that
actually ruff itself failed instead vs this being a diagnostic.

## Examples

This is how an Err of `run` looks now:


![image](https://github.com/astral-sh/ruff/assets/6826232/890f7ab2-2309-4b6f-a4b3-67161947cc83)

With an unreadable file and `IOError` disabled:


![image](https://github.com/astral-sh/ruff/assets/6826232/fd3d6959-fa23-4ddf-b2e5-8d6022df54b1)

(we lint zero files but count files before linting not during so we exit
0)

I'm not sure if it should (or if we should take a different path with
manual ExitStatus), but this currently also triggers when `files` is
empty:


![image](https://github.com/astral-sh/ruff/assets/6826232/f7ede301-41b5-4743-97fd-49149f750337)

## Test Plan

Unix only: Create a temporary directory with files with permissions
`000` (not readable by the owner) and run on that directory. Since this
breaks the assumptions of most of the test code (single file, `ruff`
instead of `ruff_cli`), the test code is rather cumbersome and looks a
bit misplaced; i'm happy about suggestions to fit it in closer with the
other tests or streamline it in other ways. I added another test for
when the entire directory is not readable.
2023-07-20 11:30:14 +02:00
Dhruv Manilawala
7e6b472c5b Make lint_only aware of the source kind (#5876) 2023-07-19 09:29:35 +05:30
Dhruv Manilawala
e9771c9c63 Ignore Jupyter Notebooks for --add-noqa (#5727) 2023-07-13 13:26:47 +05:30
Charlie Marsh
4dee49d6fa Run nightly Clippy over the Ruff repo (#5670)
## Summary

This is the result of running `cargo +nightly clippy --workspace
--all-targets --all-features -- -D warnings` and fixing all violations.
Just wanted to see if there were any interesting new checks on nightly
👀
2023-07-10 23:44:38 -04:00
Aarni Koskela
24bcbb85a1 Rework upstream categories so we can all_rules() (#5591)
## Summary

This PR reworks the `upstream_categories` mechanism that is only used
for documentation purposes to make it easier to generate docs using
`all_rules()`. The new implementation also relies on "tribal knowledge"
about rule codes, so it's not the best implementation, but gets us
forward.

Another option would be to change the rule-defining proc macros to allow
configuring an optional `RuleCategory`, but that seems more heavy-handed
and possibly unnecessary in the long run...

Draft since this builds on #5439.

cc @charliermarsh :)
2023-07-10 09:41:26 -04:00
Charlie Marsh
a1c559eaa4 Only run pyproject.toml lint rules when enabled (#5578)
## Summary

I was testing some changes on Airflow, and I realized that we _always_
run the `pyproject.toml` validation rules, even if they're not enabled.
This PR gates them behind the appropriate enablement flags.

## Test Plan

- Ran: `cargo run -p ruff_cli -- check ../airflow -n`. Verified that no
RUF200 violations were raised.
- Run: `cargo run -p ruff_cli -- check ../airflow -n --select RUF200`.
Verified that two RUF200 violations were raised.
2023-07-08 11:05:05 -04:00
konsti
b22e6c3d38 Extend ruff_dev formatter script to compute statistics and format a project (#5492)
## Summary

This extends the `ruff_dev` formatter script util. Instead of only doing
stability checks, you can now choose different compatible options on the
CLI and get statistics.

* It adds an option the formats all files that ruff would check to allow
looking at an entire black-formatted repository with `git diff`
* It computes the [Jaccard
index](https://en.wikipedia.org/wiki/Jaccard_index) as a measure of
deviation between input and output, which is useful as single number
metric for assessing our current deviations from black.
* It adds progress bars to both the single projects as well as the
multi-project mode.
* It adds an option to write the multi-project output to a file

Sample usage:

```
$ cargo run --bin ruff_dev -- format-dev --stability-check crates/ruff/resources/test/cpython
$ cargo run --bin ruff_dev -- format-dev --stability-check /home/konsti/projects/django
Syntax error in /home/konsti/projects/django/tests/test_runner_apps/tagged/tests_syntax_error.py: source contains syntax errors (parser error): BaseError { error: UnrecognizedToken(Name { name: "syntax_error" }, None), offset: 131, source_path: "<filename>" }
Found 0 stability errors in 2755 files (jaccard index 0.911) in 9.75s
$ cargo run --bin ruff_dev -- format-dev --write /home/konsti/projects/django
```

Options:

```
Several utils related to the formatter which can be run on one or more repositories. The selected set of files in a repository is the same as for `ruff check`.

* Check formatter stability: Format a repository twice and ensure that it looks that the first and second formatting look the same. * Format: Format the files in a repository to be able to check them with `git diff` * Statistics: The subcommand the Jaccard index between the (assumed to be black formatted) input and the ruff formatted output

Usage: ruff_dev format-dev [OPTIONS] [FILES]...

Arguments:
  [FILES]...
          Like `ruff check`'s files. See `--multi-project` if you want to format an ecosystem checkout

Options:
      --stability-check
          Check stability
          
          We want to ensure that once formatted content stays the same when formatted again, which is known as formatter stability or formatter idempotency, and that the formatter prints syntactically valid code. As our test cases cover only a limited amount of code, this allows checking entire repositories.

      --write
          Format the files. Without this flag, the python files are not modified

      --format <FORMAT>
          Control the verbosity of the output
          
          [default: default]

          Possible values:
          - minimal: Filenames only
          - default: Filenames and reduced diff
          - full:    Full diff and invalid code

  -x, --exit-first-error
          Print only the first error and exit, `-x` is same as pytest

      --multi-project
          Checks each project inside a directory, useful e.g. if you want to check all of the ecosystem checkouts

      --error-file <ERROR_FILE>
          Write all errors to this file in addition to stdout. Only used in multi-project mode
```

## Test Plan

I ran this on django (2755 files, jaccard index 0.911) and discovered a
magic trailing comma problem and that we really needed to implement
import formatting. I ran the script on cpython to identify
https://github.com/astral-sh/ruff/pull/5558.
2023-07-07 11:30:12 +00:00
Aarni Koskela
d7214e77e6 Add ruff rule --all subcommand (with JSON output) (#5059)
## Summary

This adds a `ruff rule --all` switch that prints out a human-readable
Markdown or a machine-readable JSON document of the lint rules known to
Ruff.

I needed a machine-readable document of the rules [for a
project](https://github.com/astral-sh/ruff/discussions/5078), and
figured it could be useful for other people – or tooling! – to be able
to interrogate Ruff about its arcane knowledge.

The JSON output is an array of the same objects printed by `ruff rule
--format=json`.

## Test Plan

I ran `ruff rule --all --format=json`. I think more might be needed, but
maybe a snapshot test is overkill?
2023-07-04 19:45:38 +00:00
Charlie Marsh
952c623102 Avoid returning first-match for rule prefixes (#5511)
Closes #5495, but there's a TODO here to improve this further. The
current `from_code` implementation feels really indirect.
2023-07-04 19:23:05 +00:00
Anders Kaseorg
df13e69c3c Format let-else with rustfmt nightly (#5461)
Support for `let…else` formatting was just merged to nightly
(rust-lang/rust#113225). Rerun `cargo fmt` with Rust nightly 2023-07-02
to pick this up. Followup to #939.

Signed-off-by: Anders Kaseorg <andersk@mit.edu>
2023-07-03 02:13:35 +00:00
Charlie Marsh
1d2d015bc5 Make standard input detection robust to invalid arguments (#5393)
## Summary

This PR fixes a silent failure that manifested itself in
https://github.com/astral-sh/ruff-vscode/issues/238. In short, if the
user provided invalid arguments to Ruff in the VS Code extension (like
`"ruff.args": ["a"]`), then we generated something like the following
command:

```console
/path/to/ruff --force-exclude --no-cache --no-fix --format json - --fix a --stdin-filename /path/to/file.py
```

Since this contains both `-` and `a` as the "input files", Ruff would
treat this as if we're linting the files names `-` and `a`, rather than
linting standard input.

This PR modifies out standard input detection to force standard input
when `--stdin-filename` is present, or at least one file is `-`. (We
then warn and ignore the others.)
2023-06-28 14:52:23 +00:00
Charlie Marsh
032b967b05 Enable --watch for Jupyter notebooks (#5394)
## Summary

The list of extensions that support watching is hard-coded
(unfortunately); this PR adds `.ipynb` to the list.
2023-06-27 12:53:47 -04:00
Dhruv Manilawala
2fc38d81e6 Experimental release for Jupyter notebook integration (#5363)
## Summary

Experimental release for Jupyter Notebook integration.

Currently, this requires a user to explicitly opt-in using the
[include](https://beta.ruff.rs/docs/settings/#include) configuration:

```toml
[tool.ruff]
include = ["*.py", "*.pyi", "**/pyproject.toml", "*.ipynb"]
```

Or, a user can pass in the file directly:

```sh
ruff check path/to/notebook.ipynb
```

For known limitations, please refer #5188 

## Test Plan

Following command should work without the `--all-features` flag:

```sh
cargo dev round-trip /path/to/notebook.ipynb
```

Following command should work with the above config file along with
`select = ["ALL"]`:

```sh
cargo run --bin ruff -- check --no-cache --config=../test-repos/openai-cookbook/pyproject.toml --fix ../test-repos/openai-cookbook/
```

Passing the Jupyter notebook directly:

```sh
cargo run --bin ruff -- check --no-cache --isolated --select=ALL --fix ../test-repos/openai-cookbook/examples/Classification_using_embeddings.ipynb
```
2023-06-26 21:22:42 +05:30
Micha Reiser
dd0d1afb66 Create PyFormatOptions
<!--
Thank you for contributing to Ruff! To help us out with reviewing, please consider the following:

- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->

## Summary

This PR adds a new `PyFormatOptions` struct that stores the python formatter options. 
The new options aren't used yet, with the exception of magical trailing commas and the options passed to the printer. 
I'll follow up with more PRs that use the new options (e.g. `QuoteStyle`).

<!-- What's the purpose of the change? What does it do, and why? -->

## Test Plan

`cargo test` I'll follow up with a new PR that adds support for overriding the options in our fixture tests.
2023-06-26 14:02:17 +02:00
Thomas de Zeeuw
1c638264b2 Keep track of when files are last seen in the cache (#5214)
## Summary

And remove cached files that we haven't seen for a certain period of
time, currently 30 days.

For the last seen timestamp we actually use an `u64`, it's smaller on
disk than `SystemTime` (which size is OS dependent) and fits in an
`AtomicU64` which we can use to update it without locks.

## Test Plan

Added a new unit test, run by `cargo test`.
2023-06-23 15:40:35 +02:00
Charlie Marsh
6b8b318d6b Use mod tests consistently (#5278)
As per the Rust documentation.
2023-06-22 01:50:28 +00:00
Charlie Marsh
bf1a94ee54 Initialize caches for packages and standalone files (#5237)
## Summary

While fixing https://github.com/astral-sh/ruff/pull/5233, I noticed that
in FastAPI, 343 out of 823 files weren't hitting the cache. It turns out
these are standalone files in the documentation that lack a "package
root". Later, when looking up the cache entries, we fallback to the
package directory.

This PR ensures that we initialize the cache for both kinds of files:
those that are in a package, and those that aren't.

The total size of the FastAPI cache for me is now 388K. I also suspect
that this approach is much faster than as initially written, since
before, we were probably initializing one cache per _directory_.

## Test Plan

Ran `cargo run -p ruff_cli -- check ../fastapi --verbose`; verified
that, on second execution, there were no "Checking" entries in the logs.
2023-06-21 17:29:09 +00:00
Charlie Marsh
621e9ace88 Use package roots rather than package members for cache initialization (#5233)
## Summary

This is a proper fix for the issue patched-over in
https://github.com/astral-sh/ruff/pull/5229, thanks to an extremely
helpful repro from @tlambert03 in that thread. It looks like we were
using the keys of `package_roots` rather than the values to initialize
the cache -- but it's a map from package to package root.

## Test Plan

Reverted #5229, then ran through the plan that @tlambert03 included in
https://github.com/astral-sh/ruff/pull/5229#issuecomment-1599723226.
Verified the panic before but not after this change.
2023-06-20 21:21:45 -04:00
Charlie Marsh
1a2bd984f2 Avoid .unwrap() on cache access (#5229)
## Summary

I haven't been able to determine why / when this is happening, but in
some cases, users are reporting that this `unwrap()` is causing a panic.
It's fine to just return `None` here and fallback to "No cache",
certainly better than panicking (while we figure out the edge case).

Closes #5225.

Closes #5228.
2023-06-20 19:01:21 -04:00
Dhruv Manilawala
6f7d3cc798 Add option (-o/--output-file) to write output to a file (#4950)
## Summary

A new CLI option (`-o`/`--output-file`) to write output to a file
instead of stdout.

Major change is to remove the lock acquired on stdout. The argument is
that the output is buffered and thus the lock is acquired only when
writing a block (8kb). As per the benchmark below there is a slight
performance penalty.

Reference:
https://rustmagazine.org/issue-3/javascript-compiler/#printing-is-slow

## Benchmarks

_Output is truncated to only contain useful information:_

Command: `check --isolated --no-cache --select=ALL --show-source
./test-repos/cpython"`

Latest HEAD (361d45f2b2) with and without
the manual lock on stdout:

```console
Benchmark 1: With lock
  Time (mean ± σ):      5.687 s ±  0.075 s    [User: 17.110 s, System: 0.486 s]
  Range (min … max):    5.615 s …  5.860 s    10 runs

Benchmark 2: Without lock
  Time (mean ± σ):      5.719 s ±  0.064 s    [User: 17.095 s, System: 0.491 s]
  Range (min … max):    5.640 s …  5.865 s    10 runs

Summary
  (1) ran 1.01 ± 0.02 times faster than (2)
```

This PR:

```console
Benchmark 1: This PR
  Time (mean ± σ):      5.855 s ±  0.058 s    [User: 17.197 s, System: 0.491 s]
  Range (min … max):    5.786 s …  5.987 s    10 runs
 
Benchmark 2: Latest HEAD with lock
  Time (mean ± σ):      5.645 s ±  0.033 s    [User: 16.922 s, System: 0.495 s]
  Range (min … max):    5.600 s …  5.712 s    10 runs
 
Summary
  (2) ran 1.04 ± 0.01 times faster than (1)
```

## Test Plan

Run all of the commands which gives output with and without the
`--output-file=ruff.out` option:
* `--show-settings`
* `--show-files`
* `--show-fixes`
* `--diff`
* `--select=ALL`
* `--select=All --show-source`
* `--watch` (only stdout allowed)

resolves: #4754
2023-06-20 22:16:49 +05:30
Thomas de Zeeuw
17f1ecd56e Open cache files in parallel (#5120)
## Summary

Open cache files in parallel (again), brings the performance back to be roughly equal to the old implementation.

## Test Plan

Existing tests should keep working.
2023-06-20 17:43:09 +02:00
Dhruv Manilawala
48f4f2d63d Maintain consistency when deserializing to JSON (#5114)
## Summary

Maintain consistency while deserializing Jupyter notebook to JSON. The
following changes were made:

1. Use string array to store the source value as that's the default
(5781720423/nbformat/v4/nbjson.py (L56-L57))
2. Remove unused structs and enums
3. Reorder the keys in alphabetical order as that's the default.
(5781720423/nbformat/v4/nbjson.py (L51))

### Side effect

Removing the `preserve_order` feature means that the order of keys in
JSON output (`--format json`) will be in alphabetical order. This is
because the value is represented using `serde_json::Value` which
internally is a `BTreeMap`, thus sorting it as per the string key. For
posterity if this turns out to be not ideal, then we could define a
struct representing the JSON object and the order of struct fields will
determine the order in the JSON string.

## Test Plan

Add a test case to assert the raw JSON string.
2023-06-19 23:47:56 +05:30