As of [this cpython PR](https://github.com/python/cpython/pull/135996),
it is not allowed to concatenate t-strings with non-t-strings,
implicitly or explicitly. Expressions such as `"foo" t"{bar}"` are now
syntax errors.
This PR updates some AST nodes and parsing to reflect this change.
The structural change is that `TStringPart` is no longer needed, since,
as in the case of `BytesStringLiteral`, the only possibilities are that
we have a single `TString` or a vector of such (representing an implicit
concatenation of t-strings). This removes a level of nesting from many
AST expressions (which is what all the snapshot changes reflect), and
simplifies some logic in the implementation of visitors, for example.
The other change of note is in the parser. When we meet an implicit
concatenation of string-like literals, we now count the number of
t-string literals. If these do not exhaust the total number of
implicitly concatenated pieces, then we emit a syntax error. To recover
from this syntax error, we encode any t-string pieces as _invalid_
string literals (which means we flag them as invalid, record their
range, and record the value as `""`). Note that if at least one of the
pieces is an f-string we prefer to parse the entire string as an
f-string; otherwise we parse it as a string.
This logic is exactly the same as how we currently treat
`BytesStringLiteral` parsing and error recovery - and carries with it
the same pros and cons.
Finally, note that I have not implemented any changes in the
implementation of the formatter. As far as I can tell, none are needed.
I did change a few of the fixtures so that we are always concatenating
t-strings with t-strings.
## Summary
Per @ntBre in https://github.com/astral-sh/ruff/pull/19111, it would be
a good idea to make the tests no longer have these syntax errors, so
this PR updates the tests and snapshots.
`B031` gave me a lot of trouble since the ending test of declaring a
function named `groupby` makes it so that inside other functions, it's
unclear which `groupby` is referred to since it depends on when the
function is called. To fix it I made each function have it's own `from
itertools import groupby` so there's no more ambiguity.
This PR implements template strings (t-strings) in the parser and
formatter for Ruff.
Minimal changes necessary to compile were made in other parts of the code (e.g. ty, the linter, etc.). These will be covered properly in follow-up PRs.
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
Fixes#17798
<!-- What's the purpose of the change? What does it do, and why? -->
## Test Plan
Snapshot tests
<!-- How was it tested? -->
## Summary
Closes#17112. Allows passing in string and list-of-strings literals
into `subprocess.run` (and related) calls without marking them as
untrusted input:
```py
import subprocess
subprocess.run("true")
# "instant" named expressions are also allowed
subprocess.run(c := "ls")
```
## Test Plan
Added test cases covering new behavior, passed with `cargo nextest run`.
## Summary
Stop flagging each invocation of `django.utils.safestring.mark_safe`
(also available at, `django.utils.html.mark_safe`) as an error.
Instead, allow string literals as valid uses for `mark_safe`.
Also, update the documentation, pointing at
`django.utils.html.format_html` for dynamic content generation use
cases.
Closes#16702
## Test Plan
I verified several possible uses, but string literals, are still
flagged.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
`RUF035` has been backported into bandit as `S704` in this
[PR](https://github.com/PyCQA/bandit/pull/1225)
This moves the rule and its corresponding setting to the `flake8-bandit`
category
## Test Plan
`cargo nextest run`
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
## Summary
Permits suspicious imports (the `S4` namespaced diagnostics) from stub
files.
Closes#15207.
## Test Plan
Added tests and ran `cargo nextest run`. The test files are copied from
the `.py` variants.
## Summary
A follow up PR on https://github.com/astral-sh/ruff/issues/14991
Ruff ignores hardcoded passwords for typed variables. Add a rule to
catch passwords in typed code bases
## Test Plan
Includes 2 more test typed variables
## Summary
S113 exists because `requests` doesn't have a default timeout, so
request without timeout may hang indefinitely
> B113: Test for missing requests timeout
This plugin test checks for requests or httpx calls without a timeout
specified.
>
> Nearly all production code should use this parameter in nearly all
requests, **Failure to do so can cause your program to hang
indefinitely.**
But httpx has default timeout 5s, so S113 for httpx request without
`timeout` argument is a false positive, only valid case would be
`timeout=None`.
https://www.python-httpx.org/advanced/timeouts/
> HTTPX is careful to enforce timeouts everywhere by default.
>
> The default behavior is to raise a TimeoutException after 5 seconds of
network inactivity.
## Test Plan
snap updated
## Summary
Bandit now also reports `B113` on `httpx`
(https://github.com/PyCQA/bandit/pull/1060). This PR implements the same
logic, to detect missing or `None` timeouts for `httpx` alongside
`requests`.
## Test Plan
Snapshot tests.
## Summary
This PR updates the `StringLike::FString` variant to use `ExprFString`
instead of `FStringLiteralElement`.
For context, the reason it used `FStringLiteralElement` is that the node
is actually the string part of an f-string ("foo" in `f"foo{x}"`). But,
this is inconsistent with other variants where the captured value is the
_entire_ string.
This is also problematic w.r.t. implicitly concatenated strings. Any
rules which work with `StringLike::FString` doesn't account for the
string part in an implicitly concatenated f-strings. For example, we
don't flag confusable character in the first part of `"𝐁ad" f"𝐁ad
string"`, but only the second part
(https://play.ruff.rs/16071c4c-a1dd-4920-b56f-e2ce2f69c843).
### Update `PYI053`
_This is included in this PR because otherwise it requires a temporary
workaround to be compatible with the old logic._
This PR also updates the `PYI053` (`string-or-bytes-too-long`) rule for
f-string to consider _all_ the visible characters in a f-string,
including the ones which are implicitly concatenated. This is consistent
with implicitly concatenated strings and bytes.
For example,
```python
def foo(
# We count all the characters here
arg1: str = '51 character ' 'stringgggggggggggggggggggggggggggggggg',
# But not here because of the `{x}` replacement field which _breaks_ them up into two chunks
arg2: str = f'51 character {x} stringgggggggggggggggggggggggggggggggggggggggggggg',
) -> None: ...
```
This PR fixes it to consider all _visible_ characters inside an f-string
which includes expressions as well.
fixes: #10310fixes: #10307
## Test Plan
Add new test cases and update the snapshots.
## Review
To facilitate the review process, the change have been split into two
commits: one which has the code change while the other has the test
cases and updated snapshots.
## Summary
Django's `mark_safe` can also be used as a decorator, so we should
detect usages of `@mark_safe` for the purpose of the relevant Bandit
rule.
Closes https://github.com/astral-sh/ruff/issues/9780.
## Summary
Hey there 👋 thanks for this great project!
On python code looking like the following
```
import yaml
from yaml.loader import SafeLoader
with MY_FILE_PATH.open("r") as my_file:
my_data = yaml.load(my_file, Loader=SafeLoader)
```
ruff reports this error:
```
S506 Probable use of unsafe loader `SafeLoader` with `yaml.load`. Allows instantiation of arbitrary objects. Consider `yaml.safe_load`.
```
This PR is an attempt to support SafeLoader being imported for either
`yaml` or `yaml.loader`
Disclaimer:
I am not familiar with Rust so this is likely not the better way of
doing it. Interested in hearing how to adapt this PR to provide similar
behavior in a better way
## Test Plan
The S506.py file was updated accordingly to cover the use cases and test
were confirmed to pass with this change.
Rebase of #6365 authored by @davidszotten.
## Summary
This PR updates the AST structure for an f-string elements.
The main **motivation** behind this change is to have a dedicated node
for the string part of an f-string. Previously, the existing
`ExprStringLiteral` node was used for this purpose which isn't exactly
correct. The `ExprStringLiteral` node should include the quotes as well
in the range but the f-string literal element doesn't include the quote
as it's a specific part within an f-string. For example,
```python
f"foo {x}"
# ^^^^
# This is the literal part of an f-string
```
The introduction of `FStringElement` enum is helpful which represent
either the literal part or the expression part of an f-string.
### Rule Updates
This means that there'll be two nodes representing a string depending on
the context. One for a normal string literal while the other is a string
literal within an f-string. The AST checker is updated to accommodate
this change. The rules which work on string literal are updated to check
on the literal part of f-string as well.
#### Notes
1. The `Expr::is_literal_expr` method would check for
`ExprStringLiteral` and return true if so. But now that we don't
represent the literal part of an f-string using that node, this improves
the method's behavior and confines to the actual expression. We do have
the `FStringElement::is_literal` method.
2. We avoid checking if we're in a f-string context before adding to
`string_type_definitions` because the f-string literal is now a
dedicated node and not part of `Expr`.
3. Annotations cannot use f-string so we avoid changing any rules which
work on annotation and checks for `ExprStringLiteral`.
## Test Plan
- All references of `Expr::StringLiteral` were checked to see if any of
the rules require updating to account for the f-string literal element
node.
- New test cases are added for rules which check against the literal
part of an f-string.
- Check the ecosystem results and ensure it remains unchanged.
## Performance
There's a performance penalty in the parser. The reason for this remains
unknown as it seems that the generated assembly code is now different
for the `__reduce154` function. The reduce function body is just popping
the `ParenthesizedExpr` on top of the stack and pushing it with the new
location.
- The size of `FStringElement` enum is the same as `Expr` which is what
it replaces in `FString::format_spec`
- The size of `FStringExpressionElement` is the same as
`ExprFormattedValue` which is what it replaces
I tried reducing the `Expr` enum from 80 bytes to 72 bytes but it hardly
resulted in any performance gain. The difference can be seen here:
- Original profile: https://share.firefox.dev/3Taa7ES
- Profile after boxing some node fields:
https://share.firefox.dev/3GsNXpD
### Backtracking
I tried backtracking the changes to see if any of the isolated change
produced this regression. The problem here is that the overall change is
so small that there's only a single checkpoint where I can backtrack and
that checkpoint results in the same regression. This checkpoint is to
revert using `Expr` to the `FString::format_spec` field. After this
point, the change would revert back to the original implementation.
## Review process
The review process is similar to #7927. The first set of commits update
the node structure, parser, and related AST files. Then, further commits
update the linter and formatter part to account for the AST change.
---------
Co-authored-by: David Szotten <davidszotten@gmail.com>
Part of #1646.
## Summary
Implement `S505`
([`weak_cryptographic_key`](https://bandit.readthedocs.io/en/latest/plugins/b505_weak_cryptographic_key.html))
rule from `bandit`.
For this rule, `bandit` [reports the issue
with](https://github.com/PyCQA/bandit/blob/1.7.5/bandit/plugins/weak_cryptographic_key.py#L47-L56):
- medium severity for DSA/RSA < 2048 bits and EC < 224 bits
- high severity for DSA/RSA < 1024 bits and EC < 160 bits
Since Ruff does not handle severities for `bandit`-related rules, we
could either report the issue if we have lower values than medium
severity, or lower values than high one. Two reasons led me to choose
the first option:
- a medium severity issue is still a security issue we would want to
report to the user, who can then decide to either handle the issue or
ignore it
- `bandit` [maps the EC key algorithms to their respective key lengths
in
bits](https://github.com/PyCQA/bandit/blob/1.7.5/bandit/plugins/weak_cryptographic_key.py#L112-L133),
but there is no value below 160 bits, so technically `bandit` would
never report medium severity issues for EC keys, only high ones
Another consideration is that as shared just above, for EC key
algorithms, `bandit` has a mapping to map the algorithms to their
respective key lengths. In the implementation in Ruff, I rather went
with an explicit list of EC algorithms known to be vulnerable (which
would thus be reported) rather than implementing a mapping to retrieve
the associated key length and comparing it with the minimum value.
## Test Plan
Snapshot tests from
https://github.com/PyCQA/bandit/blob/1.7.5/examples/weak_cryptographic_key_sizes.py.
## Summary
This is a follow-up to #7469 that attempts to achieve similar gains, but
without introducing malachite. Instead, this PR removes the `BigInt`
type altogether, instead opting for a simple enum that allows us to
store small integers directly and only allocate for values greater than
`i64`:
```rust
/// A Python integer literal. Represents both small (fits in an `i64`) and large integers.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Int(Number);
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum Number {
/// A "small" number that can be represented as an `i64`.
Small(i64),
/// A "large" number that cannot be represented as an `i64`.
Big(Box<str>),
}
impl std::fmt::Display for Number {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Number::Small(value) => write!(f, "{value}"),
Number::Big(value) => write!(f, "{value}"),
}
}
}
```
We typically don't care about numbers greater than `isize` -- our only
uses are comparisons against small constants (like `1`, `2`, `3`, etc.),
so there's no real loss of information, except in one or two rules where
we're now a little more conservative (with the worst-case being that we
don't flag, e.g., an `itertools.pairwise` that uses an extremely large
value for the slice start constant). For simplicity, a few diagnostics
now show a dedicated message when they see integers that are out of the
supported range (e.g., `outdated-version-block`).
An additional benefit here is that we get to remove a few dependencies,
especially `num-bigint`.
## Test Plan
`cargo test`