mirror of
https://github.com/astral-sh/ruff
synced 2026-01-23 06:20:55 -05:00
## Summary _This is preview only feature and is available using the `--preview` command-line flag._ With the implementation of [PEP 701] in Python 3.12, f-strings can now be broken into multiple lines, can contain comments, and can re-use the same quote character. Currently, no other Python formatter formats the f-strings so there's some discussion which needs to happen in defining the style used for f-string formatting. Relevant discussion: https://github.com/astral-sh/ruff/discussions/9785 The goal for this PR is to add minimal support for f-string formatting. This would be to format expression within the replacement field without introducing any major style changes. ### Newlines The heuristics for adding newline is similar to that of [Prettier](https://prettier.io/docs/en/next/rationale.html#template-literals) where the formatter would only split an expression in the replacement field across multiple lines if there was already a line break within the replacement field. In other words, the formatter would not add any newlines unless they were already present i.e., they were added by the user. This makes breaking any expression inside an f-string optional and in control of the user. For example, ```python # We wouldn't break this aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc" # But, we would break the following as there's already a newline aaaaaaaaaaa = f"asaaaaaaaaaaaaaaaa { aaaaaaaaaaaa + bbbbbbbbbbbb + ccccccccccccccc } cccccccccc" ``` If there are comments in any of the replacement field of the f-string, then it will always be a multi-line f-string in which case the formatter would prefer to break expressions i.e., introduce newlines. For example, ```python x = f"{ # comment a }" ``` ### Quotes The logic for formatting quotes remains unchanged. The existing logic is used to determine the necessary quote char and is used accordingly. Now, if the expression inside an f-string is itself a string like, then we need to make sure to preserve the existing quote and not change it to the preferred quote unless it's 3.12. For example, ```python f"outer {'inner'} outer" # For pre 3.12, preserve the single quote f"outer {'inner'} outer" # While for 3.12 and later, the quotes can be changed f"outer {"inner"} outer" ``` But, for triple-quoted strings, we can re-use the same quote char unless the inner string is itself a triple-quoted string. ```python f"""outer {"inner"} outer""" # valid f"""outer {'''inner'''} outer""" # preserve the single quote char for the inner string ``` ### Debug expressions If debug expressions are present in the replacement field of a f-string, then the whitespace needs to be preserved as they will be rendered as it is (for example, `f"{ x = }"`. If there are any nested f-strings, then the whitespace in them needs to be preserved as well which means that we'll stop formatting the f-string as soon as we encounter a debug expression. ```python f"outer { x = !s :.3f}" # ^^ # We can remove these whitespaces ``` Now, the whitespace doesn't need to be preserved around conversion spec and format specifiers, so we'll format them as usual but we won't be formatting any nested f-string within the format specifier. ### Miscellaneous - The [`hug_parens_with_braces_and_square_brackets`](https://github.com/astral-sh/ruff/issues/8279) preview style isn't implemented w.r.t. the f-string curly braces. - The [indentation](https://github.com/astral-sh/ruff/discussions/9785#discussioncomment-8470590) is always relative to the f-string containing statement ## Test Plan * Add new test cases * Review existing snapshot changes * Review the ecosystem changes [PEP 701]: https://peps.python.org/pep-0701/
421 lines
12 KiB
Rust
421 lines
12 KiB
Rust
use crate::comments::Comments;
|
|
use crate::other::f_string::FStringContext;
|
|
use crate::string::QuoteChar;
|
|
use crate::PyFormatOptions;
|
|
use ruff_formatter::{Buffer, FormatContext, GroupId, IndentWidth, SourceCode};
|
|
use ruff_source_file::Locator;
|
|
use std::fmt::{Debug, Formatter};
|
|
use std::ops::{Deref, DerefMut};
|
|
|
|
#[derive(Clone)]
|
|
pub struct PyFormatContext<'a> {
|
|
options: PyFormatOptions,
|
|
contents: &'a str,
|
|
comments: Comments<'a>,
|
|
node_level: NodeLevel,
|
|
indent_level: IndentLevel,
|
|
/// Set to a non-None value when the formatter is running on a code
|
|
/// snippet within a docstring. The value should be the quote character of the
|
|
/// docstring containing the code snippet.
|
|
///
|
|
/// Various parts of the formatter may inspect this state to change how it
|
|
/// works. For example, multi-line strings will always be written with a
|
|
/// quote style that is inverted from the one here in order to ensure that
|
|
/// the formatted Python code will be valid.
|
|
docstring: Option<QuoteChar>,
|
|
/// The state of the formatter with respect to f-strings.
|
|
f_string_state: FStringState,
|
|
}
|
|
|
|
impl<'a> PyFormatContext<'a> {
|
|
pub(crate) fn new(options: PyFormatOptions, contents: &'a str, comments: Comments<'a>) -> Self {
|
|
Self {
|
|
options,
|
|
contents,
|
|
comments,
|
|
node_level: NodeLevel::TopLevel(TopLevelStatementPosition::Other),
|
|
indent_level: IndentLevel::new(0),
|
|
docstring: None,
|
|
f_string_state: FStringState::Outside,
|
|
}
|
|
}
|
|
|
|
pub(crate) fn source(&self) -> &'a str {
|
|
self.contents
|
|
}
|
|
|
|
#[allow(unused)]
|
|
pub(crate) fn locator(&self) -> Locator<'a> {
|
|
Locator::new(self.contents)
|
|
}
|
|
|
|
pub(crate) fn set_node_level(&mut self, level: NodeLevel) {
|
|
self.node_level = level;
|
|
}
|
|
|
|
pub(crate) fn node_level(&self) -> NodeLevel {
|
|
self.node_level
|
|
}
|
|
|
|
pub(crate) fn set_indent_level(&mut self, level: IndentLevel) {
|
|
self.indent_level = level;
|
|
}
|
|
|
|
pub(crate) fn indent_level(&self) -> IndentLevel {
|
|
self.indent_level
|
|
}
|
|
|
|
pub(crate) fn comments(&self) -> &Comments<'a> {
|
|
&self.comments
|
|
}
|
|
|
|
/// Returns a non-None value only if the formatter is running on a code
|
|
/// snippet within a docstring.
|
|
///
|
|
/// The quote character returned corresponds to the quoting used for the
|
|
/// docstring containing the code snippet currently being formatted.
|
|
pub(crate) fn docstring(&self) -> Option<QuoteChar> {
|
|
self.docstring
|
|
}
|
|
|
|
/// Return a new context suitable for formatting code snippets within a
|
|
/// docstring.
|
|
///
|
|
/// The quote character given should correspond to the quote character used
|
|
/// for the docstring containing the code snippets.
|
|
pub(crate) fn in_docstring(self, quote: QuoteChar) -> PyFormatContext<'a> {
|
|
PyFormatContext {
|
|
docstring: Some(quote),
|
|
..self
|
|
}
|
|
}
|
|
|
|
pub(crate) fn f_string_state(&self) -> FStringState {
|
|
self.f_string_state
|
|
}
|
|
|
|
pub(crate) fn set_f_string_state(&mut self, f_string_state: FStringState) {
|
|
self.f_string_state = f_string_state;
|
|
}
|
|
|
|
/// Returns `true` if preview mode is enabled.
|
|
pub(crate) const fn is_preview(&self) -> bool {
|
|
self.options.preview().is_enabled()
|
|
}
|
|
}
|
|
|
|
impl FormatContext for PyFormatContext<'_> {
|
|
type Options = PyFormatOptions;
|
|
|
|
fn options(&self) -> &Self::Options {
|
|
&self.options
|
|
}
|
|
|
|
fn source_code(&self) -> SourceCode {
|
|
SourceCode::new(self.contents)
|
|
}
|
|
}
|
|
|
|
impl Debug for PyFormatContext<'_> {
|
|
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
|
|
f.debug_struct("PyFormatContext")
|
|
.field("options", &self.options)
|
|
.field("comments", &self.comments.debug(self.source_code()))
|
|
.field("node_level", &self.node_level)
|
|
.field("source", &self.contents)
|
|
.finish()
|
|
}
|
|
}
|
|
|
|
#[derive(Copy, Clone, Debug, Default)]
|
|
pub(crate) enum FStringState {
|
|
/// The formatter is inside an f-string expression element i.e., between the
|
|
/// curly brace in `f"foo {x}"`.
|
|
///
|
|
/// The containing `FStringContext` is the surrounding f-string context.
|
|
InsideExpressionElement(FStringContext),
|
|
/// The formatter is outside an f-string.
|
|
#[default]
|
|
Outside,
|
|
}
|
|
|
|
/// The position of a top-level statement in the module.
|
|
#[derive(Copy, Clone, Debug, Eq, PartialEq, Default)]
|
|
pub(crate) enum TopLevelStatementPosition {
|
|
/// This is the last top-level statement in the module.
|
|
Last,
|
|
/// Any other top-level statement.
|
|
#[default]
|
|
Other,
|
|
}
|
|
|
|
/// What's the enclosing level of the outer node.
|
|
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
|
|
pub(crate) enum NodeLevel {
|
|
/// Formatting statements on the module level.
|
|
TopLevel(TopLevelStatementPosition),
|
|
|
|
/// Formatting the body statements of a [compound statement](https://docs.python.org/3/reference/compound_stmts.html#compound-statements)
|
|
/// (`if`, `while`, `match`, etc.).
|
|
CompoundStatement,
|
|
|
|
/// The root or any sub-expression.
|
|
Expression(Option<GroupId>),
|
|
|
|
/// Formatting nodes that are enclosed by a parenthesized (any `[]`, `{}` or `()`) expression.
|
|
ParenthesizedExpression,
|
|
}
|
|
|
|
impl Default for NodeLevel {
|
|
fn default() -> Self {
|
|
Self::TopLevel(TopLevelStatementPosition::Other)
|
|
}
|
|
}
|
|
|
|
impl NodeLevel {
|
|
/// Returns `true` if the expression is in a parenthesized context.
|
|
pub(crate) const fn is_parenthesized(self) -> bool {
|
|
matches!(
|
|
self,
|
|
NodeLevel::Expression(Some(_)) | NodeLevel::ParenthesizedExpression
|
|
)
|
|
}
|
|
|
|
/// Returns `true` if this is the last top-level statement in the module.
|
|
pub(crate) const fn is_last_top_level_statement(self) -> bool {
|
|
matches!(self, NodeLevel::TopLevel(TopLevelStatementPosition::Last))
|
|
}
|
|
}
|
|
|
|
/// Change the [`NodeLevel`] of the formatter for the lifetime of this struct
|
|
pub(crate) struct WithNodeLevel<'ast, 'buf, B>
|
|
where
|
|
B: Buffer<Context = PyFormatContext<'ast>>,
|
|
{
|
|
buffer: &'buf mut B,
|
|
saved_level: NodeLevel,
|
|
}
|
|
|
|
impl<'ast, 'buf, B> WithNodeLevel<'ast, 'buf, B>
|
|
where
|
|
B: Buffer<Context = PyFormatContext<'ast>>,
|
|
{
|
|
pub(crate) fn new(level: NodeLevel, buffer: &'buf mut B) -> Self {
|
|
let context = buffer.state_mut().context_mut();
|
|
let saved_level = context.node_level();
|
|
|
|
context.set_node_level(level);
|
|
|
|
Self {
|
|
buffer,
|
|
saved_level,
|
|
}
|
|
}
|
|
}
|
|
|
|
impl<'ast, 'buf, B> Deref for WithNodeLevel<'ast, 'buf, B>
|
|
where
|
|
B: Buffer<Context = PyFormatContext<'ast>>,
|
|
{
|
|
type Target = B;
|
|
|
|
fn deref(&self) -> &Self::Target {
|
|
self.buffer
|
|
}
|
|
}
|
|
|
|
impl<'ast, 'buf, B> DerefMut for WithNodeLevel<'ast, 'buf, B>
|
|
where
|
|
B: Buffer<Context = PyFormatContext<'ast>>,
|
|
{
|
|
fn deref_mut(&mut self) -> &mut Self::Target {
|
|
self.buffer
|
|
}
|
|
}
|
|
|
|
impl<'ast, B> Drop for WithNodeLevel<'ast, '_, B>
|
|
where
|
|
B: Buffer<Context = PyFormatContext<'ast>>,
|
|
{
|
|
fn drop(&mut self) {
|
|
self.buffer
|
|
.state_mut()
|
|
.context_mut()
|
|
.set_node_level(self.saved_level);
|
|
}
|
|
}
|
|
|
|
/// The current indent level of the formatter.
|
|
///
|
|
/// One can determine the the width of the indent itself (in number of ASCII
|
|
/// space characters) by multiplying the indent level by the configured indent
|
|
/// width.
|
|
///
|
|
/// This is specifically used inside the docstring code formatter for
|
|
/// implementing its "dynamic" line width mode. Namely, in the nested call to
|
|
/// the formatter, when "dynamic" mode is enabled, the line width is set to
|
|
/// `min(1, line_width - indent_level * indent_width)`, where `line_width` in
|
|
/// this context is the global line width setting.
|
|
#[derive(Copy, Clone, Debug, Eq, PartialEq)]
|
|
pub(crate) struct IndentLevel {
|
|
/// The numeric level. It is incremented for every whole indent in Python
|
|
/// source code.
|
|
///
|
|
/// Note that the first indentation level is actually 1, since this starts
|
|
/// at 0 and is incremented when the first top-level statement is seen. So
|
|
/// even though the first top-level statement in Python source will have no
|
|
/// indentation, its indentation level is 1.
|
|
level: u16,
|
|
}
|
|
|
|
impl IndentLevel {
|
|
/// Returns a new indent level for the given value.
|
|
pub(crate) fn new(level: u16) -> IndentLevel {
|
|
IndentLevel { level }
|
|
}
|
|
|
|
/// Returns the next indent level.
|
|
pub(crate) fn increment(self) -> IndentLevel {
|
|
IndentLevel {
|
|
level: self.level.saturating_add(1),
|
|
}
|
|
}
|
|
|
|
/// Convert this indent level into a specific number of ASCII whitespace
|
|
/// characters based on the given indent width.
|
|
pub(crate) fn to_ascii_spaces(self, width: IndentWidth) -> u16 {
|
|
let width = u16::try_from(width.value()).unwrap_or(u16::MAX);
|
|
// Why the subtraction? IndentLevel starts at 0 and asks for the "next"
|
|
// indent level before seeing the first top-level statement. So it's
|
|
// always 1 more than what we expect it to be.
|
|
let level = self.level.saturating_sub(1);
|
|
width.saturating_mul(level)
|
|
}
|
|
}
|
|
|
|
/// Change the [`IndentLevel`] of the formatter for the lifetime of this
|
|
/// struct.
|
|
pub(crate) struct WithIndentLevel<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
buffer: D,
|
|
saved_level: IndentLevel,
|
|
}
|
|
|
|
impl<'a, B, D> WithIndentLevel<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
pub(crate) fn new(level: IndentLevel, mut buffer: D) -> Self {
|
|
let context = buffer.state_mut().context_mut();
|
|
let saved_level = context.indent_level();
|
|
|
|
context.set_indent_level(level);
|
|
|
|
Self {
|
|
buffer,
|
|
saved_level,
|
|
}
|
|
}
|
|
}
|
|
|
|
impl<'a, B, D> Deref for WithIndentLevel<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
type Target = B;
|
|
|
|
fn deref(&self) -> &Self::Target {
|
|
&self.buffer
|
|
}
|
|
}
|
|
|
|
impl<'a, B, D> DerefMut for WithIndentLevel<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
fn deref_mut(&mut self) -> &mut Self::Target {
|
|
&mut self.buffer
|
|
}
|
|
}
|
|
|
|
impl<'a, B, D> Drop for WithIndentLevel<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
fn drop(&mut self) {
|
|
self.buffer
|
|
.state_mut()
|
|
.context_mut()
|
|
.set_indent_level(self.saved_level);
|
|
}
|
|
}
|
|
|
|
pub(crate) struct WithFStringState<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
buffer: D,
|
|
saved_location: FStringState,
|
|
}
|
|
|
|
impl<'a, B, D> WithFStringState<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
pub(crate) fn new(expr_location: FStringState, mut buffer: D) -> Self {
|
|
let context = buffer.state_mut().context_mut();
|
|
let saved_location = context.f_string_state();
|
|
|
|
context.set_f_string_state(expr_location);
|
|
|
|
Self {
|
|
buffer,
|
|
saved_location,
|
|
}
|
|
}
|
|
}
|
|
|
|
impl<'a, B, D> Deref for WithFStringState<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
type Target = B;
|
|
|
|
fn deref(&self) -> &Self::Target {
|
|
&self.buffer
|
|
}
|
|
}
|
|
|
|
impl<'a, B, D> DerefMut for WithFStringState<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
fn deref_mut(&mut self) -> &mut Self::Target {
|
|
&mut self.buffer
|
|
}
|
|
}
|
|
|
|
impl<'a, B, D> Drop for WithFStringState<'a, B, D>
|
|
where
|
|
D: DerefMut<Target = B>,
|
|
B: Buffer<Context = PyFormatContext<'a>>,
|
|
{
|
|
fn drop(&mut self) {
|
|
self.buffer
|
|
.state_mut()
|
|
.context_mut()
|
|
.set_f_string_state(self.saved_location);
|
|
}
|
|
}
|