mirror of
https://github.com/astral-sh/ruff
synced 2026-01-08 15:14:19 -05:00
Disallow implicit concatenation of t-strings and other string types (#19485)
As of [this cpython PR](https://github.com/python/cpython/pull/135996), it is not allowed to concatenate t-strings with non-t-strings, implicitly or explicitly. Expressions such as `"foo" t"{bar}"` are now syntax errors. This PR updates some AST nodes and parsing to reflect this change. The structural change is that `TStringPart` is no longer needed, since, as in the case of `BytesStringLiteral`, the only possibilities are that we have a single `TString` or a vector of such (representing an implicit concatenation of t-strings). This removes a level of nesting from many AST expressions (which is what all the snapshot changes reflect), and simplifies some logic in the implementation of visitors, for example. The other change of note is in the parser. When we meet an implicit concatenation of string-like literals, we now count the number of t-string literals. If these do not exhaust the total number of implicitly concatenated pieces, then we emit a syntax error. To recover from this syntax error, we encode any t-string pieces as _invalid_ string literals (which means we flag them as invalid, record their range, and record the value as `""`). Note that if at least one of the pieces is an f-string we prefer to parse the entire string as an f-string; otherwise we parse it as a string. This logic is exactly the same as how we currently treat `BytesStringLiteral` parsing and error recovery - and carries with it the same pros and cons. Finally, note that I have not implemented any changes in the implementation of the formatter. As far as I can tell, none are needed. I did change a few of the fixtures so that we are always concatenating t-strings with t-strings.
This commit is contained in:
@@ -110,16 +110,13 @@ f"{10 + len('bar')=}" f'{10 + len("bar")=}'
|
||||
# T-strings
|
||||
##############################################################################
|
||||
|
||||
# Escape `{` and `}` when merging a t-string with a string
|
||||
"a {not_a_variable}" t"b {10}" "c"
|
||||
|
||||
# Join, and break expressions
|
||||
t"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa{
|
||||
expression
|
||||
}bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" t"cccccccccccccccccccc {20999}" "more"
|
||||
}bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" t"cccccccccccccccccccc {20999}" t"more"
|
||||
|
||||
# Join, but don't break the expressions
|
||||
t"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa{expression}bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" t"cccccccccccccccccccc {20999}" "more"
|
||||
t"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa{expression}bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb" t"cccccccccccccccccccc {20999}" t"more"
|
||||
|
||||
t"test{
|
||||
expression
|
||||
@@ -177,22 +174,11 @@ t"test" tR"test"
|
||||
|
||||
"single" f""""single"""
|
||||
|
||||
"single" t""""single"""
|
||||
t"single" t""""single"""
|
||||
|
||||
b"single" b"""triple"""
|
||||
|
||||
|
||||
##############################################################################
|
||||
# Don't join t-strings and f-strings
|
||||
##############################################################################
|
||||
|
||||
t"{interp}" f"{expr}"
|
||||
|
||||
f"{expr}" t"{interp}"
|
||||
|
||||
f"{expr}" "string" t"{interp}"
|
||||
|
||||
|
||||
##############################################################################
|
||||
# Join strings in with statements
|
||||
##############################################################################
|
||||
@@ -521,9 +507,6 @@ f"{10 + len('bar')=}" f'{10 + len("bar")=}'
|
||||
# T-strings
|
||||
##############################################################################
|
||||
|
||||
# Escape `{` and `}` when merging a t-string with a string
|
||||
t"a {{not_a_variable}}b {10}c"
|
||||
|
||||
# Join, and break expressions
|
||||
t"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa{
|
||||
expression
|
||||
@@ -583,22 +566,11 @@ t"test" Rt"test"
|
||||
|
||||
"single" f""""single"""
|
||||
|
||||
"single" t""""single"""
|
||||
t"single" t""""single"""
|
||||
|
||||
b"single" b"""triple"""
|
||||
|
||||
|
||||
##############################################################################
|
||||
# Don't join t-strings and f-strings
|
||||
##############################################################################
|
||||
|
||||
t"{interp}" f"{expr}"
|
||||
|
||||
f"{expr}" t"{interp}"
|
||||
|
||||
f"{expr}" "string" t"{interp}"
|
||||
|
||||
|
||||
##############################################################################
|
||||
# Join strings in with statements
|
||||
##############################################################################
|
||||
@@ -905,7 +877,7 @@ f"aaaaaaaaaaaaaaaa \
|
||||
```diff
|
||||
--- Stable
|
||||
+++ Preview
|
||||
@@ -302,9 +302,12 @@
|
||||
@@ -288,9 +288,12 @@
|
||||
##############################################################################
|
||||
# Use can_omit_optional_parentheses layout to avoid an instability where the formatter
|
||||
# picks the can_omit_optional_parentheses layout when the strings are joined.
|
||||
|
||||
@@ -351,7 +351,7 @@ a[
|
||||
b
|
||||
] = (
|
||||
t"ccccc{
|
||||
expression}ccccccccccc" "cccccccccccccccccccccccc" # comment
|
||||
expression}ccccccccccc" t"cccccccccccccccccccccccc" # comment
|
||||
)
|
||||
|
||||
# Same but starting with a joined string. They should both result in the same formatting.
|
||||
@@ -367,7 +367,7 @@ a[
|
||||
aaaaaaa,
|
||||
b
|
||||
] = t"ccccc{
|
||||
expression}ccccccccccc" "ccccccccccccccccccccccccccccccccccccccccccc" # comment
|
||||
expression}ccccccccccc" t"ccccccccccccccccccccccccccccccccccccccccccc" # comment
|
||||
|
||||
|
||||
# Split an overlong target, but join the string if it fits
|
||||
@@ -376,7 +376,7 @@ a[
|
||||
b
|
||||
].bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb = (
|
||||
t"ccccc{
|
||||
expression}ccccccccccc" "cccccccccccccccccccccccccccccc" # comment
|
||||
expression}ccccccccccc" t"cccccccccccccccccccccccccccccc" # comment
|
||||
)
|
||||
|
||||
# Split both if necessary and keep multiline
|
||||
@@ -385,66 +385,66 @@ a[
|
||||
b
|
||||
].bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb = (
|
||||
t"ccccc{
|
||||
expression}cccccccccccccccccccccccccccccccc" "ccccccccccccccccccccccccccccccc" # comment
|
||||
expression}cccccccccccccccccccccccccccccccc" t"ccccccccccccccccccccccccccccccc" # comment
|
||||
)
|
||||
|
||||
# Don't inline t-strings that contain expressions that are guaranteed to split, e.b. because of a magic trailing comma
|
||||
aaaaaaaaaaaaaaaaaa = t"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
[a,]
|
||||
}" "moreeeeeeeeeeeeeeeeeeee" "test" # comment
|
||||
}" t"moreeeeeeeeeeeeeeeeeeee" t"test" # comment
|
||||
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
[a,]
|
||||
}" "moreeeeeeeeeeeeeeeeeeee" "test" # comment
|
||||
}" t"moreeeeeeeeeeeeeeeeeeee" t"test" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = t"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
[a,]
|
||||
}" "moreeeeeeeeeeeeeeeeeeee" "test" # comment
|
||||
}" t"moreeeeeeeeeeeeeeeeeeee" t"test" # comment
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (t"testeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
[a,]
|
||||
}" "moreeeeeeeeeeeeeeeeeeee" "test" # comment
|
||||
}" t"moreeeeeeeeeeeeeeeeeeee" t"test" # comment
|
||||
)
|
||||
|
||||
# Don't inline t-strings that contain commented expressions
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{[
|
||||
a # comment
|
||||
]}" "moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
]}" t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{[
|
||||
a # comment
|
||||
]}" "moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
]}" t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
# Don't inline t-strings with multiline debug expressions:
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
a=}" "moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
a=}" t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{a +
|
||||
b=}" "moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
b=}" t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{a
|
||||
=}" "moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
=}" t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
a=}" "moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
a=}" t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{a
|
||||
=}" "moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
=}" t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
|
||||
@@ -505,7 +505,7 @@ a = (
|
||||
)
|
||||
|
||||
logger.error(
|
||||
f"Failed to run task {task} for job"
|
||||
f"Failed to run task {task} for job"
|
||||
f"with id {str(job.id)}" # type: ignore[union-attr]
|
||||
)
|
||||
|
||||
@@ -909,7 +909,7 @@ a[aaaaaaa, b] = t"ccccc{expression}ccccccccccccccccccccccccccccccccccc" # comme
|
||||
# The string gets parenthesized because it, with the inlined comment, exceeds the line length limit.
|
||||
a[aaaaaaa, b] = (
|
||||
t"ccccc{expression}ccccccccccc"
|
||||
"ccccccccccccccccccccccccccccccccccccccccccc"
|
||||
t"ccccccccccccccccccccccccccccccccccccccccccc"
|
||||
) # comment
|
||||
|
||||
|
||||
@@ -925,7 +925,7 @@ a[
|
||||
aaaaaaa, b
|
||||
].bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb = (
|
||||
t"ccccc{expression}cccccccccccccccccccccccccccccccc"
|
||||
"ccccccccccccccccccccccccccccccc"
|
||||
t"ccccccccccccccccccccccccccccccc"
|
||||
) # comment
|
||||
|
||||
# Don't inline t-strings that contain expressions that are guaranteed to split, e.b. because of a magic trailing comma
|
||||
@@ -935,8 +935,8 @@ aaaaaaaaaaaaaaaaaa = (
|
||||
a,
|
||||
]
|
||||
}"
|
||||
"moreeeeeeeeeeeeeeeeeeee"
|
||||
"test"
|
||||
t"moreeeeeeeeeeeeeeeeeeee"
|
||||
t"test"
|
||||
) # comment
|
||||
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
@@ -945,8 +945,8 @@ aaaaaaaaaaaaaaaaaa = (
|
||||
a,
|
||||
]
|
||||
}"
|
||||
"moreeeeeeeeeeeeeeeeeeee"
|
||||
"test" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeeee"
|
||||
t"test" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
@@ -955,8 +955,8 @@ aaaaa[aaaaaaaaaaa] = (
|
||||
a,
|
||||
]
|
||||
}"
|
||||
"moreeeeeeeeeeeeeeeeeeee"
|
||||
"test"
|
||||
t"moreeeeeeeeeeeeeeeeeeee"
|
||||
t"test"
|
||||
) # comment
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
@@ -965,8 +965,8 @@ aaaaa[aaaaaaaaaaa] = (
|
||||
a,
|
||||
]
|
||||
}"
|
||||
"moreeeeeeeeeeeeeeeeeeee"
|
||||
"test" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeeee"
|
||||
t"test" # comment
|
||||
)
|
||||
|
||||
# Don't inline t-strings that contain commented expressions
|
||||
@@ -976,7 +976,7 @@ aaaaaaaaaaaaaaaaaa = (
|
||||
a # comment
|
||||
]
|
||||
}"
|
||||
"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
@@ -985,38 +985,38 @@ aaaaa[aaaaaaaaaaa] = (
|
||||
a # comment
|
||||
]
|
||||
}"
|
||||
"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
# Don't inline t-strings with multiline debug expressions:
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
a=}"
|
||||
"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{a +
|
||||
b=}"
|
||||
"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaaaaaaaaaaaaaaa = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{a
|
||||
=}"
|
||||
"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{
|
||||
a=}"
|
||||
"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
aaaaa[aaaaaaaaaaa] = (
|
||||
t"testeeeeeeeeeeeeeeeeeeeeeeeee{a
|
||||
=}"
|
||||
"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
t"moreeeeeeeeeeeeeeeeeetest" # comment
|
||||
)
|
||||
|
||||
|
||||
|
||||
@@ -14,21 +14,21 @@ rt"Not-so-tricky \"quote"
|
||||
|
||||
# Regression test for tstrings dropping comments
|
||||
result_f = (
|
||||
'Traceback (most recent call last):\n'
|
||||
t'Traceback (most recent call last):\n'
|
||||
t' File "{__file__}", line {lineno_f+5}, in _check_recursive_traceback_display\n'
|
||||
' f()\n'
|
||||
t' f()\n'
|
||||
t' File "{__file__}", line {lineno_f+1}, in f\n'
|
||||
' f()\n'
|
||||
t' f()\n'
|
||||
t' File "{__file__}", line {lineno_f+1}, in f\n'
|
||||
' f()\n'
|
||||
t' f()\n'
|
||||
t' File "{__file__}", line {lineno_f+1}, in f\n'
|
||||
' f()\n'
|
||||
t' f()\n'
|
||||
# XXX: The following line changes depending on whether the tests
|
||||
# are run through the interactive interpreter or with -m
|
||||
# It also varies depending on the platform (stack size)
|
||||
# Fortunately, we don't care about exactness here, so we use regex
|
||||
r' \[Previous line repeated (\d+) more times\]' '\n'
|
||||
'RecursionError: maximum recursion depth exceeded\n'
|
||||
rt' \[Previous line repeated (\d+) more times\]' t'\n'
|
||||
t'RecursionError: maximum recursion depth exceeded\n'
|
||||
)
|
||||
|
||||
|
||||
@@ -37,7 +37,7 @@ result_f = (
|
||||
(
|
||||
t'{1}'
|
||||
# comment 1
|
||||
''
|
||||
t''
|
||||
)
|
||||
|
||||
(
|
||||
@@ -661,7 +661,7 @@ hello {
|
||||
|
||||
# Implicit concatenated t-string containing quotes
|
||||
_ = (
|
||||
'This string should change its quotes to double quotes'
|
||||
t'This string should change its quotes to double quotes'
|
||||
t'This string uses double quotes in an expression {"it's a quote"}'
|
||||
t'This t-string does not use any quotes.'
|
||||
)
|
||||
@@ -761,22 +761,22 @@ rt"Not-so-tricky \"quote"
|
||||
|
||||
# Regression test for tstrings dropping comments
|
||||
result_f = (
|
||||
"Traceback (most recent call last):\n"
|
||||
t"Traceback (most recent call last):\n"
|
||||
t' File "{__file__}", line {lineno_f + 5}, in _check_recursive_traceback_display\n'
|
||||
" f()\n"
|
||||
t" f()\n"
|
||||
t' File "{__file__}", line {lineno_f + 1}, in f\n'
|
||||
" f()\n"
|
||||
t" f()\n"
|
||||
t' File "{__file__}", line {lineno_f + 1}, in f\n'
|
||||
" f()\n"
|
||||
t" f()\n"
|
||||
t' File "{__file__}", line {lineno_f + 1}, in f\n'
|
||||
" f()\n"
|
||||
t" f()\n"
|
||||
# XXX: The following line changes depending on whether the tests
|
||||
# are run through the interactive interpreter or with -m
|
||||
# It also varies depending on the platform (stack size)
|
||||
# Fortunately, we don't care about exactness here, so we use regex
|
||||
r" \[Previous line repeated (\d+) more times\]"
|
||||
"\n"
|
||||
"RecursionError: maximum recursion depth exceeded\n"
|
||||
rt" \[Previous line repeated (\d+) more times\]"
|
||||
t"\n"
|
||||
t"RecursionError: maximum recursion depth exceeded\n"
|
||||
)
|
||||
|
||||
|
||||
@@ -785,7 +785,7 @@ result_f = (
|
||||
(
|
||||
t"{1}"
|
||||
# comment 1
|
||||
""
|
||||
t""
|
||||
)
|
||||
|
||||
(
|
||||
@@ -1463,7 +1463,7 @@ hello {
|
||||
|
||||
# Implicit concatenated t-string containing quotes
|
||||
_ = (
|
||||
"This string should change its quotes to double quotes"
|
||||
t"This string should change its quotes to double quotes"
|
||||
t"This string uses double quotes in an expression {"it's a quote"}"
|
||||
t"This t-string does not use any quotes."
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user