[red-knot] Preliminary `NamedTuple` support (#17738)

## Summary

Adds preliminary support for `NamedTuple`s, including:
* No false positives when constructing a `NamedTuple` object
* Correct signature for the synthesized `__new__` method, i.e. proper
checking of constructor calls
* A patched MRO (`NamedTuple` => `tuple`), mainly to make type inference
of named attributes possible, but also to better reflect the runtime
MRO.

All of this works:
```py
from typing import NamedTuple

class Person(NamedTuple):
    id: int
    name: str
    age: int | None = None

alice = Person(1, "Alice", 42)
alice = Person(id=1, name="Alice", age=42)

reveal_type(alice.id)  # revealed: int
reveal_type(alice.name)  # revealed: str
reveal_type(alice.age)  # revealed: int | None

# error: [missing-argument]
Person(3)

# error: [too-many-positional-arguments]
Person(3, "Eve", 99, "extra")

# error: [invalid-argument-type]
Person(id="3", name="Eve")
```

Not included:
* type inference for index-based access.
* support for the functional `MyTuple = NamedTuple("MyTuple", […])`
syntax

## Test Plan

New Markdown tests

## Ecosystem analysis

```
                          Diagnostic Analysis Report                           
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━┓
┃ Diagnostic ID                     ┃ Severity ┃ Removed ┃ Added ┃ Net Change ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━┩
│ lint:call-non-callable            │ error    │       0 │     3 │         +3 │
│ lint:call-possibly-unbound-method │ warning  │       0 │     4 │         +4 │
│ lint:invalid-argument-type        │ error    │       0 │    72 │        +72 │
│ lint:invalid-context-manager      │ error    │       0 │     2 │         +2 │
│ lint:invalid-return-type          │ error    │       0 │     2 │         +2 │
│ lint:missing-argument             │ error    │       0 │    46 │        +46 │
│ lint:no-matching-overload         │ error    │   19121 │     0 │     -19121 │
│ lint:not-iterable                 │ error    │       0 │     6 │         +6 │
│ lint:possibly-unbound-attribute   │ warning  │      13 │    32 │        +19 │
│ lint:redundant-cast               │ warning  │       0 │     1 │         +1 │
│ lint:unresolved-attribute         │ error    │       0 │    10 │        +10 │
│ lint:unsupported-operator         │ error    │       3 │     9 │         +6 │
│ lint:unused-ignore-comment        │ warning  │      15 │     4 │        -11 │
├───────────────────────────────────┼──────────┼─────────┼───────┼────────────┤
│ TOTAL                             │          │   19152 │   191 │     -18961 │
└───────────────────────────────────┴──────────┴─────────┴───────┴────────────┘

Analysis complete. Found 13 unique diagnostic IDs.
Total diagnostics removed: 19152
Total diagnostics added: 191
Net change: -18961
```

I uploaded the ecosystem full diff (ignoring the 19k
`no-matching-overload` diagnostics)
[here](https://shark.fish/diff-namedtuple.html).

* There are some new `missing-argument` false positives which come from
the fact that named tuples are often created using unpacking as in
`MyNamedTuple(*fields)`, which we do not understand yet.
* There are some new `unresolved-attribute` false positives, because
methods like `_replace` are not available.
* Lots of the `invalid-argument-type` diagnostics look like true
positives

---------

Co-authored-by: Douglas Creager <dcreager@dcreager.net>
This commit is contained in:
David Peter 2025-04-30 22:52:04 +02:00 committed by GitHub
parent d33a503686
commit 03d8679adf
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 276 additions and 116 deletions

View File

@ -0,0 +1,120 @@
# `NamedTuple`
`NamedTuple` is a type-safe way to define named tuples — a tuple where each field can be accessed by
name, and not just by its numeric position within the tuple:
## `typing.NamedTuple`
### Basics
```py
from typing import NamedTuple
class Person(NamedTuple):
id: int
name: str
age: int | None = None
alice = Person(1, "Alice", 42)
alice = Person(id=1, name="Alice", age=42)
bob = Person(2, "Bob")
bob = Person(id=2, name="Bob")
reveal_type(alice.id) # revealed: int
reveal_type(alice.name) # revealed: str
reveal_type(alice.age) # revealed: int | None
# TODO: These should reveal the types of the fields
reveal_type(alice[0]) # revealed: Unknown
reveal_type(alice[1]) # revealed: Unknown
reveal_type(alice[2]) # revealed: Unknown
# error: [missing-argument]
Person(3)
# error: [too-many-positional-arguments]
Person(3, "Eve", 99, "extra")
# error: [invalid-argument-type]
Person(id="3", name="Eve")
```
Alternative functional syntax:
```py
Person2 = NamedTuple("Person", [("id", int), ("name", str)])
alice2 = Person2(1, "Alice")
# TODO: should be an error
Person2(1)
reveal_type(alice2.id) # revealed: @Todo(GenericAlias instance)
reveal_type(alice2.name) # revealed: @Todo(GenericAlias instance)
```
### Multiple Inheritance
Multiple inheritance is not supported for `NamedTuple` classes:
```py
from typing import NamedTuple
# This should ideally emit a diagnostic
class C(NamedTuple, object):
id: int
name: str
```
### Inheriting from a `NamedTuple`
Inheriting from a `NamedTuple` is supported, but new fields on the subclass will not be part of the
synthesized `__new__` signature:
```py
from typing import NamedTuple
class User(NamedTuple):
id: int
name: str
class SuperUser(User):
level: int
# This is fine:
alice = SuperUser(1, "Alice")
reveal_type(alice.level) # revealed: int
# This is an error because `level` is not part of the signature:
# error: [too-many-positional-arguments]
alice = SuperUser(1, "Alice", 3)
```
### Generic named tuples
```toml
[environment]
python-version = "3.12"
```
```py
from typing import NamedTuple
class Property[T](NamedTuple):
name: str
value: T
# TODO: this should be supported (no error, revealed type of `Property[float]`)
# error: [invalid-argument-type]
reveal_type(Property("height", 3.4)) # revealed: Property[Unknown]
```
## `collections.namedtuple`
```py
from collections import namedtuple
Person = namedtuple("Person", ["id", "name", "age"], defaults=[None])
alice = Person(1, "Alice", 42)
bob = Person(2, "Bob")
```

View File

@ -101,6 +101,42 @@ fn inheritance_cycle_initial<'db>(
None
}
/// A category of classes with code generation capabilities (with synthesized methods).
#[derive(Clone, Copy, Debug, PartialEq)]
enum CodeGeneratorKind {
/// Classes decorated with `@dataclass` or similar dataclass-like decorators
DataclassLike,
/// Classes inheriting from `typing.NamedTuple`
NamedTuple,
}
impl CodeGeneratorKind {
fn from_class(db: &dyn Db, class: ClassLiteral<'_>) -> Option<Self> {
if CodeGeneratorKind::DataclassLike.matches(db, class) {
Some(CodeGeneratorKind::DataclassLike)
} else if CodeGeneratorKind::NamedTuple.matches(db, class) {
Some(CodeGeneratorKind::NamedTuple)
} else {
None
}
}
fn matches<'db>(self, db: &'db dyn Db, class: ClassLiteral<'db>) -> bool {
match self {
Self::DataclassLike => {
class.dataclass_params(db).is_some()
|| class
.try_metaclass(db)
.is_ok_and(|(_, transformer_params)| transformer_params.is_some())
}
Self::NamedTuple => class.explicit_bases(db).iter().any(|base| {
base.into_class_literal()
.is_some_and(|c| c.is_known(db, KnownClass::NamedTuple))
}),
}
}
}
/// A specialization of a generic class with a particular assignment of types to typevars.
#[salsa::interned(debug)]
pub struct GenericAlias<'db> {
@ -986,26 +1022,91 @@ impl<'db> ClassLiteral<'db> {
});
if symbol.symbol.is_unbound() {
if let Some(dataclass_member) = self.own_dataclass_member(db, specialization, name) {
return Symbol::bound(dataclass_member).into();
if let Some(synthesized_member) = self.own_synthesized_member(db, specialization, name)
{
return Symbol::bound(synthesized_member).into();
}
}
symbol
}
/// Returns the type of a synthesized dataclass member like `__init__` or `__lt__`.
fn own_dataclass_member(
/// Returns the type of a synthesized dataclass member like `__init__` or `__lt__`, or
/// a synthesized `__new__` method for a `NamedTuple`.
fn own_synthesized_member(
self,
db: &'db dyn Db,
specialization: Option<Specialization<'db>>,
name: &str,
) -> Option<Type<'db>> {
let params = self.dataclass_params(db);
let has_dataclass_param = |param| params.is_some_and(|params| params.contains(param));
let dataclass_params = self.dataclass_params(db);
let has_dataclass_param =
|param| dataclass_params.is_some_and(|params| params.contains(param));
match name {
"__init__" => {
let field_policy = CodeGeneratorKind::from_class(db, self)?;
let signature_from_fields = |mut parameters: Vec<_>| {
for (name, (mut attr_ty, mut default_ty)) in
self.fields(db, specialization, field_policy)
{
// The descriptor handling below is guarded by this fully-static check, because dynamic
// types like `Any` are valid (data) descriptors: since they have all possible attributes,
// they also have a (callable) `__set__` method. The problem is that we can't determine
// the type of the value parameter this way. Instead, we want to use the dynamic type
// itself in this case, so we skip the special descriptor handling.
if attr_ty.is_fully_static(db) {
let dunder_set = attr_ty.class_member(db, "__set__".into());
if let Some(dunder_set) = dunder_set.symbol.ignore_possibly_unbound() {
// This type of this attribute is a data descriptor. Instead of overwriting the
// descriptor attribute, data-classes will (implicitly) call the `__set__` method
// of the descriptor. This means that the synthesized `__init__` parameter for
// this attribute is determined by possible `value` parameter types with which
// the `__set__` method can be called. We build a union of all possible options
// to account for possible overloads.
let mut value_types = UnionBuilder::new(db);
for signature in &dunder_set.signatures(db) {
for overload in signature {
if let Some(value_param) = overload.parameters().get_positional(2) {
value_types = value_types.add(
value_param.annotated_type().unwrap_or_else(Type::unknown),
);
} else if overload.parameters().is_gradual() {
value_types = value_types.add(Type::unknown());
}
}
}
attr_ty = value_types.build();
// The default value of the attribute is *not* determined by the right hand side
// of the class-body assignment. Instead, the runtime invokes `__get__` on the
// descriptor, as if it had been called on the class itself, i.e. it passes `None`
// for the `instance` argument.
if let Some(ref mut default_ty) = default_ty {
*default_ty = default_ty
.try_call_dunder_get(db, Type::none(db), Type::ClassLiteral(self))
.map(|(return_ty, _)| return_ty)
.unwrap_or_else(Type::unknown);
}
}
}
let mut parameter =
Parameter::positional_or_keyword(name).with_annotated_type(attr_ty);
if let Some(default_ty) = default_ty {
parameter = parameter.with_default_type(default_ty);
}
parameters.push(parameter);
}
let signature = Signature::new(Parameters::new(parameters), Some(Type::none(db)));
Some(Type::Callable(CallableType::single(db, signature)))
};
match (field_policy, name) {
(CodeGeneratorKind::DataclassLike, "__init__") => {
let has_synthesized_dunder_init = has_dataclass_param(DataclassParams::INIT)
|| self
.try_metaclass(db)
@ -1015,77 +1116,14 @@ impl<'db> ClassLiteral<'db> {
return None;
}
let mut parameters = vec![];
for (name, (mut attr_ty, mut default_ty)) in
self.dataclass_fields(db, specialization)
{
// The descriptor handling below is guarded by this fully-static check, because dynamic
// types like `Any` are valid (data) descriptors: since they have all possible attributes,
// they also have a (callable) `__set__` method. The problem is that we can't determine
// the type of the value parameter this way. Instead, we want to use the dynamic type
// itself in this case, so we skip the special descriptor handling.
if attr_ty.is_fully_static(db) {
let dunder_set = attr_ty.class_member(db, "__set__".into());
if let Some(dunder_set) = dunder_set.symbol.ignore_possibly_unbound() {
// This type of this attribute is a data descriptor. Instead of overwriting the
// descriptor attribute, data-classes will (implicitly) call the `__set__` method
// of the descriptor. This means that the synthesized `__init__` parameter for
// this attribute is determined by possible `value` parameter types with which
// the `__set__` method can be called. We build a union of all possible options
// to account for possible overloads.
let mut value_types = UnionBuilder::new(db);
for signature in &dunder_set.signatures(db) {
for overload in signature {
if let Some(value_param) =
overload.parameters().get_positional(2)
{
value_types = value_types.add(
value_param
.annotated_type()
.unwrap_or_else(Type::unknown),
);
} else if overload.parameters().is_gradual() {
value_types = value_types.add(Type::unknown());
}
}
}
attr_ty = value_types.build();
// The default value of the attribute is *not* determined by the right hand side
// of the class-body assignment. Instead, the runtime invokes `__get__` on the
// descriptor, as if it had been called on the class itself, i.e. it passes `None`
// for the `instance` argument.
if let Some(ref mut default_ty) = default_ty {
*default_ty = default_ty
.try_call_dunder_get(
db,
Type::none(db),
Type::ClassLiteral(self),
)
.map(|(return_ty, _)| return_ty)
.unwrap_or_else(Type::unknown);
}
}
}
let mut parameter =
Parameter::positional_or_keyword(name).with_annotated_type(attr_ty);
if let Some(default_ty) = default_ty {
parameter = parameter.with_default_type(default_ty);
}
parameters.push(parameter);
}
let init_signature =
Signature::new(Parameters::new(parameters), Some(Type::none(db)));
Some(Type::Callable(CallableType::single(db, init_signature)))
signature_from_fields(vec![])
}
"__lt__" | "__le__" | "__gt__" | "__ge__" => {
(CodeGeneratorKind::NamedTuple, "__new__") => {
let cls_parameter = Parameter::positional_or_keyword(Name::new_static("cls"))
.with_annotated_type(KnownClass::Type.to_instance(db));
signature_from_fields(vec![cls_parameter])
}
(CodeGeneratorKind::DataclassLike, "__lt__" | "__le__" | "__gt__" | "__ge__") => {
if !has_dataclass_param(DataclassParams::ORDER) {
return None;
}
@ -1106,27 +1144,27 @@ impl<'db> ClassLiteral<'db> {
}
}
fn is_dataclass(self, db: &'db dyn Db) -> bool {
self.dataclass_params(db).is_some()
|| self
.try_metaclass(db)
.is_ok_and(|(_, transformer_params)| transformer_params.is_some())
}
/// Returns a list of all annotated attributes defined in this class, or any of its superclasses.
///
/// See [`ClassLiteral::own_dataclass_fields`] for more details.
fn dataclass_fields(
/// See [`ClassLiteral::own_fields`] for more details.
fn fields(
self,
db: &'db dyn Db,
specialization: Option<Specialization<'db>>,
field_policy: CodeGeneratorKind,
) -> FxOrderMap<Name, (Type<'db>, Option<Type<'db>>)> {
let dataclasses_in_mro: Vec<_> = self
if field_policy == CodeGeneratorKind::NamedTuple {
// NamedTuples do not allow multiple inheritance, so it is sufficient to enumerate the
// fields of this class only.
return self.own_fields(db);
}
let matching_classes_in_mro: Vec<_> = self
.iter_mro(db, specialization)
.filter_map(|superclass| {
if let Some(class) = superclass.into_class() {
let class_literal = class.class_literal(db).0;
if class_literal.is_dataclass(db) {
if field_policy.matches(db, class_literal) {
Some(class_literal)
} else {
None
@ -1138,10 +1176,10 @@ impl<'db> ClassLiteral<'db> {
// We need to collect into a `Vec` here because we iterate the MRO in reverse order
.collect();
dataclasses_in_mro
matching_classes_in_mro
.into_iter()
.rev()
.flat_map(|class| class.own_dataclass_fields(db))
.flat_map(|class| class.own_fields(db))
// We collect into a FxOrderMap here to deduplicate attributes
.collect()
}
@ -1157,10 +1195,7 @@ impl<'db> ClassLiteral<'db> {
/// y: str = "a"
/// ```
/// we return a map `{"x": (int, None), "y": (str, Some(Literal["a"]))}`.
fn own_dataclass_fields(
self,
db: &'db dyn Db,
) -> FxOrderMap<Name, (Type<'db>, Option<Type<'db>>)> {
fn own_fields(self, db: &'db dyn Db) -> FxOrderMap<Name, (Type<'db>, Option<Type<'db>>)> {
let mut attributes = FxOrderMap::default();
let class_body_scope = self.body_scope(db);
@ -1925,6 +1960,7 @@ pub enum KnownClass {
TypeVarTuple,
TypeAliasType,
NoDefaultType,
NamedTuple,
NewType,
SupportsIndex,
// Collections
@ -2011,6 +2047,8 @@ impl<'db> KnownClass {
| Self::Float
| Self::Enum
| Self::ABCMeta
// Empty tuples are AlwaysFalse; non-empty tuples are AlwaysTrue
| Self::NamedTuple
// Evaluating `NotImplementedType` in a boolean context was deprecated in Python 3.9
// and raises a `TypeError` in Python >=3.14
// (see https://docs.python.org/3/library/constants.html#NotImplemented)
@ -2071,6 +2109,7 @@ impl<'db> KnownClass {
| Self::TypeVarTuple
| Self::TypeAliasType
| Self::NoDefaultType
| Self::NamedTuple
| Self::NewType
| Self::ChainMap
| Self::Counter
@ -2118,6 +2157,7 @@ impl<'db> KnownClass {
Self::UnionType => "UnionType",
Self::MethodWrapperType => "MethodWrapperType",
Self::WrapperDescriptorType => "WrapperDescriptorType",
Self::NamedTuple => "NamedTuple",
Self::NoneType => "NoneType",
Self::SpecialForm => "_SpecialForm",
Self::TypeVar => "TypeVar",
@ -2305,6 +2345,7 @@ impl<'db> KnownClass {
Self::Any
| Self::SpecialForm
| Self::TypeVar
| Self::NamedTuple
| Self::StdlibAlias
| Self::SupportsIndex => KnownModule::Typing,
Self::TypeAliasType
@ -2397,6 +2438,7 @@ impl<'db> KnownClass {
| Self::Enum
| Self::ABCMeta
| Self::Super
| Self::NamedTuple
| Self::NewType => false,
}
}
@ -2457,6 +2499,7 @@ impl<'db> KnownClass {
| Self::ABCMeta
| Self::Super
| Self::UnionType
| Self::NamedTuple
| Self::NewType => false,
}
}
@ -2498,6 +2541,7 @@ impl<'db> KnownClass {
"UnionType" => Self::UnionType,
"MethodWrapperType" => Self::MethodWrapperType,
"WrapperDescriptorType" => Self::WrapperDescriptorType,
"NamedTuple" => Self::NamedTuple,
"NewType" => Self::NewType,
"TypeAliasType" => Self::TypeAliasType,
"TypeVar" => Self::TypeVar,
@ -2586,6 +2630,7 @@ impl<'db> KnownClass {
| Self::ParamSpecArgs
| Self::ParamSpecKwargs
| Self::TypeVarTuple
| Self::NamedTuple
| Self::NewType => matches!(module, KnownModule::Typing | KnownModule::TypingExtensions),
}
}

View File

@ -72,11 +72,15 @@ impl<'db> ClassBase<'db> {
pub(super) fn try_from_type(db: &'db dyn Db, ty: Type<'db>) -> Option<Self> {
match ty {
Type::Dynamic(dynamic) => Some(Self::Dynamic(dynamic)),
Type::ClassLiteral(literal) => Some(if literal.is_known(db, KnownClass::Any) {
Self::Dynamic(DynamicType::Any)
} else {
Self::Class(literal.default_specialization(db))
}),
Type::ClassLiteral(literal) => {
if literal.is_known(db, KnownClass::Any) {
Some(Self::Dynamic(DynamicType::Any))
} else if literal.is_known(db, KnownClass::NamedTuple) {
Self::try_from_type(db, KnownClass::Tuple.to_class_literal(db))
} else {
Some(Self::Class(literal.default_specialization(db)))
}
}
Type::GenericAlias(generic) => Some(Self::Class(ClassType::Generic(generic))),
Type::NominalInstance(instance)
if instance.class().is_known(db, KnownClass::GenericAlias) =>

View File

@ -59,22 +59,13 @@ type KeyDiagnosticFields = (
Severity,
);
static EXPECTED_TOMLLIB_DIAGNOSTICS: &[KeyDiagnosticFields] = &[
(
DiagnosticId::lint("no-matching-overload"),
Some("/src/tomllib/_parser.py"),
Some(2329..2358),
"No overload of bound method `__init__` matches arguments",
Severity::Error,
),
(
DiagnosticId::lint("unused-ignore-comment"),
Some("/src/tomllib/_parser.py"),
Some(22299..22333),
"Unused blanket `type: ignore` directive",
Severity::Warning,
),
];
static EXPECTED_TOMLLIB_DIAGNOSTICS: &[KeyDiagnosticFields] = &[(
DiagnosticId::lint("unused-ignore-comment"),
Some("/src/tomllib/_parser.py"),
Some(22299..22333),
"Unused blanket `type: ignore` directive",
Severity::Warning,
)];
fn tomllib_path(file: &TestFile) -> SystemPathBuf {
SystemPathBuf::from("src").join(file.name())