Files
ruff/crates/red_knot_python_semantic/resources/mdtest/call/methods.md
David Peter d2e034adcd [red-knot] Method calls and the descriptor protocol (#16121)
## Summary

This PR achieves the following:

* Add support for checking method calls, and inferring return types from
method calls. For example:
  ```py
  reveal_type("abcde".find("abc"))  # revealed: int
  reveal_type("foo".encode(encoding="utf-8"))  # revealed: bytes
  
  "abcde".find(123)  # error: [invalid-argument-type]
  
  class C:
      def f(self) -> int:
          pass
  
  reveal_type(C.f)  # revealed: <function `f`>
  reveal_type(C().f)  # revealed: <bound method: `f` of `C`>
  
  C.f()  # error: [missing-argument]
  reveal_type(C().f())  # revealed: int
  ```
* Implement the descriptor protocol, i.e. properly call the `__get__`
method when a descriptor object is accessed through a class object or an
instance of a class. For example:
  ```py
  from typing import Literal
  
  class Ten:
def __get__(self, instance: object, owner: type | None = None) ->
Literal[10]:
          return 10
  
  class C:
      ten: Ten = Ten()
  
  reveal_type(C.ten)  # revealed: Literal[10]
  reveal_type(C().ten)  # revealed: Literal[10]
  ```
* Add support for member lookup on intersection types.
* Support type inference for `inspect.getattr_static(obj, attr)` calls.
This was mostly used as a debugging tool during development, but seems
more generally useful. It can be used to bypass the descriptor protocol.
For the example above:
  ```py
  from inspect import getattr_static
  
  reveal_type(getattr_static(C, "ten"))  # revealed: Ten
  ```
* Add a new `Type::Callable(…)` variant with the following sub-variants:
* `Type::Callable(CallableType::BoundMethod(…))` — represents bound
method objects, e.g. `C().f` above
* `Type::Callable(CallableType::MethodWrapperDunderGet(…))` — represents
`f.__get__` where `f` is a function
* `Type::Callable(WrapperDescriptorDunderGet)` — represents
`FunctionType.__get__`
* Add new known classes:
  * `types.MethodType`
  * `types.MethodWrapperType`
  * `types.WrapperDescriptorType`
  * `builtins.range`

## Performance analysis

On this branch, we do more work. We need to do more call checking, since
we now check all method calls. We also need to do ~twice as many member
lookups, because we need to check if a `__get__` attribute exists on
accessed members.

A brief analysis on `tomllib` shows that we now call `Type::call` 1780
times, compared to 612 calls before.

## Limitations

* Data descriptors are not yet supported, i.e. we do not infer correct
types for descriptor attribute accesses in `Store` context and do not
check writes to descriptor attributes. I felt like this was something
that could be split out as a follow-up without risking a major
architectural change.
* We currently distinguish between `Type::member` (with descriptor
protocol) and `Type::static_member` (without descriptor protocol). The
former corresponds to `obj.attr`, the latter corresponds to
`getattr_static(obj, "attr")`. However, to model some details correctly,
we would also need to distinguish between a static member lookup *with*
and *without* instance variables. The lookup without instance variables
corresponds to `find_name_in_mro`
[here](https://docs.python.org/3/howto/descriptor.html#invocation-from-an-instance).
We currently approximate both using `member_static`, which leads to two
open TODOs. Changing this would be a larger refactoring of
`Type::own_instance_member`, so I chose to leave it out of this PR.

## Test Plan

* New `call/methods.md` test suite for method calls
* New tests in `descriptor_protocol.md`
* New `call/getattr_static.md` test suite for `inspect.getattr_static`
* Various updated tests
2025-02-20 23:22:26 +01:00

7.7 KiB

Methods

Background: Functions as descriptors

Note: See also this related section in the descriptor guide: Functions and methods.

Say we have a simple class C with a function definition f inside its body:

class C:
    def f(self, x: int) -> str:
        return "a"

Whenever we access the f attribute through the class object itself (C.f) or through an instance (C().f), this access happens via the descriptor protocol. Functions are (non-data) descriptors because they implement a __get__ method. This is crucial in making sure that method calls work as expected. In general, the signature of the __get__ method in the descriptor protocol is __get__(self, instance, owner). The self argument is the descriptor object itself (f). The passed value for the instance argument depends on whether the attribute is accessed from the class object (in which case it is None), or from an instance (in which case it is the instance of type C). The owner argument is the class itself (C of type Literal[C]). To summarize:

  • C.f is equivalent to getattr_static(C, "f").__get__(None, C)
  • C().f is equivalent to getattr_static(C, "f").__get__(C(), C)

Here, inspect.getattr_static is used to bypass the descriptor protocol and directly access the function attribute. The way the special __get__ method on functions works is as follows. In the former case, if the instance argument is None, __get__ simply returns the function itself. In the latter case, it returns a bound method object:

from inspect import getattr_static

reveal_type(getattr_static(C, "f"))  # revealed: Literal[f]

reveal_type(getattr_static(C, "f").__get__)  # revealed: <method-wrapper `__get__` of `f`>

reveal_type(getattr_static(C, "f").__get__(None, C))  # revealed: Literal[f]
reveal_type(getattr_static(C, "f").__get__(C(), C))  # revealed: <bound method `f` of `C`>

In conclusion, this is why we see the following two types when accessing the f attribute on the class object C and on an instance C():

reveal_type(C.f)  # revealed: Literal[f]
reveal_type(C().f)  # revealed: <bound method `f` of `C`>

A bound method is a callable object that contains a reference to the instance that it was called on (can be inspected via __self__), and the function object that it refers to (can be inspected via __func__):

bound_method = C().f

reveal_type(bound_method.__self__)  # revealed: C
reveal_type(bound_method.__func__)  # revealed: Literal[f]

When we call the bound method, the instance is implicitly passed as the first argument (self):

reveal_type(C().f(1))  # revealed: str
reveal_type(bound_method(1))  # revealed: str

When we call the function object itself, we need to pass the instance explicitly:

C.f(1)  # error: [missing-argument]

reveal_type(C.f(C(), 1))  # revealed: str

When we access methods from derived classes, they will be bound to instances of the derived class:

class D(C):
    pass

reveal_type(D().f)  # revealed: <bound method `f` of `D`>

If we access an attribute on a bound method object itself, it will defer to types.MethodType:

reveal_type(bound_method.__hash__)  # revealed: <bound method `__hash__` of `MethodType`>

If an attribute is not available on the bound method object, it will be looked up on the underlying function object. We model this explicitly, which means that we can access __kwdefaults__ on bound methods, even though it is not available on types.MethodType:

reveal_type(bound_method.__kwdefaults__)  # revealed: @Todo(generics) | None

Basic method calls on class objects and instances

class Base:
    def method_on_base(self, x: int | None) -> str:
        return "a"

class Derived(Base):
    def method_on_derived(self, x: bytes) -> tuple[int, str]:
        return (1, "a")

reveal_type(Base().method_on_base(1))  # revealed: str
reveal_type(Base.method_on_base(Base(), 1))  # revealed: str

Base().method_on_base("incorrect")  # error: [invalid-argument-type]
Base().method_on_base()  # error: [missing-argument]
Base().method_on_base(1, 2)  # error: [too-many-positional-arguments]

reveal_type(Derived().method_on_base(1))  # revealed: str
reveal_type(Derived().method_on_derived(b"abc"))  # revealed: tuple[int, str]
reveal_type(Derived.method_on_base(Derived(), 1))  # revealed: str
reveal_type(Derived.method_on_derived(Derived(), b"abc"))  # revealed: tuple[int, str]

Method calls on literals

Boolean literals

reveal_type(True.bit_length())  # revealed: int
reveal_type(True.as_integer_ratio())  # revealed: tuple[int, Literal[1]]

Integer literals

reveal_type((42).bit_length())  # revealed: int

String literals

reveal_type("abcde".find("abc"))  # revealed: int
reveal_type("foo".encode(encoding="utf-8"))  # revealed: bytes

"abcde".find(123)  # error: [invalid-argument-type]

Bytes literals

reveal_type(b"abcde".startswith(b"abc"))  # revealed: bool

Method calls on LiteralString

from typing_extensions import LiteralString

def f(s: LiteralString) -> None:
    reveal_type(s.find("a"))  # revealed: int

Method calls on tuple

def f(t: tuple[int, str]) -> None:
    reveal_type(t.index("a"))  # revealed: int

Method calls on unions

from typing import Any

class A:
    def f(self) -> int:
        return 1

class B:
    def f(self) -> str:
        return "a"

def f(a_or_b: A | B, any_or_a: Any | A):
    reveal_type(a_or_b.f)  # revealed: <bound method `f` of `A`> | <bound method `f` of `B`>
    reveal_type(a_or_b.f())  # revealed: int | str

    reveal_type(any_or_a.f)  # revealed: Any | <bound method `f` of `A`>
    reveal_type(any_or_a.f())  # revealed: Any | int

Method calls on KnownInstance types

[environment]
python-version = "3.12"
type IntOrStr = int | str

reveal_type(IntOrStr.__or__)  # revealed: <bound method `__or__` of `typing.TypeAliasType`>

Error cases: Calling __get__ for methods

The __get__ method on types.FunctionType has the following overloaded signature in typeshed:

from types import FunctionType, MethodType
from typing import overload

@overload
def __get__(self, instance: None, owner: type, /) -> FunctionType: ...
@overload
def __get__(self, instance: object, owner: type | None = None, /) -> MethodType: ...

Here, we test that this signature is enforced correctly:

from inspect import getattr_static

class C:
    def f(self, x: int) -> str:
        return "a"

method_wrapper = getattr_static(C, "f").__get__

reveal_type(method_wrapper)  # revealed: <method-wrapper `__get__` of `f`>

# All of these are fine:
method_wrapper(C(), C)
method_wrapper(C())
method_wrapper(C(), None)
method_wrapper(None, C)

# Passing `None` without an `owner` argument is an
# error: [missing-argument] "No argument provided for required parameter `owner`"
method_wrapper(None)

# Passing something that is not assignable to `type` as the `owner` argument is an
# error: [invalid-argument-type] "Object of type `Literal[1]` cannot be assigned to parameter 2 (`owner`); expected type `type`"
method_wrapper(None, 1)

# Passing `None` as the `owner` argument when `instance` is `None` is an
# error: [invalid-argument-type] "Object of type `None` cannot be assigned to parameter 2 (`owner`); expected type `type`"
method_wrapper(None, None)

# Calling `__get__` without any arguments is an
# error: [missing-argument] "No argument provided for required parameter `instance`"
method_wrapper()

# Calling `__get__` with too many positional arguments is an
# error: [too-many-positional-arguments] "Too many positional arguments: expected 2, got 3"
method_wrapper(C(), C, "one too many")