Files
ruff/crates/ty_python_semantic/mdtest.py
David Peter dfd6ed0524 [ty] mdtests with external dependencies (#20904)
## Summary

This PR adds the possibility to write mdtests that specify external
dependencies in a `project` section of TOML blocks. For example, here is
a test that makes sure that we understand Pydantic's dataclass-transform
setup:

````markdown
```toml
[environment]
python-version = "3.12"
python-platform = "linux"

[project]
dependencies = ["pydantic==2.12.2"]
```

```py
from pydantic import BaseModel

class User(BaseModel):
    id: int
    name: str

user = User(id=1, name="Alice")
reveal_type(user.id)  # revealed: int
reveal_type(user.name)  # revealed: str

# error: [missing-argument] "No argument provided for required parameter
`name`"
invalid_user = User(id=2)
```
````

## How?

Using the `python-version` and the `dependencies` fields from the
Markdown section, we generate a `pyproject.toml` file, write it to a
temporary directory, and use `uv sync` to install the dependencies into
a virtual environment. We then copy the Python source files from that
venv's `site-packages` folder to a corresponding directory structure in
the in-memory filesystem. Finally, we configure the search paths
accordingly, and run the mdtest as usual.

I fully understand that there are valid concerns here:
* Doesn't this require network access? (yes, it does)
* Is this fast enough? (`uv` caching makes this almost unnoticeable,
actually)
* Is this deterministic? ~~(probably not, package resolution can depend
on the platform you're on)~~ (yes, hopefully)

For this reason, this first version is opt-in, locally. ~~We don't even
run these tests in CI (even though they worked fine in a previous
iteration of this PR).~~ You need to set `MDTEST_EXTERNAL=1`, or use the
new `-e/--enable-external` command line option of the `mdtest.py`
runner. For example:
```bash
# Skip mdtests with external dependencies (default):
uv run crates/ty_python_semantic/mdtest.py

# Run all mdtests, including those with external dependencies:
uv run crates/ty_python_semantic/mdtest.py -e

# Only run the `pydantic` tests. Use `-e` to make sure it is not skipped:
uv run crates/ty_python_semantic/mdtest.py -e pydantic
```

## Why?

I believe that this can be a useful addition to our testing strategy,
which lies somewhere between ecosystem tests and normal mdtests.
Ecosystem tests cover much more code, but they have the disadvantage
that we only see second- or third-order effects via diagnostic diffs. If
we unexpectedly gain or lose type coverage somewhere, we might not even
notice (assuming the gradual guarantee holds, and ecosystem code is
mostly correct). Another disadvantage of ecosystem checks is that they
only test checked-in code that is usually correct. However, we also want
to test what happens on wrong code, like the code that is momentarily
written in an editor, before fixing it. On the other end of the spectrum
we have normal mdtests, which have the disadvantage that they do not
reflect the reality of complex real-world code. We experience this
whenever we're surprised by an ecosystem report on a PR.

That said, these tests should not be seen as a replacement for either of
these things. For example, we should still strive to write detailed
self-contained mdtests for user-reported issues. But we might use this
new layer for regression tests, or simply as a debugging tool. It can
also serve as a tool to document our support for popular third-party
libraries.

## Test Plan

* I've been locally using this for a couple of weeks now.
* `uv run crates/ty_python_semantic/mdtest.py -e`
2025-12-08 11:44:20 +01:00

292 lines
9.9 KiB
Python

"""A runner for Markdown-based tests for ty"""
# /// script
# requires-python = ">=3.11"
# dependencies = [
# "rich",
# "watchfiles",
# ]
# ///
from __future__ import annotations
import argparse
import json
import os
import subprocess
import sys
import threading
from pathlib import Path
from typing import Final, Literal, assert_never
from rich.console import Console
from watchfiles import Change, watch
CRATE_NAME: Final = "ty_python_semantic"
CRATE_ROOT: Final = Path(__file__).resolve().parent
TY_VENDORED: Final = CRATE_ROOT.parent / "ty_vendored"
DIRS_TO_WATCH: Final = (
CRATE_ROOT,
TY_VENDORED,
CRATE_ROOT.parent / "ty_test/src",
)
MDTEST_DIR: Final = CRATE_ROOT / "resources" / "mdtest"
MDTEST_README: Final = CRATE_ROOT / "resources" / "README.md"
class MDTestRunner:
mdtest_executable: Path | None
console: Console
filters: list[str]
enable_external: bool
def __init__(self, filters: list[str] | None, enable_external: bool) -> None:
self.mdtest_executable = None
self.console = Console()
self.filters = [
f.removesuffix(".md").replace("/", "_").replace("-", "_")
for f in (filters or [])
]
self.enable_external = enable_external
def _run_cargo_test(self, *, message_format: Literal["human", "json"]) -> str:
return subprocess.check_output(
[
"cargo",
"test",
"--package",
CRATE_NAME,
"--no-run",
"--color=always",
"--test=mdtest",
"--message-format",
message_format,
],
cwd=CRATE_ROOT,
env=dict(os.environ, CLI_COLOR="1"),
stderr=subprocess.STDOUT,
text=True,
)
def _recompile_tests(
self, status_message: str, *, message_on_success: bool = True
) -> bool:
with self.console.status(status_message):
# Run it with 'human' format in case there are errors:
try:
self._run_cargo_test(message_format="human")
except subprocess.CalledProcessError as e:
print(e.output)
return False
# Run it again with 'json' format to find the mdtest executable:
try:
json_output = self._run_cargo_test(message_format="json")
except subprocess.CalledProcessError as _:
# `cargo test` can still fail if something changed in between the two runs.
# Here we don't have a human-readable output, so just show a generic message:
self.console.print("[red]Error[/red]: Failed to compile tests")
return False
if json_output:
self._get_executable_path_from_json(json_output)
if message_on_success:
self.console.print("[dim]Tests compiled successfully[/dim]")
return True
def _get_executable_path_from_json(self, json_output: str) -> None:
for json_line in json_output.splitlines():
try:
data = json.loads(json_line)
except json.JSONDecodeError:
continue
if data.get("target", {}).get("name") == "mdtest":
self.mdtest_executable = Path(data["executable"])
break
else:
raise RuntimeError(
"Could not find mdtest executable after successful compilation"
)
def _run_mdtest(
self, arguments: list[str] | None = None, *, capture_output: bool = False
) -> subprocess.CompletedProcess:
assert self.mdtest_executable is not None
arguments = arguments or []
return subprocess.run(
[self.mdtest_executable, *arguments],
cwd=CRATE_ROOT,
env=dict(
os.environ,
CLICOLOR_FORCE="1",
INSTA_FORCE_PASS="1",
INSTA_OUTPUT="none",
MDTEST_EXTERNAL="1" if self.enable_external else "0",
),
capture_output=capture_output,
text=True,
check=False,
)
def _mangle_path(self, markdown_file: Path) -> str:
return (
markdown_file.as_posix()
.replace("/", "_")
.replace("-", "_")
.removesuffix(".md")
)
def _run_mdtests_for_file(self, markdown_file: Path) -> None:
path_mangled = self._mangle_path(markdown_file)
test_name = f"mdtest__{path_mangled}"
output = self._run_mdtest(["--exact", test_name], capture_output=True)
if output.returncode == 0:
if "running 0 tests\n" in output.stdout:
self.console.log(
f"[yellow]Warning[/yellow]: No tests were executed with filter '{test_name}'"
)
else:
self.console.print(
f"Test for [bold green]{markdown_file}[/bold green] succeeded"
)
else:
self.console.print()
self.console.rule(
f"Test for [bold red]{markdown_file}[/bold red] failed",
style="gray",
)
self._print_trimmed_cargo_test_output(
output.stdout + output.stderr, test_name
)
def _print_trimmed_cargo_test_output(self, output: str, test_name: str) -> None:
# Skip 'cargo test' boilerplate at the beginning:
lines = output.splitlines()
start_index = 0
for i, line in enumerate(lines):
if f"{test_name} stdout" in line:
start_index = i
break
for line in lines[start_index + 1 :]:
if "MDTEST_TEST_FILTER" in line:
continue
if line.strip() == "-" * 50:
# Skip 'cargo test' boilerplate at the end
break
print(line)
def watch(self):
def keyboard_input() -> None:
for _ in sys.stdin:
# This is silly, but there is no other way to inject events into
# the main `watch` loop. We use changes to the `README.md` file
# as a trigger to re-run all mdtests:
MDTEST_README.touch()
input_thread = threading.Thread(target=keyboard_input, daemon=True)
input_thread.start()
self._recompile_tests("Compiling tests...", message_on_success=False)
self._run_mdtest(self.filters)
self.console.print("[dim]Ready to watch for changes...[/dim]")
for changes in watch(*DIRS_TO_WATCH):
new_md_files = set()
changed_md_files = set()
rust_code_has_changed = False
vendored_typeshed_has_changed = False
for change, path_str in changes:
path = Path(path_str)
# See above: `README.md` changes trigger a full re-run of all tests
if path == MDTEST_README:
self._run_mdtest(self.filters)
continue
match path.suffix:
case ".rs":
rust_code_has_changed = True
case ".pyi" if path.is_relative_to(TY_VENDORED):
vendored_typeshed_has_changed = True
case ".md":
pass
case _:
continue
try:
relative_path = Path(path).relative_to(MDTEST_DIR)
except ValueError:
continue
match change:
case Change.added:
# When saving a file, some editors (looking at you, Vim) might first
# save the file with a temporary name (e.g. `file.md~`) and then rename
# it to the final name. This creates a `deleted` and `added` change.
# We treat those files as `changed` here.
if (Change.deleted, path_str) in changes:
changed_md_files.add(relative_path)
else:
new_md_files.add(relative_path)
case Change.modified:
changed_md_files.add(relative_path)
case Change.deleted:
# No need to do anything when a Markdown test is deleted
pass
case _ as unreachable:
assert_never(unreachable)
if rust_code_has_changed:
if self._recompile_tests("Rust code has changed, recompiling tests..."):
self._run_mdtest(self.filters)
elif vendored_typeshed_has_changed:
if self._recompile_tests(
"Vendored typeshed has changed, recompiling tests..."
):
self._run_mdtest(self.filters)
elif new_md_files:
files = " ".join(file.as_posix() for file in new_md_files)
self._recompile_tests(
f"New Markdown test [yellow]{files}[/yellow] detected, recompiling tests..."
)
for path in new_md_files | changed_md_files:
self._run_mdtests_for_file(path)
def main() -> None:
parser = argparse.ArgumentParser(
description="A runner for Markdown-based tests for ty"
)
parser.add_argument(
"filters",
nargs="*",
help="Partial paths or mangled names, e.g., 'loops/for.md' or 'loops_for'",
)
parser.add_argument(
"--enable-external",
"-e",
action="store_true",
help="Enable tests with external dependencies",
)
args = parser.parse_args()
try:
runner = MDTestRunner(
filters=args.filters, enable_external=args.enable_external
)
runner.watch()
except KeyboardInterrupt:
print()
if __name__ == "__main__":
main()