//! First, some terminology: //! //! * A "place" is semantically a location where a value can be read or written, and syntactically, //! an expression that can be the target of an assignment, e.g. `x`, `x[0]`, `x.y`. (The term is //! borrowed from Rust). In Python syntax, an expression like `f().x` is also allowed as the //! target so it can be called a place, but we do not record declarations / bindings like `f().x: //! int`, `f().x = ...`. Type checking itself can be done by recording only assignments to names, //! but in order to perform type narrowing by attribute/subscript assignments, they must also be //! recorded. //! //! * A "binding" gives a new value to a place. This includes many different Python statements //! (assignment statements of course, but also imports, `def` and `class` statements, `as` //! clauses in `with` and `except` statements, match patterns, and others) and even one //! expression kind (named expressions). It notably does not include annotated assignment //! statements without a right-hand side value; these do not assign any new value to the place. //! We consider function parameters to be bindings as well, since (from the perspective of the //! function's internal scope), a function parameter begins the scope bound to a value. //! //! * A "declaration" establishes an upper bound type for the values that a variable may be //! permitted to take on. Annotated assignment statements (with or without an RHS value) are //! declarations; annotated function parameters are also declarations. We consider `def` and //! `class` statements to also be declarations, so as to prohibit accidentally shadowing them. //! //! Annotated assignments with a right-hand side, and annotated function parameters, are both //! bindings and declarations. //! //! We use [`Definition`] as the universal term (and Salsa tracked struct) encompassing both //! bindings and declarations. (This sacrifices a bit of type safety in exchange for improved //! performance via fewer Salsa tracked structs and queries, since most declarations -- typed //! parameters and annotated assignments with RHS -- are both bindings and declarations.) //! //! At any given use of a variable, we can ask about both its "declared type" and its "inferred //! type". These may be different, but the inferred type must always be assignable to the declared //! type; that is, the declared type is always wider, and the inferred type may be more precise. If //! we see an invalid assignment, we emit a diagnostic and abandon our inferred type, deferring to //! the declared type (this allows an explicit annotation to override bad inference, without a //! cast), maintaining the invariant. //! //! The **inferred type** represents the most precise type we believe encompasses all possible //! values for the variable at a given use. It is based on a union of the bindings which can reach //! that use through some control flow path, and the narrowing constraints that control flow must //! have passed through between the binding and the use. For example, in this code: //! //! ```python //! x = 1 if flag else None //! if x is not None: //! use(x) //! ``` //! //! For the use of `x` on the third line, the inferred type should be `Literal[1]`. This is based //! on the binding on the first line, which assigns the type `Literal[1] | None`, and the narrowing //! constraint on the second line, which rules out the type `None`, since control flow must pass //! through this constraint to reach the use in question. //! //! The **declared type** represents the code author's declaration (usually through a type //! annotation) that a given variable should not be assigned any type outside the declared type. In //! our model, declared types are also control-flow-sensitive; we allow the code author to //! explicitly redeclare the same variable with a different type. So for a given binding of a //! variable, we will want to ask which declarations of that variable can reach that binding, in //! order to determine whether the binding is permitted, or should be a type error. For example: //! //! ```python //! from pathlib import Path //! def f(path: str): //! path: Path = Path(path) //! ``` //! //! In this function, the initial declared type of `path` is `str`, meaning that the assignment //! `path = Path(path)` would be a type error, since it assigns to `path` a value whose type is not //! assignable to `str`. This is the purpose of declared types: they prevent accidental assignment //! of the wrong type to a variable. //! //! But in some cases it is useful to "shadow" or "redeclare" a variable with a new type, and we //! permit this, as long as it is done with an explicit re-annotation. So `path: Path = //! Path(path)`, with the explicit `: Path` annotation, is permitted. //! //! The general rule is that whatever declaration(s) can reach a given binding determine the //! validity of that binding. If there is a path in which the place is not declared, that is a //! declaration of `Unknown`. If multiple declarations can reach a binding, we union them, but by //! default we also issue a type error, since this implicit union of declared types may hide an //! error. //! //! To support type inference, we build a map from each use of a place to the bindings live at //! that use, and the type narrowing constraints that apply to each binding. //! //! Let's take this code sample: //! //! ```python //! x = 1 //! x = 2 //! y = x //! if flag: //! x = 3 //! else: //! x = 4 //! z = x //! ``` //! //! In this snippet, we have four bindings of `x` (the statements assigning `1`, `2`, `3`, and `4` //! to it), and two uses of `x` (the `y = x` and `z = x` assignments). The first binding of `x` //! does not reach any use, because it's immediately replaced by the second binding, before any use //! happens. (A linter could thus flag the statement `x = 1` as likely superfluous.) //! //! The first use of `x` has one live binding: the assignment `x = 2`. //! //! Things get a bit more complex when we have branches. We will definitely take either the `if` or //! the `else` branch. Thus, the second use of `x` has two live bindings: `x = 3` and `x = 4`. The //! `x = 2` assignment is no longer visible, because it must be replaced by either `x = 3` or `x = //! 4`, no matter which branch was taken. We don't know which branch was taken, so we must consider //! both bindings as live, which means eventually we would (in type inference) look at these two //! bindings and infer a type of `Literal[3, 4]` -- the union of `Literal[3]` and `Literal[4]` -- //! for the second use of `x`. //! //! So that's one question our use-def map needs to answer: given a specific use of a place, which //! binding(s) can reach that use. In [`AstIds`](crate::semantic_index::ast_ids::AstIds) we number //! all uses (that means a `Name`/`ExprAttribute`/`ExprSubscript` node with `Load` context) //! so we have a `ScopedUseId` to efficiently represent each use. //! //! We also need to know, for a given definition of a place, what type narrowing constraints apply //! to it. For instance, in this code sample: //! //! ```python //! x = 1 if flag else None //! if x is not None: //! use(x) //! ``` //! //! At the use of `x`, the live binding of `x` is `1 if flag else None`, which would infer as the //! type `Literal[1] | None`. But the constraint `x is not None` dominates this use, which means we //! can rule out the possibility that `x` is `None` here, which should give us the type //! `Literal[1]` for this use. //! //! For declared types, we need to be able to answer the question "given a binding to a place, //! which declarations of that place can reach the binding?" This allows us to emit a diagnostic //! if the binding is attempting to bind a value of a type that is not assignable to the declared //! type for that place, at that point in control flow. //! //! We also need to know, given a declaration of a place, what the inferred type of that place is //! at that point. This allows us to emit a diagnostic in a case like `x = "foo"; x: int`. The //! binding `x = "foo"` occurs before the declaration `x: int`, so according to our //! control-flow-sensitive interpretation of declarations, the assignment is not an error. But the //! declaration is an error, since it would violate the "inferred type must be assignable to //! declared type" rule. //! //! Another case we need to handle is when a place is referenced from a different scope (for //! example, an import or a nonlocal reference). We call this "public" use of a place. For public //! use of a place, we prefer the declared type, if there are any declarations of that place; if //! not, we fall back to the inferred type. So we also need to know which declarations and bindings //! can reach the end of the scope. //! //! Technically, public use of a place could occur from any point in control flow of the scope //! where the place is defined (via inline imports and import cycles, in the case of an import, or //! via a function call partway through the local scope that ends up using a place from the scope //! via a global or nonlocal reference.) But modeling this fully accurately requires whole-program //! analysis that isn't tractable for an efficient analysis, since it means a given place could //! have a different type every place it's referenced throughout the program, depending on the //! shape of arbitrarily-sized call/import graphs. So we follow other Python type checkers in //! making the simplifying assumption that usually the scope will finish execution before its //! places are made visible to other scopes; for instance, most imports will import from a //! complete module, not a partially-executed module. (We may want to get a little smarter than //! this in the future for some closures, but for now this is where we start.) //! //! The data structure we build to answer these questions is the `UseDefMap`. It has a //! `bindings_by_use` vector of [`Bindings`] indexed by [`ScopedUseId`], a //! `declarations_by_binding` vector of [`Declarations`] indexed by [`ScopedDefinitionId`], a //! `bindings_by_declaration` vector of [`Bindings`] indexed by [`ScopedDefinitionId`], and //! `public_bindings` and `public_definitions` vectors indexed by [`ScopedPlaceId`]. The values in //! each of these vectors are (in principle) a list of live bindings at that use/definition, or at //! the end of the scope for that place, with a list of the dominating constraints for each //! binding. //! //! In order to avoid vectors-of-vectors-of-vectors and all the allocations that would entail, we //! don't actually store these "list of visible definitions" as a vector of [`Definition`]. //! Instead, [`Bindings`] and [`Declarations`] are structs which use bit-sets to track //! definitions (and constraints, in the case of bindings) in terms of [`ScopedDefinitionId`] and //! [`ScopedPredicateId`], which are indices into the `all_definitions` and `predicates` //! indexvecs in the [`UseDefMap`]. //! //! There is another special kind of possible "definition" for a place: there might be a path from //! the scope entry to a given use in which the place is never bound. We model this with a special //! "unbound/undeclared" definition (a [`DefinitionState::Undefined`] entry at the start of the //! `all_definitions` vector). If that sentinel definition is present in the live bindings at a //! given use, it means that there is a possible path through control flow in which that place is //! unbound. Similarly, if that sentinel is present in the live declarations, it means that the //! place is (possibly) undeclared. //! //! To build a [`UseDefMap`], the [`UseDefMapBuilder`] is notified of each new use, definition, and //! constraint as they are encountered by the //! [`SemanticIndexBuilder`](crate::semantic_index::builder::SemanticIndexBuilder) AST visit. For //! each place, the builder tracks the `PlaceState` (`Bindings` and `Declarations`) for that place. //! When we hit a use or definition of a place, we record the necessary parts of the current state //! for that place that we need for that use or definition. When we reach the end of the scope, it //! records the state for each place as the public definitions of that place. //! //! ```python //! x = 1 //! x = 2 //! y = x //! if flag: //! x = 3 //! else: //! x = 4 //! z = x //! ``` //! //! Let's walk through the above example. Initially we do not have any record of `x`. When we add //! the new place (before we process the first binding), we create a new undefined `PlaceState` //! which has a single live binding (the "unbound" definition) and a single live declaration (the //! "undeclared" definition). When we see `x = 1`, we record that as the sole live binding of `x`. //! The "unbound" binding is no longer visible. Then we see `x = 2`, and we replace `x = 1` as the //! sole live binding of `x`. When we get to `y = x`, we record that the live bindings for that use //! of `x` are just the `x = 2` definition. //! //! Then we hit the `if` branch. We visit the `test` node (`flag` in this case), since that will //! happen regardless. Then we take a pre-branch snapshot of the current state for all places, //! which we'll need later. Then we record `flag` as a possible constraint on the current binding //! (`x = 2`), and go ahead and visit the `if` body. When we see `x = 3`, it replaces `x = 2` //! (constrained by `flag`) as the sole live binding of `x`. At the end of the `if` body, we take //! another snapshot of the current place state; we'll call this the post-if-body snapshot. //! //! Now we need to visit the `else` clause. The conditions when entering the `else` clause should //! be the pre-if conditions; if we are entering the `else` clause, we know that the `if` test //! failed and we didn't execute the `if` body. So we first reset the builder to the pre-if state, //! using the snapshot we took previously (meaning we now have `x = 2` as the sole binding for `x` //! again), and record a *negative* `flag` constraint for all live bindings (`x = 2`). We then //! visit the `else` clause, where `x = 4` replaces `x = 2` as the sole live binding of `x`. //! //! Now we reach the end of the if/else, and want to visit the following code. The state here needs //! to reflect that we might have gone through the `if` branch, or we might have gone through the //! `else` branch, and we don't know which. So we need to "merge" our current builder state //! (reflecting the end-of-else state, with `x = 4` as the only live binding) with our post-if-body //! snapshot (which has `x = 3` as the only live binding). The result of this merge is that we now //! have two live bindings of `x`: `x = 3` and `x = 4`. //! //! Another piece of information that the `UseDefMap` needs to provide are reachability constraints. //! See [`reachability_constraints.rs`] for more details, in particular how they apply to bindings. //! //! The [`UseDefMapBuilder`] itself just exposes methods for taking a snapshot, resetting to a //! snapshot, and merging a snapshot into the current state. The logic using these methods lives in //! [`SemanticIndexBuilder`](crate::semantic_index::builder::SemanticIndexBuilder), e.g. where it //! visits a `StmtIf` node. use ruff_index::{IndexVec, newtype_index}; use rustc_hash::FxHashMap; use self::place_state::{ Bindings, Declarations, EagerSnapshot, LiveBindingsIterator, LiveDeclaration, LiveDeclarationsIterator, PlaceState, ScopedDefinitionId, }; use crate::node_key::NodeKey; use crate::place::BoundnessAnalysis; use crate::semantic_index::ast_ids::ScopedUseId; use crate::semantic_index::definition::{Definition, DefinitionState}; use crate::semantic_index::narrowing_constraints::{ ConstraintKey, NarrowingConstraints, NarrowingConstraintsBuilder, NarrowingConstraintsIterator, }; use crate::semantic_index::place::{ FileScopeId, PlaceExpr, PlaceExprWithFlags, ScopeKind, ScopedPlaceId, }; use crate::semantic_index::predicate::{ Predicate, PredicateOrLiteral, Predicates, PredicatesBuilder, ScopedPredicateId, }; use crate::semantic_index::reachability_constraints::{ ReachabilityConstraints, ReachabilityConstraintsBuilder, ScopedReachabilityConstraintId, }; use crate::semantic_index::use_def::place_state::PreviousDefinitions; use crate::semantic_index::{EagerSnapshotResult, SemanticIndex}; use crate::types::{IntersectionBuilder, Truthiness, Type, infer_narrowing_constraint}; mod place_state; /// Applicable definitions and constraints for every use of a name. #[derive(Debug, PartialEq, Eq, salsa::Update, get_size2::GetSize)] pub(crate) struct UseDefMap<'db> { /// Array of [`Definition`] in this scope. Only the first entry should be [`DefinitionState::Undefined`]; /// this represents the implicit "unbound"/"undeclared" definition of every place. all_definitions: IndexVec>, /// Array of predicates in this scope. predicates: Predicates<'db>, /// Array of narrowing constraints in this scope. narrowing_constraints: NarrowingConstraints, /// Array of reachability constraints in this scope. reachability_constraints: ReachabilityConstraints, /// [`Bindings`] reaching a [`ScopedUseId`]. bindings_by_use: IndexVec, /// Tracks whether or not a given AST node is reachable from the start of the scope. node_reachability: FxHashMap, /// If the definition is a binding (only) -- `x = 1` for example -- then we need /// [`Declarations`] to know whether this binding is permitted by the live declarations. /// /// If the definition is both a declaration and a binding -- `x: int = 1` for example -- then /// we don't actually need anything here, all we'll need to validate is that our own RHS is a /// valid assignment to our own annotation. declarations_by_binding: FxHashMap, Declarations>, /// If the definition is a declaration (only) -- `x: int` for example -- then we need /// [`Bindings`] to know whether this declaration is consistent with the previously /// inferred type. /// /// If the definition is both a declaration and a binding -- `x: int = 1` for example -- then /// we don't actually need anything here, all we'll need to validate is that our own RHS is a /// valid assignment to our own annotation. bindings_by_declaration: FxHashMap, Bindings>, /// [`PlaceState`] visible at end of scope for each place. end_of_scope_places: IndexVec, /// All potentially reachable bindings and declarations, for each place. reachable_definitions: IndexVec, /// Snapshot of bindings in this scope that can be used to resolve a reference in a nested /// eager scope. eager_snapshots: EagerSnapshots, /// Whether or not the end of the scope is reachable. /// /// This is used to check if the function can implicitly return `None`. /// For example: /// ```py /// def f(cond: bool) -> int | None: /// if cond: /// return 1 /// /// def g() -> int: /// if True: /// return 1 /// ``` /// /// Function `f` may implicitly return `None`, but `g` cannot. /// /// This is used by [`UseDefMap::can_implicitly_return_none`]. end_of_scope_reachability: ScopedReachabilityConstraintId, } pub(crate) enum ApplicableConstraints<'map, 'db> { UnboundBinding(ConstraintsIterator<'map, 'db>), ConstrainedBindings(BindingWithConstraintsIterator<'map, 'db>), } impl<'db> UseDefMap<'db> { pub(crate) fn bindings_at_use( &self, use_id: ScopedUseId, ) -> BindingWithConstraintsIterator<'_, 'db> { self.bindings_iterator( &self.bindings_by_use[use_id], BoundnessAnalysis::BasedOnUnboundVisibility, ) } pub(crate) fn applicable_constraints( &self, constraint_key: ConstraintKey, enclosing_scope: FileScopeId, expr: &PlaceExpr, index: &'db SemanticIndex, ) -> ApplicableConstraints<'_, 'db> { match constraint_key { ConstraintKey::NarrowingConstraint(constraint) => { ApplicableConstraints::UnboundBinding(ConstraintsIterator { predicates: &self.predicates, constraint_ids: self.narrowing_constraints.iter_predicates(constraint), }) } ConstraintKey::EagerNestedScope(nested_scope) => { let EagerSnapshotResult::FoundBindings(bindings) = index.eager_snapshot(enclosing_scope, expr, nested_scope) else { unreachable!( "The result of `SemanticIndex::eager_snapshot` must be `FoundBindings`" ) }; ApplicableConstraints::ConstrainedBindings(bindings) } ConstraintKey::UseId(use_id) => { ApplicableConstraints::ConstrainedBindings(self.bindings_at_use(use_id)) } } } pub(super) fn is_reachable( &self, db: &dyn crate::Db, reachability: ScopedReachabilityConstraintId, ) -> bool { self.reachability_constraints .evaluate(db, &self.predicates, reachability) .may_be_true() } /// Check whether or not a given expression is reachable from the start of the scope. This /// is a local analysis which does not capture the possibility that the entire scope might /// be unreachable. Use [`super::SemanticIndex::is_node_reachable`] for the global /// analysis. #[track_caller] pub(super) fn is_node_reachable(&self, db: &dyn crate::Db, node_key: NodeKey) -> bool { self .reachability_constraints .evaluate( db, &self.predicates, *self .node_reachability .get(&node_key) .expect("`is_node_reachable` should only be called on AST nodes with recorded reachability"), ) .may_be_true() } pub(crate) fn end_of_scope_bindings( &self, place: ScopedPlaceId, ) -> BindingWithConstraintsIterator<'_, 'db> { self.bindings_iterator( self.end_of_scope_places[place].bindings(), BoundnessAnalysis::BasedOnUnboundVisibility, ) } pub(crate) fn all_reachable_bindings( &self, place: ScopedPlaceId, ) -> BindingWithConstraintsIterator<'_, 'db> { self.bindings_iterator( &self.reachable_definitions[place].bindings, BoundnessAnalysis::AssumeBound, ) } pub(crate) fn eager_snapshot( &self, eager_bindings: ScopedEagerSnapshotId, ) -> EagerSnapshotResult<'_, 'db> { match self.eager_snapshots.get(eager_bindings) { Some(EagerSnapshot::Constraint(constraint)) => { EagerSnapshotResult::FoundConstraint(*constraint) } Some(EagerSnapshot::Bindings(bindings)) => EagerSnapshotResult::FoundBindings( self.bindings_iterator(bindings, BoundnessAnalysis::BasedOnUnboundVisibility), ), None => EagerSnapshotResult::NotFound, } } pub(crate) fn bindings_at_declaration( &self, declaration: Definition<'db>, ) -> BindingWithConstraintsIterator<'_, 'db> { self.bindings_iterator( &self.bindings_by_declaration[&declaration], BoundnessAnalysis::BasedOnUnboundVisibility, ) } pub(crate) fn declarations_at_binding( &self, binding: Definition<'db>, ) -> DeclarationsIterator<'_, 'db> { self.declarations_iterator( &self.declarations_by_binding[&binding], BoundnessAnalysis::BasedOnUnboundVisibility, ) } pub(crate) fn end_of_scope_declarations<'map>( &'map self, place: ScopedPlaceId, ) -> DeclarationsIterator<'map, 'db> { let declarations = self.end_of_scope_places[place].declarations(); self.declarations_iterator(declarations, BoundnessAnalysis::BasedOnUnboundVisibility) } pub(crate) fn all_reachable_declarations( &self, place: ScopedPlaceId, ) -> DeclarationsIterator<'_, 'db> { let declarations = &self.reachable_definitions[place].declarations; self.declarations_iterator(declarations, BoundnessAnalysis::AssumeBound) } pub(crate) fn all_end_of_scope_declarations<'map>( &'map self, ) -> impl Iterator)> + 'map { (0..self.end_of_scope_places.len()) .map(ScopedPlaceId::from_usize) .map(|place_id| (place_id, self.end_of_scope_declarations(place_id))) } pub(crate) fn all_end_of_scope_bindings<'map>( &'map self, ) -> impl Iterator)> + 'map { (0..self.end_of_scope_places.len()) .map(ScopedPlaceId::from_usize) .map(|place_id| (place_id, self.end_of_scope_bindings(place_id))) } /// This function is intended to be called only once inside `TypeInferenceBuilder::infer_function_body`. pub(crate) fn can_implicitly_return_none(&self, db: &dyn crate::Db) -> bool { !self .reachability_constraints .evaluate(db, &self.predicates, self.end_of_scope_reachability) .is_always_false() } pub(crate) fn is_binding_reachable( &self, db: &dyn crate::Db, binding: &BindingWithConstraints<'_, 'db>, ) -> Truthiness { self.reachability_constraints.evaluate( db, &self.predicates, binding.reachability_constraint, ) } fn bindings_iterator<'map>( &'map self, bindings: &'map Bindings, boundness_analysis: BoundnessAnalysis, ) -> BindingWithConstraintsIterator<'map, 'db> { BindingWithConstraintsIterator { all_definitions: &self.all_definitions, predicates: &self.predicates, narrowing_constraints: &self.narrowing_constraints, reachability_constraints: &self.reachability_constraints, boundness_analysis, inner: bindings.iter(), } } fn declarations_iterator<'map>( &'map self, declarations: &'map Declarations, boundness_analysis: BoundnessAnalysis, ) -> DeclarationsIterator<'map, 'db> { DeclarationsIterator { all_definitions: &self.all_definitions, predicates: &self.predicates, reachability_constraints: &self.reachability_constraints, boundness_analysis, inner: declarations.iter(), } } } /// Uniquely identifies a snapshot of a place state that can be used to resolve a reference in a /// nested eager scope. /// /// An eager scope has its entire body executed immediately at the location where it is defined. /// For any free references in the nested scope, we use the bindings that are visible at the point /// where the nested scope is defined, instead of using the public type of the place. /// /// There is a unique ID for each distinct [`EagerSnapshotKey`] in the file. #[newtype_index] #[derive(get_size2::GetSize)] pub(crate) struct ScopedEagerSnapshotId; #[derive(Clone, Copy, Debug, Eq, Hash, PartialEq, get_size2::GetSize)] pub(crate) struct EagerSnapshotKey { /// The enclosing scope containing the bindings pub(crate) enclosing_scope: FileScopeId, /// The referenced place (in the enclosing scope) pub(crate) enclosing_place: ScopedPlaceId, /// The nested eager scope containing the reference pub(crate) nested_scope: FileScopeId, } /// A snapshot of place states that can be used to resolve a reference in a nested eager scope. type EagerSnapshots = IndexVec; #[derive(Debug)] pub(crate) struct BindingWithConstraintsIterator<'map, 'db> { all_definitions: &'map IndexVec>, pub(crate) predicates: &'map Predicates<'db>, pub(crate) narrowing_constraints: &'map NarrowingConstraints, pub(crate) reachability_constraints: &'map ReachabilityConstraints, pub(crate) boundness_analysis: BoundnessAnalysis, inner: LiveBindingsIterator<'map>, } impl<'map, 'db> Iterator for BindingWithConstraintsIterator<'map, 'db> { type Item = BindingWithConstraints<'map, 'db>; fn next(&mut self) -> Option { let predicates = self.predicates; let narrowing_constraints = self.narrowing_constraints; self.inner .next() .map(|live_binding| BindingWithConstraints { binding: self.all_definitions[live_binding.binding], narrowing_constraint: ConstraintsIterator { predicates, constraint_ids: narrowing_constraints .iter_predicates(live_binding.narrowing_constraint), }, reachability_constraint: live_binding.reachability_constraint, }) } } impl std::iter::FusedIterator for BindingWithConstraintsIterator<'_, '_> {} pub(crate) struct BindingWithConstraints<'map, 'db> { pub(crate) binding: DefinitionState<'db>, pub(crate) narrowing_constraint: ConstraintsIterator<'map, 'db>, pub(crate) reachability_constraint: ScopedReachabilityConstraintId, } pub(crate) struct ConstraintsIterator<'map, 'db> { predicates: &'map Predicates<'db>, constraint_ids: NarrowingConstraintsIterator<'map>, } impl<'db> Iterator for ConstraintsIterator<'_, 'db> { type Item = Predicate<'db>; fn next(&mut self) -> Option { self.constraint_ids .next() .map(|narrowing_constraint| self.predicates[narrowing_constraint.predicate()]) } } impl std::iter::FusedIterator for ConstraintsIterator<'_, '_> {} impl<'db> ConstraintsIterator<'_, 'db> { pub(crate) fn narrow( self, db: &'db dyn crate::Db, base_ty: Type<'db>, place: ScopedPlaceId, ) -> Type<'db> { let constraint_tys: Vec<_> = self .filter_map(|constraint| infer_narrowing_constraint(db, constraint, place)) .collect(); if constraint_tys.is_empty() { base_ty } else { constraint_tys .into_iter() .rev() .fold( IntersectionBuilder::new(db).add_positive(base_ty), IntersectionBuilder::add_positive, ) .build() } } } #[derive(Clone)] pub(crate) struct DeclarationsIterator<'map, 'db> { all_definitions: &'map IndexVec>, pub(crate) predicates: &'map Predicates<'db>, pub(crate) reachability_constraints: &'map ReachabilityConstraints, pub(crate) boundness_analysis: BoundnessAnalysis, inner: LiveDeclarationsIterator<'map>, } pub(crate) struct DeclarationWithConstraint<'db> { pub(crate) declaration: DefinitionState<'db>, pub(crate) reachability_constraint: ScopedReachabilityConstraintId, } impl<'db> Iterator for DeclarationsIterator<'_, 'db> { type Item = DeclarationWithConstraint<'db>; fn next(&mut self) -> Option { self.inner.next().map( |LiveDeclaration { declaration, reachability_constraint, }| { DeclarationWithConstraint { declaration: self.all_definitions[*declaration], reachability_constraint: *reachability_constraint, } }, ) } } impl std::iter::FusedIterator for DeclarationsIterator<'_, '_> {} #[derive(Debug, PartialEq, Eq, salsa::Update, get_size2::GetSize)] struct ReachableDefinitions { bindings: Bindings, declarations: Declarations, } /// A snapshot of the definitions and constraints state at a particular point in control flow. #[derive(Clone, Debug)] pub(super) struct FlowSnapshot { place_states: IndexVec, reachability: ScopedReachabilityConstraintId, } #[derive(Debug)] pub(super) struct UseDefMapBuilder<'db> { /// Append-only array of [`DefinitionState`]. all_definitions: IndexVec>, /// Builder of predicates. pub(super) predicates: PredicatesBuilder<'db>, /// Builder of narrowing constraints. pub(super) narrowing_constraints: NarrowingConstraintsBuilder, /// Builder of reachability constraints. pub(super) reachability_constraints: ReachabilityConstraintsBuilder, /// Live bindings at each so-far-recorded use. bindings_by_use: IndexVec, /// Tracks whether or not the current point in control flow is reachable from the /// start of the scope. pub(super) reachability: ScopedReachabilityConstraintId, /// Tracks whether or not a given AST node is reachable from the start of the scope. node_reachability: FxHashMap, /// Live declarations for each so-far-recorded binding. declarations_by_binding: FxHashMap, Declarations>, /// Live bindings for each so-far-recorded declaration. bindings_by_declaration: FxHashMap, Bindings>, /// Currently live bindings and declarations for each place. place_states: IndexVec, /// All potentially reachable bindings and declarations, for each place. reachable_definitions: IndexVec, /// Snapshots of place states in this scope that can be used to resolve a reference in a /// nested eager scope. eager_snapshots: EagerSnapshots, /// Is this a class scope? is_class_scope: bool, } impl<'db> UseDefMapBuilder<'db> { pub(super) fn new(is_class_scope: bool) -> Self { Self { all_definitions: IndexVec::from_iter([DefinitionState::Undefined]), predicates: PredicatesBuilder::default(), narrowing_constraints: NarrowingConstraintsBuilder::default(), reachability_constraints: ReachabilityConstraintsBuilder::default(), bindings_by_use: IndexVec::new(), reachability: ScopedReachabilityConstraintId::ALWAYS_TRUE, node_reachability: FxHashMap::default(), declarations_by_binding: FxHashMap::default(), bindings_by_declaration: FxHashMap::default(), place_states: IndexVec::new(), reachable_definitions: IndexVec::new(), eager_snapshots: EagerSnapshots::default(), is_class_scope, } } pub(super) fn mark_unreachable(&mut self) { self.reachability = ScopedReachabilityConstraintId::ALWAYS_FALSE; for state in &mut self.place_states { state.record_reachability_constraint( &mut self.reachability_constraints, ScopedReachabilityConstraintId::ALWAYS_FALSE, ); } } pub(super) fn add_place(&mut self, place: ScopedPlaceId) { let new_place = self .place_states .push(PlaceState::undefined(self.reachability)); debug_assert_eq!(place, new_place); let new_place = self.reachable_definitions.push(ReachableDefinitions { bindings: Bindings::unbound(self.reachability), declarations: Declarations::undeclared(self.reachability), }); debug_assert_eq!(place, new_place); } pub(super) fn record_binding( &mut self, place: ScopedPlaceId, binding: Definition<'db>, is_place_name: bool, ) { let def_id = self.all_definitions.push(DefinitionState::Defined(binding)); let place_state = &mut self.place_states[place]; self.declarations_by_binding .insert(binding, place_state.declarations().clone()); place_state.record_binding( def_id, self.reachability, self.is_class_scope, is_place_name, ); self.reachable_definitions[place].bindings.record_binding( def_id, self.reachability, self.is_class_scope, is_place_name, PreviousDefinitions::AreKept, ); } pub(super) fn add_predicate( &mut self, predicate: PredicateOrLiteral<'db>, ) -> ScopedPredicateId { match predicate { PredicateOrLiteral::Predicate(predicate) => self.predicates.add_predicate(predicate), PredicateOrLiteral::Literal(true) => ScopedPredicateId::ALWAYS_TRUE, PredicateOrLiteral::Literal(false) => ScopedPredicateId::ALWAYS_FALSE, } } pub(super) fn record_narrowing_constraint(&mut self, predicate: ScopedPredicateId) { if predicate == ScopedPredicateId::ALWAYS_TRUE || predicate == ScopedPredicateId::ALWAYS_FALSE { // No need to record a narrowing constraint for `True` or `False`. return; } let narrowing_constraint = predicate.into(); for state in &mut self.place_states { state .record_narrowing_constraint(&mut self.narrowing_constraints, narrowing_constraint); } } /// Snapshot the state of a single place at the current point in control flow. /// /// This is only used for `*`-import reachability constraints, which are handled differently /// to most other reachability constraints. See the doc-comment for /// [`Self::record_and_negate_star_import_reachability_constraint`] for more details. pub(super) fn single_place_snapshot(&self, place: ScopedPlaceId) -> PlaceState { self.place_states[place].clone() } /// This method exists solely for handling `*`-import reachability constraints. /// /// The reason why we add reachability constraints for [`Definition`]s created by `*` imports /// is laid out in the doc-comment for `StarImportPlaceholderPredicate`. But treating these /// reachability constraints in the use-def map the same way as all other reachability constraints /// was shown to lead to [significant regressions] for small codebases where typeshed /// dominates. (Although `*` imports are not common generally, they are used in several /// important places by typeshed.) /// /// To solve these regressions, it was observed that we could do significantly less work for /// `*`-import definitions. We do a number of things differently here to our normal handling of /// reachability constraints: /// /// - We only apply and negate the reachability constraints to a single symbol, rather than to /// all symbols. This is possible here because, unlike most definitions, we know in advance that /// exactly one definition occurs inside the "if-true" predicate branch, and we know exactly /// which definition it is. /// /// - We only snapshot the state for a single place prior to the definition, rather than doing /// expensive calls to [`Self::snapshot`]. Again, this is possible because we know /// that only a single definition occurs inside the "if-predicate-true" predicate branch. /// /// - Normally we take care to check whether an "if-predicate-true" branch or an /// "if-predicate-false" branch contains a terminal statement: these can affect the reachability /// of symbols defined inside either branch. However, in the case of `*`-import definitions, /// this is unnecessary (and therefore not done in this method), since we know that a `*`-import /// predicate cannot create a terminal statement inside either branch. /// /// [significant regressions]: https://github.com/astral-sh/ruff/pull/17286#issuecomment-2786755746 pub(super) fn record_and_negate_star_import_reachability_constraint( &mut self, reachability_id: ScopedReachabilityConstraintId, symbol: ScopedPlaceId, pre_definition_state: PlaceState, ) { let negated_reachability_id = self .reachability_constraints .add_not_constraint(reachability_id); let mut post_definition_state = std::mem::replace(&mut self.place_states[symbol], pre_definition_state); post_definition_state .record_reachability_constraint(&mut self.reachability_constraints, reachability_id); self.place_states[symbol].record_reachability_constraint( &mut self.reachability_constraints, negated_reachability_id, ); self.place_states[symbol].merge( post_definition_state, &mut self.narrowing_constraints, &mut self.reachability_constraints, ); } pub(super) fn record_reachability_constraint( &mut self, constraint: ScopedReachabilityConstraintId, ) { self.reachability = self .reachability_constraints .add_and_constraint(self.reachability, constraint); for state in &mut self.place_states { state.record_reachability_constraint(&mut self.reachability_constraints, constraint); } } pub(super) fn record_declaration( &mut self, place: ScopedPlaceId, declaration: Definition<'db>, ) { let def_id = self .all_definitions .push(DefinitionState::Defined(declaration)); let place_state = &mut self.place_states[place]; self.bindings_by_declaration .insert(declaration, place_state.bindings().clone()); place_state.record_declaration(def_id, self.reachability); self.reachable_definitions[place] .declarations .record_declaration(def_id, self.reachability, PreviousDefinitions::AreKept); } pub(super) fn record_declaration_and_binding( &mut self, place: ScopedPlaceId, definition: Definition<'db>, is_place_name: bool, ) { // We don't need to store anything in self.bindings_by_declaration or // self.declarations_by_binding. let def_id = self .all_definitions .push(DefinitionState::Defined(definition)); let place_state = &mut self.place_states[place]; place_state.record_declaration(def_id, self.reachability); place_state.record_binding( def_id, self.reachability, self.is_class_scope, is_place_name, ); self.reachable_definitions[place] .declarations .record_declaration(def_id, self.reachability, PreviousDefinitions::AreKept); self.reachable_definitions[place].bindings.record_binding( def_id, self.reachability, self.is_class_scope, is_place_name, PreviousDefinitions::AreKept, ); } pub(super) fn delete_binding(&mut self, place: ScopedPlaceId, is_place_name: bool) { let def_id = self.all_definitions.push(DefinitionState::Deleted); let place_state = &mut self.place_states[place]; place_state.record_binding( def_id, self.reachability, self.is_class_scope, is_place_name, ); } pub(super) fn record_use( &mut self, place: ScopedPlaceId, use_id: ScopedUseId, node_key: NodeKey, ) { // We have a use of a place; clone the current bindings for that place, and record them // as the live bindings for this use. let new_use = self .bindings_by_use .push(self.place_states[place].bindings().clone()); debug_assert_eq!(use_id, new_use); // Track reachability of all uses of places to silence `unresolved-reference` // diagnostics in unreachable code. self.record_node_reachability(node_key); } pub(super) fn record_node_reachability(&mut self, node_key: NodeKey) { self.node_reachability.insert(node_key, self.reachability); } pub(super) fn snapshot_eager_state( &mut self, enclosing_place: ScopedPlaceId, scope: ScopeKind, enclosing_place_expr: &PlaceExprWithFlags, ) -> ScopedEagerSnapshotId { // Names bound in class scopes are never visible to nested scopes (but attributes/subscripts are visible), // so we never need to save eager scope bindings in a class scope. if (scope.is_class() && enclosing_place_expr.is_name()) || !enclosing_place_expr.is_bound() { self.eager_snapshots.push(EagerSnapshot::Constraint( self.place_states[enclosing_place] .bindings() .unbound_narrowing_constraint(), )) } else { self.eager_snapshots.push(EagerSnapshot::Bindings( self.place_states[enclosing_place].bindings().clone(), )) } } /// Take a snapshot of the current visible-places state. pub(super) fn snapshot(&self) -> FlowSnapshot { FlowSnapshot { place_states: self.place_states.clone(), reachability: self.reachability, } } /// Restore the current builder places state to the given snapshot. pub(super) fn restore(&mut self, snapshot: FlowSnapshot) { // We never remove places from `place_states` (it's an IndexVec, and the place // IDs must line up), so the current number of known places must always be equal to or // greater than the number of known places in a previously-taken snapshot. let num_places = self.place_states.len(); debug_assert!(num_places >= snapshot.place_states.len()); // Restore the current visible-definitions state to the given snapshot. self.place_states = snapshot.place_states; self.reachability = snapshot.reachability; // If the snapshot we are restoring is missing some places we've recorded since, we need // to fill them in so the place IDs continue to line up. Since they don't exist in the // snapshot, the correct state to fill them in with is "undefined". self.place_states .resize(num_places, PlaceState::undefined(self.reachability)); } /// Merge the given snapshot into the current state, reflecting that we might have taken either /// path to get here. The new state for each place should include definitions from both the /// prior state and the snapshot. pub(super) fn merge(&mut self, snapshot: FlowSnapshot) { // As an optimization, if we know statically that either of the snapshots is always // unreachable, we can leave it out of the merged result entirely. Note that we cannot // perform any type inference at this point, so this is largely limited to unreachability // via terminal statements. If a flow's reachability depends on an expression in the code, // we will include the flow in the merged result; the reachability constraints of its // bindings will include this reachability condition, so that later during type inference, // we can determine whether any particular binding is non-visible due to unreachability. if snapshot.reachability == ScopedReachabilityConstraintId::ALWAYS_FALSE { return; } if self.reachability == ScopedReachabilityConstraintId::ALWAYS_FALSE { self.restore(snapshot); return; } // We never remove places from `place_states` (it's an IndexVec, and the place // IDs must line up), so the current number of known places must always be equal to or // greater than the number of known places in a previously-taken snapshot. debug_assert!(self.place_states.len() >= snapshot.place_states.len()); let mut snapshot_definitions_iter = snapshot.place_states.into_iter(); for current in &mut self.place_states { if let Some(snapshot) = snapshot_definitions_iter.next() { current.merge( snapshot, &mut self.narrowing_constraints, &mut self.reachability_constraints, ); } else { current.merge( PlaceState::undefined(snapshot.reachability), &mut self.narrowing_constraints, &mut self.reachability_constraints, ); // Place not present in snapshot, so it's unbound/undeclared from that path. } } self.reachability = self .reachability_constraints .add_or_constraint(self.reachability, snapshot.reachability); } pub(super) fn finish(mut self) -> UseDefMap<'db> { self.all_definitions.shrink_to_fit(); self.place_states.shrink_to_fit(); self.reachable_definitions.shrink_to_fit(); self.bindings_by_use.shrink_to_fit(); self.node_reachability.shrink_to_fit(); self.declarations_by_binding.shrink_to_fit(); self.bindings_by_declaration.shrink_to_fit(); self.eager_snapshots.shrink_to_fit(); UseDefMap { all_definitions: self.all_definitions, predicates: self.predicates.build(), narrowing_constraints: self.narrowing_constraints.build(), reachability_constraints: self.reachability_constraints.build(), bindings_by_use: self.bindings_by_use, node_reachability: self.node_reachability, end_of_scope_places: self.place_states, reachable_definitions: self.reachable_definitions, declarations_by_binding: self.declarations_by_binding, bindings_by_declaration: self.bindings_by_declaration, eager_snapshots: self.eager_snapshots, end_of_scope_reachability: self.reachability, } } }