ruff/crates/ty_python_semantic/src/semantic_index/use_def.rs

//! First, some terminology:
//!
//! * A "place" is semantically a location where a value can be read or written, and syntactically,
//!   an expression that can be the target of an assignment, e.g. `x`, `x[0]`, `x.y`. (The term is
//!   borrowed from Rust). In Python syntax, an expression like `f().x` is also allowed as the
//!   target so it can be called a place, but we do not record declarations / bindings like `f().x:
//!   int`, `f().x = ...`. Type checking itself can be done by recording only assignments to names,
//!   but in order to perform type narrowing by attribute/subscript assignments, they must also be
//!   recorded.
//!
//! * A "binding" gives a new value to a place. This includes many different Python statements
//!   (assignment statements of course, but also imports, `def` and `class` statements, `as`
//!   clauses in `with` and `except` statements, match patterns, and others) and even one
//!   expression kind (named expressions). It notably does not include annotated assignment
//!   statements without a right-hand side value; these do not assign any new value to the place.
//!   We consider function parameters to be bindings as well, since (from the perspective of the
//!   function's internal scope), a function parameter begins the scope bound to a value.
//!
//! * A "declaration" establishes an upper bound type for the values that a variable may be
//!   permitted to take on. Annotated assignment statements (with or without an RHS value) are
//!   declarations; annotated function parameters are also declarations. We consider `def` and
//!   `class` statements to also be declarations, so as to prohibit accidentally shadowing them.
//!
//! Annotated assignments with a right-hand side, and annotated function parameters, are both
//! bindings and declarations.
//!
//! We use [`Definition`] as the universal term (and Salsa tracked struct) encompassing both
//! bindings and declarations. (This sacrifices a bit of type safety in exchange for improved
//! performance via fewer Salsa tracked structs and queries, since most declarations -- typed
//! parameters and annotated assignments with RHS -- are both bindings and declarations.)
//!
//! At any given use of a variable, we can ask about both its "declared type" and its "inferred
//! type". These may be different, but the inferred type must always be assignable to the declared
//! type; that is, the declared type is always wider, and the inferred type may be more precise. If
//! we see an invalid assignment, we emit a diagnostic and abandon our inferred type, deferring to
//! the declared type (this allows an explicit annotation to override bad inference, without a
//! cast), maintaining the invariant.
//!
//! The **inferred type** represents the most precise type we believe encompasses all possible
//! values for the variable at a given use. It is based on a union of the bindings which can reach
//! that use through some control flow path, and the narrowing constraints that control flow must
//! have passed through between the binding and the use. For example, in this code:
//!
//! ```python
//! x = 1 if flag else None
//! if x is not None:
//!     use(x)
//! ```
//!
//! For the use of `x` on the third line, the inferred type should be `Literal[1]`. This is based
//! on the binding on the first line, which assigns the type `Literal[1] | None`, and the narrowing
//! constraint on the second line, which rules out the type `None`, since control flow must pass
//! through this constraint to reach the use in question.
//!
//! The **declared type** represents the code author's declaration (usually through a type
//! annotation) that a given variable should not be assigned any type outside the declared type. In
//! our model, declared types are also control-flow-sensitive; we allow the code author to
//! explicitly redeclare the same variable with a different type. So for a given binding of a
//! variable, we will want to ask which declarations of that variable can reach that binding, in
//! order to determine whether the binding is permitted, or should be a type error. For example:
//!
//! ```python
//! from pathlib import Path
//! def f(path: str):
//!     path: Path = Path(path)
//! ```
//!
//! In this function, the initial declared type of `path` is `str`, meaning that the assignment
//! `path = Path(path)` would be a type error, since it assigns to `path` a value whose type is not
//! assignable to `str`. This is the purpose of declared types: they prevent accidental assignment
//! of the wrong type to a variable.
//!
//! But in some cases it is useful to "shadow" or "redeclare" a variable with a new type, and we
//! permit this, as long as it is done with an explicit re-annotation. So `path: Path =
//! Path(path)`, with the explicit `: Path` annotation, is permitted.
//!
//! The general rule is that whatever declaration(s) can reach a given binding determine the
//! validity of that binding. If there is a path in which the place is not declared, that is a
//! declaration of `Unknown`. If multiple declarations can reach a binding, we union them, but by
//! default we also issue a type error, since this implicit union of declared types may hide an
//! error.
//!
//! To support type inference, we build a map from each use of a place to the bindings live at
//! that use, and the type narrowing constraints that apply to each binding.
//!
//! Let's take this code sample:
//!
//! ```python
//! x = 1
//! x = 2
//! y = x
//! if flag:
//!     x = 3
//! else:
//!     x = 4
//! z = x
//! ```
//!
//! In this snippet, we have four bindings of `x` (the statements assigning `1`, `2`, `3`, and `4`
//! to it), and two uses of `x` (the `y = x` and `z = x` assignments). The first binding of `x`
//! does not reach any use, because it's immediately replaced by the second binding, before any use
//! happens. (A linter could thus flag the statement `x = 1` as likely superfluous.)
//!
//! The first use of `x` has one live binding: the assignment `x = 2`.
//!
//! Things get a bit more complex when we have branches. We will definitely take either the `if` or
//! the `else` branch. Thus, the second use of `x` has two live bindings: `x = 3` and `x = 4`. The
//! `x = 2` assignment is no longer visible, because it must be replaced by either `x = 3` or `x =
//! 4`, no matter which branch was taken. We don't know which branch was taken, so we must consider
//! both bindings as live, which means eventually we would (in type inference) look at these two
//! bindings and infer a type of `Literal[3, 4]` -- the union of `Literal[3]` and `Literal[4]` --
//! for the second use of `x`.
//!
//! So that's one question our use-def map needs to answer: given a specific use of a place, which
//! binding(s) can reach that use. In [`AstIds`](crate::semantic_index::ast_ids::AstIds) we number
//! all uses (that means a `Name`/`ExprAttribute`/`ExprSubscript` node with `Load` context)
//! so we have a `ScopedUseId` to efficiently represent each use.
//!
//! We also need to know, for a given definition of a place, what type narrowing constraints apply
//! to it. For instance, in this code sample:
//!
//! ```python
//! x = 1 if flag else None
//! if x is not None:
//!     use(x)
//! ```
//!
//! At the use of `x`, the live binding of `x` is `1 if flag else None`, which would infer as the
//! type `Literal[1] | None`. But the constraint `x is not None` dominates this use, which means we
//! can rule out the possibility that `x` is `None` here, which should give us the type
//! `Literal[1]` for this use.
//!
//! For declared types, we need to be able to answer the question "given a binding to a place,
//! which declarations of that place can reach the binding?" This allows us to emit a diagnostic
//! if the binding is attempting to bind a value of a type that is not assignable to the declared
//! type for that place, at that point in control flow.
//!
//! We also need to know, given a declaration of a place, what the inferred type of that place is
//! at that point. This allows us to emit a diagnostic in a case like `x = "foo"; x: int`. The
//! binding `x = "foo"` occurs before the declaration `x: int`, so according to our
//! control-flow-sensitive interpretation of declarations, the assignment is not an error. But the
//! declaration is an error, since it would violate the "inferred type must be assignable to
//! declared type" rule.
//!
//! Another case we need to handle is when a place is referenced from a different scope (for
//! example, an import or a nonlocal reference). We call this "public" use of a place. For public
//! use of a place, we prefer the declared type, if there are any declarations of that place; if
//! not, we fall back to the inferred type. So we also need to know which declarations and bindings
//! can reach the end of the scope.
//!
//! Technically, public use of a place could occur from any point in control flow of the scope
//! where the place is defined (via inline imports and import cycles, in the case of an import, or
//! via a function call partway through the local scope that ends up using a place from the scope
//! via a global or nonlocal reference.) But modeling this fully accurately requires whole-program
//! analysis that isn't tractable for an efficient analysis, since it means a given place could
//! have a different type every place it's referenced throughout the program, depending on the
//! shape of arbitrarily-sized call/import graphs. So we follow other Python type checkers in
//! making the simplifying assumption that usually the scope will finish execution before its
//! places are made visible to other scopes; for instance, most imports will import from a
//! complete module, not a partially-executed module. (We may want to get a little smarter than
//! this in the future for some closures, but for now this is where we start.)
//!
//! The data structure we build to answer these questions is the `UseDefMap`. It has a
//! `bindings_by_use` vector of [`Bindings`] indexed by [`ScopedUseId`], a
//! `declarations_by_binding` vector of [`Declarations`] indexed by [`ScopedDefinitionId`], a
//! `bindings_by_declaration` vector of [`Bindings`] indexed by [`ScopedDefinitionId`], and
//! `public_bindings` and `public_definitions` vectors indexed by [`ScopedPlaceId`]. The values in
//! each of these vectors are (in principle) a list of live bindings at that use/definition, or at
//! the end of the scope for that place, with a list of the dominating constraints for each
//! binding.
//!
//! In order to avoid vectors-of-vectors-of-vectors and all the allocations that would entail, we
//! don't actually store these "list of visible definitions" as a vector of [`Definition`].
//! Instead, [`Bindings`] and [`Declarations`] are structs which use bit-sets to track
//! definitions (and constraints, in the case of bindings) in terms of [`ScopedDefinitionId`] and
//! [`ScopedPredicateId`], which are indices into the `all_definitions` and `predicates`
//! indexvecs in the [`UseDefMap`].
//!
//! There is another special kind of possible "definition" for a place: there might be a path from
//! the scope entry to a given use in which the place is never bound. We model this with a special
//! "unbound/undeclared" definition (a [`DefinitionState::Undefined`] entry at the start of the
//! `all_definitions` vector). If that sentinel definition is present in the live bindings at a
//! given use, it means that there is a possible path through control flow in which that place is
//! unbound. Similarly, if that sentinel is present in the live declarations, it means that the
//! place is (possibly) undeclared.
//!
//! To build a [`UseDefMap`], the [`UseDefMapBuilder`] is notified of each new use, definition, and
//! constraint as they are encountered by the
//! [`SemanticIndexBuilder`](crate::semantic_index::builder::SemanticIndexBuilder) AST visit. For
//! each place, the builder tracks the `PlaceState` (`Bindings` and `Declarations`) for that place.
//! When we hit a use or definition of a place, we record the necessary parts of the current state
//! for that place that we need for that use or definition. When we reach the end of the scope, it
//! records the state for each place as the public definitions of that place.
//!
//! ```python
//! x = 1
//! x = 2
//! y = x
//! if flag:
//!     x = 3
//! else:
//!     x = 4
//! z = x
//! ```
//!
//! Let's walk through the above example. Initially we do not have any record of `x`. When we add
//! the new place (before we process the first binding), we create a new undefined `PlaceState`
//! which has a single live binding (the "unbound" definition) and a single live declaration (the
//! "undeclared" definition). When we see `x = 1`, we record that as the sole live binding of `x`.
//! The "unbound" binding is no longer visible. Then we see `x = 2`, and we replace `x = 1` as the
//! sole live binding of `x`. When we get to `y = x`, we record that the live bindings for that use
//! of `x` are just the `x = 2` definition.
//!
//! Then we hit the `if` branch. We visit the `test` node (`flag` in this case), since that will
//! happen regardless. Then we take a pre-branch snapshot of the current state for all places,
//! which we'll need later. Then we record `flag` as a possible constraint on the current binding
//! (`x = 2`), and go ahead and visit the `if` body. When we see `x = 3`, it replaces `x = 2`
//! (constrained by `flag`) as the sole live binding of `x`. At the end of the `if` body, we take
//! another snapshot of the current place state; we'll call this the post-if-body snapshot.
//!
//! Now we need to visit the `else` clause. The conditions when entering the `else` clause should
//! be the pre-if conditions; if we are entering the `else` clause, we know that the `if` test
//! failed and we didn't execute the `if` body. So we first reset the builder to the pre-if state,
//! using the snapshot we took previously (meaning we now have `x = 2` as the sole binding for `x`
//! again), and record a *negative* `flag` constraint for all live bindings (`x = 2`). We then
//! visit the `else` clause, where `x = 4` replaces `x = 2` as the sole live binding of `x`.
//!
//! Now we reach the end of the if/else, and want to visit the following code. The state here needs
//! to reflect that we might have gone through the `if` branch, or we might have gone through the
//! `else` branch, and we don't know which. So we need to "merge" our current builder state
//! (reflecting the end-of-else state, with `x = 4` as the only live binding) with our post-if-body
//! snapshot (which has `x = 3` as the only live binding). The result of this merge is that we now
//! have two live bindings of `x`: `x = 3` and `x = 4`.
//!
//! Another piece of information that the `UseDefMap` needs to provide are reachability constraints.
//! See [`reachability_constraints.rs`] for more details, in particular how they apply to bindings.
//!
//! The [`UseDefMapBuilder`] itself just exposes methods for taking a snapshot, resetting to a
//! snapshot, and merging a snapshot into the current state. The logic using these methods lives in
//! [`SemanticIndexBuilder`](crate::semantic_index::builder::SemanticIndexBuilder), e.g. where it
//! visits a `StmtIf` node.

use ruff_index::{IndexVec, newtype_index};
use rustc_hash::FxHashMap;

use self::place_state::{
    Bindings, Declarations, EagerSnapshot, LiveBindingsIterator, LiveDeclaration,
    LiveDeclarationsIterator, PlaceState, ScopedDefinitionId,
};
use crate::node_key::NodeKey;
use crate::place::BoundnessAnalysis;
use crate::semantic_index::ast_ids::ScopedUseId;
use crate::semantic_index::definition::{Definition, DefinitionState};
use crate::semantic_index::narrowing_constraints::{
    ConstraintKey, NarrowingConstraints, NarrowingConstraintsBuilder, NarrowingConstraintsIterator,
};
use crate::semantic_index::place::{
    FileScopeId, PlaceExpr, PlaceExprWithFlags, ScopeKind, ScopedPlaceId,
};
use crate::semantic_index::predicate::{
    Predicate, PredicateOrLiteral, Predicates, PredicatesBuilder, ScopedPredicateId,
};
use crate::semantic_index::reachability_constraints::{
    ReachabilityConstraints, ReachabilityConstraintsBuilder, ScopedReachabilityConstraintId,
};
use crate::semantic_index::use_def::place_state::PreviousDefinitions;
use crate::semantic_index::{EagerSnapshotResult, SemanticIndex};
use crate::types::{IntersectionBuilder, Truthiness, Type, infer_narrowing_constraint};

mod place_state;

/// Applicable definitions and constraints for every use of a name.
#[derive(Debug, PartialEq, Eq, salsa::Update, get_size2::GetSize)]
pub(crate) struct UseDefMap<'db> {
    /// Array of [`Definition`] in this scope. Only the first entry should be [`DefinitionState::Undefined`];
    /// this represents the implicit "unbound"/"undeclared" definition of every place.
    all_definitions: IndexVec<ScopedDefinitionId, DefinitionState<'db>>,

    /// Array of predicates in this scope.
    predicates: Predicates<'db>,

    /// Array of narrowing constraints in this scope.
    narrowing_constraints: NarrowingConstraints,

    /// Array of reachability constraints in this scope.
    reachability_constraints: ReachabilityConstraints,

    /// [`Bindings`] reaching a [`ScopedUseId`].
    bindings_by_use: IndexVec<ScopedUseId, Bindings>,

    /// Tracks whether or not a given AST node is reachable from the start of the scope.
    node_reachability: FxHashMap<NodeKey, ScopedReachabilityConstraintId>,

    /// If the definition is a binding (only) -- `x = 1` for example -- then we need
    /// [`Declarations`] to know whether this binding is permitted by the live declarations.
    ///
    /// If the definition is both a declaration and a binding -- `x: int = 1` for example -- then
    /// we don't actually need anything here, all we'll need to validate is that our own RHS is a
    /// valid assignment to our own annotation.
    declarations_by_binding: FxHashMap<Definition<'db>, Declarations>,

    /// If the definition is a declaration (only) -- `x: int` for example -- then we need
    /// [`Bindings`] to know whether this declaration is consistent with the previously
    /// inferred type.
    ///
    /// If the definition is both a declaration and a binding -- `x: int = 1` for example -- then
    /// we don't actually need anything here, all we'll need to validate is that our own RHS is a
    /// valid assignment to our own annotation.
    bindings_by_declaration: FxHashMap<Definition<'db>, Bindings>,

    /// [`PlaceState`] visible at end of scope for each place.
    end_of_scope_places: IndexVec<ScopedPlaceId, PlaceState>,

    /// All potentially reachable bindings and declarations, for each place.
    reachable_definitions: IndexVec<ScopedPlaceId, ReachableDefinitions>,

    /// Snapshot of bindings in this scope that can be used to resolve a reference in a nested
    /// eager scope.
    eager_snapshots: EagerSnapshots,

    /// Whether or not the end of the scope is reachable.
    ///
    /// This is used to check if the function can implicitly return `None`.
    /// For example:
    /// ```py
    /// def f(cond: bool) -> int | None:
    ///     if cond:
    ///        return 1
    ///
    /// def g() -> int:
    ///     if True:
    ///        return 1
    /// ```
    ///
    /// Function `f` may implicitly return `None`, but `g` cannot.
    ///
    /// This is used by [`UseDefMap::can_implicitly_return_none`].
    end_of_scope_reachability: ScopedReachabilityConstraintId,
}

pub(crate) enum ApplicableConstraints<'map, 'db> {
    UnboundBinding(ConstraintsIterator<'map, 'db>),
    ConstrainedBindings(BindingWithConstraintsIterator<'map, 'db>),
}

impl<'db> UseDefMap<'db> {
    pub(crate) fn bindings_at_use(
        &self,
        use_id: ScopedUseId,
    ) -> BindingWithConstraintsIterator<'_, 'db> {
        self.bindings_iterator(
            &self.bindings_by_use[use_id],
            BoundnessAnalysis::BasedOnUnboundVisibility,
        )
    }

    pub(crate) fn applicable_constraints(
        &self,
        constraint_key: ConstraintKey,
        enclosing_scope: FileScopeId,
        expr: &PlaceExpr,
        index: &'db SemanticIndex,
    ) -> ApplicableConstraints<'_, 'db> {
        match constraint_key {
            ConstraintKey::NarrowingConstraint(constraint) => {
                ApplicableConstraints::UnboundBinding(ConstraintsIterator {
                    predicates: &self.predicates,
                    constraint_ids: self.narrowing_constraints.iter_predicates(constraint),
                })
            }
            ConstraintKey::EagerNestedScope(nested_scope) => {
                let EagerSnapshotResult::FoundBindings(bindings) =
                    index.eager_snapshot(enclosing_scope, expr, nested_scope)
                else {
                    unreachable!(
                        "The result of `SemanticIndex::eager_snapshot` must be `FoundBindings`"
                    )
                };
                ApplicableConstraints::ConstrainedBindings(bindings)
            }
            ConstraintKey::UseId(use_id) => {
                ApplicableConstraints::ConstrainedBindings(self.bindings_at_use(use_id))
            }
        }
    }

    pub(super) fn is_reachable(
        &self,
        db: &dyn crate::Db,
        reachability: ScopedReachabilityConstraintId,
    ) -> bool {
        self.reachability_constraints
            .evaluate(db, &self.predicates, reachability)
            .may_be_true()
    }

    /// Check whether or not a given expression is reachable from the start of the scope. This
    /// is a local analysis which does not capture the possibility that the entire scope might
    /// be unreachable. Use [`super::SemanticIndex::is_node_reachable`] for the global
    /// analysis.
    #[track_caller]
    pub(super) fn is_node_reachable(&self, db: &dyn crate::Db, node_key: NodeKey) -> bool {
        self
            .reachability_constraints
            .evaluate(
                db,
                &self.predicates,
                *self
                    .node_reachability
                    .get(&node_key)
                    .expect("`is_node_reachable` should only be called on AST nodes with recorded reachability"),
            )
            .may_be_true()
    }

    pub(crate) fn end_of_scope_bindings(
        &self,
        place: ScopedPlaceId,
    ) -> BindingWithConstraintsIterator<'_, 'db> {
        self.bindings_iterator(
            self.end_of_scope_places[place].bindings(),
            BoundnessAnalysis::BasedOnUnboundVisibility,
        )
    }

    pub(crate) fn all_reachable_bindings(
        &self,
        place: ScopedPlaceId,
    ) -> BindingWithConstraintsIterator<'_, 'db> {
        self.bindings_iterator(
            &self.reachable_definitions[place].bindings,
            BoundnessAnalysis::AssumeBound,
        )
    }

    pub(crate) fn eager_snapshot(
        &self,
        eager_bindings: ScopedEagerSnapshotId,
    ) -> EagerSnapshotResult<'_, 'db> {
        match self.eager_snapshots.get(eager_bindings) {
            Some(EagerSnapshot::Constraint(constraint)) => {
                EagerSnapshotResult::FoundConstraint(*constraint)
            }
            Some(EagerSnapshot::Bindings(bindings)) => EagerSnapshotResult::FoundBindings(
                self.bindings_iterator(bindings, BoundnessAnalysis::BasedOnUnboundVisibility),
            ),
            None => EagerSnapshotResult::NotFound,
        }
    }

    pub(crate) fn bindings_at_declaration(
        &self,
        declaration: Definition<'db>,
    ) -> BindingWithConstraintsIterator<'_, 'db> {
        self.bindings_iterator(
            &self.bindings_by_declaration[&declaration],
            BoundnessAnalysis::BasedOnUnboundVisibility,
        )
    }

    pub(crate) fn declarations_at_binding(
        &self,
        binding: Definition<'db>,
    ) -> DeclarationsIterator<'_, 'db> {
        self.declarations_iterator(
            &self.declarations_by_binding[&binding],
            BoundnessAnalysis::BasedOnUnboundVisibility,
        )
    }

    pub(crate) fn end_of_scope_declarations<'map>(
        &'map self,
        place: ScopedPlaceId,
    ) -> DeclarationsIterator<'map, 'db> {
        let declarations = self.end_of_scope_places[place].declarations();
        self.declarations_iterator(declarations, BoundnessAnalysis::BasedOnUnboundVisibility)
    }

    pub(crate) fn all_reachable_declarations(
        &self,
        place: ScopedPlaceId,
    ) -> DeclarationsIterator<'_, 'db> {
        let declarations = &self.reachable_definitions[place].declarations;
        self.declarations_iterator(declarations, BoundnessAnalysis::AssumeBound)
    }

    pub(crate) fn all_end_of_scope_declarations<'map>(
        &'map self,
    ) -> impl Iterator<Item = (ScopedPlaceId, DeclarationsIterator<'map, 'db>)> + 'map {
        (0..self.end_of_scope_places.len())
            .map(ScopedPlaceId::from_usize)
            .map(|place_id| (place_id, self.end_of_scope_declarations(place_id)))
    }

    pub(crate) fn all_end_of_scope_bindings<'map>(
        &'map self,
    ) -> impl Iterator<Item = (ScopedPlaceId, BindingWithConstraintsIterator<'map, 'db>)> + 'map
    {
        (0..self.end_of_scope_places.len())
            .map(ScopedPlaceId::from_usize)
            .map(|place_id| (place_id, self.end_of_scope_bindings(place_id)))
    }

    /// This function is intended to be called only once inside `TypeInferenceBuilder::infer_function_body`.
    pub(crate) fn can_implicitly_return_none(&self, db: &dyn crate::Db) -> bool {
        !self
            .reachability_constraints
            .evaluate(db, &self.predicates, self.end_of_scope_reachability)
            .is_always_false()
    }

    pub(crate) fn is_binding_reachable(
        &self,
        db: &dyn crate::Db,
        binding: &BindingWithConstraints<'_, 'db>,
    ) -> Truthiness {
        self.reachability_constraints.evaluate(
            db,
            &self.predicates,
            binding.reachability_constraint,
        )
    }

    fn bindings_iterator<'map>(
        &'map self,
        bindings: &'map Bindings,
        boundness_analysis: BoundnessAnalysis,
    ) -> BindingWithConstraintsIterator<'map, 'db> {
        BindingWithConstraintsIterator {
            all_definitions: &self.all_definitions,
            predicates: &self.predicates,
            narrowing_constraints: &self.narrowing_constraints,
            reachability_constraints: &self.reachability_constraints,
            boundness_analysis,
            inner: bindings.iter(),
        }
    }

    fn declarations_iterator<'map>(
        &'map self,
        declarations: &'map Declarations,
        boundness_analysis: BoundnessAnalysis,
    ) -> DeclarationsIterator<'map, 'db> {
        DeclarationsIterator {
            all_definitions: &self.all_definitions,
            predicates: &self.predicates,
            reachability_constraints: &self.reachability_constraints,
            boundness_analysis,
            inner: declarations.iter(),
        }
    }
}

/// Uniquely identifies a snapshot of a place state that can be used to resolve a reference in a
/// nested eager scope.
///
/// An eager scope has its entire body executed immediately at the location where it is defined.
/// For any free references in the nested scope, we use the bindings that are visible at the point
/// where the nested scope is defined, instead of using the public type of the place.
///
/// There is a unique ID for each distinct [`EagerSnapshotKey`] in the file.
#[newtype_index]
#[derive(get_size2::GetSize)]
pub(crate) struct ScopedEagerSnapshotId;

#[derive(Clone, Copy, Debug, Eq, Hash, PartialEq, get_size2::GetSize)]
pub(crate) struct EagerSnapshotKey {
    /// The enclosing scope containing the bindings
    pub(crate) enclosing_scope: FileScopeId,
    /// The referenced place (in the enclosing scope)
    pub(crate) enclosing_place: ScopedPlaceId,
    /// The nested eager scope containing the reference
    pub(crate) nested_scope: FileScopeId,
}

/// A snapshot of place states that can be used to resolve a reference in a nested eager scope.
type EagerSnapshots = IndexVec<ScopedEagerSnapshotId, EagerSnapshot>;

#[derive(Debug)]
pub(crate) struct BindingWithConstraintsIterator<'map, 'db> {
    all_definitions: &'map IndexVec<ScopedDefinitionId, DefinitionState<'db>>,
    pub(crate) predicates: &'map Predicates<'db>,
    pub(crate) narrowing_constraints: &'map NarrowingConstraints,
    pub(crate) reachability_constraints: &'map ReachabilityConstraints,
    pub(crate) boundness_analysis: BoundnessAnalysis,
    inner: LiveBindingsIterator<'map>,
}

impl<'map, 'db> Iterator for BindingWithConstraintsIterator<'map, 'db> {
    type Item = BindingWithConstraints<'map, 'db>;

    fn next(&mut self) -> Option<Self::Item> {
        let predicates = self.predicates;
        let narrowing_constraints = self.narrowing_constraints;

        self.inner
            .next()
            .map(|live_binding| BindingWithConstraints {
                binding: self.all_definitions[live_binding.binding],
                narrowing_constraint: ConstraintsIterator {
                    predicates,
                    constraint_ids: narrowing_constraints
                        .iter_predicates(live_binding.narrowing_constraint),
                },
                reachability_constraint: live_binding.reachability_constraint,
            })
    }
}

impl std::iter::FusedIterator for BindingWithConstraintsIterator<'_, '_> {}

pub(crate) struct BindingWithConstraints<'map, 'db> {
    pub(crate) binding: DefinitionState<'db>,
    pub(crate) narrowing_constraint: ConstraintsIterator<'map, 'db>,
    pub(crate) reachability_constraint: ScopedReachabilityConstraintId,
}

pub(crate) struct ConstraintsIterator<'map, 'db> {
    predicates: &'map Predicates<'db>,
    constraint_ids: NarrowingConstraintsIterator<'map>,
}

impl<'db> Iterator for ConstraintsIterator<'_, 'db> {
    type Item = Predicate<'db>;

    fn next(&mut self) -> Option<Self::Item> {
        self.constraint_ids
            .next()
            .map(|narrowing_constraint| self.predicates[narrowing_constraint.predicate()])
    }
}

impl std::iter::FusedIterator for ConstraintsIterator<'_, '_> {}

impl<'db> ConstraintsIterator<'_, 'db> {
    pub(crate) fn narrow(
        self,
        db: &'db dyn crate::Db,
        base_ty: Type<'db>,
        place: ScopedPlaceId,
    ) -> Type<'db> {
        let constraint_tys: Vec<_> = self
            .filter_map(|constraint| infer_narrowing_constraint(db, constraint, place))
            .collect();

        if constraint_tys.is_empty() {
            base_ty
        } else {
            constraint_tys
                .into_iter()
                .rev()
                .fold(
                    IntersectionBuilder::new(db).add_positive(base_ty),
                    IntersectionBuilder::add_positive,
                )
                .build()
        }
    }
}

#[derive(Clone)]
pub(crate) struct DeclarationsIterator<'map, 'db> {
    all_definitions: &'map IndexVec<ScopedDefinitionId, DefinitionState<'db>>,
    pub(crate) predicates: &'map Predicates<'db>,
    pub(crate) reachability_constraints: &'map ReachabilityConstraints,
    pub(crate) boundness_analysis: BoundnessAnalysis,
    inner: LiveDeclarationsIterator<'map>,
}

pub(crate) struct DeclarationWithConstraint<'db> {
    pub(crate) declaration: DefinitionState<'db>,
    pub(crate) reachability_constraint: ScopedReachabilityConstraintId,
}

impl<'db> Iterator for DeclarationsIterator<'_, 'db> {
    type Item = DeclarationWithConstraint<'db>;

    fn next(&mut self) -> Option<Self::Item> {
        self.inner.next().map(
            |LiveDeclaration {
                 declaration,
                 reachability_constraint,
             }| {
                DeclarationWithConstraint {
                    declaration: self.all_definitions[*declaration],
                    reachability_constraint: *reachability_constraint,
                }
            },
        )
    }
}

impl std::iter::FusedIterator for DeclarationsIterator<'_, '_> {}

#[derive(Debug, PartialEq, Eq, salsa::Update, get_size2::GetSize)]
struct ReachableDefinitions {
    bindings: Bindings,
    declarations: Declarations,
}

/// A snapshot of the definitions and constraints state at a particular point in control flow.
#[derive(Clone, Debug)]
pub(super) struct FlowSnapshot {
    place_states: IndexVec<ScopedPlaceId, PlaceState>,
    reachability: ScopedReachabilityConstraintId,
}

#[derive(Debug)]
pub(super) struct UseDefMapBuilder<'db> {
    /// Append-only array of [`DefinitionState`].
    all_definitions: IndexVec<ScopedDefinitionId, DefinitionState<'db>>,

    /// Builder of predicates.
    pub(super) predicates: PredicatesBuilder<'db>,

    /// Builder of narrowing constraints.
    pub(super) narrowing_constraints: NarrowingConstraintsBuilder,

    /// Builder of reachability constraints.
    pub(super) reachability_constraints: ReachabilityConstraintsBuilder,

    /// Live bindings at each so-far-recorded use.
    bindings_by_use: IndexVec<ScopedUseId, Bindings>,

    /// Tracks whether or not the current point in control flow is reachable from the
    /// start of the scope.
    pub(super) reachability: ScopedReachabilityConstraintId,

    /// Tracks whether or not a given AST node is reachable from the start of the scope.
    node_reachability: FxHashMap<NodeKey, ScopedReachabilityConstraintId>,

    /// Live declarations for each so-far-recorded binding.
    declarations_by_binding: FxHashMap<Definition<'db>, Declarations>,

    /// Live bindings for each so-far-recorded declaration.
    bindings_by_declaration: FxHashMap<Definition<'db>, Bindings>,

    /// Currently live bindings and declarations for each place.
    place_states: IndexVec<ScopedPlaceId, PlaceState>,

    /// All potentially reachable bindings and declarations, for each place.
    reachable_definitions: IndexVec<ScopedPlaceId, ReachableDefinitions>,

    /// Snapshots of place states in this scope that can be used to resolve a reference in a
    /// nested eager scope.
    eager_snapshots: EagerSnapshots,

    /// Is this a class scope?
    is_class_scope: bool,
}

impl<'db> UseDefMapBuilder<'db> {
    pub(super) fn new(is_class_scope: bool) -> Self {
        Self {
            all_definitions: IndexVec::from_iter([DefinitionState::Undefined]),
            predicates: PredicatesBuilder::default(),
            narrowing_constraints: NarrowingConstraintsBuilder::default(),
            reachability_constraints: ReachabilityConstraintsBuilder::default(),
            bindings_by_use: IndexVec::new(),
            reachability: ScopedReachabilityConstraintId::ALWAYS_TRUE,
            node_reachability: FxHashMap::default(),
            declarations_by_binding: FxHashMap::default(),
            bindings_by_declaration: FxHashMap::default(),
            place_states: IndexVec::new(),
            reachable_definitions: IndexVec::new(),
            eager_snapshots: EagerSnapshots::default(),
            is_class_scope,
        }
    }
    pub(super) fn mark_unreachable(&mut self) {
        self.reachability = ScopedReachabilityConstraintId::ALWAYS_FALSE;

        for state in &mut self.place_states {
            state.record_reachability_constraint(
                &mut self.reachability_constraints,
                ScopedReachabilityConstraintId::ALWAYS_FALSE,
            );
        }
    }

    pub(super) fn add_place(&mut self, place: ScopedPlaceId) {
        let new_place = self
            .place_states
            .push(PlaceState::undefined(self.reachability));
        debug_assert_eq!(place, new_place);
        let new_place = self.reachable_definitions.push(ReachableDefinitions {
            bindings: Bindings::unbound(self.reachability),
            declarations: Declarations::undeclared(self.reachability),
        });
        debug_assert_eq!(place, new_place);
    }

    pub(super) fn record_binding(
        &mut self,
        place: ScopedPlaceId,
        binding: Definition<'db>,
        is_place_name: bool,
    ) {
        let def_id = self.all_definitions.push(DefinitionState::Defined(binding));
        let place_state = &mut self.place_states[place];
        self.declarations_by_binding
            .insert(binding, place_state.declarations().clone());
        place_state.record_binding(
            def_id,
            self.reachability,
            self.is_class_scope,
            is_place_name,
        );

        self.reachable_definitions[place].bindings.record_binding(
            def_id,
            self.reachability,
            self.is_class_scope,
            is_place_name,
            PreviousDefinitions::AreKept,
        );
    }

    pub(super) fn add_predicate(
        &mut self,
        predicate: PredicateOrLiteral<'db>,
    ) -> ScopedPredicateId {
        match predicate {
            PredicateOrLiteral::Predicate(predicate) => self.predicates.add_predicate(predicate),
            PredicateOrLiteral::Literal(true) => ScopedPredicateId::ALWAYS_TRUE,
            PredicateOrLiteral::Literal(false) => ScopedPredicateId::ALWAYS_FALSE,
        }
    }

    pub(super) fn record_narrowing_constraint(&mut self, predicate: ScopedPredicateId) {
        if predicate == ScopedPredicateId::ALWAYS_TRUE
            || predicate == ScopedPredicateId::ALWAYS_FALSE
        {
            // No need to record a narrowing constraint for `True` or `False`.
            return;
        }

        let narrowing_constraint = predicate.into();
        for state in &mut self.place_states {
            state
                .record_narrowing_constraint(&mut self.narrowing_constraints, narrowing_constraint);
        }
    }

    /// Snapshot the state of a single place at the current point in control flow.
    ///
    /// This is only used for `*`-import reachability constraints, which are handled differently
    /// to most other reachability constraints. See the doc-comment for
    /// [`Self::record_and_negate_star_import_reachability_constraint`] for more details.
    pub(super) fn single_place_snapshot(&self, place: ScopedPlaceId) -> PlaceState {
        self.place_states[place].clone()
    }

    /// This method exists solely for handling `*`-import reachability constraints.
    ///
    /// The reason why we add reachability constraints for [`Definition`]s created by `*` imports
    /// is laid out in the doc-comment for `StarImportPlaceholderPredicate`. But treating these
    /// reachability constraints in the use-def map the same way as all other reachability constraints
    /// was shown to lead to [significant regressions] for small codebases where typeshed
    /// dominates. (Although `*` imports are not common generally, they are used in several
    /// important places by typeshed.)
    ///
    /// To solve these regressions, it was observed that we could do significantly less work for
    /// `*`-import definitions. We do a number of things differently here to our normal handling of
    /// reachability constraints:
    ///
    /// - We only apply and negate the reachability constraints to a single symbol, rather than to
    ///   all symbols. This is possible here because, unlike most definitions, we know in advance that
    ///   exactly one definition occurs inside the "if-true" predicate branch, and we know exactly
    ///   which definition it is.
    ///
    /// - We only snapshot the state for a single place prior to the definition, rather than doing
    ///   expensive calls to [`Self::snapshot`]. Again, this is possible because we know
    ///   that only a single definition occurs inside the "if-predicate-true" predicate branch.
    ///
    /// - Normally we take care to check whether an "if-predicate-true" branch or an
    ///   "if-predicate-false" branch contains a terminal statement: these can affect the reachability
    ///   of symbols defined inside either branch. However, in the case of `*`-import definitions,
    ///   this is unnecessary (and therefore not done in this method), since we know that a `*`-import
    ///   predicate cannot create a terminal statement inside either branch.
    ///
    /// [significant regressions]: https://github.com/astral-sh/ruff/pull/17286#issuecomment-2786755746
    pub(super) fn record_and_negate_star_import_reachability_constraint(
        &mut self,
        reachability_id: ScopedReachabilityConstraintId,
        symbol: ScopedPlaceId,
        pre_definition_state: PlaceState,
    ) {
        let negated_reachability_id = self
            .reachability_constraints
            .add_not_constraint(reachability_id);

        let mut post_definition_state =
            std::mem::replace(&mut self.place_states[symbol], pre_definition_state);

        post_definition_state
            .record_reachability_constraint(&mut self.reachability_constraints, reachability_id);

        self.place_states[symbol].record_reachability_constraint(
            &mut self.reachability_constraints,
            negated_reachability_id,
        );

        self.place_states[symbol].merge(
            post_definition_state,
            &mut self.narrowing_constraints,
            &mut self.reachability_constraints,
        );
    }

    pub(super) fn record_reachability_constraint(
        &mut self,
        constraint: ScopedReachabilityConstraintId,
    ) {
        self.reachability = self
            .reachability_constraints
            .add_and_constraint(self.reachability, constraint);

        for state in &mut self.place_states {
            state.record_reachability_constraint(&mut self.reachability_constraints, constraint);
        }
    }

    pub(super) fn record_declaration(
        &mut self,
        place: ScopedPlaceId,
        declaration: Definition<'db>,
    ) {
        let def_id = self
            .all_definitions
            .push(DefinitionState::Defined(declaration));
        let place_state = &mut self.place_states[place];
        self.bindings_by_declaration
            .insert(declaration, place_state.bindings().clone());
        place_state.record_declaration(def_id, self.reachability);

        self.reachable_definitions[place]
            .declarations
            .record_declaration(def_id, self.reachability, PreviousDefinitions::AreKept);
    }

    pub(super) fn record_declaration_and_binding(
        &mut self,
        place: ScopedPlaceId,
        definition: Definition<'db>,
        is_place_name: bool,
    ) {
        // We don't need to store anything in self.bindings_by_declaration or
        // self.declarations_by_binding.
        let def_id = self
            .all_definitions
            .push(DefinitionState::Defined(definition));
        let place_state = &mut self.place_states[place];
        place_state.record_declaration(def_id, self.reachability);
        place_state.record_binding(
            def_id,
            self.reachability,
            self.is_class_scope,
            is_place_name,
        );

        self.reachable_definitions[place]
            .declarations
            .record_declaration(def_id, self.reachability, PreviousDefinitions::AreKept);
        self.reachable_definitions[place].bindings.record_binding(
            def_id,
            self.reachability,
            self.is_class_scope,
            is_place_name,
            PreviousDefinitions::AreKept,
        );
    }

    pub(super) fn delete_binding(&mut self, place: ScopedPlaceId, is_place_name: bool) {
        let def_id = self.all_definitions.push(DefinitionState::Deleted);
        let place_state = &mut self.place_states[place];
        place_state.record_binding(
            def_id,
            self.reachability,
            self.is_class_scope,
            is_place_name,
        );
    }

    pub(super) fn record_use(
        &mut self,
        place: ScopedPlaceId,
        use_id: ScopedUseId,
        node_key: NodeKey,
    ) {
        // We have a use of a place; clone the current bindings for that place, and record them
        // as the live bindings for this use.
        let new_use = self
            .bindings_by_use
            .push(self.place_states[place].bindings().clone());
        debug_assert_eq!(use_id, new_use);

        // Track reachability of all uses of places to silence `unresolved-reference`
        // diagnostics in unreachable code.
        self.record_node_reachability(node_key);
    }

    pub(super) fn record_node_reachability(&mut self, node_key: NodeKey) {
        self.node_reachability.insert(node_key, self.reachability);
    }

    pub(super) fn snapshot_eager_state(
        &mut self,
        enclosing_place: ScopedPlaceId,
        scope: ScopeKind,
        enclosing_place_expr: &PlaceExprWithFlags,
    ) -> ScopedEagerSnapshotId {
        // Names bound in class scopes are never visible to nested scopes (but attributes/subscripts are visible),
        // so we never need to save eager scope bindings in a class scope.
        if (scope.is_class() && enclosing_place_expr.is_name()) || !enclosing_place_expr.is_bound()
        {
            self.eager_snapshots.push(EagerSnapshot::Constraint(
                self.place_states[enclosing_place]
                    .bindings()
                    .unbound_narrowing_constraint(),
            ))
        } else {
            self.eager_snapshots.push(EagerSnapshot::Bindings(
                self.place_states[enclosing_place].bindings().clone(),
            ))
        }
    }

    /// Take a snapshot of the current visible-places state.
    pub(super) fn snapshot(&self) -> FlowSnapshot {
        FlowSnapshot {
            place_states: self.place_states.clone(),
            reachability: self.reachability,
        }
    }

    /// Restore the current builder places state to the given snapshot.
    pub(super) fn restore(&mut self, snapshot: FlowSnapshot) {
        // We never remove places from `place_states` (it's an IndexVec, and the place
        // IDs must line up), so the current number of known places must always be equal to or
        // greater than the number of known places in a previously-taken snapshot.
        let num_places = self.place_states.len();
        debug_assert!(num_places >= snapshot.place_states.len());

        // Restore the current visible-definitions state to the given snapshot.
        self.place_states = snapshot.place_states;
        self.reachability = snapshot.reachability;

        // If the snapshot we are restoring is missing some places we've recorded since, we need
        // to fill them in so the place IDs continue to line up. Since they don't exist in the
        // snapshot, the correct state to fill them in with is "undefined".
        self.place_states
            .resize(num_places, PlaceState::undefined(self.reachability));
    }

    /// Merge the given snapshot into the current state, reflecting that we might have taken either
    /// path to get here. The new state for each place should include definitions from both the
    /// prior state and the snapshot.
    pub(super) fn merge(&mut self, snapshot: FlowSnapshot) {
        // As an optimization, if we know statically that either of the snapshots is always
        // unreachable, we can leave it out of the merged result entirely. Note that we cannot
        // perform any type inference at this point, so this is largely limited to unreachability
        // via terminal statements. If a flow's reachability depends on an expression in the code,
        // we will include the flow in the merged result; the reachability constraints of its
        // bindings will include this reachability condition, so that later during type inference,
        // we can determine whether any particular binding is non-visible due to unreachability.
        if snapshot.reachability == ScopedReachabilityConstraintId::ALWAYS_FALSE {
            return;
        }
        if self.reachability == ScopedReachabilityConstraintId::ALWAYS_FALSE {
            self.restore(snapshot);
            return;
        }

        // We never remove places from `place_states` (it's an IndexVec, and the place
        // IDs must line up), so the current number of known places must always be equal to or
        // greater than the number of known places in a previously-taken snapshot.
        debug_assert!(self.place_states.len() >= snapshot.place_states.len());

        let mut snapshot_definitions_iter = snapshot.place_states.into_iter();
        for current in &mut self.place_states {
            if let Some(snapshot) = snapshot_definitions_iter.next() {
                current.merge(
                    snapshot,
                    &mut self.narrowing_constraints,
                    &mut self.reachability_constraints,
                );
            } else {
                current.merge(
                    PlaceState::undefined(snapshot.reachability),
                    &mut self.narrowing_constraints,
                    &mut self.reachability_constraints,
                );
                // Place not present in snapshot, so it's unbound/undeclared from that path.
            }
        }

        self.reachability = self
            .reachability_constraints
            .add_or_constraint(self.reachability, snapshot.reachability);
    }

    pub(super) fn finish(mut self) -> UseDefMap<'db> {
        self.all_definitions.shrink_to_fit();
        self.place_states.shrink_to_fit();
        self.reachable_definitions.shrink_to_fit();
        self.bindings_by_use.shrink_to_fit();
        self.node_reachability.shrink_to_fit();
        self.declarations_by_binding.shrink_to_fit();
        self.bindings_by_declaration.shrink_to_fit();
        self.eager_snapshots.shrink_to_fit();

        UseDefMap {
            all_definitions: self.all_definitions,
            predicates: self.predicates.build(),
            narrowing_constraints: self.narrowing_constraints.build(),
            reachability_constraints: self.reachability_constraints.build(),
            bindings_by_use: self.bindings_by_use,
            node_reachability: self.node_reachability,
            end_of_scope_places: self.place_states,
            reachable_definitions: self.reachable_definitions,
            declarations_by_binding: self.declarations_by_binding,
            bindings_by_declaration: self.bindings_by_declaration,
            eager_snapshots: self.eager_snapshots,
            end_of_scope_reachability: self.reachability,
        }
    }
}