uv/crates/uv-build-backend/src
konsti 4ac78f673b
Build backend: Switch to custom glob-walkdir implementation (#9013)
When doing a directory traversal for source dist inclusion, we want to
offer the user include and exclude options, and we want to avoid
traversing irrelevant directories. The latter is important for
performance, especially on network file systems, but also with large
data directories, or (not-included) directories with other permissions.
To support this, we introduce `GlobDirFilter`, which uses a DFA from
regex_automata to determine whether any children of a directory can be
included and skips the directory if not.

The globs are based on PEP 639. The syntax is more restricted than glob
or globset, but it's standardized. I chose it over glob or globset
because we're already using this syntax for `project.license-files` a
required by PEP 639, so it makes sense to use the same globs for all
includes (see e.g.
4f52a3bb62/pyproject.toml (L36-L48)
for example with same semantics for include and exclude)

### Semantics

Glob semantics are complex due to mixing directories and files,
expectations around simplicity and our need to exclude most of the tree
in the project from traversal. The current draft uses a syntax that
optimizes for simple default use cases for the start.

#### includes

Glob expressions which files and directories to include in the source
distribution.

Includes are anchored, which means that `pyproject.toml` includes only
`<project root>/pyproject.toml`. Use for example `assets/**/sample.csv`
to include for all
`sample.csv` files in `<project root>/assets` or any child directory. To
recursively include
all files under a directory, use a `/**` suffix, e.g. `src/**`. For
performance and
reproducibility, avoid unanchored matches such as `**/sample.csv`.

The glob syntax is the reduced portable glob from
[PEP 639](https://peps.python.org/pep-0639/#add-license-FILES-key).

#### excludes

Glob expressions which files and directories to exclude from the
previous source
distribution includes.

Excludes are not, which means that `__pycache__` excludes all
directories named
`__pycache__` and it's children anywhere. To anchor a directory, use a
`/` prefix, e.g.,
`/dist` will exclude only `<project root>/dist`.

The glob syntax is the reduced portable glob from
[PEP 639](https://peps.python.org/pep-0639/#add-license-FILES-key).
2024-11-14 13:14:58 +00:00
..
metadata Don't allow non-string email in authors (#8520) 2024-10-24 15:25:52 +00:00
lib.rs Build backend: Switch to custom glob-walkdir implementation (#9013) 2024-11-14 13:14:58 +00:00
metadata.rs Build backend: Switch to custom glob-walkdir implementation (#9013) 2024-11-14 13:14:58 +00:00
tests.rs Build backend: Switch to custom glob-walkdir implementation (#9013) 2024-11-14 13:14:58 +00:00