uv/crates/uv-globfilter
konsti a43333351e
Build backend: Allow escaping in globs (#13313)
PEP 639 does not allow any characters that aren't in either their
limited glob syntax or the alphanumeric Unicode characters. This means
there's no way to express a glob such as `**/@test` for the excludes.

We extend the glob syntax from PEP 639 by introducing backslash escapes,
which can escape all characters but path separators (forward and
backwards slashes) to be parsed verbatim.

This means we have two glob parsers: The strict PEP 639 parser for
`project.license-files`, and our extended parser for `tool.uv`, with a
slight difference if you need to use special characters, to both adhere
to PEP 639 and to support cases such as #13280.

Fixes #13280
2025-05-07 18:31:41 +02:00
..
src Build backend: Allow escaping in globs (#13313) 2025-05-07 18:31:41 +02:00
Cargo.toml Build backend: Allow escaping in globs (#13313) 2025-05-07 18:31:41 +02:00
README.md Build backend: Add reference docs and schema (#12803) 2025-04-21 12:27:49 +02:00

README.md

globfilter

Portable directory walking with includes and excludes.

Motivating example: You want to allow the user to select paths within a project.

include = ["src", "License.txt", "resources/icons/*.svg"]
exclude = ["target", "/dist", ".cache", "*.tmp"]

When traversing the directory, you can use GlobDirFilter::from_globs(...)?.match_directory(&relative) skip directories that never match in WalkDirs filter_entry.

Syntax

This crate supports the cross-language, restricted glob syntax from PEP 639:

  • Alphanumeric characters, underscores (_), hyphens (-) and dots (.) are matched verbatim.
  • The special glob characters are:
    • *: Matches any number of characters except path separators
    • ?: Matches a single character except the path separator
    • **: Matches any number of characters including path separators
    • [], containing only the verbatim matched characters: Matches a single of the characters contained. Within [...], the hyphen indicates a locale-agnostic range (e.g., a-z, order based on Unicode code points). Hyphens at the start or end are matched literally.
  • The path separator is the forward slash character (/). Patterns are relative to the given directory, a leading slash character for absolute paths is not supported.
  • Parent directory indicators (..) are not allowed.

These rules mean that matching the backslash (\) is forbidden, which avoid collisions with the windows path separator.