Matching Patterns
- Represent shell-style glob wildcards (
*,?) as an inductive type - Implement a backtracking pattern matcher using the Chain of Responsibility pattern
- Use
partialfor functions that cannot be proved terminating - Build a runnable program that tests patterns against file names
- Command shells let you match filenames with patterns like
*.leanordata?.csv - We'll build a glob matcher from scratch
- Like Python's
fnmatchmodule, but implemented as a recursive descent parser - This combines pattern matching from types with recursion from basics
Pattern Elements
i
-- Pattern elements for glob matching
inductive Elem where
| Lit : Char → Elem -- literal character
| Any : Elem -- match exactly one char (?)
| Wild : Elem -- match zero or more chars (*)
deriving Repr, BEq
inductivedefines a new sum type with three constructors- Like a Python
Enumbut each variant can carry data
- Like a Python
Elem.Lit cmatches a literal characterElem.Anymatches exactly one character of any kind- Shell syntax:
?
- Shell syntax:
Elem.Wildmatches zero or more characters- Shell syntax:
*
- Shell syntax:
deriving Repr, BEqauto-generates printing and equality instances- Like Python's
__repr__and__eq__but generated by the compiler
- Like Python's
The Glob Algorithm
i
-- Chain of Responsibility: consume the matched prefix, return remaining chars.
-- Uses partial because Wild tries every suffix without a decreasing measure.
partial def glob : List Elem → List Char → Bool
| [], [] => true
| [], _ => false
| Elem.Lit c :: ps, c' :: cs => c == c' && glob ps cs
| Elem.Lit _ :: _, [] => false
| Elem.Any :: ps, _ :: cs => glob ps cs
| Elem.Any :: _, [] => false
| Elem.Wild :: [], _ => true -- * at end matches anything
| Elem.Wild :: ps, cs => -- try every split point
(List.range (cs.length + 1)).any fun i =>
glob ps (cs.drop i)
globtakes a pattern (list ofElem) and a string (list ofChar), returnsBool- The algorithm is Chain of Responsibility:
- Each pattern element tries to match the front of the input
- If it matches, it passes the remaining input to the rest of the pattern
- Cases:
[], []: empty pattern matches empty input — success[], _: empty pattern with non-empty input — failElem.Lit cvsc': match if characters are equal, then recurseElem.Lit _vs[]: literal needs a character but there are noneElem.Anyvs_ :: cs: consume one character, recurseElem.Anyvs[]: wildcard needs a characterElem.Wildvs[]at end of pattern: matches anythingElem.Wildvscs: try every split point from 0 to length
- The
Wildcase usesList.rangeto generate split points and.anyto try each- This is the backtracking: try matching 0 chars, then 1, then 2, etc.
cs.length + 1(notcs.length) ensures the range includescs.length— the case where*consumes all remaining characters- Like trying different amounts of input consumed by
*
cs.drop iskips the firsticharacters for the rest of the pattern
The partial Keyword
- Lean requires every recursive function to be provably terminating
- Normally you use structural recursion (the argument gets smaller each call)
globcannot be proved terminating by Lean's automatic checker- The
Wildcase callsglob ps (cs.drop i)where the arguments don't structurally decrease
- The
partialtells Lean: "trust me, this function terminates, don't try to prove it"- The function will still run correctly at runtime
- But Lean won't try to generate a termination proof
- Only use
partialwhen you're sure the function terminates- An infinite loop in
partialcode will hang the compiler
- An infinite loop in
Termination Without partial
i
-- Structural recursion: Lean proves termination automatically
-- because the list argument shrinks with each recursive call
def mySum : List Nat → Nat
| [] => 0
| x :: xs => x + mySum xs
#eval mySum [1, 2, 3, 4, 5]
-- termination_by: tell Lean which expression decreases each call
-- Use this when the termination measure is not the argument itself
-- decreasing_by provides a tactic to discharge the proof obligation
def repeatStr (s : String) : Nat → String
| 0 => ""
| n + 1 => s ++ repeatStr s n
#eval repeatStr "ab" 3
-- partial: opt out of termination checking entirely
-- Use only when you are certain the function terminates
-- and cannot express the measure with termination_by
partial def findFirst (p : α → Bool) : List α → Option α
| [] => none
| x :: xs => if p x then some x else findFirst p xs
#eval findFirst (· > 3) [1, 2, 3, 4, 5]
i
15
"ababab"
some 4
- For application programmers, there are three options when Lean rejects your recursive function:
- Structural recursion: match on the argument directly so Lean can see it shrinks — no annotation needed
termination_by: supply a decreasing measure (aNatexpression that gets smaller each call); Lean checks the proof automaticallypartial: opt out of termination checking when the measure is too complex to express
- Prefer structural recursion or
termination_bywhen possible: Lean's guarantee that your function terminates is a real correctness property partialgives up that guarantee — the function is unchecked at the type level, even if you are confident it terminates in practice
String Matching Wrapper
i
-- Wrap glob for whole-string matching
def matchGlob (pat : List Elem) (s : String) : Bool :=
glob pat s.toList
matchGlobconverts between theStringtype and theList Charinternal representations.toListconverts aStringto aList Char- Like Python's
list("hello")→['h', 'e', 'l', 'l', 'o']
- Like Python's
- The function is pure: same inputs always produce the same result
Building Literal Patterns
i
-- Helper to build a Lit element list from a string
def litPat (s : String) : List Elem := s.toList.map Elem.Lit
litPatbuilds a pattern that matches a literal string exactlys.toList.map Elem.Litconverts every character to aLitelement- Like Python's
[Elem.Lit(c) for c in s]
- Like Python's
- Useful for building patterns programmatically
Parsing Pattern Strings
i
-- Parse a glob pattern string into a list of Elem
def parsePat (s : String) : List Elem :=
s.toList.filterMap fun c =>
if c == '?' then some Elem.Any
else if c == '*' then some Elem.Wild
else some (Elem.Lit c)
parsePatconverts shell syntax (?and*) intoElemvaluesfilterMapapplies a function and discardsnoneresults- Like Python's
[y for x in items if (y := f(x)) is not None](walrus operator, Python 3.8+)
- Like Python's
- Every character produces
some Elem, so nothing is filtered- We use
filterMaprather thanmapto show the pattern for optional parsing (DEBT)
- We use
?becomesElem.Any,*becomesElem.Wild, everything else becomesElem.Lit- Unlike
litPat, this function understands wildcard syntax
Tests
i
-- Tests ---------------------------------------------------------------
-- Literal matching
#guard matchGlob (litPat "hello") "hello"
#guard !matchGlob (litPat "foo") "bar"
-- Wildcard at end
#guard matchGlob (litPat "hi" ++ [Elem.Wild]) "hiXYZ"
#guard matchGlob [Elem.Wild] ""
#guard matchGlob [Elem.Wild] "anything"
-- Wildcard at start
#guard matchGlob ([Elem.Wild] ++ litPat ".txt") "report.txt"
#guard !matchGlob ([Elem.Wild] ++ litPat ".txt") "report.csv"
-- Any single character
#guard matchGlob [Elem.Any] "x"
#guard !matchGlob [Elem.Any] ""
#guard !matchGlob [Elem.Any] "xy"
-- Combined patterns using parsePat
#guard matchGlob (parsePat "*.html") "index.html"
#guard matchGlob (parsePat "*.h*") "header.hpp"
#guard matchGlob (parsePat "*.h??") "header.hpp"
#guard !matchGlob (parsePat "*.h??") "main.cpp"
#guard matchGlob (parsePat "d?ta.*") "data.txt"
#guard !matchGlob (parsePat "d?ta.*") "date.txt"
#guardassertions are checked at compile time- All must evaluate to
trueor compilation fails
- All must evaluate to
- Test categories:
- Literal matching: exact string matches
- Wildcard at end:
"hi*"matches"hiXYZ"but not"foo" - Wildcard at start:
"*.txt"matches"report.txt"but not"report.csv" - Any single character:
"?"matches"x"but not""or"xy" - Combined patterns:
"*.html","*.h*","*.h??","d?ta.*"
!matchGlobtests that the pattern does not match- Like Python's
assert not fnmatch(...)
- Like Python's
Running the Program
i
-- Entry point: test a few patterns interactively
-- Run with: lean --run glob/code.lean
def main : IO Unit := do
let patterns : List (String × String) := [
("*.lean", "basics/sum_list.lean"),
("*.lean", "types/option.lean"),
("*.out", "basics/sum_list.out"),
("*.lean", "README.md")
]
for (patStr, target) in patterns do
let pat := parsePat patStr
let result := if matchGlob pat target then "✓" else "✗"
IO.println s!"{result} '{patStr}' matches '{target}'"
IO.println "Done."
mainloops over a list of(pattern, target)pairs and reports matchesfor (patStr, target) in patterns doiterates with destructuring- Like Python's
for pat_str, target in patterns:
- Like Python's
IO.printlnoutputs a line to the terminal- Compile and type-check with:
lake env lean glob/code.lean
- Run interactively with:
lean --run glob/code.lean
Exercises
Fix: Literal Match Returns Wrong
i
-- Fix the literal matching: it always returns true
def matchLit (c c' : Char) : Bool :=
true
#guard matchLit 'a' 'a'
#guard !matchLit 'a' 'b'
hint
- The function always returns
trueregardless of the characters - Replace
truewith the actual comparisonc == c'
Fix: Any Doesn't Handle Empty
i
-- This says [Elem.Any] matches empty input, but it shouldn't
inductive Elem where | Any | Lit (c : Char) | Wild
deriving Repr, BEq
def matchAny (es : List Elem) (cs : List Char) : Bool :=
match es, cs with
| Elem.Any :: _, [] => true
| Elem.Any :: _, _ :: cs' => true
| _, _ => false
#guard matchAny [Elem.Any] ['x']
#guard !matchAny [Elem.Any] []
hint
- The pattern
[Elem.Any]matches exactly one character - An empty string has zero characters, so it should return
false - Add a case for
Elem.Any :: _with[]
Fix: Missing partial
i
-- Lean reports "fail to show termination"; add the missing keyword
inductive Elem where | Lit (c : Char) | Any | Wild
def glob : List Elem → List Char → Bool
| [], [] => true
| [], _ => false
| Elem.Wild :: [], _ => true
| Elem.Wild :: ps, cs =>
(List.range (cs.length + 1)).any fun i =>
glob ps (cs.drop i)
| _, _ => false
#guard glob [Elem.Lit 'x'] ['x']
hint
- Lean reports "fail to show termination" because the
Wildcase recurses oncs.drop i - Add the
partialkeyword beforedef
Fix: Wrong Match Order
i
-- The match arms are in the wrong order: empty match should be tried first
inductive Elem where | Lit (c : Char) | Any | Wild
def glob : List Elem → List Char → Bool
| Elem.Wild :: [], _ => true
| [], [] => true
| [], _ => false
| _, _ => false
#guard glob [] []
#guard !glob [] ['x']
hint
Elem.Wild :: [], _matches anything when Wild is the last pattern element- But the empty case should be checked first: empty pattern with empty input is success
- Swap the two cases so
[], []comes beforeElem.Wild :: [], _
Fix: Double Star in parsePat
i
-- One case is unreachable; '?' should map to Any, not Wild
inductive Elem where | Lit (c : Char) | Any | Wild
def parsePat (s : String) : List Elem :=
s.toList.filterMap fun c =>
if c == '*' then some Elem.Wild
else if c == '*' then some Elem.Any
else some (Elem.Lit c)
#guard parsePat "?" == [Elem.Any]
#guard parsePat "*" == [Elem.Wild]
hint
- The
ifchain checksc == '*'twice — the second branch is never reached - The second
'*'check is dead code, and'?'is never handled - Change the second
c == '*'toc == '?'
Fix: Backtracking Uses Wrong Range
i
-- Wild backtracking range is off by one; should try consuming ALL chars
inductive Elem where | Lit (c : Char) | Any | Wild
partial def glob : List Elem → List Char → Bool
| Elem.Wild :: [], _ => true
| Elem.Wild :: ps, cs =>
(List.range cs.length).any fun i =>
glob ps (cs.drop i)
| [], [] => true
| _, _ => false
-- This should match (Wild consumes all of "abc")
#guard glob [Elem.Wild, Elem.Lit 'c'] ['a', 'b', 'c']
hint
List.range (cs.length)generates[0, 1, ..., n-1], missing the case where Wild consumes everything- Pass
cs.length + 1instead so the range includesn(consume all remaining characters)
Write: Filter by Pattern
i
-- Write a function that filters a list of strings by a glob pattern
inductive Elem where | Lit (c : Char) | Any | Wild
deriving Repr, BEq
partial def glob : List Elem → List Char → Bool
| [], [] => true
| [], _ => false
| Elem.Wild :: [], _ => true
| Elem.Wild :: ps, cs =>
(List.range (cs.length + 1)).any fun i =>
glob ps (cs.drop i)
| Elem.Lit c :: ps, c' :: cs => c == c' && glob ps cs
| Elem.Lit _ :: _, [] => false
| Elem.Any :: ps, _ :: cs => glob ps cs
| Elem.Any :: _, [] => false
def matchGlob (pat : List Elem) (s : String) : Bool :=
glob pat s.toList
def parsePat (s : String) : List Elem :=
s.toList.filterMap fun c =>
if c == '?' then some Elem.Any
else if c == '*' then some Elem.Wild
else some (Elem.Lit c)
def filterGlob (patStr : String) (files : List String) : List String :=
[]
#guard filterGlob "*.lean" ["a.lean", "b.txt", "c.lean"] == ["a.lean", "c.lean"]
hint
- Use
List.filterwith a function that callsmatchGlob - The filter function should be
fun s => matchGlob (parsePat patStr) s filterGlob "*.lean" ["a.lean", "b.txt", "c.lean"]should return["a.lean", "c.lean"]
Write: Check All Match
i
-- Write a function that checks if a pattern matches every string in a list
inductive Elem where | Lit (c : Char) | Any | Wild
deriving Repr, BEq
partial def glob : List Elem → List Char → Bool
| [], [] => true
| [], _ => false
| Elem.Wild :: [], _ => true
| Elem.Wild :: ps, cs =>
(List.range (cs.length + 1)).any fun i =>
glob ps (cs.drop i)
| Elem.Lit c :: ps, c' :: cs => c == c' && glob ps cs
| Elem.Lit _ :: _, [] => false
| Elem.Any :: ps, _ :: cs => glob ps cs
| Elem.Any :: _, [] => false
def matchGlob (pat : List Elem) (s : String) : Bool :=
glob pat s.toList
def parsePat (s : String) : List Elem :=
s.toList.filterMap fun c =>
if c == '?' then some Elem.Any
else if c == '*' then some Elem.Wild
else some (Elem.Lit c)
def matchesAll (patStr : String) (files : List String) : Bool :=
false
#guard matchesAll "*.lean" ["a.lean", "b.lean"]
#guard !matchesAll "*.lean" ["a.lean", "b.txt"]
hint
- Use
List.allto check that every string in the list matches - The predicate is
fun s => matchGlob (parsePat patStr) s List.allreturnstrueif every element satisfies the predicate (like Python'sall())
Write: Count Matches
i
-- Write a function that counts how many strings match a pattern
inductive Elem where | Lit (c : Char) | Any | Wild
deriving Repr, BEq
partial def glob : List Elem → List Char → Bool
| [], [] => true
| [], _ => false
| Elem.Wild :: [], _ => true
| Elem.Wild :: ps, cs =>
(List.range (cs.length + 1)).any fun i =>
glob ps (cs.drop i)
| Elem.Lit c :: ps, c' :: cs => c == c' && glob ps cs
| Elem.Lit _ :: _, [] => false
| Elem.Any :: ps, _ :: cs => glob ps cs
| Elem.Any :: _, [] => false
def matchGlob (pat : List Elem) (s : String) : Bool :=
glob pat s.toList
def parsePat (s : String) : List Elem :=
s.toList.filterMap fun c =>
if c == '?' then some Elem.Any
else if c == '*' then some Elem.Wild
else some (Elem.Lit c)
def countMatches (patStr : String) (files : List String) : Nat :=
0
#guard countMatches "*.lean" ["a.lean", "b.lean", "c.txt"] == 2
#guard countMatches "*.txt" ["a.lean", "b.lean", "c.txt"] == 1
hint
- Filter the list first, then take
List.lengthof the result - Or use
List.foldlto count matches in one pass countMatches "*.lean" ["a.lean", "b.lean", "c.txt"]should return2
Write: Pattern to String
i
-- Write a function that converts a list of Elem back to a pattern string
inductive Elem where | Lit (c : Char) | Any | Wild
deriving Repr, BEq
def patternToString (pat : List Elem) : String :=
""
#guard patternToString [Elem.Lit 'h', Elem.Any, Elem.Lit 't', Elem.Wild] == "h?t*"
#guard patternToString [Elem.Wild, Elem.Lit '.', Elem.Lit 't', Elem.Lit 'x', Elem.Lit 't'] == "*.txt"
#guard patternToString [] == ""
hint
- Reverse of
parsePat: convertElemvalues back to?and*characters - Use
List.mapwith a function that pattern-matches on eachElem Elem.Lit c→ the character itself,Elem.Any→'?',Elem.Wild→'*'- Then use
String.ofListto collect the result
Write: Find Wildcard Positions
i
-- Write a function that finds all positions of Wild elements in a pattern
inductive Elem where | Lit (c : Char) | Any | Wild
deriving Repr, BEq
def wildcardPositions (pat : List Elem) : List Nat :=
[]
#guard wildcardPositions [Elem.Lit 'a', Elem.Wild, Elem.Lit 'b', Elem.Wild] == [1, 3]
#guard wildcardPositions [Elem.Lit 'x'] == []
#guard wildcardPositions [Elem.Wild] == [0]
hint
- Find all indices where
Elem.Wildappears in the pattern List.findIdxs?returns a list of indices matching a predicate- The predicate is
fun e => e == Elem.Wild
Write: Has No Wildcards
i
-- Write a function that checks if a pattern has no wildcards (no * or ?)
inductive Elem where | Lit (c : Char) | Any | Wild
deriving Repr, BEq
def hasNoWildcards (pat : List Elem) : Bool :=
true
#guard hasNoWildcards [Elem.Lit 'a', Elem.Lit 'b']
#guard !hasNoWildcards [Elem.Lit 'a', Elem.Wild]
#guard !hasNoWildcards [Elem.Any]
#guard hasNoWildcards []
hint
- Check if the pattern contains no
Elem.Wildand noElem.Anyelements - Use
List.allwith a function that checks each element isElem.Lit _ - Can pattern-match:
fun e => match e with | Elem.Lit _ => true | _ => false