Software Design by Example 7: Pattern Matching
Every piece of technical writing I’ve ever done has been shaped by the work of Brian Kernighan. The Elements of Programming Style, Software Tools in Pascal, The Unix Programming Environment, and The C Programming Language didn’t just teach me how to design software (as opposed to just writing code); the clarity of Kernighan’s explanations gave me a model to imitate and a standard to strive for.
It was therefore one of the proudest moments of my life when Kernighan agreed to contribute a chapter to Beautiful Code in 2006. The subject he chose was matching regular expression—more specifically, the very first regular expression matcher that Rob Pike wrote for Unix in the early 1970s. It only matched the patterns shown below, but as Kernighan wrote, “This is quite a useful class; in my own experience of using regular expressions on a day-to-day basis, it easily accounts for 95 percent of all instances.”
|Any literal character c||c|
|Any single character||.|
|Beginning of input||^|
|End of input||$|
|Zero or more of the previous character||*|
Terms defined: base class, Chain of Responsibility pattern, child (in a tree), coupling, depth-first, derived class, Document Object Model, eager matching, eager matching, greedy algorithm, lazy matching, node, Open-Closed Principle, polymorphism, query selector, regular expression, scope creep, test-driven development.