Software Design by Example in Python 4: Matching Patterns
As I wrote last year, every piece of technical writing I’ve ever done has been shaped by the work of Brian Kernighan. His books didn’t just teach me how to design software; the clarity of his explanations gave me a model to imitate and a standard to strive for.
It was therefore one of the proudest moments of my professional life when Kernighan agreed to contribute a chapter on matching regular expressions to Beautiful Code in 2006. His short C program only matched the patterns shown below, but as Kernighan wrote, “This is quite a useful class; in my own experience of using regular expressions on a day-to-day basis, it easily accounts for 95 percent of all instances.”
| Meaning | Character |
|---|---|
| Any literal character c | c |
| Any single character | . |
| Beginning of input | ^ |
| End of input | $ |
| Zero or more of the previous character | * |
After teaching this material last spring, I decided that Chapter 4: Matching Patterns would cover the patterns use in globbing filenames rather than the subset of regular expressions for text in the JavaScript version. Re-implementing this still turned out to be a natural way to introduce the Open-Closed Principle, the Chain of Responsibility and Null Object design patterns, some refactoring steps, and the idea of test-driven development, which isn’t bad for a one-hour lesson.
Terms defined: Chain of Responsibility pattern, child class, Extract Parent Class refactoring, globbing, greedy matching, helper method, inheritance, lazy matching, literal (in parsing), Null Object pattern, refactor, regular expression, signature, technical debt, test-driven development.