I promised in the last article to move on to DrProject's ticketing system, but there are still a couple of issues around its wiki that need further description. The first is how wiki text is transformed into HTML; the second is why this is harder to do in batch mode than you'd think.
Ward Cunningham created the first wiki in 1994-95 so that people could easily edit hypertext over the web. The only input widget he trust browsers to support was the text input box. Since editing HTML tags by hand is tedious and error-prone, he adopted and modified the notational conventions that people were using for email and other plain-text documents:
Wiki markup simplifies input, but creates a new problem: somehow, pages have to be parsed, then transformed into HTML for display. The parser that DrProject inherited from Trac used a pile of regular expressions to match and extract bits of text. This worked well enough, but was difficult to extend---every time we tried to add a new feature, we found our new regexp conflicted with some of the existing ones, or made some piece of formatting ambiguous.
A graduate student named Liam Stewart (now with Idée) therefore wrote an entirely new parser using Greg Ewing's Plex toolkit. Like yacc and (many) other parser generators, Plex takes a grammar specification as input, and compiles it into executable code that will parse the language that grammar specifies. Developers can embed instructions in the grammar to create a parse tree as a side effect of analyzing input. Here's a bit of our grammar (Bol means "beginning of line", Eol means "end of line", and---well, the details aren't important):