Archive

Archive for November, 2006

DrProject Internals: Testing

November 19th, 2006

In two of the earlier postings in this series, I said that you can’t add security to a system after it has been built: instead, you have to design it in, right from the start. The same is true of testing: if you don’t think about how you’re going to test your application while you’re designing it, the odds are very good that you’ll build something that can’t (or cannot easily) be tested.

Trac was one of those. Its developers had written some unit tests, but they only covered a small part of the code in the version we started with in May 2005. Given that we were going to have high turnover in our development team (students rotating in and out on a term-by-term basis), we had to have better coverage, or we’d become mired in a downward spiral of “fixing A breaks B, fixing B breaks C, fixing C breaks A”.

But testing web applications is harder than testing classical command-line applications because web apps consist of several collaborating processes: a browser, a web server, the CGI program (or servlet container and servlets), the database server, and filesystem-dependent components like Subversion (Figure 1).

Figure 3

This causes three problems; in increasing order, they are:

  1. Unit testing libraries like JUnit (and its clones in other languages) aren’t built to handle this: as the word “library” implies, they’re made up of code that’s meant to be called within a process. Despite the ubiquity of multi-process applications, most debuggers and testing libraries cannot track “calls” between processes.
  2. Configuring a test environment is a pain: you have to set up a database server and Subversion repository, clear the browser’s cache, make sure the right stanzas are in your Apache configuration file, and so on.
  3. Running tests is slow. In order to ensure that tests are independent, you have to create an entirely new fixture for each test. This means reinitializing the database, erasing and re-creating the contents of the Subversion repository, and so on, which can take several seconds per test. That translates into an hour or more for a thousand tests, which means that developers won’t run them routinely while they’re coding, and might not even run them before checking changes in.

The first step in fixing this is to get rid of the browser and web server. One way to do this (shown in Figure 2) is to replace the browser with a Python script that generates HTTP requests as multi-line strings and passes them to a “fake CGI” library via a normal method call. After invoking our actual program, the fake CGI library passes the text of an HTTP response back to our script, which then checks that the right values are present (about which more in a moment). The “fake CGI” library’s job is to emulate the environment the web app under test would see if it was being invoked as a CGI by Apache: environment variables are set, standard input and output are replaced by string I/O objects, and so on, so that the web app has no (easy) way of knowing that it’s being invoked via function call, rather than being forked.

Figure 2

Why go through this rigmarole? Why not just have a top-level function in the web app that takes a URL, a dictionary full of header keys and values, and a string containing the POST data, and check the HTML page it generates? The answer is that structuring our tests in this way allows us to run them both in this test harness, and against a real system. By replacing the fake CGI adapter with code that sends the HTTP request through a socket connected to an actual web server, and reads that server’s response, we can check that our application still does what it’s supposed to when it’s actually deployed. The tests will run much more slowly, but that’s OK: if we’ve done our job properly, we’ll have caught most of the problems in our faked environment, where debugging is much easier to do.

Now, how to check the result of the test? We’re expecting HTML, which is just text, so why not store the HTML page we want in the test and do a string comparison? OK, it was a rhetorical question—if we do that, then every time we make any change at all to the format of an HTML page, we have to rewrite every test that produces it. Even something as simple as introducing white space, or changing the order of attributes within a tag, will break string comparison.

A better strategy is to add unique IDs to significant elements in the HTML page, and only check the contents of those elements. For example, if we’re testing login, then somewhere on the page there ought to be an element like this:

Logged in as gvwilson (logout | preferences)

We can find that pretty easily with an XPath query, or by crawling the DOM tree produced by parsing the HTML ourselves [1]. We can then move the div around without breaking any of our tests; if we were a little more polite about formatting its internals (i.e., if we used something symbolic to highlight the user name, and trusted CSS to do the formatting), we’d be in even better shape.

We’ve still only addressed half of our overall problem, though: our web application is still talking to a database, and to Subversion, and reinitializing those each time we run a text is still sloooooooow. We solve this by moving the database into memory, and replacing Subversion with a mock object (Figure 3).

Figure 4

Let’s start with the database. Most applications rely on an external database server, which is just a long-lived process that manages data on disk. An increasingly-popular alternative is the embedded database, in which the database manipulation code runs inside the user’s application as a normal library. Berkeley DB (now owned by Oracle) and SQLite (still open source) are probably the best known of these; their advocates claim they are simpler to build and faster to run, although there are lots of advantages to the server model as well.

From a testing point of view, the great advantage of embedded databases is that they can be told to store data in memory, rather than on disk. This would be a silly thing to do in a production environment (after all, the whole point of a database is that it persists), but in a testing environment, it can speed things up by a factor of a thousand or more, since the hard drive never has to come into play. The cost of doing this is that you have to either commit to using one database in both environments, or avoid using the “improvements” that different databases have added to SQL.

We initially thought we could do the first—SQLite seemed like it would be fast enough for us even in a production environment, and it was easy to set it up to run in memory. However, we tripped over a concurrency bug in the Python/SQLite bindings in July 2005; lacking the skills or time required to fix it, we decided we’d use PostgreSQL for production, and SQLite for testing. As part of a big refactoring effort in January 2006, Chris Lenz introduced a sort-of kind-of object/relational mapping layer that hid most of the differences between the two. I’d like to go back one day and revisit this, but what we have is good enough for now.

What about Subversion? There’s only one of it, and it doesn’t support in-memory operation, so it would seem that we’d have to create a repository and check in a bunch of files every time we wanted to run a single unit test. What saved us was the fact that DrProject only uses a small fraction of Subversion’s capabilities: our unit tests don’t have to be able to exercise commit, branch, properties, or anything else that actually changes the repository.

Our approach was therefore to create a mock object that implemented just the subset of Subversion’s API that we cared about. Its “repository” consists of a dictionary of lists of lists (which should have been a dictionary of objects, but hey, nobody’s ferpect); its interface knows how to pull out the contents of files and directories by revision number, the items affected by a change set, and the “check-in” comments. Typing in one of these repositories takes 10 or 15 minutes, but we only need three or four to test everything we care about.

With all of these changes, DrProject zips through its tests quickly enough that developers actually will run the test suite before checking in changes to the code. The downside is the loss of fidelity: the system we’re testing is a close cousin to what we’re deploying, but not exactly the same. However, this is a good economic tradeoff: we may miss a few bugs because we’re not using a real Subversion repository, or because our fake CGI layer doesn’t translate HTTP requests exactly the same way Apache and Python’s libraries do, but we catch (and prevent) a lot more by making testing cheap.

And now for the bad news. We haven’t implemented all of this yet—in particular, the fake CGI layer was never completed, and the mock Subversion repository has fallen behind the rest of the code. This is mostly my fault: over the past sixteen months, I’ve been too distracted by other things to check the source on a regular basis. Partly, though, it’s a reflection of the fact that most developers (even ones I’ve had a chance to brainwash) don’t take testing as seriously as they should. Getting DrProject to the point where one command will run tests that exercise 90% or more of its capabilities is my #1 priority for the summer of 2007.


[1] Assuming we’re generating well-formed HTML, which of course we should be.See also:


Later: Titus Brown posted a short note on bypassing the server when testing apps running on WSGI. I’d really like to try this out…

DrProject, Learning

DrProject Internals: Subversion

November 16th, 2006
Comments Off

It’s finally time to look at how DrProject integrates with Subversion. “Integrates” is the key word here: whereas we (and Trac‘s designers before us) had a free hand with the ticketing system and wiki, Subversion and other version control systems are complex enough that we have to base our design on what they can do, rather than what we might want.

Lucky for us, Subversion‘s designers had lots of experience with previous version control systems, and so were careful to provide tools that would make integration easy. The best way to appreciate these tools is to compare the Bad Old Days (CVS in the early 1990s) with our modern utopia. The first time I had to mess around with it, the source code for CVS was a tangled mess—so tangled that the best (possibly only) way to fetch a list of recent commit messages was to run the command in a sub-shell and parse its output. Think for a moment about what that involves:

  1. My application formats a string containing a CVS command.
  2. It passes that string to a shell running in a sub-process.
  3. That shell starts another process to execute the cvs program (unless the PATH variable has been mangled too badly by all this forking and exec’ing).
  4. The cvs program calls a bunch of C functions (some of which might actually starts sub-shells of their own, but that’s another story) to extract information from the versioning files and metadata in the repository.
  5. My application reads that command’s output as a list of strings and runs it through a handwritten parser that (hopefully) extracts dates, user IDs, and commit messages.

Subversion‘s design makes the first three steps are unnecessary. It has a well-defined C API [1], which provides functions for doing (almost) all user-level operations. Command-line programs like svn and svnadmin call these functions, but Subversion also provides adapter libraries to make them available to Python, Java, and other languages. As a result, programmers don’t have to fork sub-processes, or parse strings; they can instead call a function, and get a data structure as a result.

All right: what information do project members actually want about their project’s repository and its contents? “What’s there?” (i.e., a listing of available files) is pretty obvious, along with what’s in particular files, what used to be in them, and a list of change sets. If we’re showing what used to be in files, we ought to show the differences too; and if we’re showing change sets, we ought to provide a multi-file view of the overall differences.

Hm… What about access control? How are we going to ensure that only people who are members of a project [2] can view the contents of the project’s repository? And what exactly do we mean by “the project’s repository”, anyway? Is there going to be one repository for each of the projects DrProject is managing, or would it be simpler and/or safer to partition one big repository into project-sized chunks?

Subversion supports the latter: you can create an access file that gives particular users read and/or write permission for sub-directories within a repository. However, this is what Joel Spolsky famously called a leaky abstraction. To see why, consider a situation in which Olga can read and write both the red and green directories in a repository, but Maxim can only read and write the green directory. If Olga commits changes to updates red/reddish.java and green/greenish.java in a single operation, what should we show Maxim when he asks to view the change set? We can hide the contents of, and changes to, the file he’s not allowed to view, but he’ll still be able to read Olga’s commit message, which may (if Olga is conscientious) tell him a lot about what’s going on in parts of the world he’s not supposed to know anything about.

We therefore decided to use one repository per project. Each of these repositories has its own access file; when users are added to or removed from the project, DrProject modifies the access file appropriately [3]. This means that even if people bypass DrProject, and try access repositories using Eclipse, command-line programs, and other clients, their access rights will always be what we want them to be.

One thing DrProject does not provide is a way for users to modify the repository over the web. In particular, users cannot edit or commit files through their browser. We left this out for several reasons:

  1. It opens up a channel for attack: if the DrProject CGI is able to modify the repository, then anyone who subverts the CGI can do a lot of damage to the project’s core resource.
  2. We didn’t believe anyone would ever actually do any significant code editing through an HTML text editing box. (This may change in future, as rich editing controls become common; even today, it’d be nice to be able to add comments to commits after the fact.)
  3. Implementing it—in particular, implementing conflict resolution—would have at least tripled the complexity of this part of DrProject.
  4. Nobody else’s system has it, so we figured there couldn’t be a crying need ;-)

It turns out that accessing a repository via the Subversion API is a lot slower than querying a PostgreSQL database. To keep things zippy, DrProject caches the information it gets about the repository in the database, so that future lookups will be faster. This information is WORM (write-once, read-many): once it’s in the database, it stays there forever, and is never changed (except in those very rare cases when someone actually does edit a commit log message after the fact outside DrProject, in which case the database information is resynchronized).

This isn’t as big a disk hog as you might think, since most of what’s in a repository is never viewed over the web. However, we are a little worried about what might happen if we provide a web services API, so that people can write scripts to pull data out of DrProject. While a human being might not click on dozens of links to pull up all the files, revisions, change sets, and differences in a project, a script very easily could. We’ll see…

So how well is it working? Pretty well, although students don’t seem to use DrProject‘s Subversion browser very much. One reason might be that they don’t need it—in small projects, done by small teams, on short timescales, the history of a project isn’t that important. Another reason may be that desktop tools (command-line programs and Eclipse plugins) give them a richer experience. Still, it does seem to be running smoothly, and the wiki formatting of commit messages, which automatically creates links to tickets, is something I personally rely on a fair bit.


[1] Yes, Subversion is written in C, for both execution speed and portability. The last may seem like an oxymoron, given the bajillions of #ifdefs programmers have to use to actually make their C portable, but these have the advantage of being well understood. Getting C++ or Java to work on multiple platforms is actually no easier.[2] More specifically, have a role with respect to that project that includes the capability to view the contents of the repository.

[3] This is only true if administrators use the functions in DrProject to edit user permissions. If someone edits the underlying database directly via SQL, it obviously won’t have the desired side effect of updating the Subversion access file.

DrProject, Learning

TUCOWS on spam on CBC: on tonight

November 13th, 2006
Comments Off

The folks from TUCOWS will be talking about spam on the CBC tonight (Nov 13).

Announcements

Requirements as Tickets (or, Hierarchy to the Rescue)

November 11th, 2006

Several of the small companies we’ve spoken to recently have asked whether it would make sense to use an issue tracker to manage requirements. It’s superficially sensible: if you can create tickets for feature requests, why not create them for the needs that drive those features?

After pondering this for a (short) while, we think it’s workable, but with caveats. The first is that stakeholders must be able to express relationships between tickets, such as “Feature X is needed to satisfy business need Y”, “P is an exception to the general rule Q,” or, “E must be done before F and G, which in turn must both be done before H can be started”. Injecting these cross-references must be really, really easy: stakeholders cannot be required to type in lots of ticket numbers (they might be willing to do this once, but they won’t keep them all up to date as tickets are split, merged, added, and removed during the course of the project).

The second caveat is that the system must give stakeholders a way to organize the tickets into a sensible linear order. Hypertext is all very well, but wandering around a twisty maze of little requirements is not an effective way to understand a complex system. Linearization gives people who already know the problem architecture a way to help people who don’t, learn.

We’re still thinking about the first (tag-directed proximitization in an unstructured graph plus link-drawing?), but we have an idea for the second. Right now, most ticketing systems’ standard display is a flat list sorted by ticket ID, filing date, priority, or something of that ilk (Figure 1).

Figure 1What if stakeholders had another view, in which they could organize tickets like this:

Tagging System
  281: In-place editing
    407: Keyword completion
    399: Explicit “submit”
  117: Colorization
    145: Red-purple-blue spectrum
    214: Must be legible when printed B&W

A little AJAX ought to be enough to let users move things up and down, drag them left and right, or create a heading that isn’t associated with an actual ticket. The system would store this structure as a first-class entity; tickets that hadn’t yet been manually organized wouldn’t appear in it, while tickets that had been closed would be crossed out until a human being said, “Yeah, take it out of the hierarchy.”

I don’t know if this would be a sensible way to organize issue tickets (which often relate to several components of the system). I think it has more of a chance of working with requirements; I’d be interested in hearing what you think.

DrProject, Learning, Research

Expressing Temporal “Type” Information in Programs

November 11th, 2006

Several students were in my office this week explaining how their programs work.  In each case, much of the information was temporal: “this happens, then that, then this other thing”.  Sequence diagrams and the like were designed to capture this, but I don’t know any programming language that allows programmers to specify constraints of this kind on their variables.

Here’s an example of what I mean. Jorma Sajaniemi has been doing some interesting work on the roles of variables in novices’ programs (see in particular his list of roles).  One such role is a “one-way flag”, which, as its name suggests, starts with one value, and is assigned a different one exactly once, after which it can’t go back to having the original value.  Another role is a “most-wanted holder”, which keeps track of something like the maximum value seen so far.

In both cases, it would be very helpful to be able to specify the intended use of the variable in a machine-checkable way, but I don’t know how to do this.  Implementing the behavior is easy: I can write a one-way flag class, or a most-interesting accumulator, in just a few lines of Java or Python.  But how do I tell the compiler and runtime system what the variable “means”?  I can sort of see how to do it in the first case (mumble mumble finite state machines mumble mumble), but in the second, I need to tell the compiler or runtime about some dataflow constraints (“variable X may only be assigned values taken from Y — other values cannot be assigned to X, even if they are type-compatible”).

If anyone knows of languages that allow such constraints to be captured or expressed, I’d welcome pointers.

Research

Setting Up Yet Again

November 10th, 2006
Comments Off

I now have a Windows PC on my desktop — using the Mac laptop as my main computer was making my back and neck ache. Here’s what I’ve installed so far (in order):

  • Firefox
  • Thunderbird
  • Microsoft Office
  • Cygwin (including Gnu Make and MiKTeX)
  • Acrobat (for reading PDFs)
  • MWSnap (for screen captures)
  • Eclipse (for Java, and as a memory stress tester ;-) )
  • Wing (for Python)
  • Emacs (I know, I know…)
  • A printer drive
  • Putty
  • SQLite

Everything else I need I got by checking out a repository from one of (quick count) four machines:

Elapsed time: about two hours, during most of which I was reading mail and talking to people.

Later: I have since added:

  • MiKTeX
  • Ghostscript and Ghostview
  • FreeRip
  • PDFCreator
  • Java 1.5
  • MyUninstall
  • Gnumeric (to read old grades files)

Uncategorized

CONWISE 2007

November 10th, 2006
Comments Off

The 2007 edition of a Conference for an Ontario Network of Women in Science and Engineering (CONWISE 2007) will be held at the University of Western Ontario on March 2-4, 2007.  It’d be great to have a strong turnout from U of T…

Equity, Learning

Can’t Get (Directly) There From Here

November 9th, 2006
Comments Off

One of the projects I’m contributing to these days is writing a first-year Computer Science textbook using Python. We’re using DrProject to manage it: after all, LaTeX files are really just another kind of source code, and what better way to keep track of who’s supposed to be doing what than ticketing?

Well, since you asked… The truth is, we’re storing to-do information in two ways: as tickets in DrProject, and as specially-formatted text in the LaTeX. All the big items use the former; the little notes to ourselves like, “This sentence is cheesy,” are inside the .tex files like this:

\FIXME{This sentence is cheesy.}

Embedding “tickets” in the source is a bad idea for several reasons. First, the embedded items are invisible to DrProject: they can’t be searched, ordered, assigned to particular users, and so on [1]. Second, whenever you store information in two places, you run the risk of duplication, contradiction, or omission. Right now, we have no way of knowing which of our FIXMEs are also recorded as tickets, and which aren’t; we could ask people to file a ticket each time they create a FIXME, and delete the FIXME when they close the ticket, but that’s a lot of extra work.

Despite these problems, embedding little notes in code is such a popular working practice that Eclipse and other IDEs have tools to collate and present markers of this kind. The reason is simple: embedding in code is easy. Even if you have Mylar installed, so that you can work with your ticketing system from within Eclipse [2], throwing a TODO comment or a FIXME macro into your source file disrupts your train of thought—your flow—much less than filing a ticket [3].

There’s another, subtler issue here as well. Suppose you did want to file a ticket to say that a particular sentence was cheesy, or that a particularly complex assignment statement should be refactored. Would you quote it in the ticket? There goes your flow, but what else can you do? You can’t point to it (e.g., quote file name and line number), because the text or code in question might be reorganized between the time the ticket is created, and the time someone gets to it. The only thing to do is to follow where literate programming led, and Javadoc half-heartedly followed: store the “documentation” with the “code”, and tell ourselves that it’s the least of the available evils.

But wait: our source files are under version control, aren’t they? And DrProject can see the version control repository. Why can’t DrProject scan the files in the repository, extract the FIXMEs and TODOs, and turn them into tickets? Better yet, why not have it look for FIXMEs like the one above and insert an automatically-generated ticket ID, i.e., turn the FIXME into something like this:

\FIXME[179]{This sentence is cheesy.}

Those “tickets” can then be managed like any others: if someone closes one in the database, DrProject can delete the corresponding line from the source file, and vice versa.

What I’m really proposing is to treat information-in-the-database and information-in-the-repository on an equal footing. At some level of abstraction (which we have to define and implement), it shouldn’t matter how or where the ticket is stored. All that really matters is the information it contains, and the operations users can perform on it. If it’s easiest for them to enter that data by adding a line to their source code, great—we can handle that. If there’s enough data to justify them switching tools (e.g., a one-page description of how to reproduce a complicated synchronization bug), we can support that too.

It’s tempting—but it won’t work. The problem is that the editors people use when they’re working with source code are unstructured. The editor I’m using to create this posting knows about HTML; if I press the < key on my keyboard, it adds the string < to the file. In contrast, the editor in Eclipse lets me put whatever I want in my Java files---even text that can't possibly be legal Java. We would therefore have to trust users (a) to format their FIXMEs and TODOs exactly the right way when initially adding them to files, and (b) not to mess up any of the information the system added. Experience with first-generation CASE tools and similar systems proves (at least to me) that people will get both wrong often enough to find the system a hindrance rather than a help.

Teaching Eclipse's editor how to format a \FIXME[...]{...} in a .tex file is not the right answer: different issue tracking tools will have different conventions, and anyway, what do we do when someone wants to add \CODEREVIEW{...} or \QUESTION{...} or something else? The right answer is to allow developers to create custom micro-editors and bind them to particular flavors of micro-content. The document then becomes an assemblage of strongly-typed elements, the presence of which causes display/modify/diff/merge handlers [4] to be loaded and run.

So once again it comes back to extensible programming. We separate models, views, and controllers when we're building tools for other people, but we still, in the early 21st Century, insist that our files be unstructured text. It's easy to see the cost of changing---legacy tools would stop working, and we might have to (shock horror) put Vi or Emacs out to pasture. I think it's time we started thinking about the cost of staying stuck in the 1970s; I think we ought to start paying attention to all the neat tools that aren't feasible to build because we're afraid of embracing the very future that we've dedicated ourselves to creating.


[1] OK, we could add another parameter to the \FIXME macro to record a user ID, but then we'd have to validate it. And what about priority? Should \FIXME have as many parameters as tickets have fields? Brr...

[2] DrProject doesn't support Mylar yet, though there is a plugin for Trac. If anyone is looking for a challenging, useful CSC49X project...

[3] The problem isn't the time it takes to fix whatever you've noticed; the problem is that you have to put aside whatever problem you were thinking about to do so. People talk about "pushing" and "popping" issues on a mental stack, but the brain doesn't actually work that way: lots of studies have shown that it takes several minutes to get back in a flow state after any significant interruption.

[4] Display and modify should be obvious; diff and merge are needed so as not to discourage users from putting files containing such content under version control. (I would give you real cash money right now for an Excel merge-and-diff tool, and no, export as CSV and use text diff is not an answer.)

DrProject, Extensible Programming, Learning, Research

UML Debugging Moves to SourceForge

November 9th, 2006
Comments Off

A former CSC49X project that aimed to build a visual debugger for UML sequence diagrams has moved to SourceForge.  Congratulations to Mike Liu — I’m sure more hands would be welcome ;-)

Announcements, Learning

CSC49X Projects Winter 2007

November 8th, 2006
Comments Off

Week 9: time to start thinking about next term’s projects. Here’s what’s on the table:

  • The Online Marking tool, which gives TAs the ability to mark up student code on the web. This is now being used in a couple of courses, and there was lots of interest at DemoCamp in modifying it to work as a general online code review tool. Lillian Angel will be heading up the team, under the direction of Jennifer Campbell; two more bodies are needed.
  • Our Blackboard Plugin project, which displays stats on the grades being given in different sections of large courses. This plugin may be done by Christmas, but there are lots more to write—Java, anyone?
  • The MailClouds project is using tag clouds to show the history of conversations between businesses and their customers.
  • We’re also visualizing generalogical data for REED, the Records of Early English Drama project, using Flash, asynchronous XML, and other sexy technologies. Rich web GUIs… mmm…
  • Mylar is a tool that (among other things) allows Eclipse to connect directly to issue-tracking systems. There is apparently already a plugin for Trac;we’d like to adapt it for DrProject.
  • One new project: improving the GUI of a tool for visualizing brain scan data. Prof. Stephen Strother‘s team at the Rotman-Baycrest is studying the effects of strokes and other impairments on elderly patients; they have lots of data, and need better ways to look at it.
  • Another new project: an API (with implementations) for an abstract spreadsheet connection layer that does for Excel, Gnumeric, and OpenOffice what ODBC, JDBC, and Python’s DB-API do for databases.

If you have at least two 300-level CS courses, and your GPA is 3.0 or higher, please give me a shout.

Uncategorized