A year and a bit ago, I posted pictures of DrProject‘s database schema and architecture. I’m happy with the former — the only thing I’d do differently is make it larger and add comments (or change field names to be more descriptive) — but the latter still bugs me. It doesn’t carry as much information as a napkin sketch of a user interface, much less an architectural blueprint. I’ve looked at lots of books on software architecture (Reekie & McAdam and Gorton being my favorites), but I still haven’t found a notation that seems like good value for time. If you know of one, I’d welcome a pointer…
Monthly Archives: April 2008
Someone’s a Fan
Google Mentoring Videos
Via Lin Zhou:
From: Susan Dorward
Google has started offering coaching to most of their employees, and I’ve been lucky enough to be approved as one of their coaches. They recently sponsored a series of talks given by their coaches and they have posted videos of these talks up on YouTube. My talk is about patterns I’ve seen when coaching technical women, and suggestions around how we can handle these common issues.
If you are interested, you can access the the talks by going to YouTube.com and searching for “Google techtalks coaching”. There are currently six talks there, including one that I gave. Here are the topics:
Student Impressions of haXe
Two of the students in this term’s consulting course, Robert Beghian and Tom Plaskon, spent the term working with haXe. Their impressions of it are below…
HaXe is a high-level, object-oriented programming language for the development of web applications. Programs written in haXe can be compiled on several platforms including Javascript, Flash, and Neko. For this reason the official website calls haXe the “web oriented universal language.” HaXe’s syntax is similar to the Java and ActionScript syntaxes. However, despite these promising characteristics, haXe has many drawbacks. I will focus on three areas in my critique of haXe: documentation, tool support, and libraries.
The API documentation on the official haXe website is difficult to find and incomplete. The API documentation is not even present on the top-level navigation of the website (even though there is clearly enough room). To find the API one must scroll through dozens of screens on the reference page, to find the link to it at the very bottom (with no prior mention that it is even present there). Many of the API pages are completely without any explanatory text. The methods provided by a library are merely listed, leaving one wondering what their function is, and what the purpose of the library is. There is very little in the way of documentation offered anywhere other than the official site. Even worse, Amazon only lists a single book on haXe. The product information page details the book as having 600 pages. It would be helpful if the entire haXe website provided 100 hundred pages of documentation, especially considering is not available elsewhere.
The tool support in haXe is very poor. Although there is a Eclipse plugin for haXe it barely adds any functionality at all. It does not support the opening of definitions nor does it even have a hot-key for compilation. This has to be done using the context menu on the project. However, these are fairly minor issues in comparison to the biggest shortfall of the tool support for haXe, namely, that it does not have a symbolic debugger! The only method of debugging a haXe program is to catch exceptions and print them with trace statements. Furthermore, exceptions are not even automatically caught at the top of the execution stack, so if an exception does get that far, the programmer will simply be displayed a blank, white screen. Even Javascript has an error console built-in to every modern browser.
Lastly, I would like to discuss haXe’s libraries. The standard library contains only the most basic of XML support, allowing you to parse an XML string into a tree, and navigate this tree using the familial attributes of a node (parent and children). There is no built in support for any of the W3C DOM features, including even the lack of getElementById and getElementsByTagName functions. There is only a single library, by Daniel Cassidy, which implements the XPath W3C standard. This library is in an alpha state and although I could find the documentation for it, I could not find out where to download the library. In comparison, Python has several XML libraries (PyXML, libxml2, ElementTree, and 4suite) all of which have full XPath support. Web programmers expect a language to have excellent XML support. The official haXe website lists twenty-one libraries. Ten of these libraries are by two authors. That is to say, nearly 50% of the all libraries listed on the official website were committed by two programmers. Thankfully, there is a standard MySQL library, or I would say that would be the killing blow for the mainstream use of haXe.
Although haXe has many desirable traits (including support for many platforms and familiar syntax) one can only assume that the claim of it being the universal language refers to its multi-platform support and not its pervasiveness. The contributors to the haXe community appear to be few and this shows. In comparison to Python with its endless libraries and PHP with its plethora of user-submitted samples and tutorials (the bottom of every page on PHP.net contains a wealth of very helpful, user-submitted information), haXe is likely to be found wanting by the average web programmer. HaXe’s lack of good documentation makes it difficult for new users to quickly learn, while it’s lack of tool support (especially a debugger) and libraries make it difficult to be very productive in.
Zis Is Cursed, Zat Is Cursed
Spring has definitely started in Toronto — it’s 23 degrees outside (yes, Celsius), and the sky is that slightly dusty blue that I always associate with hot dogs, frisbees, and sunburn. It helps put this week’s rejections in perspective:
- NSERC turned down a proposal to study the usefulness of integrated IM, configurable ticketing, and continuous documentation in web-based software project portals.
- ITCDF (an internal program at U of T) turned down a proposal to see whether short screencasts (including ones contributed by students themselves) could take the place of conventional help pages.
Coming on top of NSERC’s rejection of my Discovery grant (which would have looked at the impact of reverse test oracles on uptake of test-driven development) and Google’s rejection of a proposal similar to the IM/ticketing/docs one I sent to NSERC, it’s making me wonder if I’ve been hexed. I know there’s a lot of randomness in academic grant review procedures, but being 0 for 4 (or 5, if you count the initial submission of the big NSERC proposal) definitely hints at a pattern.
Oh well — it’s still a sunny day out there, and in a couple of hours I get to go home and tickle my daughter’s feet…
Consulting Course Videos
There’s a metric shoeful of demo videos up on the consulting course web site now showing off what students did this term. I (or rather Samira and Jeremy, two of my grad students) would be particularly interested in comments on their tool for threading project histories — it’s going to be the focus of their research, and I think it’s a very cool idea.
Introducing Stack Overflow
Jeff Atwood (who didn’t like Beautiful Code, but is a nice guy anyway) and Joel Spolsky (best known these days for being Joel Spolsky) are jointly launching a new company called stackoverflow.com, which will be a Q&A/ask-the-experts/find-what-you-need site for programmers. Bookmarked…
Pogy Travel Crib
Madeleine is now a year, two weeks, and one day old. Still only has two teeth, but love gnawing on bananas (and occasionally fingers); walkin’ up a storm, and we’re the proud owners of two hand-painted… um… paintings.
So this seems like as good a time as any to put in a plug for the Pogy Travel Crib:

It folds up into a disk about the size of two frisbees stacked on one another, weights half a kilo, sets up in seconds, and is easy to clean. If you’re doing the rounds of friends & family with a newborn, I can’t say enough good things about it.
Integration Irony
We’ve been having a problem recently with self-registration in the new version of DrProject. Would-be users fill an oh-so-familiar form (preferred ID, email address, password); their data is then held in queue for an admin to approve. However, when the admin clicks “approve”, DrProject reports “user already exists in password file”.
Yesterday, David Wolever managed to track it down.
- Two users are being confirmed at once (i.e., there are two or more requests pending, the admin has selected “approve” for both, then clicks “submit” on the form).
- One works fine.
- The other triggers an exception for some reason (usually missing information).
- The exception causes the database transaction to roll back (good), but the first user’s ID and password are in the external password file (bad).
Yes, we will improve pre-transaction validation so that #3 happens less frequently, but we’re still left with the basic problem: we can’t make operations across two things (in this case, the database and the file system) atomic. We could make up a list of file operations to be undone in case of a database transaction failure (i.e., roll our own transaction system—bleah), or do the file operations first and proceed to the database transaction only if the file op succeeds (more code, which means more places where developers could forget to do things), or move passwords for self-registered users into the database (which makes administration of large portals harder: managing credentials in multiple credential stores is a project unto itself).
I’m sure we’ll come up with something, but until we do, I’m just going to savor the irony of it all. Four years ago, when we forked from Trac, we faced a similar problem. Should each of a portal’s projects have its own database and version control repository, or should we use one DB for all projects (but separate repositories), or one DB and one repository? We eventually realized that one DB shared by all projects was the only option that made sense, even though it meant more hacking on the tables inherited from Trac, because multiple DBs would require us to build our own atomicity layer. It’s ironic that trying to keep all the user credentials in one place (a file on disk, accessed only by a setuid validation program), has gotten us back to the same point.
Later: at Alan Rosenthal’s suggestion, David Wolever has “solved” this particular problem by making operations on the credentials file idempotent: if DrProject tries to add data that’s already there, nothing happens and no error is reported. It’s not a general solution—in fact, I’m sure that one day someone somewhere will curse us for special-casing this—but it’s good enough to get 3.0.1 out the door.
SPOC
Regarding the idea of reproducible research, I stumbled over I-SPOC while looking through Google Summer of Code stuff. From their pitch:
The overall goal of this project is to build computational and social infrastructure to support the use of a new form of scientific communication called a SPOC (Scientific Paper with Open Communication). A SPOC combines a standard academic paper with open source computational models written in any publicly accessible computer language. SPOCs will (i) link computational results with the models that produce them, allowing independent verification and validation (ii) create incentives for cleaner, more transparent code and for the sharing of code (iii) enable others to extend and improve existing computational models and to verify model robustness (iv) bring computational models to life allowing faculty, students, and other scholars to see dynamic phenomena emerge and (v) have an enormous effect on the teaching of science.
The reality isn’t (yet) as impressive as the vision, but it’s still intriguing. I think there’s some great work in requirements engineering waiting to be done here: is reproducibility both necessary and sufficient for scientists to regard their peers’ computational work as science? If so, what must a tool do or provide in order to satisfy that need? If not, what are the requirements, and why?
