Monthly Archives: June 2006

Mother Tongues and the Vietnam of Programming

Overnight links:

  • A New Scientist article reports: “…thenative language you speak may determine how your brain solves mathematical puzzles, according to a new study. Brain scans have revealed that Chinese speakers rely more on visual regions than English speakers when comparing numbers and doing sums.” I wonder if it applies to programming?
  • Ted Neward on whether object/relational mapping is the Vietnam of computing. No, he doesn’t mean “proudly independent with a rich cultural history”. (Cmmentary from Ned Batchelder here.)
  • There’s a workshop on the social side of large-scale software development in Banff in November. Oughta be interesting…

The Overnight Link Roundup

Next… Design by Contract? (Please)

I was pleasantly surprised a few years ago when programmers (particularly open source programmers) actually started writing unit tests. XP is usually given the lion’s share of the credit, but I think that JUnit was the real reason: it was just enough structure to get people in, and had a perfect balance between simplicity and extensibility.

What I’d like to see catch on next is design by contract. Most people who’ve encountered it think it’s about enforcing modularity, or about making sure that derived classes respect the rules of their parents. What I like, though, is its temporal application: making sure that Version 3.1.2 of a class has the same externally-visible behavior as Version 3.1.1, so that upgrading is guaranteed not to break things. I’d particularly like this right now, as we’re banging our heads once more against “slight” differences between successive versions of what’s supposed to be the same damn library. The release notes for 3.1.2 of—let’s call it “Fred”, shall we?—don’t mention any backward-incompatible changes, but somewhere eight levels down in the call stack, something is passing an Element where it should be passing a string, and everything after that is blowing up.

Given pre- and post-conditions (a big given, I’ll grant you), a tool that checks an RPM, JAR file, or Egg against an existing installation, and reports discrepancies, shouldn’t be all that hard to build—the general problem may be undecidable, but even something conservative, that draws human attention to possibly problematic mutations, would go a long way toward making lives like mine a little better.

Update: Aaron Bingham’s presentation from EuroPython 2006.

Why DrProject Is Slow

Billy Chun has been investigating why DrProject is so slow (5.1 seconds per request). As regular readers will know, we’re running it as a pure CGI: a new Python interpreter is forked for every request. The fork itself costs about half a second, and importing all the libraries takes about the same, but the real culprit turns out to be the 1.9 seconds (yes, 1.9 seconds) spent in Plex.nfa_to_dfa. Plex is the parser generator we used to create a parser for our wiki pages; according to internal documentation, the nfa_to_dfa translates non-deterministic finite automata into deterministic ones. In plain English, we’re recompiling our wiki syntax parser every single time a page is requested.

Oops.

So how much improvement can we expect? Well, when Billy ran DrProject with SCGI, the time per request dropped from 5.1 seconds per request to 0.6 seconds: no fork, no re-import, no regeneration of the wiki parser, and a few other things as well. That’s a factor of eight, and will get us well within our target performance range. He’s writing up an installation guide right now, and will then look at whether we can cache the Plex-generated parser somehow. These changes won’t be in this week’s release candidate, but ought to make it into the final 1.0.

Onward!

Pop vs. Soda?

This map of generic names for soft drinks is fascinating.  I can understand why “Coke” is a generic name in the south, with “pop” predominating in the north, but why the four disconnected islands of “soda”?  Do they represent the remnants of some earlier wave of migration that was overwhelmed by later arrivals?  Are there economic factors at play?  The public wants to know, dammit!

(via Aetiology one again)

DrProject 1.0: 98% and climbing

We’re edging up on DrProject 1.0′s first release candidate — 98% of our tickets are closed. If all goes well tomorrow and Wednesday, it should be available for trial on June 30, with the “real” release following a week or two later (depending on feedback). We still have performance problems to solve, and a couple of bug fixes that we aren’t going to merge into this release (too hard to unit test, and therefore too likely to destabilize the release), but I’m pleased — very pleased.

MDA vs. RonR: top-down vs. bottom-up?

Model-driven architecture, or MDA, is the latest darling of those who would have us program by describing our system at a high level in something other than code, then generate something runnable automatically.  Ruby on Rails is an MVC web application development framework that favors convention over configuration: for example, it maps your objects to database tables in a particular way, so that you don’t have to worry about it, and can’t mess it up.

My question is, are these really top-down vs. bottom-up approaches to the same geeky Nirvana? I have only played around with RonR so far, but at times, it really does feel like I’m modeling, rather than programming.  Once I know what my objects are, and how they relate to one another, much of what happens next happens automagically.  Yes, I still have to fill in details like the maximum size of a file upload by hand, but I’d have to do that with MDA, too.  Call me meatloaf, but it seems to me that if you slapped some WYSIWYG modeling tools on top of RonR, you could stick an MDA badge on the combination…

Perforce: For beginners only…

Perforce certainly has a stellar reputation amongst software
developers; Perforce for real projects, and your choice of open source
version control system if you cannot afford to foot the Perforce bill.
This naturally leads to the idea that Perforce is for power users,
whereas systems like CVS or Subversion are for beginners. After a few
days using Perforce, however, (I had only ever used Subversion/CVS in the past), I noticed a striking contrast between the two groups that seems to contradict this.

Consider the standard use case for a version control system: editing files, and then committing those changes back to the repository. If you are using Subversion, you simply edit the files in place. When you are ready to commit, you issue a ‘status’ command to get a quick overview of files that have been changed, files that have been added, and files that have been deleted. You may choose to add some files that are not under version control, revert some files that shouldn’t have been edited, and then issue the commit command to finalize your session.

Now consider the same case under Perforce. You go to edit your first file, only to realize that it is read-only, requiring you to first ‘open it for edit’. After a number of edits, you decide you need to add a new module to your program. Don’t forget to add this file to
Perforce, because as far as I know, there is no way in Perforce to get a list of files that are not under version control. How many times has someone in your organization broken the build because they forgot to add a new file to Perforce? You then decide that you want to run a script to edit a large number of files. After you have modified these files to be read-write, you run the script, but then realize Perforce has not picked up the changes in ‘Pending Changelists’ screen. Of course not, because you didn’t explicitly open these files for edit! After you are ready, you issue the ‘submit’ command to check in your
changes.

Can you spot the difference? With Subversion, the version control system is not forcing you to be aware of your changes until you are ready to commit. With Perforce, you constantly have to be on top of your changes during development. Perforce is forcing you to think about version control while programming. While Subversion silently sits
in the background, Perforce is constantly poking you in the back whenever you want to do something. Could you get any work done if someone was doing that in real life?

The real issue boils down to Power Users vs Beginners. Perforce holds your hand and puts in restrictive measures to ensure you don’t break things. This is great for beginners, but quickly becomes annoying and frustrating for experienced users.

What about the other features of Perforce? Surely they must justify the 800$ per head price tag? Maybe, but as a developer, I’m focused on the edit-resolve-commit cycle for 95% of my time. Get out of the way and let me work. Now who broke that build…