Archive

Archive for November, 2006

Not on the Shelves (Version 3)

November 30th, 2006

Every couple of years, I indulge in a bit of sympathetic magic by putting together a list of books I want someone to write, so that I can review them. If nothing else, doing this helps me figure out what I currently think is and isn’t important in our profession. Previous versions were written in 1997 and 2003; the latest revision lists 16 books that would sell well whlie making the world a better place.

Books

Writing Blackboard Plugins

November 30th, 2006
Comments Off

Two of my students (Billy Chun and Darren Jung) spent the term writing a plugin for Blackboard, the Java-based learning management system (LMS) that the University of Toronto is moving to. Here are there experiences:
Billy Chun:

When first attempting to develop a ‘Hello World’ plug-in for Blackboard, I needed to do some research to find the basic structure of a Blackboard plug-in. The Blackboard developer site (http://www.blackboard.com/extend/dev/) had some sample code available, but only contained the JSP files for the plug-in. Further searching the site led me to the Blackboard SDK, and here was where I obtained that information.

With the SDK in hand, I started to create the plug-in. The sample plug-in found accompanying the SDK provided excellent reference to follow when coding my own. The documentation PDF for the sample provided a fairly simple explanation of the structure of the plug-in and the code found in the sample JSP and XML manifest files. Although this documentation followed a specific method in generating the packaged plug-in, it was still simple to follow.

Two difficult areas when learning Blackboard were in reading the general developer documentation and finding APIs. The general developer documentation explained the plug-in structure quite well, but the syntactic details for some aspects were puzzling. Two examples of this ambiguity were in the explanation for the bb-manifest file and tag libraries. The documentation did provide explanation for what was or was not required but gave fairly brief information about items like expected input values. I had to google to find details on this. Next, finding APIs for Blackboard classes would have been difficult if it were not for the help of Andrew Wang (Blackboard developer over at UTM). No where on the Blackboard developer site could I find a link to APIs.

A good suggestion I would make when learning Blackboard would be to get a demo of the system with someone experienced in developing Blackboard plug-ins. The demo Andrew presented really gave me a head start as a beginner into creating future plug-ins.

Darren Jung:

Experiencing blackboard was a great oppotunity for me. In my case, I started off late because of my personal situation, and moreover, previously I had no experience with jsp files and javascript. So I was frustrated, I didn’t know what’s going  on, I even had no idea what blackboard is!

Billy, my partner of this project, helped me a lot. He showed me where to  get references, and explained me about that. So after that I felt that I’m going to  the right direction.

My difficulty happened at very first : I had no idea at all, I felt like lost. Getting help from Billy was a great choice, because sometimes when anybody is placed at whole new world, he/she will try to understand what’s going on in that world. And that takes a LOT of time if you’re alone and don’t get help from people in that new world.

From that, I’ve written my bb_manifest.xml and .jsp files. Trying to get ‘Hello world’ from the server was quite easy. One more comment for users of Mac OSX. I need to investigate why this is so later, but I’ve found that when I build .war file on my windows XP machine, the file size is well below 70kb. It took about 5 seconds to upload building block. But on my Mac machine, (Macbook 2.0GHZ) .war file has size of about 20 megabytes! It took more than 10 minutes to upload plug-in. I guess this is partly because windows uses .zip and mac uses .jar files as external  library, but I’m not sure at this point.

Uncategorized

Software Carpentry article in CiSE

November 28th, 2006
Comments Off

The November 2006 issue of Computing in Science and Engineering includes an article I wrote titled “Software Carpentry: Getting Scientists to Write Better Code by making Them More Productive”. It’s available to subscribers only on the magazine’s web site, but they have kindly given me permission to post it here. I think it’s out too late to bring November’s stats up above October’s, but I could be surprised.

November Statistics

Software Carpentry

Presto and Responsibility

November 28th, 2006
Comments Off

TechCrunch recently profiled Presto, which allows people who don’t have Internet access to receive photos and email. (Special printer plugs into phone line; people with ‘net access email yourname@presto.com; printer prints.) It reinforces a theme I’ve been hitting in class with increasing regularity: the fastest-growing market in the industrialized world is old people, so if you’re looking for a startup idea, try to think of something a 65-year-old would give you money for.


Later: Jon Udell posted some interesting discussion of using modern technology to make the whole cost of things visible to consumers. I think this is another area with huge potential for growth…

Uncategorized

Psiphon in the News Again

November 27th, 2006
Comments Off

Psiphon has made the news again — this time at O’Reilly. Yay!

Uncategorized

NPR Industriosphere

November 26th, 2006
Comments Off

Brian Hayes (author of Infrastructure, and of many fine articles in American Scientist) reports that National Public Radio is launching a new series on the “industriosphere“, which you can listen to online.

Announcements

DrProject Internals: Email

November 25th, 2006
Comments Off

And now, DrProject‘s email. This was the first completely new subsystem we added after we forked; running Mailman in parallel with Trac worked well enough while we were bootstrapping, but the fact that neither system knew about the other made both a pain to use. We couldn’t, for example, run a single search query that would hit both a project’s wiki and its mailing list archive; we also had to do a little dance to keep project and mailing list memberships in sync, and we never even tried to modify Mailman so that wiki-syntax shortcuts to tickets and pages in mail messages would be turned into links automatically.

The broad outline of our design was straightforward:

  1. We didn’t want to compete with people’s existing mail clients, so DrProject would provide relay and archiving only—there’d be no way to compose or receive messages from within DrProject, and no way to send mail to individual (just projects). That simplified things at least as much as only supporting reads simplified the Subversion repository viewer.
  2. We could safely assume that the Linux host running DrProject had a mail transfer agent (MTA) such as sendmail. We could further require that whoever installed DrProject be able to modify the MTA’s configuration to route messages for particular addresses to a program we provided. In particular, we could tell the MTA to store all messages for A+B@host.name in a directory called /drproject/A/B. In this scheme, A identifies a particular instance of DrProject on the host, while B identifies the project within that instance. For us, A would be a course ID, like csc408, and B would identify the student team.
  3. Every time the DrProject CGI ran, it could check the spool directories for new messages. If it found any, it could copy them into the database, index them for searching, and forward them to the members of the project the message was for.
  4. The wiki parser could be modified to recognize @123@ as a shortcut to message #123. We decided to use an ‘@’ before and after to avoid worrying about the possible ambiguity in abc@123.yourhost.com.

Simple enough—but as always, there were a lot of sharp-toothed details lurking in the underbrush. First (and simplest), the program invoked by the MTA to copy messages into the spooling directory, and the DrProject CGI, had to lock files at the proper times, so that DrProject wouldn’t try to read a message while the handler invoked by the MTA was still writing it.

Second, we needed a way to prevent the project lists from being spammed. After some discussion, we decided to use a whitelisting: every user would have to tell DrProject the addresses from which she wanted to be able to send mail, and select one of those for DrProject to forward mail to. The procedure we’re currently using is far from original:

  1. After logging in, the user goes to the preferences page and enters the email address she wants to register.
  2. DrProject stores that address in an UnconfirmedEmail table, then sends a message to that address requesting validation.
  3. Once the user gets the message and validates the address, it is added to the set associated with her account. One of those must always be marked for forwarding: all mail sent to projects of which she is a member will be forwarded to that address. She can turn forwarding on or off on a per-project basis, but we didn’t see any reason to allow mail to different projects to be forwarded to different addresses.

There’s still a bit of room for abuse here: if I told DrProject that yourname@yourhost.com was my address, but never reply to the validation request, you wouldn’t be able to claim it. We figured that was a pretty minor issue, and that it could be resolved by divine (i.e., administrative) intervention, so we didn’t worry about it.

What we did have to worry about was exactly what constituted “membership” in a project for the purpose of message forwarding. Our authorization scheme doesn’t actually include a notion of “membership”; instead, every user has a role (possibly a default role) with respect to each project, and each role is a collection of capabilities. Should roles have a MEMBERSHIP attribute? Or should we infer “membership-for-the-purpose-of-mail-forwarding” from something else?

We went with the latter: if your role with respect to a project gives you MAIL_POST privileges, then messages to the project list are forwarded to you. MAIL_VIEW isn’t enough, since we may want to give anonymous users the ability to read the archives of “public” projects, but don’t want a special-case rule saying, “Forward to anyone with this capability unless they’re anonymous or nobody.”

It all worked well under test, but failed when we first deployed it last fall. The problem turned out to be some missing quotes in a shell script—the commands all worked when run directly from an interactive shell prompt, but failed when the script was invoked. Once that was fixed, we began noticing that messages would sometimes be delayed for hours—even days—before being delivered.

That one turned out to be a simple oversight. DrProject is a long-lived CGI (we actually use SCGI); when it’s not actually processing an HTTP request, it just sits and waits. That means that it only looks for new mail messages when someone interacts with it over the web (e.g., files a ticket or views a wiki page). Messages sent to project lists were therefore piling up until someone went to check on them, at which point they were all forwarded.

The solution we’re now using is a cron job that sends a dummy HTTP request to DrProject every two minutes or so. It was a simple thing to write, but we’re still unhappy with it, since it’s difficult for developers to test, and is yet another scraplet that administrators have to remember to deploy and restart. I’d like to fold the cron job into the SCGI process some day, but it’s well down my wish list.


Later: how could I have forgotten the address rewriting problem? We’re currently hosting course-related instances of DrProject on Stanley, a medium-hefty server donated by the kind folks at the Jonah Group. For the first few weeks of term, mail forwarded by DrProject had instance+project@stanley.cs.toronto.edu as a return address. The problem was, the CS department’s mailer was rewriting this as instance+project@cs.toronto.edu. That makes perfect sense for mail from real people (you probably don’t care that the machine I compose my messages on is jalkelainen.cs.toronto.edu), but since the department’s mail server didn’t know about DrProject‘s project-related addressing, anyone who just hit “reply” to a forwarded message got a bounce-back. The “solution” (and yes, I think the quotes are justified) is to take advantage of the fact that drproject.org is hosted at stanley.cs.toronto.edu, and use instance+project@drproject.org as the return address. It’s these kinds of integration issues that make real software hard…

DrProject, Learning

DemoCamp 11

November 21st, 2006
Comments Off

A handful of early bloggers are calling DemoCamp 11 a “failure”.  I’m not sure why: I thought it had more interesting content than 9 and 10.  The lead-off, AutoSSL, was an interesting idea (auto-provision of SSL certificates to small devices in the home, like security cameras); there was too much slideware, and some technical problems, but I think we should start being more open-minded about the former, and they handled the latter with aplomb.

Andrew Reynolds demo’d Selenium next. It’s a very cool tool for testing web applications via the browser; he didn’t build it, but again, I think we’d do well to be open-minded about that — I’d certainly welcome more people standing up to say, “I found this really cool thing that made my life a zillion times easier, and you oughta know about it.”

The Design Bibliography wiki had nice tabs ;-) , and Sunir motivated it well; I agree with other commentators that it’s a very crowded space, but that doesn’t mean there isn’t room for new ideas.  My Studio Assistant was a prototype of a CMS for really nontechnical people; its creator wants to put together a team to build it, and market it to artists, craftspeople, galleries, and so on.  The only one that left me dissatisfied was Firestoker; after waiting a year for their demo, I’m still not sure what it actually is.

So overall, I’d give it three and a half stars out of five.  About a third of the audience was newcomers (which is good), and the discussion in the pub afterward was worth staying out on a cold November evening for.  I look forward to January’s…

DemoCamp

CSC49X Projects for Winter 2007 (final)

November 20th, 2006

Week 11, and most of the projects for next term have been lined up. Here’s the current road map:

  • The Online Marking tool, which gives TAs the ability to mark up student code on the web. This is now being used in a couple of courses, and there was lots of interest at DemoCamp in modifying it to work as a general online code review tool. Lillian Angel, Martin Williams, and Jane Shen will be working under the direction of Jennifer Campbell.
  • We’re also visualizing generalogical data for REED, the Records of Early English Drama project, using Flash, asynchronous XML, and other sexy technologies. Apple Viriyakattiyaporn, Melissa Luu, and Lisa Ly will be doing the work.
  • Prof. Stephen Strother‘s team at the Rotman-Baycrest is studying the effects of strokes and other impairments on elderly patients; they have lots of data, and need a Java GUI for looking at it. David Wright and Richard Zhou will be building one for him.
  • Andrey Petrov and Maria Khomenko will be fixing up some file locking issues in SQLite. I’m pretty excited about this one, as it’ll be only the second time a 49X team has been contributing directly to a major open source project.
  • David Chen, Winson Chung, and Mikhail Temkine are going to help Paul Gries get VPython and a graphics package into shape for use in the new CSC120 course.
  • Debbie Winter and Rick Valenzano are going to study game theory and evolution under the direction of Gary Baumgartner.
  • Muhammad Ali will build the information visualization half of a status dashboard for DrProject.

15 students, 7 projects (8 if you count Igor Foox’s field trials of UTest, and 9 if you count ongoing enhancements of DrProject). I’m looking forward to it…

Uncategorized

Slow Growth is Still Growth

November 19th, 2006
Comments Off

We discovered late last week that a team at TUCOWS is using DrProject.  That got me wondering who else had downloaded it, so I trawled through some Apache server logs and came up with 56 distinct hosts in Canada, Japan, China, Taiwan, and the United States, including:

It ain’t quite world conquest, but it’s a start ;-)

DrProject