Over on the Basie blog, Florian has posted an idea about using AJAX to get around one of the most annoying problems in DrProject: timeouts during lengthy batch creation of projects and/or users. Basically, his plan is to have the browser send one create request at a time, instead of sending a batch to the server and asking it to do them all at once (which often led to timeouts, since each creation can take about one second, and classes often have a hundred students or more). It’s theoretically less efficient (more network traffic, and N rewrites of the .htaccess file instead of one), but “slower and always works” is better than “faster and sometimes fails”. He’d welcome your comments…
Basie
As one term ends, so another begins, and we’re very excited to have another good team of summer interns this year:
- Mike Conley, Severin Gehwolf, Amanda Manarin, and Nelle Varoquaux are working on OLM for Karen Reid
- Ainsley Lawson, Samar Sabie, Sarah Strong, and Maria Yancheva are doing climate change projects for Steve Easterbrook
- Eran Henig, Bill Konrad, Phyliss Lee, and Florian Shkurti are working on the Django-based rewrite of DrProject for me, with Ashwin Bhat, Leigh Honeywell, and a few others helping out
- Blaise Alleyne is doing a Google Summer of Code project with Drupal
We’re kicking things off on Monday the 11th—stay tuned for progress reports.
Basie
Over on the Basie blog, Zuzel Vera Pacheco writes:
This week I presented cdocs at the Scientific Student Day as part of the thesis’ requirements. It won two awards: Best Work and Best Written Work. Also I got the membership to the Cuban Society of Mathematics and Computer Science and a spanish edition of Introduction to Programming Using Java as prizes. I owe you the English translation of the paper and the presentation.
Congratulations, Zuzel—it’s well deserved.
Basie
As I’ve mentioned previously, we’re moving DrProject over to Django. Work started in September, and the second release is coming up soon. The schedule highlights some of the differences between doing development with full-time developers, and doing it with students who are working 1/5 or 1/4 time:
- Friday March 20: feature freeze. Anything that adds functionality must be up on review board by the close of business. Anything that’s posted in rough shape just to get it in, or without tests, will be rejected.
- Friday April 3: code freeze. Integration, testing, and bug fixing must be wrapped up by the close of business.
- Thursday April 9: release. We’ll spend the week between code freeze and release tidying up, asking friends to test our installers and setup instructions, etc.
I’m pretty pleased with the state of the reworked code; looking forward to reactions from the rest of the world.
Basie, DrProject
I blogged a couple of weeks ago about this term’s consulting projects. Here are a few more details:
- Hanieh Bastani is working with AutoDesk to create animation software with realistic flesh and bone models.
- Botond Ballo, David Cooper, Eran Henig, Bill Konrad, Derek Kwok, Phyllis Lee, and Kosta Zabashta are porting DrProject to Django. They’re joined by Heather Grant (University of Alberta), Zuzel Vera Pacheco (University of Havana), Dan Servos (Lakehead University), and Jason Whyne (University of Waterloo).
- Arnold Binas and Alex Levinshtein are building a better friend finder for Chapters/Indigo, while Laurent Charlin and Maksims Volkovs are building them a better recommendation engine.
- Aran Donohue and Veronica Quinones are working with James Leung of the University of Alberta to build an abstraction library in Python so that we can plug different version control systems into applications.
- Fan Dong is continuing his work on the usability of scientific workflow tools.
- Andriy Borzenko and Cameron Gorrie are using a small suite of programs to gauge the usability of different parallel programming systems.
- Valdas Bancewicz and Anatoliy Kats are writing image-to-object matching algorithms on GPUs for MDA.
- Eben Hailemariam is writing fluid flow simulation code on GPUs for the Civil Engineering department.
- Bijoy Mandal and Weichen Wang are helping Prof. Matthew Roorda (Civil Engineering) visualize real-time traffic data.
- Torsten Hahmann is doing geospatial reasoning for the ILUTE project in the Civil Engineering department.
- Phillipa Gill and Lee Zamparo are investigating better spam filtering techniques for MessageLabs.
- Ziad Hatahet and Vivek Lakshmanan are looking at the feasibility of moving TUCOWS mail hosting service to NFS v4.
- Mohammad Jalali and Rory Tulk are seeing whether we can automatically generate REST APIs from ORM metadata. (This one might be renegotiated, since there’s been more prior work than I’d realized.)
- Zachary Kincaid might be implementing a domain-specific scripting language—details to follow.
- Matthew Ansell is building wizards for Mirarco.
- Michalis Famelis is helping StereoLogic build tools to reverse engineer business processes.
- Nick Shim and (probably) Olga Irzak are putting ClearCanvas‘s medical imaging software on a Microsoft Surface.
- Andrew Trusty is preparing the first open source release of Psiphon.
- Carolyn MacLeod is extending our survey of how scientists use computers.
- Hooman Bahador, Nikola Kramaric, and Ainsley Lawson are helping Prof. Ian Spence (Psychology) study the effects of video games on attentional field of view.
- Michael Reimer is trying to get the Tesseract OCR software to work as an accessibility aid.
- Jennifer Ruttan is working with Mozilla to adapt Ubiquity as an accessibility aid.
It’s going to be a busy term, but I’m looking forward to it.
Basie, Teaching
One of the things I teach my students is that the real purpose of a schedule is to tell you when to start cutting corners and dropping features. The ticker on my web site tells me I have 489 days left in my contract with the university; I signed up hoping to study ways of teaching second-stage novices [1] how to be better programmers, but after four failed attempts to get NSERC funding [2], it’s time to lower my sights. Here are the things I’d like to finish off before my stint at U of T is over:
- Help Samira Abdi, Jeremy Handcock, and Carolyn MacLeod finish their Master’s theses, and get Aran Donohue, Alecia Fowler, Alicia Grubb, Zachary Kincaid, Jason Montojo, and Rory Tulk through theirs.
- Publish Practical Programming (the “CS-1 in Python” book that Jennifer Campbell, Paul Gries, Jason Montojo, and I have been writing). It’s currently in beta, and due for release in a month or so; we’d like to do a Python 3 update in a year or so, but that’s likely to slip.
- Finish the study of how scientists actually use computers. Data from the initial survey is now being processed; we’ll put together a follow-up survey in the next couple of months, write a “popular science” paper for American Scientist in the spring, present results at the SECSE workshop in Vancouver in May, and submit a paper by year’s end.
- Co-edit a special issue of Computing in Science & Engineering on “Software Engineering and Computational Science”. Andy Lumsdaine and I have four articles lined up, and are looking for two more—if you’d like to volunteer, please give me a shout.
- Submit a proposal for a professional master’s degree in Computer Science to U of T’s School of Graduate Studies. This is mostly a matter of filling in forms, but that’s kind of like looking at Everest and saying, “It’s mostly a matter of going uphill.”
- “Finish” a much-improved DrProject. I originally planned to use it as a platform for research, as well as teaching; there isn’t enough time left for that, but I still hope to make it easier for software engineering instructors to introduce students to modern tools.
- Rewrite Software Carpentry. Tina Yee has translated some of the lectures into MATLAB; the next step is to make the whole thing look like it was written in the 21st Century [3].
Everything else has to go by the boards. In particular:
- I have resigned from my contributing editor post at Doctor Dobb’s Journal. It was a lot of fun, and I really enjoyed working with Jon Erickson, but as I said back in October, I’d rather not do it than do it badly.
- The software developers’ reading group I’d planned to start this January isn’t going to happen. I’d really like something to pick up the slack now that DemoCamp seems to have stalled (if only to provide an excuse to get together with former students on a regular basis), but someone else is going to have to organize it.
- After this term, I’m going to stop supervising student projects (except those directly relevant to DrProject and/or Software Carpentry). Next to 10:00 am coffee breaks with the lecturers, this is the part of university life I enjoy the most, but there just isn’t time…
- The Software Project Coloring Book (my attempt to write down everything I try to teach undergraduates about real-world software development) is being put back on the shelf. I have written 35,000 words, but those were the easy bits: conservatively, I’d need 4-6 months of full-time work to finish it off.
On the upside, Sadie got me some biking gear for Christmas, so now I’ll have to shed the twenty pounds I’ve picked up in the last couple of years, and I get to start taking our daughter to music classes every week. To quote a friend, it isn’t what I planned—it’s better.
[1] People who already know how to write programs, but not how to develop applications. I’m specifically interested in undergraduate Computer Science students, and graduate students in other disciplines.
[2] Companies like Nitido, the Jonah Group, Idee, and Rogers have kindly donated a few thousand dollars each to keep things like DrProject going, as have several of my fellow professors, but a $24K grant from The MathWorks is the only “research” funding I’ve been able to raise.
[3] As I said yesterday, I’m looking for a mentor in the Toronto area who can show me how to do this.
Basie, DrProject, Practical Programming, Research, Software Carpentry, Teaching, Writing
There’s a quote (attributed to various people—I’d welcome a pointer to the original) to the effect that if you show me your code, I don’t know what you’re doing, but if you show me your data structures, I’ll understand. To figure out just how far our students got rebuilding DrProject on top of Django this term, I asked one of them to generate a schema diagram for the database tables. The result, included below, was created by running the following commands in a virtual environment:
$ svn checkout http://django-command-extensions.googlecode.com/svn/trunk/ django-extensions
$ cd django-extensions
$ python setup.py install
$ django graph_models -ag > schema.dot
$ dot -o schema.png -Tpng schema.dot

(Note: I moved the three tables floating in the bottom middle from the upper right corner to make it more printable.)
Basie
Since September, half a dozen students at four universities have been rebuilding DrProject (our lightweight classroom-friendly replacement for Trac) on top of Django (a Rails-like web programming framework written in Python). What’s made this project different—and IMHO better—is the use of code reviews. Blake Winton, a local Python hacker, reviewed every single commit that came into the SVN repository in the first couple of weeks of the project. Thanks to his example, the students started reviewing each other’s work as well (Jeff Balogh, a two-time Google Summer of Code veteran who’s starting a full-time job with Mozilla in January, being the most prominent culprit).
It made a huge difference to productivity and code quality, and we’d like to do it again next term, but are wondering how best to implement it. We managed reviews this term by having each commit diff echoed to a mailing list; a self-appointed reviewer would reply to the email with comments, the author of the diff would reply, others would chip in, etc. I thought it worked pretty well (especially relative to the near-zero setup cost), but some students said in the post mortem that some commits got lost in the cracks, while others said they found it hard to track what was going on, since code review threads often turned into design discussions without a signal going to the larger group.
So, my question is, what could/should we do next term without either a big investment in infrastructure, or weeks of retraining? (I have no objection in principle to doing either, but since students only work 10 hours a week on this project, and usually have four other courses on the go, I have to focus on the absolutely smallest thing that could possibly work.) One suggestion has been to prepare a diff and send it to the reviewers’ mailing list before committing, so that reviews happen before code goes into the repo. Another is to pseudo-randomly assign commits to other team members for review (so that nothing gets dropped on the floor), and to use a “three strikes” rule to promote discussion from the review list to the dev list. What would you suggest? What do you think would work for a larger group (say, a class of 50 students, working in teams of 5, each team doing an 8-hour-a-week term-long project in parallel)?
Basie, Teaching
Recent Comments