Archive

Archive for March, 2009

Seven Signs of Bogus Science

March 25th, 2009
Comments Off

Nice article in the Chronicle of Higher Education on signs of bogus science:

  1. The discoverer pitches the claim directly to the media.
  2. The discoverer says that a powerful establishment is trying to suppress his or her work.
  3. The scientific effect involved is always at the very limit of detection.
  4. Evidence for a discovery is anecdotal.
  5. The discoverer says a belief is credible because it has endured for centuries.
  6. The discoverer has worked in isolation.
  7. The discoverer must propose new laws of nature to explain an observation.

There have obviously been great scientists and great scientific discoveries that broke some of these rules, but they’ve been remembered precisely because they were rarities.  Now, think about the claims made for [name of recent innovation in IT goes here]: how many of these rules do they violate?

Research

Open Notebook Science Badges

March 25th, 2009

I blogged last summer about creating a badging scheme for open science. Turns out it’s been done: ONS Claims has badges for four flavors of open science. Two sets are available in various sizes, all variations on the themes below:

All content Selected content
Immediate release
Delayed release

Here’s hoping they’re widely adopted.

Software Carpentry

Recent Reading

March 25th, 2009
Comments Off

Another bunch of papers and books:

  • Sfetsos, Stamelos, Angelis, and Deligiannis: “An experimental invesgitation of personalit ytypes impact on pair efefctiveness in pair programming.” Empirical Software Engineering, 14:187-226, 2009. The authors had 70 undergraduates do pair programming and measured effectiveness in terms of communication, velocity, design correctness, passed acceptance tests, and general satisfaction with partners. Half the pairs were people with similar personalities (as reported by Myers-Briggs Type Indicator, or MBTI), while the other half were dissimilar. The result: dissimilar pairs outperformed similar pairs in almost all ways. There are lots of flaws in the study—MBTI is deeply flawed, it’s not clear if the study was double-blind, there are many uncontrolled confounding factors—but it’s still a very interesting result.
  • Draper, Kessler, and Riesenfeld: “A History of Computing Course with a Technical Focus.” Proc. SIGCSE 2009. Describes a history of computing course at the University of Utah that included “period” programming assignments (such as an auction/bidding game in old-style Fortran). I think this would be very cool.
  • Enbody, Punch, and McCullen: “Python CS1 as Preparation for C++ CS2.” Proc. SIGSCSE 2009. Measures the impact of switching to Python as a first language at Michigan State University by looking at students’ performance in the subsequent (unchaged) C++ course; there was no statistically significant difference, which is either encouraging or disheartening.
  • Bollen, Van de Sompel, Hagberg, and Chute: “A principal component analysis of 39 scientific impact measures.” http://arxiv.org/abs/0902.2183. The impact of scientific publication has traditionally been measured via citation counts, but dozens of more sophisticated metrics have been proposed. In this paper, the authors performed a principal component analysis of the rankings produced by 39 measures based on citation and usage log data, then cluster the metrics. The most important result is that Impact Factor (the most commonly used metric) is an outlier: it doesn’t produce the same rankings as most of its competitors, and should therefore be “used with caution” (academese for “discarded”).
  • Boyle, Cavnor, Killcoyne, and Shmulevich: “Systems biology driven software design for the research enterprise.” BMC Bioinformatics, 9:295, 2008. Describes a sophisticated third-generation architecture for supporting biomedical research. It reads like a catalog of enterprise-scale buzzwords, but that’s what it takes to do modern science.
  • Chen, Cheng, Hsieh, and Wu: “Exception handling refactorings: Directed by goals and driven by bug fixing.” Journal of Systems and Software, 82:333-345, 2009. Describes four increasingly-useful levels of exception handling, introduces some related “code smells”, describes appropriate refactorings, and presents a case study. Nice work—I’ll borrow from it in my next software engineering course.
  • Little and Miller: “Keyword programming in Java.” Automated Software Engineering, 16:37-71, 2009. Describes a tool that translates small sets of unordered keywords into legal Java expressions by matching against context, e.g., “letter at message[i]” becomes message.charAt(i). I don’t know how useful the tool would be to experienced programmers, but the fact that it works at all is an indication of how much redundancy (or entropy) there is in most programming languages.
  • Allspaw: The Art of Capacity Planning. O’Reilly, 2009. Allspaw draws on his practical experience at Flickr to describe how to measure, deploy, and manage web application infrastructure to avoid bottlenecks. Unfortunately, he chose not to include even the most basic mathematical material (Little’s Law, M/M/1 queues, etc.). I was quite disappointed.
  • Hazzan and Dubinsky: Agile Software Engineering. Springer-Verlag, 2008. A textbook on agile software development aimed at both undergraduate classes and industrial training courses. All the expected topics are there, along with questions to ponder and references to relevant literature. Dry as toast, but I’ll still borrow a few ideas for my next course.
  • Monson-Haefel (ed.): 97 Things Every Software Architect Should Know. O’Reilly, 2009. Mostly motherhood and apple pie: I agree software architects (and most other adults) should know, understand, believe, and practice these things, but with only a handful of exceptions, page-and-a-half descriptions of them isn’t going to make a difference to whether they do or not. On the upside, there are twice as many female contributors as there were in Beautiful Code.

Books, Research

Zimmer’s Visions

March 25th, 2009
Comments Off

Carl Zimmer has a lengthy post on the future of science journalism that’s well worth reading.  I have many of the same misgivings regarding writing about programming; as the cost of entry goes down, so does the signal-to-noise ratio, even among the “professionals”.

Writing

GSoC 2009 Now Accepting Student Applications

March 23rd, 2009
Comments Off

The title says it all: Google Summer of Code is now accepting applications from students for the 2009 program. Please read the FAQ for details, and check out the ideas pages of the 150 open source software organizations that are involved this year.

Announcements

What I’d Like To Do Next

March 22nd, 2009

I’ve been thinking about what I’d like to do when I leave U of T, and I think I can use what I enjoy most about my present job—mentoring students—to make some company more attractive to talented Canadian Computer Science grads, while at the same time improving CS education in this country. Here’s my pitch.

Most schools have a third or fourth year “Intro to Software Engineering” course where students work in teams on an extended project that’s meant to teach them real-world development skills. In my experience, these courses aren’t particularly effective because:

  • the content is usually the traditional (ritual-intensive) stuff found in textbooks, which bears only a passing resemblance to the way good teams actually get things done,
  • many of the important aspects of modern software development, such as working in distributed teams, are absent completely, and
  • the faculty teaching them usually don’t have much real-world experience themselves.

I propose talking half a dozen schools across Canada into running their courses in tandem, so that each team has members from three or four schools. I’m well positioned to do this: I’ve been running project courses at U of T since 2002, and I’m prototyping the “tandem” idea this term with students from Waterloo, Lakehead, Alberta, and Havana on my team.

I think a lot of SE profs will like this idea, particularly if someone else (me) is doing the organizing. I also think it’ll give students a much better education: if we go from Newfoundland to Vancouver Island, and include schools in Quebec and New Brunswick, students will be exposed to time zone and language issues (and also have a chance to make links with peers across the country, something that they currently have no way to do).

As part of this, I would offer training courses for profs and grad students on how to teach team-based software engineering courses. I think there’s a lot of demand for this, and my experience in industry, open source, and university education gives me at least a fighting chance of pulling it off.

Finally, I would start writing and blogging about what’s new and exciting in computer science, as opposed to the computer industry. Magazines like New Scientist, Discover, and American Scientist cover everything from math and physics to biology and medicine, but not computer science. Similarly, there are dozens or hundreds of good blogs about developments in evolutionary biology, cosmology, and neuropsychology, but nothing accessible to the educated layperson about computer science. I think a weekly screed about AI, computer graphics, algorithm theory, software engineering, and computer systems would have a lot of readers; I’d certainly like a chance to find out.

The trick, of course, is finding someone to pay for this. The last two and a half years have taught me just how slowly regular channels move, so I would instead look for corporate sponsorship. The main thing the sponsoring company would get out of this is a high-value recruiting channel. Want to know who’s worth interviewing? Look at the students who do well in this course, or ask the profs who are taking part in it who their stars are. The sponsor would also increase its mindshare: students think Microsoft and Google are destination employers because they’ve used those companies’ software. If their core SE course is organized by a Canadian company, and they’re using tools provided by it, they’re more likely to think of it as a good place to work. Similarly, if that company is providing real value to software engineering faculty, those faculty are more likely to steer students toward the company.

More broadly, I think this could help create stronger software development culture right across Canada. Students at Toronto don’t know what students at Queen’s or Waterloo learn, much less what’s going on at Dalhousie or UBC, and vice versa. The same is unfortunately true at the faculty level as well: there simply aren’t very many venues for cross-Canada discussion about CS education. The real long-term value of something like this would be to strengthen ties between programs from coast to coast, which would benefit everyone.

Uncategorized

Why I Read Less Science Fiction Than I Used To

March 22nd, 2009

I picked up a copy of Stephen Baxter’s Vacuum Diagrams Friday—wasn’t in the mood to read any tech stuff, and Gears of the City hasn’t arrived yet—and reading it reminded me why I don’t enjoy “future history” science fiction as much as I used to. In Baxter’s stories, humanity spends the next hundred thousand years spreading to the stars, adapting to ever-weirder environments along the way. Meanwhile, here and now, climate change is happening faster than the IPCC predicted, and the consequences look grimmer by the day. It’s sort of like the “uncanny valley“: if a story is far enough away from reality to be seen as pure escapism, I can lose myself in it, but if it combines real(ish) engineering with brittle Heinleinian techno-optimism, I can’t help but think of the tragedies my daughter (age two) is likely to see in her lifetime, and that kind of spoils the fun.

Later: several commenters have recommended other SF authors (some of whom I’ve read, some of whom I haven’t). I’m grateful for the pointers, but I’m still intrigued by the uncanny valley effect: I’m comfortable with well-written fantasy, or deliberately retro SF, but anything in which technology saves us from our own shortsightedness makes me genuinely angry.  Maybe what I’m really looking for is a near-future SF Grapes of Wrath, which I admit is setting the bar pretty high…

Writing

Survey: Theory vs. Practice

March 19th, 2009
Comments Off

A couple of profs (one in Sweden, the other in New Zealand) have put together a short survey to find out if programmers write code the way they’ve been told is The Way, or do they ignore the theory and follow practices learned in the school of hard knocks.  It only takes 10 minutes, and we all might learn something.

Research

Actual Meanings of Common Java Exceptions

March 19th, 2009

Keywords Speak Louder Than Words

March 19th, 2009
Comments Off

Mozilla is taking education pretty seriously these days (see previous posts). The latest news from David Humphrey is a brand new Bugzilla keyword for tagging tickets: student-project. This tag will help students find things to work on; as he says, good candidates are:

  • not in the critical path of a release
  • not so far from a release that they have no chance of getting review or attention from the community.
  • things [developers] would do if [they] had time, which probably means [they] care about them happening, know they could be done, etc.
  • scoped for academic time frames.
  • things [developers] know the community will be willing to help mentor.

Here’s hoping other organizations pick this up (and not just during GSoC).

Learning