Someone once said, "Never set yourself achievable goals." I'm willing to bet he died a lonely, bitter man, and I have no desire to follow in his footsteps. However, I do
have to figure out what the Software Carpentry
course should try to teach scientists about programming for the web. I have to teach something: getting data from the web, and making data and useful programs accessible to others, is high on everyone's wish list, and crucial to being a more effective scientist.
But even basic web programming would be a course unto itself. At best, there's room in Software Carpentry for two hours of lecture and four hours of practical work. Students will be comfortable with strings, lists, loops, conditionals, and functions; they will have just met regular expressions, XML, and databases, and their grasp of these topics will still be pretty shaky. It's hard to cover even simple CGI programming with those constraints, and morally questionable as well---there are just too many ways they could leave their machines open to compromise.
I'm tempted to introduce them to an RSS-based mashup tool like Yahoo! Pipes
, but (a) it's closed-source and proprietary (which means it could disappear at any moment, like FuseCal
and so many others), (b) getting data from your own machine requires exactly those skills we don't have time to include in the two hours (or a friendly sys admin with time on her hands). Is there an open source tool that does similar things? It'd have to be much (much, much) lighter weight than scientific workflow tools like Taverna
, allow users to mix local and remote data/processing/services, and be nearly trivial to extend. Any pointers? Or is anyone interested in helping write such a beast?