The following exchange (lightly edited) took place on Twitter a few days ago:
- Titus Brown: bash is disastrous for pipelines! very hard to rerun entire analysis from bash script.
- Titus Brown: I want/we need dependency management in scientific computing workflows & pipelines. bash doesn't provide.
- Lorin Hochstein: I am always saddened at the poor state of scientific workflow tools. Isn't that why we invented computers?
- Konrad Hinsen: We use completely inadequate tools and notations for computational science.
- Titus Brown: nothing personal, but I'm wary of you & Greg Wilson when you make comments like this.
- Titus Brown: you're not wrong, but I have watched people try to build new/better sci sw & fail & fail.
- Titus Brown: I now believe wholeheartedly in iterative approach: small steps. But, more generally, when > 5% of scientists use the tools we have, I will acknowledge need for new tools.
- David Soergel: I get it, but also better tools ⇒ more users!
- Titus Brown: you are (in my experience) largely mistaken, good sir.
- David Soergel: Really? Git (&GitHub) profoundly better than CVS&Subversion; lots more scientists use it...
There are a lot of claims and assumptions in these ten tweets. I've frequently made similar claims (hence Titus's wariness), but after working with scientists daily for six years, I'm less sure of myself. Are today's tools and notations for computational science actually inadequate? Do less than 5% of scientists use the tools we have? Do better tools actually generate more users? Is Git really better than CVS or Subversion? People who do empirical studies of software engineers would say, "We don't know how to measure that," "We don't know," "Unproven," and, "That study hasn't been done, but probably not for most users" respectively.
The fact that those questions popped into my head has made me realize that I might finally be an engineer. Consider:
- I want to teach computer science students the scientific method so that they will build tools and choose working practices based on evidence rather than anecdote.
- I want the Python and Julia communities to user-test features à la Stefik et al before adding them to the language
- I want Software Carpentry and Data Carpentry to do more assessment in 2016 to find out what's effective and where we can make improvements.
- I'm going to work full-time on instructor training for the next twelve months because I believe that if we apply educational research in the classroom, teaching and learning will both be improved.
What ties these together is the belief that if we start with, "I don't know, but I can find out", we can make our world better. That—the use of the scientific method to improve the universe instead of merely understanding it—is as good a definition of engineering as I know. Among all the thoughts prompted by this difficult year, the discovery that I might finally be thinking like an engineer three decades after earning a degree in the subject is oddly comforting.