Snailz

Posted

Earlier this year I put together notes for a companion to the JavaScript and Python versions of Software Design by Example aimed at research software engineers. I shelved the project because the (lack of) reaction to the first two books convinced me that this isn’t what most people are looking for, but I still hope that one of the software packages I built along the way might be useful.

Snailz is a set of synthetic data generators that simulate the collection, storage, and analysis of data related to snails in the Pacific Northwest that are growing unusually large as a result of exposure to pollution.

The first diagram below shows the schema of the database that holds the results; the second shows how various scripts and parameter files interact to create that database. Along the way, the scripts also create CSV files describing the designs of genomic assay plates and both messy and tidy versions of readings for those plates. The whole process is documented on the package site, and all the code is open source.

Database schema of snailz data
Snailz database schema
Workflow for generating snailz data
Snailz data generation workflow