2004 · 2005 · 2006 · 2007 · 2008 · 2009
2010 · 2011 · 2012 · 2013 · 2014 · 2015 · 2016 · 2017 · 2018 · 2019
2020 · 2021

Whatever Happened to TidyBlocks?

TidyBlocks is a Scratch-like tool for doing basic data science. Originally built by Maya Gans, it was overhauled in 2020, after which several volunteers translated its interface into several different (human) languages. We were excited by its potential, but:

  1. We had reached the limits of what the Blockly toolkit could do without some serious extension work. (In particular, there’s no comprehensible way to represent joins using the available styles of blocks.)

  2. Nobody was willing to fund further development. The overhaul in 2020 took about 300-400 hours of volunteer time; while I would have liked to continue, I didn’t see a way forward without fixing #1 above, and that couldn’t be done without financial support.

I still think the idea is a good one: the user testing we did showed that the interface is immediately comprehensible to anyone who has used Scratch (which these days means most middle school kids and their teachers), and after watching my daughter plod through their school’s “data literacy” module, I think we need something better. I hope someone, some day, will find a way to make it happen.

I Hope They Would Have Liked It

My mum would have been 94 today, and my sister would have been 57. Mum knit every day for more than 80 years and Sylvia collected toy mice, so I got this. I suspect both would have disapproved but secretly been pleased.

Knitting Mouse tattoo

What Everyone in Tech Should Know About Teaching and Learning

I have just posted a 40-minute video of my talk “What Everyone In Tech Should Know About Teaching and Learning”. It’s a quick tour of the most popular material from Teaching Tech Together, which in turn is based on training I originally developed at the Carpentries and for RStudio’s instructor training and certification program. The slides are available under a Creative Commons license, and I’m always happy to deliver it as a lunch-and-learn. I hope you find it useful—feedback is always welcome.

Software Engineering's Greatest Hits

I have just posted a 30-minute video of my talk “Software Engineering’s Greatest Hits”. In brief: software engineering researchers have learned a lot over the last 50 years, but most working programmers don’t even know that knowledge exists. I think the way to close that knowledge gap is to teach a bit of data science to undergraduate computer scientists so that they’ll understand what claims are actually being made, and then tell them what we currently think we know. To make this work, I think we have to teach data science using software engineering data and examples—there are lots of good generic data science courses out there, but most people learn best and fastest when the examples are directly relevant to their own domain. A course like this would fit into the curriculum and be culturally defensible (“Look, math!”) and I think it would also be very popular with students (“Look, data science!”). And if I’ve learned anything in the last 20 years, it’s that simply presenting the results bounces off: if it was going to work, it would have by now. I hope you’ll enjoy the talk - comments and feedback are very welcome.

Related:

Beneath Coriandel

My novel Beneath Coriandel is now for sale on Amazon in Canada, the UK, the US, and elsewhere. I hope you enjoy reading it as much as I enjoyed writing it.

Cover of 'Beneath Coriandel'

A young man descends into the tombs beneath the city to slay a monster, a woman plots to steal her niece’s youth, a ghost explains how he earned his name, and a magician wonders why she can’t get an old nursery rhyme out of her head. With spies, swordplay, betrayal, forbidden love, and a philosophically inclined pair of boots, Beneath Coriandel will appeal to anyone who enjoyed The Curse of Chalion, The Lies of Locke Lamora, or The Innkeeper’s Song.

That Seems Simple to Me

1997

PF: Hi, I’m a programmer. I just founded an online retail company. What do you two do?

GD: Hi, I’m a graphic designer. I know how to select and arrange text and images in ways that are appealing, informative, and usable.

MA: And I’m in marketing. I know how to identify people’s needs and craft messages that will appeal to them.

PF: Huh. Those both seem much simpler to me than programming. I don’t think I need to hire you.

2003

GD & MA: So, how did your startup do?

PF: It tanked. People kept saying the site was ugly and confusing. I guess I should have hired you back in ‘97 after all.

2012

PF: Hi, I’m a programmer. I just founded an online education company. What do you do?

TE: Hi, I’m a teacher. I know how to select and arrange learning materials in ways that are appealing, informative, and effective.

PF: Huh. That seems simpler to me than programming…

2021

PF: And what do you do?

ML: Machine learning.

PF: You mean you can automatically make lessons more effective and tell us how to make videos and autograded exercises more engaging?

ML: Uh, what?

PF: Cool! When can you start?

A Proficiency Test for Research Software Engineers

Back in 2014, I worked with Mike Jackson, James Hetherington, and Andrew Turner to develop a skills assessment for people who wanted to use the DiRAC supercomputing facility. It never really caught on—it’s politically impossible to tell professors who have already paid for machine time that their grad students don’t know enough about software development to use it efficiently—but I’m publishing it here in the hope that it will spur others to share what people ought to know and how we can tell if they know it.

1. Introduction

  1. You have 90 minutes to complete the following tasks.

  2. You may use the web, man pages, and any other resources you would usually use when programming.

  3. If you are completing the assessment with colleagues, you are free to discuss hints, tips and the usage of commands, but do not share the answers to tasks.

  4. If at any point you would like clarification, please do not hesitate to ask.

  5. If at any point you are unable to complete a task, feel free to move on to the next task. The major tasks (shell, shell scripts, testing, code review) are all independent. You may ask for help before moving on, but doing so will be considered equivalent to not completing that task.

Clone the exam repository at [URL provided]. This serves as both a repository and your working copy. You will do all of your work in this local repository and use version control to commit your solutions as you go.

2. Shell

Use the cd command to go into the python/ directory, which contains two subdirectories. Within that directory, use a single shell command to create a list of all files with names ending in .dat in or below this directory in a file called all-dat-files.txt.

Create a file called shell-command.txt containing the command you ran to get the list of files, add this to the version control repository, and commit it.

Add all-dat-files.txt to the version control repository and commit it.

3. Shell Scripts

Write a script called do-many.XX (scripting language of choice) that runs power2.py for the inputs listed in the file input.d and saves the output in a file calld output.d. The output file must contain the following when you are done:

16
8
2
1
8
1
32
2
1

You do not need to do error-checking on the command-line parameters, i.e., you may assume that they are all non-negative integers.

Add do-many.sh to the version control repository and commit it.

4. Testing

analyze.py contains a function called running_total that is supposed to calculate the total of each strictly increasing sequence of integers in a list:

running_total([1, 2, 1, 8, 9, 2])       == [3, 18, 2]
running_total([1, 3, 4, 2, 5, 4, 6, 9]) == [8, 7, 19]

test_analyze.py contains a unit test implementing the first example above. Write four more unit tests that you think are most important to run to test this function. Do not test for cases of invalid input (e.g., inputs that are strings, lists of lists, or an input that isn’t a flat list of numbers). You can run your tests using the command:

py.test test_analyze.py

Add test_analyse.py to the version control repository.

5. Scale Testing

What is the approach you would use to check the scaling of a program called long_run.py? Explain your method in 50 words.

Create a file called scale-test.txt with the commands you would use and add this to the version control repository.

6. Code Review

The programs power2.py, power2.c, and power2.f each take a single non-negative integer as a command line argument and produce the powers of two that total to that number. For example:

./power2.py 27

produces:

16
8
2
1

Choose the language you are most comfortable with and change this program in at least 3 ways to improve its readability, understandability, and modularity without changing its behaviour.

Commit your changes to the version control repository.

7. SSH keypairs

Assume you have an account with the username me on a remote host called server.dirac.org. You want to use a SSH public/private keypair and associated passphrase to access the remote host. What commands would you use to create your SSH keypair and to update the remote host with your public key?

Create a file called ssh-keypair.txt with the commands you would use and add this to the version control repository.

8. Secure copy

Assume that in your me account on server.dirac.org you have 100 files called detector001.dat, …, detector099.dat in a directory called data immediately below your home directory. What commands would you use to securely copy the directory that contains all these files from server.dirac.org to your local computer?

If a few files have been changed within this directory on your server.dirac.org account, what command would you use to copy over only the changed files?

Create a file called secure-copy.txt with the commands you would use and add this to the version control repository.

9. Archiving

Create a file called save-work.txt containing a single command to create an archive called either exam.tar.gz or exam.zip in your home directory with the files you have created. This archive must not contain the .git directory.

Run save-work.txt and email exam.tar.gz or exam.zip to your examiner.

A Magic USB Drive

I dreamed again last night about a magic USB drive. If I held it in my hand and thought hard about the files I wanted, then plugged the drive into my computer, the files would be there. A finished version of the YA novel I’ve been working on sporadically, the tech books I’m either struggling with or just wishing for—they would just be there, fully formed and ready to share with the world, along with the complete source code of the extensible programming tools I’ve wanted for the last 20 years and the albums Ike Quebec never recorded.

Then the exuberance of the birds in our back yard woke me up and I had to return a world in which my hands hurt when I type and my brain hurts when I word. Iffy won’t escape from Antarctica by magic; the errors and inconsistencies in STJS and BST won’t correct themselves, and as for the programming tools and jazz, you can only have everything you want in dreams and sometimes not even then.

But the birds sound happy and the rain has stopped long enough that I can get in a good bike ride. I may not dream about things like that, but I enjoy them. Time to start my day…

Two Books

Hi - I'm Greg Wilson, and this short talk describes two new open access resources you might want to use in undergraduate software engineering classes.

The first, called Software Tools in JavaScript, teaches software design by showing students how to build simple versions of twenty widely-used software development tools. The second, called Building Software Together, is a guide to workng in a small team for a single term. Both books are freely available under a Creative Commons license.

Here's how it all started. Back in the early 2000s I taught a course on software architecture three times at the University of Toronto and then told the department they should cancel it. The problem was a lack of material: between them, the eight books I had on my shelf with "software architecture" in the title spent a total of less than 30 pages describing actual systems. They explained how to gather architectural requirements, how to maintain them, and how to communicate them, but didn't show readers what real systems looked like.

In frustration, I mailed 200 programmers and asked them to write a chapter each describing the most beautiful piece of software they'd ever seen. We wound up with 34 contribution, which were published as a book called Beautiful Code. It won a Jolt Award and raised quite a bit of money for charity, but it wasn't useful as a textbook: the pieces of software contributors described ranged in size from a few lines of code to multi-million line systems, and required far more background knowledge than undergrads were likely to have.

Our second attempt was a pair of books called The Architecture of Open Source Applications. The entries were much more uniform in level, but the contributors used such a wide range of languages that once again most students would have struggled to follow along. On the upside, all of the content was (and is) open access.

The irony is, I learned most of what I know about software design from a series of books that avoided all of these problems. The trilogy that Brian Kernighan and colleagues wrote in the early 1980s didn't just teach people C and Unix: they taught my generation how to think about programming by showing us how to build the tools we used to program.

Which brings us to Software Tools in JavaScript. Its aim is to teach software design by walking readers through the construction of simple versions of tools they actually use. The book uses JavaScript rather than Java, Python, or something more modern because JavaScript is the one unavoidable language - and these days, it's actually not bad.

The material covers twenty tools, ranging from a very (very) simple version control system to a browser layout engine, an interactive debugger, and a style checker.

Each entry is short enough to cover in one or two lectures. Together, they introduce some common design patterns, show students how to test complex systems by mocking or stubbing components, and introduce them to tools they should probably be using anyway.

There are lots of starting points for homework exercises, mostly of the form "add this feature to tool X" or "build a very simple version of tool Y from scratch". If your students do the latter, we'd be very happy to include their work - with full attribution - in Volume 2. And if you think of a tool that we haven't covered but should, like a fuzz tester or accessibility auditor, please dive in.

Meanwhile, whenever I run into one of my former project students, they tell me that the most useful thing I taught them wasn't about code: it was about how to work in a team. I think this is more important with each passing year: we used to say, "Move fast and break things," and while I don't know if we did the former, it's pretty clear at this point that we've done a lot of breaking.

The second book, Building Software Together, is what a small team of undergrads needs to know to get through their first semester-long project. It talks about version control and continuous integration, but it's mostly about the human side of software engineering. How do you run a meeting? How do you review someone else's code? How to handle deadlines when you're juggling assignments in several different courses whose lecturers don't seem to talk to each other?

I'm still dissatisfied with some parts of this book. In particular, I don't think the chapters on security and inclusivity are going to persuade someone who doesn't already care about these things that they should. I've taught ethics to engineers, and I think that in a lot of cases, they tune out while we're talking, tell us what they think we expect to hear, and carry on as before. If you or your students can think of better approaches for these important topics, please let me know - I'm happy to rewrite.

To wrap up, these two books don't have to be used together, but they are designed to complement one another. They are both free to use under a Creative Commons - Attribution - NonCommercial license, which means you can copy or modify them however you want as long as you link back to the originals and don't try to make any money from it. (If print editions do appear, 100% of the royalties will go to support the Red Door Shelter in Toronto.) Contributions are very welcome, and all contributors will be acknowledged.

You can find the books online at https://stjs.tech/ and https://buildtogether.tech/...

...and you can reach me at gvwilson@third-bit.com. Thank you for listening.

In the wake of posts about Shopify's support for white nationalists and DataCamp's attempts to cover up sexual harassment
I have had to disable comments on this blog. Please email me if you'd like to get in touch.