A primary goal of many CS education projects is to determine the extent to which a given intervention has had an impact on student learning. However, computing lacks valid assessments for pedagogical or research purposes. Without such valid assessments, it is difficult to accurately measure student learning or establish a relationship between the instructional setting and learning outcomes.
We developed the Foundational CS1 (FCS1) Assessment instrument, the first assessment instrument for introductory computer science concepts that is applicable across a variety of current pedagogies and programming languages. We applied methods from educational and psychological test development, adapting them as necessary to fit the disciplinary context. We conducted a large scale empirical study to demonstrate that pseudo-code was an appropriate mechanism for achieving programming language independence. Finally, we established the validity of the assessment using a multi-faceted argument, combining interview data, statistical analysis of results on the assessment, and CS1 exam scores.
People have been studying how we learn programming for even longer than they've been studying how we do it, and while the two aren't exactly the same, there's a lot of overlap in both methodologies and findings. Some of the best work I know has come out of the group at Georgia Tech led by Mark Guzdial (who is also a prolific and informative blogger). In this paper, he and his student Allison Tew present the results of a multi-year project to develop an instrument that can be used to assess how well students have learned basic programming concepts, regardless of whether the language they learned in was Java, Python, or MATLAB. The long-term goal is to create a concept inventory for computing similar to those that have been developed in physics, biology, and other sciences.
I think this is critically important work, and deserves a lot more attention from the software engineering community as a whole, not just that portion of it also interested in teaching. To paraphrase Dobzhansky, nothing in software engineering makes sense except in light of human psychology, so while measuring outputs like bugs per module is important, we won't know why some people are so much more productive than others until we get a handle on what people actually know.
Originally posted at Never Work in Theory.