Archive

Archive for the ‘Books’ Category

Recent Reading

March 25th, 2009
Comments Off

Another bunch of papers and books:

  • Sfetsos, Stamelos, Angelis, and Deligiannis: “An experimental invesgitation of personalit ytypes impact on pair efefctiveness in pair programming.” Empirical Software Engineering, 14:187-226, 2009. The authors had 70 undergraduates do pair programming and measured effectiveness in terms of communication, velocity, design correctness, passed acceptance tests, and general satisfaction with partners. Half the pairs were people with similar personalities (as reported by Myers-Briggs Type Indicator, or MBTI), while the other half were dissimilar. The result: dissimilar pairs outperformed similar pairs in almost all ways. There are lots of flaws in the study—MBTI is deeply flawed, it’s not clear if the study was double-blind, there are many uncontrolled confounding factors—but it’s still a very interesting result.
  • Draper, Kessler, and Riesenfeld: “A History of Computing Course with a Technical Focus.” Proc. SIGCSE 2009. Describes a history of computing course at the University of Utah that included “period” programming assignments (such as an auction/bidding game in old-style Fortran). I think this would be very cool.
  • Enbody, Punch, and McCullen: “Python CS1 as Preparation for C++ CS2.” Proc. SIGSCSE 2009. Measures the impact of switching to Python as a first language at Michigan State University by looking at students’ performance in the subsequent (unchaged) C++ course; there was no statistically significant difference, which is either encouraging or disheartening.
  • Bollen, Van de Sompel, Hagberg, and Chute: “A principal component analysis of 39 scientific impact measures.” http://arxiv.org/abs/0902.2183. The impact of scientific publication has traditionally been measured via citation counts, but dozens of more sophisticated metrics have been proposed. In this paper, the authors performed a principal component analysis of the rankings produced by 39 measures based on citation and usage log data, then cluster the metrics. The most important result is that Impact Factor (the most commonly used metric) is an outlier: it doesn’t produce the same rankings as most of its competitors, and should therefore be “used with caution” (academese for “discarded”).
  • Boyle, Cavnor, Killcoyne, and Shmulevich: “Systems biology driven software design for the research enterprise.” BMC Bioinformatics, 9:295, 2008. Describes a sophisticated third-generation architecture for supporting biomedical research. It reads like a catalog of enterprise-scale buzzwords, but that’s what it takes to do modern science.
  • Chen, Cheng, Hsieh, and Wu: “Exception handling refactorings: Directed by goals and driven by bug fixing.” Journal of Systems and Software, 82:333-345, 2009. Describes four increasingly-useful levels of exception handling, introduces some related “code smells”, describes appropriate refactorings, and presents a case study. Nice work—I’ll borrow from it in my next software engineering course.
  • Little and Miller: “Keyword programming in Java.” Automated Software Engineering, 16:37-71, 2009. Describes a tool that translates small sets of unordered keywords into legal Java expressions by matching against context, e.g., “letter at message[i]” becomes message.charAt(i). I don’t know how useful the tool would be to experienced programmers, but the fact that it works at all is an indication of how much redundancy (or entropy) there is in most programming languages.
  • Allspaw: The Art of Capacity Planning. O’Reilly, 2009. Allspaw draws on his practical experience at Flickr to describe how to measure, deploy, and manage web application infrastructure to avoid bottlenecks. Unfortunately, he chose not to include even the most basic mathematical material (Little’s Law, M/M/1 queues, etc.). I was quite disappointed.
  • Hazzan and Dubinsky: Agile Software Engineering. Springer-Verlag, 2008. A textbook on agile software development aimed at both undergraduate classes and industrial training courses. All the expected topics are there, along with questions to ponder and references to relevant literature. Dry as toast, but I’ll still borrow a few ideas for my next course.
  • Monson-Haefel (ed.): 97 Things Every Software Architect Should Know. O’Reilly, 2009. Mostly motherhood and apple pie: I agree software architects (and most other adults) should know, understand, believe, and practice these things, but with only a handful of exceptions, page-and-a-half descriptions of them isn’t going to make a difference to whether they do or not. On the upside, there are twice as many female contributors as there were in Beautiful Code.

Books, Research

Reading Update

February 15th, 2009

Back in December, I blogged the books I was planning to read in January and February. Here are the quick summaries:

  • Glut: Mastering Information Through the Ages: too much “gosh wow” for me (and also too much shaky science).
  • The Online Learning Idea Book: to paraphrase, what was new wasn’t interesting, what was interesting wasn’t new.
  • Patterns for Fault-Tolerant Software: nice set of patterns; pity there weren’t examples to bring them to life for those of us who don’t have the author’s first-hand experience of real-time problems and their solutions.
  • Practical API Design: still working on it—started with some philosophy (never a good sign), but by Chapter 3 gets into meatier stuff.
  • e-Learning and the Science of Instruction: on pg 42, the authors say, “Shavelson and Towne eloquently [my emphasis] summarize the argument for evidence-based practice in education: ‘One cannot expect reform efforts in education to have significant effects without research-based knowledge to guide them.’” By comparison with the rest of the book, that is eloquent: there might be good ideas lurking in its turgid Brezhnevian prose, but I gave up before finding them.
  • Workflows for e-Science: like most collections, hit and miss—there are a few thought-provoking articles, but most spend their time repeating things readers will already know, or describing successes without any reflection on failures.

I also read:

Thankfully, there are lots of interesting books coming out in the next few months. I’ll be lucky if I get to read any three of them, of course, but I can always dream:

(Ironically, I now have to be much more careful which technical books I gamble on: since I’m no longer DDJ‘s book review editor, and don’t have any research money to spend on books, I have to buy them myself. How barbaric… :-)

Books

Segaran on the Excluded Middle

December 18th, 2008
Comments Off

Nice post from Toby Segaran (author of a very good book called Programming Collective Intelligence) that discusses the “excluded middle” of technical books—worth reading if you’re thinking about writing anything for the geek market.

Books

Random Library Entries

December 10th, 2008
Comments Off

As I noted back in May, I’m using LibraryThing to keep track of my reading these days.  The “Library” tab on my site now displays a random selection of books that I’ve enjoyed; hope you enjoy ‘em too.

Books

Adam’s review of “Clean Code”

December 8th, 2008
Comments Off

Adam Goucher has posted a review of Robert Martin’s Clean Code. It’s much more detailed than mine, but reaches the same conclusion: good book, worth reading.

Books

Next Term’s Technical Reading

December 5th, 2008

I’m no longer DDJ’s tame book reviewer, but reading is an addiction that’s very hard to break.  Here’s my queue for January and February:

Books

Two Others

October 19th, 2008
Comments Off

In amongst a bunch of programming books, I also found time this summer to read a few for fun, like the latest in the Temeraire series. The two I enjoyed most were Felix Gilman’s debut novel Thunderer, and Michael Chabon’s Gentlemen of the Road. The former is a match for K. J. Bishop’s The Etched City (think China Mieville, but less self-consciously anxious to impress); the latter, like The Yiddish Policemen’s Union, is proof that Chabon has finally learned that too much is not a good thing. I wouldn’t call either book deep, but they’re both nice and chewy.

Books

So Far Behind

October 10th, 2008

I’d like to start this lengthy set of reviews by apologizing to the authors and editors who’ve been waiting so patiently for me to tell the world about their work. Buying a house, getting married, selling a house, and taking on a new batch of graduate students is an explanation, but it’s not an excuse.

I’d also like to apologize for the fact that these reviews aren’t as detailed as they ought to be. The distractions listed above are part of the reason; the real cause is that academic life doesn’t leave me time for programming, so I’m not able or qualified to delve into technical detail the way I used to. I miss it a lot, and worry that in another couple of years, I will be too out of touch to be able to guide my students. (And that a couple of years after that, I’ll have to wear a tie to work…)

But for now, though, I have a pile o’ books here that you might want to read, and the one you’ll probably like most is Charles Petzold’s The Annotated Turing. My first reaction when I heard about it was, “Why didn’t I think of that?” and my second was, “I wonder if he can pull it off?” The answer to the latter is definitely “yes”, and I expect to see “AT” on shelves beside Godel, Escher, Bach and other thinkalong books in years to come.

Petzold’s idea is simple: take Alan Turing’s classic paper “On Computable Numbers, with an Application to the Entscheidungsproblem”—the paper for which he invented the Turing machine—and interpolate enough explanation to make it accessible to a lay reader. The original paper is broken into chunks ranging in size from a line or two to half a page, and typeset on gray. In between, Petzold’s commentary explains the background to Turing’s work, why his “machine” has the features it does, what the significance of various parts of the proof are, and so on. It would be a great text for a sophomore course on computability, but it’s also simply a fun read for anyone who’s curious about the intellectual underpinnings of our field. Five out of five, and I hope it inspires some imitators.

I’d like a lot of people to imitate Neal Ford’s The Productive Programmer as well. In fact, I’d like people to imitate Mr. Ford, and the whole point of this book is to make that easy. In a little over 200 pages, he describes the things he does that allow him to produce more working software per day than most of his peers. Some are micro-level tricks, like using clipboards that can hold multiple items. Others, like running code through state-of-the-art static analysis tools or practicing test-driven development, are higher level, but no more or less important. I came away from the book feeling like I’d just watched one of those cooking shows where you get to see exactly how a great pastry chef makes a pie crust that tastes so much better than yours. I’ll probably never do everything Ford recommends, but I’ve already switched to a better desktop shortcut tool, and don’t plan to switch back.

“Uncle Bob” Martin’s Clean Code is similar in spirit, and just as worthwhile. Where Ford touches on everything a developer does in a working day, Martin focuses on what developers produce: code. I expected most of the topics, such as choosing good variable names and information hiding. What pleasantly surprised me was how many new things Martin had to say about them, and how well his examples illustrated his points. I particularly liked Chapter 14, in which he refactors a Java class for handling command-line arguments step by step. It’s the clearest explanation of what refactoring is actually for that I’ve ever read, and I’m already using it in my software engineering classes. Don’t let the table of contents fool you: no matter how experienced you are, there’s enough in here to make owning a copy worthwhile.

Kent Beck’s Implementation Patterns draws on an equal depth of experience, but focuses more on the ideas that go into the code. The author says in the introduction that it’s meant to sit between the Gang of Four’s classic Design Patterns and a Java language manual. I think he does himself a disservice: what he’s actually done is catalog the mental building blocks people use to write sequential imperative software. The chapter on “Methods”, for example, includes a few paragraphs on each of 23 micro-patterns, including:

  • Composed Method—Compose methods out of calls to other methods.
  • Intention-Revealing Name—Name methods after what they are intended to do.
  • Conversion Constructor—For most conversions, provide a method on the ocnverted object’s class that takes the source object as a parameter.

It’s tempting to say, “Well, everyone knows that,” but of course everyone doesn’t, and even if they did, categorizing and naming the obvious often reveals a lot that isn’t. As I read the book, I thought about how cool it would be if the status bar in Eclipse could tell me which of these micro-patterns I was using in real time as I typed. It would be a great teaching tool, and would keep a lot of corner-cutting programmers (myself included) honest.

Stepping back for a moment, Ford, Martin, and Beck’s books are all trying to teach a way of seeing the world. This is much harder than teaching the syntax of Python 3.0 or how to configure Basie, and it’s very easy for authors who try to start preaching. (I know, because that’s what I do.) FM&B all have very definite opinions on how you should think when you’re programming; what makes all three books worthwhile is that they set these opinions on the dinner table and hand you a knife and fork, rather than trying to force-feed you or persuade you that yes, you really do like pickled beets.

Lindberg’s Intellectual Property and Open Source also tries to convey a particular way of thinking. In this case, the “way” is the one embodied in America’s legal code, which, like every other legal code, is contradictory, biased, and out of date. As you can guess from the title, Lindberg’s target is software developers who know at most a few basic terms (and are probably even confused about some of those). The book is divided into two parts: an eight-chapter introduction to IP law that covers patents, trademarks, copyrights, trade secrets, and their interaction with open source, and six “how to” chapters to help you figure out who owns your idea (and patches that other people submit), apply a license to your code, skirt around the landmines of reverse engineering, and formalize your project.

The writing is clear, and the examples accessible; my only complaints are that some of Lindberg’s analogies are a bit of a stretch, and that like most books in this area, his only covers the US. Those quibbles aside, I really enjoyed it, and think it deserves a place beside Karl Fogel’s Producing Open Source Software on every open source developer’s bookshelf.

Next up are five Pythonic books. Younker’s Foundations of Agile Python Development and Ziadé’s Expert Python Programming overlap in many ways: both have chapter-long introductions to version control, talk about packaging Python applications for distribution, preach the agile gospel, and so on. The major difference is that Ziadé’s book devotes more space to the advanced features of Python itself, while Younker devotes that space to database programming and setting up build farms. It would be worth browsing either a few months after starting your first big Python project, just to make sure you hadn’t missed anything, but if you have read The Pragmatic Programmer or any of its kin, you will already have seen half or more of their contents. In addition, Ziadé’s book could use a closer proof-reading: some of the examples have been incorrectly indented during typesetting, and if you don’t already understand decorators, the description in Chapter 2 isn’t going to make a lot of sense.

Copeland’s Essential SQLAlchemy and Bennett’s Practical Django Projects aren’t about Python per se, but rather about two popular programming tools built on top of it. SQLAlchemy is a full-featured object/relational mapping tool that does a very good job of managing persistence, thanks in large part to creative use of Python’s metaprogramming features. I’ve never used more than a small subset of SQLAlchemy’s features, but this book laid out the rest (especially inheritance handling) clearly and in a logical order.

Django, on the other hand, is the most popular of several Rails-style web application frameworks for Python (but uses its own ORM, rather than SQLAlchemy, which tells you all you need to know about why Python’s various offerings are still eating Rails’ dust). While Bennett’s book was written before the final Django 1.0 release, the examples all seem to work with 1.0. The writing is clear, and it isn’t bedevilled by the typos that made Holovaty and Kaplan-Moss’s Definitive Guide to Django so frustrating. Like Essential SQLAlchemy, it is a solid, if somewhat predictable, introductions to its subject: here’s how to install, here’s a “hello, world” application, here’s what you need to know to write something that’s actually useful, and so on. I wouldn’t have minded a few more screenshots, but on the other hand, not having them did force me to actually run more of the code.

The last book in this batch is Kinser’s Python for Bioinformatics. The second part of the title is more important to the author than the first: Kinser’s aim is clearly to help scientists do things like analyze gene sequences, and Python is “just” a useful tool for doing that. Thus, there are chapters on dynamic programming and text mining, rather than on generators or building distribution packages. I think his just-in-time approach will work well for his intended audience, and the extensive examples are a good way for programmers to learn a little bioinformatics.

Python is my favorite language, but I’m paying more attention to JavaScript with each passing day. I don’t particularly like it, but it has become the C of the internet: the thing that everything else depends on. I was therefore more than a little excited to get Douglas Crockford’s JavaScript: The Good Parts in the (physical) mail. Crockford contributed a very entertaining chapter to Beautiful Code, and knows as much about JavaScript as anyone. In this book, he “…digs through a pile of ogod intentions and blunders to give you a detailed look at all of the genuinely elegant parts of JavaScript…”

At least, that’s what the blurb on the back promises. The actual content was a mixed bag, ranging from unnecessarily-detailed descriptions of the language’s syntax (complete with railroad diagrams) and commentary on the APIs of some built-in types to fairly advanced discussion of how closures and objects work, and how best to use them. A month after finishing it, I’m still not sure who the intended audience is: many parts will leave newcomers bewildered, while experienced programmers will frequently be bored. It isn’t even really the guide to good practice that the title and blurb suggest, as there is too little discussion of why you would do things one way or another.

Adams et al’s Art \& Science of Javascript seems to have a better idea of who it’s for and what its readers already know. The subtitle, “Inspirational, cutting-edge JavaScript from the world’s best”, is not far off the mark. Each of the book’s seven chapters walks the reader through building a 3D maze, metaprogramming, debugging with Firebug, and other topics. There are plenty of annotated code samples, lots of full-color pictures, and most importantly, a refreshing sense of, “Gosh, isn’t this cool!” It definitely shouldn’t be anyone’s first (or even second) book on JavaScript, but if you’re already comfortable with AJAX and drawing on a browser canvas, there are plenty of ideas here for you.

Gill et al’s Mastering Dojo is a nice counterpoint to the Adams book. For those who haven’t run into it yet, Dojo is a large (some would say “overly large”) JavaScript library for building client-side web applications. Comparable in size to early editions of the Microsoft Foundation Classes or Java’s SWING GUI library, it hides most of the differences between various browsers and does what it can to shield programmers from the legacy of design decisions made in the language’s early days. Like Copeland’s book on SQLAlchemy, this one does what O’Reilly books do best: it lays out everything a programmer needs to know to use Dojo effectively in a readable order, provides plenty of examples, and isn’t shy about describing its limitations and ways to work around them. I would have liked more illustrations, but the index is well-organized, and the examples are well explained.

While I haven’t been paying as much attention to Windows PowerShell (formerly Monad) as I have to JavaScript, I suspect it will have just as much impact on programmers’ lives in the long run. If you haven’t seen it, PowerShell takes the Unix pipe-and-filter model to the next level by allowing components to pass streams of objects around. It might sound like a small change, but it’s not: being able to pipe complex data structures through a bunch of filters, and to take advantage of polymorphism, allows PowerShell to do some pretty amazing things.

Kumaravel et al’s Windows PowerShell Programming assumes readers already get this, and spends most of its time explaining how to extend PowerShell with new capabilities. You’ll need to know a bit about .NET programming to follow the examples, but the payoff is being able to build new power tools with just a few dozen lines of code. It definitely isn’t your grandmother’s command line any more…

Finally, there is Allemang and Hendler’s Semantic Web for the Working Ontologist, a (very) detailed introduction to the semantic web’s approach to modeling data. You won’t find a lot of code in the traditional sense in this book; instead, the authors present one real-world data management problem after another, and show how to represent and solve it using RDF, SPARQL, and related technologies. A friend of mine with a master’s degree in library science littered her copy of this book with sticky notes, some of which had double exclamation marks on it. I wasn’t quite as enthusiastic, but that’s probably just a reflection of the fact that I don’t usually have data complex enough to need this depth of analysis. If I ever get around to rewriting Data Crunching, though, I’ll go through this book again very carefully: while the authors occasionally lose the forest in the trees, they are very careful to motivate every new twist and wrinkle they introduce, and their “challenge problems” do a good job of testing the reader’s understanding of the material.

And that’s all—fifteen books, read over five months, and reviewed in as many days. As I said at the outset, I don’t have time to program any more, so I may have missed some crucial details. If so, I apologize in advance; corrections are awlays welcome. Until then, it’s a beautiful day outside, and I’m going to take my daughter to the park to play on the slide. The big slide, mind you, not the little one—it really does make a difference, and I’m very happy to be re-learning it.


  1. Dean Allemang and James Hendler: Semantic Web for the Working Ontologist. Morgan Kaufmann, 0123735564, 2008, 352 pages.
  2. Cameron Adams, James Edwards, Christian Heilmann, Michael Mahemoff, Ara Pehlivanian, Dan Webb, and Simon Willison: The Art & Science of JavaScript. SitePoint, 2007, 0980285844, 300 pages.
  3. Kent Beck: Implementation Patterns. Addison-Wesley, 2007, 0321413091, 176 pages.
  4. James Bennett: Practical Django Projects. Apress, 2008, 1590599969, 256 pages.
  5. Rick Copeland: Essential SQLAlchemy. O’Reilly, 2008, 0596516142, 230 pages.
  6. Douglas Crockford: JavaScript: The Good Parts. O’Reilly, 2008, 0596517742, 170 pages.
  7. Neal Ford: The Productive Programmer. O’Reilly, 2008, 0596519788, 222 pages.
  8. Rawld Gill, Craig Riecke, and Alex Russell: Mastering Dojo. Pragmatic Bookshelf, 2008, 1934356115, 568 pages.
  9. Jason Kinser: Python for Bioinformatics. Jones & Bartlett, 2008, 0763751863, 417 pages.
  10. Arul Kumaravel, Jon White, Maixin Li, Scott Happell, Guohui Xie, and Krishna C. Vutukuri: Professional Windows PowerShell Programming. Wrox, 2008, 0470173939, 336 pages.
  11. Van Lindberg: Intellectual Property and Open Source. O’Reilly, 2008, 0596517963, 390 pages.
  12. Robert C. Martin: Clean Code: A Handbook of Agile Software Craftsmanship. Prentice Hall PTR, 2008, 0132350882, 464 pages.
  13. Charles Petzold: The Annotated Turing. Wiley, 2008, 0470229055, 384 pages.
  14. Jeff Younker: Foundations of Agile Python Development. Apress, 2008, 1590599810, 416 pages.
  15. Tarek Ziadé: Expert Python Programming. Packt Publishing, 2008, 184719494X, 372 pages.

Books

Adam Reviews “Automated Defect Prevention”

July 1st, 2008
Comments Off

Adam Goucher’s review of Automated Defect Prevention is up at his blog.

Books

Writing a Technical Book

June 17th, 2008
Comments Off

Baron Schwartz has put up a lengthy post describing what it’s like to write a technical book (which I found through Matt Doar’s smaller, but more graphical, post). I’m doing this for the fifth time right now (two solo, two edited, the current one collaborative).  I haven’t been keeping a word count log for this one, but I did for Data Crunching (and I used to for my fiction, back when I was actually writing some), and I found it a great way to keep myself honest.  I also found the big downward spikes showing days when I did nothing but rip out inadequate text rather therapeutic… :-)

Books