Archive

Archive for September, 2004

Solitude Is More Productive

September 11th, 2004
Comments Off

Email, blogs, instant messaging… I don’t have enough willpower to turn them off, or to ignore them if they’re turned on. “I’ll only check mail at the top of the hour,” I tell myself, but then I hit a bug, or a compile takes more than ten seconds, or I have to check the Python or Java API documentation anyway, and, well, what if something really important arrived and I didn’t get to it for a whole hour? A whole hour?

That’s why I’m most productive when I am off the net for two or three hours a day. I have most of the documentation I need on my laptop, so I can search it without exposing myself to networked temptation. A desk, a comfortable chair, some natural light, a power point (because the battery in this refurbished Gateway machine only lasts 90 minutes when the screen brightness is cranked up to readable levels), and the soft, steady hum of an air circulation system, and I’m set. Oh, and a kettle and a box of mixed teas, so that when my eight a.m. cuppa runs out, I can make myself another. As others have discovered, that’s all it takes to get into the flow. What works for you?

Uncategorized

Is Groovy Dead?

September 11th, 2004
Comments Off

No, but it isn’t healthy, and I see no signs of the situation improving. For those who haven’t heard of it, Groovy is a scripting language designed to run on the JVM. It has (or had—see below) a Java-like syntax, but is freely typed liked Lisp, Perl, and Python, and has some features like closures that Java lacks. Groovy is the first language since Java to receive official blessing from Sun in the form of a Java Community Process standardization committee. Lots of people have been talking about it at conferences recently, and according to traffic on the developers’ list, four (4) books about it are now in the works.

I’ve been interested in language design for years, so earlier this year, I became HP‘s semi-official representative on the committee. I was very excited, both because Groovy promised to remove the impedance mismatch that makes ported languages liked Jython awkward, and because of the chance to be in at the start of something that promised to be fairly widely used.

The excitement wore off as soon as my students and started trying to use Groovy. There’s very little documentation, and much of what exists is out of date. OK, that’s forgiveable; early days and all that. But then we discovered that the language itself is a seemingly random collection of various developers’ favorite features from other languages, all thrown together in the digital equivalent of Sunday stew. My favorite (and I use that word ironically) is that whitespace is sometimes significant: putting the opening curly brace of a block on a new line will completely change the meaning of a block of code, but only in some cases. And then there are the parenthesization rules: a method call with zero arguments must have them, method calls with exactly one argument can omit them, but with two or more arguments, they have to be there again.

I wouldn’t mind this if there was any sign that matters were going to improve. Instead, Groovy‘s developers spent the summer throwing one feature after another into the mix, with very little discussion about how they all might interact, and no visible sign of quality control. The problem is that no-one is acting as BDFL (Benevolent Dictator For Life), in the way that Guido van Rossum does for Python. Consider this post, for example; in a single three-page post, Guido summarizes discussion about a contentious feature, and explains why he has decided that it ought to be done a certain way. He knew when he wrote this message that it was going to disappoint a lot of people, but he also knew that saying “No” is an important part of a language designer’s job. Groovy‘s core developers seem to be so excited by the possibilities of what they could add to their language that they’re not willing to throw anything away, or leave anything out.

So, is Groovy dead? Not yet. Several other people have pointed out on its mailing lists that that it seems to have lost momentum over the summer. With luck, someone will step in who’s willing to break people’s hearts in order to get a useful, usable language out the door. As far as Hippo is concerned, though, we’re not going to take the chance: we’ll use Jython for scripting.

Uncategorized

Doveryay, no proveryay

September 9th, 2004

I don’t often quote Ronald Reagan, but in this case, he’s quoting an old Russian proverb, so I guess it’s OK. Doveryay, no proveryay: trust, but verify. We’re running a training session for the incoming Hippo students on Saturday. Two of the outgoing students spent some time last week installing the required software on the departmental machines, so this morning, I decided to give it a test drive.

Eclipse wouldn’t run—bad permissions. Subversion wouldn’t either—missing some shared libraries. And I’m still trying to get HSQLDB to play nicely on my Windows machine. The students who are helping me seem a little crestfallen, but I actually think we’re doing pretty well. You always run into problems like this when you’re trying to set up a shared development environment. Yesterday, for example, one of my co-workers at Hewlett-Packard and I spent an hour tracking down a problem in an Ant configuration file. Luddites would cite this as further proof that modern tools actually slow programmers down by hiding too much of what’s Really Going On, but in my experience, a little configuration pain now and again repays itself many times over when you’re actually programming.

But you have to verify. You should trust your software, but you have to verify that it is doing what you think it is. Human-readable configurations help, as do integrity-checking tools, logging tools, and all the other debugging aids that we usually don’t take the time to put into the products we build. We’ve made a start on this in Hippo, but there’s a lot more to do…

Uncategorized

Better is Harder than New

September 7th, 2004

I gave notice at Hewlett-Packard last Friday; I’ve enjoyed working with the Select Access team a lot over the past four years, and have learned a tremendous amount from them, it’s time for me to focus on a few personal projects (including Hippo—I really would like to learn how it works).

As part of unwinding my commitments at HP, I’m going to resign from JSR-241, the committee that is supposed to be standardizing a new JVM-based scripting language called Groovy. I would probably have done this anyway; developers have been adding an average of one major new feature every ten days for the past three months, without any direction or discipline that I can see. It’s kind of sad—there was a lot of excitement about the language back in the spring, and despite four of its developers now having book deals, I think it’s going to be stillborn.

Coincidentally, David Ascher (of ActiveState) introduced me to Mark Hahn last week. Mark is the driving force behind PyCS, a new Python-inspired language. Mark was interested in the paper I wrote about extensible programming systems, and wanted me on board. We swapped some mail, and I commented on his proposal to use XML as an intermediate format. I closed my comments with:

Which leads me to ask, what is the compelling feature that will make programmers ache to switch from [name of language here] to PyCS? “A tidied up Python with some concurrency constructs” sounds like an improvement over what I’m using now, but not a particularly big one. Putting the “power tools” that Lisp (esp. Scheme) developers have enjoyed for the last quarter century into the hands of the masses—now that is compelling…

That got me thinking about the difference between being new, and being better, and why the latter is so much harder to do. According to Donald Mackenzie (a sociologist at the University of Edinburgh who provided seed money for the second book I worked on), a new technology has to be roughly three times better in order to displace an old one. “Better” can mean faster, cheaper, safer, more reliable, or (I guess) more exciting; the key point is, the difference has to be dramatic to be compelling. A 10% reduction in cost simply isn’t enough for people to give up the comfort of the known.

That’s why I’ve lost interest in Groovy and PyCS: I can’t think of any way in which they’re that much better than what’s out there already. Groovy adds some of Ruby‘s features to Java (but does it badly: whitespace is sometimes significant, and my students found its inconsistencies very frustrating); PyCS adds prototypes and concurrency to Python…but so what? Is that enough to persuade people to change languages? To learn a new set of libraries? To bet months of their working lives on a new language’s longevity? I doubt it.

I think these groups would do better to take a leap into the unknown, and build something genuinely different. Sure, it’ll probably fail (i.e., will probably only ever have a few hundred users), but the “slightly better X” approach seems guaranteed to fail. I mean, even steamrollers like Java and .NET took four or five years to establish themselves; if you’re designing something to be a better X in 2004, its chances of being better than the competition in 2009 seem pretty slim…

So why Hippo? At first glance, it’s a “slightly better X”, where “X” equals SourceForge. Version control, mailing lists, user management… there is (deliberately) nothing new in its feature set…

…except batch operations. Hippo is designed to handle courses, in which each student (or small group of students) does her work in a mini-project of her own. This means that instructors have to be able to create dozens or hundreds of more-or-less identical projects. To do this on SourceForge (or GForge, or any of the other clone systems), you must either click and type ’til your fingers go numb, or write a script that throws SQL at the underlying database. No-one would put up with the former; the latter requires skills that many instructors don’t have, and by completely bypassing the system’s data model layer, pretty much guarantees that someone, some day, will corrupt the underlying database beyond repair.

So, batch operations will be Hippo‘s “must have” feature. We hope it’ll win hearts and minds for its usability, too, but as Joel Spolsky points out in his latest article, people will forgive a little clunkiness if you give them something they really want.

Now, back to getting HSQLDB to play nicely

Uncategorized

They’re All Eighth Bolts

September 6th, 2004

Years ago, I was putting together a pine futon frame I’d purchased from Ikea. It was a pretty simple thing—just four uprights and some crosspieces. The whole thing was held together by eight bolts hidden in pre-drilled recesses.

The first seven bolts went in easily, but the eighth bolt took more than half an hour. The problem was getting the bolt to thread through the nut when both pieces were hidden from sight inside the aforementioned recesses. I got up and did odd jobs three times to calm myself down while trying to get the damn thing seated.

There are a lot of eighth bolts in software development—things that should take a couple of minutes, but instead chew up an entire afternoon. Most of them have to do with getting third-party software (i.e., stuff you didn’t write yourself) to play nicely. Never mind algorithms (which is what my computer science courses taught me life was about), or pushing data around, or user interface design, or analyzing requirements, or anything else: anyone building a real modern grown-up application is going to spend days—days!—trying to make sense of out-of-date documentation and idiosyncratic configuration files.

As you’ve probably guessed by now, I’m knee-deep in exactly this process. I’ve been trying to set up two PCs (my girlfriend’s desktop machine, and my laptop) so that I can do Hippo development. I’m logging the process, and am starting to worry that my description will frighten students away. Did you know that Eclipse sometimes pays attention to some environment variables, but only some, and only sometimes? Or that if you install Tomcat on Windows XP with Service Pack 2, it can fail silently because Windows decides that running it would be a security risk? If you reboot after the install, Windows will put up a dialog telling you what it’s doing, but nothing in the install process tells you that you have to do this.

And then there are databases. I’m sure relational databases have to be as complex as they are. Right? They have to be, because otherwise I’d have to believe that database developers are all cruel misanthropes. Hibernate‘s developers recommend HSQLDB (formerly Hypersonic), a pure-Java RDBMS, but only one process can connect to it at once, which means that if you want to check what your servlet has done to the database, you have to shut Tomcat down, fire up a viewer, take a peek, then shut down the viewer and re-start Tomcat. It isn’t difficult, but it makes debugging even more painful than it normally is.

OK, so what about Postgres? We’re planning to use it as our production server, so why not download the beta version of the Postgres Windows installer, run it, and carry on? Well, Postgres insists on creating a new user account for security reasons… And then the Cygwin version of Postgres has trouble talking to the version you just downloaded… and then, and then, and then. I’m sure it’s all doable, but jeez, does it really have to hurt this much?

So, two observations. First, it isn’t just Hippo; the Select Access product that I’ve worked on for the past four and a half years has run into the same issues over and over again. (Anecdote: at the first Select Access training course, I apologized to the sales engineers for the product’s complexity. They all laughed. “You have no idea how much simpler this is to install than anything else we’ve ever worked with,” one of them said. “It only takes a day to get it going.”)

Second, where is academic computer science in all of this? When and where are we teaching students that this is important, and that getting it right (or at least not getting it wrong) matters more than adding any number of WYSIWYG peer-to-peer object-oriented voice-activated features to a project.

Back to it…

Uncategorized

Book Review: Decompiling Java

September 2nd, 2004

Godfrey Nolan: Decompiling Java. APress, 2004, 1590592654, 264 pages.


I was excited when
I read the description of this book on APress’s site, since
decompilation is one of those subjects (like debugging and linking)
that has been crying out for a good book for years. Unfortunately,
this book isn’t good: it rambles, it repeats itself, and too many of
the decompilers it references are out of date or have vanished
completely. This isn’t just the author’s fault—better editing would
have made a different as well—but for now, if you want to make sense
of this particular corner of the programming world, you should look
for some other guide than this.

Books

Book Review: Foundations of Python Network Programming

September 2nd, 2004
Comments Off

John Goerzen: Foundations of Python Network Programming. APress, 1590593715, 512 pages.


Goerzen’s Foundations of Python Network Programming
looks at how to
handle several common protocols, including HTTP, SMTP, and FTP.
Goerzen doesn’t delve as deeply into how servers work,
concentrating instead on how to build clients that use these
protocols.

Throughout,
Goerzen builds solutions to complex problems one step at a time,
explaining each addition or modification along the way. He
occasionally assumes more background knowledge than most readers of
this book are likely to have, but only occasionally, and makes up for
it by providing both clear code, and clear explanations of why this
particular function has to do things in a particular order, or why
that one really ought to be multithreaded. I’ve already folded down
the corners of quite a few pages, and expect I’ll refer to this book
often in the coming months.

Books

Book Review: How Tomcat Works

September 2nd, 2004
Comments Off

Budi Kurniawan and Paul Deck: How Tomcat Works. BrainySoftware.com, 2004, 097521280X, 450 pages.


Kurniawan and Deck’s How Tomcat Works is a narrower
book than some, but
seems to be driven by the universal need to make sense of things. The book
delivers exactly what its title promises: a detailed, step-by-step
explanatio of how the world’s most popular Java servlet container
works. The authors start with a naive web server that does nothing
except serve static HTML pages until it’s told to stop. From that
humble beginning, they build up to a full-blown servlet container one
feature at a time. Each time they add code, they explain what it’s
doing, and (more importantly) _why_ it’s needed. Their English is
occasionally strained, and there were paragraphs I had to read several
times to understand, but this book is nevertheless an invaluable
resource for every servlet programmer who wants to know more about her
world.

Books

Book Review: Joel on Software

September 2nd, 2004
Comments Off

Joel Spolsky: Joel on Software. APress, 2004, 1590593898, 362
pages.

Joel Spolsky’s Joel on Software
collects some of the witty, insightful articles he has written over
the past four years. If you’re a developer, Spolsky’s weblog is a
must-read: his observations on hiring programmers, measuring how well
a dev team is doing its job, the API wars, and other topics are always
entertaining and informative. Over the course of forty-five short
chapters, he ranges from the specific to the general and back again,
tossing out pithy observations on the commoditization of the operating
system, why you need to hire more testers, and why NIH (the
not-invented-here syndrome) isn’t necessarily a bad thing. Most of
this material is still available on-line, but having it in one place,
edited, with an index, is probably the best twenty-five dollars you’ll
spend this year.

Books

Book Review: Better, Faster, Lighter Java

September 2nd, 2004
Comments Off

Bruce A. Tate and Justin Gehtland: Better, Faster, Lighter Java. O’Reilly & Associates, 2004, 0596006764, 243 pages.


I sometimes think that Ecclesiastes must have been a programmer. “The
thing that hath been, it is that which shall be…and there is no new
thing under the sun.” I don’t know about you, but it sure reminds me
of the never-ending debate between people who want comprehensive
frameworks so that they can assemble applications out of tried and
tested components, and those who want to keep their libraries light,
agile, and (most importantly) comprehensible, even if it means
building more of each application by hand.

This new book from O’Reilly is the latest salvo in this war’s Java
front. By now, most developers would agree that Enterprise JavaBeans
(EJB) is a mess. As Tate and Gehtland point out, any system which
requires you write half a dozen classes and XML deployment descriptors
in order to implement a simple integer counter must have taken a wrong
turn somewhere. The alternative they propose is certainly attractive:
Hibernate (www.hibernate.org) to persist POJOs (Plain Old Java
Objects), and Spring (www.springframework.org) as a container, are a
fraction of the size of full-blown EJB, but can do most of the things
that most programmers want most of the time.

But a book isn’t just a point of view; it’s a presentation of a
point of view, and that’s where this one left me unsatisfied. First,
there were more paragraphs than I wanted on the future of the industry
(they seem to feel that any company with more than a dozen employees
is a swamp of self-serving hypocrisy), and too many
motherhood-and-apple-pie statements about the virtues of simplicity
and transparency. Simple in whose eyes? Transparent by what measure?
I’m pretty sure the programmer they mock on pg. 17 thought his code
was both…

The authors do much better when they get into the specific technical
problems of EJB, and the ways in which lighter frameworks are better.
However, once they start doing this, they aren’t as critical of their
favored alternative as they are of EJB. For example, five of my
students at the University of Toronto have been working with Hibernate
for the last four months. We definitely prefer it to JDO, or to
rolling our own persistence layer, but we have still had to bend some
of our data model out of shape to fit Hibernate’s needs, and debugging
mapping files is about as much fun as, well, debugging EJB deployment
descriptors.

I came away from this book feeling that the authors never made up
their minds whether they were writing a polemic, or a technical
how-to. A little of one mixed into the other would have been fine,
but the balancing act they seemed to be striving for just didn’t work
for me. Everything they discuss is worth knowing, but this book may
not be the best way to pick it up.

Books