Go On, Convince Me

(Posted on behalf of Alex Krizhevsky.)

I and a few other students have spent the last four weeks working on DrProject, a project management tool derived from Trac, and written in Python, that has been under development at the University of Toronto since 2002. We are here for the entire summer under the supervision of Karen Reid and Greg Wilson, whom we like so much that a small movement (OK, just me) has emerged with the noble goal of making the stuffed likeness of Greg DrProject’s official mascot of.

One of the tasks with which I have not been particularly involved is porting the database code in DrProject from Trac’s built-in object-relational mapper (ORM) to one that is becoming somewhat of a standard for Python applications, called SQLAlchemy. So what have I learned? I’ve learned that there exist these things called object-relational mappers. But I should have known about them earlier, because my dad spent the last few years writing one in Java (he was unhappy with Hibernate, the most popular Java ORM/persistence framework).

I’m still having trouble convincing myself that these things offer a distinct advantage over plain-old SQL, at least when we have to write the object definitions by hand. That’s a pain. Why shouldn’t the objects be generated from the table definitions? The table definitions are right there. ORMs like SQLAlchemy make the code harder to modify, too, as any change in the database definition must be (manually) reflected in the object definitions. (Unless you use the ORM to define the database. But it seems like a database should be independent of the application accessing it. And defining the database with an ORM is redundant if you have more than one application accessing the database. Plus it forces you to learn a new database definition language.)

And another thing! SQL is much more portable than ORMs. ORMs are vastly different, and they each have their own syntax. But SQL is, for the most part, standardized. Switching to SQLAlchemy is taking us quite a bit of effort. But switching from MySQL to PostgreSQL, for example, would involve much less manual labour. It would be easier on our backs, and we’d live longer. But I concede that this is somewhat of an apples-to-oranges comparison.

And a third thing! I don’t trust the SQL generated by my ORM. Who knows what it’s doing? Is it running one query or 100? (see: N+1 selects problem.) Is the query efficient?

I get this much: the reason objects are nice is because they have types. You can access a field of an object mapped to a database table and it will be a date or an int or whatever it’s supposed to be. In Java this is a real benefit, because the only alternative is to run an SQL query and get back an Object[] with a value for each column. Then, you litter your code with casts. Such code quickly becomes very ugly and hard to read and love. But Python doesn’t have this problem. It’s untyped.

Here’s another pro-ORM argument I get: lots of very short, trivial SQL statements littered throughout the code make its purpose hard to understand. I agree with this, and here I see a clear advantage of ORMs. But often the queries get complicated, and then we’re back to writing complex SQL, except with objects. This is insanely stupid, and if I weren’t so in control of my emotions I’d be inclined to say something more emotional here. But since I’m not emotional, I don’t need to calm down. So on to the next sentence: it can be argued that such code is always the result of a poor mapping between objects and database tables. A mapping that doesn’t capture all of the relationships. Maybe. Sometimes. I still suspect that complex queries are at times unavoidable. Like I said, I haven’t been involved with a lot of ORM code. But anyone can blog. It’s a whole Internet universe of mediocrity, and I’m participating!

So here’s a final mediocre thought: ORMs that generate object definitions might be worthwhile, I’m not sure. But ORMs like SQLAlchemy that force us to write our own object definitions are stupid and should not be used. I’d give more mediocre reasons, but this blog entry is already long enough. But please, convince me that they are totally awesome.

In the wake of posts about Shopify's support for white nationalists and DataCamp's attempts to cover up sexual harassment
I have had to disable comments on this blog. Please email me if you'd like to get in touch.