Calling the author naive is, I think, uncharitable. I've also written my own ORM...

DavidMcLaughlin · on June 15, 2011

> Yes, the ORM layer makes it easier to have structured queries that can be cached . . . it also makes it harder to have one-off queries that can be tuned easily based on exactly what data is needed and tweaked to convince the database to generate the right query plan, and it makes it much harder to look at some DB stats, identify a poorly-performing query, and then map it back to the code that generated that query.

The problem I have with your post is that you are repeatedly mistaking the high level idea of an ORM with the (seemingly) Spartan implementations that you have used.

Here is an ORM that provides support for automatically profiling the performance of queries over a period of time:

http://squeryl.org/performance-profiling.html

Sample output:

http://squeryl.org/profileOfH2Tests.html

I cannot begin to tell you how much time this has saved me when optimising performance of my webapp.

Note that that particular ORM also has the advantage of having type safe queries - i.e. it can tell at compile time if there's a syntax error in your query (subject to bugs in the ORM :)) - even in dynamically generated queries. In practice this is a fantastic feature as it is so much safer than building up SQL queries with string manipulation and dealing with multiple code paths that depend on user input. The test paths alone in such code (even if you have a "query builder" layer) are the stuff of nightmares.

There are many features missing from Squeryl though that I've had in other ORMs because it makes different trade-offs. But this is what you do when you choose a library, and it's important to understand what trade-offs you're making upfront... otherwise you might find yourself writing off an entire approach to software development as an anti-pattern because you picked the wrong library.

akeefer · on June 15, 2011

I think you're missing my point on the performance side; yes an ORM layer can help you identify slow queries, but it's pretty much the database query plan that will tell you why it's slow. Is it doing a full table scan instead of using an index? Is it applying joins in a sub-optimal order? Are the statistics just off, which causes it to use a bad query plan? At that point you're already in SQL/DBA land, but now you have to map that knowledge back to the ORM layer to fix things.

My experience with type-safe query layers is that they tend to be incomplete; they simply don't let you generate the full range of SQL queries because you're restricted by the language's type system. That said, I'm not particularly familiar with squeryl (and Scala's type system is certainly more expressive than most statically-typed languages), so I can't say what it's limitations are, I can only make general statements.

Anyway, I think it's fair to say it's difficult to talk about ORM generally due to the differences between frameworks and approaches. So I'll try to phrase things more clearly, and say that I think the author's original intent, and the part I agree with, is the fundamental premise that ORM abstractions are inherently leaky and that performance needs often result in a desire to go around the ORM framework to handle something more natively in SQL. Some ORM frameworks embrace those limitations, and allow you to use them when you want to and to work around them when you don't; other frameworks fight that limitation and attempt to swallow the world such that you never have to leave the ORM framework, and those frameworks tend to be the ones that become frustrating to work with.

So if I were to attempt to charitably read the original post, I'd say that perhaps saying it's an "antipattern" is taking it too far, but saying that it's a fundamentally flawed, leaky abstraction is totally accurate, and that recognizing that it's fundamentally leaky means that you, as a developer, should probably take that into account in your application design and your library selection, and that there are some techniques that might help you to do that.

lloeki · on June 16, 2011

In this discussion, everyone seem to broadly assume that it's an all-or-nothing affair.

More precisely, it's a cost-benefit decision. If most queries you will make are hampered by the ORM then by all means, don't use one. But if like in many (most?) situations, an ORM greatly abstracts and eases design and development for 90%+ of your operations and you have like 10% queries to be either tuned or handwritten (even if it has to be handwritten against multiple database types to preserve portability), then an ORM is a net benefit.

Compare it to inline asm in C, or C modules in Python: the higher level stuff makes it efficient to work with top level concepts 90% of the time, but sometimes you have to go down to be actually efficient, or even simply be able to do something, even if that means losing some form of independence (which would then mandate writing the same function for a different platform if you want to preserve portability). Not only it is not an anti-pattern, by no means does it mean either that the abstraction is fundamentally flawed.

The very idea that "going around" an ORM is somehow proving that ORMs are flawed is simply wrong. There are problems that ORMs are built to solve, and there are problems they can't ever solve. "Going around" is part of the deal because it's not a "work around", it's a "work together".

This is very visible in the article, especially the moment the author states that "I claim that the abstraction of ORM breaks down not for 20% of projects, but close to 100% of them". Indeed this is the case, but for close to 100% of those close-to-100%-projects where it "fails", the ORM is helpful for managing 90% of data access implementation. The 10% remainder may need to be implemented at a lower level, but wouldn't it be silly to spend a lot of time on those 90% of code that would get used 10% of the time? This is what ORMs are about, and saves a lot of time to develop the 10% of code that is critical both in usage volume and in performance. Of course such ratios are highly project dependent, and this is what warrants a thoughtful analysis to select the right tool for each task, of which there can be multiple in a single project, or even _object_. ORM, just as NoSQL, is simply not the end-all be-all solution, yet that does not make it a very valid pattern any less.

(edit: cosmetic/typo)