Why, oh why, is the industry so wedded to this disaster of a concept? None of the previous answers adequately addresses @johnny's concerns. They are all half-baked hand-waving justifications that steer well clear of the concrete issues facing database programmers.
@TheTXI's answer is typical of the responses I get when asking the same question: separation of layers. What is that supposed to mean? Why do I want to separate my layers? Or, more to the point, how does it benefit me to create an additional layer that is different to the relational layer and yet is supposed to be a canonical mapping from that layer?
Furthermore, (@johhny's as-yet unanswered point) how does this protect us from change? If the database changes, the ORM layer will almost certainly have to follow suit. In fact, the default practice of not modeling join tables at the object level makes this even worse, because when the table inevitably grows some extra columns, you not only add some extra fields to the object model, but you also change its topology and force a bunch of code rewrite! This is a disaster for change management and is a classic example of how a misguided view of relations (thinking they represent objects) courts disaster. This simply wouldn't have occurred if you just mapped the relational database directly into congruent notions in the programming language (think LINQ-to-SQL, and not LINQ-to-EF).
The biggest unanswered question in this space — the elephant in the room — is: what problem is ORM supposed be solving? And don't say "object-relational impedance mismatch". That's just another hand-waving fob-off. Explain why there is an impedance mismatch, and why it should be the database that comes to the language rather than the language going to the database. My explanation is that most programming languages suck at expressing and working with relational data, but that this is a historical reality that is beginning to slip away (LINQ-to-SQL was the first baby-step in that direction), not a fundamental principle on which to base sound architecture.
There's a reason that ORMs have become so complex, with lazy-loading, caching, a bewildering array of persistence and ownership semantics, etc. And there's a reason that for all this extra pounding away at the keyboard, they still fail to efficiently solve basic problems like, "Which pairs of members share more than one group?" The relational model was conceived at a time when network and heirarchical models were buckling at the knees under the weight of such problems; it was a breath of fresh air. Now we all seem to yearn to go back to our old sandpit full of cat-pee, and we think we've invented something new (as long as we hold our noses).
(I fully expect to be liberally down-voted on this answer. But please leave a comment when you do. I don't mind being told I'm wrong, as long as I know why.)
EDIT: Thank you @Chris for taking the time to comment. It gives me some concrete points to address... (Note that while I frequently address @Chris below, I am not trying to take him to task specifically; his responses are typical of the kinds of comments I hear all the time when discussing the subject. So I hope he doesn't take my criticisms as a personal affront; they are not intended that way, and I do genuinely appreciate the time he took to respond.)
First off, let me clear up some misconceptions evident in @Chris's comments and answer.
- I do not advocate raw SQL in code, for all the obvious reasons, and some not so obvious ones (e.g., SQL is neither an algebra nor a calculus, which makes functional decomposition virtually impossible).
- I do not advocate monolithic application design. Layers are, in general, a good thing.
- I do not advocate polluting object models with lots of line-noise such as special fields, methods, and attributes. Frankly, however, this is a strawman, since domain/object models only exist in the ORM universe. Now, I know LINQ-to-SQL has all these classes with lots of noisy bits in them, but they are just behind-the-scenes plumbing; you don't edit that code, and you generally shouldn't even look at it.
Now some objections to the objections:
- The assertion that applications can be built independently of the database is unfounded. By and large, ORMs are just a canonical mapping onto the data layer (Tables Foo and Bar become classes Foo and Bar, and table FooBar becomes some kind of torrid affair between classes Foo and Bar). There isn't much wiggle room in this mapping, and so any change to the data model will almost certainly require a corresponding change to the object model. This is a good thing in my view, since an object that radically diverged from the corresponding database model would be nothing more than an additional maintenance headache for all concerned.
- Once the illusion that ORMs engender data-model-independence is discarded, all protestations about the evils of direct coupling to the data model become moot. But I'd like to pursue this a little further than to simply dismiss it. Coupling is an essential feature of system design. At some point, decisions and assumptions have to be made. You can't program everything using a single "Things" table. You have to decide that your domain contains certain specific concepts and then create schemas and code that respect those concepts, treat them as first-class citizens, hard-code them. The idea that applications ought to be independent of the database is misguided. The database is (or ought to be) the purest representation of a business's knowledge (I know that this isn't always the case, and I will address this later). Coupling to this representation ought to provide the strongest guarantee of resilience, since such a data model will only change when the business itself undergoes some intrinsic change. In short, coupling to a well-designed database schema is a very good thing.
- Layering is not an end in its own right. It is good because it achieves some specified goal. The preceding points show that layering between the database and the app in the way ORM does is neither effective nor necessary to achieve the real goal of resilience to change. This is achieved through good database design.
- @Chris asserts that letting the database dictate things stymies OO design. This is true enough, but it is only interesting if OO design is the best way to model knowledge. The nearly complete failure of OODBMSs in the marketplace hints that this is not the case. The relational model, with its predicate-logic foundation, possesses the same expressive power as OO design without incurring the graph-theoretic complexities of OO models.
- @Chris's objections to the relational model on the grounds that it doesn't solve today's problems (hence the NoSQL movement) is completely off the mark. NoSQL means "No SQL", not, "No relational model". Unfortunately even proponents of the NoSQL movement seem to be quite clueless in this regard. SQL has deep flaws, many of which can be traced to its radical departure from the relational model. To say that we should abandon the relational model because SQL sucks is a rather blatant case of throwing the baby out with the bathwater.
- Failure to use an ORM does not triple the effort of building an application. This is a ludicrous claim, and even @Chris seems to be holding the back door open on it with a backhanded compliment to the codegen alternative. Codegen tools such as LINQ-to-SQL's sqlmetal are a perfect solution for anyone who isn't wedded to the dogma that the application's data model absolutely has to be different to the database's data model.
My own experience with ORMs has been that they work great in tutorials and cause endless pain and frustration in the real world. With LINQ-to-SQL fixing many of the problems that motivated ORMs in the first place, I see no reason to put myself through that kind of torture.
One major problem remains: the current crop of SQL databases doesn't offer any meaningful degree of control over the separation of physical and logical layers. The mapping from a table to stuff on the disk is largely fixed and entirely under the control of the SQL DBMS. This was not part of the plan for the relational model, which explicitly separated the two, and allowed for the definition of a consistent logical representation of data that could be stored on disk in a completely different structure than was suggested by the logical model. For instance, a system (or dba) would be free to physically denormalise — for performance reasons — a highly normalised logical model. Because SQL engines don't allow this separation of concerns, it is common to denormalise or otherwise torture the logical model through sheer necessity. As a result, logical models can't always be exactly as they should, and so the ideal of using the database as the purest representation of knowledge cannot be fully realised. In practice, however, designers generally stick to a canonical mapping from database to domain model anyway, because anything else is just too painful to maintain.