I\'ve almost finished my Data Mapper, but now I\'m at the point where it comes to relationships.
I will try to illustrate my ideas here. I wasn\'t able to find good
Before I even start, I'd assume you've read PoEAA book from Fowler from beginning to end. =) Also, I'll consider that you already thought of first initial issues you face when dealing with ORMs. I can highlight an easy one, such as calling a DataMapper multiple times using the same identifier and always returning same object (read as IdentityMap).
Important: Is a Data Mapper allowed to use another Data Mapper?
It is only possible to have one DataMapper access another one if the second is a weak reference on second.
Lets say in most cases you need both the company object AND the address object, because you always display it in a list all together. In this case, the CompanyDataMapper not only fetches company objects, but does an SQL query with JOIN to also get all the fields of the address object. Finally, it iterates over the record set and feeds new objects with their corresponding values, assigning the address object to the company object.
The problem you're trying to discuss here sounds simple in practice, but it is a bit complex behind the scenes.
First of all, you shouldn't have a getAddress(Company), but rather benefit from having Proxy objects. A proxy is a non-initialized representation of a given instance. In this case, a Proxy contains a reference to which entry you're looking for. It must extend from your original object and needs to provide an initialization method, together with a related DataMapper to load it.
The second part about JOINing and loading multiple objects at once is called Hydrator. Hydrators receives a flat structure of lines and columns and convert into an object graph. But it really enters on a separate issue: if you're purely dealing with objects, why are you fetching tables? Trying to take a object fetching approach would lead you to implement a sort of OQL (Object Query Language).
Important: Is this true? Must I send out 100 queries to get 100 address objects, if I have 10 companies with each 10 addresses?
Dealing with a collection of objects is a nightmare in PHP. Yes, the language sucks a lot for the lack of a powerful collection implementation. Basically, you are required to deal with different situations here: - new instance and all elements in this list of elements are new - new instance and all elements in this list of elements are pre-existent - new instance and elements in this list of elements are mixed between new and pre-existent - pre-existing instance and not touching anything on the list of elements - pre-existing instance and manipulating items on the list
I'm being very simplistic here, but the main point I want to highlight you is the need of a Collection object. There're two of them: one that deals with new lists and one that deals with existent lists. The one that deals with existent lists need to be able to load the collection once you try to access anything inside of it. That's the only way to not have n + 1 issues.
Here it also highlights the next big problem you'd have to deal. Associations can be uni-directional or bi-directional. This means that Company knows about Address but Address have no idea about Company is uni-directional, while a User is part of many Groups and Groups contains many Users is a bi-directional association. Things easily become a nightmare here and that's why you require Mapping patterns to properly understand what's going on.
Dealing with many-to-many is just the same as dealing with collections in general.
There is an important part you haven't considered yet. If I build my entire object graph (Company and Address) and I decide to persist them... it needs to persist both or do I have to manually tell what I want to persist? Both ways have different sets of problems. Let's assume you want the first approach. You just entered in what I consider one of the most complex design patterns to implement: UnitOfWork. Then you'd have to deal with sorting the order of entities to be applied to not generate constraint problems (read Topological Sorting on how to solve this). If you take the second approach, you may easily enter on a situation where it feels your tool is broken, mainly because it's very easy to have your object graph in an inconsistent state.
Finally... are you planning to do ANY support for inheritance? If positive, your entire planning just entered on a whole new level. =( Trying to explain would take me a book. But I can point some design patterns you can look at: Concrete Table Inheritance (1 class, 1 table), Single Table Inheritance (N classes, 1 table) and Class Table Inheritance (N classes, M tables).
I can go in depth on many different points here, but ORMs normally leads to head explodes. I'll stop for now.
PS: I'm one of the core developers of Doctrine ORM. Unless you're doing this for study purposes, don't bother trying to create another one. It's an extremely complex, time consuming and it demands lots and lots of planning on how things would work before you even code the first line. As a matter of fact, we planned Doctrine ORM for 2 years and took 1 year to implement reliably the core functionality. I'm not discouraging you, but as Fowler's said on his ORM hate article, it's a complex solution for an even complex problem.
I am looking forward to any answers you'll get on this topic, but in the meantime why not just hop over to Amazon (or your local books dealer) and finally buy
These book contain the original patterns you have been pointed at in various of your questions and are considered reference works in Design Patterns and Software Architecture.
I too am working through this issue. To start with, I have adapted the Data Mapper pattern from Matt Zandstra's PHP Objects, Patterns, and Practice (2d Ed). I see now that a new edition has come out
Perhaps the most ingenious part of the setup is the "Collections" objects. I am not sure what language you are using, so I'll spare you the details. Suffice it to say that PHP has an Iterator interface that makes it possible to load an array (map, in other languages) at first and transform the raw data into objects (hydrate?) on the fly, while looping.
Like you, I am struggling with how to load relationships. What I have found so far is that I can write my massive JOIN query in the Mapper class and create both a dehydrated collection for the target object and sneak in the data on the related objects at the same time.
I really dislike "Lazy Load" because it leads to so many database queries. It offends my perfectionist sensibilities to know that I am using tens or hundreds of queries to accomplish would could be done in one.
I, too, am looking forward to more answers.