Suppose I have
public class Product: Entity
{
public IList- Items { get; set; }
}
Suppose I want to find an item with max someth
I think that this is a difficult question that has no hard and fast answer.
A key to one answer is to analyze Aggregates and Associations as discussed in Domain-Driven Design. The point is that either you load the children together with the parent or you load them separately.
When you load them together with the parent (Product in your example), the parent controls all access to the children, including retrieval and write operations. A corrolary to this is that there must be no repository for the children - data access is managed by the parent's repository.
So to answer one of your questions: "why do I need this collection in Product at all?" Maybe you don't, but if you do, that would mean that Items would always be loaded when you load a Product. You could implement a Max method that would simply find the Max by looking over all Items in the list. That may not be the most performant implementation, but that would be the way to do it if Product was an Aggregate Root.
What if Product is not an Aggregate Root? Well, the first thing to do is to remove the Items property from Product. You will then need some sort of Service that can retrieve the Items associated with the Product. Such a Service could also have a GetMaxItemSmth method.
Something like this:
public class ProductService
{
private readonly IItemRepository itemRepository;
public ProductService (IItemRepository itemRepository)
{
this.itemRepository = itemRepository;
}
public IEnumerable<Item> GetMaxItemSmth(Product product)
{
var max = this.itemRepository.GetMaxItemSmth(product);
// Do something interesting here
return max;
}
}
That is pretty close to your extension method, but with the notable difference that the repository should be an instance injected into the Service. Static stuff is never good for modeling purposes.
As it stands here, the ProductService is a pretty thin wrapper around the Repository itself, so it may be redundant. Often, however, it turns out to be a good place to add other interesting behavior, as I have tried to hint at with my code comment.
(Disclaimer, I am just starting to get a grasp on DDD. or at least believe doing it :) )
I will second Mark on this one and emphasize 2 point that took me some times to realize.
The point is that either you load the children together with the parent or you load them separately
The difficult part is to think about the aggregate for your problem at hand and not to focus the DB structure supporting it.
An example that emphasizes this point i customer.Orders. Do you really need all the orders of your customer for adding a new order? usually not. what if she has 1 millin of them?
You might need something like OutstandingAmount or AmountBuyedLastMonth in order to fulfill some scenarios like "AcceptNewOrder" or ApplyCustomerCareProgram.
What if Product is not an Aggregate Root?
i.e. are you going to manipulate the item or the product?
If it is the product, do you need the ItemWithMaxSomething or do you need MaxSomethingOfItemsInProduct?
Given that you really need the item with maxSomething in your scenario, then you need to know what it means in terms of database operation in order to choose the right implementation, either through a service or a property.
For example if a product has a huge number of items, a solution might be to have the ID of the Item recorded with the product in the db instead of iterating over the all list.
The difficult part for me in DDD is to define the right aggregates. I feel more and more that if I need to rely on lazy loading then I might have overseen some context boundary.
hope this helps :)
I believe in terms of DDD, whenever you are having problems like this, you should first ask yourself if your entity was designed properly.
If you say that Product has a list of Items. You are saying that Items is a part of the Product aggregate. That means that, if you perform data changes on the Product, you are changing the items too. In this case, your Product and it's items are required to be transactionally consistent. That means that changes to one or another should always cascade over the entire Product aggregate, and the change should be ATOMIC. Meaning that, if you changed the Product's name and the name of one of it's Items and if the database commit of the Item's name works, but fails on the Product's name, the Item's name should be rolled back.
This is the fact that Aggregates should represent consistency boundaries, not compositional convenience.
If it does not make sense in your domain to require changes on Items and changes on the Product to be transactionally consistent, then Product should not hold a reference to the Items.
You are still allowed to model the relationship between Product and items, you just shouldn't have a direct reference. Instead, you want to have an indirect reference, that is, Product will have a list of Item Ids.
The choice between having a direct reference and an indirect reference should be based first on the question of transactional consistency. Once you have answered that, if it seemed that you needed the transactional consistency, you must then further ask if it could lead to scalability and performance issues.
If you have too many items for too many products, this could scale and perform badly. In that case, you should consider eventual consistency. This is when you still only have an indirect reference from Product to items, but with some other mechanism, you guarantee that at some future point in time (hopefully as soon as possible), the Product and the Items will be in a consistent state. The example would be that, as Items balances are changed, the Products total balance increases, while each item is being one by one altered, the Product might not exactly have the right Total Balance, but as soon as all items will have finished changing, the Product will update itself to reflect the new Total Balance and thus return to a consistent state.
That last choice is harder to make, you have to determine if it is acceptable to have eventual consistency in order to avoid the scalability and performance problems, or if the cost is too high and you'd rather have transactional consistency and live with the scalability and performance issues.
Now, once you have indirect references to Items, how do you perform GetMaxItemSmth()?
In this case, I believe the best way is to use the double dispatch pattern. You create an ItemProcessor class:
public class ItemProcessor
{
private readonly IItemRepository _itemRepo;
public ItemProcessor(IItemRepository itemRepo)
{
_itemRepo = itemRepo;
}
public Item GetMaxItemSmth(Product product)
{
// Here you are free to implement the logic as performant as possible, or as slowly
// as you want.
// Slow version
//Item maxItem = _itemRepo.GetById(product.Items[0]);
//for(int i = 1; i < product.Items.Length; i++)
//{
// Item item = _itemRepo.GetById(product.Items[i]);
// if(item > maxItem) maxItem = item;
//}
//Fast version
Item maxItem = _itemRepo.GetMaxItemSmth();
return maxItem;
}
}
And it's corresponding interface:
public interface IItemProcessor
{
Item GetMaxItemSmth(Product product);
}
Which will be responsible for performing the logic you need that involves working with both your Product data and other related entities data. Or this could host any kind of complicated logic that spans multiple entities and don't quite fit in on any one entity per say, because of how it requires data that span multiple entities.
Than, on your Product entity you add:
public class Product
{
private List<string> _items; // indirect reference to the Items Product is associated with
public List<string> Items
{
get
{
return _items;
}
}
public Product(List<string> items)
{
_items = items;
}
public Item GetMaxItemSmth(IItemProcessor itemProcessor)
{
return itemProcessor.GetMaxItemSmth(this);
}
}
NOTE: If you only need to query the Max items and get a value back, not an Entity, you should bypass this method altogether. Create an IFinder that has a GetMaxItemSmth that returns your specialised read model. It's ok to have a separate model only for querying, and a set of Finder classes that perform specialized queries to retrieve such specialized read model. As you must remember, Aggregates only exist for the purpose of data change. Repositories only work on Aggregates. Therefore, if no data change, no need for either Aggregates or Repositories.
You can now do that with NHibernate 5 directly without specific code ! It won't load the whole collection into memory.
See https://github.com/nhibernate/nhibernate-core/blob/master/releasenotes.txt
Build 5.0.0
=============================
** Highlights
...
* Entities collections can be queried with .AsQueryable() Linq extension without being fully loaded.
...
Having an Items
collection and having GetXXX()
methods are both correct.
To be pure, your Entities shouldn't have direct access to Repositories. However, they can have an indirect reference via a Query Specification. Check out page 229 of Eric Evans' book. Something like this:
public class Product
{
public IList<Item> Items {get;}
public int GetMaxItemSmth()
{
return new ProductItemQuerySpecifications().GetMaxSomething(this);
}
}
public class ProductItemQuerySpecifications()
{
public int GetMaxSomething(product)
{
var respository = MyContainer.Resolve<IProductRespository>();
return respository.GetMaxSomething(product);
}
}
How you get a reference to the Repository is your choice (DI, Service Locator, etc). Whilst this removes the direct reference between Entity and Respository, it doesn't reduce the LoC.
Generally, I'd only introduce it early if I knew that the number of GetXXX()
methods will cause problems in the future. Otherwise, I'd leave it for a future refactoring exercise.
Another way you can solve this problem is to track it all in the aggregate root. If Product and Item are both part of the same aggregate, with Product being the root, then all access to the Items is controlled via Product. So in your AddItem method, compare the new Item to the current max item and replace it if need be. Maintain it where it's needed within Product so you don't have to run the SQL query at all. This is one reason why defining aggregates promotes encapsulation.