问题
I would like to discuss with you, if it makes any sense to use two way embedding instead of one way embedding, when modeling an N:M relationship in MongoDB.
Let's say, we have two entities: A Product can belong to many (few) Categories, and a Category can have many (lots of) Products.
Two Way Embedding
If we use this approach, our categories would look like this:
{
_id: 1,
name: "Baby",
products: [2]
}
{
_id: 2,
name: "Electronics",
products: [1, 2]
}
And products:
{
_id: 1,
name: "HDMI Cable",
categories: [2]
}
{
_id: 2,
name: "Babyphone",
categories: [1, 2]
}
Queries:
If we would like to fetch products that belong to a specific category:
const category = categoriesCollection.findOne({name: "Electronics"});
const products = productsCollection.find({_id: {$in: category.products}}).toArray();
If we would like to fetch categories that belong to a specific product:
const product = productsCollection.findOne({name: "Babyphone"});
const categories = categoriesCollection.find({_id: {$in: product.categories}}).toArray();
One Way Embedding
Because a product will probably only belong to two or three categories, but a category can have millions of products, I would embed the categories into the products, not the other way round. So we can be sure that we'll never reach the 16 MB maximum document size.
Our products will look the same as above, but categories won't have a "products"-field any longer.
If we would like to fetch categories for a specific product, our query stays the same as above:
const product = productsCollection.findOne({name: "Babyphone"});
const categories = categoriesCollection.find({_id: {$in: product.categories}}).toArray();
The other way round, if we fetch products for a specific category, our query changes to this:
const category = categoriesCollection.findOne({name: "Electronics"});
const products = productsCollection.find({categories: category._id}).toArray();
In our products collection, we put a (multikey) index on the categories-array, so performance should be okay.
My conclusion
One-way-embedding seems to be the better solution to me, as we won't reach maximum document size, while not having any(?) disadvantages in relation to the two-way-embedding approach. Why would anyone ever want to do two-way-embedding? Am I missing something? Performance-wise, it should be almost the same, or no?
What do you think?
来源:https://stackoverflow.com/questions/59898018/two-way-embedding-vs-one-way-embedding-in-mongodb-many-to-many