Mongodb database Schema Design with shared data

时光总嘲笑我的痴心妄想 提交于 2019-12-01 09:37:26
Sim

Your challenge comes from the fact that Prop_Info must be retrieved by both queries. This makes it difficult to figure out which Mongo collection it should live in.

In MongoDB, you create your document schema with the ideal goal for a single document to have all the information you need given your query patterns. In the case where you need to have the same data D (such as Prop_Info in your case) returned by two separate queries against two separate collections A and B, you have to choose between the following three strategies:

  1. Duplicate D in the documents of both A and B, and enforce consistency with your code. This is typically the design choice of high-performance systems that want to eliminate the need for a second query even if that comes at the cost of additional code complexity on the insert/update side and with some potential consistency problems since Mongo is not ACID.

  2. Put D in A and store a reference (DBRef or some other combination of identifying fields) in B so that you can get to it with a second query. This is typically the design choice when the number of queries to A exceeds the number of queries to B. It keeps D local to the more frequently queried collection. In this schema design pattern you only need to make a second query when you query B.

  3. Put D in a new collection C and make a second query to it from both A and B. This is typically the design choice in the face of very uncertain future requirements where it is not clear what the trade-offs would be if you go with (1) or (2) above. It is the most "relational-like" schema and the one that will force you to make a second query when you query both A and B.

Which strategy you choose depends on your domain, the query patterns, the support you get from your object-relational mapping (ORM) framework (if you use one), and last but not least, your preference.

In the situations I've encountered, I've never chosen (3). I've used (1) in high-performance situations (analytics systems). I've used (2) everywhere else since the query access patterns have made it obvious where the "shared" data should live.

Once you pick your strategy, if you still need assistance, post another SO question that specifically focuses on the schema design problem given the chosen strategy.

Three final tips:

  1. If the shared data D has a relationship multiplicity greater than 1 use an array. You can index entire arrays and you can query precisely inside arrays using $elemMatch.

  2. To update D in strategy (1) or (2) use MongoDB's atomic modifier operations, many of which are designed to operate on arrays.

  3. This SO question covers the DBRef two query pattern in @Stennie's answer. (@Stennie works for 10gen, the markers of MongoDB.)

Good luck!

mongotop

You may need just one collection with all the documents, of course you will end up by having too many repeated field but this is the trick to scale well. For the Relations the type one to many and one to one you will just remove the identifier and put the rest of the attributes because MongoDB will take care of the primary key. ('I love MongoDB for that').

For the many to many relationship that you have between Tenant and System you will have to change it to an array in MongoDB data structure.

coll{
Tenant : 'value',
tenant_info : 'value',
Sys_info: 'value' ,
auth_info: 'value' ,
Prop_info : array [ 'value','value',''value....]
}

You are still thinking in relational databases. MongoDB, however, is a document-oriented database.

  1. artificial ID numbers are usually not needed, because every document automatically has a _id field, which is a GUID (as good as guaranteed to be globally unique).
  2. relation tables should not be used in MongoDB. n-type relations are made with arrays fields instead. So when 1 system has N authorizations it uses, your system document should have a field "authorization" which is an array of the object IDs of the authorizations it has. Yes, that would be a horrible violation of the normalization rules of relational databases. But you don't have a relational database here. In MongoDB it is practical to represent N-relations with arrays, because arrays are transparent to the query language.
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!