Mongodb Schema for Posts and Shares

痴心易碎 提交于 2019-12-12 04:54:28

问题


I am new to mongodb NoSQL concept and stuck at point where I am unable to take a decision for modelling the schema that could best serve my purpose.

I need to design schema in such a way that I have my end result as Posts and Shares sorted by time. For this I considered two options:


Option 1: Different Collection for Posts and Share as:

Schema for Post collection:

var postSchema = mongoose.Schema({
   postText: String,
   postedBy: String, 
   privacy: Number,
   updatedOn: { type: Date, default: Date.now }        
}, { collection: 'posts' }); 

Schema for Share Collection

var shareSchema = mongoose.Schema({
   dis_Id: { type: mongoose.Schema.Types.ObjectId }, // Id of post that is shared
   shareBy: { type: mongoose.Schema.Types.ObjectId },
   shareText: String,
   share_privacy: Number,
   shareOn: { type: Date, default: Date.now }
}, { collection: 'shares' });

Option 2: Embed Share in Posts itself

New Schema for Post

var postSchema = mongoose.Schema({
  postText: String,
  postedBy: String,
  updatedOn: { type: Date, default: Date.now }, 
  privacy: Number,
  share: {
    shareBy: { type: mongoose.Schema.Types.ObjectId },
    shareText: String, 
    share_privacy: Number,
    shareOn: { type: Date } 
  }       
}, { collection: 'posts' });

Now which of this could be a better choice? Option 1 has problem in querying as there are no joins in mongodb and Option 2 will lead to replication of the same data and can grow up to more than billions for hundreds of thousands of users.


回答1:


Ok. I suggest the following approach:

  1. Since you already have the username, you can retrieve the list of post corresponding to that id in a sorted order using sort.

  2. Iterating over each post, you can fetch the shares in sorted order using same sort used above.

The key here is understanding indexes that you'll be setting. I suggest you should have following indexes.

post_schema: compound index on {username, updatedOn}

share_schema: compound index on {dis_Id, shareOn}.

If compound indexes are not used, your application will not scale for large number of records.




回答2:


Well it is easy to work with embedded documents as you get all the required data together so option 2 is good in that case. But if you are concerned about the size of documents increasing more then 16MB then go with option 1. In that case do not use aggregate query to join two collections because that will be time consuming operation became it will first perform all the operations and then perform skip operation. Instead you should query on each collection individually and create a complete response yourself with some custom logic.



来源:https://stackoverflow.com/questions/46262800/mongodb-schema-for-posts-and-shares

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!