MongoDB database schema design

后端 未结 2 2010
隐瞒了意图╮
隐瞒了意图╮ 2020-12-23 12:30

I have a website with 500k users (running on sql server 2008). I want to now include activity streams of users and their friends. After testing a few things on SQL Server it

2条回答
  •  时光说笑
    2020-12-23 13:17

    I'd go with the following structure:

    1. Use one collection for all actions that happend, Actions

    2. Use another collection for who follows whom, Subscribers

    3. Use a third collection, Newsfeed for a certain user's news feed, items are fanned-out from the Actions collection.

    The Newsfeed collection will be populated by a worker process that asynchronously processes new Actions. Therefore, news feeds won't populate in real-time. I disagree with Geert-Jan in that real-time is important; I believe most users don't care for even a minute of delay in most (not all) applications (for real time, I'd choose a completely different architecture).

    If you have a very large number of consumers, the fan-out can take a while, true. On the other hand, putting the consumers right into the object won't work with very large follower counts either, and it will create overly large objects that take up a lot of index space.

    Most importantly, however, the fan-out design is much more flexible and allows relevancy scoring, filtering, etc. I have just recently written a blog post about news feed schema design with MongoDB where I explain some of that flexibility in greater detail.

    Speaking of flexibility, I'd be careful about that activitystrea.ms spec. It seems to make sense as a specification for interop between different providers, but I wouldn't store all that verbose information in my database as long as you don't intend to aggregate activities from various applications.

提交回复
热议问题