What I need:
Suppose you\'re using MongoDB and you have a collection called users
, and each user has a \"following\" array with user
I can't see any other way too, i implemented such thing before and didn't have a problem.
On your case, it should be sth like this, you pass certain user's $follower_ids
array as an argument to your function:
$query = array("status_owner_id" => array('$in' => $follower_ids));
$cursor = $mongo->yourdb->statuses->find($query);
And if you index statuses (if you have enough ram to do so) upon owner_id you'd get the results really fast.
Hope it helps, Sinan.
What you tried is what every body think first however it's not really easy to scale... You can always add more servers or use sharding etc... If you have million of users and people who follow lots of people this solution would become really hard to execute.
There is another solution that is basically just doing the aggregation when someone post a status. Facebook use this idea and it might be easier to scale and if someone is following 25000 people, he will see his list of status pretty quickly and your server wont have to "fight" to retrieve the data quickly.
You will have a user collection, each user will have a statuses array. Let say you have user1 and user2, and that user1 follow user2. When user2 push a status, his status will be saved in user1 array of statuses AND in user2 array of statuses. You will use more storage which with mongoDB mean more memory.... At Facebook they are using Hadoop with HBase for the main storage then they have huge arrays of servers with lots of memory.
One inconvenient is if you delete one status you have to delete it everywhere... Major advantage to this solution, each user will have an array of statuses already in order! In the previous solution if you follow 3users, you need to grab all their feeds then sort them, then render them...
[Edit] Like Shekhar point out int the comment, Mongo has a document Limit. You need to create a status collection an save the status twice, once for user2 and once for user1 and need to have a fromId, toId, status, and time
Yea, I do the exact same thing. See what Dwight Merriman suggested on his blog.
http://dmerr.tumblr.com/post/463694595/just-for-fun-a-single-server-twitter-design