How to structure a DynamoDB database to allow queries for trending posts?

后端 未结 1 1030
天涯浪人
天涯浪人 2021-02-09 14:26

I am planning on using the following formula to calculate \"trending\" posts:

Trending Score = (p - 1) / (t + 2)^1.5

p = votes (points) from us

相关标签:
1条回答
  • 2021-02-09 14:30

    I'm starting with a note on your comment with the timestamp vs post_id.
    Since you are going to use DynamoDB as your post_id generator, there is a scalability issue right there. Those numbers are inherently unscalable and you better off using a date object. If you need to create posts in a crazy speed time you can start reading about how twitter are doing it http://blog.twitter.com/2010/announcing-snowflake

    Now let's get back to your trending check:
    I believe your scenario is misusing DynamoDB.
    Let's say you have one HOT category that has most posts in it. Basically you will have to scan the whole posts (since the data isn't spread well) and for each start to look at the points and do the comparisons in your server. This will just not work or will be very expensive since each time you will probably use all your reserved read units capacity.

    The DynamoDB approach for those type of trends checking is using MapReduce
    Read here how to implement those: http://aws.typepad.com/aws/2012/01/aws-howto-using-amazon-elastic-mapreduce-with-dynamodb.html

    I can't specify a time, but I believe you will find this approach scalable - though you won't be able to use it often.

    On another note - you could keep a list of the "top 10/100" trendy questions and you update them in "real-time" when a post is upvoted - you get the list, check if it needs to be updated with the newly upvoted question and save it back to the db if needed.

    0 讨论(0)
提交回复
热议问题