Schema for User Ratings - Key/Value DB

前端 未结 4 971
暖寄归人
暖寄归人 2021-02-02 18:15

We\'re using MongoDB and I\'m figuring out a schema for storing Ratings.

  • Ratings will have values of 1-5.
  • I want to store other values such as from
相关标签:
4条回答
  • 2021-02-02 18:30

    First of all 'Dictionary in User Class' is not a good idea. why? Adding extra rate object requires pushing a new item to the array, which implies the old item will be removed, and this insertion is so called "moving a document". Moving documents is slow and MongoDB is not so great at reusing empty space, so moving documents around a lot can result in large swaths of empty data file (some text in 'MongoDB The Definitive Guide' book).

    Then what is the correct solution: assume you have a collection named Blogs, and want to implement a rating solution for your blog posts, and additionally keep track of every user-based rate operation.

    The schema for a blog document would be like:

    {
       _id : ....,
       title: ....,
       ....
       rateCount : 0,
       rateValue : 0,
       rateAverage: 0
    }
    

    You need another collection (Rates) with this document schema:

    {
        _id: ....,
        userId: ....,
        postId:....,
        value: ..., //1 to 5
        date:....   
    }
    

    And you need to define a proper index for it:

    db.Rates.ensureIndex({userId : 1, postId : 1})// very useful. it will result in a much faster search operation in case you want to check if a user has rated the post previously

    When a user wants to rate, firstly you need to check whether the user has rated the post or not. assume the user is 'user1', the query then would be

    var ratedBefore = db.Rates.find({userId : 'user1', postId : 'post1'}).count()
    

    And based on ratedBefore, if !ratedBefore then insert new rate-document to Rates collection and update blog status, otherwise, user is not allowed to rate

    if(!ratedBefore)
    {
        var postId = 'post1'; // this id sould be passed before by client driver
        var userId = 'user1'; // this id sould be passed before by client driver
        var rateValue = 1; // to 5
        var rate = 
        {       
           userId: userId,
           postId: postId,
           value: rateValue,
           date:new Date()  
        };
    
        db.Rates.insert(rate);
        db.Blog.update({"_id" : postId}, {$inc : {'rateCount' : 1, 'rateValue' : rateValue}});
    }
    

    Then what is gonna happen to rateAverage? I strongly recommend to calculate it based on rateCount and rateValue on client side, it is easy to update rateAverage with mongoquery, but you shouldn't do it. why? The simple answer is: this is a very easy job for client to handle these kind of works and putting average on every blog document needs an unnecessary update operation.

    the average query would be calculated as:

    var blog = db.Blog.findOne({"_id" : "post1"});
    var avg = blog.rateValue / blog.rateCount;
    print(avg);
    

    With this approach you will get maximum performance with mongodb an you have track of every rate based by user, post and date.

    0 讨论(0)
  • 2021-02-02 18:32

    The Below code can be used to get the average rating for each users.

    db.ratings.aggregate([
     {
     $match:{ rated: '$user' },
     },
     {
     $order: {
      _id: "$rated",
      average: { $avg: "$rating" }
     }
     ])
    
    0 讨论(0)
  • 2021-02-02 18:40

    I would do it a bit different: Have a User class and a Rating class and aggregate the number of ratings and rating average.

    The Rating class

    This is a bit of pseudo code, but the meaning should be obvious.

    {
      _id:ObjectId(…),
      rating: Integer,
      rater: User._id
      rated: User._id
      date: ISODate()
    }
    

    In order to do the aggregation efficiently, you should at least create an index over rated:

    db.ratings.ensureIndex({rated:1})
    

    Now, you can decide between to approaches: either, you calculate the number of ratings and the average let's say once an hour and store it in an collection, let's say rate_averages, or you calculate those values on demand.

    Precalculated

    db.ratings.aggregate(
      // Aggregation
      [{
         $order: {
          _id: "$rated",
          ratings: { $sum:1 },
          average: { $avg: "$rating" }
        },
        {$out:'rate_averages'}
      ]
    )
    

    A document in the rate_averages collection will then look like this:

    {
      _id:User._id,
      ratings: Integer,
      average: Float
    }
    

    and is easily queryable for the individual user's values, as _id is indexed automatically.

    On demand

    You'd use the same rating and almost the same aggregation query, except that we add a $match stage so we only work with the values for the user we want to know the stats for and leave out the $out stage and have the document to be returned directly:

    db.ratings.aggregate([
      {
        $match:{ rated: <_id of the user we want the values for> },
      },
      {
        $order: {
          _id: "$rated",
          ratings: { $sum:1 },
          average: { $avg: "$rating" }
      }
    ])
    

    which would return a single document as shown for the user in question.

    With this approach and a proper data model, you can even do such things as "How many ratings were given by a specific user on a given date?" or "What are the most active raters/the most rated?" quite easily.

    Please read the aggregation framework docs for further details. You might find the data modeling docs useful, too.

    0 讨论(0)
  • 2021-02-02 18:40

    My solution is quite simple, similar to your 3rd option but more simpler. Let's said we have 3 models: Book, User and Rating. I added new field call totalRated - array of int to Book model to store total Rated counting, the value is mapping index + 1.

    Your rating system from 1-5, so, totalRated means:

    • [total1star, total2star, total3star, total4star, total5star]

    Every time user rate this Book, I will create a Document on Rating collection, and increase the counting by 1 (mapping with the index+1 of totalRated array).

    The Results is:

    • rateCount now is sum of each item in array.
    • rateAverage should be (index+1 * value) / rateCount.
    • We can get total number rate by value mapping with index + 1 too.

    Step by step

    For default, this should be:

    // Book Document
    {
     _id,
     totalRated: [0, 0, 0, 0, 0],
     ...otherFields
    }
    
    • If user1 rate 5 star for this book, the document now should be:
    {
     _id,
     totalRated: [0, 0, 0, 0, 1],
     ...otherFields
    }
    
    • If user2 rate 4 star for this book, the document now should be:
    {
     _id,
     totalRated: [0, 0, 0, 1, 1],
     ...otherFields
    }
    
    • If user3 rate 4 star for this book, the document now should be:
    {
     _id,
     totalRated: [0, 0, 0, 2, 1],
     ...otherFields
    }
    
    • rateCount = 0 + 0 + 0 + 2 + 1 = 3
    • rateAverage = (0*1 + 0*2 + 0*3 + 2*4 + 1*5)/3 = 9.6666...

    Note: You can change array int to array object, the key should be rating value, and value should be totalRating, but array int is enough for me.

    0 讨论(0)
提交回复
热议问题