Realtime database data structure modeling

问题

We have a chat system for which we have a analytics dashboard. currently we are showing the top said sentences. The model looks like below:


messages
    --key1
       -text: "who are you"
    --key2
       -text: "hello"
    --key3
       -text: "who are you"

there is a database trigger that every time a new message gets inserted store a count like below


stat
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1

Our dashboard now queries this data and shows on dashboard as top sentences used.

The problem we have is now we need to add date element to it. So basically currently this solves to answer "top said sentences ever by people"

What we now want to answer is "top said sentences today, this week, this month"

So, we probably need to re store the stat data model differently. Please advise.

回答1:

The common recommendation is to store the data that you app needs to display. So if you want to display top sentences for today, for this week, and for this month, that means storing precisely those aggregates: the top sentences by day, week, and month.

A simple model for storing these is to keep your current, but then for each aggregation level, and each interval:

stats
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1
   --topPhrases_byDay
     --20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
   --topPhrases_byWeek
     --201922
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --201923
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
   --topPhrases_byMonth
     --201905
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --201906
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1

Alternatively, store all aggregations as a single list, and use prefixes to indicate their aggregation level (and the format of the rest of the key):

stats
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1
     day_20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     day_20190608
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     week_201922
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     week_201923
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     month_201905
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     month_201906
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1

You're definitely duplicating a lot of data here, but the advantage of these models is that displaying the stats to a user is now trivial. That's a common trade-off with NoSQL databases, writing of data is made more complex, and more (duplicate) data is stored, but it makes reading the data trivial, and thus very scalable.

来源：https://stackoverflow.com/questions/56503645/realtime-database-data-structure-modeling

标签

firebase

firebase-realtime-database