Realtime database data structure modeling

寵の児 提交于 2021-01-29 14:29:06

问题


We have a chat system for which we have a analytics dashboard. currently we are showing the top said sentences. The model looks like below:


messages
    --key1
       -text: "who are you"
    --key2
       -text: "hello"
    --key3
       -text: "who are you"

there is a database trigger that every time a new message gets inserted store a count like below


stat
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1

Our dashboard now queries this data and shows on dashboard as top sentences used.

The problem we have is now we need to add date element to it. So basically currently this solves to answer "top said sentences ever by people"

What we now want to answer is "top said sentences today, this week, this month"

So, we probably need to re store the stat data model differently. Please advise.


回答1:


The common recommendation is to store the data that you app needs to display. So if you want to display top sentences for today, for this week, and for this month, that means storing precisely those aggregates: the top sentences by day, week, and month.

A simple model for storing these is to keep your current, but then for each aggregation level, and each interval:

stats
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1
   --topPhrases_byDay
     --20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
   --topPhrases_byWeek
     --201922
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --201923
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
   --topPhrases_byMonth
     --201905
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --201906
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1

Alternatively, store all aggregations as a single list, and use prefixes to indicate their aggregation level (and the format of the rest of the key):

stats
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1
     day_20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     day_20190608
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     week_201922
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     week_201923
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     month_201905
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     month_201906
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1

You're definitely duplicating a lot of data here, but the advantage of these models is that displaying the stats to a user is now trivial. That's a common trade-off with NoSQL databases, writing of data is made more complex, and more (duplicate) data is stored, but it makes reading the data trivial, and thus very scalable.



来源:https://stackoverflow.com/questions/56503645/realtime-database-data-structure-modeling

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!