How Do I aggregate Data By Day and Still Respect Timezone?

后端 未结 3 2013
逝去的感伤
逝去的感伤 2021-02-14 06:27

We are currently using a summary table that aggregates information for our users on an hourly basis in UTC time. The problem we are having is that this table is becoming too la

相关标签:
3条回答
  • 2021-02-14 07:21

    I'm assuming you've went through all the partitioning considerations, such as partitioning by user.

    I can see several solutions to your problem, depending on the usage pattern.

    1. Aggregate data per day, per user selection. In the event of timezone change, programatically recalculate the aggregate for this partner. This is plausible if timezone changes are infrequent and if a certain delay in data may be introduced when a user changes timezones.

    2. If you have relatively few measures, you may maintain 24 columns for each measure - each describing the daily aggregate for the measure in a different timezone.

    3. If timezone changes are frequent and there are numerous measures, it seems like 24 different aggregate tables would be the way to go.

    0 讨论(0)
  • 2021-02-14 07:33

    Summarise the data in tables with a timeoffset column, and a "day" field (a date) that is the day for that particular summary line. Index on (timeoffset, day, other relevant fields), clustered if possible (presumably PostgresSQL has clustered indexes?) and all should be well.

    0 讨论(0)
  • 2021-02-14 07:35

    I met this problem too. I take this solution: the data with date type use local timezone, the other data with datetime type use UTC timezone, because the statistics index is local. Another reason is now we have only local data.

    0 讨论(0)
提交回复
热议问题