发表新帖

发表新帖

Which database to choose (Cassandra, MongoDB, ?) for storing and querying event / log / metrics data?

后端未结

关注

 3  1080

陌清茗 2021-02-14 09:31

In sql terms we\'re storing data like this:

table events (
  id
  timestamp
  dimension1
  dimension2
  dimension3
  etc.
)

All dimension value

3条回答

醉话见心 (楼主)

2021-02-14 10:17

Was also looking at MongoDB, but their "group()" function has severe limitations as far as I could read (max of 10,000 rows).

To clarify, this is 10,000 rows returned. In your example, this will work for up to 10,000 combinations of dimension1/dimension2. If that's too large, then you can also use the slower Map / Reduce. Note that if you're running a query with more than 10k results, it may best to use Map / Reduce and save this data. 10k is a large query result to otherwise just "throw away".

Do you have experience with any of these databases, and would you recommend it as a solution to the problem described above?

Many people actually use MongoDB to do this type of summary "real-time", but they do it using "counters" instead of "aggregation". Instead of "rolling-up" detailed data, they'll do a regular insert and then they'll increment some counters.

In particular, using the atomic modifiers like $inc & $push to atomically update data in a single request.

Take a look at hummingbird for someone doing this right now. There's also an open source event-logging system backed by MongoDB: Graylog2. ServerDensity also does server event logging backed by MongoDB.

Looking at these may give you some inspiration for the types of logging you want to do.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题