realtime querying/aggregating millions of records - hadoop? hbase? cassandra?

前端 未结 5 661
不思量自难忘°
不思量自难忘° 2021-01-31 06:30

I have a solution that can be parallelized, but I don\'t (yet) have experience with hadoop/nosql, and I\'m not sure which solution is best for my needs. In theory, if I had unl

5条回答
  •  醉梦人生
    2021-01-31 07:23

    It is serious problem without immidiate good solution in the open source space. In commercial space MPP databases like greenplum/netezza should do. Ideally you would need google's Dremel (engine behind BigQuery). We are developing open source clone, but it will take some time... Regardless of the engine used I think solution should include holding the whole dataset in memory - it should give an idea what size of cluster you need.

提交回复
热议问题