MongoDB: What's the point of using MapReduce without parallelism?

后端 未结 3 1624
灰色年华
灰色年华 2021-02-05 15:51

Quoting http://www.mongodb.org/display/DOCS/MapReduce#MapReduce-Parallelism

As of right now, MapReduce jobs on a single mongod process are single thre

相关标签:
3条回答
  • 2021-02-05 16:31

    The main reason to use MapReduce over simpler or more traditional queries is that it simply can do things (i.e., aggregation) that simple queries cannot.

    Once you need aggregation, there are two options using MongoDB: MapReduce and the group command. The group command is analogous to SQL's "group by" and is limited in that it has to return all its results in a single database response. That means group can only be used when you have less than 4MB of results. MapReduce, on the other hand, can do anything a "group by" can, but outputs results to a new collection so results can be as large as needed.

    Also, parallelism is coming, so it's good to have some practice :)

    0 讨论(0)
  • 2021-02-05 16:34

    super fast map/reduce is on the roadmap

    it will not be in the 1.6 release (summer release)

    so late this year likely

    0 讨论(0)
  • 2021-02-05 16:44

    M/R is already parallel in MongoDB if you're running a sharded cluster. This is the main point of M/R anyway - to put the computation on the same node as the data.

    0 讨论(0)
提交回复
热议问题