How to use Cassandra's Map Reduce with or w/o Pig?

后端未结

关注

 3  1369

抹茶落季 2021-02-13 05:31

Can someone explain how MapReduce works with Cassandra .6? I\'ve read through the word count example, but I don\'t quite follow what\'s happening on the Cassandra end vs. the \"

3条回答

南旧 (楼主)

2021-02-13 06:11

The win of using a direct InputFormat from cassandra is that it streams the data efficiently, which is a very big win. Each input split covers a range of tokens and rolls off the disk at its full bandwidth: no seeking, no complex querying. I don't think it knows about locality -- to have each tasktracker prefer input splits from a cassandra process on the same node.

You can try using Pig with the STREAM method as a hack until more direct hadoop streaming support is in place.

0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...