How the data is split in Hadoop

后端未结

关注

 5  1444

不知归路 2021-01-31 12:17

Does the Hadoop split the data based on the number of mappers set in the program? That is, having a data set of size 500MB, if the number of mappers is 200 (assuming that the Ha

5条回答

迷失自我 (楼主)

2021-01-31 12:44

If 200 mapper are running for 500mb of data, then you need to check for each individual file size. If that file size is lesser than block size (64 mb ) then it will run map task for each file.

Normally we merge the smaller files in large file (sizing greater than block size)

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...