How the data is split in Hadoop

后端 未结 5 1444
不知归路
不知归路 2021-01-31 12:17

Does the Hadoop split the data based on the number of mappers set in the program? That is, having a data set of size 500MB, if the number of mappers is 200 (assuming that the Ha

5条回答
  •  迷失自我
    2021-01-31 12:44

    If 200 mapper are running for 500mb of data, then you need to check for each individual file size. If that file size is lesser than block size (64 mb ) then it will run map task for each file.

    Normally we merge the smaller files in large file (sizing greater than block size)

提交回复
热议问题