How to set the number of partitions for newAPIHadoopFile?

后端 未结 1 438
北荒
北荒 2021-02-04 17:14

The \"old\" SparkContext.hadoopFile takes a minPartitions argument, which is a hint for the number of partitions:

def hadoopFile[K, V](         


        
相关标签:
1条回答
  • 2021-02-04 18:07

    The function newApiHadoopFile allows you to pass a configuration object so in that you can set mapred.max.split.size.

    Even though this is in the mapred namespace since there is seemingly no new option I would imagine the new API will respect the variable.

    0 讨论(0)
提交回复
热议问题