pyspark Loading multiple partitioned files in a single load

后端 未结 1 595
陌清茗
陌清茗 2021-01-23 02:25

I am trying to load multiple files in a single load. They are all partitioned files When I tried it with 1 file it works, but when I listed down 24 files, it gives me this erro

相关标签:
1条回答
  • 2021-01-23 03:08

    As explained in the official documentation, to read multiple files, you should pass a list:

    path – optional string or a list of string for file-system backed data sources.

    So in your case:

    (sqlContext.read
        .format('orc') 
        .options(basePath=basePath)
        .load(path=paths))
    

    Argument unpacking (*) would makes sense only if load was defined with variadic arguments, form example:

    def load(this, *paths):
        ...
    
    0 讨论(0)
提交回复
热议问题