Where does spark look for text files?

后端 未结 2 1402
执念已碎
执念已碎 2021-02-08 09:29

I thought that loading text files is done only from workers / within the cluster (you just need to make sure all workers have access to the same path, either by having that text

2条回答
  •  粉色の甜心
    2021-02-08 09:48

    So the really short version of it the answer is, if you reference "file://..." it should be accessible on all nodes in your cluster including the dirver program. Sometimes some bits of work happen on the worker. Generally the way around this is just not using local files, and instead using something like S3, HDFS, or another network filesystem. There is the sc.addFile method which can be used to distribute a file from the driver to all of the other nodes (and then you use SparkFiles.get to resolve the download location).

提交回复
热议问题