I thought that loading text files is done only from workers / within the cluster (you just need to make sure all workers have access to the same path, either by having that text
So the really short version of it the answer is, if you reference "file://..." it should be accessible on all nodes in your cluster including the dirver program. Sometimes some bits of work happen on the worker. Generally the way around this is just not using local files, and instead using something like S3, HDFS, or another network filesystem. There is the sc.addFile
method which can be used to distribute a file from the driver to all of the other nodes (and then you use SparkFiles.get
to resolve the download location).