Is there a way to prevent PySpark from creating several small files when writing a DataFrame to JSON file?
If I run:
df.write.format(\'json\').save(
df1.rdd.repartition(1).write.json('myfile.json')
Would be nice, but isn't available. Check this related question. https://stackoverflow.com/a/33311467/2843520