I am using Spark SQL
for reading parquet and writing parquet file.
But some cases,i need to write the DataFrame
as text file instead of Json or
Using Databricks Spark-CSV you can save directly to a CSV file and load from a CSV file afterwards like this
import org.apache.spark.sql.SQLContext SQLContext sqlContext = new SQLContext(sc); DataFrame df = sqlContext.read() .format("com.databricks.spark.csv") .option("inferSchema", "true") .option("header", "true") .load("cars.csv"); df.select("year", "model").write() .format("com.databricks.spark.csv") .option("header", "true") .option("codec", "org.apache.hadoop.io.compress.GzipCodec") .save("newcars.csv");