I have following simple Scala class , which i will later modify to fit some machine learning models.
I need to create a jar file out of this as i am going to run these m
As I paid attention later that your are going to write to a text file. Spark's .format(text)
doesn't support any specific types except String/Text. So to achive a goal you need to first convert the all the types to String and store:
df.rdd.map(_.toString().replace("[","").replace("]", "")).saveAsTextFile("textfilename")
If it's you could consider other oprions to store the data as file based, then you can have benefits of types. For example using CSV or JSON. This is working code example based on your csv file for csv.
val spark = SparkSession.builder
.appName("Simple Application")
.config("spark.master", "local")
.getOrCreate()
import spark.implicits._
import spark.sqlContext.implicits._
val df = spark.read
.format("csv")
.option("delimiter", ",")
.option("header", "true")
.option("inferSchema", "true")
.option("dateFormat", "yyyy-MM-dd")
.load("datat.csv")
df.printSchema()
df.show()
df.write
.format("csv")
.option("inferSchema", "true")
.option("header", "true")
.option("delimiter", "\t")
.option("timestampFormat", "yyyy-MM-dd HH:mm:ss")
.option("escape", "\\")
.save("another")
There is no need custom encoder/decoder.