Reading CSV into a Spark Dataframe with timestamp and date types

后端 未结 2 1248
南笙
南笙 2021-02-18 16:42

It\'s CDH with Spark 1.6.

I am trying to import this Hypothetical CSV into a apache Spark DataFrame:

$ hadoop fs -cat test.csv
a,b,c,201         


        
2条回答
  •  眼角桃花
    2021-02-18 16:50

    It's not really elegant but you can convert from timestamp to date like this (check last line):

    val textData = sqlContext.read.format("com.databricks.spark.csv")
        .option("header", "false")
        .option("delimiter", ",")
        .option("dateFormat", "yyyy-MM-dd")
        .option("inferSchema", "true")
        .option("nullValue", "null")
        .load("test.csv")
        .withColumn("C4", expr("""to_date(C4)"""))
    

提交回复
热议问题