How to convert unix timestamp to date in Spark

前端 未结 7 1678
伪装坚强ぢ 2020-12-01 11:42

I have a data frame with a column of unix timestamp(eg.1435655706000), and I want to convert it to data with format \'yyyy-MM-DD\', I\'ve tried nscala-time but it doesn\'t w

  • 2020-12-01 12:21

    What you can do is:

    input.withColumn("time", concat(from_unixtime(input.col("COL_WITH_UNIX_TIME")/1000,
    "yyyy-MM-dd'T'HH:mm:ss"), typedLit("."), substring(input.col("COL_WITH_UNIX_TIME"), 11, 3), 

    where time is a new column name and COL_WITH_UNIX_TIME is the name of the column which you want to convert. This will give data in millis, making your data more accurate, like: "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'"

    0 讨论(0)
  • 2020-12-01 12:32

    You can use the following syntax in Java"timestamp)
                .withColumn("date", date_format(col("timestamp").$div(1000).cast(DataTypes.TimestampType), "yyyyMMdd").cast(DataTypes.IntegerType))
    0 讨论(0)
  • 2020-12-01 12:39

    Since spark1.5 , there is a builtin UDF for doing that.

    val df = sqlContext.sql("select from_unixtime(ts,'YYYY-MM-dd') as `ts` from mr")

    Please check Spark 1.5.2 API Doc for more info.

    0 讨论(0)
  • 2020-12-01 12:43

    Here it is using Scala DataFrame functions: from_unixtime and to_date

    // NOTE: divide by 1000 required if milliseconds
    // e.g. 1446846655609 -> 2015-11-06 21:50:55 -> 2015-11-06$"ts" / 1000))) 
    0 讨论(0)
  • 2020-12-01 12:43

    I have solved this issue using the joda-time library by mapping on the DataFrame and converting the DateTime into a String :

    import org.joda.time._
    val time_col = sqlContext.sql("select ts from mr")
                             .map(line => new DateTime(line(0)).toString("yyyy-MM-dd"))
    0 讨论(0)
  • 2020-12-01 12:46
    import org.joda.time.{DateTimeZone}
    import org.joda.time.format.DateTimeFormat

    You need to import the following libraries.

    val stri = new DateTime(timeInMillisec).toDateTime.toString("yyyy/MM/dd")

    Or adjusting to your case :

     val time_col = sqlContext.sql("select ts from mr")
                         .map(line => new DateTime(line(0).toInt).toDateTime.toString("yyyy/MM/dd"))

    There could be another way :

      import com.github.nscala_time.time.Imports._
      val date = (new DateTime() + ((threshold.toDouble)/1000).toInt.seconds )

    Hope this helps :)

    0 讨论(0)