Any Idea why I am getting the result below?
scala> val b = to_timestamp($\"DATETIME\", \"ddMMMYYYY:HH:mm:ss\")
b: org.apache.spark.sql.Column = to_timesta
try below code
I have created a sample dataframe "df" for the table
+---+-------------------+
| id| date|
+---+-------------------+
| 1| 01JAN2017:01:02:03|
| 2| 15MAR2017:01:02:03|
| 3|02APR2015:23:24:25 |
+---+-------------------+
val t_s= unix_timestamp($"date","ddMMMyyyy:HH:mm:ss").cast("timestamp")
df.withColumn("ts",t_s).show()
+---+-------------------+--------------------+
| id| date| ts|
+---+-------------------+--------------------+
| 1| 01JAN2017:01:02:03|2017-01-01 01:02:...|
| 2| 15MAR2017:01:02:03|2017-03-15 01:02:...|
| 3|02APR2015:23:24:25 |2015-04-02 23:24:...|
+---+-------------------+--------------------+
Thanks
Use y
(year) not Y
(week year):
spark.sql("SELECT to_timestamp('04JUN2018:10:11:12', 'ddMMMyyyy:HH:mm:ss')").show
// +--------------------------------------------------------+
// |to_timestamp('04JUN2018:10:11:12', 'ddMMMyyyy:HH:mm:ss')|
// +--------------------------------------------------------+
// | 2018-06-04 10:11:12|
// +--------------------------------------------------------+
Another example:
scala> sql("select to_timestamp('12/08/2020 1:24:21 AM', 'MM/dd/yyyy H:mm:ss a')").show
+-------------------------------------------------------------+
|to_timestamp('12/08/2020 1:24:21 AM', 'MM/dd/yyyy H:mm:ss a')|
+-------------------------------------------------------------+
| 2020-12-08 01:24:21|
+-------------------------------------------------------------+
Try this UDF:
val changeDtFmt = udf{(cFormat: String,
rFormat: String,
date: String) => {
val formatterOld = new SimpleDateFormat(cFormat)
val formatterNew = new SimpleDateFormat(rFormat)
formatterNew.format(formatterOld.parse(date))
}}
sourceRawData.
withColumn("ts",
changeDtFmt(lit("ddMMMyyyy:HH:mm:ss"), lit("yyyy-MM-dd HH:mm:ss"), $"DATETIME")).
show(6,false)