I have a date pyspark dataframe with a string column in the format of MM-dd-yyyy
and I am attempting to convert this into a date column.
I tried:
The strptime() approach does not work for me. I get another cleaner solution, using cast:
from pyspark.sql.types import DateType
spark_df1 = spark_df.withColumn("record_date",spark_df['order_submitted_date'].cast(DateType()))
#below is the result
spark_df1.select('order_submitted_date','record_date').show(10,False)
+---------------------+-----------+
|order_submitted_date |record_date|
+---------------------+-----------+
|2015-08-19 12:54:16.0|2015-08-19 |
|2016-04-14 13:55:50.0|2016-04-14 |
|2013-10-11 18:23:36.0|2013-10-11 |
|2015-08-19 20:18:55.0|2015-08-19 |
|2015-08-20 12:07:40.0|2015-08-20 |
|2013-10-11 21:24:12.0|2013-10-11 |
|2013-10-11 23:29:28.0|2013-10-11 |
|2015-08-20 16:59:35.0|2015-08-20 |
|2015-08-20 17:32:03.0|2015-08-20 |
|2016-04-13 16:56:21.0|2016-04-13 |