I have an RDD containing a timestamp named time of type long:
root
|-- id: string (nullable = true)
|-- value1: string (nullable = true)
Not sure if this is what you meant/needed but I've felt the same struggle-ness dealing with date/timestamp in spark-sql and the only thing I came up with was casting string in timestamp since it seems impossible (to me) having Date type in spark-sql.
Anyway, this is my code to accomplish something similar (Long in place of String) to your need (maybe):
val mySQL = sqlContext.sql("select cast(yourLong as timestamp) as time_cast" +
" ,count(1) total "+
" from logs" +
" group by cast(yourLong as timestamp)"
)
val result= mySQL.map(x=>(x(0).toString,x(1).toString))
and the output is something like this:
(2009-12-18 10:09:28.0,7)
(2009-12-18 05:55:14.0,1)
(2009-12-18 16:02:50.0,2)
(2009-12-18 09:32:32.0,2)
Could this be useful for you as well even though I'm using timestamp and not Date?
Hope it could help
FF
EDIT: in order to test a "single-cast" from Long to Timestamp I've tried this simple change:
val mySQL = sqlContext.sql("select cast(1430838439 as timestamp) as time_cast" +
" ,count(1) total "+
" from logs" +
" group by cast(1430838439 as timestamp)"
)
val result= mySQL.map(x=>(x(0),x(1)))
and all worked fine with the result:
(1970-01-17 14:27:18.439,4) // 4 because I have 4 rows in my table