I need to write Spark sql query with inner select and partition by. Problem is that I have AnalysisException. I already spend few hours on this but with other approach I hav
I believe the issue is in the windowing specification:
over (partition by deviceId order by timestamp)
The partition would need to be over a time based column - in your case timestamp . The following should work:
over (partition by timestamp order by timestamp)
That will not of course address the intent of your query. The following might be attempted: but it is unclear whether spark would support it:
over (partition by timestamp, deviceId order by timestamp)
Even if spark does support that it would still change the semantics of your query.
Update
Here is a definitive source: from Tathagata Das who is a key/core committer on spark streaming: http://apache-spark-user-list.1001560.n3.nabble.com/Does-partition-by-and-order-by-works-only-in-stateful-case-td31816.html