Spark - Non-time-based windows are not supported on streaming DataFrames/Datasets;

前端 未结 1 1592
予麋鹿
予麋鹿 2021-01-06 08:31

I need to write Spark sql query with inner select and partition by. Problem is that I have AnalysisException. I already spend few hours on this but with other approach I hav

相关标签:
1条回答
  • 2021-01-06 09:08

    I believe the issue is in the windowing specification:

    over (partition by deviceId order by timestamp) 
    

    The partition would need to be over a time based column - in your case timestamp . The following should work:

    over (partition by timestamp order by timestamp) 
    

    That will not of course address the intent of your query. The following might be attempted: but it is unclear whether spark would support it:

    over (partition by timestamp, deviceId order by timestamp) 
    

    Even if spark does support that it would still change the semantics of your query.

    Update

    Here is a definitive source: from Tathagata Das who is a key/core committer on spark streaming: http://apache-spark-user-list.1001560.n3.nabble.com/Does-partition-by-and-order-by-works-only-in-stateful-case-td31816.html

    0 讨论(0)
提交回复
热议问题