发表新帖

发表新帖

Window function is not working on Pyspark sqlcontext

前端未结

关注

 1  526

I have a data frame and I want to roll up the data into 7days and do some aggregation on some of the function.

I have a pyspark sql dataframe like ------

相关标签:

1条回答

逝去的感伤

2020-12-22 09:09
The error kind of says everything :
```
py4j.protocol.Py4JJavaError: An error occurred while calling o138.select.
: org.apache.spark.sql.AnalysisException: Could not resolve window function 'min'. Note that, using window functions currently requires a HiveContext;
```
You'll need a version of spark that supports hive (build with hive) than you can declare a hivecontext :
```
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
```
and then use that context to perform your window function.

In python :
```
# sc is an existing SparkContext.
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)
```
You can read further about the difference between SQLContextand HiveContext here.

SparkSQL has a SQLContext and a HiveContext. HiveContext is a super set of the SQLContext. The Spark community suggest using the HiveContext. You can see that when you run spark-shell, which is your interactive driver application, it automatically creates a SparkContext defined as sc and a HiveContext defined as sqlContext. The HiveContext allows you to execute SQL queries as well as Hive commands. The same behavior occurs for pyspark.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题