Window function is not working on Pyspark sqlcontext

前端 未结 1 526
夕颜
夕颜 2020-12-22 08:54

I have a data frame and I want to roll up the data into 7days and do some aggregation on some of the function.

I have a pyspark sql dataframe like ------

<         


        
相关标签:
1条回答
  • 2020-12-22 09:09

    The error kind of says everything :

    py4j.protocol.Py4JJavaError: An error occurred while calling o138.select.
    : org.apache.spark.sql.AnalysisException: Could not resolve window function 'min'. Note that, using window functions currently requires a HiveContext;
    

    You'll need a version of spark that supports hive (build with hive) than you can declare a hivecontext :

    val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
    

    and then use that context to perform your window function.

    In python :

    # sc is an existing SparkContext.
    from pyspark.sql import HiveContext
    sqlContext = HiveContext(sc)
    

    You can read further about the difference between SQLContextand HiveContext here.

    SparkSQL has a SQLContext and a HiveContext. HiveContext is a super set of the SQLContext. The Spark community suggest using the HiveContext. You can see that when you run spark-shell, which is your interactive driver application, it automatically creates a SparkContext defined as sc and a HiveContext defined as sqlContext. The HiveContext allows you to execute SQL queries as well as Hive commands. The same behavior occurs for pyspark.

    0 讨论(0)
提交回复
热议问题