pyspark's window functions fn.avg() only output same data

孤街醉人 提交于 2019-12-13 03:38:38

问题


Here is my code:

import pandas as pd
from pyspark.sql import SQLContext
import pyspark.sql.functions as fn
from pyspark.sql.functions import isnan, isnull
from pyspark.sql.functions import lit
from pyspark.sql.window import Window

spark= SparkSession.builder.appName(" ").getOrCreate()
file = "D:\project\HistoryData.csv"
lines = pd.read_csv(file)
spark_df=spark.createDataFrame(cc,['id','time','average','max','min'])

temp = Window.partitionBy("time").orderBy("id").rowsBetween(-1, 1)

df = spark_df.withColumn("movingAvg",fn.avg("average").over(temp))

df.show()

But it output this:

It output the same data,and some data is disappear.

来源:https://stackoverflow.com/questions/51316635/pysparks-window-functions-fn-avg-only-output-same-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!