发表新帖

发表新帖

pyspark and reduceByKey: how to make a simple sum

后端未结

关注

 1  1537

I am trying some code in Spark (pyspark) for an assignment. First time I use this environment, so for sure I miss something…

I have a simple dataset called c_views.

相关标签:

1条回答

独厮守ぢ

2021-01-17 03:04
Other simple ways to achieve the result?
```
from operator import add 

c_views.reduceByKey(add)
```
or if you prefer lambda expressions:
```
c_views.reduceByKey(lambda x, y: x + y)
```
I do not understand what exactly I have to code in the function

It has to be a function which takes two values of the same types as the values in your RDD and returns a value of the same type as inputs. It also has to be associative which means that the final result cannot depend how do you arrange parentheses.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题