发表新帖

发表新帖

How to overwrite entire existing column in Spark dataframe with new column?

前端未结

关注

 3  1425

抹茶落季 2021-02-20 04:19

I want to overwrite a spark column with a new column which is a binary flag.

I tried directly overwriting the column id2 but why is it not working like a inplace operati

3条回答

粉色の甜心 (楼主)

2021-02-20 05:15
As stated above it's not possible to overwrite DataFrame object, which is immutable collection, so all transformations return new DataFrame.

The fastest way to achieve your desired effect is to use withColumn:
```
df = df.withColumn("col", some expression)
```
where col is name of column which you want to "replace". After running this value of df variable will be replaced by new DataFrame with new value of column col. You might want to assign this to new variable.

In your case it can look:
```
df2 = df2.withColumn("id2", (df2.id2 > 0) & (df2.id2 != float('nan')))
```
I've added comparison to nan, because I'm assuming you don't want to treat nan as greater than 0.
0 讨论(0)

查看其它3个回答
发布评论:

提交评论
- 加载中...

热议问题