发表新帖

发表新帖

How to do row processing and item assignment in Dask

前端未结

关注

 1  838

Similar unanswered question: Row by row processing of a Dask DataFrame

I\'m working with dataframes that are millions on rows long, and so now I\'m trying to have al

相关标签:

1条回答

鱼传尺愫

2021-01-15 16:45
Dask dataframe does not support efficient iteration or row assignment. In general these workflows rarely scale well. They are also quite slow in Pandas itself.

Instead, you might consider using the Series.where method. Here is a minimal example:
```
In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'x': [1, 2, 3], 'y': [3, 2, 1]})

In [3]: import dask.dataframe as dd

In [4]: ddf = dd.from_pandas(df, npartitions=2)

In [5]: ddf['z'] = ddf.x.where(ddf.x > ddf.y, ddf.y)

In [6]: ddf.compute()
Out[6]:
   x  y  z
0  1  3  3
1  2  2  2
2  3  1  3
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题