问题
I have a pandas df that's indexed by id
and date
. I would like to run some regressions for each id in parallel using dask. I know dask splits the df into N partitions but is there a way to force it to split by id
column? This way when I do map_partitions
I can simply apply my rolling regression function to each partition.
来源:https://stackoverflow.com/questions/51698459/specify-how-to-partition-dask-dataframe