Python: Apply a function to multiple subsets of a dataframe (stored in a dictionary)

梦想与她 提交于 2021-01-28 11:16:07

问题


Regards,

Apologies if this question appears be to a duplicate of other questions. But I could find an answer that addresses my problem in its exactitude.

I split a dataframe, called "data", into multiple subsets that are stored in a dictionary of dataframes named "dfs" as follows:

# Partition DF

dfs = {}
chunk = 5

for n in range((data.shape[0] // chunk + 1)):
    df_temp = data.iloc[n*chunk:(n+1)*chunk]
    df_temp = df_temp.reset_index(drop=True)
    dfs[n] = df_temp

Now, I would like to apply a pre-defined helper function called "fun_c" to EACH of the dataframes (that are stored in the dictionary object called "dfs").

Is it correct for me to apply the function to the dfs in one go, as follows(?):

result = fun_c(dfs)

If not, what would be the correct way of doing this?


回答1:


it depends on the output you're looking for:

  • If you want a dict in the output, then you should apply the function to each dict item
result = dict({key: fun_c(val) for key, val in dfs.items()})
  • If you want a list of dataframes/values in the output, then apply the function to each dict value
result = [fun_c(val) for val in dfs.items()]

But this style isnt wrong either, you can iterate however you like inside the helper function as well:

def fun_c(dfs):

    result = None
    # either
    for key, val in dfs.items():
        pass
    # or
    for val in dfs.values():
        pass
    return result

Let me know if this helps!




回答2:


Since you want this:

Now, I would like to apply a pre-defined helper function called "fun_c" to EACH of the dataframes (that are stored in the dictionary object called "dfs").

Let's say your dataframe dict looks like this and your helper function takes in a single dataframe.

dfs = {0 : df0, 1: df1, 2: df2, 3:df3}

Let's iterate through the dictionary, apply the fun_c function on each of the dataframes, and save the results in another dictionary having the same keys:

dfs_result = {k:fun_c[v] for k, v in dfs.items()}


来源:https://stackoverflow.com/questions/60134460/python-apply-a-function-to-multiple-subsets-of-a-dataframe-stored-in-a-diction

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!