Pandas column of lists, create a row for each list element

前端 未结 10 782
有刺的猬
有刺的猬 2020-11-22 06:59

I have a dataframe where some cells contain lists of multiple values. Rather than storing multiple values in a cell, I\'d like to expand the dataframe so that each item in t

10条回答
  •  情话喂你
    2020-11-22 07:27

    Very late answer but I want to add this:

    A fast solution using vanilla Python that also takes care of the sample_num column in OP's example. On my own large dataset with over 10 million rows and a result with 28 million rows this only takes about 38 seconds. The accepted solution completely breaks down with that amount of data and leads to a memory error on my system that has 128GB of RAM.

    df = df.reset_index(drop=True)
    lstcol = df.lstcol.values
    lstcollist = []
    indexlist = []
    countlist = []
    for ii in range(len(lstcol)):
        lstcollist.extend(lstcol[ii])
        indexlist.extend([ii]*len(lstcol[ii]))
        countlist.extend([jj for jj in range(len(lstcol[ii]))])
    df = pd.merge(df.drop("lstcol",axis=1),pd.DataFrame({"lstcol":lstcollist,"lstcol_num":countlist},
    index=indexlist),left_index=True,right_index=True).reset_index(drop=True)
    

提交回复
热议问题