Filling empty python dataframe using loops

后端 未结 2 1094
终归单人心
终归单人心 2021-01-05 06:29

Lets say I want to create and fill an empty dataframe with values from a loop.

import pandas as pd
import numpy as np

years = [2013, 2014, 2015]
dn=pd.Data         


        
相关标签:
2条回答
  • 2021-01-05 07:15
    import pandas as pd
    
    years = [2013, 2014, 2015]
    dn = []
    for year in years:
        df1 = pd.DataFrame({'Incidents': [ 'C', 'B','A'],
                     year: [1, 1, 1 ],
                    }).set_index('Incidents')
        dn.append(df1)
    dn = pd.concat(dn, axis=1)
    print(dn)
    

    yields

               2013  2014  2015
    Incidents                  
    C             1     1     1
    B             1     1     1
    A             1     1     1
    

    Note that calling pd.concat once outside the loop is more time-efficient than calling pd.concat with each iteration of the loop.

    Each time you call pd.concat new space is allocated for a new DataFrame, and all the data from each component DataFrame is copied into the new DataFrame. If you call pd.concat from within the for-loop then you end up doing on the order of n**2 copies, where n is the number of years.

    If you accumulate the partial DataFrames in a list and call pd.concat once outside the list, then Pandas only needs to perform n copies to make dn.

    0 讨论(0)
  • 2021-01-05 07:20

    As far as I know you should avoid to add line by line to the dataframe due to speed issue

    What I usually do is:

    l1 = []
    l2 = []
    
    for i in range(n):
       compute value v1
       compute value v2
       l1.append(v1)
       l2.append(v2)
    
    d = pd.DataFrame()
    d['l1'] = l1
    d['l2'] = l2
    
    0 讨论(0)
提交回复
热议问题