Append an empty row in dataframe using pandas

前端 未结 8 1864
忘了有多久
忘了有多久 2021-02-01 13:25

I am trying to append an empty row at the end of dataframe but unable to do so, even trying to understand how pandas work with append function and still not getting it.

相关标签:
8条回答
  • 2021-02-01 13:53

    The code below worked for me.

    df.append(pd.Series([np.nan]), ignore_index = True)
    
    0 讨论(0)
  • 2021-02-01 14:00

    Add a new pandas.Series using pandas.DataFrame.append().

    If you wish to specify the name (AKA the "index") of the new row, use:

    df.append(pandas.Series(name='NameOfNewRow'))
    

    If you don't wish to name the new row, use:

    df.append(pandas.Series(), ignore_index=True)
    

    where df is your pandas.DataFrame.

    0 讨论(0)
  • 2021-02-01 14:00

    Assuming your df.index is sorted you can use:

    df.loc[df.index.max() + 1] = None
    

    It handles well different indexes and column types.

    [EDIT] it works with pd.DatetimeIndex if there is a constant frequency, otherwise we must specify the new index exactly e.g:

    df.loc[df.index.max() + pd.Timedelta(milliseconds=1)] = None
    

    long example:

    df = pd.DataFrame([[pd.Timestamp(12432423), 23, 'text_field']], 
                        columns=["timestamp", "speed", "text"],
                        index=pd.DatetimeIndex(start='2111-11-11',freq='ms', periods=1))
    df.info()
    

    <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 1 entries, 2111-11-11 to 2111-11-11 Freq: L Data columns (total 3 columns): timestamp 1 non-null datetime64[ns] speed 1 non-null int64 text 1 non-null object dtypes: datetime64[ns](1), int64(1), object(1) memory usage: 32.0+ bytes

    df.loc[df.index.max() + 1] = None
    df.info()
    

    <class 'pandas.core.frame.DataFrame'> DatetimeIndex: 2 entries, 2111-11-11 00:00:00 to 2111-11-11 00:00:00.001000 Data columns (total 3 columns): timestamp 1 non-null datetime64[ns] speed 1 non-null float64 text 1 non-null object dtypes: datetime64[ns](1), float64(1), object(1) memory usage: 64.0+ bytes

    df.head()
    
                                timestamp                   speed      text
    2111-11-11 00:00:00.000 1970-01-01 00:00:00.012432423   23.0    text_field
    2111-11-11 00:00:00.001 NaT NaN NaN
    
    0 讨论(0)
  • 2021-02-01 14:01

    Assuming df is your dataframe,

    df_prime = pd.concat([df, pd.DataFrame([[np.nan] * df.shape[1]], columns=df.columns)], ignore_index=True)
    

    where df_prime equals df with an additional last row of NaN's.

    Note that pd.concat is slow so if you need this functionality in a loop, it's best to avoid using it. In that case, assuming your index is incremental, you can use

    df.loc[df.iloc[-1].name + 1,:] = np.nan
    
    0 讨论(0)
  • 2021-02-01 14:04

    You can add a new series, and name it at the same time. The name will be the index of the new row, and all the values will automatically be NaN.

    df.append(pd.Series(name='Afterthought'))
    
    0 讨论(0)
  • 2021-02-01 14:07

    Append "empty" row to data frame and fill selected cells:

    Generate empty data frame (no rows just columns a and b):

    import pandas as pd    
    col_names =  ["a","b"]
    df  = pd.DataFrame(columns = col_names)
    

    Append empty row at the end of the data frame:

    df = df.append(pd.Series(), ignore_index = True)
    

    Now fill the empty cell at the end (len(df)-1) of the data frame in column a:

    df.loc[[len(df)-1],'a'] = 123
    

    Result:

         a    b
    0  123  NaN
    

    And of course one can iterate over the rows and fill cells:

    col_names =  ["a","b"]
    df  = pd.DataFrame(columns = col_names)
    for x in range(0,5):
        df = df.append(pd.Series(), ignore_index = True)
        df.loc[[len(df)-1],'a'] = 123
    

    Result:

         a    b
    0  123  NaN
    1  123  NaN
    2  123  NaN
    3  123  NaN
    4  123  NaN
    
    0 讨论(0)
提交回复
热议问题