Append an empty row in dataframe using pandas

前端未结

关注

 8  1864

I am trying to append an empty row at the end of dataframe but unable to do so, even trying to understand how pandas work with append function and still not getting it.

相关标签:

8条回答

轻奢々

2021-02-01 13:53
The code below worked for me.
```
df.append(pd.Series([np.nan]), ignore_index = True)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
忘掉有多难

2021-02-01 14:00
Add a new pandas.Series using pandas.DataFrame.append().

If you wish to specify the name (AKA the "index") of the new row, use:
```
df.append(pandas.Series(name='NameOfNewRow'))
```
If you don't wish to name the new row, use:
```
df.append(pandas.Series(), ignore_index=True)
```
where df is your pandas.DataFrame.
0 讨论(0)
发布评论:

提交评论
- 加载中...
我寻月下人不归

2021-02-01 14:00
Assuming your df.index is sorted you can use:
```
df.loc[df.index.max() + 1] = None
```
It handles well different indexes and column types.

[EDIT] it works with pd.DatetimeIndex if there is a constant frequency, otherwise we must specify the new index exactly e.g:
```
df.loc[df.index.max() + pd.Timedelta(milliseconds=1)] = None
```
long example:
```
df = pd.DataFrame([[pd.Timestamp(12432423), 23, 'text_field']], 
                    columns=["timestamp", "speed", "text"],
                    index=pd.DatetimeIndex(start='2111-11-11',freq='ms', periods=1))
df.info()
```
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 1 entries, 2111-11-11 to 2111-11-11 Freq: L Data columns (total 3 columns): timestamp 1 non-null datetime64[ns] speed 1 non-null int64 text 1 non-null object dtypes: datetime64[ns](1), int64(1), object(1) memory usage: 32.0+ bytes
```
df.loc[df.index.max() + 1] = None
df.info()
```
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 2 entries, 2111-11-11 00:00:00 to 2111-11-11 00:00:00.001000 Data columns (total 3 columns): timestamp 1 non-null datetime64[ns] speed 1 non-null float64 text 1 non-null object dtypes: datetime64[ns](1), float64(1), object(1) memory usage: 64.0+ bytes
```
df.head()

                            timestamp                   speed      text
2111-11-11 00:00:00.000 1970-01-01 00:00:00.012432423   23.0    text_field
2111-11-11 00:00:00.001 NaT NaN NaN
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
半阙折子戏

2021-02-01 14:01
Assuming df is your dataframe,
```
df_prime = pd.concat([df, pd.DataFrame([[np.nan] * df.shape[1]], columns=df.columns)], ignore_index=True)
```
where df_prime equals df with an additional last row of NaN's.

Note that pd.concat is slow so if you need this functionality in a loop, it's best to avoid using it. In that case, assuming your index is incremental, you can use
```
df.loc[df.iloc[-1].name + 1,:] = np.nan
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
悲哀的现实

2021-02-01 14:04
You can add a new series, and name it at the same time. The name will be the index of the new row, and all the values will automatically be NaN.
```
df.append(pd.Series(name='Afterthought'))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

孤独总比滥情好

2021-02-01 14:07

Append "empty" row to data frame and fill selected cells:

Generate empty data frame (no rows just columns a and b):

import pandas as pd    
col_names =  ["a","b"]
df  = pd.DataFrame(columns = col_names)

Append empty row at the end of the data frame:

df = df.append(pd.Series(), ignore_index = True)

Now fill the empty cell at the end (len(df)-1) of the data frame in column a:

df.loc[[len(df)-1],'a'] = 123

Result:

     a    b
0  123  NaN

And of course one can iterate over the rows and fill cells:

col_names =  ["a","b"]
df  = pd.DataFrame(columns = col_names)
for x in range(0,5):
    df = df.append(pd.Series(), ignore_index = True)
    df.loc[[len(df)-1],'a'] = 123

Result:

     a    b
0  123  NaN
1  123  NaN
2  123  NaN
3  123  NaN
4  123  NaN

0 讨论(0)

1 2 下一页