问题
New to pandas development. How do I forward fill a DataFrame with the value contained in one previously seen column?
Self-contained example:
import pandas as pd
import numpy as np
O = [1, np.nan, 5, np.nan]
H = [5, np.nan, 5, np.nan]
L = [1, np.nan, 2, np.nan]
C = [5, np.nan, 2, np.nan]
timestamps = ["2017-07-23 03:13:00", "2017-07-23 03:14:00", "2017-07-23 03:15:00", "2017-07-23 03:16:00"]
dict = {'Open': O, 'High': H, 'Low': L, 'Close': C}
df = pd.DataFrame(index=timestamps, data=dict)
ohlc = df[['Open', 'High', 'Low', 'Close']]
This yields the following DataFrame:
print(ohlc)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 NaN NaN NaN NaN
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 NaN NaN NaN NaN
I want to go from that last DataFrame to something like this:
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 5.0 5.0 5.0 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 2.0 2.0 2.0 2.0
So that the previously-seen value in 'Close' forward fills entire rows until there's a new populated row seen. It's simple enough to fill column 'Close' like so:
column2fill = 'Close'
ohlc[column2fill] = ohlc[column2fill].ffill()
print(ohlc)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 NaN NaN NaN 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 NaN NaN NaN 2.0
But is there a way to fill across the 03:14:00 and 03:16:00 rows with the 'Close' value of those rows? And is there a way to do it in one step using one forward fill instead of filling the 'Close' column first?
回答1:
It seems you need assign with ffill
and then bfill
per row by axis=1
, but necessary full NaN
s rows:
df = ohlc.assign(Close=ohlc['Close'].ffill()).bfill(axis=1)
print (df)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 5.0 5.0 5.0 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 2.0 2.0 2.0 2.0
What is same as:
ohlc['Close'] = ohlc['Close'].ffill()
df = ohlc.bfill(axis=1)
print (df)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 5.0 5.0 5.0 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 2.0 2.0 2.0 2.0
来源:https://stackoverflow.com/questions/45262040/python-pandas-forward-filling-entire-rows-with-value-of-one-previous-column