New to pandas development. How do I forward fill a DataFrame with the value contained in one previously seen column?
Self-contained example:
import pandas as pd
import numpy as np
O = [1, np.nan, 5, np.nan]
H = [5, np.nan, 5, np.nan]
L = [1, np.nan, 2, np.nan]
C = [5, np.nan, 2, np.nan]
timestamps = ["2017-07-23 03:13:00", "2017-07-23 03:14:00", "2017-07-23 03:15:00", "2017-07-23 03:16:00"]
dict = {'Open': O, 'High': H, 'Low': L, 'Close': C}
df = pd.DataFrame(index=timestamps, data=dict)
ohlc = df[['Open', 'High', 'Low', 'Close']]
This yields the following DataFrame:
print(ohlc)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 NaN NaN NaN NaN
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 NaN NaN NaN NaN
I want to go from that last DataFrame to something like this:
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 5.0 5.0 5.0 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 2.0 2.0 2.0 2.0
So that the previously-seen value in 'Close' forward fills entire rows until there's a new populated row seen. It's simple enough to fill column 'Close' like so:
column2fill = 'Close'
ohlc[column2fill] = ohlc[column2fill].ffill()
print(ohlc)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 NaN NaN NaN 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 NaN NaN NaN 2.0
But is there a way to fill across the 03:14:00 and 03:16:00 rows with the 'Close' value of those rows? And is there a way to do it in one step using one forward fill instead of filling the 'Close' column first?
It seems you need assign
with ffill
and then bfill
per row by axis=1
, but necessary full NaN
s rows:
df = ohlc.assign(Close=ohlc['Close'].ffill()).bfill(axis=1)
print (df)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 5.0 5.0 5.0 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 2.0 2.0 2.0 2.0
What is same as:
ohlc['Close'] = ohlc['Close'].ffill()
df = ohlc.bfill(axis=1)
print (df)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 5.0 5.0 5.0 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 2.0 2.0 2.0 2.0
来源:https://stackoverflow.com/questions/45262040/python-pandas-forward-filling-entire-rows-with-value-of-one-previous-column