pyhton_使用插值法填充缺失值
# 라이브러리를 임포트합니다.
import pandas as pd
import numpy as np
# 날짜를 만듭니다.
time_index = pd.date_range("01/01/2010", periods=5, freq="M")
# 设置索引
dataframe = pd.DataFrame(index=time_index)
# 创建带确实数据的特征
dataframe["Sales"] = [1.0,2.0,np.nan,np.nan,5.0]
dataframe
Sales
2010-01-31 1.0
2010-02-28 2.0
2010-03-31 NaN
2010-04-30 NaN
2010-05-31 5.0
插值
# 对缺失数据进行插值
dataframe.interpolate()
Sales
2010-01-31 1.0
2010-02-28 2.0
2010-03-31 3.0
2010-04-30 4.0
2010-05-31 5.0
向前填充
# 使用前面数据进行替换 向前填充
dataframe.ffill()
Sales
2010-01-31 1.0
2010-02-28 2.0
2010-03-31 2.0
2010-04-30 2.0
2010-05-31 5.0
向后填充
# 向后填充
dataframe.bfill()
Sales
2010-01-31 1.0
2010-02-28 2.0
2010-03-31 5.0
2010-04-30 5.0
2010-05-31 5.0
非线性的, 可以尝试这种方法
# `method='quadratic'`二次插值。 如果数据是非线性的, 可以尝试这种方法
dataframe.interpolate(method="quadratic")
Sales
2010-01-31 1.000000
2010-02-28 2.000000
2010-03-31 3.059808
2010-04-30 4.038069
2010-05-31 5.000000
# 限制插数的个数
dataframe.interpolate(limit=1, limit_direction="forward")
Sales
2010-01-31 1.0
2010-02-28 2.0
2010-03-31 3.0
2010-04-30 NaN
2010-05-31 5.0
来源:CSDN
作者:御剑归一
链接:https://blog.csdn.net/wj1298250240/article/details/103774463