Subset data points outside confidence interval

心不动则不痛 提交于 2020-01-05 05:53:04

问题


Using the same example as from this previous question (code pasted below), we can get the 95% CI with the summary_table function from statsmodels outliers_influence. But now, how would it be possible to only subset the data points (x and y) that are outside the confidence interval?

import numpy as np
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import summary_table

#measurements genre
n = 100
x = np.linspace(0, 10, n)
e = np.random.normal(size=n)
y = 1 + 0.5*x + 2*e
X = sm.add_constant(x)
re = sm.OLS(y, X).fit()
st, data, ss2 = summary_table(re, alpha=0.05)
predict_ci_low, predict_ci_upp = data[:, 6:8].T

回答1:


It might be a bit late for this, but you could put it in a pandas.DataFrame and filter depending on a list of booleans. Assuming I got your question:

import numpy as np
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import summary_table
import matplotlib.pyplot as plot

## Import pandas
import pandas as pd

#measurements genre
n = 100
x = np.linspace(0, 10, n)
e = np.random.normal(size=n)
y = 1 + 0.5*x + 2*e
X = sm.add_constant(x)
re = sm.OLS(y, X).fit()
st, data, ss2 = summary_table(re, alpha=0.05)

# Make prediction
prediction = re.predict(X)
predict_ci_low, predict_ci_upp = data[:, 6:8].T

# Put y and x in a pd.DataFrame
df = pd.DataFrame(y).set_index(x)

# Get the y values that are out of the ci intervals. This could be done directly in the df indexer
out_up = y > predict_ci_upp
out_down = y < predict_ci_low

# Plot everything
plot.plot(x, y, label = 'train')
plot.plot(df[out_up], marker = 'o', linewidth = 0)
plot.plot(df[out_down], marker = 'o', linewidth = 0)
plot.plot(x, predictionTrain, label = 'prediction')
plot.plot(x, predict_ci_upp, label = 'ci_up')
plot.plot(x, predict_ci_low, label = 'ci_low')
plot.legend(loc='best')

Here is the resulting plot:



来源:https://stackoverflow.com/questions/50585837/subset-data-points-outside-confidence-interval

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!