statsmodels

Statsmodels score

只愿长相守 提交于 2019-12-25 00:35:09
问题 I am running a logistic regression using statsmodels and am trying to find the score of my regression. The documentation doesn't really provide much information about the score method unlike sklearn which allows the user to pass a test dataset with the y value and the regression coefficients i.e. lr.score(test_data, target) . What and how should I pass parameters to the statsmodels 's score function? Documentation: http://statsmodels.sourceforge.net/stable/generated/statsmodels.discrete

NaN in data frame: when first observation of time series is NaN, frontfill with first available, otherwise carry over last / previous observation

孤街醉人 提交于 2019-12-24 17:33:56
问题 I am performing an ADF-test from statsmodels. The value series can have missing obversations. In fact, I am dropping the analysis if the fraction of NaNs is larger than c. However, if the series makes it through the I get the problem, that the adfuller cannot deal with missing data. Since this is training data with a minimum framesize, I would like to do: 1) if x(t=0) = NaN, then find the next non-NaN value (t>0) 2) otherwise if x(t) = NaN, then x(t) = x(t-1) So I am compromising here my

NaN in data frame: when first observation of time series is NaN, frontfill with first available, otherwise carry over last / previous observation

[亡魂溺海] 提交于 2019-12-24 17:33:16
问题 I am performing an ADF-test from statsmodels. The value series can have missing obversations. In fact, I am dropping the analysis if the fraction of NaNs is larger than c. However, if the series makes it through the I get the problem, that the adfuller cannot deal with missing data. Since this is training data with a minimum framesize, I would like to do: 1) if x(t=0) = NaN, then find the next non-NaN value (t>0) 2) otherwise if x(t) = NaN, then x(t) = x(t-1) So I am compromising here my

Evaluate slope and error for specific category for statsmodels ols fit

谁说胖子不能爱 提交于 2019-12-24 15:08:58
问题 I have a dataframe df with the following fields: weight , length , and animal . The first 2 are continuous variables, while animal is a categorical variable with the values cat , dog , and snake . I'd like to estimate a relationship between weight and length, but this needs to be conditioned on the type of animal, so I interact the length variable with the animal categorical variable. model = ols(formula='weight ~ length * animal', data=df) results = model.fit() How can I programmatically

Why I got 'The computed initial AR coefficients are not stationary' while using aic_min_order?

柔情痞子 提交于 2019-12-24 10:54:08
问题 I generate some data like [1, 6, 1, 6, 1, 6] and add noises under normal distribution. I use arma_order_select_ic to select order. Then aic_min_order is used to fit the ARMA model. Sometime the model works well. But sometimes it raises ValueError. ValueError: The computed initial AR coefficients are not stationary Here is my code. import statsmodels.api as sm import numpy as np x = [1 if i%2 == 0 else 6 for i in range(50)] eta = np.random.normal(0, 0.01, 50) x = x + eta res = sm.tsa.stattools

r in stats.linregress compared to r-squared in statsmodels

社会主义新天地 提交于 2019-12-24 09:26:08
问题 I'm working on a program to investigate the correlation between magnitude and redshift for some quasars, and I'm using statsmodels and scipy.stats.linregress to compute the statistics of the data; statsmodels to compute r-squared (among other parameters), and stats.linregress to compute r (among others). Some example output is: W1 r-squared: 0.855715 W1 r-value : 0.414026 W2 r-squared: 0.861169 W2 r-value : 0.517381 W3 r-squared: 0.874051 W3 r-value : 0.418523 W4 r-squared: 0.856747 W4 r

python statsmodels linear regression

早过忘川 提交于 2019-12-24 06:23:14
问题 I am attempting to make a linear regression model based on pre project data and ultimately attempt to calculate some modeled data where I could compare pre/post project data... Can anyone tell me what the best proactice is else I maybe off in the weeds somewhere... For starters: import statsmodels.api as sm import numpy as np import pandas as pd import matplotlib.pyplot as plt ng = pd.read_csv('C:/Users/ngDataBaseline.csv', thousands=',', index_col='Date', parse_dates=True) ng.head() This

python statsmodels linear regression

混江龙づ霸主 提交于 2019-12-24 06:22:48
问题 I am attempting to make a linear regression model based on pre project data and ultimately attempt to calculate some modeled data where I could compare pre/post project data... Can anyone tell me what the best proactice is else I maybe off in the weeds somewhere... For starters: import statsmodels.api as sm import numpy as np import pandas as pd import matplotlib.pyplot as plt ng = pd.read_csv('C:/Users/ngDataBaseline.csv', thousands=',', index_col='Date', parse_dates=True) ng.head() This

Python error: len() of unsized object while using statsmodels with one row of data

删除回忆录丶 提交于 2019-12-24 02:22:54
问题 I'm able to use the statsmodel's WLS (weighted least squares regression) fine when I have lots of datapoints. However, I seem to be having a problem with the numpy arrays when I try to use WLS for a single sample from the dataset. What I mean is, if I have a dataset X which is a 2D array, with lots of rows, WLS works fine. But not if I try to work it on a single row. You'll get what I mean in the code below: import sys from sklearn.externals.six.moves import xrange from sklearn.metrics import

Python Statsmodels: OLS regressor not predicting

99封情书 提交于 2019-12-24 02:19:14
问题 I wrote the following piece of code but I just cannot get the 'predict' method to work: import statsmodels.api as sm from statsmodels.formula.api import ols ols_model = ols('Consumption ~ Disposable_Income', df).fit() My 'df' is a pandas dataframe with column headings 'Consumption' and 'Disposable_Income'. When I run, for example, ols_model.predict([1000.0]) I get: "TypeError: list indices must be integers, not str" When I run, for example, ols_model.predict(df['Disposable_Income'].values) I