statsmodels | 易学教程

Statsmodels score

阅读更多关于 Statsmodels score

问题 I am running a logistic regression using statsmodels and am trying to find the score of my regression. The documentation doesn't really provide much information about the score method unlike sklearn which allows the user to pass a test dataset with the y value and the regression coefficients i.e. lr.score(test_data, target) . What and how should I pass parameters to the statsmodels 's score function? Documentation: http://statsmodels.sourceforge.net/stable/generated/statsmodels.discrete

NaN in data frame: when first observation of time series is NaN, frontfill with first available, otherwise carry over last / previous observation

阅读更多关于 NaN in data frame: when first observation of time series is NaN, frontfill with first available, otherwise carry over last / previous observation

问题 I am performing an ADF-test from statsmodels. The value series can have missing obversations. In fact, I am dropping the analysis if the fraction of NaNs is larger than c. However, if the series makes it through the I get the problem, that the adfuller cannot deal with missing data. Since this is training data with a minimum framesize, I would like to do: 1) if x(t=0) = NaN, then find the next non-NaN value (t>0) 2) otherwise if x(t) = NaN, then x(t) = x(t-1) So I am compromising here my

NaN in data frame: when first observation of time series is NaN, frontfill with first available, otherwise carry over last / previous observation

阅读更多关于 NaN in data frame: when first observation of time series is NaN, frontfill with first available, otherwise carry over last / previous observation

Evaluate slope and error for specific category for statsmodels ols fit

阅读更多关于 Evaluate slope and error for specific category for statsmodels ols fit

问题 I have a dataframe df with the following fields: weight , length , and animal . The first 2 are continuous variables, while animal is a categorical variable with the values cat , dog , and snake . I'd like to estimate a relationship between weight and length, but this needs to be conditioned on the type of animal, so I interact the length variable with the animal categorical variable. model = ols(formula='weight ~ length * animal', data=df) results = model.fit() How can I programmatically

Why I got 'The computed initial AR coefficients are not stationary' while using aic_min_order?

阅读更多关于 Why I got 'The computed initial AR coefficients are not stationary' while using aic_min_order?

问题 I generate some data like [1, 6, 1, 6, 1, 6] and add noises under normal distribution. I use arma_order_select_ic to select order. Then aic_min_order is used to fit the ARMA model. Sometime the model works well. But sometimes it raises ValueError. ValueError: The computed initial AR coefficients are not stationary Here is my code. import statsmodels.api as sm import numpy as np x = [1 if i%2 == 0 else 6 for i in range(50)] eta = np.random.normal(0, 0.01, 50) x = x + eta res = sm.tsa.stattools

r in stats.linregress compared to r-squared in statsmodels

阅读更多关于 r in stats.linregress compared to r-squared in statsmodels

问题 I'm working on a program to investigate the correlation between magnitude and redshift for some quasars, and I'm using statsmodels and scipy.stats.linregress to compute the statistics of the data; statsmodels to compute r-squared (among other parameters), and stats.linregress to compute r (among others). Some example output is: W1 r-squared: 0.855715 W1 r-value : 0.414026 W2 r-squared: 0.861169 W2 r-value : 0.517381 W3 r-squared: 0.874051 W3 r-value : 0.418523 W4 r-squared: 0.856747 W4 r

python statsmodels linear regression

阅读更多关于 python statsmodels linear regression

问题 I am attempting to make a linear regression model based on pre project data and ultimately attempt to calculate some modeled data where I could compare pre/post project data... Can anyone tell me what the best proactice is else I maybe off in the weeds somewhere... For starters: import statsmodels.api as sm import numpy as np import pandas as pd import matplotlib.pyplot as plt ng = pd.read_csv('C:/Users/ngDataBaseline.csv', thousands=',', index_col='Date', parse_dates=True) ng.head() This

python statsmodels linear regression

阅读更多关于 python statsmodels linear regression

Python error: len() of unsized object while using statsmodels with one row of data

阅读更多关于 Python error: len() of unsized object while using statsmodels with one row of data

问题 I'm able to use the statsmodel's WLS (weighted least squares regression) fine when I have lots of datapoints. However, I seem to be having a problem with the numpy arrays when I try to use WLS for a single sample from the dataset. What I mean is, if I have a dataset X which is a 2D array, with lots of rows, WLS works fine. But not if I try to work it on a single row. You'll get what I mean in the code below: import sys from sklearn.externals.six.moves import xrange from sklearn.metrics import

Python Statsmodels: OLS regressor not predicting

阅读更多关于 Python Statsmodels: OLS regressor not predicting

问题 I wrote the following piece of code but I just cannot get the 'predict' method to work: import statsmodels.api as sm from statsmodels.formula.api import ols ols_model = ols('Consumption ~ Disposable_Income', df).fit() My 'df' is a pandas dataframe with column headings 'Consumption' and 'Disposable_Income'. When I run, for example, ols_model.predict([1000.0]) I get: "TypeError: list indices must be integers, not str" When I run, for example, ols_model.predict(df['Disposable_Income'].values) I