问题
I'm trying to run a Poisson model, like this:
poisson_model_xg = smf.glm(formula="xG ~ home + team + opponent", data=xg_model_data,
family=sm.families.Poisson()).fit()
I'm getting the following error:
ValueError: endog has evaluated to an array with multiple columns that has shape (760, 9). This occurs when the variable converted to endog is non-numeric (e.g., bool or str).
But I can't figure out what does it mean, since all my dataframe is numeric:
xg_model_data.apply(lambda s: pd.to_numeric(s, errors='coerce').notnull().all())
Out[10]:
goals True
xG True
team True
opponent True
home True
dtype: bool
回答1:
Solved. The trick was not in content type, but in columns type:
xg_model_data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 760 entries, 0 to 759
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 goals 760 non-null object
1 xG 760 non-null object
2 team 760 non-null object
3 opponent 760 non-null object
4 home 760 non-null object
dtypes: object(5)
memory usage: 55.6+ KB
After I applied pd.to_numeric()
on desired columns, the dataframe looks like the following, and Poisson is able to process.
xg_model_data.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 760 entries, 0 to 759
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 goals 760 non-null int64
1 xG 760 non-null float64
2 team 760 non-null object
3 opponent 760 non-null object
4 home 760 non-null int64
dtypes: float64(1), int64(2), object(2)
memory usage: 55.6+ KB
来源:https://stackoverflow.com/questions/64584416/python-endog-has-evaluated-to-an-array-with-multiple-columns