ValueError: continuous is not supported

问题

I am using GridSearchCV for cross validation of a linear regression (not a classifier nor a logistic regression).

I also use StandardScaler for normalization of X

My dataframe has 17 features (X) and 5 targets (y) (observations). Around 1150 rows

I keep getting ValueError: continuous is not supported error message and ran out of options.

here is some code (assume all imports are done properly):

soilM = pd.read_csv('C:/training.csv', index_col=0)
soilM = getDummiedSoilDepth(soilM) #transform text values in 0 and 1

soilM = soilM.drop('Depth', 1) 

soil = soilM.iloc[:,-22:]

X_train, X_test, Ca_train, Ca_test, P_train, P_test, pH_train, pH_test, SOC_train, SOC_test, Sand_train, Sand_test = splitTrainTestAdv(soil)

scores = ['precision', 'recall']


for score in scores:

    for model in MODELS.keys():

        print model, score

        performParameterSelection(model, score, X_test, Ca_test, X_train, Ca_train)

def performParameterSelection(model_name, criteria, X_test, y_test, X_train, y_train):

    model, param_grid = MODELS[model_name]
    gs = GridSearchCV(model, param_grid, n_jobs= 1, cv=5, verbose=1, scoring='%s_weighted' % criteria)

    gs.fit(X_train, y_train) 

    print(gs.best_params_)

    for params, mean_score, scores in gs.grid_scores_:
        print("%0.3f (+/-%0.03f) for %r"
          % (mean_score, scores.std() * 2, params))


    y_true, y_pred = y_test, gs.predict(X_test)
    print(classification_report(y_true, y_pred))


MODELS = {
    'lasso': (
        linear_model.Lasso(),
        {'alpha': [0.95]}
    ),
    'ridge': (
        linear_model.Ridge(),
        {'alpha': [0.01]}
    ),
    'elasticnet': (
        linear_model.ElasticNet(),
        {
            'alpha': [0.6],
            'l1_ratio': [0.4]
        }
    ),
    'svr': (
        svm.SVR(),
        {
            'C': [5.0],
            'epsilon': [0.1],
            'kernel': ['linear']
        }
    )
 }


def performLasso(X_train, y_train, X_test, parameter):

     alpha = parameter[0]

    model = linear_model.Lasso(alpha=alpha, normalize=True) #pass alpha to Lasso
    model.fit(X_train, y_train)



    return model.predict(X_test)

def splitTrainTestAdv(df):


    y = df.iloc[:,-5:].copy()  # last 5 columns
    X1 = df.iloc[:,:-5].copy()  # Except for last 5 columns

    Ca = y['Ca'].copy()
    P = y['P'].copy()
    pH = y['pH'].copy()
    SOC = y['SOC'].copy()
    Sand = y['Sand'].copy()


    #Scaling and Sampling

    X = StandardScaler(copy=False).fit_transform(X1)

    X_train, X_test, Ca_train, Ca_test = train_test_split(X, Ca, test_size=0.2, random_state=0)


    return X_train, X_test, Ca_train, Ca_test, P_train, P_test, pH_train, pH_test, SOC_train, SOC_test, Sand_train, Sand_test

These are the main pieces of the code

This is the main part of Error output:

ValueError                                Traceback (most recent call last)
<ipython-input-90-1315d47e2551> in <module>()
     20         print '####################'
     21         print featuresV[1]
---> 22         performParameterSelection(model, score, X_test, Ca_test,  X_train, Ca_train)
     23         print featuresV[2]
     24         performParameterSelection(model, score, X_test, P_test, X_train, P_train)

<ipython-input-41-7075e1a49412> in performParameterSelection(model_name, criteria, X_test, y_test, X_train, y_train)
     12     # cv=5 - constant; verbose - keep writing
     13 
---> 14     gs.fit(X_train, y_train) # Will get grid scores with outputs from ALL models described above
     15 
     16         #pprint(sorted(gs.grid_scores_, key=lambda x: -x.mean_validation_score))

C:\Users\Tony\Anaconda\lib\site-packages\sklearn\grid_search.pyc in fit(self, X, y)
    730 
    731         """
--> 732         return self._fit(X, y, ParameterGrid(self.param_grid))



     90     if (y_type not in ["binary", "multiclass", "multilabel-indicator",
     91                        "multilabel-sequences"]):
---> 92         raise ValueError("{0} is not supported".format(y_type))
     93 
     94     if y_type in ["binary", "multiclass"]:

 ValueError: continuous is not supported

Here is some data after using soil.head(15). It does not show all the columns but it should behave in the same way with 8 features instead of 17. As for target: these are the last 5 columns but the code here calculated only one (Ca)

    BSAN    BSAS    BSAV    CTI ELEV    EVI LSTD    LSTN    REF1    REF2    ... RELI    Subsoil Topsoil TMAP    TMFI    Ca  P   pH  SOC Sand
PIDN                                                                                    
92RkYor6    -0.405797   -0.563636   -0.806271   -0.228241   -0.691982     1.653790  -0.605889   0.627488    -0.856727   0.056586    ... -0.062181   0     1 0.896228    1.651807    -0.394962   0.031291    0.488676    -0.389042   0.630347
nPv9P04t    -0.688406   -0.709091   -0.739082   -0.189180   1.185523    0.395773    -0.381748   -0.338928   -0.774545   -0.818182   ... 2.995923    1   0   1.539208    1.618022    -0.460044   -0.366432   -0.549490   0.204798    -1.162260
oCASbXEx    -0.623188   -0.654545   -0.727884   -0.155835   0.711136    0.517493    -0.035002   -0.092554   -0.725818   -0.651206   ... -0.300034   1   0   0.286952    0.657765    0.259613    -0.407934   0.591558    -0.529688   -0.793082
xq94dGBz    -0.746377   -0.781818   -0.862262   -0.340427   0.791314    0.672741    -0.665032   -0.128613   -0.853091   -0.741187   ... -0.418960   0     1 0.276740    0.678724    -0.467854   -0.245386   -0.577548   -0.428111   -0.130845
GYSYA8Yf    -0.862319   -0.836364   -0.783875   -0.020427   4.715590    0.473032    -1.321194   -2.560069   -0.791273   -0.827458   ... 2.299354    1   0   0.583042    1.825040    1.442361    -0.328389   0.797320    -0.443738   -0.892037
G4e9Ahvi    -0.710145   -0.736364   -0.727884   -0.175122   -1.003786   0.744898    -0.678329   0.851702    -0.661818   -0.474954   ... -0.300034   1   0   1.544703    1.641861    -0.355335   -0.079380   -0.287610   -0.256209   0.287810
SHU443XO    -0.579710   -0.736364   -0.963046   -0.536744   -0.179733   1.793003    -0.914052   0.291898    -0.966545   -0.086271   ... 0.260618    0   1   1.840689    2.223996    -0.499961   0.155796    -0.886192   -0.107749   0.942435
oAeygDKu    -0.152174   -0.154545   -0.134378   1.252267    -0.796659   -0.155977   1.309391    0.642680    -0.205818   -0.341373   ... -0.537887   1   0   -0.320335   0.429981    -0.441821   -0.352598   0.339031    -0.826609   1.650344
agBvYkUI    -0.724638   -0.790909   -0.839866   0.114245    1.363697    0.726676    -1.687885   0.060034    -0.706909   -0.523191   ... 1.127081    1   0   1.254782    0.972442    -0.505456   -0.345681   -1.774712   0.071966    -1.207931
8ujcZd8d    -0.427536   -0.600000   -0.806271   -0.667808   -1.208686   2.008018    -1.276453   1.203854    -0.698182   0.224490    ... 0.107713    0   1   0.288463    0.013744    -0.362277   -0.338764   0.039740    -0.232768   0.451467
hqO5LhmQ    -0.644928   -0.690909   -0.772676   -0.195877   1.138753    0.390671    0.145537    -0.544813   -0.722909   -0.729128   ... -0.537887   0   1   0.153926    0.422784    -0.460333   -0.300721   -0.063142   -0.607825   1.208852
QsfH8CWp    -0.449275   -0.618182   -0.862262   -0.512923   -0.712027   1.537901    -0.665190   0.595265    -0.884364   -0.103896   ... -0.028203   1   0   0.896228    1.651807    -0.475953   -0.252303   -0.128612   -0.670335   0.786391
5hhEGbrX    -0.260870   -0.290909   -0.335946   -0.175122   -0.749889   0.400146    0.299908    0.567983    -0.423273   -0.244898   ... -0.520897   1   0   0.249117    0.907095    -0.142446   -0.397558   0.423206    -0.412483   -0.678903
XlJWsmdz    -0.768116   -0.800000   -0.873460   -0.737115   0.682183    1.013848    -1.013065   -0.376346   -0.837818   -0.544527   ... 1.619776    1   0   0.942437    1.482143    -0.358517   1.283256    -0.072494   -0.490620   -0.899649
FY3riRgw    -0.818841   -0.863636   -0.873460   -0.739177   1.715590    1.434402    -1.669818   -0.090647   -0.874909   -0.388683   ... 3.182807    0   1   1.254782    0.972442    -0.333063   0.020916    -0.942309   1.314342    -0.690321

15 rows × 22 columns

回答1:

Your error continuous is not supported tells me you're trying to do "something" from regression domain on classification domain.

At least 1 thing captures my eye as your target is regression:

 scores = ['precision', 'recall']

To start with, both have nothing to do with regression (as @zero323 pointed out in a comment to your question): they are accuracy measures for classification. Try any regression scores that suit your tastes from this sklearn docs page, section "3.3.1.1. Common cases: predefined values"

As far as the rest of the code is concerned, I would strongly encourage you to rewrite your code from scratch: chunk for Lasso, chunk for Ridge, chunk for ElasticNet and chunk for SVM (why would you run Ridge and Lasso separately from ElasticNet as they are special cases of ElasticNet???). This will take you no more than 10-15 lines of code. Only after you made it sure all of them execute, optimal hyperparameters are found, and desirable regression metrics are calculated I would attempt optimizing the code and putting everything together in a loop.

PS:

how are these loops supposed to run:

for score in scores:
  for model in MODELS.keys():

prior to defining MODELS?

来源：https://stackoverflow.com/questions/33047525/valueerror-continuous-is-not-supported

标签

python

pandas

scikit-learn

linear-regression

grid-search