Why am I getting 'Error in weights * y : non-numeric argument to binary operator' in my logistic regression?

我怕爱的太早我们不能终老 提交于 2020-05-31 06:57:10

问题


I am willing to perform a logistic regression for my dataset. I use:

glm.fit=glm(direccion~Profit, data=datos, family=binomial)

    Minute  ecopet  TASA10  direccion   Minute  cl1     Day         Profit  
1   571     2160     5       1          571    51.85    2015-02-20  -0.03   
2   572     2160     5       1          572    51.92    2015-02-20   0.04   
3   573     2160     5       1          573    51.84    2015-02-20  -0.04   
4   574     2160     5       1          574    51.77    2015-02-20  -0.11   
5   575     2160     10      1          575    51.69    2015-02-20  -0.19   
6   576     2165     5       1          576    51.69    2015-02-20  -0.16   
7   577     2165    -5       0          577    51.64    2015-02-20  -0.28   
8   578     2165    -10      0          578    51.47    2015-02-20  -0.37   
9   579     2165    -10      0          579    51.41    2015-02-20  -0.36   
10  580     2170    -15      0          580    51.44    2015-02-20  -0.25   
11  581     2170    -30      0          581    51.48    2015-02-20  -0.21   
12  582     2160    -20      0          582    51.52    2015-02-20  -0.12   
13  583     2155    -5       0          583    51.56    2015-02-20   0.09   
14  584     2155    -5       0          584    51.51    2015-02-20   0.10   
15  585     2155    -5       0          585    51.44    2015-02-20   0.00   
16  586     2140     10      1          586    51.30    2015-02-20  -0.18   
17  587     2140     10      1          587    51.31    2015-02-20  -0.21   
18  588     2150     0       0          588    51.31    2015-02-20  -0.25

As you can see, the variable 'direccion' is a binary variable and is the dependent variable in my logistic regression. It is 1 whenever the variable 'TASA10' is positive and 0 otherwise. The problem is that after I run the code, I get:

'Error in weights * y : non-numeric argument to binary operator'

would you know why is that?

Thanks!!


回答1:


It appears that the direccion column is a character column rather than a numeric one. You can verify by running str(datos); you'll see something like

'data.frame':   18 obs. of  8 variables:
 $ Minute   : int  571 572 573 574 575 576 577 578 579 580 ...
 $ ecopet   : int  2160 2160 2160 2160 2160 2165 2165 2165 2165 2170 ...
 $ TASA10   : int  5 5 5 5 10 5 -5 -10 -10 -15 ...
 $ direccion: chr  "1" "1" "1" "1" ...
 $ Minute.1 : int  571 572 573 574 575 576 577 578 579 580 ...
 $ cl1      : num  51.9 51.9 51.8 51.8 51.7 ...
 $ Day      : Factor w/ 1 level "2015-02-20": 1 1 1 1 1 1 1 1 1 1 ...
 $ Profit   : num  -0.03 0.04 -0.04 -0.11 -0.19 -0.16 -0.28 -0.37 -0.36 -0.25 ...

In particular note the type of the direccion column. This can be fixed by running

datos$direccion <- as.numeric(datos$direccion)

If it is a factor then you need to make sure that you don't lose the coding by using

datos$direccion <- as.numeric(as.character(datos$direccion))

Even better is to look back in your pipeline to the code that generates this data frame and fixing that to encode as numeric rather than as a string.




回答2:


glm() only accepts variables that are either of numeric or factor type, it does not know how to deal with character type variables.

You could make a simple factorise function that turns all character (chr) columns into factors, while leaving numeric columns as they are:

factorize = function(column, df){
  #' Check if column is character and  turn to factor

  if (class(df[1,column]) == "character"){
    out = as.factor(df[,column])
  } else { # if it's numeric
    out = df[,column]
  }
  return(out)
}

store.colnames = colnames(data)
data  = lapply(store.colnames, function(column) factorize(column, data))
data = as.data.frame(data)
colnames(data) = store.colnames

The code could be much prettier but it will get the job done and I just wanted to illustrate the point.

Alternatively, you could just change a single column to factor type:

datos$direccion = as.factor(datos$direccion)

Hope that helps!



来源:https://stackoverflow.com/questions/32953750/why-am-i-getting-error-in-weights-y-non-numeric-argument-to-binary-operator

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!