cor shows only NA or 1 for correlations - Why?

后端 未结 6 1227
别那么骄傲
别那么骄傲 2020-12-08 07:03

I\'m running cor() on a data.framewith all numeric values and I\'m getting this as the result:

       price exprice...
price      1         


        
相关标签:
6条回答
  • 2020-12-08 07:08

    very simple and correct answer

    Tell the correlation to ignore the NAs with use argument, e.g.:

    cor(data$price, data$exprice, use = "complete.obs")
    
    0 讨论(0)
  • 2020-12-08 07:11

    Tell the correlation to ignore the NAs with use argument, e.g.:

    cor(data$price, data$exprice, use = "complete.obs")
    
    0 讨论(0)
  • 2020-12-08 07:24

    The NA can actually be due to 2 reasons. One is that there is a NA in your data. Another one is due to there being one of the values being constant. This results in standard deviation being equal to zero and hence the cor function returns NA.

    0 讨论(0)
  • In my case I was using more than two variables, and this worked for me better:

    cor(x = as.matrix(tbl), method = "pearson", use = "pairwise.complete.obs")
    

    However:

    If use has the value "pairwise.complete.obs" then the correlation or covariance between each pair of variables is computed using all complete pairs of observations on those variables. This can result in covariance or correlation matrices which are not positive semi-definite, as well as NA entries if there are no complete pairs for that pair of variables.

    0 讨论(0)
  • 2020-12-08 07:25

    The 1s are because everything is perfectly correlated with itself, and the NAs are because there are NAs in your variables.

    You will have to specify how you want R to compute the correlation when there are missing values, because the default is to only compute a coefficient with complete information.

    You can change this behavior with the use argument to cor, see ?cor for details.

    0 讨论(0)
  • 2020-12-08 07:26

    NAs also appear if there are attributes with zero variance (with all elements equal); see for instance:

    cor(cbind(a=runif(10),b=rep(1,10)))
    

    which returns:

       a  b
    a  1 NA
    b NA  1
    Warning message:
    In cor(cbind(a = runif(10), b = rep(1, 10))) :
      the standard deviation is zero
    
    0 讨论(0)
提交回复
热议问题