Complete.obs of cor() function

后端 未结 1 1572
悲&欢浪女
悲&欢浪女 2021-02-07 20:42

I am establishing a correlation matrix for my data, which looks like this

df <- structure(list(V1 = c(56, 123, 546, 26, 62, 6, NA, NA, NA, 15
), V2 = c(21, 23         


        
相关标签:
1条回答
  • 2021-02-07 21:22

    Look at the help file for cor, i.e. ?cor. In particular,

    If ‘use’ is ‘"everything"’, ‘NA’s will propagate conceptually, i.e., a resulting value will be ‘NA’ whenever one of its contributing observations is ‘NA’.

    If ‘use’ is ‘"all.obs"’, then the presence of missing observations will produce an error. If ‘use’ is ‘"complete.obs"’ then missing values are handled by casewise deletion (and if there are no complete cases, that gives an error).

    To get a better feel about what is going on, is to create an (even) simpler example:

    df1 = df[1:5,1:3]
    cor(df1, use="pairwise.complete.obs", method="pearson") 
    cor(df1, use="complete.obs", method="pearson") 
    cor(df1[3:5,], method="pearson") 
    

    So, when we use complete.obs, we discard the entire row if an NA is present. In my example, this means we discard rows 1 and 2. However, pairwise.complete.obs uses the non-NA values when calculating the correlation between V1 and V2.

    0 讨论(0)
提交回复
热议问题