Reason for unexpected output in subsetting data frame - R

后端 未结 1 1653
滥情空心
滥情空心 2021-01-22 14:48

I have the data frame \"a\" and it has a variable called \"VAL\". I want to count the elements where the value of VAL is 23 or 24.

I used two codes which worked Ok:

相关标签:
1条回答
  • 2021-01-22 15:32

    Working through an example shows where it is going wrong:

    a <- data.frame(VAL=c(1,1,1,23,24))
    a
    #  VAL
    #1   1
    #2   1
    #3   1
    #4  23
    #5  24
    

    These work:

    a$VAL %in% c(23,24)
    #[1] FALSE FALSE FALSE  TRUE  TRUE
    a$VAL==23 | a$VAL==24
    #[1] FALSE FALSE FALSE  TRUE  TRUE
    

    The following doesn't work due to vector recycling when comparing - take note of the warning message below E.g.:

    a$VAL ==c(23,24)
    #[1] FALSE FALSE FALSE FALSE FALSE
    #Warning message:
    #In a$VAL == c(23, 24) :
    #  longer object length is not a multiple of shorter object length
    

    This last bit of code recycles what you are testing against and is basically comparing:

    c( 1,  1,  1, 23, 24) #to
    c(23, 24, 23, 24, 23)
    

    ...so you don't get any rows returned. Changing the order will give you

    c( 1,  1,  1, 23, 24) #to
    c(24, 23, 24, 23, 24)
    

    ...and you will get two rows returned (which gives the intended result by pure luck, but it is not appropriate to use).

    0 讨论(0)
提交回复
热议问题