How to select R data.table rows based on substring match (a la SQL like)

前端 未结 2 706
难免孤独
难免孤独 2020-12-04 07:44

I have a data.table with a character column, and want to select only those rows that contain a substring in it. Equivalent to SQL WHERE x LIKE \'%substring%\'<

相关标签:
2条回答
  • 2020-12-04 08:29

    data.table has a like function.

    Months[like(Name,"mb")]
            Name Number
    1: September      9
    2:  November     11
    3:  December     12
    

    Or, %like% looks nicer :

    > Months[Name %like% "mb"]
        Name Number
    1: September      9
    2:  November     11
    3:  December     12
    

    Note that %like% and like() use grepl (returns logical vector) rather than grep (returns integer locations). That's so it can be combined with other logical conditions :

    > Months[Number<12 & Name %like% "mb"]
            Name Number
    1: September      9
    2:  November     11
    

    and you get the power of regular expression search (not just % or * wildcard), too.

    0 讨论(0)
  • 2020-12-04 08:34

    The operator %in% does not do partial string matching it is used for finding if values exist in another set of values i.e. "a" %in% c("a","b","c")

    To do partial string matching you need to use the grep() function. You can use the grep to return an index of all columns with "mb" in it. Then subset the rows by that index

    Months[grep("mb", Name)]    # data.table syntax slightly easier
    
    0 讨论(0)
提交回复
热议问题