R grep regular expression using elements in a vector (FOLLOW UP)

后端未结

关注

 3  1442

被撕碎了的回忆 2021-01-26 23:58

Following up on this question, I have another example where I cannot use the accepted answer.

Again, I want to find each of the exact group elements in the

3条回答

盖世英雄少女心 (楼主)

2021-01-27 00:25

Try

lapply(groups, function(g)
  grep(gsub("\\+", "\\\\+", paste0(g, "$")), labs, value = TRUE))
# [[1]]
# [1] "Beijing -- T0 -- BC-89 + CN"     
# [2] "Beijing -- T24 -- BC-89 + CN"    
# [3] "Beijing -- T0 -- BC-89 + CN"     
# [4] "Zhangjiakou -- T0 -- BC-89 + CN" 
# [5] "Beijing -- T0 -- BC-89 + CN"     
# [6] "Beijing -- T0 -- BC-89 + CN"     
# [7] "Beijing -- T24 -- BC-89 + CN"    
# [8] "Beijing -- T24 -- BC-89 + CN"    
# [9] "Zhangjiakou -- T0 -- BC-89 + CN" 
# [10] "Zhangjiakou -- T0 -- BC-89 + CN" 
# [11] "Zhangjiakou -- T24 -- BC-89 + CN"
# [12] "Zhangjiakou -- T24 -- BC-89 + CN"
# 
# [[2]]
# [1] "Beijing -- T0 -- BC-89 + CN with 2% DD + 1.6% ZC"     
# [2] "Beijing -- T24 -- BC-89 + CN with 2% DD + 1.6% ZC"    
# [3] "Beijing -- T0 -- BC-89 + CN with 2% DD + 1.6% ZC"     
# [4] "Zhangjiakou -- T0 -- BC-89 + CN with 2% DD + 1.6% ZC" 
# [5] "Beijing -- T0 -- BC-89 + CN with 2% DD + 1.6% ZC"     
# [6] "Beijing -- T24 -- BC-89 + CN with 2% DD + 1.6% ZC"    
# [7] "Zhangjiakou -- T0 -- BC-89 + CN with 2% DD + 1.6% ZC" 
# [8] "Zhangjiakou -- T24 -- BC-89 + CN with 2% DD + 1.6% ZC"
# 
# [[3]]
# [1] "Beijing -- T0 -- BC-89 with 2% Puricare + 5% Merquat + CN"    
# [2] "Beijing -- T24 -- BC-89 with 2% Puricare + 5% Merquat + CN"   
# [3] "Beijing -- T0 -- BC-89 with 2% Puricare + 5% Merquat + CN"    
# [4] "Zhangjiakou -- T0 -- BC-89 with 2% Puricare + 5% Merquat + CN"

The problem with your approach is that, e.g., groups[1] is "BC-89 + CN", which contains +, having particular meaning in regular expressions. Given only this, adding fixed = TRUE in grep would fix the issue, but then $ would lose its effect. So what I did is escaping + in the group names first.

Alternatively, and relating to your linked answer, you could do

lapply(groups, function(g)
  grep(paste0(g, "$"), paste0(labs, "$"), value = TRUE, fixed = TRUE))

0 讨论(0)

查看其它3个回答