Extract the last word between | |

☆樱花仙子☆ 提交于 2019-12-04 01:09:20

问题


I have the following dataset

> head(names$SAMPLE_ID)
[1] "Bacteria|Proteobacteria|Gammaproteobacteria|Pseudomonadales|Moraxellaceae|Acinetobacter|"
[2] "Bacteria|Firmicutes|Bacilli|Bacillales|Bacillaceae|Bacillus|"                            
[3] "Bacteria|Proteobacteria|Gammaproteobacteria|Pasteurellales|Pasteurellaceae|Haemophilus|" 
[4] "Bacteria|Firmicutes|Bacilli|Lactobacillales|Streptococcaceae|Streptococcus|"             
[5] "Bacteria|Firmicutes|Bacilli|Lactobacillales|Streptococcaceae|Streptococcus|"             
[6] "Bacteria|Firmicutes|Bacilli|Lactobacillales|Streptococcaceae|Streptococcus|" 

I want to extract the last word between || as a new variable i.e.

Acinetobacter
Bacillus
Haemophilus

I have tried using

library(stringr)
names$sample2 <-   str_match(names$SAMPLE_ID, "|.*?|")

回答1:


We can use

library(stringi)
stri_extract_last_regex(v1, '\\w+')
#[1] "Acinetobacter"

data

v1 <- "Bacteria|Proteobacteria|Gammaproteobacteria|Pseudomonadales|Moraxellaceae|Acinetobacter|"



回答2:


Using just base R:

myvar <- gsub("^..*\\|(\\w+)\\|$", "\\1", names$SAMPLE_ID)



回答3:


^.*\\|\\K.*?(?=\\|)

Use \K to remove rest from the final matche.See demo.Also use perl=T

https://regex101.com/r/fM9lY3/45

x <- c("Bacteria|Firmicutes|Bacilli|Lactobacillales|Streptococcaceae|Streptococcus|",
       "Bacteria|Firmicutes|Bacilli|Lactobacillales|Streptococcaceae|Streptococcus|" )

unlist(regmatches(x, gregexpr('^.*\\|\\K.*?(?=\\|)', x, perl = TRUE)))
# [1] "Streptococcus" "Streptococcus"



回答4:


The ending is all you need [^|]+(?=\|$)

Per @RichardScriven :

Which in R would be regmatches(x, regexpr("[^|]+(?=\\|$)", x, perl = TRUE)




回答5:


You can use package "stringr" as well in this case. Here is the code:

v<- "Bacteria| Proteobacteria|Gammaproteobacteria|Pseudomonadales|Moraxellaceae|Acinetobacter|"

v1<- str_replace_all(v, "\\|", " ")

word(v1,-2)

Here I used v as the string. The basic theory is to replace all the | with spaces, and then get the last word in the string by using function word().



来源:https://stackoverflow.com/questions/34342380/extract-the-last-word-between

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!