stringr | 易学教程

obtaining first word in the string [duplicate]

阅读更多关于 obtaining first word in the string [duplicate]

问题 This question already has answers here : Extract first word from a column and insert into new column [duplicate] (3 answers) Closed 2 years ago . I would like to extract the first string from a vector. For example, y<- c('london/hilss', 'newyork/hills', 'paris/jjk') I want to get the string before the symbol"/" i.e., location london newyork paris 回答1: A very simple approach with gsub gsub("/.*", '', y) [1] "london" "newyork" "paris" 回答2: Your example is simple, for a more general case like y<

Extract a sample of words around a particular word using stringr in R

阅读更多关于 Extract a sample of words around a particular word using stringr in R

I've seen a couple of similar questions posted on SO regarding this topic, but they seem to be worded improperly ( example ) or in a different language ( example ). In my scenario, I consider everything that is surrounded by white space to be a word. Emoticons, numbers, strings of letters that aren't really words, I don't care. I just want to get some context around the string that was found without having to read the entire file to figure out if it's a valid match. I tried using the following, but it takes awhile to run if you've got a long text file: text <- "He served both as Attorney

Extracting a number following specific text in R

阅读更多关于 Extracting a number following specific text in R

问题 I have a data frame which contains a column full of text. I need to capture the number (can potentially be any number of digits from most likely 1 to 4 digits in length) that follows a certain phrase, namely 'Floor Area' or 'floor area' . My data will look something like the following: "A beautiful flat on the 3rd floor with floor area: 50 sqm and a lift" "Newbuild flat. Floor Area: 30 sq.m" "6 bed house with floor area 50 sqm, lot area 25 sqm" If I try to extract just the number or if I look

Installation of packages ‘stringr’ and ‘stringi’ had non-zero exit status

阅读更多关于 Installation of packages ‘stringr’ and ‘stringi’ had non-zero exit status

Please help me to install stringr and stringi packages in R. The result is: install.packages("stringi") Installing package into ‘C:/Users/kozlovpy/Documents/R/win-library/3.2’ (as ‘lib’ is unspecified) пробую URL 'https://mran.revolutionanalytics.com/snapshot/2015-08-27/bin/windows/contrib/3.2/stringi_0.5-5.zip' Error in download.file(url, destfile, method, mode = "wb", ...) : не могу открыть URL 'https://mran.revolutionanalytics.com/snapshot/2015-08-27/bin/windows/contrib/3.2/stringi_0.5-5.zip' Вдобавок: Предупреждение: В download.file(url, destfile, method, mode = "wb", ...) :

How to extract everything until first occurrence of pattern

阅读更多关于 How to extract everything until first occurrence of pattern

I'm trying to use the stringr package in R to extract everything from a string up until the first occurrence of an underscore. What I've tried str_extract("L0_123_abc", ".+?(?<=_)") > "L0_" Close but no cigar. How do I get this one? Also, Ideally I'd like something that's easy to extend so that I can get the information in between the 1st and 2nd underscore and get the information after the 3rd underscore. To get L0 , you may use > library(stringr) > str_extract("L0_123_abc", "[^_]+") [1] "L0" The [^_]+ matches 1 or more chars other than _ . Also, you may split the string with _ : x <- str

Extract the last word between | |

阅读更多关于 Extract the last word between | |

Regular Expression in Base R Regex to identify email address

阅读更多关于 Regular Expression in Base R Regex to identify email address

问题 I am trying to use the stringr library to extract emails from a big, messy file. str_match doesn't allow perl=TRUE, and I can't figure out the escape characters to get it to work. Can someone recommend a relatively robust regex that would work in the context below? c("larry@gmail.com", "larry-sally@sally.com", "larry@sally.larry.com")->emails "SomeRegex"->regex str_match(emails, regex) 回答1: > "^[[:alnum:].-_]+@[[:alnum:].-]+$"->regex > str_match(emails, regex) [,1] [1,] "larry@gmail.com" [2,]

R regex gsub separate letters and numbers

阅读更多关于 R regex gsub separate letters and numbers

I have a string that's mixed letters and numbers: "The sample is 22mg" I'd like to split strings where a number is immediately followed by letter like this: "The sample is 22 mg" I've tried this: gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This is a test 22mg') but am not getting the desired results. Any suggestions? You need to use capturing parentheses in the regular expression and group references in the replacement. For example: gsub('([0-9])([[:alpha:]])', '\\1 \\2', 'This is a test 22mg') There's nothing R-specific here; the R help for regex and gsub should be of some use. You need

Detect multiple strings with dplyr and stringr

阅读更多关于 Detect multiple strings with dplyr and stringr

问题 I'm trying to combine dplyr and stringr to detect multiple patterns in a dataframe. I want to use dplyr as I want to test a number of different columns. Here's some sample data: test.data <- data.frame(item = c("Apple", "Bear", "Orange", "Pear", "Two Apples")) fruit <- c("Apple", "Orange", "Pear") test.data item 1 Apple 2 Bear 3 Orange 4 Pear 5 Two Apples What I would like to use is something like: test.data <- test.data %>% mutate(is.fruit = str_detect(item, fruit)) and receive item is.fruit

R: How to ignore case when using str_detect?

阅读更多关于 R: How to ignore case when using str_detect?

stringr package provides good string functions. To search for a string (ignoring case) one could use stringr::str_detect('TOYOTA subaru',ignore.case('toyota')) This works but gives warning Please use (fixed|coll|regex)(x, ignore_case = TRUE) instead of ignore.case(x) What is the right way of rewriting it? You can use regex (or fix as @lmo's comments depending on what you need) function to make the pattern as detailed in ?modifiers or ?str_detect (see the instruction for pattern parameter) : library(stringr) str_detect('TOYOTA subaru', regex('toyota', ignore_case = T)) # [1] TRUE the search