stringr

RegEx and stringr package

有些话、适合烂在心里 提交于 2020-01-15 12:12:53
问题 I am an R newbie and have troubles with my programming homework. The input is a poem: poem <- c( "Am Tag, an dem das L verschwand,", "da war die Luft voll Klagen.", "Den Dichtern, ach, verschlug es glatt", "ihr Singen und ihr Sagen.", "Nun gut. Sie haben sich gefasst.", "Man sieht sie wieder schreiben.", "Jedoch:", "Solang das L nicht wiederkehrt,", "muß alles Flickwerk beiben.") Now I need to extract all the capital letters and combine them into one word. I am doing this with the following

Using str_detect (or some other function) and some way to loop through a list to essentially perform a vlookup

醉酒当歌 提交于 2020-01-15 10:14:34
问题 I have been searching for a way to do this and some results on here seem similar, nothing seems to be working, nor can I find a method that will loop through a list like a vlookup in excel. I apologize if I have missed it. I am trying to add a new column to a data set with Mutate. What it is going to do is look at one column using str_replace (or some other function if necessary), and then loop through another list. I want to replace what it finds on with the corresponding value in another

Convert HTML Entity to proper character R

流过昼夜 提交于 2020-01-11 11:30:07
问题 Does anyone know of a generic function in r that can convert ä to its unicode character â ? I have seen some functions that take in â , and convert it to a normal character. Any help would be appreciated. Thanks. Edit: Below is a record of data, which I probably have over 1 million records. Is there an easier solution other than reading the data into a massive vector, and for each element, changing the records? wine/name: 1999 Domaine Robert Chevillon Nuits St. Georges 1er Cru Les Vaucrains

Concatenate previous and latter words to a word that match a condition in R

走远了吗. 提交于 2020-01-04 00:19:12
问题 I need to concatenate the previous and the latter words of a condition meeting word. Specifically, those who match the condition of having a comma. vector <- c("Paulsen", "Kehr,", "Diego", "Schalper", "Sepúlveda,", "Diego") #I know how to get which elements meet my condition: grepl(",", vector) #[1] FALSE TRUE FALSE FALSE TRUE FALSE Desired output: print(vector_ok) #[1] "Paulsen Kehr, Diego", "Schalper Sepúlveda, Diego" Thanks in advance! 回答1: You can use grep() to get the positions of the

Extract text in parentheses in R

时光毁灭记忆、已成空白 提交于 2020-01-02 03:41:09
问题 Two related questions. I have vectors of text data such as "a(b)jk(p)" "ipq" "e(ijkl)" and want to easily separate it into a vector containing the text OUTSIDE the parentheses: "ajk" "ipq" "e" and a vector containing the text INSIDE the parentheses: "bp" "" "ijkl" Is there any easy way to do this? An added difficulty is that these can get quite large and have a large (unlimited) number of parentheses. Thus, I can't simply grab text "pre/post" the parentheses and need a smarter solution. 回答1:

Why is stringr changing encoding when manipulating strings?

你说的曾经没有我的故事 提交于 2020-01-02 02:31:07
问题 There is this strange behavior of stringr , which is really annoying me. stringr changes without a warning the encoding of some strings that contain exotic characters, in my case ø, å, æ, é and some others... If you str_trim a vector of characters, then those with exotic letters will be converted to a new Encoding. letter1 <- readline('Gimme an ASCII character!') # try q or a letter2 <- readline('Gimme an non-ASCII character!') # try ø or é Letters <- c(letter1, letter2) Encoding(Letters) #

R: How to ignore case when using str_detect?

ε祈祈猫儿з 提交于 2020-01-01 04:04:55
问题 stringr package provides good string functions. To search for a string (ignoring case) one could use stringr::str_detect('TOYOTA subaru',ignore.case('toyota')) This works but gives warning Please use (fixed|coll|regex)(x, ignore_case = TRUE) instead of ignore.case(x) What is the right way of rewriting it? 回答1: You can use regex (or fix as @lmo's comments depending on what you need) function to make the pattern as detailed in ?modifiers or ?str_detect (see the instruction for pattern parameter

R: How to ignore case when using str_detect?

。_饼干妹妹 提交于 2020-01-01 04:04:08
问题 stringr package provides good string functions. To search for a string (ignoring case) one could use stringr::str_detect('TOYOTA subaru',ignore.case('toyota')) This works but gives warning Please use (fixed|coll|regex)(x, ignore_case = TRUE) instead of ignore.case(x) What is the right way of rewriting it? 回答1: You can use regex (or fix as @lmo's comments depending on what you need) function to make the pattern as detailed in ?modifiers or ?str_detect (see the instruction for pattern parameter

Split a character vector into individual characters? (opposite of paste or stringr::str_c)

吃可爱长大的小学妹 提交于 2019-12-27 12:08:13
问题 An incredibly basic question in R yet the solution isn't clear. How to split a vector of character into its individual characters, i.e. the opposite of paste(..., sep='') or stringr::str_c() ? Anything less clunky than this: sapply(1:26, function(i) { substr("ABCDEFGHIJKLMNOPQRSTUVWXYZ",i,i) } ) "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z" Can it be done otherwise, e.g. with strsplit() , stringr::* or anything else? 回答1: Yes, strsplit

Creating Groups with Dplyr's “group_by” then Using Stringr to Find Differences Between Groups

末鹿安然 提交于 2019-12-25 08:28:50
问题 Using the example below, I want to group the dataframe by CaseWorker, then Client, then determine for each Client group whether the list of tasks in "Task" is the same as the list of tasks in "Task2". I would be happy witha simple true or false, or better yet, if each task that is in "Task2" but not "Task" could be extracted and displayed in a new column or dataframe. So basically I need to make sure "Task" and "Task2" contain the same entries for each individual Client. I would like to stick