stringr | 易学教程

R remove only “[” “]” from string

阅读更多关于 R remove only “[” “]” from string

问题 I have a something like : test[1] "[0 30.5 4.5 10.5 2 35 22.999999999999996 29 5.500000000000001 23.5 18 23.5 44.5 3 44.5 44.00000000000001 43 27 42 35.5 19.5 44.00000000000001 1 0 31 34 18 1.5 26 6 45.99999999999999 10.5 9.5 24 20 42.5 14.5 45.5 20.499999999999996 150 45.5 0 4.5 22.5 4 9 8 0 0 15.5 30.5 7 5.500000000000001 12.5 33.5 15 500 22.5 18 43 4.5 26 23.5 16 4.5 7.5 32 0 0 18.5 33 31 14.5 21.5 0 40 0 0 43.49999999999999 22.999999999999996]" And I would like to remove [ and ] (first

Change language encoding in existing Df (not on import)

阅读更多关于 Change language encoding in existing Df (not on import)

问题 looking to "correct" the encoding of a DF with a mix of english and french. I'm not loading it from a .csv but from an API, so won't be able to change encoding on import. df <- tibble(ID = 1:4, text = c("engish", "pour la mise en Å“uvre dâ€™une ville", "SÃ©curitÃ© de l'information - Ouverture des donnÃ©es", "Directeur GÃ©nÃ©ral")) Encoding(df$text) [1] "unknown" "latin1" "latin1" "latin1" Using this function from the proustr package changes the encoding, but not the characters: pattern_quote

purrr pmap to read max column name by column name number

阅读更多关于 purrr pmap to read max column name by column name number

问题 I have this dataset: library(dpylr) Problem<- tibble(name = c("Angela", "Claire", "Justin", "Bob", "Gil"), status_1 = c("Registered", "No Action", "Completed", "Denied", "No Action"), status_2 = c("Withdrawn", "No Action", "Registered", "No Action", "Exempt"), status_3 = c("No Action", "Registered", "Withdrawn", "No Action", "No Action")) I want to make a column that has everyone's current status. If the person has ever completed the course, they are completed. If they were ever exempt, they

R stringR RegExp strategy for grouping like expressions without prior knowledge

阅读更多关于 R stringR RegExp strategy for grouping like expressions without prior knowledge

问题 I've got a list of 50K+ part numbers. I need to group them by their Product Type. Part numbers are typically near each other in sequence, although they're not perfectly sequential. The product description is always similar, but does not follow optimum rules. Let me illustrate with the following table. | PartNo | Description | ProductType | |--------|-------------|-------------| |A000443 |Water Bottle | Water | |A000445 |Contain Water| Water | |A000448 |WaterBotHold | Water | |HRZ55 |Hershey

R regex - extract words beginning with @ symbol

阅读更多关于 R regex - extract words beginning with @ symbol

问题 I'm trying to extract twitter handles from tweets using R's stringr package. For example, suppose I want to get all words in a vector that begin with "A". I can do this like so library(stringr) # Get all words that begin with "A" str_extract_all(c("hAi", "hi Ahello Ame"), "(?<=\\b)A[^\\s]+") [[1]] character(0) [[2]] [1] "Ahello" "Ame" Great. Now let's try the same thing using "@" instead of "A" str_extract_all(c("h@i", "hi @hello @me"), "(?<=\\b)\\@[^\\s]+") [[1]] [1] "@i" [[2]] character(0)

Export csv with ISO-8859-1 encoding instead of UTF-8

阅读更多关于 Export csv with ISO-8859-1 encoding instead of UTF-8

问题 I struggle with encoding in csv exports. I'm from the Netherlands and we use quite some trema's (e.g. ë , ï ) and accents (e.g. é , ó ) etc. This causes troubles when exporting to csv and open file in excel. On macOS Mojave. I've tried multiple encoding functions like the following. library(stringr) library(readr) test <- c("Argentinië", "België", "Haïti") test %>% stringi::stri_conv(., "UTF-8", "ISO-8859-1") %>% write.csv2("~/Downloads/test.csv") But still, this causes weird characters: 回答1:

How to give Backslash as replacement in R string replace [duplicate]

阅读更多关于 How to give Backslash as replacement in R string replace [duplicate]

问题 This question already has answers here : R: How to replace space (' ') in string with a *single* backslash and space ('\ ') (2 answers) How do I deal with special characters like \^$.?*|+()[{ in my regex? (2 answers) Closed 2 years ago . I need to ">" with "\". Example : "a>b" should be changed to "a\b" I have tried gsub > test <- "a>b" > gsub(">","\\",test, fixed = TRUE) [1] "a\\b" I have tried StringR str_replace > library(stringr) > str_replace(test,">","\\") [1] "ab" I have tried Stringi

How to detect substrings from multiple lists within a string in R

阅读更多关于 How to detect substrings from multiple lists within a string in R

问题 I am attempting to try and find if a string called "values" contains substrings from two different lists. This is my current code: for (i in 1:length(value)){ for (j in 1:length(city)){ if (str_detect(value[i],(city[j]))) == TRUE){ for (k in 1:length(school)){ if (str_detect(value[i],(school[j]))) == TRUE){ ........................................................... } } } } } city and school are separate vectors of different length, each containing string elements. city <- ("Madrid", "London"

Why are my row names dropped and how to avoid it?

阅读更多关于 Why are my row names dropped and how to avoid it?

问题 I want to replace a certain string by another in a data frame here is a sample code: table_ex <- data.frame(row.names = c("row 1", "row 2", "row 3")) table_ex$year1 <- 3:1 table_ex$year2 <- c("NaN", 5, "NaN %") table_ex$year3 <- c("NaN %", 7, "NaN %") remove_symb <- function(yolo){stringr::str_replace(yolo, 'NaN %|NaN', '')} table_ex <- mutate_all(table_ex, funs(remove_symb)) Doing the above is dropping my rownnames. I understand I could use a lapply function, but I'm wondering why are the

find word near another using stringr

阅读更多关于 find word near another using stringr

问题 I have a simple problem, consider this example library(dplyr) library(stringr) dataframe <- data_frame(mytext = c('stackoverflow is pretty good my friend', 'but sometimes pretty bad as well')) # A tibble: 2 x 1 mytext <chr> 1 stackoverflow is pretty good my friend 2 but sometimes pretty bad as well I want to count the number of times stackoverflow is near good . I use the following regex but it does not work. dataframe %>% mutate(mycount = str_count(mytext, regex('stackoverflow(?:\\w+){0,5