stringr

R remove only “[” “]” from string

自古美人都是妖i 提交于 2019-12-13 04:55:50
问题 I have a something like : test[1] "[0 30.5 4.5 10.5 2 35 22.999999999999996 29 5.500000000000001 23.5 18 23.5 44.5 3 44.5 44.00000000000001 43 27 42 35.5 19.5 44.00000000000001 1 0 31 34 18 1.5 26 6 45.99999999999999 10.5 9.5 24 20 42.5 14.5 45.5 20.499999999999996 150 45.5 0 4.5 22.5 4 9 8 0 0 15.5 30.5 7 5.500000000000001 12.5 33.5 15 500 22.5 18 43 4.5 26 23.5 16 4.5 7.5 32 0 0 18.5 33 31 14.5 21.5 0 40 0 0 43.49999999999999 22.999999999999996]" And I would like to remove [ and ] (first

Change language encoding in existing Df (not on import)

一曲冷凌霜 提交于 2019-12-13 03:00:40
问题 looking to "correct" the encoding of a DF with a mix of english and french. I'm not loading it from a .csv but from an API, so won't be able to change encoding on import. df <- tibble(ID = 1:4, text = c("engish", "pour la mise en Å“uvre d’une ville", "Sécurité de l'information - Ouverture des données", "Directeur Général")) Encoding(df$text) [1] "unknown" "latin1" "latin1" "latin1" Using this function from the proustr package changes the encoding, but not the characters: pattern_quote

purrr pmap to read max column name by column name number

北战南征 提交于 2019-12-13 02:49:35
问题 I have this dataset: library(dpylr) Problem<- tibble(name = c("Angela", "Claire", "Justin", "Bob", "Gil"), status_1 = c("Registered", "No Action", "Completed", "Denied", "No Action"), status_2 = c("Withdrawn", "No Action", "Registered", "No Action", "Exempt"), status_3 = c("No Action", "Registered", "Withdrawn", "No Action", "No Action")) I want to make a column that has everyone's current status. If the person has ever completed the course, they are completed. If they were ever exempt, they

R stringR RegExp strategy for grouping like expressions without prior knowledge

血红的双手。 提交于 2019-12-13 01:03:45
问题 I've got a list of 50K+ part numbers. I need to group them by their Product Type. Part numbers are typically near each other in sequence, although they're not perfectly sequential. The product description is always similar, but does not follow optimum rules. Let me illustrate with the following table. | PartNo | Description | ProductType | |--------|-------------|-------------| |A000443 |Water Bottle | Water | |A000445 |Contain Water| Water | |A000448 |WaterBotHold | Water | |HRZ55 |Hershey

R regex - extract words beginning with @ symbol

与世无争的帅哥 提交于 2019-12-12 10:46:17
问题 I'm trying to extract twitter handles from tweets using R's stringr package. For example, suppose I want to get all words in a vector that begin with "A". I can do this like so library(stringr) # Get all words that begin with "A" str_extract_all(c("hAi", "hi Ahello Ame"), "(?<=\\b)A[^\\s]+") [[1]] character(0) [[2]] [1] "Ahello" "Ame" Great. Now let's try the same thing using "@" instead of "A" str_extract_all(c("h@i", "hi @hello @me"), "(?<=\\b)\\@[^\\s]+") [[1]] [1] "@i" [[2]] character(0)

Export csv with ISO-8859-1 encoding instead of UTF-8

荒凉一梦 提交于 2019-12-11 10:08:12
问题 I struggle with encoding in csv exports. I'm from the Netherlands and we use quite some trema's (e.g. ë , ï ) and accents (e.g. é , ó ) etc. This causes troubles when exporting to csv and open file in excel. On macOS Mojave. I've tried multiple encoding functions like the following. library(stringr) library(readr) test <- c("Argentinië", "België", "Haïti") test %>% stringi::stri_conv(., "UTF-8", "ISO-8859-1") %>% write.csv2("~/Downloads/test.csv") But still, this causes weird characters: 回答1:

How to give Backslash as replacement in R string replace [duplicate]

笑着哭i 提交于 2019-12-11 05:52:51
问题 This question already has answers here : R: How to replace space (' ') in string with a *single* backslash and space ('\ ') (2 answers) How do I deal with special characters like \^$.?*|+()[{ in my regex? (2 answers) Closed 2 years ago . I need to ">" with "\". Example : "a>b" should be changed to "a\b" I have tried gsub > test <- "a>b" > gsub(">","\\",test, fixed = TRUE) [1] "a\\b" I have tried StringR str_replace > library(stringr) > str_replace(test,">","\\") [1] "ab" I have tried Stringi

How to detect substrings from multiple lists within a string in R

天大地大妈咪最大 提交于 2019-12-11 04:09:22
问题 I am attempting to try and find if a string called "values" contains substrings from two different lists. This is my current code: for (i in 1:length(value)){ for (j in 1:length(city)){ if (str_detect(value[i],(city[j]))) == TRUE){ for (k in 1:length(school)){ if (str_detect(value[i],(school[j]))) == TRUE){ ........................................................... } } } } } city and school are separate vectors of different length, each containing string elements. city <- ("Madrid", "London"

Why are my row names dropped and how to avoid it?

梦想的初衷 提交于 2019-12-11 03:34:15
问题 I want to replace a certain string by another in a data frame here is a sample code: table_ex <- data.frame(row.names = c("row 1", "row 2", "row 3")) table_ex$year1 <- 3:1 table_ex$year2 <- c("NaN", 5, "NaN %") table_ex$year3 <- c("NaN %", 7, "NaN %") remove_symb <- function(yolo){stringr::str_replace(yolo, 'NaN %|NaN', '')} table_ex <- mutate_all(table_ex, funs(remove_symb)) Doing the above is dropping my rownnames. I understand I could use a lapply function, but I'm wondering why are the

find word near another using stringr

三世轮回 提交于 2019-12-10 23:57:29
问题 I have a simple problem, consider this example library(dplyr) library(stringr) dataframe <- data_frame(mytext = c('stackoverflow is pretty good my friend', 'but sometimes pretty bad as well')) # A tibble: 2 x 1 mytext <chr> 1 stackoverflow is pretty good my friend 2 but sometimes pretty bad as well I want to count the number of times stackoverflow is near good . I use the following regex but it does not work. dataframe %>% mutate(mycount = str_count(mytext, regex('stackoverflow(?:\\w+){0,5