stringi

Filter by multiple patterns with filter() and str_detect()

孤街浪徒 提交于 2019-12-01 08:11:37
I would like to filter a dataframe using filter() and str_detect() matching for multiple patterns without multiple str_detect() function calls. In the example below I would like to filter the dataframe df to show only rows containing the letters a f and o . df <- data.frame(numbers = 1:52, letters = letters) df %>% filter( str_detect(.$letters, "a")| str_detect(.$letters, "f")| str_detect(.$letters, "o") ) # numbers letters #1 1 a #2 6 f #3 15 o #4 27 a #5 32 f #6 41 o I have attempted the following df %>% filter( str_detect(.$letters, c("a", "f", "o")) ) # numbers letters #1 1 a #2 15 o #3 32

Retrieving sentence score based on values of words in a dictionary

删除回忆录丶 提交于 2019-12-01 06:33:49
Edited df and dict I have a data frame containing sentences: df <- data_frame(text = c("I love pandas", "I hate monkeys", "pandas pandas pandas", "monkeys monkeys")) And a dictionary containing words and their corresponding scores: dict <- data_frame(word = c("love", "hate", "pandas", "monkeys"), score = c(1,-1,1,-1)) I want to append a column "score" to df that would sum the score for each sentence: Expected results text score 1 I love pandas 2 2 I hate monkeys -2 3 pandas pandas pandas 3 4 monkeys monkeys -2 Update Here are the results so far: Akrun's methods Suggestion 1 df %>% mutate(score

Retrieving sentence score based on values of words in a dictionary

半世苍凉 提交于 2019-12-01 03:20:55
问题 Edited df and dict I have a data frame containing sentences: df <- data_frame(text = c("I love pandas", "I hate monkeys", "pandas pandas pandas", "monkeys monkeys")) And a dictionary containing words and their corresponding scores: dict <- data_frame(word = c("love", "hate", "pandas", "monkeys"), score = c(1,-1,1,-1)) I want to append a column "score" to df that would sum the score for each sentence: Expected results text score 1 I love pandas 2 2 I hate monkeys -2 3 pandas pandas pandas 3 4

How to install stringi library from archive and install the local icu52l.zip

帅比萌擦擦* 提交于 2019-12-01 03:09:47
We're bumbling through making some R code work in a production environment and as part of that we're installing some R packages as follows: # Default directories and mirrors WORKING_DIR <- "/srv/foo/bar/baz" LIB_DIR <- paste( WORKING_DIR, "libs", sep="/" ) setwd(WORKING_DIR) stringi.loc <- paste( WORKING_DIR, "stringi_0.4-1.tar.gz", sep="/" ) This might not be the most elegant way of installing R packages but it seems to work okay for us (any other tips on R package management would be welcome but a bit late at this stage :). However, the stringi package seems to depend on the icu52l package,

gsub speed vs pattern length

时光怂恿深爱的人放手 提交于 2019-12-01 02:07:55
I've been using gsub extensively lately, and I noticed that short patterns run faster than long ones, which is not surprising. Here's a fully reproducible code: library(microbenchmark) set.seed(12345) n = 0 rpt = seq(20, 1461, 20) msecFF = numeric(length(rpt)) msecFT = numeric(length(rpt)) inp = rep("aaaaaaaaaa",15000) for (i in rpt) { n = n + 1 print(n) patt = paste(rep("a", rpt[n]), collapse = "") #time = microbenchmark(func(count[1:10000,12], patt, "b"), times = 10) timeFF = microbenchmark(gsub(patt, "b", inp, fixed=F), times = 10) msecFF[n] = mean(timeFF$time)/1000000. timeFT =

Filter by multiple patterns with filter() and str_detect()

隐身守侯 提交于 2019-11-30 17:25:21
问题 I would like to filter a dataframe using filter() and str_detect() matching for multiple patterns without multiple str_detect() function calls. In the example below I would like to filter the dataframe df to show only rows containing the letters a f and o . df <- data.frame(numbers = 1:52, letters = letters) df %>% filter( str_detect(.$letters, "a")| str_detect(.$letters, "f")| str_detect(.$letters, "o") ) # numbers letters #1 1 a #2 6 f #3 15 o #4 27 a #5 32 f #6 41 o I have attempted the

object 'C_stri_join' not found - Using knitr in Rstudio

拥有回忆 提交于 2019-11-28 14:14:43
When using the knit button in Rstudio I get an error object 'C_stri_join' not found . Here is an example: --- title: "Sample Document" output: html_document: toc: true theme: united --- <!-- %\VignetteEngine{knitr::knitr} %\VignetteIndexEntry{Basic test} --> Here we go ```{r} x <- 1 str(x) ``` The error is as follows: Error in stri_c(..., sep = sep, collapse = collapse, ignore_null = TRUE) : object 'C_stri_join' not found Calls: suppressPackageStartupMessages ... evaluate_call -> handle_output -> <Anonymous> -> str_c -> stri_c This comes after a recent update of my R packages: > sessionInfo()

package 'stringi' does not work after updating to R3.2.1

自古美人都是妖i 提交于 2019-11-28 07:27:58
I saw a version of this question posted, but still did not see the answer. I am trying to use ggplot2 but get the following errors (everything worked this morning using R3.0.2 'frisbee sailing' with RStudio version 0.98.1102. I updated both R and Rstudio and now get the following: library(ggplot) Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no package called ‘stringi’ Error: package or namespace load failed for ‘ggplot2’ So naturally I tried: > install.packages('stringi') **There is a binary version available but the source version is later: binary

Cannot install stringi since Xcode Command Line Tools update

不羁岁月 提交于 2019-11-28 05:44:13
问题 System : macOS Sierra 10.12.6 Xcode : 9.2 (2347) R : 3.4.0 RStudio : 1.1.383 I'm attempting to install the latest version of stringi (1.1.6). This isn't possible since the most recent update to Xcode. The error received is configure: error: C compiler cannot create executables with full output here: Installing package into ‘/usr/local/lib/R/3.4/site-library’ (as ‘lib’ is unspecified) trying URL 'http://cran.rstudio.com/src/contrib/stringi_1.1.6.tar.gz' Content type 'application/x-gzip' length

Error in R: (Package which is only available in source form, and may need compilation of C/C++/Fortran)

笑着哭i 提交于 2019-11-27 04:36:29
I'm trying to install the 'yaml' and 'stringi' packages in R-Studio, and it keeps giving me these errors: > install.packages("stringi") Package which is only available in source form, and may need compilation of C/C++/Fortran: ‘stringi’ These will not be installed or > install.packages('yaml') Package which is only available in source form, and may need compilation of C/C++/Fortran: ‘yaml’ These will not be installed How can I get these to install properly? LyzandeR The error is due to R being unable to find a binary version of the package on CRAN, instead only finding a source version of the