subset

Filtering rows in R unexpectedly removes NAs when using subset or dplyr::filter

扶醉桌前 提交于 2020-06-08 17:45:59
问题 I have a dataset df and I would like to remove all rows for which variable y does not have the value a . Variable y also contains some NAs : df <- data.frame(x=1:3, y=c('a', NA, 'c')) I can achieve this using R's indexing syntax like this: df[df$y!='a',] x y 2 <NA> 3 c Note this returns both the NA and the value c - which is what I want. However, when I try the same thing using subset or dplyr::filter , the NA gets stripped out: subset(df, y!='a') x y 3 c dplyr::filter(df, y!='a') x y 3 c Why

Conditionally removing duplicates in R

ぃ、小莉子 提交于 2020-05-28 10:49:27
问题 I have a dataset in which I need to conditionally remove duplicated rows based on values in another column. Specifically, I need to delete any row where size = 0 only if SampleID is duplicated . SampleID<-c("a", "a", "b", "b", "b", "c", "d", "d", "e") size<-c(0, 1, 1, 2, 3, 0, 0, 1, 0) data<-data.frame(SampleID, size) I want to delete rows with: Sample ID size a 0 d 0 And keep: SampleID size a 1 b 1 b 2 b 3 c 0 d 1 e 0 Note. actual dataset it very large, so I am not looking for a way to just

R dplyr subset with missing columns

吃可爱长大的小学妹 提交于 2020-05-28 08:35:09
问题 I have the following code and would like to select columns into a new data.frame . library(dplyr) df = data.frame( Manhattan=c(1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0), Brooklyn=c(0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0), The_Bronx=c(1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0), Staten_Island=c(0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), "2012"=c("P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q",

R dplyr subset with missing columns

拜拜、爱过 提交于 2020-05-28 08:28:45
问题 I have the following code and would like to select columns into a new data.frame . library(dplyr) df = data.frame( Manhattan=c(1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0), Brooklyn=c(0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0), The_Bronx=c(1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0), Staten_Island=c(0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), "2012"=c("P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q",

R dplyr subset with missing columns

六月ゝ 毕业季﹏ 提交于 2020-05-28 08:28:38
问题 I have the following code and would like to select columns into a new data.frame . library(dplyr) df = data.frame( Manhattan=c(1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0), Brooklyn=c(0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0), The_Bronx=c(1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0), Staten_Island=c(0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), "2012"=c("P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q",

R dplyr subset with missing columns

若如初见. 提交于 2020-05-28 08:28:02
问题 I have the following code and would like to select columns into a new data.frame . library(dplyr) df = data.frame( Manhattan=c(1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0), Brooklyn=c(0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0), The_Bronx=c(1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0), Staten_Island=c(0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0), "2012"=c("P", "P", "P", "P", "P", "P", "P", "P", "P", "P", "Q", "Q", "Q", "Q", "Q", "Q", "Q", "Q",

Why does 'out of bounds' indexing differ between a matrix and a data.frame?

谁都会走 提交于 2020-05-25 11:26:13
问题 I'm sure this is kind of basic, but I'd just like to really understand the logic of R data structures here. If I subset a matrix by index out of bounds, I get exactly that error: m <- matrix(data = c("foo", "bar"), nrow = 1) m[2,] # Error in m[2, ] : subscript out of bounds If I do the same do a data frame, however, I get all NA rows : df <- data.frame(foo = "foo", bar = "bar") df[2,] # foo bar # NA <NA> <NA> If I subset into a non-existent data frame column I get the familiar df[, 3] # Error

Subset a data frame based on user-input, Shiny

非 Y 不嫁゛ 提交于 2020-05-11 04:30:53
问题 I'm trying to build a Shiny app that subsets a data frame (to only include rows where a categorical variable matches the user-select input from the UI) before the data is processed in Server and then visualized in the UI. I've tried several different methods but I keep getting errors, e.g. "object of type 'closure' is not subsettable" Then when I try to cast the reactive user input with target <- toString(reactive({input$value})) I get the following error: "Error in as.vector(x, "character")

CF 1132A,1132B,1132C,1132D,1132E,1132F(Round 61 A,B,C,D,E,F)题解

孤者浪人 提交于 2020-05-09 10:56:06
A.Regular bracket sequence A string is called bracket sequence if it does not contain any characters other than " (" and " )". A bracket sequence is called regular if it it is possible to obtain correct arithmetic expression by inserting characters " +" and " 1" into this sequence. For example, "", " (())" and " ()()" are regular bracket sequences; " ))" and " )((" are bracket sequences (but not regular ones), and " (a)" and " (1)+(1)" are not bracket sequences at all. You have a number of strings; each string is a bracket sequence of length 2 2. So, overall you have c n t 1 cnt1 strings " (("

Finding all Coprime subset upto a number N

旧巷老猫 提交于 2020-05-08 19:48:25
问题 Suppose I have numbers 1 to N and I want to divide them into subsets based on following criteria: Each number can be present in only 1 subset. The elements of the subsets must be mutually coprime. Minimizing the total number of subsets. My approach to it is by finding all primes up to N using Sieve of Eratosthenes and then dividing them accordingly in subsets. For example for N=5, I can have two subsets at minimum {1,2,3,5} and {4}. But I am unsure how to distribute the elements in subsets so