subset | 易学教程

subset list of multiple dataframe based on either row or column match

阅读更多关于 subset list of multiple dataframe based on either row or column match

问题 library(tidyverse) library(dplyr) list of data frames where some have matches to the vector and some don't lsdf <- list( list1 = head(mtcars), list2 = as.data.frame(t(head(mtcars))) %>% rownames_to_column(., var = "ID"), list3 = head(starwars) ) the vector of names that match to some dataframe vec <- c("mpg", "wt", "am") If I have to manually do it step by step this would be like this. df1 <- lsdf$list1 %>% select(vec) # column-name matches df2 <- lsdf$list2 %>% filter(ID %in% vec) # values

Finding the mean of a subset

阅读更多关于 Finding the mean of a subset

问题 I have made a subset from the dataframe 'Indometh' called 'indo': indo Subject time conc 1 1 0.25 1.50 13 2 0.50 1.63 24 3 0.50 1.49 25 3 0.75 1.16 34 4 0.25 1.85 35 4 0.50 1.39 36 4 0.75 1.02 46 5 0.50 1.04 57 6 0.50 1.44 58 6 0.75 1.03 I want to find what the average concentration for the subset is. I have used code but to no avail: mean(subset(indo, conc >1 & conc <2)) I know summary(indo) will show the mean of the concentration but wanted to know if there was another way I could do this

用Java的Set实现交并差等集合运算

阅读更多关于用Java的Set实现交并差等集合运算

放码过来 package com.lun.util; import java.util.HashSet; import java.util.Set; public class SetUtils { public static <T> Set<T> union(Set<T> setA, Set<T> setB) { Set<T> tmp = new HashSet<T>(setA); tmp.addAll(setB); return tmp; } public static <T> Set<T> intersection(Set<T> setA, Set<T> setB) { Set<T> tmp = new HashSet<T>(); for (T x : setA) if (setB.contains(x)) tmp.add(x); return tmp; } public static <T> Set<T> difference(Set<T> setA, Set<T> setB) { Set<T> tmp = new HashSet<T>(setA); tmp.removeAll(setB); return tmp; } public static <T> Set<T> symDifference(Set<T> setA, Set<T> setB) { Set<T> tmpA;

Concatenate expressions to subset a dataframe

阅读更多关于 Concatenate expressions to subset a dataframe

问题 I am attempting to create a function that will calculate the mean of a column in a subsetted dataframe. The trick here is that I always to want to have a couple subsetting conditions and then have the option to pass more conditions to the functions to further subset the dataframe. Suppose my data look like this: dat <- data.frame(var1 = rep(letters, 26), var2 = rep(letters, each = 26), var3 = runif(26^2)) head(dat) var1 var2 var3 1 a a 0.7506109 2 b a 0.7763748 3 c a 0.6014976 4 d a 0.6229010

R subset with condition using %in% or ==. Which one should be used? [duplicate]

阅读更多关于 R subset with condition using %in% or ==. Which one should be used? [duplicate]

问题 This question already has answers here : Subset dataframe by multiple logical conditions of rows to remove (8 answers) Closed 5 years ago . Usually, if I want to subset a dataframe conditioning of some values a variable I'm using subset and %in%: x <- data.frame(u=1:10,v=LETTERS[1:10]) x subset(x, v %in% c("A","D")) Now, I found out that also == gives the same result: subset(x, v == c("A","D")) I'm just wondering if they are identically or if there is a reason to prefere one over the other.

Create an index that increases after each gap in otherwise regularily-increasing row

阅读更多关于 Create an index that increases after each gap in otherwise regularily-increasing row

问题 I've got a samples data frame which contains some readings at regularily-spaced timestamps (1 sec. interval). TS Pressure Temperature [...] 8 2014-08-26 00:18:26.8 105 30 9 2014-08-26 00:18:27.8 108 32 10 2014-08-26 00:18:28.8 109.9 31 11 2014-08-26 00:34:20.8 109 20 12 2014-08-26 00:34:21.8 100 24 13 2014-08-26 00:34:22.8 95 22 [...] I only have records during some events of interest (e.g. when Pressure < 110 ) and don't have any records outside of these events. I want to give an unique ID

Recover original array from all subsets

阅读更多关于 Recover original array from all subsets

问题 You are given all subset sums of an array. You are then supposed to recover the original array from the subset sums provided. Every element in the original array is guaranteed to be non-negative and less than 10^5. There are no more than 20 elements in the original array. The original array is also sorted. The input is guaranteed to be valid. Example 1 If the subset sums provided are this: 0 1 5 6 6 7 11 12 We can quickly deduce that the size of the original array is 3 since there are 8 (2^3)

Ruby: Array contained in Array, any order [duplicate]

阅读更多关于 Ruby: Array contained in Array, any order [duplicate]

问题 This question already has answers here : Check if an array is subset of another array in Ruby (4 answers) Closed 3 years ago . Suppose I have the following Ruby code: array_1 = ['a', 'b'] array_2 = ['a', 'b', 'c'] some_function(array_1, array_2) # => True some_function(array_2, array_1) # => False some_function(['a', 'b'], ['a', 'd']) # => False some_function(['x', 'y'], array_2) # => False I am pretty much looking for some_function to return True when Parameter 2 contains all of the elements

Ruby: Array contained in Array, any order [duplicate]

阅读更多关于 Ruby: Array contained in Array, any order [duplicate]

Subset data.table based on value in column of type list

阅读更多关于 Subset data.table based on value in column of type list

问题 So I have this case currently of a data.table with one column of type list. This list can contain different values, NULL among other possible values. I tried to subset the data.table to keep only rows for which this column has the value NULL . Behold... my attempts below (for the example I named the column "ColofTypeList"): DT[is.null(ColofTypeList)] It returns me an Empty data.table . Then I tried: DT[ColofTypeList == NULL] It returns the following error (I expected an error): Error in