tapply | 易学教程

How to assign a counter to a specific subset of a data.frame which is defined by a factor combination?

阅读更多关于 How to assign a counter to a specific subset of a data.frame which is defined by a factor combination?

问题 My question is: I have a data frame with some factor variables. I now want to assign a new vector to this data frame, which creates an index for each subset of those factor variables. data <-data.frame(fac1=factor(rep(1:2,5)), fac2=sample(letters[1:3],10,rep=T)) Gives me something like: fac1 fac2 1 1 a 2 2 c 3 1 b 4 2 a 5 1 c 6 2 b 7 1 a 8 2 a 9 1 b 10 2 c And what I want is a combination counter which counts the occurrence of each factor combination. Like this fac1 fac2 counter 1 1 a 1 2 2 c

How to assign a counter to a specific subset of a data.frame which is defined by a factor combination?

阅读更多关于 How to assign a counter to a specific subset of a data.frame which is defined by a factor combination?

sum multiple columns by group with tapply

阅读更多关于 sum multiple columns by group with tapply

问题 I wanted to sum individual columns by group and my first thought was to use tapply . However, I cannot get tapply to work. Can tapply be used to sum multiple columns? If not, why not? I have searched the internet extensively and found numerous similar questions posted as far back as 2008. However, none of those questions have been answered directly. Instead, the responses invariably suggest using a different function. Below is an example data set for which I wish to sum apples by state,

Calculate mean value of sets of 4 sub locations from multiple location from a larger matrix

阅读更多关于 Calculate mean value of sets of 4 sub locations from multiple location from a larger matrix

问题 I am doing a data analysis on wall thickness measurements of circular tubes. I have the following matrix: > head(datIn, 12) Component Tube.number Measurement.location Sub.location Interval Unit Start 1 In 1 1 A 121 U6100 7/25/2000 2 In 1 1 A 122 U6100 5/24/2001 3 In 1 1 A 222 U6200 1/19/2001 4 In 1 1 A 321 U6300 6/1/2000 5 In 1 1 A 223 U6200 5/22/2002 6 In 1 1 A 323 U6300 6/18/2002 7 In 1 1 A 21 U6200 10/1/1997 8 In 1 1 A 221 U6200 6/3/2000 9 In 1 1 A 322 U6300 12/11/2000 10 In 1 1 B 122

R - Get number of values per group without counting NAs

阅读更多关于 R - Get number of values per group without counting NAs

问题 So I'm trying to count the number of values per group in a column without counting the NAs. I've tried doing it with "length" but I can't figure out how to tell "length" to leave the NAs be, when in the context of looking at values per group. I've found similar problems but couldn't figure out how to apply the solutions to my case: Length of columns excluding NA in r http://r.789695.n4.nabble.com/Length-of-vector-without-NA-s-td2552208.html I've created a minimal working example to illustrate

Scale all values depending on group [duplicate]

阅读更多关于 Scale all values depending on group [duplicate]

问题 This question already has answers here : group by and scale/normalize a column in r (2 answers) Closed 2 years ago . I have a dataframe similar to this one ID <- c(1,1,1,1,1,2,2,2,2,2,3,3,3,3,3) p1 <- c(21000, 23400, 26800, 2345, 23464, 34563, 456433, 56543, 34543,3524, 353, 3432, 4542, 6343, 4534 ) p2 <- c(234235, 2342342, 32, 23432, 23423, 2342342, 34, 2343, 23434, 23434, 34, 234, 2343, 34, 5) my.df <- data.frame(ID, p1, p2) Now I would like to scale the values in p1 and p2 depending on

understanding difference in results between dplyr group_by vs tapply

阅读更多关于 understanding difference in results between dplyr group_by vs tapply

问题 I was expecting to see the same results between these two runs, and they are different. Makes me question if I really understand what how the dplyr code is working (I have read pretty much everything I can find about dplyr in the package and online). Can anyone explain why the results are different, or how to obtain similar results? library(dplyr) x <- iris x <- x %.% group_by(Species, Sepal.Width) %.% summarise (freq=n()) %.% summarise (mean_by_group = mean(Sepal.Width)) print(x) x <- iris x

Using variations of `apply` in R

阅读更多关于 Using variations of `apply` in R

问题 Often times in research we have to do a summary table. I would like to create a table using tapply in R. The only problem is I have 40 variables and I would like to basically perform the same operation for all 40 variables. Here is an example of the data Age Wt Ht Type 79 134 66 C 67 199 64 C 39 135 78 T 92 149 61 C 33 138 75 T 68 139 71 C 95 198 62 T 65 132 65 T 56 138 81 C 71 193 78 T Essentially I would like to get it to produce the means of all the variables given the Type . It should

How to use tapply() within a for loop and print output in R?

阅读更多关于 How to use tapply() within a for loop and print output in R?

问题 I am using tapply() to apply a function to my data Myrepfun <- function(x,n){ nstudents <- replicate(1000,sum(sample(x, size=n,replace=TRUE))) quantile(nstudents,probs=0.95) } tapply(weight,schoolcode,Myrepfun,n=2) I would like to use this within a for loop and print out the output. I have tried the following and I get the error message: Error: unexpected symbol in "for(n in 12:13) (t=tapply(ow,sc,ndropfunction,n,p=0.95) output for(n in 1:25) {t=tapply(weight,schoolcode,Myrepfun,n,p=0.95)

What is the difference between the functions tapply and ave?

阅读更多关于 What is the difference between the functions tapply and ave?

问题 I can't wrap my mind around the ave function. I read the help and searched the net but I still cannot understand what it does. I understand it applies some function on a subset of observation but not in the same way as for example tapply Could someone please enlighten me perhaps with a small example? Thanks, and excuse me for perhaps an unusual request. 回答1: tapply returns a single result for each factor level. ave also produces a single result per factor level, but it copies this value to