Working across sub-lists with apply() functions

问题

I am trying to the bootstrap the proportional occurrence of diet items for 7 individuals and calculate a sd()

Lets say there are 9 prey items on the menu.

Diet <- c("Beaver","Bird", "Bobcat","Coyote", "Deer", "Elk",
    "Porcupine", "Raccoon",   "SmMamm")

And that these prey items are eaten by 7 different individuals of the same species

Inds <- c("P01", "P02", "P03", "P04", "P05", "P06", "P07")

My goal is the bootstrap the proportional occurrence of each diet item for each individual.
The loop below generates five diets for each individual (each diet containing N = 20 feedings) that were sampled with replacement. The data are stored as a list of the individuals, each of which contains a list of the sample diets.

BootIndDiet <- list()
IndTotboot <- list()
for(i in Inds){
    for(j in 1:5){
        BootIndDiet[[j]] <- prop.table(table(sample(Diet, 20 ,replace = T)))
                        }
            IndTotboot[[i]] <- BootIndDiet
            }

Below I have included the first two diets of individual P07 as an example of the loop results

$P07
$P07[[1]]

   Beaver      Bird    Bobcat      Deer       Elk 
     0.05      0.15      0.20      0.10      0.15 
Porcupine   Raccoon    SmMamm 
     0.15      0.15      0.05 

$P07[[2]]

   Beaver      Bird    Bobcat    Coyote      Deer 
     0.15      0.10      0.20      0.05      0.05 
      Elk Porcupine   Raccoon    SmMamm 
     0.05      0.20      0.10      0.10

I then want to calculate the sd() of the proportional of each species for each individual. Equivocally, for each individual (P01 - P07) I want the sd() of the proportional occurrence of each prey species across the 5 diets.

While my loop above runs, I suspect there is a better way (possibly using the boot() function) that avoids lists...

While I have only included 5 samples (bootstraps) for each individual here, I hope to generate 10000.

Suggestions on a different strategy or how to apply sd() across sub-lists is greatly appreciated.

回答1:

I'd try to obtain an array (instead of a nested list) in this way:

   IndTotboot <-array(replicate(5*length(Inds),prop.table(table(sample(as.factor(Diet), 20 ,replace = T))),simplify=T), dim=c(length(Diet),5,length(Inds)), dimnames=list(Diet,NULL,Inds))

With replicate you can execute an expression a given number of times and store the result as an array (if possible). I added an as.factor before Diet to make sure that the table takes trace of every Diet (even the ones with a 0 frequency).

The IndTotboot object obtained is a 3-dimensional array where the first index indicates the Diet, the second the bootstrap replications and the third the Inds. From there you can use apply in the standard way.

Edit:

If you try str(IndTotboot) you get:

    > str(IndTotboot)
     num [1:9, 1:5, 1:7] 0.1 0.15 0.15 0.1 0.1 0.1 0.15 0.05 0.1 0.15 ...
     - attr(*, "dimnames")=List of 3
       ..$ : chr [1:9] "Beaver" "Bird" "Bobcat" "Coyote" ...
       ..$ : NULL
       ..$ : chr [1:7] "P01" "P02" "P03" "P04" ...

The first line is the most important. It says num [1:9, 1:5, 1:7], which means a 9x5x7 array. The rest indicates the dimnames, the names of the dimensions, which is a list. They are the generalization of the rownames and the colnames for a matrix.

Now, to obtain the sd for every Diet and Inds you just use apply:

    apply(IndTotboot,MARGIN=c(1,3),sd)

回答2:

Following may be useful:

dd = data.frame(sapply(IndTotboot, function(x)x))

maindf = data.frame(Var1=as.character(), Freq=as.numeric())

for(rr in 1:nrow(dd)) for (cc in 1:ncol(dd)){
        maindf= merge(maindf, data.frame(dd[rr,cc]), all=TRUE)
}

> head(maindf, 10)
     Var1 Freq
1  Beaver 0.05
2  Beaver 0.10
3  Beaver 0.15
4  Beaver 0.20
5  Beaver 0.30
6    Bird 0.05
7    Bird 0.10
8    Bird 0.15
9    Bird 0.20
10   Bird 0.25


with(maindf, tapply(Freq, Var1, sd))
    Beaver       Bird     Bobcat     Coyote        Elk  Porcupine    Raccoon     SmMamm       Deer 
0.09617692 0.09354143 0.09354143 0.09354143 0.09354143 0.06454972 0.07905694 0.10801234 0.07905694

For each individual:

counter=1
for (cc in 1:ncol(dd)){
    maindf = data.frame(Var1=as.character(), Freq=as.numeric())
    for(rr in 1:nrow(dd)){
        maindf= merge(maindf, data.frame(dd[rr,cc]), all=TRUE)
    }
    cat("\nFor individual number: ",counter,"\n"); counter=counter+1
    print(with(maindf, tapply(Freq, Var1, sd)))
}


For individual number:  1 
    Beaver       Bird     Bobcat     Coyote        Elk  Porcupine    Raccoon     SmMamm       Deer 
0.05000000 0.07637626 0.05000000 0.06454972 0.03535534 0.05000000 0.05000000 0.07637626 0.05000000 

For individual number:  2 
    Beaver       Bird     Bobcat     Coyote       Deer        Elk  Porcupine    Raccoon     SmMamm 
0.05000000 0.10801234 0.03535534 0.05000000 0.09128709         NA 0.03535534 0.07637626 0.13149778 

For individual number:  3 
    Beaver       Bird     Bobcat     Coyote       Deer        Elk  Porcupine    Raccoon     SmMamm 
0.03535534 0.07637626 0.12583057 0.03535534 0.03535534 0.06454972 0.05000000 0.10606602 0.06454972 

For individual number:  4 
    Beaver       Bird     Bobcat     Coyote       Deer        Elk  Porcupine    Raccoon     SmMamm 
0.05000000 0.05000000 0.05000000 0.03535534 0.10408330 0.07905694 0.05000000 0.03535534 0.10408330 

For individual number:  5 
    Beaver       Bird     Bobcat     Coyote       Deer        Elk  Porcupine    Raccoon     SmMamm 
0.05000000 0.03535534 0.05000000 0.03535534 0.03535534 0.03535534 0.05000000 0.05000000 0.07071068 

For individual number:  6 
    Beaver     Coyote       Deer        Elk  Porcupine    Raccoon     SmMamm     Bobcat       Bird 
0.10000000 0.07637626 0.03535534 0.05000000 0.07071068 0.03535534 0.05000000 0.10408330 0.10606602 

For individual number:  7 
    Beaver       Bird     Bobcat     Coyote       Deer        Elk  Porcupine    Raccoon     SmMamm 
0.03535534 0.05000000 0.10408330 0.07637626 0.05000000 0.13228757 0.07637626 0.03535534 0.05000000

For individual+species:

maindf = data.frame(Var1=as.character(), Freq=as.numeric(), ind=as.numeric())
counter=1
for (cc in 1:ncol(dd)){
    for(rr in 1:nrow(dd)){
        maindf= merge(maindf, cbind(data.frame(dd[rr,cc]),ind=counter), all=TRUE)
    }
    counter=counter+1
}
with(maindf, tapply(Freq, list(Var1,ind), sd))

                   1          2          3          4          5          6          7
Beaver    0.05000000 0.05000000 0.03535534 0.05000000 0.05000000 0.10000000 0.03535534
Bird      0.07637626 0.10801234 0.07637626 0.05000000 0.03535534 0.10606602 0.05000000
Bobcat    0.05000000 0.03535534 0.12583057 0.05000000 0.05000000 0.10408330 0.10408330
Coyote    0.06454972 0.05000000 0.03535534 0.03535534 0.03535534 0.07637626 0.07637626
Elk       0.03535534         NA 0.06454972 0.07905694 0.03535534 0.05000000 0.13228757
Porcupine 0.05000000 0.03535534 0.05000000 0.05000000 0.05000000 0.07071068 0.07637626
Raccoon   0.05000000 0.07637626 0.10606602 0.03535534 0.05000000 0.03535534 0.03535534
SmMamm    0.07637626 0.13149778 0.06454972 0.10408330 0.07071068 0.05000000 0.05000000
Deer      0.05000000 0.09128709 0.03535534 0.10408330 0.03535534 0.03535534 0.05000000

来源：https://stackoverflow.com/questions/25595786/working-across-sub-lists-with-apply-functions

标签

apply

boot

nested-lists