发表新帖

发表新帖

How to handle more than multiple sets of data in R programming?

前端未结

关注

 1  1488

Ca data <- cut(data$Time, breaks=seq(0, max(data$Time)+400, 400))  by(data$Oxytocin, cuts, mean)

but this would only work for only one person\'s data....Bu

相关标签:

1条回答

渐次进展

2021-01-26 13:48
Here's a solution using IRanges package.

idx assumes your data format is Time, data, Time, data, ... and so on.. So, it creates indices 1,3,5,...ncol(df)-1.

ir1 is the intervals you would want the mean for. It's width is 400. It goes from 0 to max(Time) for each Time column (here columns 1 and 3).

ir2 is the corresponding Time column of interval width = 1.

Then I get the overlaps of ir1 with ir2, which basically tells me which intervals from ir2 overlap with ir1 (which we want), from which I calculate the mean and output the data.frame.
```
idx <- seq(1, ncol(df), by=2)
o <- lapply(idx, function(i) {  
    ir1 <- IRanges(start=seq(0, max(df[[i]]), by=401), width=401)
    ir2 <- IRanges(start=df[[i]], width=1)
    t <- findOverlaps(ir1, ir2)
    d <- data.frame(mean=tapply(df[[i+1]], queryHits(t), mean))
    cbind(as.data.frame(ir1), d)
})

> o
# [[1]]
#   start  end width      mean
# 1     0  400   401 0.6750000
# 2   401  801   401 0.8050000
# 3   802 1202   401 0.8750000
# 4  1203 1603   401 0.2285333

# [[2]]
#   start  end width    mean
# 1     0  400   401 0.73508
# 2   401  801   401 0.13408
# 3   802 1202   401 0.26408
# 4  1203 1603   401 1.06408
# 5  1604 2004   401 3.06408
```
For each Time column, you'll get a list with the intervals and mean for that interval.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题