R running average for non-time data

后端 未结 1 1230
余生分开走
余生分开走 2021-01-06 15:42

This is the plot I\'m having now. \"enter

It\'s generated from this code:



        
相关标签:
1条回答
  • 2021-01-06 16:07

    One way to provide a running mean is with geom_smooth() using the loess local regression method. In order to demonstrate my proposed solution, I created a fake genomic dataset using R functions. You can adjust the span parameter of geom_smooth to make the running mean smoother (closer to 1.0) or rougher (closer to 1/number of data points).

    # Create example data.
    set.seed(27182)
    
    y1 = rnorm(10000) + 
         c(rep(0, 1000), dnorm(seq(-2, 5, length.out=8000)) * 3, rep(0, 1000))
    y2 = c(rnorm(2000), rnorm(1000, mean=1.5), rnorm(1000, mean=-1, sd=2), 
           rnorm(2000, sd=2))
    y3 = rnorm(4000)
    pos = c(sort(runif(10000, min=0, max=1e8)),
            sort(runif(6000,  min=0, max=6e7)),
            sort(runif(4000,  min=0, max=4e7)))
    chr = rep(c("chr01", "chr02", "chr03"), c(10000, 6000, 4000))
    
    data1 = data.frame(CHROM=chr, POS=pos, DIFF=c(y1, y2, y3))
    
    # Plot.
    p = ggplot(data1, aes(x=POS, y=DIFF)) +
        geom_point(alpha=0.1, size=1.5) +
        geom_smooth(colour="darkgoldenrod1", size=1.5, method="loess", degree=0, 
            span=0.1, se=FALSE) +
        scale_x_continuous(breaks=seq(1e7, 3e8, 1e7), 
            labels=paste(seq(10, 300, 10)), expand=c(0, 0)) +
        xlab("Position, Megabases") +
        theme(axis.text.x=element_text(size=8)) +
        facet_grid(. ~ CHROM, scales="free", space="free")
    
    ggsave(filename="plot_1.png", plot=p, width=10, height=5, dpi=150)
    

    enter image description here

    0 讨论(0)
提交回复
热议问题