weight equivalent for geom_density2d

后端 未结 2 747
[愿得一人]
[愿得一人] 2021-01-31 11:58

Consider the following data:

   contesto       x       y perc
1       M01  81.370 255.659   22
2       M02  85.814 242.688   16
3       M03  73.204 240.526   33
         


        
2条回答
  •  有刺的猬
    2021-01-31 12:10

    I think you're doing it right, if your weights are # observations at each co-ordinate (or in proportion). The function seems to expect all the observations, and there's no way to dynamically update the ggplot object if you call it on your original dataset, because it's already modelled the density, and contains derived plot data.

    You might want to use data.table instead of with() if your real data set is large, it's about 70 times faster. e.g. see here for 1m co-ords, with 1-20 repeats (>10m observations in this example). No performance relevance for 660 observations, though (and the plot will probably be your performance bottleneck with a large data set anyway).

    bigtable<-data.frame(x=runif(10e5),y=runif(10e5),perc=sample(1:20,10e5,T))
    
    system.time(rep.with.by<-with(bigtable, bigtable[rep(1:nrow(bigtable), perc),]))
    #user  system elapsed 
    #11.67    0.18   11.92
    
    system.time(rep.with.dt<-data.table(bigtable)[,list(x=rep(x,perc),y=rep(y,perc))])
    #user  system elapsed 
    #0.12    0.05    0.18
    
    # CHECK THEY'RE THE SAME
    sum(rep.with.dt$x)==sum(rep.with.by$x)
    #[1] TRUE    
    
    # OUTPUT ROWS
    nrow(rep.with.dt)
    #[1] 10497966
    

提交回复
热议问题