calculate area of overlapping density plot by ggplot using R

醉酒当歌 提交于 2019-12-01 19:50:23
Karop

I was looking for a way to do this for empirical data, and had the problem of multiple intersections as mentioned by user5878028. After some digging I found a very simple solution, even for a total R noob like me:

Install and load the libraries "overlapping" (which performs the calculation) and "lattice" (which displays the result):

library(overlapping)
library(lattice)

Then define a variable "x" as a list that contains the two density distributions you want to compare. For this example, the two datasets "data1" and "data2" are both columns in a text file called "yourfile":

x <- list(X1=yourfile$data1, X2=yourfile$data2)

Then just tell it to display the output as a plot which will also display the estimated % overlap:

out <- overlap(x, plot=TRUE)

I hope this helps someone like it helped me! Here's an example overlap plot

I will make a few base R plots, but the plots are not actually part of the solution. They are just there to confirm that I am getting the right answer.

You can get each of the density functions and solve for where they intersect.

##  Create the two density functions and display
FDensity = approxfun(density(df$weight[df$sex=="F"], from=40, to=80))
MDensity = approxfun(density(df$weight[df$sex=="M"], from=40, to=80))
plot(FDensity, xlim=c(40,80), ylab="Density")
curve(MDensity, add=TRUE)

Now solve for the intersection

## Solve for the intersection and plot to confirm
FminusM = function(x) { FDensity(x) - MDensity(x) }
Intersect = uniroot(FminusM, c(40, 80))$root
points(Intersect, FDensity(Intersect), pch=20, col="red")

Now we can just integrate to get the area of the overlap.

integrate(MDensity, 40,Intersect)$value + 
    integrate(FDensity, Intersect, 80)$value
[1] 0.2952838
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!