R generate 2D histogram from raw data

后端未结

关注

 6  1954

I have some raw data in 2D, x, y as given below. I want to generate a 2D histogram from the data. Typically, dividing the x,y values into bins of size 0.5, and count the number

相关标签:

6条回答

梦如初夏

2021-02-01 07:58

For completeness, you can also use the hist2d{gplots} function. It seems to be the most straightforward for a 2D plot:

library(gplots)

# data is in variable df

# define bin sizes
bin_size <- 0.5
xbins <- (max(df$x) - min(df$x))/bin_size
ybins <- (max(df$y) - min(df$y))/bin_size

# create plot
hist2d(df, same.scale=TRUE, nbins=c(xbins, ybins))

Two-dimensional histogram of x,y data

# if you want to retrieve the data for other purposes
df.hist2d <- hist2d(df, same.scale=TRUE, nbins=c(xbins, ybins), show=FALSE)
df.hist2d$counts

0 讨论(0)

灰色年华

2021-02-01 08:00
Bivariate density estimates can be done with MASS::kde2d, or KernSmooth::bkde2D (both supplied with the base R distribution). The latter uses an algorithm based on the fast Fourier transform over a grid of points, and is very fast. The result can be plotted with contour or persp or similar functions in other graphing packages.

Using your data:
```
require(KernSmooth)
z <- bkde2D(df, .5)
persp(z$fhat)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
死守一世寂寞

2021-02-01 08:03
i came to this page from http://www.r-bloggers.com/5-ways-to-do-2d-histograms-in-r/ which lists one of the answers above. It provides code samples for a total of 5 methods:
```
hist2d from the library gplots
hexbin,hexbinplot from the library hexbin
stat_bin2d from the library ggplot2
kde2d from the library MASS
the "hard way" solution listed above.
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

攒了一身酷

2021-02-01 08:10

The ggplot is elegant and fast and pretty, as usual. But if you want to use base graphics (image, contour, persp) and display your actual frequencies (instead of the smoothing 2D kernel), you have to first obtain the binnings yourself and create a matrix of frequencies. Here's some code (not necessarily elegant, but pretty robust) that does 2D binning and generates plots somewhat similar to the ones above:

    require(mvtnorm)
    xy <- rmvnorm(1000,c(5,10),sigma=rbind(c(3,-2),c(-2,3)))

    nbins <- 20
    x.bin <- seq(floor(min(xy[,1])), ceiling(max(xy[,1])), length=nbins)
    y.bin <- seq(floor(min(xy[,2])), ceiling(max(xy[,2])), length=nbins)

    freq <-  as.data.frame(table(findInterval(xy[,1], x.bin),findInterval(xy[,2], y.bin)))
    freq[,1] <- as.numeric(freq[,1])
    freq[,2] <- as.numeric(freq[,2])

    freq2D <- diag(nbins)*0
    freq2D[cbind(freq[,1], freq[,2])] <- freq[,3]

    par(mfrow=c(1,2))
    image(x.bin, y.bin, freq2D, col=topo.colors(max(freq2D)))
    contour(x.bin, y.bin, freq2D, add=TRUE, col=rgb(1,1,1,.7))

    palette(rainbow(max(freq2D)))
    cols <- (freq2D[-1,-1] + freq2D[-1,-(nbins-1)] + freq2D[-(nbins-1),-(nbins-1)] + freq2D[-(nbins-1),-1])/4
    persp(freq2D, col=cols)

enter image description here

For a really fun time, try making an interactive, zoomable, 3D surface:

require(rgl)
surface3d(x.bin,y.bin,freq2D/10, col="red")

enter image description here

0 讨论(0)

长发绾君心

2021-02-01 08:12

freq <-  as.data.frame(table(findInterval(xy[,1], x.bin),findInterval(xy[,2], y.bin)))
    freq[,1] <- as.numeric(freq[,1])
    freq[,2] <- as.numeric(freq[,2])

This is probably wrong since it destroys the original indices.

0 讨论(0)

谎友^

2021-02-01 08:14
If you want it with a 2d contour, you can also use the package ggplot2. Some example code is shown in this question:

gradient breaks in a ggplot stat_bin2d plot Adjusted slightly:
```
x <- rnorm(10000)+5
y <- rnorm(10000)+5
df <- data.frame(x,y)
require(ggplot2)
p <- ggplot(df, aes(x, y)) 
p <- p + stat_bin2d(bins = 20)
p
```
Here's the output of the code above:
0 讨论(0)
发布评论:

提交评论
- 加载中...