is there a simple way to define breaks instead of nbins for a 2d histogram (hist2d) in R?
I want to define the range for the x- and yaxis for a 2D histogram and the number of bins for each dimension.
My example:
# example data
x <- sample(-1:100, 2000, replace=T)
y <- sample(0:89, 2000, replace=T)
# create 2d histogram
h2 <- hist2d(x,y,nbins=c(23,19),xlim=c(-1,110), ylim=c(0,95),xlab='x',ylab='y',main='hist2d')
This results in this 2D histogram output 1
----------------------------
2-D Histogram Object
----------------------------
Call: hist2d(x = x, y = y, nbins = c(23, 19), xlab = "x", ylab = "y",
xlim = c(-1, 110), ylim = c(0, 95), main = "hist2d")
Number of data points: 2000
Number of grid bins: 23 x 19
X range: ( -1 , 100 )
Y range: ( 0 , 89 )
I need
X range: ( -1 , 110 )
Y range: ( 0 , 95 )
instead.
My attempt to define the xlim and ylim only extends the plot but does not define the axis range for the histogram. I know that there would be no data in the additional bins.
Is there a way to define
xbreaks = seq(-1,110,5)
ybreaks = seq(0,95,5)
instead of using nbins which divides the range from minimum to maximum into the given number of bins?
Thank you for your help
Revise the "hist2d" as follows
hist2d_range<-function (x, y = NULL, nbins = 200, same.scale = TRUE, na.rm = TRUE,
show = TRUE, col = c("black", heat.colors(12)), FUN = base::length,
xlab, ylab,range=NULL, ...)
{
if (is.null(y)) {
if (ncol(x) != 2)
stop("If y is ommitted, x must be a 2 column matirx")
y <- x[, 2]
x <- x[, 1]
}
if (length(nbins) == 1)
nbins <- rep(nbins, 2)
nas <- is.na(x) | is.na(y)
if (na.rm) {
x <- x[!nas]
y <- y[!nas]
}
else stop("missinig values not permitted if na.rm=FALSE")
if (same.scale) {
if(is.null(range))
{
x.cuts <- seq(from = min(x, y), to = max(x, y), length = nbins[1] +
1)
y.cuts <- seq(from = min(x, y), to = max(x, y), length = nbins[2] +
1)
}else{
x.cuts <- seq(from = range[1], to = range[2], length = nbins[1] + 1)
y.cuts <- seq(from = range[1], to = range[2], length = nbins[1] + 1)
}
}
else {
x.cuts <- seq(from = min(x), to = max(x), length = nbins[1] +
1)
y.cuts <- seq(from = min(y), to = max(y), length = nbins[2] +
1)
}
index.x <- cut(x, x.cuts, include.lowest = TRUE)
index.y <- cut(y, y.cuts, include.lowest = TRUE)
m <- tapply(x, list(index.x, index.y), FUN)
if (identical(FUN, base::length))
m[is.na(m)] <- 0
if (missing(xlab))
xlab <- deparse(substitute(xlab))
if (missing(ylab))
ylab <- deparse(substitute(ylab))
if (show)
image(x.cuts, y.cuts, m, col = col, xlab = xlab, ylab = ylab,
...)
midpoints <- function(x) (x[-1] + x[-length(x)])/2
retval <- list()
retval$counts <- m
retval$x.breaks = x.cuts
retval$y.breaks = y.cuts
retval$x = midpoints(x.cuts)
retval$y = midpoints(y.cuts)
retval$nobs = length(x)
retval$call <- match.call()
class(retval) <- "hist2d"
retval
}
This function has an additional argument "range". The revised point is as follows.
if(is.null(range))
{
x.cuts <- seq(from = min(x, y), to = max(x, y), length = nbins[1] +
1)
y.cuts <- seq(from = min(x, y), to = max(x, y), length = nbins[2] +
1)
}else{
x.cuts <- seq(from = range[1], to = range[2], length = nbins[1] + 1)
y.cuts <- seq(from = range[1], to = range[2], length = nbins[1] + 1)
}
I changed the code a little bit and this version should work the with explicitly defining the breaks for both axes. First you have to load the function. Then you can give the x.breaks
and y.breaks
options with x.breaks=seq(0,10,0.1)
.
If same.scale
is true, you only need x.breaks
The return value addionaly contains the number of bins and the relative counts.
Also, you can include a legend if wanted, by setting legend=TRUE
. For that you need to have the package Fields
hist2d_breaks = function (x, y = NULL, nbins = 200,same.scale = FALSE, na.rm = TRUE,
show = TRUE, col = c("black", heat.colors(12)), FUN = base::length,
xlab, ylab,x.breaks,y.breaks, ...)
{
if (is.null(y)) {
if (ncol(x) != 2)
stop("If y is ommitted, x must be a 2 column matirx")
y <- x[, 2]
x <- x[, 1]
}
if (length(nbins) == 1)
nbins <- rep(nbins, 2)
nas <- is.na(x) | is.na(y)
if (na.rm) {
x <- x[!nas]
y <- y[!nas]
}
else stop("missinig values not permitted if na.rm=FALSE")
if(same.scale){
x.cuts = x.breaks;
y.cuts = x.breaks;
}else{
x.cuts <- x.breaks
y.cuts <- y.breaks
}
index.x <- cut(x, x.cuts, include.lowest = TRUE)
index.y <- cut(y, y.cuts, include.lowest = TRUE)
m <- tapply(x, list(index.x, index.y), FUN)
if (identical(FUN, base::length))
m[is.na(m)] <- 0
if (missing(xlab))
xlab <- deparse(substitute(xlab))
if (missing(ylab))
ylab <- deparse(substitute(ylab))
if (show){
if(legend){
image.plot(x.cuts, y.cuts, m, col = col, xlab = xlab, ylab = ylab,
...)
}else{
image(x.cuts, y.cuts, m, col = col, xlab = xlab, ylab = ylab,
...)
}
}
midpoints <- function(x) (x[-1] + x[-length(x)])/2
retval <- list()
retval$counts <- m
retval$counts_rel <- m/max(m)
retval$x.breaks = x.cuts
retval$y.breaks = y.cuts
retval$x = midpoints(x.cuts)
retval$y = midpoints(y.cuts)
retval$nobs = length(x)
retval$bins = c(length(x.cuts),length(y.cuts))
retval$call <- match.call()
class(retval) <- "hist2d"
retval
}
The call of (my data) then brings the following:
hist2d_breaks(df,x.breaks=seq(0,10,1),y.breaks=seq(-10,10,1),legend=TRUE)
brings up the following plot
2D Histogram with breaks
来源:https://stackoverflow.com/questions/33634593/define-breaks-for-hist2d-in-r