问题
I am trying to make a contour ternary plot, with two groups, where the opacity (alpha) of the contours is a function of the density of the points (e.g. more points tightly clustered = less opaque).
I am stuck on one point.
My two groups (here A
and B
) have unequal group sizes (A
= 150 obs, B
= 50 obs),
this means points in one group are frequently alot more clustered,
when this is the case the opacity of group B
should be much higher relative to
group A
, since points in group B are much less dense. But it looks like opacity is scaled within groups instead of across groups.
My Question: is it possible to scale opacity to the density of points, where density is relative across both groups?
An example:
library(ggtern)
set.seed(1234)
# example data
df <- data.frame(X = c(runif(150, 0.7, 1),runif(50, 0, 0.3)),
Y = c(runif(150, 0, 0.3),runif(50, 0, 0.3)),
Z = c(runif(150, 0, 0.5),runif(50, 0.5, 1)),
D = c(rep("A", 150), rep("B", 50)))
# ternary plot
ggtern(df, aes(x = X,y = Y, z = Z, color = D)) +
stat_density_tern(aes(alpha = ..level.., fill = D),
geom = 'polygon',
bins = 10,
color = "grey") +
geom_point(alpha = 0.5) +
scale_colour_manual(values = c("tomato3", "turquoise4"))
# points are only displayed to show densities, I don't plan on showing
# points in the final plot
Given group B
points are much less dense I would expect the contours
to be more opaque than group A
.
Another option would be to use scale_colour_gradient()
, but I can't
see how to get two separate gradients (one for each of A
and B
) on a
single plot.
回答1:
I wished I would have an easier answer for you, but alas, I have not. However, I've found a quite hacky solution to your problem, by making a new stat and predefining breaks. Disclaimer: I don't use ggtern myself, so I don't know much about the specifics. The problem in general seems to be that the density is computed per group and the integral of densities are generally set to 1. We can solve this by adding a new stat that scales this for us.
The solution then seems deceivingly simple: multiply the calculated densities by the number of datapoints in the group, to get a density scaled to reflects group sizes. The only drawback would be that we have to change bins = 10
, which is calculated per group, with breaks = seq(start, end, by = somenumber)
to have absolute instead of relative breaks for the contours.
However, ggtern is quite the complicated package with it's own peculiarities that make it difficult to write a new stat function to work. There exists a list with 'approved stats' and ggtern will remove any layers that don't have their approval.
ggtern:::.approvedstat
identity confidence density_tern smooth_tern
"StatIdentity" "StatConfidenceTern" "StatDensityTern" "StatSmoothTern"
sum unique interpolate_tern mean_ellipse
"StatSum" "StatUnique" "StatInterpolateTern" "StatMeanEllipse"
hex_tern tri_tern
"StatHexTern" "StatTriTern"
So the first order of business would be to add an entry for our own stat (which we'll call StatDensityTern2
) to the approved stat list, but since this .approvedstat
is in the package namespace, we'll have to be a bit hacky to do this:
approveupdate <- c(ggtern:::.approvedstat, "density_tern2" = "StatDensityTern2")
assignInNamespace(".approvedstat", approveupdate, pos = "package:ggtern")
Now we can write our own StatDensityTern2
, that inherits functionality from StatDensityTern
, with a small update in how groups are computed. While writing this new stat, we need to take care that we load necessary packages and refer to internal functions correctly. We'll largely copy-paste from the existing StatDensityTern$compute_group
, but make a small adjustment to change z = as.vector(dens$z)
to z = as.vector(dens$z) * nrow(data)
before passing on the data to the contour function.
library(compositions)
library(rlang)
StatDensityTern2 <-
ggproto(
"StatDensityTern2",
StatDensityTern,
compute_group = function(
self, data, scales, na.rm = FALSE, n = 100, h = NULL,
bdl = 0, bdl.val = NA, contour = TRUE, base = "ilr", expand = 0.5,
weight = NULL, bins = NULL, binwidth = NULL, breaks = NULL
) {
if (!c(base) %in% c("identity", "ilr"))
stop("base must be either identity or ilr", call. = FALSE)
raes = self$required_aes
data[raes] = suppressWarnings(compositions::acomp(data[raes]))
data[raes][data[raes] <= bdl] = bdl.val[1]
data = remove_missing(data, vars = self$required_aes, na.rm = na.rm,
name = "StatDensityTern", finite = TRUE)
if (ggplot2:::empty(data))
return(data.frame())
coord = coord_tern()
f = get(base, mode = "function")
fInv = get(sprintf("%sInv", base), mode = "function")
if (base == "identity")
data = tlr2xy(data, coord, inverse = FALSE, scale = TRUE)
h = h %||% ggtern:::estimateBandwidth(base, data[which(colnames(data) %in%
raes)])
if (length(h) != 2)
h = rep(h[1], 2)
if (base != "identity" && diff(h) != 0)
warning("bandwidth 'h' has different x and y bandwiths for 'ilr', this may (probably will) introduce permutational artifacts depending on the ordering",
call. = FALSE)
data[raes[1:2]] = suppressWarnings(f(as.matrix(data[which(colnames(data) %in%
raes)])))
expand = if (length(expand) != 2)
rep(expand[1], 2)
else expand
rngxy = range(c(data$x, data$y))
rngx = scales:::expand_range(switch(base, identity = coord$limits$x,
rngxy), expand[1])
rngy = scales:::expand_range(switch(base, identity = coord$limits$y,
rngxy), expand[2])
dens = ggtern:::kde2d.weighted(data$x, data$y, h = h, n = n, lims = c(rngx,
rngy), w = data$weight)
# Here be relevant changes ------------------------------------------------
df = data.frame(expand.grid(x = dens$x, y = dens$y),
z = as.vector(dens$z) * nrow(data),
group = data$group[1])
# Here end relevant changes -----------------------------------------------
if (contour) {
df = StatContour$compute_panel(df, scales, bins = bins,
binwidth = binwidth, breaks = breaks)
}
else {
names(df) <- c("x", "y", "density", "group")
df$level <- 1
df$piece <- 1
}
if (base == "identity")
df = tlr2xy(df, coord, inverse = TRUE, scale = TRUE)
df[raes] = suppressWarnings(fInv(as.matrix(df[which(colnames(df) %in%
raes)])))
df
}
)
Now that we've written a new stat and have approved of the stat ourselves, we can use it in the following manner:
set.seed(1234)
# example data
df <- data.frame(X = c(runif(150, 0.7, 1),runif(50, 0, 0.3)),
Y = c(runif(150, 0, 0.3),runif(50, 0, 0.3)),
Z = c(runif(150, 0, 0.5),runif(50, 0.5, 1)),
D = c(rep("A", 150), rep("B", 50)))
ggtern(df, aes(x = X, y = Y, z = Z, color = D)) +
geom_polygon(aes(alpha = ..level.., fill = D),
stat = "DensityTern2",
breaks = seq(10, 150, by = 10),
color = "grey") +
geom_point(alpha = 0.5) +
scale_colour_manual(values = c("tomato3", "turquoise4"))
Which gave me the following plot:
Hope you found this useful!
来源:https://stackoverflow.com/questions/57175114/ternary-plot-scaling-opacity-across-groups