问题
I am combining two distinct plots into a grid layout with grid
as suggested by @lgautier in rpy2 using python. The top plot is a density and and the bottom a bar graph:
iris = r('iris')
import pandas
# define layout
lt = grid.layout(2, 1)
vp = grid.viewport(layout = lt)
vp.push()
# first plot
vp_p = grid.viewport(**{'layout.pos.row': 1, 'layout.pos.col':1})
p1 = ggplot2.ggplot(iris) + \
ggplot2.geom_density(aes_string(x="Sepal.Width",
colour="Species")) + \
ggplot2.facet_wrap(Formula("~ Species"))
p1.plot(vp = vp_p)
# second plot
mean_df = pandas.DataFrame({"Species": ["setosa", "virginica", "versicolor"],
"X": [10, 2, 30],
"Y": [5, 3, 4]})
mean_df = pandas.melt(mean_df, id_vars=["Species"])
r_mean_df = get_r_dataframe(mean_df)
p2 = ggplot2.ggplot(r_mean_df) + \
ggplot2.geom_bar(aes_string(x="Species",
y="value",
group="variable",
colour="variable"),
position=ggplot2.position_dodge(),
stat="identity")
vp_p = grid.viewport(**{'layout.pos.row': 2, 'layout.pos.col':1})
p2.plot(vp = vp_p)
what I get is close to what I want but the plots are not exactly aligned (shown by the arrows that I added):
I'd like the plot regions (not the legends) to match up exactly. How can that be achieved? the difference here is not so big but as you add conditions to the bar graph below or make them dodged bar graphs with position_dodge
the differences can become very big and the plots are not aligned.
The standard ggplot solution cannot easily be translated into rpy2:
arrange
appears to be grid_arrange
in gridExtra
:
>>> gridExtra = importr("gridExtra")
>>> gridExtra.grid_arrange
<SignatureTranslatedFunction - Python:0x430f518 / R:0x396f678>
ggplotGrob
is not accessible from ggplot2
, but can be accessed like this:
>>> ggplot2.ggplot2.ggplotGrob
Though I have no idea how to access grid::unit.pmax
:
>>> grid.unit
<bound method type.unit of <class 'rpy2.robjects.lib.grid.Unit'>>
>>> grid.unit("pmax")
Error in (function (x, units, data = NULL) :
argument "units" is missing, with no default
rpy2.rinterface.RRuntimeError: Error in (function (x, units, data = NULL) :
argument "units" is missing, with no default
so it's not clear how to translate the standard ggplot2 solution to rpy2.
edit: as others pointed out grid::unit.pmax
is grid.unit_pmax
. I still don't know how to access in rpy2 the widths
parameter of grob
objects though, which is necessary to set the widths of the plots to be that of the wider plot. I have:
gA = ggplot2.ggplot2.ggplotGrob(p1)
gB = ggplot2.ggplot2.ggplotGrob(p2)
g = importr("grid")
print "gA: ", gA
maxWidth = g.unit_pmax(gA.widths[2:5], gB.widths[2:5])
The gA.widths
is not the correct syntax. The grob
object gA
prints as:
gA: TableGrob (8 x 13) "layout": 17 grobs
z cells name grob
1 0 ( 1- 8, 1-13) background rect[plot.background.rect.350]
2 1 ( 4- 4, 4- 4) panel-1 gTree[panel-1.gTree.239]
3 2 ( 4- 4, 7- 7) panel-2 gTree[panel-2.gTree.254]
4 3 ( 4- 4,10-10) panel-3 gTree[panel-3.gTree.269]
5 4 ( 3- 3, 4- 4) strip_t-1 absoluteGrob[strip.absoluteGrob.305]
6 5 ( 3- 3, 7- 7) strip_t-2 absoluteGrob[strip.absoluteGrob.311]
7 6 ( 3- 3,10-10) strip_t-3 absoluteGrob[strip.absoluteGrob.317]
8 7 ( 4- 4, 3- 3) axis_l-1 absoluteGrob[axis-l-1.absoluteGrob.297]
9 8 ( 4- 4, 6- 6) axis_l-2 zeroGrob[axis-l-2.zeroGrob.298]
10 9 ( 4- 4, 9- 9) axis_l-3 zeroGrob[axis-l-3.zeroGrob.299]
11 10 ( 5- 5, 4- 4) axis_b-1 absoluteGrob[axis-b-1.absoluteGrob.276]
12 11 ( 5- 5, 7- 7) axis_b-2 absoluteGrob[axis-b-2.absoluteGrob.283]
13 12 ( 5- 5,10-10) axis_b-3 absoluteGrob[axis-b-3.absoluteGrob.290]
14 13 ( 7- 7, 4-10) xlab text[axis.title.x.text.319]
15 14 ( 4- 4, 2- 2) ylab text[axis.title.y.text.321]
16 15 ( 4- 4,12-12) guide-box gtable[guide-box]
17 16 ( 2- 2, 4-10) title text[plot.title.text.348]
update: made some progress on accessing widths, but still cannot translate the solution. To set widths of grobs, I have:
# get grobs
gA = ggplot2.ggplot2.ggplotGrob(p1)
gB = ggplot2.ggplot2.ggplotGrob(p2)
g = importr("grid")
# get max width
maxWidth = g.unit_pmax(gA.rx2("widths")[2:5][0], gB.rx2("widths")[2:5][0])
print gA.rx2("widths")[2:5]
wA = gA.rx2("widths")[2:5]
wB = gB.rx2("widths")[2:5]
print "before: ", wA[0]
wA[0] = robj.ListVector(maxWidth)
print "After: ", wA[0]
print "before: ", wB[0]
wB[0] = robj.ListVector(maxWidth)
print "after:", wB[0]
gridExtra.grid_arrange(gA, gB, ncol=1)
It runs but does not work. THe output is:
[[1]]
[1] 0.740361111111111cm
[[2]]
[1] 1null
[[3]]
[1] 0.127cm
before: [1] 0.740361111111111cm
After: [1] max(0.740361111111111cm, sum(1grobwidth, 0.15cm+0.1cm))
before: [1] sum(1grobwidth, 0.15cm+0.1cm)
after: [1] max(0.740361111111111cm, sum(1grobwidth, 0.15cm+0.1cm))
update2: realized as @baptiste pointed out that it would be helpful to show the pure R version of what I'm trying to reproduce in rpy2. Here's the pure R version:
df <- data.frame(Species=c("setosa", "virginica", "versicolor"),X=c(1,2,3), Y=c(10,20,30))
p1 <- ggplot(iris) + geom_density(aes(x=Sepal.Width, colour=Species))
p2 <- ggplot(df) + geom_bar(aes(x=Species, y=X, colour=Species))
gA <- ggplotGrob(p1)
gB <- ggplotGrob(p2)
maxWidth = grid::unit.pmax(gA$widths[2:5], gB$widths[2:5])
gA$widths[2:5] <- as.list(maxWidth)
gB$widths[2:5] <- as.list(maxWidth)
grid.arrange(gA, gB, ncol=1)
I think that this in general works for two panels with legends that have different facets in ggplot2 and I want to implement this in rpy2.
update3: almost got it to work, by building a FloatVector
up one element at a time:
maxWidth = []
for x, y in zip(gA.rx2("widths")[2:5], gB.rx2("widths")[2:5]):
pmax = g.unit_pmax(x, y)
print "PMAX: ", pmax
val = pmax[1][0][0]
print "VAL->", val
maxWidth.append(val)
gA[gA.names.index("widths")][2:5] = robj.FloatVector(maxWidth)
gridExtra.grid_arrange(gA, gB, ncol=1)
however this generates a segfault/core dump:
Error: VECTOR_ELT() can only be applied to a 'list', not a 'double'
*** longjmp causes uninitialized stack frame ***: python2.7 terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f83742e2817]
/lib/x86_64-linux-gnu/libc.so.6(+0x10a78d)[0x7f83742e278d]
/lib/x86_64-linux-gnu/libc.so.6(__longjmp_chk+0x33)[0x7f83742e26f3]
...
7f837591e000-7f8375925000 r--s 00000000 fc:00 1977264 /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
7f8375926000-7f8375927000 rwxp 00000000 00:00 0
7f8375927000-7f8375929000 rw-p 00000000 00:00 0
7f8375929000-7f837592a000 r--p 00022000 fc:00 917959 /lib/x86_64-linux-gnu/ld-2.15.so
7f837592a000-7f837592c000 rw-p 00023000 fc:00 917959 /lib/x86_64-linux-gnu/ld-2.15.so
7ffff4b96000-7ffff4bd6000 rw-p 00000000 00:00 0 [stack]
7ffff4bff000-7ffff4c00000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted (core dumped)
Update: the bounty is ended. I appreciate the answers received, but neither answer uses rpy2 and this is an rpy2 question, so technically the answers are not on topic. There is a plain R solution to this problem (even if there isn't a solution to this in general as @baptiste pointed out) and the question is simply how to translate it into rpy2
回答1:
Aligning two plots becomes much trickier when facets are involved. I don't know if there is a general solution, even in R. Consider this scenario,
p1 <- ggplot(mtcars, aes(mpg, wt)) + geom_point() +
facet_wrap(~ cyl, ncol=2,scales="free")
p2 <- p1 + facet_null() + aes(colour=am) + ylab("this\nis taller")
gridExtra::grid.arrange(p1, p2)
With some work, you can compare the widths for the left axis, and the legends (which may or may not be present on the right side).
library(gtable)
# legend, if it exists, may be the second last item on the right,
# unless it's not on the right side.
locate_guide <- function(g){
right <- max(g$layout$r)
gg <- subset(g$layout, (grepl("guide", g$layout$name) & r == right - 1L) |
r == right)
sort(gg$r)
}
compare_left <- function(g1, g2){
w1 <- g1$widths[1:3]
w2 <- g2$widths[1:3]
unit.pmax(w1, w2)
}
align_lr <- function(g1, g2){
# align the left side
left <- compare_left(g1, g2)
g1$widths[1:3] <- g2$widths[1:3] <- left
# now deal with the right side
gl1 <- locate_guide(g1)
gl2 <- locate_guide(g2)
if(length(gl1) < length(gl2)){
g1$widths[[gl1]] <- max(g1$widths[gl1], g2$widths[gl2[2]]) +
g2$widths[gl2[1]]
}
if(length(gl2) < length(gl1)){
g2$widths[[gl2]] <- max(g2$widths[gl2], g1$widths[gl1[2]]) +
g1$widths[gl1[1]]
}
if(length(gl1) == length(gl2)){
g1$widths[[gl1]] <- g2$widths[[gl2]] <- unit.pmax(g1$widths[gl1], g2$widths[gl2])
}
grid.arrange(g1, g2)
}
align_lr(g1, g2)
Note that I haven't tested other cases; I'm sure it's very easy to break. As far as I understand from the docs, rpy2
provides a mechanism to use an arbitrary piece of R code, so the conversion should not be a problem.
回答2:
Split the legends from the plots (see ggplot separate legend and plot) , then use grid.arrange
library(gridExtra)
g_legend <- function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
legend
}
legend1 <- g_legend(p1)
legend2 <- g_legend(p2)
grid.arrange(p1 + theme(legend.position = 'none'), legend1,
p2 + theme(legend.position = 'none'), legend2,
ncol=2, widths = c(5/6,1/6))
This is obviously the R
implementation.
回答3:
Untested translation of the answer using gridExtra
's grid.arrange()
. The left sides of the plots (where the labels for the y-axis are) might not always be aligned though.
from rpy2.robjects.packages import importr
gridextra = importr('gridExtra')
from rpy2.robjects.lib import ggplot2
_ggplot2 = ggplot2.ggplot2
def dollar(x, name): # should be included in rpy2.robjects, may be...
return x[x.index(name)]
def g_legend(a_gplot):
tmp = _ggplot2.ggplot_gtable(_ggplot2.ggplot_build(a_gplot))
leg = [dollar(x, 'name')[0] for x in dollar(tmp, 'grobs')].index('guide-box')
legend = dollar(tmp, 'grobs')[leg]
return legend
legend1 = g_legend(p1)
legend2 = g_legend(p2)
nolegend = ggplot2.theme(**{'legend.position': 'none'})
gridexta.grid_arrange(p1 + nolegend, legend1,
p2 + nolegend, legend2,
ncol=2, widths = FloatVector((5.0/6,1.0/6)))
来源:https://stackoverflow.com/questions/17736434/aligning-distinct-non-facet-plots-in-ggplot2-using-rpy2-in-python