Multi x-axis using ggplot to present z-scores, iq scores and raw data

天涯浪子 提交于 2021-01-24 19:32:05

问题


Just contextualizing, I work with psychometrics/psychological testing. I have a dataset formed of "points","percentile","z_real","z_normal","iq". I would like to have a single ggplot in which I could present the Z_score (from my raw data), the z_score (with an underlying normal distribution), and then have two supplementary x-axes with "raw score" and "iq scores". That's pretty common in statistics, as you can check it below

This is the current plot

This is the best solution I've got

That's the desired plot

I'm working with tidyverse and I would like to remain within it. Some previous posts helped me, such as this [one][5] and this [one][6].

Thank you. The (part of the) data and the codes are here:

ask_ds <- structure(list(points = c(17, 17, 2, 16, 11, 17, 20, 16, 19, 
                                    15, 9, 14, 14, 16, 13, 13, 22, 21, 25, 17, 17, 17, 20, 6, 11, 
                                    5, 10, 23, 21, 19, 11, 15, 13, 17, 17, 17, 9, 18, 12, 22, 21, 
                                    23, 8, 12, 6, 7, 22, 12, 21, 16, 12, 5, 19, 19, 21, 13, 12, 18, 
                                    22, 13, 21, 24, 23, 4, 12, 19, 2, 22, 14, 16, 24, 19, 7, 12, 
                                    18, 14, 9, 18, 7, 22, 3, 18, 14, 22, 15, 11, 15, 19, 25, 9, 19, 
                                    16, 13, 19, 14, 15, 20, 3, 23, 9, 7, 22, 9, 22, 17, 12, 14, 11, 
                                    7, 9, 19, 16, 19, 10, 24, 14, 12, 18, 19, 24, 22, 11, 11, 15, 
                                    17, 15, 11, 11, 14, 24, 14, 21, 16, 10, 19, 21, 15, 20, 15, 20, 
                                    22, 10, 22, 16, 14, 16, 8, 16, 9, 15), percentile = c(63, 61, 
                                                                                          1, 56, 25, 59, 80, 55, 74, 49, 17, 45, 44, 57, 36, 32, 90, 85, 
                                                                                          99, 63, 63, 60, 82, 7, 24, 5, 20, 93, 83, 75, 24, 50, 35, 64, 
                                                                                          62, 61, 16, 65, 28, 90, 85, 93, 12, 28, 7, 8, 90, 26, 87, 55, 
                                                                                          30, 4, 74, 73, 87, 33, 30, 67, 91, 35, 86, 95, 93, 3, 29, 75, 
                                                                                          1, 89, 44, 55, 96, 75, 8, 27, 66, 41, 13, 68, 9, 91, 2, 65, 44, 
                                                                                          89, 51, 22, 46, 74, 98, 13, 76, 58, 37, 73, 43, 50, 79, 2, 93, 
                                                                                          17, 8, 91, 14, 92, 60, 26, 43, 25, 10, 14, 73, 57, 76, 18, 96, 
                                                                                          40, 31, 70, 73, 96, 91, 24, 23, 47, 59, 51, 24, 22, 40, 96, 45, 
                                                                                          85, 57, 17, 73, 86, 50, 79, 48, 81, 91, 21, 89, 53, 43, 58, 11, 
                                                                                          53, 14, 47), z_real = structure(c(0.36, 0.36, -2.38, 0.18, -0.73, 
                                                                                                                            0.36, 0.91, 0.18, 0.73, 0, -1.1, -0.19, -0.19, 0.18, -0.37, -0.37, 
                                                                                                                            1.28, 1.09, 1.83, 0.36, 0.36, 0.36, 0.91, -1.65, -0.73, -1.83, 
                                                                                                                            -0.92, 1.46, 1.09, 0.73, -0.73, 0, -0.37, 0.36, 0.36, 0.36, -1.1, 
                                                                                                                            0.55, -0.55, 1.28, 1.09, 1.46, -1.28, -0.55, -1.65, -1.47, 1.28, 
                                                                                                                            -0.55, 1.09, 0.18, -0.55, -1.83, 0.73, 0.73, 1.09, -0.37, -0.55, 
                                                                                                                            0.55, 1.28, -0.37, 1.09, 1.64, 1.46, -2.01, -0.55, 0.73, -2.38, 
                                                                                                                            1.28, -0.19, 0.18, 1.64, 0.73, -1.47, -0.55, 0.55, -0.19, -1.1, 
                                                                                                                            0.55, -1.47, 1.28, -2.2, 0.55, -0.19, 1.28, 0, -0.73, 0, 0.73, 
                                                                                                                            1.83, -1.1, 0.73, 0.18, -0.37, 0.73, -0.19, 0, 0.91, -2.2, 1.46, 
                                                                                                                            -1.1, -1.47, 1.28, -1.1, 1.28, 0.36, -0.55, -0.19, -0.73, -1.47, 
                                                                                                                            -1.1, 0.73, 0.18, 0.73, -0.92, 1.64, -0.19, -0.55, 0.55, 0.73, 
                                                                                                                            1.64, 1.28, -0.73, -0.73, 0, 0.36, 0, -0.73, -0.73, -0.19, 1.64, 
                                                                                                                            -0.19, 1.09, 0.18, -0.92, 0.73, 1.09, 0, 0.91, 0, 0.91, 1.28, 
                                                                                                                            -0.92, 1.28, 0.18, -0.19, 0.18, -1.28, 0.18, -1.1, 0), .Dim = c(150L, 
                                                                                                                                                                                            1L), "`\`\`\`scaled:center\`\`\``" = 15.0143288084465, "`\`\`\`scaled:scale\`\`\``" = 5.47051980922509), 
                         z_normal = c(0.33, 0.28, -2.33, 0.15, -0.67, 0.23, 0.84, 
                                      0.13, 0.64, -0.03, -0.95, -0.13, -0.15, 0.18, -0.36, -0.47, 
                                      1.28, 1.04, 2.33, 0.33, 0.33, 0.25, 0.92, -1.48, -0.71, -1.64, 
                                      -0.84, 1.48, 0.95, 0.67, -0.71, 0, -0.39, 0.36, 0.31, 0.28, 
                                      -0.99, 0.39, -0.58, 1.28, 1.04, 1.48, -1.17, -0.58, -1.48, 
                                      -1.41, 1.28, -0.64, 1.13, 0.13, -0.52, -1.75, 0.64, 0.61, 
                                      1.13, -0.44, -0.52, 0.44, 1.34, -0.39, 1.08, 1.64, 1.48, 
                                      -1.88, -0.55, 0.67, -2.33, 1.23, -0.15, 0.13, 1.75, 0.67, 
                                      -1.41, -0.61, 0.41, -0.23, -1.13, 0.47, -1.34, 1.34, -2.05, 
                                      0.39, -0.15, 1.23, 0.03, -0.77, -0.1, 0.64, 2.05, -1.13, 
                                      0.71, 0.2, -0.33, 0.61, -0.18, 0, 0.81, -2.05, 1.48, -0.95, 
                                      -1.41, 1.34, -1.08, 1.41, 0.25, -0.64, -0.18, -0.67, -1.28, 
                                      -1.08, 0.61, 0.18, 0.71, -0.92, 1.75, -0.25, -0.5, 0.52, 
                                      0.61, 1.75, 1.34, -0.71, -0.74, -0.08, 0.23, 0.03, -0.71, 
                                      -0.77, -0.25, 1.75, -0.13, 1.04, 0.18, -0.95, 0.61, 1.08, 
                                      0, 0.81, -0.05, 0.88, 1.34, -0.81, 1.23, 0.08, -0.18, 0.2, 
                                      -1.23, 0.08, -1.08, -0.08), iq = c(104.98, 104.19, 65.1, 
                                                                         102.26, 89.88, 103.41, 112.62, 101.88, 109.65, 99.62, 85.69, 
                                                                         98.12, 97.74, 102.65, 94.62, 92.98, 119.22, 115.55, 134.9, 
                                                                         104.98, 104.98, 103.8, 113.73, 77.86, 89.41, 75.33, 87.38, 
                                                                         122.14, 114.31, 110.12, 89.41, 100, 94.22, 105.38, 104.58, 
                                                                         104.19, 85.08, 105.78, 91.26, 119.22, 115.55, 122.14, 82.38, 
                                                                         91.26, 77.86, 78.92, 119.22, 90.35, 116.9, 101.88, 92.13, 
                                                                         73.74, 109.65, 109.19, 116.9, 93.4, 92.13, 106.6, 120.11, 
                                                                         94.22, 116.2, 124.67, 122.14, 71.79, 91.7, 110.12, 65.1, 
                                                                         118.4, 97.74, 101.88, 126.26, 110.12, 78.92, 90.81, 106.19, 
                                                                         96.59, 83.1, 107.02, 79.89, 120.11, 69.19, 105.78, 97.74, 
                                                                         118.4, 100.38, 88.42, 98.49, 109.65, 130.81, 83.1, 110.59, 
                                                                         103.03, 95.02, 109.19, 97.35, 100, 112.1, 69.19, 122.14, 
                                                                         85.69, 78.92, 120.11, 83.8, 121.08, 103.8, 90.35, 97.35, 
                                                                         89.88, 80.78, 83.8, 109.19, 102.65, 110.59, 86.27, 126.26, 
                                                                         96.2, 92.56, 107.87, 109.19, 126.26, 120.11, 89.41, 88.92, 
                                                                         98.87, 103.41, 100.38, 89.41, 88.42, 96.2, 126.26, 98.12, 
                                                                         115.55, 102.65, 85.69, 109.19, 116.2, 100, 112.1, 99.25, 
                                                                         113.17, 120.11, 87.9, 118.4, 101.13, 97.35, 103.03, 81.6, 
                                                                         101.13, 83.8, 98.87)), row.names = c(NA, -150L), class = c("tbl_df", 
                                                                                                                                    "tbl", "data.frame"))
ask_ds %>% 
  select(z_real, z_normal) %>% 
  pivot_longer(everything()) %>% 
  ggplot(., aes(value, fill = name)) + 
  geom_density(alpha=.2) +
  scale_x_continuous(breaks=-3:3, labels=parse(text=paste(-3:3, '*sigma')) ,
                     "Standard deviation", 
                     sec.axis = sec_axis(trans = ~.* sd(ask_ds$points) +
                                           mean(ask_ds$points), 
                                         "Raw score")) 


gridExtra::grid.arrange(ask_ds %>% 
                          select(z_real, z_normal) %>% 
                          pivot_longer(everything()) %>% 
                          ggplot(., aes(value, fill = name)) + 
                          geom_density(alpha=.2) ,
                        
                        ggplot(ask_ds, aes(points)) +
                          geom_density()
                        
)

回答1:


I hadn't known about the patchwork thing - that's really cool, thanks @teunbrand. I have a less elegant solution, but thought I would post it just the same. Here, you're making the axes by hand inside the plotting region.

## turn your data into a stand alone dataset
plot.dat <- ask_ds %>% 
  select(z_real, z_normal) %>% 
  pivot_longer(everything()) 
## identify the values for the x-axis
xv <- c(-2, -1, 0, 1, 2)
## identify the middle values for the y-tick marks
yv1 <- c(-.025,-.025,-.025,-.025,-.025)
yv2 <- c(-.065,-.065,-.065,-.065,-.065)
## identify the transformations for the tick mark labels on other axes
rv <- xv* sd(ask_ds$points) + mean(ask_ds$points)
iq <- xv*15 + 100

## make the plot
ggplot() + 
  geom_density(data=plot.dat, aes(value, fill = name), alpha=.2) +
  ## add an abline that will serve as the raw data axis
  geom_abline(slope=0, intercept=-.025) + 
  ## add segments for the tick marks
  geom_segment(aes(x=xv, y=yv1 - .005, xend=xv, yend=yv1 + .005)) + 
  ## add the text labels
  geom_text(aes(x=xv, y = yv1 - .011, label=sprintf("%.2f", rv))) + 
  ## repeat a bit lower for the IQ axis
  geom_abline(slope=0, intercept=-.065) + 
  geom_segment(aes(x=xv, y=yv2 - .005, xend=xv, yend=yv2 + .005)) +  
  geom_text(aes(x=xv, y = yv2 - .011, label=sprintf("%.2f", iq))) + 
  ## format the actual x-axis
  scale_x_continuous(breaks=-3:3, labels=parse(text=paste(-3:3, '*sigma')) ,
                     "Standard deviation") + 
  ## add a y-axis to identify the two new scales.
  scale_y_continuous(sec.axis = sec_axis(~., 
                                        breaks = c(-.025, -.065), 
                                        labels = c("Raw Data", "IQ"))) + 
  ## put the legend on top - this keeps it from being pushed aside for the second
  ## y-axis labels. 
  theme(legend.position="top")




回答2:


I think the core of the question here is how to get multiple x-axes, so I'm going to answer that question instead of trying to understand what your data is about. There is no 'proper' way to introduce multiple x-axis in ggplot2, however, you can work around this by putting a bunch of miniscule height plots below each other. For this I recommend the patchwork package.

We first make the plot that we want to illustrate, I'll do this with some dummy data:

library(ggplot2)
library(patchwork)

set.seed(0)
df <- data.frame(
  x = rnorm(100, 100, 15)
)

dens <- ggplot(df, aes(x)) +
  geom_density() +
  labs(x = "IQ") +
  theme(axis.line.x = element_line())

Next, we'll make two dummy plots where the x-axis will be trained, but no geom is drawn. Also the aspect ratio is set to a small number, to make it appear as if this is just the axis line and no plot occured. This generally only works whenever there is a linear translation from one axis to the next. I'm no psychometrician but I thought that IQ was roughly ~N(100, 15), so we need to inverse transform this to the standard normal for a Z-score.

zscore_axis <- ggplot(df, aes((x - 100) / 15)) +
  labs(x = "Z-score") +
  theme(axis.line.x = element_line(),
        aspect.ratio = 1e-8)

quantile_axis <- ggplot(df, aes((x - 100) / 15)) +
  labs(x = "Quantile") +
  scale_x_continuous(breaks = qnorm(seq(0.1, 0.9, by = 0.1)),
                     labels = seq(0.1, 0.9, by = 0.1)) +
  theme(axis.line.x = element_line(),
        aspect.ratio = 1e-8)

Lastly, with patchwork magic we place all three plots below oneanother.

dens / zscore_axis / quantile_axis

Created on 2020-09-17 by the reprex package (v0.3.0)

So that is how I would make multiple x-axis. How this would play with your data, I'm not exactly sure. You can adapt to your own metrics, such as raw score, as long as they are a linear transformation of the data.



来源:https://stackoverflow.com/questions/63943336/multi-x-axis-using-ggplot-to-present-z-scores-iq-scores-and-raw-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!