quantile

ggplot2 How to create vertical line corresponding to quantile in geom_bar plot

三世轮回 提交于 2019-12-07 18:45:06
问题 Currently, I can create a plot such as this: geom_bar ggplot(df.Acc, aes(x = reorder(cities, -accidents), y = accidents)) + geom_bar(stat = "identity", fill="steelblue", alpha=0.75) + geom_hline(yintercept=0, size=0.4, color="black") It's a plot with, let's say, number of bike accidents per year on the y-axis, and the city name would be on the x-axis. I want to add a vertical line to separate all the cities above the 70th percentile and below it. So I've tried with > vlinAcc <- quantile(df

Obtaining nice cuts in Hmisc with cut2 (without the [ ) signs )

佐手、 提交于 2019-12-07 13:22:55
问题 I'm currently trying to neatly cut data with use of the Hmisc package, as in the example below: dummy <- data.frame(important_variable=seq(1:1000)) require(Hmisc) dummy$cuts <- cut2(dummy$important_variable, g = 4) The produced cuts are correct with respect to the values: important_variable cuts 1 1 [ 1, 251) 2 2 [ 1, 251) 3 3 [ 1, 251) 4 4 [ 1, 251) 5 5 [ 1, 251) 6 6 [ 1, 251) > table(dummy$cuts) [ 1, 251) [251, 501) [501, 751) [751,1000] 250 250 250 250 However, I would like for the data to

Quantile regression and p-values - getting more decimal places

故事扮演 提交于 2019-12-07 11:27:29
问题 Using R, and package quantreg , I am performing quantile regression analyses to my data. I can get access to the p-values using the se (standard error) estimator in the summary function, as below, however I only get 5 decimal places, and would like more. model <- rq(outcome ~ predictor) summary(model, se="ker") Call: rq(formula = outcome ~ predictor) tau: [1] 0.5 Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 78.68182 2.89984 27.13312 0.00000 predictor 0.22727 0.03885 5.84943 0

Differing quantiles: Boxplot vs. Violinplot

浪子不回头ぞ 提交于 2019-12-07 02:49:22
问题 require(ggplot2) require(cowplot) d = iris ggplot2::ggplot(d, aes(factor(0), Sepal.Length)) + geom_violin(fill="black", alpha=0.2, draw_quantiles = c(0.25, 0.5, 0.75) , colour = "red", size = 1.5) + stat_boxplot(geom ='errorbar', width = 0.1)+ geom_boxplot(width = 0.2)+ facet_grid(. ~ Species, scales = "free_x") + xlab("") + ylab (expression(paste("Value"))) + coord_cartesian(ylim = c(3.5,9.5)) + scale_y_continuous(breaks = seq(4, 9, 1)) + theme(axis.text.x=element_blank(), axis.text.y =

Percentiles from VGAM

痞子三分冷 提交于 2019-12-06 15:11:14
I am using following example from help pages of package VGAM library(VGAM) fit4 <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), data = bmi.nz, trace = TRUE) qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1, xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4) I am getting a proper graph with it: How can I avoid plotting points from the graph? Also I need to print out values for these percentiles at each of ages 20,30,40...80 (separately as a table). How can this be done? Can I use ggplot() format command rather than qtplot() command here? Thanks for your help. How

Calculating Percentiles (Ruby)

丶灬走出姿态 提交于 2019-12-06 10:50:58
问题 My code is based on the methods described here and here. def fraction?(number) number - number.truncate end def percentile(param_array, percentage) another_array = param_array.to_a.sort r = percentage.to_f * (param_array.size.to_f - 1) + 1 if r <= 1 then return another_array[0] elsif r >= another_array.size then return another_array[another_array.size - 1] end ir = r.truncate another_array[ir] + fraction?((another_array[ir].to_f - another_array[ir - 1].to_f).abs) end Example usage: test_array

Python Pandas - Quantile calculation manually

家住魔仙堡 提交于 2019-12-06 08:33:37
问题 I am trying to calculate quantile for a column values manually, but not able to find the correct quantile value manually using the formula when compared to result output from Pandas. I looked around for different solutions, but did not find the right answer In [54]: df Out[54]: data1 data2 key1 key2 0 -0.204708 1.393406 a one 1 0.478943 0.092908 a two 2 1.965781 1.246435 a one In [55]: grouped = df.groupby('key1') In [56]: grouped['data1'].quantile(0.9) Out[56]: key1 a 1.668413 using the

Plot quantiles of distribution in ggplot2 with facets

梦想的初衷 提交于 2019-12-06 05:39:06
I'm currently plotting a number of different distributions of first differences from a number of regression models in ggplot. To facilitate interpretation of the differences, I want to mark the 2.5% and the 97.5% percentile of each distribution. Since I will be doing quite a few plots, and because the data is grouped in two dimension (model and type), I would like to define and plot the respective percentiles in the ggplot environment. Plotting the distributions using facets gets me to exactly where I want except for the percentiles. I could of course do this more manually, but I would ideally

ggplot2 How to create vertical line corresponding to quantile in geom_bar plot

这一生的挚爱 提交于 2019-12-06 04:08:32
Currently, I can create a plot such as this: geom_bar ggplot(df.Acc, aes(x = reorder(cities, -accidents), y = accidents)) + geom_bar(stat = "identity", fill="steelblue", alpha=0.75) + geom_hline(yintercept=0, size=0.4, color="black") It's a plot with, let's say, number of bike accidents per year on the y-axis, and the city name would be on the x-axis. I want to add a vertical line to separate all the cities above the 70th percentile and below it. So I've tried with > vlinAcc <- quantile(df.Cities$accidents, .70) > vlinAcc 70% 41.26589 This looks good, all the cities which have value of

Obtaining nice cuts in Hmisc with cut2 (without the [ ) signs )

牧云@^-^@ 提交于 2019-12-05 19:06:00
I'm currently trying to neatly cut data with use of the Hmisc package, as in the example below: dummy <- data.frame(important_variable=seq(1:1000)) require(Hmisc) dummy$cuts <- cut2(dummy$important_variable, g = 4) The produced cuts are correct with respect to the values: important_variable cuts 1 1 [ 1, 251) 2 2 [ 1, 251) 3 3 [ 1, 251) 4 4 [ 1, 251) 5 5 [ 1, 251) 6 6 [ 1, 251) > table(dummy$cuts) [ 1, 251) [251, 501) [501, 751) [751,1000] 250 250 250 250 However, I would like for the data to be presented slightly differently. For instance instead of [ 1, 251 ) [ 251, 501 ) I would prefer the