quantile | 易学教程

ggplot2 How to create vertical line corresponding to quantile in geom_bar plot

阅读更多关于 ggplot2 How to create vertical line corresponding to quantile in geom_bar plot

问题 Currently, I can create a plot such as this: geom_bar ggplot(df.Acc, aes(x = reorder(cities, -accidents), y = accidents)) + geom_bar(stat = "identity", fill="steelblue", alpha=0.75) + geom_hline(yintercept=0, size=0.4, color="black") It's a plot with, let's say, number of bike accidents per year on the y-axis, and the city name would be on the x-axis. I want to add a vertical line to separate all the cities above the 70th percentile and below it. So I've tried with > vlinAcc <- quantile(df

Obtaining nice cuts in Hmisc with cut2 (without the [ ) signs )

阅读更多关于 Obtaining nice cuts in Hmisc with cut2 (without the [ ) signs )

问题 I'm currently trying to neatly cut data with use of the Hmisc package, as in the example below: dummy <- data.frame(important_variable=seq(1:1000)) require(Hmisc) dummy$cuts <- cut2(dummy$important_variable, g = 4) The produced cuts are correct with respect to the values: important_variable cuts 1 1 [ 1, 251) 2 2 [ 1, 251) 3 3 [ 1, 251) 4 4 [ 1, 251) 5 5 [ 1, 251) 6 6 [ 1, 251) > table(dummy$cuts) [ 1, 251) [251, 501) [501, 751) [751,1000] 250 250 250 250 However, I would like for the data to

Quantile regression and p-values - getting more decimal places

阅读更多关于 Quantile regression and p-values - getting more decimal places

问题 Using R, and package quantreg , I am performing quantile regression analyses to my data. I can get access to the p-values using the se (standard error) estimator in the summary function, as below, however I only get 5 decimal places, and would like more. model <- rq(outcome ~ predictor) summary(model, se="ker") Call: rq(formula = outcome ~ predictor) tau: [1] 0.5 Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) 78.68182 2.89984 27.13312 0.00000 predictor 0.22727 0.03885 5.84943 0

Differing quantiles: Boxplot vs. Violinplot

阅读更多关于 Differing quantiles: Boxplot vs. Violinplot

问题 require(ggplot2) require(cowplot) d = iris ggplot2::ggplot(d, aes(factor(0), Sepal.Length)) + geom_violin(fill="black", alpha=0.2, draw_quantiles = c(0.25, 0.5, 0.75) , colour = "red", size = 1.5) + stat_boxplot(geom ='errorbar', width = 0.1)+ geom_boxplot(width = 0.2)+ facet_grid(. ~ Species, scales = "free_x") + xlab("") + ylab (expression(paste("Value"))) + coord_cartesian(ylim = c(3.5,9.5)) + scale_y_continuous(breaks = seq(4, 9, 1)) + theme(axis.text.x=element_blank(), axis.text.y =

Percentiles from VGAM

阅读更多关于 Percentiles from VGAM

I am using following example from help pages of package VGAM library(VGAM) fit4 <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), data = bmi.nz, trace = TRUE) qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1, xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4) I am getting a proper graph with it: How can I avoid plotting points from the graph? Also I need to print out values for these percentiles at each of ages 20,30,40...80 (separately as a table). How can this be done? Can I use ggplot() format command rather than qtplot() command here? Thanks for your help. How

Calculating Percentiles (Ruby)

阅读更多关于 Calculating Percentiles (Ruby)

问题 My code is based on the methods described here and here. def fraction?(number) number - number.truncate end def percentile(param_array, percentage) another_array = param_array.to_a.sort r = percentage.to_f * (param_array.size.to_f - 1) + 1 if r <= 1 then return another_array[0] elsif r >= another_array.size then return another_array[another_array.size - 1] end ir = r.truncate another_array[ir] + fraction?((another_array[ir].to_f - another_array[ir - 1].to_f).abs) end Example usage: test_array

Python Pandas - Quantile calculation manually

阅读更多关于 Python Pandas - Quantile calculation manually

问题 I am trying to calculate quantile for a column values manually, but not able to find the correct quantile value manually using the formula when compared to result output from Pandas. I looked around for different solutions, but did not find the right answer In [54]: df Out[54]: data1 data2 key1 key2 0 -0.204708 1.393406 a one 1 0.478943 0.092908 a two 2 1.965781 1.246435 a one In [55]: grouped = df.groupby('key1') In [56]: grouped['data1'].quantile(0.9) Out[56]: key1 a 1.668413 using the

Plot quantiles of distribution in ggplot2 with facets

阅读更多关于 Plot quantiles of distribution in ggplot2 with facets

I'm currently plotting a number of different distributions of first differences from a number of regression models in ggplot. To facilitate interpretation of the differences, I want to mark the 2.5% and the 97.5% percentile of each distribution. Since I will be doing quite a few plots, and because the data is grouped in two dimension (model and type), I would like to define and plot the respective percentiles in the ggplot environment. Plotting the distributions using facets gets me to exactly where I want except for the percentiles. I could of course do this more manually, but I would ideally

ggplot2 How to create vertical line corresponding to quantile in geom_bar plot

阅读更多关于 ggplot2 How to create vertical line corresponding to quantile in geom_bar plot

Currently, I can create a plot such as this: geom_bar ggplot(df.Acc, aes(x = reorder(cities, -accidents), y = accidents)) + geom_bar(stat = "identity", fill="steelblue", alpha=0.75) + geom_hline(yintercept=0, size=0.4, color="black") It's a plot with, let's say, number of bike accidents per year on the y-axis, and the city name would be on the x-axis. I want to add a vertical line to separate all the cities above the 70th percentile and below it. So I've tried with > vlinAcc <- quantile(df.Cities$accidents, .70) > vlinAcc 70% 41.26589 This looks good, all the cities which have value of

Obtaining nice cuts in Hmisc with cut2 (without the [ ) signs )

阅读更多关于 Obtaining nice cuts in Hmisc with cut2 (without the [ ) signs )

I'm currently trying to neatly cut data with use of the Hmisc package, as in the example below: dummy <- data.frame(important_variable=seq(1:1000)) require(Hmisc) dummy$cuts <- cut2(dummy$important_variable, g = 4) The produced cuts are correct with respect to the values: important_variable cuts 1 1 [ 1, 251) 2 2 [ 1, 251) 3 3 [ 1, 251) 4 4 [ 1, 251) 5 5 [ 1, 251) 6 6 [ 1, 251) > table(dummy$cuts) [ 1, 251) [251, 501) [501, 751) [751,1000] 250 250 250 250 However, I would like for the data to be presented slightly differently. For instance instead of [ 1, 251 ) [ 251, 501 ) I would prefer the