quantile

Determine a normal distribution given its quantile information

我的未来我决定 提交于 2019-12-05 18:14:51
I was wondering how I could have R tell me the SD (as an argument in the qnorm() built in R) for a normal distribution whose 95% limit values are already known? As an example, I know the two 95% limit values for my normal are 158, and 168, respectively. So, in the below R code SD is shown as "x". If "y" (the answer of this simple qnorm() function) needs to be (158, 168), then can R tell me what should be x ? y <- qnorm(c(.025,.975), 163, x) A general procedure for Normal distribution Suppose we have a Normal distribution X ~ N(mu, sigma) , with unknown mean mu and unknown standard deviation

Differing quantiles: Boxplot vs. Violinplot

吃可爱长大的小学妹 提交于 2019-12-05 06:41:58
require(ggplot2) require(cowplot) d = iris ggplot2::ggplot(d, aes(factor(0), Sepal.Length)) + geom_violin(fill="black", alpha=0.2, draw_quantiles = c(0.25, 0.5, 0.75) , colour = "red", size = 1.5) + stat_boxplot(geom ='errorbar', width = 0.1)+ geom_boxplot(width = 0.2)+ facet_grid(. ~ Species, scales = "free_x") + xlab("") + ylab (expression(paste("Value"))) + coord_cartesian(ylim = c(3.5,9.5)) + scale_y_continuous(breaks = seq(4, 9, 1)) + theme(axis.text.x=element_blank(), axis.text.y = element_text(size = rel(1.5)), axis.ticks.x = element_blank(), strip.background=element_rect(fill="black"),

Quantile functions in boost (C++)

独自空忆成欢 提交于 2019-12-04 18:41:30
问题 Judging from the documentation boost seems to offer quantile functions (inverse cdf functions) for both normal and gamma distributions, but its not clear for me how can I actually use them. Could someone paste an example please? 回答1: The quantile calculation is implemented as a free function. Here's an example: #include <boost/math/distributions/normal.hpp> boost::math::normal dist(0.0, 1.0); // 95% of distribution is below q: double q = quantile(dist, 0.95); You can also get the complement

Calculating Percentiles (Ruby)

倾然丶 夕夏残阳落幕 提交于 2019-12-04 17:54:49
My code is based on the methods described here and here . def fraction?(number) number - number.truncate end def percentile(param_array, percentage) another_array = param_array.to_a.sort r = percentage.to_f * (param_array.size.to_f - 1) + 1 if r <= 1 then return another_array[0] elsif r >= another_array.size then return another_array[another_array.size - 1] end ir = r.truncate another_array[ir] + fraction?((another_array[ir].to_f - another_array[ir - 1].to_f).abs) end Example usage: test_array = [95.1772, 95.1567, 95.1937, 95.1959, 95.1442, 95.061, 95.1591, 95.1195, 95.1065, 95.0925, 95.199,

Python Pandas - Quantile calculation manually

天大地大妈咪最大 提交于 2019-12-04 11:38:25
I am trying to calculate quantile for a column values manually, but not able to find the correct quantile value manually using the formula when compared to result output from Pandas. I looked around for different solutions, but did not find the right answer In [54]: df Out[54]: data1 data2 key1 key2 0 -0.204708 1.393406 a one 1 0.478943 0.092908 a two 2 1.965781 1.246435 a one In [55]: grouped = df.groupby('key1') In [56]: grouped['data1'].quantile(0.9) Out[56]: key1 a 1.668413 using the formula to find it manually,n is 3 as there are 3 values in data1 column quantile(n+1) applying the values

How can I compute statistics by decile groups in data.table

☆樱花仙子☆ 提交于 2019-12-04 11:05:51
问题 I have a data.table and would like to compute stats by groups. R) set.seed(1) R) DT=data.table(a=rnorm(100),b=rnorm(100)) Those groups should be defined by R) quantile(DT$a,probs=seq(.1,.9,.1)) 10% 20% 30% 40% 50% 60% 70% 80% 90% -1.05265747329 -0.61386923071 -0.37534201964 -0.07670312896 0.11390916079 0.37707993057 0.58121734252 0.77125359976 1.18106507751 How can I compute say the average of b per bin, say if b=-.5 I am within [-0.61386923071,-0.37534201964] so in bin 3 回答1: How about : >

How to replace outliers with the 5th and 95th percentile values in R

流过昼夜 提交于 2019-12-03 15:47:08
I'd like to replace all values in my relatively large R dataset which take values above the 95th and below the 5th percentile, with those percentile values respectively. My aim is to avoid simply cropping these outliers from the data entirely. Any advice would be much appreciated, I can't find any information on how to do this anywhere else. This would do it. fun <- function(x){ quantiles <- quantile( x, c(.05, .95 ) ) x[ x < quantiles[1] ] <- quantiles[1] x[ x > quantiles[2] ] <- quantiles[2] x } fun( yourdata ) You can do it in one line of code using squish() : d2 <- squish(d, quantile(d, c(

Quantile functions in boost (C++)

有些话、适合烂在心里 提交于 2019-12-03 12:59:43
Judging from the documentation boost seems to offer quantile functions (inverse cdf functions) for both normal and gamma distributions, but its not clear for me how can I actually use them. Could someone paste an example please? The quantile calculation is implemented as a free function. Here's an example: #include <boost/math/distributions/normal.hpp> boost::math::normal dist(0.0, 1.0); // 95% of distribution is below q: double q = quantile(dist, 0.95); You can also get the complement (quantile from the right) using: // 95% of distribution is above qc: double qc = quantile(complement(dist, 0

How can I compute statistics by decile groups in data.table

萝らか妹 提交于 2019-12-03 07:58:13
I have a data.table and would like to compute stats by groups. R) set.seed(1) R) DT=data.table(a=rnorm(100),b=rnorm(100)) Those groups should be defined by R) quantile(DT$a,probs=seq(.1,.9,.1)) 10% 20% 30% 40% 50% 60% 70% 80% 90% -1.05265747329 -0.61386923071 -0.37534201964 -0.07670312896 0.11390916079 0.37707993057 0.58121734252 0.77125359976 1.18106507751 How can I compute say the average of b per bin, say if b=-.5 I am within [-0.61386923071,-0.37534201964] so in bin 3 How about : > DT[, mean(b), keyby=cut(a,quantile(a,probs=seq(.1,.9,.1)))] cut V1 1: NA -0.31359818 2: (-1.05,-0.614] -0

Find top deciles from dataframe by group

不打扰是莪最后的温柔 提交于 2019-12-02 11:39:15
问题 I am attempting to create new variables using a function and lapply rather than working right in the data with loops. I used to use Stata and would have solved this problem with a method similar to that discussed here. Since naming variables programmatically is so difficult or at least awkward in R (and it seems you can't use indexing with assign ), I have left the naming process until after the lapply . I am then using a for loop to do the renaming prior to merging and again for the merging.