quantile | 易学教程

Plotting Quantile regression with full range in ggplot using facet_wrap

阅读更多关于 Plotting Quantile regression with full range in ggplot using facet_wrap

问题 So I would like to plot entire full range quantile lines in full range when using facet_wrap . The code goes as follows: library(tidyverse) library(quantreg) mtcars %>% gather("variable", "value", -c(3, 10)) %>% ggplot(aes(value, disp)) + geom_point(aes(color = factor(gear))) + geom_quantile(quantiles = 0.5, aes(group = factor(gear), color = factor(gear))) + facet_wrap(~variable, scales = "free") #> [multiple warnings removed for clarity] Created on 2019-12-05 by the reprex package (v0.3.0)

purrr apply functions to lists in data.frame over specifed dimensions

阅读更多关于 purrr apply functions to lists in data.frame over specifed dimensions

问题 The following produces a data frame with <int[]> and <list[]> fields: library(tidyverse) set.seed(123) s <- 4 data <- data.frame( lamda = c(5, 2, 3), meanlog = c(9, 10, 11), sdlog = c(2, 2.1, 2.2) ) data2 <- data %>% mutate( freq = map(lamda, ~rpois(s, .x)), freqsev = map(freq, ~map(.x, function(k) rlnorm(k, meanlog, sdlog))) ) Output: as_tibble(data2) lamda meanlog sdlog freq freqsev <dbl> <dbl> <dbl> <list> <list> 1 5 9 2 <int [4]> <list [4]> 2 2 10 2.1 <int [4]> <list [4]> 3 3 11 2.2 <int

How to compute moving (or rolling, if you will) percentile/quantile for a 1d array in numpy?

阅读更多关于 How to compute moving (or rolling, if you will) percentile/quantile for a 1d array in numpy?

问题 In pandas, we have pd.rolling_quantile() . And in numpy, we have np.percentile() , but I'm not sure how to do the rolling/moving version of it. To explain what I meant by moving/rolling percentile/quantile: Given array [1, 5, 7, 2, 4, 6, 9, 3, 8, 10] , the moving quantile 0.5 (i.e. moving percentile 50%) with window size 3 is: 1 5 - 1 5 7 -> 0.5 quantile = 5 7 - 5 7 2 -> 5 2 - 7 2 4 -> 4 4 - 2 4 6 -> 4 6 - 4 6 9 -> 6 9 - 6 9 3 -> 6 3 - 9 3 8 -> 8 8 - 3 8 10 -> 8 10 So [5, 5, 4, 4, 6, 6, 8, 8]

Replace outliers by quantiles in R

阅读更多关于 Replace outliers by quantiles in R

问题 I have been trying to replace outliers 1.5*IQR +/- upper/lower quantile by the upper and lower quantile with the following code: `lower.quantile <- as.numeric(summary(loans$dINC_A)[2]) lower.quantile [1] 9000 upper.quantile <- as.numeric(summary(loans$dINC_A)[5]) > upper.quantile [1] 21240 IQR <- upper.quantile - lower.quantile # I replace outliers by the lower/upper bound values loans$INC_A[ loans$dINC_A < (lower.quantile-1.5*IQR) ] <- lower.quantile loans$INC_A[ loans$dINC_A > (upper

Compute sample statistics for a data vector with ties which is stored as a frequency table

阅读更多关于 Compute sample statistics for a data vector with ties which is stored as a frequency table

问题 I am trying to get some summary statistics (mean, variance and quantiles) from a data vector with tied values. In particular, it is stored in a frequency distribution table: unique data values var and number of ties frequency . I know I could use rep function to first expand the vector to its full format: xx <- rep(mydata$var, mydata$frequency) then do standard mean(xx) var(xx) quantile(xx) But the frequency is really large and I have many unique values, which makes the program really slow.

Continuous quantiles of a scatterplot

阅读更多关于 Continuous quantiles of a scatterplot

问题 I have a data set, for which I graphed a regression (using ggplot2 's stat_smooth ) : ggplot(data = mydf, aes(x=time, y=pdm)) + geom_point() + stat_smooth(col="red") I'd also like to have the quantiles (if it's simpler, having only the quartiles will do) using the same method. All I manage to get is the following : ggplot(data = mydf, aes(x=time, y=pdm, z=surface)) + geom_point() + stat_smooth(col="red") + stat_quantile(quantiles = c(0.25,0.75)) Unfortunately, I can't put method="loess" in

Using cut2 from Hmisc to calculate cuts for different number of groups

阅读更多关于 Using cut2 from Hmisc to calculate cuts for different number of groups

问题 I was trying to calculate equal quantile cuts for a vector by using cut2 from Hmisc. library(Hmisc) c <- c(-4.18304,-3.18343,-2.93237,-2.82836,-2.13478,-2.01892,-1.88773, -1.83124,-1.74953,-1.74858,-0.63265,-0.59626,-0.5681) cut2(c, g=3, onlycuts=TRUE) [1] -4.18304 -2.01892 -1.74858 -0.56810 But I was expecting the following result (33%, 33%, 33%): [1] -4.18304 -2.13478 -1.74858 -0.56810 Should I still use cut2 or try something different? How can I make it work? Thanks for your advice. 回答1:

Inverse function of an unknown cumulative function

阅读更多关于 Inverse function of an unknown cumulative function

问题 I'm working with a data file, the observations inside are random values. In this case I don't know the distribution of x (my observations). I'm using the function density in order to estimate the density, because I must apply a kernel estimation. T=density(datafile[,1],bw=sj,kernel="epanechnikov") After this I must integrate this because I'm looking for a quantile (similar to VaR, 95%). For this I have 2 options: ecdf() quantile() Now I have the value of the quantile 95, but this is the data

Percentiles from VGAM

阅读更多关于 Percentiles from VGAM

问题 I am using following example from help pages of package VGAM library(VGAM) fit4 <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), data = bmi.nz, trace = TRUE) qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1, xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4) I am getting a proper graph with it: How can I avoid plotting points from the graph? Also I need to print out values for these percentiles at each of ages 20,30,40...80 (separately as a table). How can this be

Getting SciPy quantiles to match Stata xtile function

阅读更多关于 Getting SciPy quantiles to match Stata xtile function

问题 I've inherited some old Stata code (Stata11) that uses the xtile function to categorize observations in a vector by their quantiles (in this case, just the standard 5 quintiles, 20%, 40%, 60%, 80%, 100%). I'm trying to replicate a piece of the code in Python and I am using the SciPy.stats.mstats function mquantiles() for the computation. As near as I can tell from Stata documentation and searching online, the Stata xtile method tries to invert the empirical CDF of the data, and uses the equal