quantile

Plotting Quantile regression with full range in ggplot using facet_wrap

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-11 18:45:07
问题 So I would like to plot entire full range quantile lines in full range when using facet_wrap . The code goes as follows: library(tidyverse) library(quantreg) mtcars %>% gather("variable", "value", -c(3, 10)) %>% ggplot(aes(value, disp)) + geom_point(aes(color = factor(gear))) + geom_quantile(quantiles = 0.5, aes(group = factor(gear), color = factor(gear))) + facet_wrap(~variable, scales = "free") #> [multiple warnings removed for clarity] Created on 2019-12-05 by the reprex package (v0.3.0)

purrr apply functions to lists in data.frame over specifed dimensions

蹲街弑〆低调 提交于 2019-12-11 17:53:29
问题 The following produces a data frame with <int[]> and <list[]> fields: library(tidyverse) set.seed(123) s <- 4 data <- data.frame( lamda = c(5, 2, 3), meanlog = c(9, 10, 11), sdlog = c(2, 2.1, 2.2) ) data2 <- data %>% mutate( freq = map(lamda, ~rpois(s, .x)), freqsev = map(freq, ~map(.x, function(k) rlnorm(k, meanlog, sdlog))) ) Output: as_tibble(data2) lamda meanlog sdlog freq freqsev <dbl> <dbl> <dbl> <list> <list> 1 5 9 2 <int [4]> <list [4]> 2 2 10 2.1 <int [4]> <list [4]> 3 3 11 2.2 <int

How to compute moving (or rolling, if you will) percentile/quantile for a 1d array in numpy?

て烟熏妆下的殇ゞ 提交于 2019-12-11 07:32:24
问题 In pandas, we have pd.rolling_quantile() . And in numpy, we have np.percentile() , but I'm not sure how to do the rolling/moving version of it. To explain what I meant by moving/rolling percentile/quantile: Given array [1, 5, 7, 2, 4, 6, 9, 3, 8, 10] , the moving quantile 0.5 (i.e. moving percentile 50%) with window size 3 is: 1 5 - 1 5 7 -> 0.5 quantile = 5 7 - 5 7 2 -> 5 2 - 7 2 4 -> 4 4 - 2 4 6 -> 4 6 - 4 6 9 -> 6 9 - 6 9 3 -> 6 3 - 9 3 8 -> 8 8 - 3 8 10 -> 8 10 So [5, 5, 4, 4, 6, 6, 8, 8]

Replace outliers by quantiles in R

﹥>﹥吖頭↗ 提交于 2019-12-11 06:52:49
问题 I have been trying to replace outliers 1.5*IQR +/- upper/lower quantile by the upper and lower quantile with the following code: `lower.quantile <- as.numeric(summary(loans$dINC_A)[2]) lower.quantile [1] 9000 upper.quantile <- as.numeric(summary(loans$dINC_A)[5]) > upper.quantile [1] 21240 IQR <- upper.quantile - lower.quantile # I replace outliers by the lower/upper bound values loans$INC_A[ loans$dINC_A < (lower.quantile-1.5*IQR) ] <- lower.quantile loans$INC_A[ loans$dINC_A > (upper

Compute sample statistics for a data vector with ties which is stored as a frequency table

妖精的绣舞 提交于 2019-12-11 04:21:21
问题 I am trying to get some summary statistics (mean, variance and quantiles) from a data vector with tied values. In particular, it is stored in a frequency distribution table: unique data values var and number of ties frequency . I know I could use rep function to first expand the vector to its full format: xx <- rep(mydata$var, mydata$frequency) then do standard mean(xx) var(xx) quantile(xx) But the frequency is really large and I have many unique values, which makes the program really slow.

Continuous quantiles of a scatterplot

有些话、适合烂在心里 提交于 2019-12-10 19:31:45
问题 I have a data set, for which I graphed a regression (using ggplot2 's stat_smooth ) : ggplot(data = mydf, aes(x=time, y=pdm)) + geom_point() + stat_smooth(col="red") I'd also like to have the quantiles (if it's simpler, having only the quartiles will do) using the same method. All I manage to get is the following : ggplot(data = mydf, aes(x=time, y=pdm, z=surface)) + geom_point() + stat_smooth(col="red") + stat_quantile(quantiles = c(0.25,0.75)) Unfortunately, I can't put method="loess" in

Using cut2 from Hmisc to calculate cuts for different number of groups

懵懂的女人 提交于 2019-12-10 17:22:18
问题 I was trying to calculate equal quantile cuts for a vector by using cut2 from Hmisc. library(Hmisc) c <- c(-4.18304,-3.18343,-2.93237,-2.82836,-2.13478,-2.01892,-1.88773, -1.83124,-1.74953,-1.74858,-0.63265,-0.59626,-0.5681) cut2(c, g=3, onlycuts=TRUE) [1] -4.18304 -2.01892 -1.74858 -0.56810 But I was expecting the following result (33%, 33%, 33%): [1] -4.18304 -2.13478 -1.74858 -0.56810 Should I still use cut2 or try something different? How can I make it work? Thanks for your advice. 回答1:

Inverse function of an unknown cumulative function

空扰寡人 提交于 2019-12-10 13:22:18
问题 I'm working with a data file, the observations inside are random values. In this case I don't know the distribution of x (my observations). I'm using the function density in order to estimate the density, because I must apply a kernel estimation. T=density(datafile[,1],bw=sj,kernel="epanechnikov") After this I must integrate this because I'm looking for a quantile (similar to VaR, 95%). For this I have 2 options: ecdf() quantile() Now I have the value of the quantile 95, but this is the data

Percentiles from VGAM

白昼怎懂夜的黑 提交于 2019-12-08 04:13:15
问题 I am using following example from help pages of package VGAM library(VGAM) fit4 <- vgam(BMI ~ s(age, df = c(4, 2)), lms.bcn(zero = 1), data = bmi.nz, trace = TRUE) qtplot(fit4, percentiles = c(5,50,90,99), main = "Quantiles", las = 1, xlim = c(15, 90), ylab = "BMI", lwd = 2, lcol = 4) I am getting a proper graph with it: How can I avoid plotting points from the graph? Also I need to print out values for these percentiles at each of ages 20,30,40...80 (separately as a table). How can this be

Getting SciPy quantiles to match Stata xtile function

流过昼夜 提交于 2019-12-08 03:24:13
问题 I've inherited some old Stata code (Stata11) that uses the xtile function to categorize observations in a vector by their quantiles (in this case, just the standard 5 quintiles, 20%, 40%, 60%, 80%, 100%). I'm trying to replicate a piece of the code in Python and I am using the SciPy.stats.mstats function mquantiles() for the computation. As near as I can tell from Stata documentation and searching online, the Stata xtile method tries to invert the empirical CDF of the data, and uses the equal