quantile

Function to automatically create vector in a large list for each element of the large list

情到浓时终转凉″ 提交于 2019-12-24 19:55:30
问题 I have a single Dataframe with the following structure: A.Data is a vector with numeric data A.Quartile is a vector with the calculation of quartiles for each A.data and which quartile belongs to this data. (Q1,Q2,Q3,Q4). I used a very similar code to create the quantile and the Q which belongs to. quantile(x <- rnorm(1001)) list2env(setNames(as.list(quantile(x <- rnorm(1001))),paste0("Q",1:5)),.GlobalEnv) Now, ( and here is my problem) I have a .csv that I imported into R, with more than 400

How to find in which quantile bin does a number fall

旧街凉风 提交于 2019-12-23 03:45:16
问题 I know how to find quantile of an empirical distribution. set.seed(1) x = rnorm(100) q = quantile(x, prob=seq(0,1,.01)) Is there a function that would give me the quantile bin a number of the training set belongs to ? In this example R) x[1] [1] -0.6264538107 R) q 0% 1% 2% 3% 4% 5% 6% 7% 8% -2.214699887177 -1.991605177777 -1.808646490230 -1.532008555284 -1.472864960560 -1.381744198182 -1.282620249360 -1.255240516814 -1.226934277726 9% 10% 11% 12% 13% 14% 15% 16% 17% -1.137935552774 -1

Understanding and implementing numerical integration with a quantile function in R

旧城冷巷雨未停 提交于 2019-12-20 04:58:14
问题 I need to calculate this integral below, using R: The q_theta(x) function I managed to do in R with quantile regression (package: quantreg ). matrix=structure(c(0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.2, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5, 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57,

finding quartiles

拟墨画扇 提交于 2019-12-19 06:03:38
问题 I've written a program where the user can enter any number of values into a vector and it's supposed to return the quartiles, but I keep getting a "vector subscript out of range" error : #include "stdafx.h" #include <iostream> #include <string> #include <algorithm> #include <iomanip> #include <ios> #include <vector> int main () { using namespace std; cout << "Enter a list of numbers: "; vector<double> quantile; double x; //invariant: homework contains all the homework grades so far while (cin

TypeError: can't multiply sequence by non-int of type 'float' (python 2.7)

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-18 06:55:13
问题 I have a dataframe t_unit , which is the result of a pd.read_csv() function. datetime B18_LR_T B18_B1_T 24/03/2016 09:00 21.274 21.179 24/03/2016 10:00 19.987 19.868 24/03/2016 11:00 21.632 21.417 24/03/2016 12:00 26.285 24.779 24/03/2016 13:00 26.897 24.779 I am resampling the dataframe to calculate the 5th and 05th percentiles with the code: keys_actual = list(t_unit.columns.values) for key in keys_actual: ts_wk = t_unit[key].resample('W-MON') ts_wk_05p = ts_wk.apply(lambda x: x.quantile(0

Plotting a line graph with scale.quantile()

别等时光非礼了梦想. 提交于 2019-12-13 06:35:45
问题 I am trying to plot a sorted array of normally distributed data so that it plots as straight line. I would like to do this using a cumulative density function, which I think is also known as a quantile function. Unfortunately, I haven't found many examples that use the quantile scale. Here is my attempt to use the quantile scale: http://jsfiddle.net/tbcholla/hmFqJ/3/. I set up my x scale this way: var x = d3.scale .quantile() .range(d3.range(0,width,1))//this will create an array from 0 to

pandas: qcut error: ValueError: Bin edges must be unique:

[亡魂溺海] 提交于 2019-12-12 19:06:45
问题 I am trying to compute percentile of two columns using the pandas qcut method like below: my_df['float_col_quantile'] = pd.qcut(my_df['float_col'], 100, labels=False) my_df['int_col_quantile'] = pd.qcut(my_df['int_col'].astype(float), 100, labels=False) The column float_col_quantile works fine, but the column int_col_quantile has the following error. Any idea what I did wrong here? And how can I fix this problem? Thanks! ValueError Traceback (most recent call last) <ipython-input-19

Calculate quantiles for large data

坚强是说给别人听的谎言 提交于 2019-12-12 18:33:56
问题 I have about 300 files, each containing 1000 time series realisations (~76 MB each file). I want to calculate the quantiles (0.05, 0.50, 0.95) at each time step from the full set of 300000 realisations. I cannot merge together the realisations in 1 file because it would become too large. What's the most efficient way of doing this? Each matrix is generated by running a model, however here is a sample containing random numbers: x <- matrix(rexp(10000000, rate=.1), nrow=1000) 回答1: There are at

How to replace outliers with the 5th and 95th percentile values in R

风流意气都作罢 提交于 2019-12-12 08:12:15
问题 I'd like to replace all values in my relatively large R dataset which take values above the 95th and below the 5th percentile, with those percentile values respectively. My aim is to avoid simply cropping these outliers from the data entirely. Any advice would be much appreciated, I can't find any information on how to do this anywhere else. 回答1: This would do it. fun <- function(x){ quantiles <- quantile( x, c(.05, .95 ) ) x[ x < quantiles[1] ] <- quantiles[1] x[ x > quantiles[2] ] <-

d3 quantile or quartile scale given the quartile values

十年热恋 提交于 2019-12-12 05:33:28
问题 The current quantile scale takes all the input values as the domain to map the output range. But if the data is extremely large, I want the processing to happen on the server giving me the quartile values. So I get: var quartiles=[5, 10, 15, 20, 25, 30, 35, 40, 45]; // 9 values with the mean (25) at the middle and standard deviations to each side var valueToMark = 37; Using d3, how do I correctly create a quantile scale and mark them all on a line given only the quantiles and value to mark? p