statistics

Getting descriptive statistics with (analytic) weighting using describe() in python

梦想的初衷 提交于 2021-01-29 12:34:53
问题 I was trying to translate code from Stata to Python The original code in Stata: by year, sort : summarize age [aweight = wt] Normally a simply describe() function will do dataframe.groupby("year")["age"].describe() But I could not find a way to translate the aweight option into the language of python i.e. to give descriptive statistics of a dataset under analytic/ variance weighting. codes to generate the dataset in python: dataframe = {'year': [2016,2016,2020, 2020], 'age': [41,65, 35,28],

How to convert percentage to z-score of normal distribution in C/C++?

微笑、不失礼 提交于 2021-01-29 11:54:15
问题 The goal is to say: "These values lie within a band of 95 % of values around the mean in a normal distribution." Now, I am trying to convert percentage to z-score, so then I can get the precise range of values. Something like <lower bound , upper bound> would be enough. So I need something like double z_score(double percentage) { // ... } // ... // according to https://en.wikipedia.org/wiki/68–95–99.7_rule z_score(68.27) == 1 z_score(95.45) == 2 z_score(99.73) == 3 I found an article

Calculating WAIC for models with multiple likelihood functions with pymc3

对着背影说爱祢 提交于 2021-01-29 11:45:01
问题 I try to predict the outcome of soccer games based on the number of goals scored and I use the following model: with pm.Model() as model: # global model parameters h = pm.Normal('h', mu = mu, tau = tau) sd_a = pm.Gamma('sd_a', .1, .1) sd_d = pm.Gamma('sd_d', .1, .1) alpha = pm.Normal('alpha', mu=mu, tau = tau) # team-specific model parameters a_s = pm.Normal("a_s", mu=0, sd=sd_a, shape=n) d_s = pm.Normal("d_s", mu=0, sd=sd_d, shape=n) atts = pm.Deterministic('atts', a_s - tt.mean(a_s)) defs =

Finding data trendlines with Ruby?

久未见 提交于 2021-01-29 10:31:21
问题 I have a dataset with user session numbers from my site which looks like: page_1 = [4,2,4,1,2,6,3,2,1,6,2,7,0,0,0] page_2 = [6,3,2,3,5,7,9,3,1,6,1,6,2,7,8] ... And so on. I would like to find out whether the page has a positive or negative trendline in terms of growth, however I would also like to get the pages that are growing/falling beyond a certain threshold. Python has a ton of solutions and libs for this kind of task, yet Ruby has only one gem (trendline) which has no code in it. Before

Why did my p-value equals 0 and statistic equals 1 when I use ks test in python?

我的未来我决定 提交于 2021-01-29 10:05:22
问题 Thanks to anyone who have a look first. My codes are : import numpy as np from scipy.stats import kstest data=[31001, 38502, 40842, 40852, 43007, 47228, 48320, 50500, 54545, 57437, 60126, 65556, 71215, 78460, 81299, 96851, 106472, 108398, 118495, 130832, 141678, 155703, 180689, 218032, 222238, 239553, 250895, 274025, 298231, 330228, 330910, 352058, 362993, 369690, 382487, 397270, 414179, 454013, 504993, 518475, 531767, 551032, 782483, 913658, 1432195, 1712510, 2726323, 2777535, 3996759,

Friedman Rank Sum Test in R: Not an unreplicated complete block design

六眼飞鱼酱① 提交于 2021-01-29 08:35:17
问题 I run into an error trying to implement Friedman Rank Sum test on a dataframe. What is a likely cause of this error and how should I fix this? 回答1: Your function call didn't include named arguments, which is a little dangerous at times. Either put them in the correct order according to the order specified in the help page, which is: function (y, groups, blocks, ...) : friedman.test(new_frame$Detail, new_frame$brands, new_frame$factors) or name them (you can keep your original order): friedman

How to implement 1D Kalman filter with other distribution?

对着背影说爱祢 提交于 2021-01-29 05:42:37
问题 I have been through the concept of 1D Kalman filter, but, they mostly concentrate on the equations formed from Gaussian distributions where they used the equations in the picture Gaussian Distribution equations (they can be found in the following links: Pyata 1D Kalman Filter, 1D Kalman Filter, Sensor Fusion). I have several questions: Question 1: How can I form predict and update states with other distributions? (for example, Bradford distribution) I looked into Bradford distribution and

Sample from custom distribution in R

霸气de小男生 提交于 2021-01-29 04:15:37
问题 I have implemented an alternate parameterization of the negative binomial distribution in R, like so (also see here): nb = function(n, l, a){ first = choose((n + a - 1), a-1) second = (l/(l+a))^n third = (a/(l+a))^a return(first*second*third) } Where n is the count, lambda is the mean, and a is the overdispersion term. I would like to draw random samples from this distribution in order to validate my implementation of a negative binomial mixture model, but am not sure how to go about doing

How to plot a CDF functon from PDF in R

巧了我就是萌 提交于 2021-01-29 03:28:56
问题 I have the following function: fx <- function(x) { if(x >= 0 && x < 3) { res <- 0.2; } else if(x >=3 && x < 5) { res <- 0.05; } else if(x >= 5 && x < 6) { res <- 0.15; } else if(x >= 7 && x < 10) { res <- 0.05; } else { res <- 0; } return(res); } How can I plot it's CDF function on the interval [0,10] ? 回答1: To add a bit accuracy to @Martin Schmelzer's answer. A cummulative distribution function(CDF) evaluated at x, is the probability that X will take a value less than or equal to x So to get

How to plot a CDF functon from PDF in R

倾然丶 夕夏残阳落幕 提交于 2021-01-29 03:22:09
问题 I have the following function: fx <- function(x) { if(x >= 0 && x < 3) { res <- 0.2; } else if(x >=3 && x < 5) { res <- 0.05; } else if(x >= 5 && x < 6) { res <- 0.15; } else if(x >= 7 && x < 10) { res <- 0.05; } else { res <- 0; } return(res); } How can I plot it's CDF function on the interval [0,10] ? 回答1: To add a bit accuracy to @Martin Schmelzer's answer. A cummulative distribution function(CDF) evaluated at x, is the probability that X will take a value less than or equal to x So to get