normal-distribution

Python: Sample from multivariate normal with N means and same covariance matrix

白昼怎懂夜的黑 提交于 2021-02-19 08:14:11
问题 Suppose I want to sample 10 times from multiple normal distributions with the same covariance matrix (identity) but different means, which are stored as rows of the following matrix: means = np.array([[1, 5, 2], [6, 2, 7], [1, 8, 2]]) How can I do that in the most efficient way possible (i.e. avoiding loops) I tried like this: scipy.stats.multivariate_normal(means, np.eye(2)).rvs(10) and np.random.multivariate_normal(means, np.eye(2)) But they throw an error saying mean should be 1D. Slow

Generate random data based on existing data

好久不见. 提交于 2021-02-16 14:52:16
问题 is there a way in python to generate random data based on the distribution of the alreday existing data? Here are the statistical parameters of my dataset: Data count 209.000000 mean 1.280144 std 0.374602 min 0.880000 25% 1.060000 50% 1.150000 75% 1.400000 max 4.140000 as it is no normal distribution it is not possible to do it with np.random.normal. Any Ideas? Thank you. Edit: Performing KDE: from sklearn.neighbors import KernelDensity # Gaussian KDE kde = KernelDensity(kernel='gaussian',

PyMC3 passing stochastic covariance matrix to pm.MvNormal()

落爺英雄遲暮 提交于 2021-02-10 16:47:45
问题 I've tried to fit a simple 2D gaussian model to observed data by using PyMC3. import numpy as np import pymc3 as pm n = 10000; np.random.seed(0) X = np.random.multivariate_normal([0,0], [[1,0],[0,1]], n); with pm.Model() as model: # PRIORS mu = [pm.Uniform('mux', lower=-1, upper=1), pm.Uniform('muy', lower=-1, upper=1)] cov = np.array([[pm.Uniform('a11', lower=0.1, upper=2), 0], [0, pm.Uniform('a22', lower=0.1, upper=2)]]) # LIKELIHOOD likelihood = pm.MvNormal('likelihood', mu=mu, cov=cov,

How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

丶灬走出姿态 提交于 2021-02-07 08:17:51
问题 In genetics very small p-values are common (for example 10^-400), and I am looking for a way to get very small p-values (two-tailed) when the z-score is large in R, for example: z=40 pvalue = 2*pnorm(abs(z), lower.tail = F) This gives me a zero instead of a very small value which is very significant. 回答1: The inability to handle p-values less than about 10^(-308) ( .Machine$double.xmin ) is not really R's fault, but is rather a generic limitation of any computational system that uses double

How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

六眼飞鱼酱① 提交于 2021-02-07 08:14:15
问题 In genetics very small p-values are common (for example 10^-400), and I am looking for a way to get very small p-values (two-tailed) when the z-score is large in R, for example: z=40 pvalue = 2*pnorm(abs(z), lower.tail = F) This gives me a zero instead of a very small value which is very significant. 回答1: The inability to handle p-values less than about 10^(-308) ( .Machine$double.xmin ) is not really R's fault, but is rather a generic limitation of any computational system that uses double

Best way to plot smooth normal distribution in ggplot

我的未来我决定 提交于 2021-02-05 06:55:46
问题 I would like to plot a nice, 'approaching the limit'-looking normal pdf in ggplot. I found that to get a very symmetric and clean looking plot, I had to crank up the number of samples to a rather large number; one million creates a great visualization. However, this is pretty slow, especially if I hope to work with Shiny at some point. df <- data.frame(c(rnorm(1000000))) ggplot(df, aes(df[1])) + geom_density() Surely there is a better way to display something close to the ideal normal

shapiro.test(..) cannot deal with more than 5000 data points

旧时模样 提交于 2021-02-04 14:58:49
问题 In R, the shapiro.test() function cannot run if the sample size exceeds 5000. shapiro.test(rnorm(10^4)) Why is it so ? Can I overpass this limitation ? 回答1: This is a safety limitation. Please read this: Perform a Shapiro-Wilk Normality Test Other tests of normality do not have this limitation such as the Kolmogorov-Smirnov test: ks.test(x=rnorm(10^4),y='pnorm',alternative='two.sided') 来源: https://stackoverflow.com/questions/17125458/shapiro-test-cannot-deal-with-more-than-5000-data-points

How to convert percentage to z-score of normal distribution in C/C++?

强颜欢笑 提交于 2021-01-29 12:51:43
问题 The goal is to say: "These values lie within a band of 95 % of values around the mean in a normal distribution." Now, I am trying to convert percentage to z-score, so then I can get the precise range of values. Something like <lower bound , upper bound> would be enough. So I need something like double z_score(double percentage) { // ... } // ... // according to https://en.wikipedia.org/wiki/68–95–99.7_rule z_score(68.27) == 1 z_score(95.45) == 2 z_score(99.73) == 3 I found an article

How to convert percentage to z-score of normal distribution in C/C++?

微笑、不失礼 提交于 2021-01-29 11:54:15
问题 The goal is to say: "These values lie within a band of 95 % of values around the mean in a normal distribution." Now, I am trying to convert percentage to z-score, so then I can get the precise range of values. Something like <lower bound , upper bound> would be enough. So I need something like double z_score(double percentage) { // ... } // ... // according to https://en.wikipedia.org/wiki/68–95–99.7_rule z_score(68.27) == 1 z_score(95.45) == 2 z_score(99.73) == 3 I found an article

How can I generate data which will show inverted bell curve for normal distribution

瘦欲@ 提交于 2021-01-27 11:22:41
问题 I have generated random data which follows normal distribution using the below code: import numpy as np import matplotlib.pyplot as plt import seaborn as sns rng = np.random.default_rng() number_of_rows = 10000 mu = 0 sigma = 1 data = rng.normal(loc=mu, scale=sigma, size=number_of_rows) dist_plot_data = sns.distplot(data, hist=False) plt.show() The above code generates the below distribution plot as expected: If I want to create a distribution plot that is exactly an inverse curve like below