normal-distribution | 易学教程

Python: Sample from multivariate normal with N means and same covariance matrix

阅读更多关于 Python: Sample from multivariate normal with N means and same covariance matrix

问题 Suppose I want to sample 10 times from multiple normal distributions with the same covariance matrix (identity) but different means, which are stored as rows of the following matrix: means = np.array([[1, 5, 2], [6, 2, 7], [1, 8, 2]]) How can I do that in the most efficient way possible (i.e. avoiding loops) I tried like this: scipy.stats.multivariate_normal(means, np.eye(2)).rvs(10) and np.random.multivariate_normal(means, np.eye(2)) But they throw an error saying mean should be 1D. Slow

Generate random data based on existing data

阅读更多关于 Generate random data based on existing data

问题 is there a way in python to generate random data based on the distribution of the alreday existing data? Here are the statistical parameters of my dataset: Data count 209.000000 mean 1.280144 std 0.374602 min 0.880000 25% 1.060000 50% 1.150000 75% 1.400000 max 4.140000 as it is no normal distribution it is not possible to do it with np.random.normal. Any Ideas? Thank you. Edit: Performing KDE: from sklearn.neighbors import KernelDensity # Gaussian KDE kde = KernelDensity(kernel='gaussian',

PyMC3 passing stochastic covariance matrix to pm.MvNormal()

阅读更多关于 PyMC3 passing stochastic covariance matrix to pm.MvNormal()

问题 I've tried to fit a simple 2D gaussian model to observed data by using PyMC3. import numpy as np import pymc3 as pm n = 10000; np.random.seed(0) X = np.random.multivariate_normal([0,0], [[1,0],[0,1]], n); with pm.Model() as model: # PRIORS mu = [pm.Uniform('mux', lower=-1, upper=1), pm.Uniform('muy', lower=-1, upper=1)] cov = np.array([[pm.Uniform('a11', lower=0.1, upper=2), 0], [0, pm.Uniform('a22', lower=0.1, upper=2)]]) # LIKELIHOOD likelihood = pm.MvNormal('likelihood', mu=mu, cov=cov,

How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

阅读更多关于 How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

问题 In genetics very small p-values are common (for example 10^-400), and I am looking for a way to get very small p-values (two-tailed) when the z-score is large in R, for example: z=40 pvalue = 2*pnorm(abs(z), lower.tail = F) This gives me a zero instead of a very small value which is very significant. 回答1: The inability to handle p-values less than about 10^(-308) ( .Machine$double.xmin ) is not really R's fault, but is rather a generic limitation of any computational system that uses double

How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

阅读更多关于 How to compute p-values from z-scores in R when the Z score is large (pvalue much below zero)?

Best way to plot smooth normal distribution in ggplot

阅读更多关于 Best way to plot smooth normal distribution in ggplot

问题 I would like to plot a nice, 'approaching the limit'-looking normal pdf in ggplot. I found that to get a very symmetric and clean looking plot, I had to crank up the number of samples to a rather large number; one million creates a great visualization. However, this is pretty slow, especially if I hope to work with Shiny at some point. df <- data.frame(c(rnorm(1000000))) ggplot(df, aes(df[1])) + geom_density() Surely there is a better way to display something close to the ideal normal

shapiro.test(..) cannot deal with more than 5000 data points

阅读更多关于 shapiro.test(..) cannot deal with more than 5000 data points

问题 In R, the shapiro.test() function cannot run if the sample size exceeds 5000. shapiro.test(rnorm(10^4)) Why is it so ? Can I overpass this limitation ? 回答1: This is a safety limitation. Please read this: Perform a Shapiro-Wilk Normality Test Other tests of normality do not have this limitation such as the Kolmogorov-Smirnov test: ks.test(x=rnorm(10^4),y='pnorm',alternative='two.sided') 来源： https://stackoverflow.com/questions/17125458/shapiro-test-cannot-deal-with-more-than-5000-data-points

How to convert percentage to z-score of normal distribution in C/C++?

阅读更多关于 How to convert percentage to z-score of normal distribution in C/C++?

问题 The goal is to say: "These values lie within a band of 95 % of values around the mean in a normal distribution." Now, I am trying to convert percentage to z-score, so then I can get the precise range of values. Something like <lower bound , upper bound> would be enough. So I need something like double z_score(double percentage) { // ... } // ... // according to https://en.wikipedia.org/wiki/68–95–99.7_rule z_score(68.27) == 1 z_score(95.45) == 2 z_score(99.73) == 3 I found an article

How to convert percentage to z-score of normal distribution in C/C++?

阅读更多关于 How to convert percentage to z-score of normal distribution in C/C++?

How can I generate data which will show inverted bell curve for normal distribution

阅读更多关于 How can I generate data which will show inverted bell curve for normal distribution

问题 I have generated random data which follows normal distribution using the below code: import numpy as np import matplotlib.pyplot as plt import seaborn as sns rng = np.random.default_rng() number_of_rows = 10000 mu = 0 sigma = 1 data = rng.normal(loc=mu, scale=sigma, size=number_of_rows) dist_plot_data = sns.distplot(data, hist=False) plt.show() The above code generates the below distribution plot as expected: If I want to create a distribution plot that is exactly an inverse curve like below