问题
I measured the fluorescence intensity of thousands of particles and made the histogram, which showed two adjacent gaussian curves. How to use python or its package to separate them into two Gaussian curves and make two new plots?
Thank you.
回答1:
Basically, you need to infer parameters for your Gaussian mixture. I will generate a similar dataset for the illustration.
Generating mixtures with known parameters
from itertools import starmap
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import mlab
sns.set(color_codes=True)
# inline plots in jupyter notebook
%matplotlib inline
# generate synthetic data from a mixture of two Gaussians with equal weights
# the solution below readily generalises to more components
nsamples = 10000
means = [30, 120]
sds = [10, 50]
weights = [0.5, 0.5]
draws = np.random.multinomial(nsamples, weights)
samples = np.concatenate(
list(starmap(np.random.normal, zip(means, sds, draws)))
)
Plot the distribution
sns.distplot(samples)
Infer parameters
from sklearn.mixture import GaussianMixture
mixture = GaussianMixture(n_components=2).fit(samples.reshape(-1, 1))
means_hat = mixture.means_.flatten()
weights_hat = mixture.weights_.flatten()
sds_hat = np.sqrt(mixture.covariances_).flatten()
print(mixture.converged_)
print(means_hat)
print(sds_hat)
print(weights_hat)
We get:
True
[ 122.57524745 29.97741112]
[ 48.18013893 10.44561398]
[ 0.48559771 0.51440229]
You can tweak GaussianMixture's hyper-parameters to improve fit, but this looks fine enough. Now we can plot each component (I'm only plotting the first one):
mu1_h, sd1_h = means_hat[0], sds_hat[0]
x_axis = np.linspace(mu1_h-3*sd1_h, mu1_h+3*sd1_h, 1000)
plt.plot(x_axis, mlab.normpdf(x_axis, mu1_h, sd1_h))
P.S.
On a sidenote. It seems like you are dealing with constrained data, and your observations are pretty close to the left constraint (zero). While Gaussians might approximate your data well enough, you should tread carefully, because Gaussians assume unconstrained geometry.
来源:https://stackoverflow.com/questions/51318981/how-to-use-python-to-separate-two-gaussian-curves