Python: Generate random values from empirical distribution

前端 未结 1 1745
無奈伤痛
無奈伤痛 2020-12-30 13:14

In Java, I usually rely on the org.apache.commons.math3.random.EmpiricalDistribution class to do the following:

  • Derive a probability distribution from observed
相关标签:
1条回答
  • 2020-12-30 14:05
    import numpy as np
    import scipy.stats
    import matplotlib.pyplot as plt
    
    # This represents the original "empirical" sample -- I fake it by
    # sampling from a normal distribution
    orig_sample_data = np.random.normal(size=10000)
    
    # Generate a KDE from the empirical sample
    sample_pdf = scipy.stats.gaussian_kde(orig_sample_data)
    
    # Sample new datapoints from the KDE
    new_sample_data = sample_pdf.resample(10000).T[:,0]
    
    # Histogram of initial empirical sample
    cnts, bins, p = plt.hist(orig_sample_data, label='original sample', bins=100,
                             histtype='step', linewidth=1.5, density=True)
    
    # Histogram of datapoints sampled from KDE
    plt.hist(new_sample_data, label='sample from KDE', bins=bins,
             histtype='step', linewidth=1.5, density=True)
    
    # Visualize the kde itself
    y_kde = sample_pdf(bins)
    plt.plot(bins, y_kde, label='KDE')
    plt.legend()
    plt.show(block=False)
    

    new_sample_data should be drawn from roughly the same distribution as the original data (to the degree that the KDE is a good approximation to the original distribution).

    0 讨论(0)
提交回复
热议问题