Matplotlib.pyplot.hist() very slow

前端 未结 8 2018
孤独总比滥情好
孤独总比滥情好 2021-01-11 11:37

I\'m plotting about 10,000 items in an array. They are of around 1,000 unique values.

The plotting has been running half an hour now. I made sure rest of the code wo

相关标签:
8条回答
  • 2021-01-11 12:06

    If you are working with pandas, make sure the data you passed in plt.hist() is a 1-d series rather than a dataframe. This helped me out.

    0 讨论(0)
  • 2021-01-11 12:12

    To plot histograms using matplotlib quickly you need to pass the histtype='step' argument to pyplot.hist. For example:

    plt.hist(np.random.exponential(size=1000000,bins=10000))
    plt.show()
    

    takes ~15 seconds to draw and roughly 5-10 seconds to update when you pan or zoom.

    In contrast, plotting with histtype='step':

    plt.hist(np.random.exponential(size=1000000),bins=10000,histtype='step')
    plt.show()
    

    plots almost immediately and can be panned and zoomed with no delay.

    0 讨论(0)
  • 2021-01-11 12:14

    It will be instant to plot the histogram after flattening the numpy array. Try the below demo code:

    import numpy as np
    
    array2d = np.random.random_sample((512,512))*100
    plt.hist(array2d.flatten())
    plt.hist(array2d.flatten(), bins=1000)
    
    0 讨论(0)
  • 2021-01-11 12:14

    I was facing the same problem using Pandas .hist() method. For me the solution was:

    pd.to_numeric(df['your_data']).hist()
    

    Which worked instantly.

    0 讨论(0)
  • 2021-01-11 12:17

    Importing seaborn somewhere in the code may cause pyplot.hist to take a really long time.

    If the problem is seaborn, it can be solved by resetting the matplotlib settings:

    import seaborn as sns
    sns.reset_orig()
    
    0 讨论(0)
  • 2021-01-11 12:22

    For me, the problem is that the data type of pd.series, say S, is 'object' rather than 'float64'. After I use S = np.float64(S), then plt.hist(S) is very quick!!

    0 讨论(0)
提交回复
热议问题