How to plot a mean line on a distplot between 0 and the y value of the mean?

别来无恙 提交于 2020-12-06 04:16:26

问题


I have a distplot and I would like to plot a mean line that goes from 0 to the y value of the mean frequency. I want to do this, but have the line stop at when the distplot does. Why isn't there a simple parameter that does this? It would be very useful.

I have some code that gets me almost there:

plt.plot([x.mean(),x.mean()], [0, *what here?*])

This code plots a line just as I'd like except for my desired y-value. What would the correct math be to get the y max to stop at the frequency of the mean in the distplot? An example of one of my distplots is below using 0.6 as the y-max. It would be awesome if there was some math to make it stop at the y-value of the mean. I have tried dividing the mean by the count etc.


回答1:


ax.lines[0] gets the curve of the kde, of which you can extract the x and y data. np.interp then can find the height of the curve for a given x-value:

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

x = np.random.normal(np.tile(np.random.uniform(10, 30, 5), 50), 3)
ax = sns.kdeplot(x, shade=True, color='crimson')
kdeline = ax.lines[0]
mean = x.mean()
height = np.interp(mean, kdeline.get_xdata(), kdeline.get_ydata())
ax.vlines(mean, 0, height, color='crimson', ls=':')
ax.set_ylim(ymin=0)
plt.show()

The same approach can be extended to show the mean together with the standard deviation, or the median and the quartiles:

import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

x = np.random.normal(np.tile(np.random.uniform(10, 30, 5), 50), 3)
fig, axes = plt.subplots(ncols=2, figsize=(12, 4))
for ax in axes:
    sns.kdeplot(x, shade=True, color='crimson', ax=ax)
    kdeline = ax.lines[0]
    xs = kdeline.get_xdata()
    ys = kdeline.get_ydata()
    if ax == axes[0]:
        middle = x.mean()
        sdev = x.std()
        left = middle - sdev
        right = middle + sdev
        ax.set_title('Showing mean and sdev')
    else:
        left, middle, right = np.percentile(x, [25, 50, 75])
        ax.set_title('Showing median and quartiles')
    ax.vlines(middle, 0, np.interp(middle, xs, ys), color='crimson', ls=':')
    ax.fill_between(xs, 0, ys, where=(left <= xs) & (xs <= right), interpolate=True, facecolor='crimson', alpha=0.2)
    ax.set_ylim(ymin=0)
plt.show()

PS: for the mode of the kde:

    mode_idx = np.argmax(ys)
    ax.vlines(xs[mode_idx], 0, ys[mode_idx], color='lime', ls='--')



回答2:


With plt.get_ylim() you can get the limits of the current plot: [bottom, top].
So, in your case, you can extract the actual limits and save them in ylim, then draw the line:

fig, ax = plt.subplots()

ylim = ax.get_ylim()
ax.plot([x.mean(),x.mean()], ax.get_ylim())
ax.set_ylim(ylim)

As ax.plot changes the ylims afterwards, you have to re-set them with ax.set_ylim as above.



来源:https://stackoverflow.com/questions/63307440/how-to-plot-a-mean-line-on-a-distplot-between-0-and-the-y-value-of-the-mean

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!