I have a plot like the following (using plt.boxplot()
):
Now, what I want is plotting a number how often those outliers occured (preferably to the top r
ax.boxplot returns a dictionary of all the elements in the boxplot. The key you need here from that dict is 'fliers'
.
In boxdict['fliers']
, there are the Line2D
instances that are used to plot the fliers. We can grab their x
and y
locations using .get_xdata()
and .get_ydata()
.
You can find all the unique y locations using a set
, and then find the number of fliers plotted at that location using .count()
.
Then its just a case of using matplotlib's ax.text
to add a text label to the plot.
Consider the following example:
import matplotlib.pyplot as plt
import numpy as np
# Some fake data
data = np.zeros((10000, 2))
data[0:4, 0] = 1
data[4:6, 0] = 2
data[6:10, 0] = 3
data[0:9, 1] = 1
data[9:14, 1] = 2
data[14:20, 1] = 3
# create figure and axes
fig, ax = plt.subplots(1)
# plot boxplot, grab dict
boxdict = ax.boxplot(data)
# the fliers from the dictionary
fliers = boxdict['fliers']
# loop over boxes in x direction
for j in range(len(fliers)):
# the y and x positions of the fliers
yfliers = boxdict['fliers'][j].get_ydata()
xfliers = boxdict['fliers'][j].get_xdata()
# the unique locations of fliers in y
ufliers = set(yfliers)
# loop over unique fliers
for i, uf in enumerate(ufliers):
# print number of fliers
ax.text(xfliers[i] + 0.03, uf + 0.03, list(yfliers).count(uf))
plt.show()