I\'m trying to plot several (many thousands) of circle objects - I don\'t have much experience working with python. I\'m interested in specifying the position, radius and color
The key here is to use a Collection
. In your case, you want to make a PatchCollection
.
Matplotlib optimizes drawing many similar artists through using collections. It's considerably faster than drawing each one individually. Furthermore, the plot won't contain thousands of individual artists, only one collection. This speeds up many other miscellaneous operations that need to operate on each artist every time the plot is drawn.
scatter
actually is much faster than your current approach, as it will add a collection instead of separate artists. However, it also draws markers with a size that isn't in data coordinates.
To get around that, you can use the same approach scatter
does, but create the collection manually.
As an example:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.collections
num = 5000
sizes = 0.2 * np.random.random(num)
xy = 50 * np.random.random((num, 2))
# Note that the patches won't be added to the axes, instead a collection will
patches = [plt.Circle(center, size) for center, size in zip(xy, sizes)]
fig, ax = plt.subplots()
coll = matplotlib.collections.PatchCollection(patches, facecolors='black')
ax.add_collection(coll)
ax.margins(0.01)
plt.show()
This renders quite smoothly for me. Just to prove that the circles are in data coordinates, note what happens if we zoom in on a narrow rectangle (note: this assumes that the aspect of the plot is set to auto
):
If you're really focused on speed, you can use an EllipseCollection
as @tcaswell suggested.
An EllipseCollection
will only make one path, but will scale and translate it at draw time to be in the places/sizes you specify.
The downside is that while the size can be in data coordinates, the circle will always be a circle, even if the aspect ratio of the plot isn't 1. (i.e. the circles won't stretch as they do in the figure above).
The advantage is that it's fast.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.collections
num = 5000
sizes = 0.4 * np.random.random(num)
xy = 50 * np.random.random((num, 2))
fig, ax = plt.subplots()
coll = matplotlib.collections.EllipseCollection(sizes, sizes,
np.zeros_like(sizes),
offsets=xy, units='x',
transOffset=ax.transData,
**kwargs)
ax.add_collection(coll)
ax.margins(0.01)
plt.show()
Notice the difference as we zoom in on a similar region to the second figure. The circles get bigger (the size is in data coordinates), but remain circles instead of becoming elongated. They're not an accurate representation of a circle in "data" space.
To give some idea of the time difference, here's the time to create and draw a figure with the same 5000 circles with each of the three methods:
In [5]: %timeit time_plotting(circles)
1 loops, best of 3: 3.84 s per loop
In [6]: %timeit time_plotting(patch_collection)
1 loops, best of 3: 1.37 s per loop
In [7]: %timeit time_plotting(ellipse_collection)
1 loops, best of 3: 228 ms per loop
scatter is probably better for you than plt.Circle
though it won't make anything run faster.
for i in range(4):
mp.scatter(xvals[i], yvals[i], s=rvals[i])
If you can deal with the circles being the same size then mp.plot(xvals[i], yvals[i], marker='o')
will be more performant.
But this is probably a matplotlib
limitation, rather than a language limitation. There are excellent JavaScript libraries for plotting thousands of data points efficiently (d3.js). Maybe someone here will know of one that you can call from Python.
You would certainly want to move ...gca()
outside of your loop. You can also use list comprehension.
fig = plt.figure()
ax = plt.gcf().gca()
[ax.add_artist(plt.Circle((xvals[q],yvals[q]),rvals[q],color=[0,0,0]))
for q in xrange(4)] # range(4) for Python3
Below are some tests to generate 4,000 circles using the various methods:
xvals = [0,.1,.2,.3] * 1000
yvals = [0,.1,.2,.3] * 1000
rvals = [0,.1,.1,.1] * 1000
%%timeit -n5 fig = plt.figure(); ax = plt.gcf().gca()
for q in range(4000):
circle1=plt.Circle((xvals[q], yvals[q]), rvals[q], color=[0,0,0])
plt.gcf().gca().add_artist(circle1)
5 loops, best of 3: 792 ms per loop
%%timeit -n5 fig = plt.figure(); ax = plt.gcf().gca()
for q in xrange(4000):
ax.add_artist(plt.Circle((xvals[q],yvals[q]),rvals[q],color=[0,0,0]))
5 loops, best of 3: 779 ms per loop
%%timeit -n5 fig = plt.figure(); ax = plt.gcf().gca()
[ax.add_artist(plt.Circle((xvals[q],yvals[q]),rvals[q],color=[0,0,0])) for q in xrange(4000)]
5 loops, best of 3: 730 ms per loop
Not sure what you are really trying to do, or what your issues or concerns are, but here is a totally different method of plotting circles... make an SVG
file like this and call it circles.svg
<?xml version="1.0" standalone="no"?>
<svg width="500" height="300" version="1.1" xmlns="http://www.w3.org/2000/svg">
<circle cx="100" cy="175" r="200" stroke="lime" fill="coral" stroke-width="28"/>
<circle cx="25" cy="75" r="80" stroke="red" fill="yellow" stroke-width="5"/>
<circle cx="400" cy="280" r="20" stroke="black" fill="blue" stroke-width="10"/>
</svg>
and pass it to ImageMagick to make into a PNG
file like this:
convert circles.svg result.png