Describe the bug
I've found that plotting a lot of markers can be very slow. The problem seems to be a bug where a linear increase in the number of artists results in an exponential increase in the plotting time for some signal. The result is a sluggish interface. This is part of the reason that things like the find peaks interface can appear sluggish, especially when there are many different peaks identified.
I discussed this a little bit in #3031
For example when I run the code:
num_points = 400
points = np.random.randint(0, 100, (num_points, 2,10,10))
%%timeit
s = hs.signals.Signal2D(np.random.random((10,10,100,100)))
markers = [ hs.plot.markers.Point(x = p[0],y= p[1],color="blue",) for p in points]
for m in markers:
s.add_marker(m,plot_marker=False, permanent=True, render_figure=False)
s.plot(vmin=0, vmax=1)
It takes around around .2ms to plot with 10 markers. Most of this time is related to setting up the signal. For 200 markers that time has increased to 1.86 seconds. Most of which is related to the plot
function, more specifically the underlying _plot_signal
function.
The issue is related to the repeated calls to the plt.scatter
function. The issue is similar when adding many patches to the plot as is the case with adding arrows. For the most part it is less probable that someone is adding 100's of arrows to a plot. As opposed to adding 100's of points or lines.
Expected behavior
Markers should behave in a way similar to ragged signals in hyperspy. Not only does this streamline the progression from finding a feature --> creating an marker --> visualization but it greatly speeds up visualization by reducing the number of markers.
It also reduces the dependency that each navigation position needs to have the same number of markers. A fact that as we see above is an extreme determent to plotting speed.
For the point marker the change in the code is relatively small and unbreaking.
If we change this line of code:
https://github.com/hyperspy/hyperspy/blob/7f6b448b91da35b71774a52860c95ba69deeb41c/hyperspy/drawing/_markers/point.py#L89-L90
To:
self.marker.set_offsets(np.squeeze(np.transpose([self.get_data_position('x1'),self.get_data_position('y1')])))
The plotting time doesn't change with the number of markers and we can also use this syntax to plot markers.
x= np.empty((10,10), dtype=object)
y= np.empty((10,10), dtype=object)
points = np.random.randint(0, 100, (2,10,10, 500))
for i in np.ndindex(10,10):
x_ind = (0,)+i
y_ind = (1,)+i
x[i]=points[x_ind]
y[i]=points[y_ind]
s = hs.signals.Signal2D(np.random.random((10,10,100,100)))
markers =hs.plot.markers.Point(x =x,y=y,color="blue",) # only adds one ragged marker object to the dataset.
s.add_marker(markers,plot_marker=False, permanent=True, render_figure=False)
s.plot(vmin=0, vmax=1)
Additional context
This does cause a problem with saving markers similar to #2904 but we could solve the problem in much the same way.
If we want the same increase in speed to plotting lines (@hakonanes you are probably very interested in this) the equivalent idea would be to replace the set of Lines2D
with a PolyCollection
.
I did some testing of that and the speedups are very comparable depending on the number of line segments.
I can make a PR with this change. I think this will really clean up some peakfinding/fitting workflows and make them seem much faster without much underlying change to the code.
type: bug