I am looking to find the peaks in some gaussian smoothed data that I have. I have looked at some of the peak detection methods available but they require an input range over
>> import numpy as np
>> from scipy.signal import argrelextrema
>> a = np.array([1,2,3,4,5,4,3,2,1,2,3,2,1,2,3,4,5,6,5,4,3,2,1])
>> argrelextrema(a, np.greater)
array([ 4, 10, 17]),)
>> a[argrelextrema(a, np.greater)]
array([5, 3, 6])
If your input represents a noisy distribution, you can try smoothing it with NumPy convolve function.
If you can exclude maxima at the edges of the arrays you can always check if one elements is bigger than each of it's neighbors by checking:
import numpy as np
array = np.array([1,2,3,4,5,4,3,2,1,2,3,2,1,2,3,4,5,6,5,4,3,2,1])
# Check that it is bigger than either of it's neighbors exluding edges:
max = (array[1:-1] > array[:-2]) & (array[1:-1] > array[2:])
# Print these values
print(array[1:-1][max])
# Locations of the maxima
print(np.arange(1, array.size-1)[max])
There exists a bulit-in function argrelextrema that gets this task done:
import numpy as np
from scipy.signal import argrelextrema
a = np.array([1,2,3,4,5,4,3,2,1,2,3,2,1,2,3,4,5,6,5,4,3,2,1])
# determine the indices of the local maxima
max_ind = argrelextrema(a, np.greater)
# get the actual values using these indices
r = a[max_ind] # array([5, 3, 6])
That gives you the desired output for r
.
As of SciPy version 1.1, you can also use find_peaks. Below are two examples taken from the documentation itself.
Using the height
argument, one can select all maxima above a certain threshold (in this example, all non-negative maxima; this can be very useful if one has to deal with a noisy baseline; if you want to find minima, just multiply you input by -1
):
import matplotlib.pyplot as plt
from scipy.misc import electrocardiogram
from scipy.signal import find_peaks
import numpy as np
x = electrocardiogram()[2000:4000]
peaks, _ = find_peaks(x, height=0)
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.plot(np.zeros_like(x), "--", color="gray")
plt.show()
Another extremely helpful argument is distance
, which defines the minimum distance between two peaks:
peaks, _ = find_peaks(x, distance=150)
# difference between peaks is >= 150
print(np.diff(peaks))
# prints [186 180 177 171 177 169 167 164 158 162 172]
plt.plot(x)
plt.plot(peaks, x[peaks], "x")
plt.show()
If your original data is noisy, then using statistical methods is preferable, as not all peaks are going to be significant. For your a
array, a possible solution is to use double differentials:
peaks = a[1:-1][np.diff(np.diff(a)) < 0]
# peaks = array([5, 3, 6])