I am using the FFT function in NumPy to do some signal processing. I have array called signal
which has one data point for each hour and has a total of 576 data poi
In general, the dimensional units of frequency from an FFT are the same as the dimensional units of the sample rate attributed to the data fed to the FFT, for example: per meter, per radian, per second, or in your case, per hour.
The scaled units of frequency, per FFT result bin index, are N / theSampleRate, with the same dimensional units as above, where N is the length of the full FFT (you might only be plotting half of this length in the case of strictly real data).
Note that each FFT result peak bin represents a filter with a non-zero bandwidth, so you might want to add some uncertainty or error bounds to the result points you map onto frequency values. Or even use an interpolation estimation method, if needed and appropriate for the source data.
Given sampling rate FSample
and transform blocksize N
, you can calculate the frequency resolution deltaF
, sampling interval deltaT
, and total capture time capT
using the relationships:
deltaT = 1/FSample = capT/N
deltaF = 1/capT = FSample/N
Keep in mind also that the FFT returns value from 0
to FSample
, or equivalently -FSample/2
to FSample/2
. In your plot, you're already dropping the -FSample/2
to 0
part. NumPy includes a helper function to calculate all this for you: fftfreq.
For your values of deltaT = 1 hour
and N = 576
, you get deltaF = 0.001736 cycles/hour = 0.04167 cycles/day
, from -0.5 cycles/hour
to 0.5 cycles/hour
. So if you have a magnitude peak at, say, bin 48 (and bin 528), that corresponds to a frequency component at 48*deltaF = 0.0833 cycles/hour = 2 cycles/day.
In general, you should apply a window function to your time domain data before calculating the FFT, to reduce spectral leakage. The Hann window is almost never a bad choice. You can also use the rfft
function to skip the -FSample/2, 0
part of the output. So then, your code would be:
ft = np.fft.rfft(signal*np.hanning(len(signal)))
mgft = abs(ft)
xVals = np.fft.fftfreq(len(signal), d=1.0) # in hours, or d=1.0/24 in days
plot(xVals[:len(mgft)], mgft)
Result of fft transformation doesn't map to HOURS, but to frequencies contained in your dataset. It would be beneficial to have your transformed graph so we can see where the spikes are.
You might be having spike at the beginning of the transformed buffer, since you didn't do any windowing.