At first, this question can sound really stupid, but it is not in fundamental. Maybe, it can seem like unresolvable exactly by any algorithm, but I pretend to say it is.
You could perform a smoothing or lowpass filtering operation first, and find the locations of the local minima/maxima from the smoothed data. Then get the values of the minima and the maxima from the original data.
You could use a normal maximum/minimum filter, which finds all turning points, then filter the list of turning points by threshold.
I think what you really want to do is remove the "long-term variation" from the signal and look only at the "short term variation". This is can be done using the empirical mode decomposition. See Sec 2.3.2 of my thesis. (Alernately, Google around for "Empirical Mode Decomposition", "EMD", or "Hilbert-Huang Transform".)
Here's the EMD in action:
Notice the increasing generality as the EMD algorithm extracts components of the signal, starting at "most detailed" and ending with "most general trend". (Note there are apparently nine components - only a few are shown.)