Android audio FFT to display fundamental frequency

做~自己de王妃 提交于 2019-11-30 10:19:20
PowerQian

Recently I'm working on a project which requires almost the same. Probably you don't need any help anymore but I will give my thoughts anyway. Maybe someone need this in the future.

  1. I'm not sure whether the short to double function works, I don't understand that snippet of code neither. It is wrote for byte to double conversion.
  2. In the code: "double[] micBufferData = new double[bufferSizeInBytes];" I think the size of micBufferData should be "bufferSizeInBytes / 2", since every sample takes two bytes and the size of micBufferData should be the sample number.
  3. FFT algorithms do require a FFT window size, and it has to be a number which is the power of 2. However many algorithms can receive an arbitrary of number as input and it will do the rest. In the document of those algorithms should have the requirements of input. In your case, the size of the Complex array can be the input of FFT algorithms. And I don't really know the detail of the FFT algorithm but I think the inverse one is not needed.
  4. To use the code you gave at last, you should firstly find the peak index in the sample array. I used double array as input instead of Complex, so in my case it is something like: double maxVal = -1;int maxIndex = -1;

    for( int j=0; j < mFftSize / 2; ++j ) {
        double v = fftResult[2*j] * fftResult[2*j] + fftResult[2*j+1] * fftResult[2*j+1];
        if( v > maxVal ) {
            maxVal = v;
            maxIndex = j;
        }
    }
    

    2*j is the real part and 2*j+1 is the imaginary part. maxIndex is the index of the peak magnitude you want (More detail here), and use it as input to the ComputeFrequency function. The return value is the frequency of the sample array you want.

Hopefully it can help someone.

You should pick an FFT window size depending on your time versus frequency resolution requirements, and not just use the audio buffer size when creating your FFT temp array.

The array index is your int i, as used in your magnitude[i] print statement.

The fundamental pitch frequency for music is often different from FFT peak magnitude, so you may want to research some pitch estimation algorithms.

I suspect that the strange results you're getting are because you might need to unpack the FFT. How this is done will depend on the library that you're using (see here for docs on how it's packed in GSL, for example). The packing may mean that the real and imaginary components are not in the positions in the array that you expect.

For your other questions about window size and resolution, if you're creating a tuner then I'd suggest trying a window size of about 20ms (eg 1024 samples at 44.1kHz). For a tuner you need quite high resolution, so you could try zero-padding by a factor of 8 or 16 which will give you a resolution of 3-6Hz.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!