Converting a WAV file to a spectrogram

前端 未结 1 432
余生分开走
余生分开走 2021-02-02 04:11

Hi im very new to this thing so please bear with me. I am trying to convert a WAV file to a spectrogram but arent sure how to begin with. I read on something that says to read t

相关标签:
1条回答
  • 2021-02-02 04:54

    You could also use BASS.NET library which natively provides all these features and is free.

    The Visuals.CreateSpectrum3DVoicePrint Method does exactly that.

    Feel free to ask for assistance if you're having a hard time using it.

    EDIT : here's a quick and dirty sample

    enter image description here

    public partial class Form1 : Form
    {
        private int _handle;
        private int _pos;
        private BASSTimer _timer;
        private Visuals _visuals;
    
        public Form1()
        {
            InitializeComponent();
        }
    
        private void timer_Tick(object sender, EventArgs e)
        {
            bool spectrum3DVoicePrint = _visuals.CreateSpectrum3DVoicePrint(_handle, pictureBox1.CreateGraphics(),
                                                                            pictureBox1.Bounds, Color.Cyan, Color.Green,
                                                                            _pos, false, true);
            _pos++;
            if (_pos >= pictureBox1.Width)
            {
                _pos = 0;
            }
        }
    
        private void Form1_Load(object sender, EventArgs e)
        {
            string file = "..\\..\\mysong.mp3";
            if (Bass.BASS_Init(-1, 44100, BASSInit.BASS_DEVICE_DEFAULT, Handle))
            {
                _handle = Bass.BASS_StreamCreateFile(file, 0, 0, BASSFlag.BASS_DEFAULT);
    
                if (Bass.BASS_ChannelPlay(_handle, false))
                {
                    _visuals = new Visuals();
                    _timer = new BASSTimer((int) (1.0d/10*1000));
                    _timer.Tick += timer_Tick;
                    _timer.Start();
                }
            }
        }
    }
    

    EDIT 2

    You can provide a file name but you can also provide your own audio data using the other overload that accepts an IntPtr or use Bass.BASS_StreamCreatePush with Bass.BASS_StreamPutData.

    Regarding comparing spectrograms you could do the following :

    • Resize the image to a smaller size, reduce information by dithering it to 8-bit (with a good algorithm however)
    • Compare the two images

    However for comparing audio data I'd strongly suggest you to use fingerprints, it roughly does that but is much more robust than my suggestion.

    Here's a fingerprinting library that is free to use :

    http://www.codeproject.com/Articles/206507/Duplicates-detector-via-audio-fingerprinting

    Not entirely sure it would work for small samples, though.

    EDIT 3

    I'm afraid I can't find the link where I've read that but that's what they do: reducing data and comparing images such as the example below (last image):

    (note : not to compare at all with image 1, it's something else but just to show why using a lower resolution might give better yields)

    enter image description here

    (from http://blog.echonest.com/post/545323349/the-echo-nest-musical-fingerprint-enmfp)

    Now a very basic explanation of the process:

    Comparison source A:

    enter image description here

    Comparison source B: (I've just changed a region of A)

    enter image description here

    Comparison result:

    (done with Paint.Net by adding the former images as layers and setting 2nd layer blending to difference instead of normal)

    enter image description here

    If the fingerprints were to be identical the resulting image would be completely black.

    And by reducing data to a 8-bit image you are easing the comparison process but keep in mind you will need a good dithering algorithm.

    This is one is quite good :

    http://www.codeproject.com/Articles/66341/A-Simple-Yet-Quite-Powerful-Palette-Quantizer-in-C

    Well it's not on par with Photoshop or Hypersnap's one (which IMO is exceptional) but that might be enough for the task.

    And avoid at all costs Floyd–Steinberg dithering or something that does error diffusion.

    Here some attempts on creating dithering algorithms : http://bisqwit.iki.fi/story/howto/dither/jy/

    Take this with caution as I'm not an expert in the field but that's roughly how it's done.

    Go to https://dsp.stackexchange.com/ and ask a few questions there, you might get useful hints on achieving this.

    0 讨论(0)
提交回复
热议问题