Normalizing audio, how to convert a float array to a byte array?

后端 未结 5 1790
南旧
南旧 2021-02-04 21:52

Hi all, I am playing an audio file. I read it as a byte[] and then I need to normalize the audio by putting values into range of [-1,1]. I want to then put each flo

相关标签:
5条回答
  • 2021-02-04 22:08

    This works:

    float number = 0.43f;
    byte[] array = BitConverter.GetBytes(number);
    

    What does not work for you?

    0 讨论(0)
  • 2021-02-04 22:09

    In a comment, you stated "I am playing audio file... I read it as byte[] and then I need to normalize audio by putting values into range of [-1,1] and then I need to put that byte[] back into playing audio player"

    I am making a big assumption here, but I'm guessing the the data you receive from ar.ReadData() is a byte array of 2-channel 16-bit/44.1kHz PCM data. (side note: are you using the Alvas.Audio library?) If that is the case, here is how to do what you want.

    Background

    First, a little background. A 2-channel, 16-bit PCM data stream looks like this:

       byte | 01 02 | 03 04 | 05 06 | 07 08 | 09 10 | 11 12 | ...
    channel |  Left | Right | Left  | Right | Left |  Right | ...
      frame |     First     |    Second     |     Third     | ...
     sample | 1st L | 1st R | 2nd L | 2nd R | 3rd L | 3rd R | ... etc.
    

    It's important here to take note of a few things:

    1. Since the audio data is 16-bit, a single sample from a single channel is a short (2 bytes), not an int (4 bytes), with a value in the range -32768 to 32767.
    2. This data is in little-endian representation, and unless your architecture is also little-endian, you can't use the .NET BitConverter class for the conversion.
    3. We don't have to split the data into per-channel streams, because we are normalizing both channels based on the single highest value from either channel.
    4. Converting a floating-point value to an integer value will result in quantization errors, so you probably want to use some sort of dithering (which is an entire topic in its own right).

    Helper Functions

    Before we jump into the actual normalization, let's make this easier on ourselves by writing a couple of helper functions to get a short from a byte[] and vice-versa:

    short GetShortFromLittleEndianBytes(byte[] data, int startIndex)
    {
        return (short)((data[startIndex + 1] << 8)
             | data[startIndex]);
    }
    
    byte[] GetLittleEndianBytesFromShort(short data)
    {
        byte[] b = new byte[2];
        b[0] = (byte)data;
        b[1] = (byte)(data >> 8 & 0xFF);
        return b;
    }
    

    Normalization

    An important distinction should be made here: audio normalization is not the same as statistical normalization. Here we are going to perform peak normalization on our audio data, amplifying the signal by a constant amount so that its peak is at the upper limit. To peak normalize audio data, we first find the largest value, subtract it from the upper limit (for 16-bit PCM data, this is 32767) to get an offset, and then increase each value by this offset.

    So, to normalize our audio data, first scan through it to find the peak magnitude:

    byte[] input = ar.ReadData();  // the function you used above
    float biggest = -32768F;
    float sample;
    for (int i = 0; i < input.Length; i += 2)
    {
        sample = (float)GetShortFromLittleEndianBytes(input, i);
        if (sample > biggest) biggest = sample;
    }
    

    At this point, biggest contains the largest value from our audio data. Now to perform the actual normalization, we subtract biggest from 32767 to get a value which corresponds to the offset from peak of the loudest sample in our audio data. Next we add this offset to each audio sample, effectively increasing the volume of each sample until our loudest sample is at the peak value.

    float offset = 32767 - biggest;
    
    float[] data = new float[input.length / 2];
    for (int i = 0; i < input.Length; i += 2)
    {
        data[i / 2] = (float)GetShortFromLittleEndianBytes(input, i) + offset;
    }
    

    The last step is to convert the samples from floating-point to integer values, and store them as little-endian shorts.

    byte[] output = new byte[input.Length];
    for (int i = 0; i < output.Length; i += 2)
    {
        byte[] tmp = GetLittleEndianBytesFromShort(Convert.ToInt16(data[i / 2]));
        output[i] = tmp[0];
        output[i + 1] = tmp[1];
    }
    

    And we're done! Now you can send the output byte array, which contains the normalized PCM data, to your audio player.

    As a final note, keep in mind that this code isn't the most efficient; you could combine several of these loops, and you could probably use Buffer.BlockCopy() for the array copying, as well as modifying your short to byte[] helper function to take a byte array as a parameter and copy the value directly into the array. I didn't do any of this so as to make it easier to see what's going on.

    And as I mentioned before, you should absolutely read up on dithering, as it will vastly improve the quality of your audio output.

    I've been working on an audio project myself, so I figured all this out through some trial-and-error; I hope it helps somebody somewhere.

    0 讨论(0)
  • 2021-02-04 22:19
    if (Math.Abs(sample) > biggest) biggest = sample;
    

    I would change this to:

    if (Math.Abs(sample) > biggest) biggest = Math.Abs(sample);
    

    Because if the biggest value is negative you will multiply all values with a negative.

    0 讨论(0)
  • 2021-02-04 22:22

    You can use Buffer.BlockCopy like this:

    float[] floats = new float[] { 0.43f, 0.45f, 0.47f };
    byte[] result = new byte[sizeof(float) * floats.Length];
    Buffer.BlockCopy(floats, 0, result, 0, result.Length);
    
    0 讨论(0)
  • 2021-02-04 22:29

    You can change temp to a list of byte arrays to avoid overwriting it all the time.

        byte[] data = new byte[] { 1, 3, 5, 7, 9 };  // sample data
        IList<byte[]> temp = new List<byte[]>(data.Length);
        float biggest = 0; ;
    
        for (int i = 0; i < data.Length; i++)
        {
            if (data[i] > biggest)
                biggest = data[i];
        }
    
        for (int i = 0; i < data.Length; i++)
        {
            temp.Add(BitConverter.GetBytes(data[i] * (1 / biggest)));
        }
    
    0 讨论(0)
提交回复
热议问题