Reading a single channel from a multi-channel wav file

前端 未结 1 2000
误落风尘
误落风尘 2021-02-12 18:58

I need to extract the samples of a single channel from a wav file that will contain up to 12 (11.1 format) channels. I know that within a normal stereo file samples are interlea

1条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-02-12 19:40

    Microsoft have created a standard that covers up to 18 channels. According to them, the wav file needs to have a special meta sub-chunk (under the "Extensible Format" section) that specifies a "channel mask" (dwChannelMask). This field is 4 bytes long (a uint) which contains the corresponding bits of each channel that is present, therefore indicating which of the 18 channels are used within the file.

    The Master Channel Layout

    Below is the MCL, that is, the order in which existing channels should be interleaved, along with the bit value for each channel. If a channel is not present, the next channel that is there will "drop down" into the place of the missing channel and its order number will be used instead, but never the bit value. (Bit values are unique to each channel regardless of the channel's existence),

    Order | Bit | Channel
    
     1.     0x1  Front Left
     2.     0x2  Front Right
     3.     0x4  Front Center
     4.     0x8  Low Frequency (LFE)
     5.    0x10  Back Left (Surround Back Left)
     6.    0x20  Back Right (Surround Back Right)
     7.    0x40  Front Left of Center
     8.    0x80  Front Right of Center
     9.   0x100  Back Center
    10.   0x200  Side Left (Surround Left)
    11.   0x400  Side Right (Surround Right)
    12.   0x800  Top Center
    13.  0x1000  Top Front Left
    14.  0x2000  Top Front Center
    15.  0x4000  Top Front Right
    16.  0x8000  Top Back Left
    17. 0x10000  Top Back Center
    18. 0x20000  Top Back Right
    

    For example, if the channel mask is 0x63F (1599), this would indicate that the file contains 8 channels (FL, FR, FC, LFE, BL, BR, SL & SR).

    Reading and checking the Channel Mask

    To get the mask, you'll need to read the 40th, 41st, 42nd and 43rd byte (assuming a base index of 0, and you're reading a standard wav header). For example,

    var bytes = new byte[50];
    
    using (var stream = new FileStream("filepath...", FileMode.Open))
    {
        stream.Read(bytes, 0, 50);
    }
    
    var speakerMask = BitConverter.ToUInt32(new[] { bytes[40], bytes[41], bytes[42], bytes[43] }, 0);
    

    Then, you need to check if the desired channel actually exists. To do this, I'd suggest creating an enum (defined with [Flags]) that contains all the channels (and their respective values).

    [Flags]
    public enum Channels : uint
    {
        FrontLeft = 0x1,
        FrontRight = 0x2,
        FrontCenter = 0x4,
        Lfe = 0x8,
        BackLeft = 0x10,
        BackRight = 0x20,
        FrontLeftOfCenter = 0x40,
        FrontRightOfCenter = 0x80,
        BackCenter = 0x100,
        SideLeft = 0x200,
        SideRight = 0x400,
        TopCenter = 0x800,
        TopFrontLeft = 0x1000,
        TopFrontCenter = 0x2000,
        TopFrontRight = 0x4000,
        TopBackLeft = 0x8000,
        TopBackCenter = 0x10000,
        TopBackRight = 0x20000
    }
    

    And then finally check if the channel is present.

    What if the Channel Mask doesn't exist?

    Create one yourself! Based on the file's channel count you will either have to guess which channels are used, or just blindly follow the MCL. In the below code snippet we're doing a bit of both,

    public static uint GetSpeakerMask(int channelCount)
    {
        // Assume setup of: FL, FR, FC, LFE, BL, BR, SL & SR. Otherwise MCL will use: FL, FR, FC, LFE, BL, BR, FLoC & FRoC.
        if (channelCount == 8)
        {
            return 0x63F; 
        }
    
        // Otherwise follow MCL.
        uint mask = 0;
        var channels = Enum.GetValues(typeof(Channels)).Cast().ToArray();
    
        for (var i = 0; i < channelCount; i++)
        {
            mask += channels[i];
        }
    
        return mask;
    }
    

    Extracting the samples

    To actually read samples of a particular channel, you do exactly the same as if the file were stereo, that is, you increment your loop's counter by frame size (in bytes).

    frameSize = (bitDepth / 8) * channelCount
    

    You also need to offset your loop's starting index. This is where things become more complicated, as you have to start reading data from the channel's order number based on existing channels, times byte depth.

    What do I mean "based on existing channels"? Well, you need to reassign the existing channels' order number from 1, incrementing the order for each channel that is present. For example, the channel mask 0x63F indicates that the FL, FR, FC, LFE, BL, BR, SL & SR channels are used, therefore the new channel order numbers for the respective channels would look like this (note, the bit values are not and should not ever be changed),

    Order | Bit | Channel
    
     1.     0x1  Front Left
     2.     0x2  Front Right
     3.     0x4  Front Center
     4.     0x8  Low Frequency (LFE)
     5.    0x10  Back Left (Surround Back Left)
     6.    0x20  Back Right (Surround Back Right)
     7.   0x200  Side Left (Surround Left)
     8.   0x400  Side Right (Surround Right)
    

    You'll notice that the FLoC, FRoC & BC are all missing, therefore the SL & SR channels "drop down" into the next lowest available order numbers, rather than using the SL & SR's default order (10, 11).

    Summing up

    So, to read the bytes of a single channel you'd need to do something similar to this,

    // This code will only return the bytes of a particular channel. It's up to you to convert the bytes to actual samples.
    public static byte[] GetChannelBytes(byte[] audioBytes, uint speakerMask, Channels channelToRead, int bitDepth, uint sampleStartIndex, uint sampleEndIndex)
    {
        var channels = FindExistingChannels(speakerMask);
        var ch = GetChannelNumber(channelToRead, channels);
        var byteDepth = bitDepth / 8;
        var chOffset = ch * byteDepth;
        var frameBytes = byteDepth * channels.Length;
        var startByteIncIndex = sampleStartIndex * byteDepth * channels.Length;
        var endByteIncIndex = sampleEndIndex * byteDepth * channels.Length;
        var outputBytesCount = endByteIncIndex - startByteIncIndex;
        var outputBytes = new byte[outputBytesCount / channels.Length];
        var i = 0;
    
        startByteIncIndex += chOffset;
    
        for (var j = startByteIncIndex; j < endByteIncIndex; j += frameBytes)
        {
            for (var k = j; k < j + byteDepth; k++)
            {
                outputBytes[i] = audioBytes[(k - startByteIncIndex) + chOffset];
                i++;
            }
        }
    
        return outputBytes;
    }
    
    private static Channels[] FindExistingChannels(uint speakerMask)
    {
        var foundChannels = new List();
    
        foreach (var ch in Enum.GetValues(typeof(Channels)))
        {
            if ((speakerMask & (uint)ch) == (uint)ch)
            {
                foundChannels.Add((Channels)ch);
            }
        }
    
        return foundChannels.ToArray();
    }
    
    private static int GetChannelNumber(Channels input, Channels[] existingChannels)
    {
        for (var i = 0; i < existingChannels.Length; i++)
        {
            if (existingChannels[i] == input)
            {
                return i;
            }
        }
    
        return -1;
    }
    

    0 讨论(0)
提交回复
热议问题