Web Audio API: Layout to Achieve Panning for an Arbitrary Number of Sources

只愿长相守 提交于 2021-01-28 09:20:30


I am trying to achieve user-controlled panning for any number of simultaneous web audio sources. The sources themselves are mono. I'm working in Javascript with the web audio API (https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API).

Currently, the problem I'm running into is that I'm trying to use a multi-channel output (one for each source), but the channel interpretation is overriding my attempts at panning (see https://developer.mozilla.org/en-US/docs/Web/API/AudioNode/channelInterpretation), leading me to think I'm doing something wrong architecturally.

I'd like to leave things at a mostly conceptual level here, since I believe that's where my problem lies.

Current Setup

My approach to this involves having one node handle all the processing for every source, called 'scriptNode' here. A number of channels equal to the number of audio sources is created, and a similarly equal number of panner nodes are created as well. The graph looks like this:

The bundle size (the '=' segments) is the number of channels, set to be equal to the number of sources.

scriptNode == splitter =+-- panner1 --+= merger == destination
                        \-- panner. --/
                        \-- panner. --/
                        \-- pannerN --/

Some miscellania, I'm calling this function to set scriptNode up:

scriptNode = firstPart.audioCtx.createScriptProcessor(2048, 0, numParts);

Where numParts is the number of sources. I'm also setting scriptNode's channelCountMode to 'explicit' and channelInterpretation to 'speakers'. One of these settings may end up being important, but I couldn't find out anything when trying to fiddle with the settings.

The Problem

When I actually test my code with this architecture, I will get the following behavior based on the number of parts I choose. The panning sliders are tied to the panner node's "pan" values for each respective source.

  • numParts=1 : Mono output, panning with a slider doesn't do anything but affect the volume of the output (stronger toward the middle). I imagine this is a byproduct of downmixing to mono from a panner.
  • numParts=2 : Stereo output, one hard left, one hard right. Panning both channels with a slider doesn't do anything.
  • numParts=3 : Same as =2, but third channel is silent.
  • numParts=4 : Similar to =2, now all channels work again, they are panned hard in an L/R/L/R order. Panning with a slider again doesn't do anything.

This behavior seems to fall in line with the channelInterpretation description, but what I want is to have panning work for each source separately, regardless of the number of channels I use. And I'd still like to use channels because each of my sources expect to write to a mono buffer.

Is there an architectural tweak I can make to keep this multi-channel strategy and achieve what I'm looking for?

Code Snippets

Relevant parts of current code based on my statements above alongside attempts to fix the problem. Edit: Thanks to the comments below, I managed to find the issue. I called out the one-line fix so this code can be used as reference later.

Audio processing function. Only the first synth (source) sets up this callback:

function customAudioProcessCallback( audioProcessingEvent )
    // First synth - process audio for all synths!

    const outputBuffer = audioProcessingEvent.outputBuffer;

    for ( var i = 0; i < numParts; i++ ) {

    // Each part writes to one channel.

    synthParts[ i ].synthesize(outputBuffer.getChannelData( i ), outputBuffer.length);


Relevant snippet of the play function:

function play()
    const contextClass = (window.AudioContext || window.webkitAudioContext || window.mozAudioContext || window.oAudioContext || window.msAudioContext);
    synthParts[ 0 ].audioCtx = new contextClass();

    synthParts[ 0 ].scriptNode = synthParts[ 0 ].audioCtx.createScriptProcessor ? synthParts[ 0 ].audioCtx.createScriptProcessor(2048, 0, numParts+1) : synthParts[ 0 ].audioCtx.createJavaScriptNode(2048, 0, numParts+1); // 2048, 0 input channels, ? outputs
    synthParts[ 0 ].scriptNode.onaudioprocess = customAudioProcessCallback;
    synthParts[ 0 ].scriptNode.channelCountMode = 'explicit';
    synthParts[ 0 ].scriptNode.channelInterpretation = 'speakers';

    // Set up splitter and panners for all channels
    var splitter = synthParts[ 0 ].audioCtx.createChannelSplitter( numParts+1 );

    for ( var i = 0; i < numParts; i++ ) {

        panList[ i ] = synthParts[ 0 ].audioCtx.createStereoPanner();
        panList[ i ].pan = panValues[ i ];


    // Connection:
    // scriptNode -- splitter -+-- panner1 --+- destination
    //                         \-- panner. --/
    //                         \-- pannerN --/

    synthParts[ 0 ].scriptNode.connect(splitter);

    for ( var i = 0; i < numParts; i++ ) {

        splitter.connect( panList[ i ], i);

        // This line used to read: 
        //    panList[ i ].connect( synthParts[ 0 ].audioCtx.destination, 0, i );
        // However, this was connecting multiple parts to the input of the audio context destination, which is limited to 1 input. The correct call is below.
        panList[ i ].connect( synthParts[ 0 ].audioCtx.destination );



A PannerNode always produces stereo output. When you connect the panner output to one of the input to a merger, the stereo output from the panner is downmixed to mono, effectively removing much of the panning effect.

Some information is missing, but I don't see why you need a merger. You can send the stereo output from each panner directly to the destination. The destination will mix the stereo output from each panner appropriately, preserving the panning effect.

