Using SoX to change the volume level of a range of time in an audio file

后端 未结 2 1718
南方客
南方客 2021-02-15 12:24

I’d like to change the volume level of a particular time range/slice in an audio file using SoX.

Right now, I’m having to:

  1. Trim the original file three tim
2条回答
  •  难免孤独
    2021-02-15 13:25

    For anyone who stumbles across this highly ranked thread, searching for a way to duck the middle of an audio file:

    I've been playing with SoX for ages and the method I built uses pipes to process each part without creating all those temporary files!

    The result is a single line solution, though you will need to set timings and so, unless your fade timings will be the same for all files, it may be useful to generate the line with an algorithm.

    I was pleased to get piping working, as I know this aspect has proved difficult for others. The command line options can be difficult to get right. However I really didn't like the messy additional files as an alternative.

    By using mix functionality and positioning each part using pad, then giving each section trim & fade we can also avoid use of 'splice' here. I really wasn't a fan.


    A working single line example, tested in SoX 14.4.2 Windows:

    It fades (ducks) by -6dB at 2 seconds, returning to 0dB at 5 seconds (using linear fades of 0.4 seconds):

    sox -m -t wav "|sox -V1 inputfile.wav -t wav - fade t 0 2.2 0.4" -t wav "|sox -V1 inputfile.wav -t wav - trim 1.8 fade t 0.4 3.4 0.4 gain -6 pad 1.8" -t wav "|sox -V1 inputfile.wav -t wav - trim 4.8 fade t 0.4 0 0 pad 4.8" outputfile.wav gain 9.542
    

    Let's make that a little more readable here by breaking it down into sections:

    Section 1 = full volume, Section 2 = ducked, Section 3 = full volume

    sox -m
        -t wav "|sox -V1 inputfile.wav -t wav - fade t 0 2.2 0.4" 
        -t wav "|sox -V1 inputfile.wav -t wav - trim 1.8 fade t 0.4 3.4 0.4 gain -6 pad 1.8"
        -t wav "|sox -V1 inputfile.wav -t wav - trim 4.8 fade t 0.4 0 0 pad 4.8"
        outputfile.wav gain 9.542
    

    Now, to break it down, very thoroughly

    '-m' .. says we're going to mix (this automatically reduces gain, see last parameter)

    '-t wav' .. says the piped command that follows will return a WAV (it seems the WAV header is being lost in the pipeline)

    Then.. the FIRST piped part (full volume before duck)

    '-V1' .. says ignore warnings - there will be a warning about not knowing length of output file for this specific section as it's piping out, but there should be no other warning from this operation

    then the input filename

    '-t wav' .. forces the output type

    '-' .. is the standard name for a piped output which will return to SoX command line

    'fade t 0 2.2 0.4' .. fades out the full volume section. t = linear. 0 fade in. Then (as we want the crossfade's halfway point to be at 2 seconds) we fade out by 2.2 seconds, with a 0.4 second fade (the fadeout parameter is for when the fade ENDS!)

    '-t wav' .. to advise type of next part - as above

    Then.. the SECOND piped part (the ducked section)

    '-V1' .. again, to ignore output length warning - see above then the same input filename

    '-t wav' .. forces output type, as above

    '-' .. for piped output, see above

    'trim 1.8' .. because this middle section will hit the middle of the transition at 2 seconds, so (with a 0.4 second crossfade) the ducked audio file will start 0.2 seconds before that

    'fade t 0.4 3.4 0.4' .. to fade in the ducked section & fade back out again. So a 0.4 fade in. Then (the most complicated part) as the next crossfade will end at 5.2 seconds we must take that figure minus trimmed amount for this section, so 5.2-1.8=3.4 (again this is because fadeout position deals with the end timing of the fadeout)

    'gain -6' .. is the amount, in dB, by which we should duck

    'pad 1.8' .. must match the trim figure above, so that amount of silence is inserted at the start to make it synch when sections are mixed

    '-t wav' .. to advise type of next part - as above

    Then.. the THIRD piped part (return to full level)

    '-V1' .. again - see above

    then the same input filename

    -t wav' .. to force output type, as above

    -' .. for piped output, see above

    trim 4.8' .. this final section will start at 5 seconds, but (with a 0.4 second crossfade) the audio will start 0.2 seconds before that

    'fade t 0.4 0 0' .. just fade in to this full volume section. No fade out

    'pad 4.8' .. must match the trim figure above, as explained above then output filename

    'gain 9.542' .. looks tricky, but basically when you "-m" to mix 3 files the volume is reduced to 1/3 (one third) by SoX to give headroom.

    Rather than defeating that, we boost to 300%. We get the dB amount of 9.542 with this formula 20*log(3)/log(10)


    If you copy & paste the single line somewhere you can see it all easily, it's a lot less scary than the explanation!

    Final though - I was initially concerned about whether the crossfades needed to be logarithmic rather than linear, but in my case from listening to the results linear has definitely given the sound I expected.

    You may like to try longer crossfades, or have the point of transition happening earlier or later but I hope that single line gives hope to anyone who thought many temporary files would be required!

    Let me know if more clarification would help!

    audacity waveform

提交回复
热议问题