问题
There are many questions on this topic link1, link2, and link3. However, I am asking for suggestions on the probable solution and if it has some drawbacks.
Problem Definition: amix-filter always does "volume normalization" and can't be turned off
Reference: Please read comments over here by @Reino. He also had opened a ticket on the FFmpeg forum to explain the situation.
Hacky Solution: amix=inputs=13:dropout_transition=1000,volume=13
Reference: Answered here, and also in the ticket.
Questions:
1) "amix scales each input's volume by 1/n where n = no. of active inputs. This is evaluated for each audio frame. So when an input drops out, the volume of the remaining inputs is scaled by a smaller amount, hence their volumes increase."Refer
For example, if I am merging 10 audio streams then, the 1st audio stream will be scaled by 1/10, 2nd by 1/9, 3rd by 1/8 .. 9th by 1/2 and last 10th by 1. Have I understood this correctly or let me know if I am missing something?
2) dropout_transition: The transition time, in seconds, for volume renormalization when an input stream ends. The default value is 2 seconds.
dropout_transition means it will SKIP given seconds, right? So if I set dropout_transition=1000(very large number) then regardless of video length FFmpeg will drop/skip audio transition for provided seconds. Again, please do correct me if I have made wrong assumption.
3) I have tried many other solutions without any luck and now I am relying profoundly on the provided solution. Is there any drawback to the above hacky solution?
回答1:
if I am merging 10 audio streams then, the 1st audio stream will be scaled by 1/10, 2nd by 1/9, 3rd by 1/8 .. 9th by 1/2 and last 10th by 1.
No. Let's say you have 4 inputs, which are 10, 7, 4 and 2 seconds long respectively. Let's keep dropout transition as 0. Then for first 2 seconds, all inputs are active, so each input is scaled by 1/4. From 2 to 4 seconds, 3 inputs are active, so all active inputs (#1, 2, 3) are scaled by 1/3. From 4 to 7 seconds, only inputs 1 and 2 are active, so both are scaled by 1/2. And from 7 to 10 seconds, only input 1 is active, so it is scaled by 1 i.e. its volume is unchanged.
dropout_transition means it will SKIP given seconds, right?
No. Continuing with the above scenario, let's say dropout transition is 1 second. So, when input 4 ends, the scaling does not change from 1/4 to 1/3 immediately. It transitions over 1 second, gradually.
Is there any drawback to the above hacky solution?
For most cases, it's fine. If you're combining loud pieces of music, then there will be flattening of range, but it shouldn't matter in that case.
来源:https://stackoverflow.com/questions/62367391/ffmpeg-amix-filter-always-does-volume-normalization-how-to-prevent-it-and-wha