FFMPEG amix filter volume issue with inputs of different duration
Solution 1
amix
scales each input's volume by 1/n
where n = no. of active inputs. This is evaluated for each audio frame. So when an input drops out, the volume of the remaining inputs is scaled by a smaller amount, hence their volumes increase.
Changing the dropout_transition for all earlier inputs, as suggested in other answers, is one approach, but I think it will result in coarse volume modulations. Better method is to normalize the audio after the amix.
At present, you have two options, the loudnorm or the dynaudnorm filter. The latter is much faster
Syntax is to add it after the amix, so
[aud11][aud12]amix=inputs=13:duration=first:dropout_transition=0,dynaudnorm"
Read the documentation, if you wish to tweak parameters for maximum volume or RMS mode normalization..etc
Solution 2
The latest version of FFMPEG includes the normalize
parameter for the amix
filter, which you can use to turn off the constantly changing normalization. Here's the documentation for it.
Your amix filter string can be changed to:
[aud12]amix=inputs=13:normalize=0
Solution 3
The solution seems to be a combination of "pre-amp", or multiplication, as Maxim puts it, AND you have to set dropout_transition >= max delay + max input length
(or a very high number):
amix=inputs=13:dropout_transition=1000,volume=13
Notes:
-
amix
has to resample float anyway, so there is no downside with adding thevolume
filter (which by default resamples to float, too).
And since we're using floats, there's no clipping and (almost) no loss of precision. - H't to @Mulvya for the analysis but their solution is frustratingly non-mathematical
- I was originally trying to do this with sox, which was too slow. Sox's
remix
filter has the-m
switch which disables the1/n
adjustment. - While faster, ffmpeg seems to be using way more memory for the same task. YMMV - I didn't test this thoroughly, because I finally settled on a small python script which uses pydub's
overlay
function, and only keeps the final output file and one segment in memory (whereas ffmpeg and sox seem to keep all of the segments in memory).
Solution 4
I got the same problem but found a solution!
First the Problem: i had to mix a background music file with 3 different TTS voice pieces that start with different delay. At the end the background sound was extremely loud.
I tried the suggested answer but it did not work for me, the end volume was still much higher. So my thoughts were: "All inputs must have the same length so everytime the same amount of audio is active in the mix"
apad on all TTS inputs with whole_len set and -shortest option in combination did the work for me.
Example call:
ffmpeg -y
-nostats
-hide_banner
-v quiet
-hwaccel auto
-f image2pipe
-i pipe:0
-i bgAudio.aac
-i TTS1.mp3
-i TTS2.mp3
-i TTS3.mp3
-filter_complex [1:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false[a0];[2:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false:dual_mono=true,adelay=7680|7680,apad=whole_len=2346240[a1];[3:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false:dual_mono=true,adelay=14640|14640,apad=whole_len=2346240[a2];[4:a]loudnorm=I=-16:TP=-1.5:LRA=11:linear=false:dual_mono=true,adelay=3240|3240,apad=whole_len=2346240[a3];[a0][a1][a2][a3]amix=inputs=4:dropout_transition=0,asplit=6[audio0][audio1][audio2][audio3][audio4][audio5];[0:v]format=yuv420p,split=6[1080p][720p][480p][360p][240p][144p]
-map [audio0] -map [1080p] -s 1920x1080 -shortest out1080p.mp4
-map [audio1] -map [720p] -s 1280x720 -shortest out720p.mp4
-map [audio2] -map [480p] -s 858x480 -shortest out480p.mp4
-map [audio3] -map [360p] -s 640x360 -shortest out360p.mp4
-map [audio4] -map [240p] -s 426x240 -shortest out240p.mp4
-map [audio5] -map [144p] -s 256x144 -shortest out144p.mp4
Hope someone helps this!
Solution 5
The solution I've found is to specify the volume for each track in a "descendant" order and use no normalization filter afterwards.
I use this example, where I concat the same audio file in different positions:
ffmpeg -vn -i test.mp3 -i test.mp3 -i test.mp3 -filter_complex "[0]adelay=0|0,volume=3[a];[1]adelay=2000|2000,volume=2[b];[2]adelay=4000|4000,volume=1[c];[a][b][c]amix=inputs=3:dropout_transition=0" -q:a 1 -acodec libmp3lame -y amix-volume.mp3
More details, see this image. The first track is the normal mixing, the second is the one with volumes specified; the third is the original track. As we can see the 2nd track looks to have a normal volume.
ffmpeg -vn -i test.mp3 -i test.mp3 -i test.mp3 -filter_complex "[0]adelay=0|0[a];[1]adelay=2000|2000[b];[2]adelay=4000|4000[c];[a][b][c]amix=inputs=3:dropout_transition=0" -q:a 1 -acodec libmp3lame -y amix-no-volume.mp3
ffmpeg -vn -i test.mp3 -i test.mp3 -i test.mp3 -filter_complex "[0]adelay=0|0,volume=3[a];[1]adelay=2000|2000,volume=2[b];[2]adelay=4000|4000,volume=1[c];[a][b][c]amix=inputs=3:dropout_transition=0" -q:a 1 -acodec libmp3lame -y amix-volume.mp3
I can't really understand why amix changes the volume; anyway; I was digging around since a while for a good solution.
Comments
-
Stan Reshetnyk about 2 years
I noticed that
ffmpeg amix
filter doesn't output good result in specific situation. It works fine if input files have equal duration. In that case volume is dropped in constant value and could be fixed with",volume=2"
.In my case I'm using files with different duration. Resulted volume is not good. First mixed stream resulted in lowest volume, and last one is highest. You can see on image that volume is increased linearly withing a time.
My command:
ffmpeg -i temp_0.mp4 -i user_2123_10.mp4 -i user_2123_3.mp4 -i user_2123_4.mp4 -i user_2123_7.mp4 -i user_2123_5.mp4 -i user_2123_1.mp4 -i user_2123_8.mp4 -i user_2123_0.mp4 -i user_2123_6.mp4 -i user_2123_9.mp4 -i user_2123_2.mp4 -i user_2123_11.mp4 -filter_complex "[1:a]adelay=34741.0[aud1]; [2:a]adelay=18241.0[aud2];[3:a]adelay=20602.0[aud3]; [4:a]adelay=27852.0[aud4];[5:a]adelay=22941.0[aud5]; [6:a]adelay=13142.0[aud6];[7:a]adelay=29810.0[aud7]; [8:a]adelay=12.0[aud8];[9:a]adelay=25692.0[aud9]; [10:a]adelay=32143.002[aud10];[11:a]adelay=16101.0[aud11]; [12:a]adelay=40848.0[aud12]; [0:a][aud1][aud2][aud3][aud4][aud5][aud6][aud7] [aud8][aud9][aud10][aud11] [aud12]amix=inputs=13:duration=first:dropout_transition=0" -vcodec copy -y temp_1.mp4
That could be fixed by applying silence at the beginning and end of each clip, then they will have same duration and volume will be at the same level.
Please suggest how I can use
amix
to mix many inputs and ensure constant volume level. -
Stan Reshetnyk almost 7 years@Xumo that is not my code, but I guess it is because 16 bits = 65536
-
Nabi K.A.Z. over 6 yearsI think the problem is that the volume of the initial sound is diminished, and the problem is not that the volume of the final sound has increased. Therefore, the correct solution should be such that the volume of the end sound is preserved and the volume of the sound is amplified first. But according to your shot screenshot, it seems that the volume of the initial sound is kept until the end and the entire volume is low.
-
Nabi K.A.Z. over 6 yearsThis just increase volumes of all parts in trough length audio. But still exist problem, and volume in initial lower from end of audio.
-
Gyan about 6 yearsIf you wish to go this route, add apad to all audio inputs except one. Use
duration=shortest
to amix, and thenvolume=N
. It will save having to calculate transition time. Plus during the transition, the individual weights are being modulated, it will just happen slowly with a large transition time. -
Gyan almost 6 yearsThis will lead to clipping as well as possibly changes in dynamics if the input has a high volume to start with.
-
Ahmad Arslan almost 6 yearsif I need to change -ac its not working it always showing me 2
-
mr_blond almost 6 years@Gyan yes it leads to clipping, but clipping is just an another side of mixing. That's how every audio DAW works. Thereby I think
volume={num of inputs}
nice solution if you remember that you should prepare every input for mix. -
parse over 4 yearsI have tried to apply those options to
amix
but still getting the same volume issue, 1st input is low and the last one is louder!ffmpeg -i video.mp4 -i voice1.mp3 -i voice2.mp3 -i voice3.mp3 -y -filter_complex "[a:0]volume=0[a1];[1]adelay=18400|18400[a2];[2]adelay=25800|25800[a3];[3]adelay=33700|33700[a4];[a1][a2][a3][a4]amix=inputs=4:duration=first:dropout_transition=0,dynaudnorm" -ar 44100 -ac 2 -b:a 192k -acodec libmp3lame -f mp3 mix.mp3
-
Coco over 4 yearshello, do you know how to use this code in the FFmpeg command?
-
Stan Reshetnyk over 4 yearsThat is C/C++ code which my colleague used to create linux command line program, that I later used to merge mix audio files.
-
Rick Mohr over 4 yearsFor me
dropout_transition=1000
was the key;apad
andduration=shortest
were not needed. -
ultraGentle over 2 yearsThis worked most simply, and in the least hacky way -- instead of inreasing volume after
amix
decreases it, just tell it not to do that in the first place! I suggest adding which specific version of ffmpeg makes this option available.