I have two videos of the same exact length, and I would like to use ffmpeg to stack them into one video file.
How can I do this?
For 2 videos:
ffmpeg -i 1.mp4 -i 2.mp4 -filter_complex hstack out.mp4
For more videos(3 in this example):
ffmpeg -i 1.mp4 -i 2.mp4 -i 3.mp4 -filter_complex hstack=3 out.mp4
See this answer to this question for a newer, simpler way to do this.
Old version:
You should be able to do this using the pad, movie and overlay filters in FFmpeg. The command will look something like this:
ffmpeg -i top.mov -vf 'pad=iw:2*ih [top]; movie=bottom.mov [bottom]; \
[top][bottom] overlay=0:main_h/2' stacked.mov
First the movie that should be on top is padded to twice its height. Then the bottom movie is loaded. Then the bottom movie is overlaid on the padded top movie at an offset of half the padded movie's height.
Use the vstack (vertical), hstack (horizontal), or xstack (custom layout) filters. It is easier and faster than other methods.
Using the vstack filter.
ffmpeg -i input0 -i input1 -filter_complex vstack=inputs=2 output
Videos must have the same width.
Using the hstack filter.
ffmpeg -i input0 -i input1 -filter_complex hstack=inputs=2 output
Videos must have the same height.
Using the pad filter. This examples creates a 5px black border between the two sides.
ffmpeg -i input0 -i input1 -filter_complex "[0]pad=iw+5:color=black[left];[left][1]hstack=inputs=2" output
Add the amerge filter to combine the audio channels from both inputs:
ffmpeg -i input0 -i input1 -filter_complex "[0:v][1:v]vstack=inputs=2[v];[0:a][1:a]amerge=inputs=2[a]" -map "[v]" -map "[a]" -ac 2 output
This assumes each input contains a stereo audio stream.
-ac 2
is included to downmix to stereo in case both inputs contain multi-channel audio. For example, if both inputs are stereo, you would get a 4-channel output audio stream instead of stereo if you omit -ac 2
.
Use amerge (or amix) and pan filters:
ffmpeg -i input0 -i input1 -filter_complex "[0:v][1:v]vstack=inputs=2[v];[0:a][1:a]amerge=inputs=2,pan=stereo|c0<c0+c1|c1<c2+c3[a]" -map "[v]" -map "[a]" output
This example will use the audio from input1
:
ffmpeg -i input0 -i input1 -filter_complex "[0:v][1:v]vstack=inputs=2[v]" -map "[v]" -map 1:a output
If you mix inputs that have audio and inputs that do not have audio then amerge will fail because each input needs audio. You can add silent audio with the anullsrc filter to prevent this:
ffmpeg -i input0 -i input1 -filter_complex "[0:v][1:v]vstack=inputs=2[v];anullsrc[silent];[0:a][silent]amerge=inputs=2[a]" -map "[v]" -map "[a]" -ac 2 output.mp4
ffmpeg -i input0 -i input1 -i input2 -filter_complex "[0:v][1:v][2:v]hstack=inputs=3[v]" -map "[v]" output
If you want vertical use vstack instead of hstack.
ffmpeg -i input0 -i input1 -i input2 -i input3 -filter_complex "[0:v][1:v][2:v][3:v]xstack=inputs=4:layout=0_0|w0_0|0_h0|w0_h0[v]" -map "[v]" output
ffmpeg -i input0 -i input1 -i input2 -i input3 -filter_complex "[0:v][1:v]hstack=inputs=2[top];[2:v][3:v]hstack=inputs=2[bottom];[top][bottom]vstack=inputs=2[v]" -map "[v]" output
This syntax is easier to understand, but less efficient than using xstack as shown above.
Using the drawtext filter:
ffmpeg -i input0 -i input1 -i input2 -i input3 -filter_complex
"[0]drawtext=text='vid0':fontsize=20:x=(w-text_w)/2:y=(h-text_h)/2[v0];
[1]drawtext=text='vid1':fontsize=20:x=(w-text_w)/2:y=(h-text_h)/2[v1];
[2]drawtext=text='vid2':fontsize=20:x=(w-text_w)/2:y=(h-text_h)/2[v2];
[3]drawtext=text='vid3':fontsize=20:x=(w-text_w)/2:y=(h-text_h)/2[v3];
[v0][v1][v2][v3]xstack=inputs=4:layout=0_0|w0_0|0_h0|w0_h0[v]"
-map "[v]" output
Since both videos need to have the same with for vstack, and the same height for hstack, you may need to scale one of the other videos to match the other:
Simple scale filter example to set width of input0 to 640 and automatically set height while preserving the aspect ratio:
ffmpeg -i input0 -i input2 -filter_complex "[0:v]scale=640:-1[v0];[v0][1:v]vstack=inputs=2" output
For a more advanced method to fit any size video into a specific size while preserving aspect ratio see Resizing videos with ffmpeg to fit into static sized player.
You can also use the scale2ref filter to automatically resize one video to match the dimensions of the other.
This example will play the top left video while pausing the others. Once the top left video ends the top right video will play and so on.
Use the tpad, adelay, xstack, and amix filters:
ffmpeg -i top-left.mp4 -i top-right.mp4 -i bottom-left.mp4 -i bottom-right.mp4 -filter_complex "[1]tpad=start_mode=clone:start_duration=5[tr];[2]tpad=start_mode=clone:start_duration=10[bl];[3]tpad=start_mode=clone:start_duration=15[br];[0][tr][bl][br]xstack=inputs=4:layout=0_0|w0_0|0_h0|w0_h0[v];[1:a]adelay=5s:all=true[a1];[2:a]adelay=10s:all=true[a2];[3:a]adelay=15s:all=true[a3];[0:a][a1][a2][a3]amix=inputs=4[a]" -map "[v]" -map "[a]" output.mp4
This example assumes each input is 5 seconds duration. Adjust start_duration
and adelay
values as needed.
This command requires FFmpeg 4.2 or newer.
If you don't like the complexity of xstack you can use several hstack/vstack instead as shown in Example 4: 2x2 grid.