问题
I need to merge multiple video files (with included audio) into a single video. I've noticed xfade has been recently released and used it but I am running into an audio sync issue.
All videos are in the same format / resolution / fame and bitrate / etc both for video and audio.
Here is what I am using to merge 5 videos of various durations with 0.5 crossfade transitions:
ffmpeg \
-i v0.mp4 \
-i v1.mp4 \
-i v2.mp4 \
-i v3.mp4 \
-i v4.mp4 \
-filter_complex \
"[0][1]xfade=transition=fade:duration=0.5:offset=3.5[V01]; \
[V01][2]xfade=transition=fade:duration=0.5:offset=32.75[V02]; \
[V02][3]xfade=transition=fade:duration=0.5:offset=67.75[V03]; \
[V03][4]xfade=transition=fade:duration=0.5:offset=98.75[video]; \
[0:a][1:a]acrossfade=d=0.5:c1=tri:c2=tri[A01]; \
[A01][2:a]acrossfade=d=0.5:c1=tri:c2=tri[A02]; \
[A02][3:a]acrossfade=d=0.5:c1=tri:c2=tri[A03]; \
[A03][4:a]acrossfade=d=0.5:c1=tri:c2=tri[audio]" \
-vsync 0 -map "[video]" -map "[audio]" out.mp4
The code above generates a video with audio. The first and second segment is aligned with audio but starting with the second transition the sound is misaligned.
回答1:
Your offsets are incorrect. Try:
ffmpeg -i v0.mp4 -i v1.mp4 -i v2.mp4 -i v3.mp4 -i v4.mp4 -filter_complex \
"[0][1]xfade=transition=fade:duration=0.5:offset=3.5[V01]; \
[V01][2]xfade=transition=fade:duration=0.5:offset=12.1[V02]; \
[V02][3]xfade=transition=fade:duration=0.5:offset=15.1[V03]; \
[V03][4]xfade=transition=fade:duration=0.5:offset=22.59,format=yuv420p[video]; \
[0:a][1:a]acrossfade=d=0.5:c1=tri:c2=tri[A01]; \
[A01][2:a]acrossfade=d=0.5:c1=tri:c2=tri[A02]; \
[A02][3:a]acrossfade=d=0.5:c1=tri:c2=tri[A03]; \
[A03][4:a]acrossfade=d=0.5:c1=tri:c2=tri[audio]" \
-map "[video]" -map "[audio]" -movflags +faststart out.mp4
How to get xfade offset
values:
input | input duration | + | previous xfade offset |
- | xfade duration |
= |
---|---|---|---|---|---|---|
v0.mp4 |
4.00 | + | 0 | - | 0.5 | 3.5 |
v1.mp4 |
9.19 | + | 3.5 | - | 0.5 | 12.1 |
v2.mp4 |
3.41 | + | 12.1 | - | 0.5 | 15.1 |
v3.mp4 |
7.99 | + | 15.1 | - | 0.5 | 22.59 |
See xfade and acrossfade filter documentation for more info.
回答2:
Automating the process will help deal with errors in calculating the offsets. I created a Python script that makes the calculation and builds a graph for any size list of input videos:
https://gist.github.com/royshil/369e175960718b5a03e40f279b131788
It will check the lengths of the video files (with ffprobe
) to figure out the right offsets.
The crux of the matter is to build the filter graph and calculating the offsets:
# Prepare the filter graph
video_fades = ""
audio_fades = ""
last_fade_output = "0:v"
last_audio_output = "0:a"
video_length = 0
for i in range(len(segments) - 1):
# Video graph: chain the xfade operator together
video_length += file_lengths[i]
next_fade_output = "v%d%d" % (i, i + 1)
video_fades += "[%s][%d:v]xfade=duration=0.5:offset=%.3f[%s]; " % \
(last_fade_output, i + 1, video_length - 1, next_fade_output)
last_fade_output = next_fade_output
# Audio graph:
next_audio_output = "a%d%d" % (i, i + 1)
audio_fades += "[%s][%d:a]acrossfade=d=1[%s]%s " % \
(last_audio_output, i + 1, next_audio_output, ";" if (i+1) < len(segments)-1 else "")
last_audio_output = next_audio_output
It may produce a filter graph such as
[0:v][1:v]xfade=duration=0.5:offset=42.511[v01];
[v01][2:v]xfade=duration=0.5:offset=908.517[v12];
[v12][3:v]xfade=duration=0.5:offset=1098.523[v23];
[v23][4:v]xfade=duration=0.5:offset=1234.523[v34];
[v34][5:v]xfade=duration=0.5:offset=2375.523[v45];
[v45][6:v]xfade=duration=0.5:offset=2472.526[v56];
[v56][7:v]xfade=duration=0.5:offset=2659.693[v67];
[0:a][1:a]acrossfade=d=1[a01];
[a01][2:a]acrossfade=d=1[a12];
[a12][3:a]acrossfade=d=1[a23];
[a23][4:a]acrossfade=d=1[a34];
[a34][5:a]acrossfade=d=1[a45];
[a45][6:a]acrossfade=d=1[a56];
[a56][7:a]acrossfade=d=1[a67]
来源:https://stackoverflow.com/questions/63553906/merging-multiple-video-files-with-ffmpeg-and-xfade-filter