FFMPEG audio out of sync when transcoding (demuxing) from DV

30,852

Solution 1

I've finally solved the issue - it's an overkill, but it works.

I've realized that if I copy the .dv to any other container, the audio and video is obviously out of sync. Then I wanted to cut that file to a 1 minute segment starting at the 51st minute (-ss 51:00 -t 60), it was obviously still out of sync.

However, when I used the same cut (-ss 51:00 -t 60) on the original .dv it was in sync! So what I ended up doing is I wrote a script that cut the .dv file into 1 second segment every second and saved that into separate files (yes over 3600 files per .dv). No encoding, just stream copy to a new container (avi). Then I used -f concat, to put the tiny files into one avi file, that was in sync now! Any gaps are inaudible! All that was left was encoding H264 and AAC into MP4.

I ran the script on my home server that was grinding the 50 .dv files for a couple of days, but now it's done!

THANK YOU ALL FOR YOU HELP! I've learned a lot about ffmpeg and a/v in general.

Solution 2

Here are three wildcard attempts at solving this issue:

Method 1a Use system time as timestamps

ffmpeg -use_wallclock_as_timestamps 1 -i input.dv \
       -c:v libx264 -b:v 4000k -c:a aac -b:a 128k -fflags +genpts method1.ts

Method 1b Use resampler with flag set to inject silence when input audio timestamps have gaps

ffmpeg -i input.dv -c:v libx264 -b:v 4000k \
       -af "aresample=async=1:first_pts=0" -c:a aac -b:a 128k -fflags +genpts method1.ts

Method 2 Merge with dummy audio

ffmpeg -i input.dv -f lavfi -i "aevalsrc=0:c=2:s=48000" \
       -filter_complex "[0:a][1:a]amerge[a]" -map 0:v -map "[a]" -c:v libx264 -b:v 4000k -c:a aac -b:a 128k -ac 2 -shortest method2.ts

Method 3 Combination of the above

ffmpeg -use_wallclock_as_timestamps 1 -i input.dv -f lavfi -use_wallclock_as_timestamps 1 -i "aevalsrc=0:c=2:s=48000" \
       -filter_complex "[0:a][1:a]amerge[a]" -map 0:v -map "[a]"  -c:v libx264 -b:v 4000k -c:a aac -b:a 128k -ac 2 -shortest method3.ts

You can test each of them for a short duration by inserting -t N e.g. -t 20 for a 20 second test.

If any of them work, we can then proceed to wrapping the output as MP4.

Share:
30,852

Related videos on Youtube

Wojciech
Author by

Wojciech

Updated on September 18, 2022

Comments

  • Wojciech
    Wojciech over 1 year

    I've been stuck with this problem for months. I have over 50 DV tapes (from and old Sony camcorder) to be converted to a more modern, usable format (most likely H264). I've started off with pulling the files to my PC (via firewire) using DVGRAB. There I had two options: pulling RAW data from the dv tape, resulting in a muxed file OR demuxing it and saving to a DVI file.

    That's where the problems started. Saving it to a DVI file resulted in the audio being out of sync. I thought it's a problem with DVGRAB so I saved the RAW files (which are synced correctly) and wanted to process them with ffmpeg.

    It turns out that no matter how I demux it the audio is always out of sync. BEFORE you say anything about the sampling frequency - the audio differences are of absolutely random length. An hour long tape can have between 0.1 and 4 seconds of audio lag at the end.

    Here's an example file that I've split into separate audio and video files to check the differences.

    # ffprobe -i ./video_conversion/13.dv 
    ffprobe version 2.8.4 Copyright (c) 2007-2015 the FFmpeg developers
      built with gcc 5.3.0 (GCC)
      configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-avresample --enable-fontconfig --enable-gnutls --enable-gpl --enable-ladspa --enable-libass --enable-libbluray --enable-libdcadec --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-shared --enable-version3 --enable-x11grab
      libavutil      54. 31.100 / 54. 31.100
      libavcodec     56. 60.100 / 56. 60.100
      libavformat    56. 40.101 / 56. 40.101
      libavdevice    56.  4.100 / 56.  4.100
      libavfilter     5. 40.101 /  5. 40.101
      libavresample   2.  1.  0 /  2.  1.  0
      libswscale      3.  1.101 /  3.  1.101
      libswresample   1.  2.101 /  1.  2.101
      libpostproc    53.  3.100 / 53.  3.100
    [dv @ 0x864f2a0] Detected timecode is invalid
    [dv @ 0x864f2a0] Estimating duration from bitrate, this may be inaccurate
    Input #0, dv, from './video_conversion/13.dv':
      Duration: 01:00:45.80, start: 0.000000, bitrate: 28800 kb/s
        Stream #0:0: Video: dvvideo, yuv420p, 720x576 [SAR 16:15 DAR 4:3], 28800 kb/s, 25 fps, 25 tbr, 25 tbn, 25 tbc
        Stream #0:1: Audio: pcm_s16le, 48000 Hz, stereo, s16, 1536 kb/s
    
    # ffprobe -i ./video_conversion/tmp/13.mp4
    ffprobe version 2.8.4 Copyright (c) 2007-2015 the FFmpeg developers
      built with gcc 5.3.0 (GCC)
      configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-avresample --enable-fontconfig --enable-gnutls --enable-gpl --enable-ladspa --enable-libass --enable-libbluray --enable-libdcadec --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-shared --enable-version3 --enable-x11grab
      libavutil      54. 31.100 / 54. 31.100
      libavcodec     56. 60.100 / 56. 60.100
      libavformat    56. 40.101 / 56. 40.101
      libavdevice    56.  4.100 / 56.  4.100
      libavfilter     5. 40.101 /  5. 40.101
      libavresample   2.  1.  0 /  2.  1.  0
      libswscale      3.  1.101 /  3.  1.101
      libswresample   1.  2.101 /  1.  2.101
      libpostproc    53.  3.100 / 53.  3.100
    Input #0, mov,mp4,m4a,3gp,3g2,mj2, from './video_conversion/tmp/13.mp4':
      Metadata:
        major_brand     : isom
        minor_version   : 512
        compatible_brands: isomiso2avc1mp41
        encoder         : Lavf56.40.101
      Duration: 01:00:45.80, start: 0.000000, bitrate: 5685 kb/s
        Stream #0:0(und): Video: h264 (Main) (avc1 / 0x31637661), yuv420p, 720x576 [SAR 16:15 DAR 4:3], 5683 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc (default)
        Metadata:
          handler_name    : VideoHandler
    
    # ffprobe -i ./video_conversion/tmp/13.mp3
    ffprobe version 2.8.4 Copyright (c) 2007-2015 the FFmpeg developers
      built with gcc 5.3.0 (GCC)
      configuration: --prefix=/usr --disable-debug --disable-static --disable-stripping --enable-avisynth --enable-avresample --enable-fontconfig --enable-gnutls --enable-gpl --enable-ladspa --enable-libass --enable-libbluray --enable-libdcadec --enable-libfreetype --enable-libfribidi --enable-libgsm --enable-libmodplug --enable-libmp3lame --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libopenjpeg --enable-libopus --enable-libpulse --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libv4l2 --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-shared --enable-version3 --enable-x11grab
      libavutil      54. 31.100 / 54. 31.100
      libavcodec     56. 60.100 / 56. 60.100
      libavformat    56. 40.101 / 56. 40.101
      libavdevice    56.  4.100 / 56.  4.100
      libavfilter     5. 40.101 /  5. 40.101
      libavresample   2.  1.  0 /  2.  1.  0
      libswscale      3.  1.101 /  3.  1.101
      libswresample   1.  2.101 /  1.  2.101
      libpostproc    53.  3.100 / 53.  3.100
    [mp3 @ 0x954c2a0] Skipping 0 bytes of junk at 237.
    Input #0, mp3, from './video_conversion/tmp/13.mp3':
      Metadata:
        encoder         : Lavf56.40.101
      Duration: 01:00:44.35, start: 0.023021, bitrate: 128 kb/s
        Stream #0:0: Audio: mp3, 48000 Hz, stereo, s16p, 128 kb/s
        Metadata:
          encoder         : Lavc56.60
    

    This particular one differs by 1.448 seconds. As I said the differences vary greatly.

    As for the solution. I could just stretch the audio and combine it with the video (I've tested that), but I can't be certain if the audio will be in sync somewhere in the middle of the recording.

    I think I've pinpointed the source of this behaviour. Whenever I turn the camera on or off (as to start and stop recording) the video starts just a tiny bit faster then the audio. So the more "fragments" are on the tape, the more these differences add up.

    How can I fix this? Is there a way to demux the audio and video with timestamps, so that after conversion they will add up correctly? Or is there anyway to fill these gaps in audio, so that both streams are the same size to begin with?

    • Gyan
      Gyan about 8 years
      What's the command to demux the raw files?
    • Wojciech
      Wojciech about 8 years
      The raw .dv file is multiplexed by it's nature. FFMPEG is demuxing it by default when converting it to any container.
    • Gyan
      Gyan about 8 years
      Ok, rather , what's your conversion command? I forgot you're transcoding.
    • Wojciech
      Wojciech about 8 years
      I've tried a dozen combinations. Nothing special though: avconv -f dv -i ./46raw.dv -f mp4 -acodec libvo_aacenc -b:a 256k -vcodec libx264 -b:v 4000k -y ./46raw.aac.mp4
    • Gyan
      Gyan about 8 years
      avconv != ffmpeg. If it's just an offset issue, you can use -af adelay=1000|1000 where 1000 is delay in ms.
    • Wojciech
      Wojciech about 8 years
      Typing error. I'm using ffmpeg on one machine and avconv on the other. Either way it doesn't work. If it were an offset delay I wouldn't ask this question -_- It's a difference in length of the audio track of about 0.1-4s on a 3600-3700s long video.
    • Gyan
      Gyan about 8 years
      Add -copyts as an output flag and try. Check via playback for the sync and not the duration, as this flag will not pad the audio to equalize the duration. Also, unless you're using an version older than Dec '15, the internal AAC encoder is now stable and better than the VO encoder.
    • Wojciech
      Wojciech about 8 years
      Made two files, with and without -copyts. No difference. Still lagging. :( Any other ideas ?
    • Gyan
      Gyan about 8 years
      The raw files have good sync, right? How are you playing those?
    • Wojciech
      Wojciech about 8 years
      Obviously the raw files are good. If those were bad my question wouldn't make much sense. Played in mplayer they work fine. Any attempt to demux the audio and video streams, even a "copy", and putting them back into any container results in it being out of sync. The error gets bigger along the video length and reaches the 0.1-4s shift near the end.
    • Gyan
      Gyan about 8 years
      Wrap the raw to an AVI and check: ffmpeg -f dv -i ./46raw.dv -c copy -map 0 -y ./46raw.avi
    • Wojciech
      Wojciech about 8 years
      I wanted to answer right away, but for the sake of integrity I've checked it. Sorry... still out of sync. The problem lies in demuxing the streams.
    • Gyan
      Gyan about 8 years
      Can you lop the off the first, say, 10 seconds of the raw and share it? You'll have to use dd or something like it.
    • Gyan
      Gyan about 8 years
      Also, looks like the DV demuxer does not play well with missing or bad audio. Drop a line to @rhatr on twitter. He's one of the coders of the DV demuxer code.
    • Wojciech
      Wojciech about 8 years
      Well... Those are family videos belonging to my sister, so I don't really feel good about sharing them. Maybe I'll find a neutral fragment. Thanks for pointing my to one of the devs. I don't use twitter, but I guess I'll have to.
    • Gyan
      Gyan about 8 years
      Did you make progress?
    • Wojciech
      Wojciech about 8 years
      I've contacted Roman (@rhatr) and sent him a sample of the video. He struggled with it for over a week but with no avail :( I'm really grateful for the time he offered, but this means that the matter is complicated :/ I'll try to check if other video editing software can handle it.
  • Wojciech
    Wojciech about 8 years
    Option 2: Simple filtergraph 'amerge' was expected to have exactly 1 input and 1 output. However, it had >1 input(s) and 1 output(s). Please adjust, or use a complex filtergraph (-filter_complex) instead. Option 1. Gives a lot of errors: [aac @ 0x9160040] Queue input is backward in time [mp4 @ 0x915e1c0] Non-monotonous DTS in output stream 0:1; previous: 70000289337917, current: 70000289337250; changing to 70000289337918. This may result in incorrect timestamps in the output file. And stops after about 90MB of an unplayable output file.
  • Gyan
    Gyan about 8 years
    Now, try the 3 commands. Also, test playback with ffplay i.e. ffplay method1.ts
  • Wojciech
    Wojciech about 8 years
    Options 1a and 3 produce 90MB and 20MB files respectively with little to no video. Options 1b and 2 produce the whole video, but do not help with regards to the delay :(
  • Gyan
    Gyan about 8 years
    Doing this blindly is futile. Can you send a bit of the raw file, say, 20 seconds, or enough to observe loss of sync with your original command?
  • Gyan
    Gyan about 8 years
    This is a good workaround but doesn't actually solve the sync issue since each DV to AVI wrapping is subject to the same error that you had when copying the whole .dv to .avi. What this workaround does is prevent the tiny discrepancies, if any, in each 1 second segment from cascading and accumulating since each second is a separate file. You'll still have a few of the AVIs where there's noticeable async, but those don't affect the remaining AVI segments. If you can, I'm still open to working on a short segment of the raw .dv to see if this can be accurately solved, and in one step.
  • Wojciech
    Wojciech about 8 years
    I am aware that the gaps are still there, but stretching the audio would be pretty munch the same kind of solution. This is good enough for me. About the sample - there is little sense is sending a small sample, because the error is at most 3s in 1h and that's less then 0.1%. I can't send you a whole file since these are my sister's family videos (she wouldn't approve). If I manage to get a blank tape I could make a fresh sample for you to work with (filming a movie on a TV would give you good sync reference).
  • Gyan
    Gyan about 8 years
    My desired solution won't involve stretching audio. Raw DV doesn't have timestamps, but the audio is interleaved in sync, so my tinkering would be aimed at preserving that chronological relation.If you ever get the time, I'm ready to work with a sample.