Multiple audio tracks for HTML5 video

18,031

Solution 1

If you're willing to let all five tracks download, why not just mux them into the video? Videos are not limited to a single audio track (even AVI could do multiple audio tracks). Then syncing should be handled for you. You'd just enable/disable the audio tracks as needed.

Solution 2

Synchronization between audio and video is far more complex than simply starting the audio and video at the same time. Sound cards will playback at slightly different rates. (What is 44.1 kHz to me, might actually be 44.095 kHz to you.)

Often, the video is synchronized to the audio stream, but the player is what handles this. By loading up multiple objects for playback, you are effectively pulling them out of sync.

The only way this is going to work is if you can find a way to control the different audio streams from the video player. I don't know if this is possible.

Instead, I propose that you encode the video multiple times, with the different streams. You can use FFMPEG for this, and even automate the process, depending on your workflow. Switching among streams becomes a problem, but most video players are robust enough to guess the byte offset in the file, when given the bitrate.

If you only needed two languages, you could simply adjust the balance between a left and right stereo audio channel.

Solution 3

It is doable with Web Audio API.

Part of my program listens to video element events and stops or restarts audio tracks created using web audio API. This gives me an ability to turn on and off any of the tracks in perfect sync.

There are some drawbacks.

There is no Web Audio API support in Internet Explorers except for Edge.

The technique works with buffered audio only and that's limiting. There are some problems with large files: https://bugs.chromium.org/p/chromium/issues/detail?id=71704

Share:
18,031
Emphram Stavanger
Author by

Emphram Stavanger

Updated on June 09, 2022

Comments

  • Emphram Stavanger
    Emphram Stavanger about 2 years

    I'm building a video for my website with HTML5. Ideally, I'd have only one silent video file, and five different audio tracks in different languages that sync up with the video.

    Then I'd have a button that allows users to switch between audio tracks, even as the video is playing; and the correct audio track would come to life (without the video pausing or starting over or anything; much like a DVD audio track selection).

    I can do this quite simply in Flash, but I don't want to. There has to be a way to do this in pure HTML5 or HTML5+jQuery. I'm thinking you'd play all the audio files at 0 volume, and only increase the volume of the active track... but I don't know how to even do that, let alone handle it when the user pauses or rewinds the video...

    Thanks in advance!

  • Emphram Stavanger
    Emphram Stavanger over 12 years
    How exactly would I go about controlling that with JavaScript, then? :)
  • derobert
    derobert over 12 years
    @EmphramStavanger: whatwg.org/specs/web-apps/current-work/multipage/… gives a audioTracks element, and you can then use .enable() and .disable() on each track. Haven't tested this myself.
  • Emphram Stavanger
    Emphram Stavanger over 12 years
    Awesome, I'll check that out!
  • Nek
    Nek about 12 years
    audioTracks property is still not implemented
  • Adam Chwedyk
    Adam Chwedyk about 10 years
    audioTracks is implemented now in IE10 and IE11, example of usage here: msdn.microsoft.com/en-us/library/windows/apps/hh452774.aspx
  • SimonSimCity
    SimonSimCity almost 9 years
    Still - thanks for the hint for mediagroup. It now has a bit better support than at the time of your writing ;)
  • Nek
    Nek almost 9 years
    I'm glad to be of any help. Also nice to see those specs being brought into life :)
  • Duvrai
    Duvrai over 8 years
    audioTracks has also been in Safari since 6.1 (released back in 2013) caniuse.com/#feat=audiotracks
  • Nooneelse
    Nooneelse almost 8 years
    audioTrack / videoTracks is missing in Chrome
  • themihai
    themihai almost 7 years
    What's so hard for browsers to mux the streams on the fly in a single one before to play it just like ffmpeg does?
  • themihai
    themihai almost 7 years
    mediagroup has been deprecated/unshipped (at least by chrome)
  • Brad
    Brad almost 7 years
    @themihai When you have a video file with a video track and multiple audio tracks, they're all synchronized to a common reference point. When you have multiple files all separated, they aren't. You can't just line up 8 audio objects, tell them to play, and expect them to play simultaneously as they all have their own decoding, buffering, etc., and that's assuming the browser will even try to play them all simultaneously. In reality, you call play, and the browser does its thing when it feels like it (within a reasonable amount of time so as to seem instant).
  • Brad
    Brad almost 7 years
    @themihai With FFmpeg, you can take all of those tracks, mux them into the same file, and give them that common reference point. As an alternative that someone proposed here, the Web Audio API can be used to schedule playback in a sample-accurate way. That is, all of the audio is decoded first to PCM samples, which can then be started at the same time. At that point the clock reference is the output device.
  • Brad
    Brad almost 7 years
    @themihai My other point about synchronization on clock reference is that not every output device is going to play exactly at the same rate. If you have a video with no audio track, it's going to be played back as close to the frame rate as the software will get it. But if there is audio, that may lag ahead or behind a hair, to match the audio. The sound card will have its own sample clock, and that's what will drive things. Each device is slightly different in speed, which matters more for longer content.
  • themihai
    themihai almost 7 years
    I believe that just like ffmpeg can mux the streams in a single file the browser could use a buffer to mux the streams on the fly/in memory before to send them for processing/playing. Technically you could actually do that using ffmpeg compiled to wasm and feeding the video source with the muxed streams.
  • Brad
    Brad almost 7 years
    @themihai Sure, if you use the Web Audio API, but you can't just use normal media elements and tell them to start all at the same time. Additionally, you'll want to synchronize the video to audio, as I'm saying, or it will go out of sync eventually. Maybe that eventually on most devices is long enough out to not notice, but it will happen.