Synchronization of data with video using WebRTC

javascript html video video-streaming webrtc

10,083

Solution 1

I suspect the amount of data per frame is fairly small. I would look at encoding it into a 2D barcode image and place it in each frame in a way so it is not removed by compression. Alternatively just encode timestamp like this.

Then on the player side you look at the image in a particular frame and get the data out or if it.

Solution 2

Ok, first lets get the video and audio using getUserMedia and lets make it raw data using

https://github.com/streamproc/MediaStreamRecorder

/*
 *
 *  Video Streamer
 *
 */


<script src="https://cdn.webrtc-experiment.com/MediaStreamRecorder.js"> </script>
<script>

// FIREFOX

var mediaConstraints = {
    audio: !!navigator.mozGetUserMedia, // don't forget audio!
    video: true                         // don't forget video!
};

navigator.getUserMedia(mediaConstraints, onMediaSuccess, onMediaError);

function onMediaSuccess(stream) {
    var mediaRecorder = new MediaStreamRecorder(stream);
    mediaRecorder.mimeType = 'video/webm';
    mediaRecorder.ondataavailable = function (blob) {
        // POST/PUT "Blob" using FormData/XHR2

    };
    mediaRecorder.start(3000);
}

function onMediaError(e) {
    console.error('media error', e);
}
</script>



// CHROME

var mediaConstraints = {
    audio: true,
    video: true
};

navigator.getUserMedia(mediaConstraints, onMediaSuccess, onMediaError);

function onMediaSuccess(stream) {
    var multiStreamRecorder = new MultiStreamRecorder(stream);
    multiStreamRecorder.video = yourVideoElement; // to get maximum accuracy
    multiStreamRecorder.audioChannels = 1;
    multiStreamRecorder.ondataavailable = function (blobs) {
        // blobs.audio
        // blobs.video
    };
    multiStreamRecorder.start(3000);
}

function onMediaError(e) {
    console.error('media error', e);
}

Now you can send the data through DataChannels and add your metadatas, in the receiver side:

/*
 *
 *  Video Receiver
 *
 */


 var ms = new MediaSource();

 var video = document.querySelector('video');
 video.src = window.URL.createObjectURL(ms);

 ms.addEventListener('sourceopen', function(e) {
   var sourceBuffer = ms.addSourceBuffer('video/webm; codecs="vorbis,vp8"');
   sourceBuffer.appendBuffer(/* Video chunks here */);
 }, false);

10,083

Author by

MaMazav

Works in the area of Computer Science since 2007, on the areas: compilers, graphics viewer, web.

Updated on June 30, 2022

Comments

MaMazav almost 2 years
I'm using WebRTC to send video from a server to client browser (using the native WebRTC API and an MCU WebRTC server like Kurento).

Before sending it to clients each frame of the video contained metadata (like subtitles or any other applicative content). I'm looking for a way to send this metadata to the client such that it remains synchronized (to the time it is actually presented). In addition I would like to be able to access this data from the client side (by Javascript).

Some options I thought about:
- Sending the data by WebRTC DataChannel. But I don't know how to ensure the data is synchronized on a per-frame basis. But I couldn't find a way to ensure the data sent by the data channel and the video channel is synchronize (again, I hope to get precision level of single frame).
- Sending the data manually to the client in some way (WebRTC DataChannel, websockets, etc.) with timestamps that match the video's timestamps. However, even if Kurento or other middle servers preserve the timestamp information in the video, according to the following answer there is no applicative way to get the video timestamps from the javascript: How can use the webRTC Javascript API to access the outgoing audio RTP timestamp at the sender and the incoming audio RTP timestamp at the receiver?. I thought about using the standard video element's timeupdate event, but I don't konw if it will work for precision level of frame, and I'm not sure what it means in a live video as in WebRTC.
- Sending the data manually and attach it to the video applicatively as another TextTrack. Then use the onenter and onexit to read it synchronizely: http://www.html5rocks.com/en/tutorials/track/basics/. It still requires precise timestamps, and I'm not sure how to know what are the timestamps and if Kurento pass them as-is.
- Using the statistics API of WebRTC to manually count frames (using getstats), and hope that the information provided by this API is precise.
What is the best way to do that, and how to solve the problems I mentioned in either way?

EDIT: Precise synchronization (in resolution of no more than a single frame ) of metadata with the appropriate frame is required.