Raw H264 frames in mpegts container using libavcodec

c h.264 libavcodec libavformat

35,346

I believe if you set the following, you will see video playback.

packet.flags |= AV_PKT_FLAG_KEY;
packet.pts = packet.dts = 0;

You should really set packet.flags according to the h264 packet headers. You might try this fellow stack overflowian's suggestion for extracting directly from the stream.

If you are also adding audio, then pts/dts is going to be more important. I suggest you study this tutorial

EDIT

I found time to extract out what is working for me from my test app. For some reason, dts/pts values of zero works for me, but values other than 0 or AV_NOPTS_VALUE do not. I wonder if we have different versions of ffmpeg. I have the latest from git://git.videolan.org/ffmpeg.git.

fftest.cpp

#include <string>

#ifndef INT64_C
#define INT64_C(c) (c ## LL)
#define UINT64_C(c) (c ## ULL)
#endif

//#define _M
#define _M printf( "%s(%d) : MARKER\n", __FILE__, __LINE__ )

extern "C"
{
    #include "libavcodec/avcodec.h"
    #include "libavformat/avformat.h"
};


AVFormatContext *fc = 0;
int vi = -1, waitkey = 1;

// < 0 = error
// 0 = I-Frame
// 1 = P-Frame
// 2 = B-Frame
// 3 = S-Frame
int getVopType( const void *p, int len )
{   
    if ( !p || 6 >= len )
        return -1;

    unsigned char *b = (unsigned char*)p;

    // Verify NAL marker
    if ( b[ 0 ] || b[ 1 ] || 0x01 != b[ 2 ] )
    {   b++;
        if ( b[ 0 ] || b[ 1 ] || 0x01 != b[ 2 ] )
            return -1;
    } // end if

    b += 3;

    // Verify VOP id
    if ( 0xb6 == *b )
    {   b++;
        return ( *b & 0xc0 ) >> 6;
    } // end if

    switch( *b )
    {   case 0x65 : return 0;
        case 0x61 : return 1;
        case 0x01 : return 2;
    } // end switch

    return -1;
}

void write_frame( const void* p, int len )
{
    if ( 0 > vi )
        return;

    AVStream *pst = fc->streams[ vi ];

    // Init packet
    AVPacket pkt;
    av_init_packet( &pkt );
    pkt.flags |= ( 0 >= getVopType( p, len ) ) ? AV_PKT_FLAG_KEY : 0;   
    pkt.stream_index = pst->index;
    pkt.data = (uint8_t*)p;
    pkt.size = len;

    // Wait for key frame
    if ( waitkey )
        if ( 0 == ( pkt.flags & AV_PKT_FLAG_KEY ) )
            return;
        else
            waitkey = 0;

    pkt.dts = AV_NOPTS_VALUE;
    pkt.pts = AV_NOPTS_VALUE;

//  av_write_frame( fc, &pkt );
    av_interleaved_write_frame( fc, &pkt );
}

void destroy()
{
    waitkey = 1;
    vi = -1;

    if ( !fc )
        return;

_M; av_write_trailer( fc );

    if ( fc->oformat && !( fc->oformat->flags & AVFMT_NOFILE ) && fc->pb )
        avio_close( fc->pb ); 

    // Free the stream
_M; av_free( fc );

    fc = 0;
_M; 
}

int get_nal_type( void *p, int len )
{
    if ( !p || 5 >= len )
        return -1;

    unsigned char *b = (unsigned char*)p;

    // Verify NAL marker
    if ( b[ 0 ] || b[ 1 ] || 0x01 != b[ 2 ] )
    {   b++;
        if ( b[ 0 ] || b[ 1 ] || 0x01 != b[ 2 ] )
            return -1;
    } // end if

    b += 3;

    return *b;
}

int create( void *p, int len )
{
    if ( 0x67 != get_nal_type( p, len ) )
        return -1;

    destroy();

    const char *file = "test.avi";
    CodecID codec_id = CODEC_ID_H264;
//  CodecID codec_id = CODEC_ID_MPEG4;
    int br = 1000000;
    int w = 480;
    int h = 354;
    int fps = 15;

    // Create container
_M; AVOutputFormat *of = av_guess_format( 0, file, 0 );
    fc = avformat_alloc_context();
    fc->oformat = of;
    strcpy( fc->filename, file );

    // Add video stream
_M; AVStream *pst = av_new_stream( fc, 0 );
    vi = pst->index;

    AVCodecContext *pcc = pst->codec;
_M; avcodec_get_context_defaults2( pcc, AVMEDIA_TYPE_VIDEO );
    pcc->codec_type = AVMEDIA_TYPE_VIDEO;

    pcc->codec_id = codec_id;
    pcc->bit_rate = br;
    pcc->width = w;
    pcc->height = h;
    pcc->time_base.num = 1;
    pcc->time_base.den = fps;

    // Init container
_M; av_set_parameters( fc, 0 );

    if ( !( fc->oformat->flags & AVFMT_NOFILE ) )
        avio_open( &fc->pb, fc->filename, URL_WRONLY );

_M; av_write_header( fc );

_M; return 1;
}

int main( int argc, char** argv )
{
    int f = 0, sz = 0;
    char fname[ 256 ] = { 0 };
    char buf[ 128 * 1024 ];

    av_log_set_level( AV_LOG_ERROR );
    av_register_all();

    do
    {
        // Raw frames in v0.raw, v1.raw, v2.raw, ...
//      sprintf( fname, "rawvideo/v%lu.raw", f++ );
        sprintf( fname, "frames/frame%lu.bin", f++ );
        printf( "%s\n", fname );

        FILE *fd = fopen( fname, "rb" );
        if ( !fd )
            sz = 0;
        else
        {
            sz = fread( buf, 1, sizeof( buf ) - FF_INPUT_BUFFER_PADDING_SIZE, fd );
            if ( 0 < sz )
            {
                memset( &buf[ sz ], 0, FF_INPUT_BUFFER_PADDING_SIZE );          

                if ( !fc )
                    create( buf, sz );

                if ( fc )
                    write_frame( buf, sz );

            } // end if

            fclose( fd );

        } // end else

    } while ( 0 < sz );

    destroy();
}

35,346

Author by

Ferenc Deak

Updated on June 08, 2020

Comments

Ferenc Deak about 4 years
I would really appreciate some help with the following issue:

I have a gadget with a camera, producing H264 compressed video frames, these frames are being sent to my application. These frames are not in a container, just raw data.

I want to use ffmpeg and libav functions to create a video file, which can be used later.

If I decode the frames, then encode them, everything works fine, I get a valid video file. (the decode/encode steps are the usual libav commands, nothing fancy here, I took them from the almighty internet, they are rock solid)... However, I waste a lot of time by decoding and encoding, so I would like to skip this step and directly put the frames in the output stream. Now, the problems come.

Here is the code I came up with for producing the encoding:
```
AVFrame* picture;

avpicture_fill((AVPicture*) picture, (uint8_t*)frameData, 
                 codecContext->pix_fmt, codecContext->width,
                 codecContext->height);
int outSize = avcodec_encode_video(codecContext, videoOutBuf, 
                 sizeof(videoOutBuf), picture);
if (outSize > 0) 
{
    AVPacket packet;
    av_init_packet(&packet);
    packet.pts = av_rescale_q(codecContext->coded_frame->pts,
                  codecContext->time_base, videoStream->time_base);
    if (codecContext->coded_frame->key_frame) 
    {
        packet.flags |= PKT_FLAG_KEY;
    }
    packet.stream_index = videoStream->index;
    packet.data =  videoOutBuf;
    packet.size =  outSize;

    av_interleaved_write_frame(context, &packet);
    put_flush_packet(context->pb);
}
```
Where the variables are like:

frameData is the decoded frame data, that came from the camera, it was decoded in a previous step and videoOutBuf is a plain uint8_t buffer for holding the data

I have modified the application in order to not to decode the frames, but simply pass through the data like:
```
    AVPacket packet;
    av_init_packet(&packet);

    packet.stream_index = videoStream->index;
    packet.data = (uint8_t*)frameData;
    packet.size = currentFrameSize;

    av_interleaved_write_frame(context, &packet);
    put_flush_packet(context->pb);
```
where

frameData is the raw H264 frame and currentFrameSize is the size of the raw H264 frame, ie. the number of bytes I get from the gadget for every frame.

And suddenly the application is not working correctly anymore, the produced video is unplayable. This is obvious, since I was not setting a correct PTS for the packet. What I did was the following (I'm desperate, you can see it from this approach :) )
```
    packet.pts = timestamps[timestamp_counter ++];
```
where timestamps is actually a list of PTS's produced by the working code above, and written to a file (yes, you read it properly, I logged all the PTS's for a 10 minute session, and wanted to use them).

The application still does not work.

Now, here I am without any clue what to do, so here is the question:

I would like to create an "mpegts" video stream using libav functions, insert in the stream already encoded video frames and create a video file with it. How do I do it?

Thanks, f.
Ferenc Deak about 13 years

Sorry, this is not working :( [libx264 @ 0x7fb9500243e0] invalid DTS: PTS is less than DTS [mpegts @ 0x7fb950021650] Application provided invalid, non monotonically increasing dts to muxer in stream 0: 3600 >= 3600 [libx264 @ 0x7fb9500243e0] non-strictly-monotonic PTS -9223372036854775808 <= -9223372036854775808 [libx264 @ 0x7fb9500243e0] invalid DTS: PTS is less than DTS
bob2 about 13 years

I see that failure is at ./libavformat/utils.c:2965 Can you try packet.pts = packet.dts = AV_NOPTS_VALUE to dodge that error. I'm realizing now that this may still be insufficient since you are dealing with raw H264 packets. Do you have all headers in your stream or just VOP ... ie ... do all your frames start with 0x000001b6? And could you possibly post the code you use to initialize the video stream?\
Ferenc Deak about 13 years

start of the frames: no, the very first frame starts with 0x00 00 00 01 67 42 00 1E 8D 68 1E 0B FE 00 00 00 01 68 CE and the followings are with 0x00 00 00 01 61 E0. The code to initialize the video stream is at: typewith.me/eHbYEyUS91 .... If it would help you I can put somewhere the binary frames ...do you need them?
bob2 about 13 years

0x00000167 is the SPS and 0x00000168 is the PPS, this whole thing (00 00 00 01 67 42 00 1E 8D 68 1E 0B FE 00 00 00 01 68 CE) should be set in the extradata field. If you want to zip up the header and a few frames, I'll make the change to the code above.
bob2 about 13 years

The code basically worked for me as was, except for the file read buffer being too small. I got a nice video of your keyboard :) Still, the key frames weren't being detected quite right so I modified that. I'm not sure if there are built in functions in FFMpeg to do keyframe detection, would be nice. I also was apparently incorrect about setting extradata, when I did that, FFMpeg still refused to pull out the video parameters. I'll have to research into this more when I have time as I think I'll need similar functionality eventually myself. Having to decode the SEI directly would suck.
Ferenc Deak about 13 years

Thanks :) Now I'll have to integrate this into my program.
user1058600 about 12 years

Great example. I have a question about varying framerate. Imagine I have H.264 frames coming in from real-time conversational video. The frames have timestamps. What is the best way to encapsulate in MPEG-TS in real time to maintain the timing for subsequent playback?
Khushboo about 10 years

@user1058600 I want to record the h264 data inside my ios application. I am able to successfully play the live streaming I am getting from DVR. please help me in this.
Álvaro García about 7 years

It works fine but in my case, the total video time length is always up to 8 seconds. The video plays without any problem but vlc shows "playing 10:33/00:08" which is annoying.