streaming H.264 over RTP with libavformat

12,727

h264 is an encoding standard. It specifies how video data is compressed and stored in a format that can be decompressed into a video stream at later point.

RTP is a transmission protocol. It specifies format and order of packets that can carry audio-video data that was encoded by an arbitrary encoder.

GStreamer expects to receive data that conforms to the RTP procotol. Is your expectation that libaformat will produce the RTP packets immediately readable by GStreamer warranted? Maybe GStreamers expect an additional stream description that would enable it to accept and decode the streamed packets using the proper decoder? Maybe it requires an additional RTSP exchange or the SDP stream descriptor file?

The error message states pretty clearly that an RTP format has not been negotiated. caps are short-hand for capabilities. Receiver needs to know transmitter's capabilities to set up the receiver/decoding machinery correctly.

I strongly suggest trying at least to create an SDP file for your RTP stream. libavformat should be able to do it for you.

Share:
12,727
Jacob Peddicord
Author by

Jacob Peddicord

I like to write nice software.

Updated on July 20, 2022

Comments

  • Jacob Peddicord
    Jacob Peddicord almost 2 years

    I've been trying over the past week to implement H.264 streaming over RTP, using x264 as an encoder and libavformat to pack and send the stream. Problem is, as far as I can tell it's not working correctly.

    Right now I'm just encoding random data (x264_picture_alloc) and extracting NAL frames from libx264. This is fairly simple:

    x264_picture_t pic_out;
    x264_nal_t* nals;
    int num_nals;
    int frame_size = x264_encoder_encode(this->encoder, &nals, &num_nals, this->pic_in, &pic_out);
    
    if (frame_size <= 0)
    {
        return frame_size;
    }
    
    // push NALs into the queue
    for (int i = 0; i < num_nals; i++)
    {
        // create a NAL storage unit
        NAL nal;
        nal.size = nals[i].i_payload;
        nal.payload = new uint8_t[nal.size];
        memcpy(nal.payload, nals[i].p_payload, nal.size);
    
        // push the storage into the NAL queue
        {
            // lock and push the NAL to the queue
            boost::mutex::scoped_lock lock(this->nal_lock);
            this->nal_queue.push(nal);
        }
    }
    

    nal_queue is used for safely passing frames over to a Streamer class which will then send the frames out. Right now it's not threaded, as I'm just testing to try to get this to work. Before encoding individual frames, I've made sure to initialize the encoder.

    But I don't believe x264 is the issue, as I can see frame data in the NALs it returns back. Streaming the data is accomplished with libavformat, which is first initialized in a Streamer class:

    Streamer::Streamer(Encoder* encoder, string rtp_address, int rtp_port, int width, int height, int fps, int bitrate)
    {
        this->encoder = encoder;
    
        // initalize the AV context
        this->ctx = avformat_alloc_context();
        if (!this->ctx)
        {
            throw runtime_error("Couldn't initalize AVFormat output context");
        }
    
        // get the output format
        this->fmt = av_guess_format("rtp", NULL, NULL);
        if (!this->fmt)
        {
            throw runtime_error("Unsuitable output format");
        }
        this->ctx->oformat = this->fmt;
    
        // try to open the RTP stream
        snprintf(this->ctx->filename, sizeof(this->ctx->filename), "rtp://%s:%d", rtp_address.c_str(), rtp_port);
        if (url_fopen(&(this->ctx->pb), this->ctx->filename, URL_WRONLY) < 0)
        {
            throw runtime_error("Couldn't open RTP output stream");
        }
    
        // add an H.264 stream
        this->stream = av_new_stream(this->ctx, 1);
        if (!this->stream)
        {
            throw runtime_error("Couldn't allocate H.264 stream");
        }
    
        // initalize codec
        AVCodecContext* c = this->stream->codec;
        c->codec_id = CODEC_ID_H264;
        c->codec_type = AVMEDIA_TYPE_VIDEO;
        c->bit_rate = bitrate;
        c->width = width;
        c->height = height;
        c->time_base.den = fps;
        c->time_base.num = 1;
    
        // write the header
        av_write_header(this->ctx);
    }
    

    This is where things seem to go wrong. av_write_header above seems to do absolutely nothing; I've used wireshark to verify this. For reference, I use Streamer streamer(&enc, "10.89.6.3", 49990, 800, 600, 30, 40000); to initialize the Streamer instance, with enc being a reference to an Encoder object used to handle x264 previously.

    Now when I want to stream out a NAL, I use this:

    // grab a NAL
    NAL nal = this->encoder->nal_pop();
    cout << "NAL popped with size " << nal.size << endl;
    
    // initalize a packet
    AVPacket p;
    av_init_packet(&p);
    p.data = nal.payload;
    p.size = nal.size;
    p.stream_index = this->stream->index;
    
    // send it out
    av_write_frame(this->ctx, &p);
    

    At this point, I can see RTP data appearing over the network, and it looks like the frames I've been sending, even including a little copyright blob from x264. But, no player I've used has been able to make any sense of the data. VLC quits wanting an SDP description, which apparently isn't required.

    I then tried to play it through gst-launch:

    gst-launch udpsrc port=49990 ! rtph264depay ! decodebin ! xvimagesink

    This will sit waiting for UDP data, but when it is received, I get:

    ERROR: element /GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay0: No RTP format was negotiated. Additional debug info: gstbasertpdepayload.c(372): gst_base_rtp_depayload_chain (): /GstPipeline:pipeline0/GstRtpH264Depay:rtph264depay0: Input buffers need to have RTP caps set on them. This is usually achieved by setting the 'caps' property of the upstream source element (often udpsrc or appsrc), or by putting a capsfilter element before the depayloader and setting the 'caps' property on that. Also see http://cgit.freedesktop.org/gstreamer/gst-plugins-good/tree/gst/rtp/README

    As I'm not using GStreamer to stream itself, I'm not quite sure what it means with RTP caps. But, it makes me wonder if I'm not sending enough information over RTP to describe the stream. I'm pretty new to video and I feel like there's some key thing I'm missing here. Any hints?

  • Jacob Peddicord
    Jacob Peddicord about 12 years
    That's the thing -- I don't know, and I'm having trouble finding the information I need. From what I can tell, libavformat will pack things into an RTP stream for you (and will not send invalid packets -- I've tried). It doesn't do any RTSP negotiation; eventually this will be pointed at Feng or some other external application to handle RTSP streaming to clients. However, that doesn't explain why nothing can make heads or tails of the RTP stream that libavformat is generating.
  • George Skoptsov
    George Skoptsov about 12 years
    You need to negotiate it somehow. Why not try to create an SDP file for your stream?
  • Jacob Peddicord
    Jacob Peddicord about 12 years
    I gave that a go and I can get VLC to display a green screen -- whether that's correct or not I don't know, but it's a start. Will be working on it today, so we'll see if this was actually the issue.
  • George Skoptsov
    George Skoptsov about 12 years
    Well, you're saying that you're encoding random data. Don't do that -- as a first step, read in a real image and encode that over and over.
  • George Skoptsov
    George Skoptsov about 12 years
    Here's another suggestion. Why not start with code already provided in ffserver.c and add to that whatever features you need? Or at least you could use it for reference.
  • Jacob Peddicord
    Jacob Peddicord about 12 years
    Turns out this was indeed the issue; there was no information about the stream provided to VLC nor GStreamer. I was able to get actual image data playing in VLC. For gst-launch, adding in the type and rate worked out as well: gst-launch udpsrc port=49990 ! application/x-rtp,clock-rate=90000,payload=96 ! rtph264depay ! decodebin ! xvimagesink