h.264 bytestream parsing

12,613

Solution 1

The H.264 Standard is a bit hard to read, so here are some tips.

  • Read Annex B; make sure your input starts with a start code
  • Read section 9.1: you will need it for all of the following
  • Slice header is described in section 7.3.3
  • "Frame number" is not encoded explicitly in the slice header; frame_num is close to what you probably want.
  • "Frame type" probably corresponds to slice_type (the second value in the slice header, so most easy to parse; you should definitely start with this one)
  • "Quantization coefficient" - do you mean "quantization parameter"? If yes, be prepared to write a full H.264 parser (or reuse an existing one). Look in section 9.3 to get an idea on a complexity of a H.264 parser.

Solution 2

Standard is very hard to read. You can try to analyze source code of existing H.264 video stream decoding software such as ffmpeg with it's C (C99) libraries. For example there is avcodec_decode_video2 function documented here. You can get full working C (open file, get H.264 stream, iterate thru frames, dump information, get colorspace, save frames as raw PPM images etc.) here. Alternatively there is great "The H.264 Advanced Video Compression Standard" book, which explains standard in "human language". Another option is to try Elecard StreamEye Pro software (there is trial version), which could give you some additional (visual) perspective.

Solution 3

Actually much better and easier (it is only my opinion) to read H.264 video coding documentation. ffmpeg is very good library but it contain a lot of optimized code. Better to look at reference implementation of the H.264 codec and official documentation. http://iphome.hhi.de/suehring/tml/download/ - this is link to the JM codec implementation. Try to separate levels of decoding process, like transport layer that contains NAL units (SPS, PPS, SEI, IDR, SLICE, etc). Than you need to implement VLC engine (mostly exp-Golomb codes of 0 range). Than very difficult and powerful codec called CABAC (Context Adaptive Arithmetic Binary Codec). It is quite tricky task. Demuxing process (goes after unpacking of a video data) also complicated. You need completely understand each of such modules. Good luck.

Share:
12,613
stemm
Author by

stemm

Expressed opinions are my own.

Updated on June 04, 2022

Comments

  • stemm
    stemm almost 2 years

    The input data is a byte array which represents a h.264 frame. The frame consists of a single slice (not multislice frame).

    So, as I understood I can cope with this frame as with slice. The slice has header, and slice data - macroblocks, each macroblock with its own header.

    So I have to parse that byte array to extract frame number, frame type, quantisation coefficient (as I understood each macroblock has its own coefficient? or I'm wrong?)

    Could You advise me, where I can get more detailed information about parsing h.264 frame bytes.

    (In fact I've read the standard, but it wasn't very specific, and I'm lost.)

    Thanks

    • stemm
      stemm about 13 years
      The input data is byte array which represents h.264 frame. The frame consists of a single slice (not multislice frame). (these are limits for my problem)
    • VitalyVal
      VitalyVal about 13 years
      Try to look at ISO/IEC 14496-15
    • anatolyg
      anatolyg about 13 years
      What is h.264m? I mean - is h.264m some extension to H.264?
    • stemm
      stemm about 13 years
      Oh...I've mistake ) I mean h.264