NAL Units
Coded H.264 data is stored or transmitted as a series of packets known as Network Abstraction Layer Units, NAL Units or NALUs. Each NAL Unit consists of a 1-byte NALU header followed by a byte stream containing control information or coded video data. The header indicates the NALU type. Parameter Sets and slices that are used for reference, i.e. used to predict further frames, are considered important or high priority. Since their loss could make it difficult to decode subsequent coded slices. Non-reference slices are considered to be less important to the decoder. Since their loss will not affect any further decoding. This information can optionally be used to prioritise certain NALUs during transmission.
![NAL Units and Parameter Sets in H.264](https://mymusing.co/wp-content/uploads/2022/08/h264_syntax.png)
A coded sequence begins with an Instantaneous Decoder Refresh (IDR) Access Unit, made up of one or more IDR slices, each of which is an Intra coded slice. This is followed by the default slice type,
i.e. a non-IDR coded slice, and/or by Data Partitioned slices. Non-VCL NAL units include Parameter Sets, Supplemental Enhancement Information parameters that may be useful for decoding and displaying video data, but are not essential for correct decoding, and delimiters that define boundaries between coded sections.
Parameter Sets
Parameter Sets are NAL units that carry decoding parameters common to a number of coded slices. Sending these parameters independently of coded slices can improve efficiency, since
common parameters need only be sent once. In a lossy transmission scenario, Parameter Sets may be sent with a higher quality of service using e.g. Forward Error Correction or a priority mechanism.
Sequence Parameter Set (SPS) contains parameters common to an entire video sequence such as the Profile and Level, the size of a video frame and certain decoder constraints such as the maximum number of reference frames. Each SPS has a unique identifier.
Picture Parameter Set (PPS) contains common parameters that may apply to a sequence or subset of coded frames, such as entropy coding type, number of active reference pictures
and initialization parameters.
![](https://mymusing.co/wp-content/uploads/2022/09/nal_types.jpg)
NAL Size
As noted above, from the NAL perspective, the H.264 stream is just a sequence of NAL units. But how do we know where one NAL unit ends and another one starts?
Annex B
Annex B is a format in which all NAL units are preceded with a so-called “start code” – one of the following byte sequences: three-byte 0x000001
or four-bytes 0x00000001
. In the case of that format, the H.264 stream might look as follows:
([0x000001][first NAL unit]) | ([0x000001][second NAL unit]) | ([0x000001][third NAL unit]) …
The existence of two “start codes” is dictated by the fact that the first variant is more applicable in certain situations than the second one. The choice of 0x000001
and 0x00000001
as a NAL units separator wasn’t unintentional – such a sequence of bytes does not encode much information (recall Shannon’s formula for the amount of information) and that’s why it’s not frequent for the compression algorithm used in H.264 to produce such a sequence of bytes.
AVCC
In this byte stream format, each NAL unit is preceded with a nal_size
field, describing its length. That length might be stored with the use of 1, 2 or 4 bytes, and that is why a special additional header (known as “extradata”, “sequence header” or “AVCDecoderConfigurationRecord”) is required to be present in the stream to specify how many bytes are used to store the NAL unit’s length. The stream then might look as follows:
([extradata]) | ([length] NALu) | ([length] NALu) | …
The syntax of the extradata is described in MPEG-4 Part 15 “Advanced Video Coding (AVC) file format” section 5.2.4.1:
aligned(8) class AVCDecoderConfigurationRecord { unsigned int(8) configurationVersion = 1; unsigned int(8) AVCProfileIndication; unsigned int(8) profile_compatibility; unsigned int(8) AVCLevelIndication; bit(6) reserved = '111111'b; unsigned int(2) lengthSizeMinusOne; bit(3) reserved = '111'b; unsigned int(5) numOfSequenceParameterSets; for (i=0; i< numOfSequenceParameterSets; i++) { unsigned int(16) sequenceParameterSetLength; bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit; } unsigned int(8) numOfPictureParameterSets; for (i=0; i< numOfPictureParameterSets; i++) { unsigned int(16) pictureParameterSetLength; bit(8*pictureParameterSetLength) pictureParameterSetNALUnit; } }