Three types of macroblocks are

  • I Macroblock – An I Macroblock (I MB) is predicted using intra prediction from neighboring samples in the current frame.
  • P Macroblock – A P Macroblock (P MB) is predicted from samples in a previously-coded frame which may be before or after the current picture in display order.
  • B Macroblock – Each partition in a B Macroblock (B MB) is predicted from samples in one or two previously-coded frames, for example, one ‘past’ and one ‘future’.

Intra prediction

An intra (I) macroblock is coded without referring to any data outside the current slice. Every macroblock in an I slice is an I macroblock. I macroblocks are coded using intra prediction. Intra prediction uses samples from adjacent, previously coded blocks to predict the values in the current block. Only those samples that are actually available may be used to form a prediction.

In an intra macroblock, there are three choices of intra prediction block size for the luma component, namely 16 × 16, 8 × 8 or 4 × 4. A single prediction block is generated for each chroma component.

  • Smaller blocks: A smaller prediction block size (4 × 4) tends to give a more accurate prediction, i.e. the intra prediction for each block is a good match to the actual data in the block. This in turn means a smaller coded residual, so that fewer bits are required to code the quantized transform coefficients for the residual blocks. However, the choice of prediction for every 4 × 4 block must be signalled to the decoder, which means that more bits tend to be required to code the prediction choices.
  • Larger blocks: A larger prediction block size (16 × 16) tends to give a less accurate prediction, hence more residual data, but fewer bits are required to code the prediction choice itself.

Inter prediction

Inter prediction is the process of predicting a block of luma and chroma samples from a picture that has previously been coded and transmitted, a reference picture. This involves selecting a prediction region, generating a prediction block and subtracting this from the original block of samples to form a residual that is then coded and transmitted. The block of samples to be predicted, a macroblock partition or sub-macroblock partition, can range in size from a complete macroblock, i.e. 16 × 16 luma samples and corresponding chroma samples, down to a 4 × 4 block of luma samples and corresponding chroma samples.

The reference picture is chosen from a list of previously coded pictures, stored in a Decoded Picture Buffer, which may include pictures before and after the current picture in display order. The offset between the position of the current partition and the prediction region in the reference picture is a motion vector. The motion vector may point to integer, half- or quarter-sample positions in the luma component of the reference picture. Half- or quarter-sample positions are generated by interpolating the samples of the reference picture.

The prediction block may be generated from a single prediction region in a reference picture, for a P or B macroblock, or from two prediction regions in reference pictures, for a B macroblock. Each partition in an inter-coded macroblock is predicted from an area of the same size in a reference picture.

Macroblock partitions

Each 16×16 P or B macroblock may be predicted using a range of block sizes. The macroblock is split into one, two or four macroblock partitions:

  • one 16 × 16 macroblock partition (covering the whole MB),
  • two 8 × 16 partitions,
  • two 16 × 8 partitions or
  • four 8 × 8 partitions.

If 8 × 8 partition size is chosen, then each 8 × 8 block of luma samples and associated chroma samples, a sub-macroblock, is split into one, two or four sub-macroblock partitions.

Macroblock Prediction in H.264
Macroblock partitions and sub-macroblock partitions

Each macroblock partition and sub-macroblock partition has one or two motion vectors (x, y), each pointing to an area of the same size in a reference frame that is used to predict the current partition. A partition in a P macroblock has one reference frame and one motion vector. A partition in a B macroblock has one or two reference frames and one or two corresponding motion vectors. Each macroblock partition may be predicted from different reference frame(s).