MPEG video sequences are made up of groups of pictures (GOPs), each comprising a preset number of coded frames, including one I frame and one or more P and B frames.
Pictures are equivalent to video frames or images. The I frame provides the initial reference to start the encoding process. The interleaving of I, P, and B frames in a video sequence is content dependent.
For example, video conferencing applications may employ more B frames since there is little motion in the video. On the other hand, sports content with rapid or frequent motion may require more I frames in order to maintain good video quality.
GOP Length
Longer GOPs tend to suit low-motion video content better because there is lower dependency on the I frames. This in turn improves the video compression efficiency. However, long GOPs may degrade error resilience, which may be a problem for streaming media and Blu-ray authoring. Longer GOPs also increase the latency in the transmission of the frames since the entire GOP must be assembled before transmission can occur. Low-motion content tends to allow more B frames in a GOP without sacrificing video quality.
On the other hand, choosing long GOPs or GOPs with more B frames for high-motion or high-action movies with frequent scene changes may degrade the video quality.
Variable GOP Length
While defining a GOP length provides more structure to the video sequence, this is not always mandatory. A GOP with a variable length adheres to the frame pattern but allows the flexibility of inserting an I frame when the video content demands it. For example, when there is a new scene in the content, an I frame can be inserted. This may potentially lead to better compression efficiency than periodically inserting an I frame in a GOP.
Video conferencing applications typically do not require an I frame for every group of 10 frames because the content is relatively static with few scene changes.
By conserving the I frames, more B and P frames can be used to improve compression efficiency and the GOPs become longer. However, there is a limit on the maximum GOP length because the P frames are dependent on the I frames for referencing.
An MPEG video bitstream may be accessed or switched at any time, such as during channel seeking (i.e., forward/reverse playback) and channel surfing. However, due to temporal prediction, a video decoder cannot start decoding a compressed video at a frame that is predicted from previous frames. The insertion of I frames allows random access to a video bitstream because it is encoded without any prediction from other frames. Thus, the latency in random access is inversely proportional to the rate of I frame insertion. The compression efficiency decreases as I frames are inserted more frequently because these frames only employ spatial compression.