What is Chroma Subsampling ?

Chroma subsampling is the process of encoding images by implementing less resolution for chroma information than for luma information.

Human visual system is much more sensitive to variations in brightness (luminance or luma) than color (chrominance or chroma). A video system can be optimized by devoting more bandwidth to the luma component (denoted Y’), than to the color difference components Cb and Cr.

Chroma subsampling is used to reduce the amount of data in a video signal while having little or no visible impact on image quality. Luma (Y’) is obtained after applying gamma correction on luminance (Y). A gamma-corrected signal emulate the logarithmic sensitivity of human vision, with more levels dedicated to the darker levels than the lighter ones.

A subsampling is also known as downsampling, or sampling rate compression. If the input signal is not bandlimited in a certain way, subsampling results in aliasing and information loss, and the operation is not reversible. To avoid aliasing, a low pass filter is used before subsampling in most appplications, thus ensuring the signal to be bandlimited.

Subsampling Notation

Video created with chroma subsampling includes brightness information for every single pixel, but not color information. Color information is shared among adjacent pixels. The number of pixels that share the same color information is determined by the type of chroma subsampling. Common notation for representing chroma subsampling is J:a:b, where

J indicates the number of luminance samples that will be taken. It is usually, 4.
a indicates number of chrominance samples (Cr, Cb) in the first row of J pixels. (U,V) are digital equivalent of (Cb, Cr).
b describes how many samples are taken in the lower row.

4:4:4 Sampling

The numbers indicate the relative sampling rate of each component in the horizontal direction, i.e. for every 4 luminance samples there are 4 Cr and 4 Cb samples. 4:4:4 sampling preserves the full fidelity of the chrominance components. In 4:2:2 sampling, sometimes referred to as YUY2, the chrominance components have the same vertical resolution as the luma but half the horizontal resolution. The numbers 4:2:2 mean that for every 4 luminance samples in the horizontal direction there are 2 Cr and2 Cb samples.

4:2:0 Sampling

In 4:2:0 sampling format (YV12), Cr and Cb each have half the horizontal and vertical resolution of Y. The term ‘4:2:0’ is rather confusing because the numbers do not actually have a logical interpretation and appear to have been chosen historically as a ‘code’ to identify this particular sampling pattern. If luminance component of
the video signal is sampled at 13.5MHz, then the chrominance at 6.75MHz to produce a 4:2:2 Y:Cr:Cb component signal.

Some of the more common chroma expressions are 4:4:4, 4:2:2, and 4:2:0 as shown in below figure. 4:4:4 represents the full color space, so no subsampling is performed.

Top rows has eight luminance samples, four in each row. In 4:2:0 sampling, two sample of chrominance are present for each 8 pixel. Chroma subsampling notation indicates, in the first digit, the luma horizontal sampling reference. The second digit specifies the horizontal subsampling of Cb and Cr with respect to luma.

Sampling Types

Commonly used sampling types are

4:4:4 : Original RGB data found in every pixel of a video frame is used to calculate Y’C’bC’r values for those pixels. As shown in above figure, there is no change in the space needed to store the video. With 8 bits per sample, 2×2 array of R’G’B’ would consume 12 bytes.
4:2:2 : It maintains all of the information in the luma Y’ channel. C’b and C’r values are sampled at half the horizontal rate of the luma channel, so every other pixel in each line of a pixel array is stored without C’b and C’r information. It approximately 30% less bandwidth and storage space in comparison to 4:4:4 samplig .When displaying 4:2:2 video, the missing C’b and C’r data spots are filled in by data from adjacent horizontal pixels (also called interpolation). The 12 bytes of R’G’B’ are reduced to 8, effecting 1.5:1 lossy compression.
4:2:0 : It maintains all of the information in the luma Y’ channel as before. But C’b and C’r are sampled at 1/2 their horizontal and 1/2 their vertical rate. It approximately results in 50% reduction bandwidth and storage requirements. The 12 bytes of R’G’B’ are reduced to 6.

The luma sample usually takes precedence over the chroma samples because the human eye is more sensitive to brightness than color (e.g., hue, saturation). If a 4:2:0 video frame is progressively sampled with a rectangular size of W × H (where W is the width and H is the height of the frame in terms of luma samples), then the rectangular dimension of each chroma component array is reduced to W/2 × H/2.

YUV 4:4:4	Typically 8 bits per Y, U, V plane No subsampling	24 bits/pixel 8 bits/sample
YUV 4:2:2	4Y samples for every 2U and 2V 2:1 horizontal subsampling No vertical subsampling	16 bits/pixel 8 bits/luma sample
YUV 4:2:0	4Y samples for every 2U 2:1 horizontal subsampling 2:1 vertical subsampling	12 bits/pixel 8 bits/luma sample

YUV Formats

Example

Consider an having resolution 720 × 576 pixels, and each represented with 8 bits. So Y resolution is 720 × 576 samples.

In 4:4:4 sampling Cr, Cb resolution: 720 × 576 samples, each 8 bits
So total number of bits: 720 × 576 × 8 × 3 = 9953280 bits
In 4:2:0 Cr, Cb resolution: 360 × 288 samples, each 8 bits
Total number of bits: (720 × 576 × 8) + (360 × 288 × 8 × 2) = 4976640 bits

The 4:2:0 version requires half as many bits as the 4:4:4 version.

4:2:0 sampling is sometimes described as 12 bits per pixel. The reason for this can be illustrated by examining a group of 4 pixels. The left-hand diagram shows 4:4:4 sampling. A total of 12 samples are required, 4 each of Y, Cr and Cb, requiring a total of 12 x 8 = 96 bits, i.e. an average of 96/4=24 bits per pixel. The right-hand diagram shows 4:2:0 sampling: 6 samples are required, 4 Y and one each of Cr and Cb, requiring a total o f 6 x 8=48 bits, i.e. an average of 48/4= 12 bits per pixel.