Chroma subsampling is the process of encoding images by implementing less resolution for chroma information than for luma information.
Human visual system is much more sensitive to variations in brightness (luminance or luma) than color (chrominance or chroma). A video system can be optimized by devoting more bandwidth to the luma component (denoted Y’), than to the color difference components Cb and Cr.
Chroma subsampling is used to reduce the amount of data in a video signal while having little or no visible impact on image quality. Luma (Y’) is obtained after applying gamma correction on luminance (Y). A gamma-corrected signal emulate the logarithmic sensitivity of human vision, with more levels dedicated to the darker levels than the lighter ones.
Video created with chroma subsampling includes brightness information for every single pixel, but not color information. Color information is shared among adjacent pixels. The number of pixels that share the same color information is determined by the type of chroma subsampling. Common notation for representing chroma subsampling is J:a:b, where
- J indicates the number of luminance samples that will be taken. It is usually, 4.
- a indicates number of chrominance samples (Cr, Cb) in the first row of J pixels. (U,V) are digital equivalent of (Cb, Cr).
- b describes how many samples are taken in the lower row.
The numbers indicate the relative sampling rate of each component in the horizontal direction, i.e. for every 4 luminance samples there are 4 Cr and 4 Cb samples. 4:4:4 sampling preserves the full fidelity of the chrominance components. In 4:2:2 sampling, sometimes referred to as YUY2, the chrominance components have the same vertical resolution as the luma but half the horizontal resolution. The numbers 4:2:2 mean that for every 4 luminance samples in the horizontal direction there are 2 Cr and2 Cb samples.
In 4:2:0 sampling format (YV12), Cr and Cb each have half the horizontal and vertical resolution of Y. The term ‘4:2:0’ is rather confusing because the numbers do not actually have a logical interpretation and appear to have been chosen historically as a ‘code’ to identify this particular sampling pattern. If luminance component of
the video signal is sampled at 13.5MHz, then the chrominance at 6.75MHz to produce a 4:2:2 Y:Cr:Cb component signal.
Some of the more common chroma expressions are 4:4:4, 4:2:2, and 4:2:0 as shown in below figure. 4:4:4 represents the full color space, so no subsampling is performed.
Top rows has eight luminance samples, four in each row. In 4:2:0 sampling, two sample of chrominance are present for each 8 pixel. Chroma subsampling notation indicates, in the first digit, the luma horizontal sampling reference. The second digit specifies the horizontal subsampling of Cb and Cr with respect to luma.
Commonly used sampling types are
- 4:4:4 : Original RGB data found in every pixel of a video frame is used to calculate Y’C’bC’r values for those pixels. As shown in above figure, there is no change in the space needed to store the video. With 8 bits per sample, 2×2 array of R’G’B’ would consume 12 bytes.
- 4:2:2 : It maintains all of the information in the luma Y’ channel. C’b and C’r values are sampled at half the horizontal rate of the luma channel, so every other pixel in each line of a pixel array is stored without C’b and C’r information. It approximately 30% less bandwidth and storage space in comparison to 4:4:4 samplig .When displaying 4:2:2 video, the missing C’b and C’r data spots are filled in by data from adjacent horizontal pixels (also called interpolation). The 12 bytes of R’G’B’ are reduced to 8, effecting 1.5:1 lossy compression.
- 4:2:0 : It maintains all of the information in the luma Y’ channel as before. But C’b and C’r are sampled at 1/2 their horizontal and 1/2 their vertical rate. It approximately results in 50% reduction bandwidth and storage requirements. The 12 bytes of R’G’B’ are reduced to 6.
The luma sample usually takes precedence over the chroma samples because the human eye is more sensitive to brightness than color (e.g., hue, saturation). If a 4:2:0 video frame is progressively sampled with a rectangular size of W × H (where W is the width and H is the height of the frame in terms of luma samples), then the rectangular dimension of each chroma component array is reduced to W/2 × H/2.
|YUV 4:4:4||Typically 8 bits per Y, U, V plane|
|YUV 4:2:2||4Y samples for every 2U and 2V|
2:1 horizontal subsampling
No vertical subsampling
8 bits/luma sample
|YUV 4:2:0||4Y samples for every 2U|
2:1 horizontal subsampling
2:1 vertical subsampling
Consider an having resolution 720 × 576 pixels, and each represented with 8 bits. So Y resolution is 720 × 576 samples.
- In 4:4:4 sampling Cr, Cb resolution: 720 × 576 samples, each 8 bits
So total number of bits: 720 × 576 × 8 × 3 = 9953280 bits
- In 4:2:0 Cr, Cb resolution: 360 × 288 samples, each 8 bits
Total number of bits: (720 × 576 × 8) + (360 × 288 × 8 × 2) = 4976640 bits
The 4:2:0 version requires half as many bits as the 4:4:4 version.