Chroma subsampling is the process of encoding images by implementing less resolution for chroma information than for luma information.

Human visual system is much more sensitive to variations in brightness (luminance or luma) than color (chrominance or chroma). A video system can be optimized by devoting more bandwidth to the luma component (denoted Y’), than to the color difference components Cb and Cr. Chroma subsampling is used to reduce the amount of data in a video signal while having little or no visible impact on image quality. Luma (Y’) is obtained after applying gamma correction on luminance (Y). A gamma-corrected signal emulate the logarithmic sensitivity of human vision, with more levels dedicated to the darker levels than the lighter ones.

Subsampling Notation

Video created with chroma subsampling includes brightness information for every single pixel, but not color information. Color information is shared among adjacent pixels. The number of pixels that share the same color information is determined by the type of chroma subsampling. Common notation for representing chroma subsampling is J:a:b, where

  • J indicates the number of luminance samples that will be taken. It is usually, 4.
  • a indicates number of chrominance samples (Cr, Cb) in the first row of J pixels. (U,V) are digital equivalent of (Cb, Cr).
  • b describes how many samples are taken in the lower row.

Some of the more common chroma expressions are 4:4:4, 4:2:2, and 4:2:0 as shown in below figure. 4:4:4 represents the full color space, so no subsampling is performed.

Chroma Subsampling

Top rows has eight luminance samples, four in each row. In 4:2:0 sampling, two sample of chrominance are present for each 8 pixel. Chroma subsampling notation indicates, in the first digit, the luma horizontal sampling reference. The second digit specifies the horizontal subsampling of Cb and Cr with respect to luma.

Sampling Types

Commonly used sampling types are

  • 4:4:4 : Original RGB data found in every pixel of a video frame is used to calculate Y’C’bC’r values for those pixels. As shown in above figure, there is no change in the space needed to store the video. With 8 bits per sample, 2×2 array of R’G’B’ would consume 12 bytes.
  • 4:2:2 : It maintains all of the information in the luma Y’ channel. C’b and C’r values are sampled at half the horizontal rate of the luma channel, so every other pixel in each line of a pixel array is stored without C’b and C’r information. It approximately 30% less bandwidth and storage space in comparison to 4:4:4 samplig .When displaying 4:2:2 video, the missing C’b and C’r data spots are filled in by data from adjacent horizontal pixels (also called interpolation). The 12 bytes of R’G’B’ are reduced to 8, effecting 1.5:1 lossy compression.
  • 4:2:0 : It maintains all of the information in the luma Y’ channel as before. But C’b and C’r are sampled at 1/2 their horizontal and 1/2 their vertical rate. It approximately results in 50% reduction bandwidth and storage requirements. The 12 bytes of R’G’B’ are reduced to 6.