Introduction

A visual scene is spatially and temporally continuous. Representing a visual scene in digital form involves spatial sampling and temporal sampling. Spatial sampling is usually done on a rectangular grid in the video image plane. Temporal sampling as a series of still frames or components of frames sampled at regular intervals in time.

Spatial Sampling and Temporal Sampling
Figure 1: Spatial Sampling and Temporal Sampling

Each spatio-temporal sample, a picture element or pixel, is represented as one or more numbers that describes the brightness or luminance and the colour of the sample. Below image shows the spatial sampling of black and white image. Right side image shows the pixel values of each rectangular block.

Figure 2: Digital Image Representation

Spatial Sampling

The output of a camera sensor is an analogue video signal, a varying electrical signal that represents a video image. Sampling the signal at a point in time produces a sampled image or frame that has defined values at a set of sampling points.

Common format for a sampled image is a rectangle with the sampling points positioned on a square or rectangular grid. Figure 2 shows a frame with two different sampling grids superimposed upon it. Sampling occurs at each of the intersection points on the grid. A sampled image may be reconstructed by representing each sample as a square picture element or pixel. The number of sampling points influences the visual quality of the image. Choosing a ‘coarse’ sampling grid, produces a low-resolution sampled image, while increasing the number of sampling points increases the resolution of the sampled image.

Temporal Sampling

A moving video image is formed by taking a rectangular ‘snapshot’ of the signal at periodic time intervals. Playing back the series of snapshots or frames produces the appearance of motion. A higher temporal sampling rate or frame rate gives apparently smoother motion in the video scene but requires more samples to be captured.

Frame rates below 10 frames per second may be used for very low bit-rate video communications, because the amount of data is relatively small. Temporal sampling at 30 frames per second is the norm for Standard Definition television pictures, with interlacing to improve the appearance of motion.