A single media signal is called Elementary Stream or just Stream. People usually mean the same thing when they refer to the Video Stream as Codec, Media or H.264 Stream. A media container format specification describing how different multimedia data elements (streams) and metadata coexist in files. Common container formats are MP4, MPEG2-TS and Matroska.

Media Container (mp4) having encoded frame
Media Container (mp4) having encoded frame (single stream)

A container format provides the following:

  • Stream Encapsulation : One or more media streams can exist in one single file.
  • Timing/Synchronization : The container adds data on how the different streams in the file can be used together. E.g. The correct timestamps for synchronizing lip-movement in a video stream with speech found in the audio stream.
  • Seeking : It provides information to which point in time a movie can be jumped to. E.g. The viewer wants to watch only a part of a whole movie.
  • Metadata : There are many flavours of metadata. It is possible to add them to a movie using a container format. E.g. the language of an audio stream.

Commonly Used Terminology

  • Encoding: It is process of converting a raw media signal to a binary file of a codec. For example encoding a series of raw images to the video codec H.264. It effectively compressing the size of the video file.
  • Decoding: This is opposite of encoding; decoding is the process of converting binary files back into raw media signals. Ex: H.264 codec streams into viewable images.
  • Transcoding: The process of converting one codec to another (or the same) codec. Both decoding & encoding are necessary steps to achieving a transcode.
  • Muxing: The process of adding one or more codec streams into a container format.
  • Demuxing: Extracting a codec stream from a container format.
  • Transmuxing: Extracting streams from one container format and putting them in a different (or the same) container format.
  • Multiplexing: The process of interweaving audio and video into one data stream. Ex: An elementary stream (audio & video) from the encoder are turned into Packetized Elementary Streams (PES) and then converted into Transport Streams (TS).
  • Demultiplexing: The reverse operation of multiplexing. This means extracting an elementary stream from a media container. E.g.: Extracting the mp3 audio data from an mp4 music video.
Media Encoding and Decoding Process