
Basic principles to reduce
Redundant video data in the MPEG encoding process.
Evolution of digital video compression standards
The MPEG encoding process removes redundant video data in a series of adjacent frames.
Two adjacent frames often contain many of the same picture elements. The information in them differs in a small part from all the information contained in the frame. The video is compressed, in which not all the data from each video frame is used, but the frame dynamics changes, since in most consecutive frames of a video clip the background hardly changes, and clearly noticeable changes occur in the foreground. .
For example, there is a smooth movement of a small object against the background of an unchanging background. In this case, the complete information about the image is stored for the reference images only. For the rest of the frames, only the difference information is digitized: on the position of the object, the direction and magnitude of its displacement, on new background elements that open behind the object as it moves. In addition, this difference information is calculated not only in comparison with the previous images, but also with the later ones (since it is in them, as the object moves, that the previously hidden part of the background is revealed).
The data reduction process is as follows. First, a keyframe (I, Intra frame) is created.
The reference I-frames are used to restore the remaining frames and are placed sequentially every 10-15 frames. Only a few fragments of frames that are between I-frames have time to change, and it is these changes that are recorded during the compression process.
In addition to I-frames, two other types of frames are distinguished in MPEG:
predictable frames (P, Predicted) containing the difference between the current image and the previous I-frame or taking into account the displacements of individual fragments;
Bidirectional predictive frames (B, bidirectionally predictive), containing only references to before or after frames of type I or P, taking into account the offsets of the individual fragments.
I-frames form the basis of an MPEG stream and, through them, random access to a piece of video is performed. The I-frames are slightly compressed to ensure high visual quality.
The P-frames are encoded relative to the previous frames (I or P) and are used as a comparison pattern for an additional sequence of P-frames. In this case, a high level of compression is achieved.
B-frames are highly compressed. To link B-frames to a video sequence, it is necessary to use not only the previous image, but also the next one. B-frames are never used for comparison.
The I, P, B frames are combined into groups (GOP-Group Of Pictures), which represent the minimum repeated set of consecutive frames, for example:
(I0 B1 B2 P3 B4 B5 P6 B7 B8 P9 B10 B11) (I12 B13 B14 P15 B16 B17 P18 …)
Frames are made up of macroblocks, which are small fragments of an image 16 × 16 pixels in size. The MPEG encoder processor analyzes the frames and looks for identical or very close macroblocks by comparing the base and subsequent frames. As a result, only the difference data between frames, called vector motion code, is saved. Macroblocks that do not contain changes are ignored and therefore the amount of data to be transferred is significantly reduced. To reduce the impact of errors during data transmission, sequential macroblocks are combined into independent sections (slices). In turn, each macroblock consists of six blocks, four of which carry information on luminance (Y) and the remaining 2 blocks carry information on color difference signals (U / V). Blocks are basic units
Block diagrams are used 4: 2: 0 or for studio quality (broadcast) 4: 2: 2.
This is an important point that requires a more detailed explanation:
It is useful to know that RGB color data received from a video camera can be represented equivalently as the sum of the luminance signal (Y) and two color difference signals (U and V), called chromatic. The luminance signal Y determines the luminance of the point. The U and V color difference signals, together with the Y signal, can fully restore the original RGB data.
And it is calculated from RGB data using the formula: Y = 0.299R + 0.587G + 0.114B
The U and V signals are calculated as follows: U = R – Y and V = B – Y












