
Lossy Compression Part 2

Compress audio and video

The term “bit rate” refers to the number of bits of information transmitted per second. This term is translated into Russian in different ways in different sources. Recently, the word “bitrate”, which is new to the Russian language, is often used instead of a formal translation. The translation options are also as follows: “data stream width”, “bit stream complexity”, “stream rate”, “bit rate”. This same parameter is sometimes called the file compression rate for sound files. For example, the file is said to be compressed at 128 Kbps. The fact is that the bit rate value is directly related to the physical size of the sound file per second of sound.
All compression formats of the MPEG family use a high redundancy of information in images separated by a short time interval. Between two adjacent frames, usually only a small part of the scene changes; for example, there is a smooth movement of a small object against the background of a fixed background. In this case, the complete information about the scene is saved selectively, only for reference images. For the rest of the frames, it is enough to transmit differential information: on the position of the object, the direction and magnitude of its displacement, on new background elements that open up behind the object as it moves. Furthermore, these differences can form not only in comparison with the previous images, but also with the later ones (since it is in them, as the object moves, that the previously hidden part of the background is revealed).
The MPEG family of compression formats reduces the amount of information as follows:
Temporal video redundancy is eliminated (only difference information is considered).
The spatial redundancy of the images is eliminated by suppressing the small details of the scene.
Some of the color information is removed.
The information density of the resulting digital stream is increased by choosing the optimal mathematical code for its description.
MPEG compression formats compress only anchor frames: I-frames (intraframes). The intervals between them include frames that contain only changes between two adjacent I-frames: P-frames (predicted frame – predicted frame). To reduce the loss of information between the I frame and the P frame, so-called B frames (bidirectional frame) are introduced. They contain information that is taken from the previous and next frames. When encoding in MPEG compression formats, a chain of frames of different types is formed. A typical sequence of frames looks like this:
I B B P B B I B B P B B I B B …
Consequently, the sequence of frames according to their numbers will be played in the following order:
1 4 2 3 7 6 5 …



