
MPEG format: specifications and capabilities Part 4
![]()
MPEG audio compression algorithm
![]()
Audio compression uses well-designed psychoacoustic models, derived from experimentation with the most demanding listeners, to deliver sounds that are not audible to the human ear. This is what is called “masking”, for example, a large component at a certain frequency does not allow to hear components with a lower coefficient at nearby frequencies, where the relationship between the energies of the frequencies that are masked is described by some curve empirical. There are similar temporal masking effects, as well as more complex interactions where the temporal effect can emphasize the frequency, or vice versa.
The sound is divided into spectral blocks using a hybrid scheme that combines sine and band transformations, and a psychoacoustic model described in the language of these blocks. Anything that can be trimmed or clipped is trimmed and trimmed, with the rest sent to the outflow. Actually, things seem a bit more complicated, as the bits have to be distributed between the strips. And of course, everything that is sent is encrypted with redundancy reduction.
With the advent of the MPEG-2 specification, the most popular combinations have been merged into levels and profiles. The most common are:
Source Input Format (SIF), 352 dots x 240 lines x 30 fps, also known as Low Level (LL), and
“CCIR 601” (for example, 720 dots / line x 480 lines x 30 fps) o Main level: the main level.
Motion compensation replaces macroblocks with macroblocks from older images.
Macroblock predictions are generated from corresponding 16×16 dot blocks (16×8 in MPEG-2) from previous reconstructed frames. There are no restrictions on the position of the macroblock in the image above, except for its edges.
MPEG format
The original (reference) frames (from which predictions are formed) are displayed regardless of their encoded form. Once the frame is decoded, it does not become a set of blocks, but an ordinary flat digital image of dots.
MPEG format
In MPEG, the displayed image size and frame rate may differ from those encoded in the stream. For example, before encoding, a subset of frames can be omitted from the original sequence, and then each frame is filtered and processed. When restoring, dithered to restore original size and frame rate. In fact, the three fundamental phases (original, encoded and displayed frequency) can differ in parameters. The MPEG syntax describes the rate encoded and displayed through headers, and the original frame rate and size are known only to the encoder. That is why MPEG-2 headers include elements that describe the size of the screen to display the footage.
In an I-frame, macroblocks must be coded internal, without reference to before or after, unless scalable modes are used. However, the macroblocks in a P frame can be internal or referenced from previous frames. Macroblocks in a B-frame can be internal or refer to the previous frame, the next frame, or both. Each macroblock has an element in the header that defines its type.










