An overview of the principle of MP3 encoding


Free Download Mp4Gain
picture

An overview of the principle of MP3 encoding

MP3 Encoding

Audio compression consists of two parts: encoding and decoding. Encoding is the conversion of digital audio data in a wave file into a highly compressed form (called a bitstream); decoding is the reconstruction of the bit stream into a wave file.

Mp3 Encoding

Audio compression can be divided into lossless compression and lossy compression. Lossless compression is to minimize the redundancy of audio data to reduce its volume. Once the audio signal is encoded and decoded, it must be consistent with the original signal. The compression ratio of lossless compression is relatively limited, but now the best APE can achieve 50% compression ratio (I use Monkey’s Audio 3.97 to compress WAV in extra high compression mode, and the compression ratio can reach a minimum of 52%). Lossy compression is the use of all means, including the methods used in lossless compression, to lose all losable data in order to reduce volume. After audio compression, decoding sounds at least the same as the original, and the compression ratio of lossy compression can be greatly improved. MP3 is lossy compression, and the compression ratio is 12:1 (128 kbps).
MP3 files are made up of frames and frames are the smallest unit of MP3 files. What is a painting? Do you remember how the original animation was made? Different continuous images are alternated for dynamic effects, each image is a “frame”, the difference is that frames in MP3 record audio data instead of graphic data. The MP3 frame rate is about 30 frames per second.
Each frame is made up of frame header and frame data. The frame header records the basic information of the frame, including the bit rate index and the sample rate index (this is very important to understand the ABR and VBR encoding methods). Frame data, as its name suggests, records the main audio data.
The above is the basis of MP3 encoding, but in fact, the early encoders are very imperfect, the compression algorithm is almost rudimentary, and the sound quality is not ideal. The sound quality of MP3 has reached the current level with two leaps: the introduction of the human auditory psychological model (perceptual model) and the application of VBR technology.
â—†Human auditory psychology model
Some important principles will be briefly introduced below:
1) The minimum hearing threshold
The hearing range of the human ear is the frequency range of 20Hz-20k Hz, but the sensitivity of the human ear to sounds of different frequencies is different, and the intensity of sounds of different frequencies to reach a level that can be heard for the human ear is different. Then, through calculation, the sound that exists in the music file but cannot be heard by the human ear can be removed. Through this principle, we can also build a model to allocate most of the data space to the 2kHz to 5kHz range, where the human ear is most sensitive, and allocate less space for the rest of the frequency;
2) The masking effect of the human ear
The masking effect occurs when a strong signal masks weak signals at adjacent frequencies. From life experience, in a quiet room, a needle can be heard when it falls to the ground, but in the street, even if the volume of the mobile phone is at maximum, it may not be heard when there is a call. done, and the sound of the mobile phone exists. Yes, the reason is that it is obscured by the louder sounds around. With the results of the investigation of the shading effect, the encoder can calculate the shading from the strong signal to the nearby weak signal according to the established mathematical model, so as to retain the sound that can attract people’s attention.
The human ear also has a pre-masking effect and a post-masking effect: because the human needs a certain amount of time to process the sound signal, the weak signal before or after the strong signal will be masked. . The front shading effect time is only 2-5ms, and the back shading time is relatively long, about 100ms. Using this, we can reduce the resolution before and after the strong signal;


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Author: R. Arias

R. Arias is the author of this article and has extensive experience for more than 30 years as a recording engineer and audio specialist, as well as more than 20 years of experience creating algorithms related to audio and video. Linkedin