
Structure of an mp3
audio compression

The MP3 format began in the mid-1980s and the Fraunhofer Institute in Erlangen, Germany, was committed to high-quality, low-data-rate audio coding.
MP3 audio compression includes encoding and decoding in two parts. Encoding is converting the data in the WAV file into a highly compressed bitstream format, and decoding is accepting the bitstream and reconstructing it into the WAV file.
MP3 uses the distortion algorithm of Perceptual Audio Coding (PerceptualAudioCoding). The frequency range of sound perceived by the human ear is from 20 Hz to 220 kHz. MP3 cuts out a lot of redundant signals and irrelevant signals. The encoder transforms the original sound into the frequency domain through a hybrid filter bank. Using the psychoacoustic model, it is estimated that it may simply be The perceived noise level is quantized and converted to Huffman coding to form an MP3 bitstream. The decoder is much simpler and its task is to extract the sound signal from the encoded spectral line components through inverse quantization and inverse transformation.
When compressing audio data, the original sound data is first divided into fixed blocks, and then direct MDCT is performed. MDCT itself does not perform data compression, but only converts a set of time-domain data to frequency-domain data to obtain time-domain data. In case of change, the direct MDCT converts the value of each block into 512 MDCT coefficients. Quantization compresses data, and when bits are allocated to transformed samples after quantization, it is necessary to consider making the entire quantized block the smallest, which becomes lossy compression. When decompressing, the 512 coefficients are restored to the original sound data by reverse MDCT, and the original sound data before and after are inconsistent, because redundant and irrelevant data are removed during the compression process.
MP3 file structure
MP3 files are roughly divided into three parts: TAG_V2(ID3V2), Frame, TAG_V1(ID3V1)
ID3V2 Contains information such as author, composer, album, etc., the duration is not fixed, expanding the amount of information of ID3V1
framework
A series of frames, the number is determined by the file size and frame length
The length of each frame can be variable or fixed, determined by the bit rate.
Each FRAME is divided into two parts: frame header and data entity
The frame header records the bitrate, sample rate, version, and other mp3 information, and each frame is independent of each other.
ID3V1 Contains author, composer, album and other information, length is 128BYTE










