Principle of mp3 and file format analysis. Part 2


Free Download Mp4Gain
picture

Principle of mp3 and file format analysis. Part 2

mp3

MP3 uses perceptual audio coding (Perceptual Audio Coding) this distortion algorithm.

mp3

The frequency range of sound perceived by the human ear is 20 Hz to 20 kHz. MP3 cuts out a lot of redundant signals and irrelevant signals. The encoder transforms the original sound into the frequency domain through a mixed filter bank and uses a psychoacoustic model. to estimate that it may be only The perceived noise level is quantized and converted to Huffman coding to form an MP3 bit stream. The decoder is much simpler, its task is to extract the sound signal from the encoded spectral line components through inverse quantization and inverse transformation. The MP3 encoding and decoding process is shown in Figure 1.
2.4 Modified Discrete Cosine Transform The cosine transform
Modified Discrete CT (MDCT) refers to converting a time-domain data set to frequency-domain data in order to know the changes in the time domain. MDCT is an enhancement of the DCT algorithm. The first fast algorithm is fast Fourier transform (FFT), but FFT has complex operations, MDCT are real operations, easy to program.
When compressing audio data, first divide the original sound data into fixed blocks, and then perform direct MDCT (direct MDCT) to convert the value of each block into MDCT 512 coefficients. The 512 coefficients are restored to the original sound data, and The original before and after sound data is inconsistent because redundant and irrelevant data is removed during the compression process. The FMDCT transformation formula is:
k=0, 1,
.
n0=(N/2+1)/2, X(n) is the time domain value, X(k) is the frequency domain value. If N takes 1024 points, it becomes 512 frequency domain values.
The IMDCT transformation formula is:

n=0, 1, …, N-1
MDCT itself does not compress data, it simply maps the signal to another domain, and quantization compresses the data. When bit allocation is done on the quantized transformed samples, the entire quantized block must be considered the smallest, which is called lossy compression.
3 File Format Analysis
MP3 MP3 file data is made up of multiple frames, and the frame is the smallest unit of the MP3 file. Each frame, in turn, consists of a frame header, additional information, and sound data. The playback time of each frame is 0.026 seconds and its duration varies with the bit rate. Some MP3 files have extra bytes at the end that contain description information for non-audio data.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Author: R. Arias

R. Arias is the author of this article and has extensive experience for more than 30 years as a recording engineer and audio specialist, as well as more than 20 years of experience creating algorithms related to audio and video. Linkedin