Audio compression algorithms for streaming purposes


Free Download Mp4Gain
picture

Audio compression algorithms for streaming purposes.

Lossy, and Lossless compression

The problem of transmitting the necessary number of audio channels through a network of limited capacity forces us to resort to audio compression.

Lossy and Lossless ata compresion in digital audio

Despite the use of modern digital technologies, compression negatively affects sound quality and causes additional delay in signal transmission.

Currently, there are two fundamentally different approaches to compressing audio signals. This article will provide a general comparison between these two different compression principles. Also presented are graphs of the frequency response (amplitude frequency characteristic) of the sound sample in its original uncompressed form and after one cycle of encoding and decoding using MPEG Layer II and Enhanced apt-X.

Algorithms like MPEG and AAC use encoding using a psychoacoustic model of sound perception. Another approach is time encoding using Adaptive Differential PCM (ADPCM) in algorithms like Enhanced apt-X.

Linear PCM audio
Before compression, the audio is generally digitized in linear PCM format at 32 kHz, 44.1 or 48 kHz with a resolution of 16 or 24 bits.

The analog signal will be digitized in uncompressed digital PCM. The digital inputs of the codecs use oversampling to ensure conversion without timing issues. The uncompressed PCM signal is our benchmark for comparing compressed audio files.

MPEG Layer ll compression
MPEG 1 Layer ll is a widely used format. This is a typical example of a psychoacoustic perception coding algorithm that analyzes the incoming signal and compares it to a theoretical model to determine what frequency and what time domain information can be lost. The need to analyze the audio signal results in a mandatory delay, typically greater than 30 ms.

In theory, high compression ratios can be achieved, but even with relatively low compression, MPEG can seriously degrade audio quality. In Fig. 2 shows the frequency response after one pass of MPEG encoding of the source file.

Be aware of frequencies that are lost or distorted compared to original PCM audio.

Compression Enhanced Apt-X
Enhanced apt-X uses ADPCM audio processing technology. The signal is divided into four frequency bands that can be processed at a quarter of the original sample rate using a variable bit rate and variable quantization step. Since all processing is based on the time domain method, there is no delay other than the actual processing time required.

As a result, a 4: 1 compression ratio preserves the entire frequency content of the original signal with a coding delay of less than 3 ms. Frequency response graph in Fig. 3 shows the result of one pass encoding / decoding using Enhanced apt-X at 256 kbps and illustrates the high fidelity of Enhanced apt-X compared to the original uncompressed signal.

How Enhanced apt-X Works
The improved apt-X encoding algorithm passes the original PCM data through a specially designed two-stage Q-mirror filter to divide the signal into four subbands and reduce the clock frequency to 1/4 of the original clock frequency. The quantization procedure consists of processing four sub-signals to reduce each signal from 16 bits to 7 bits in subband 1, 4 bits in subband 2, 3 bits in subband 3 and 2 in subband 4.

The inverse quantizer and prediction scheme uses the above values ​​to predict the size of the next signal. This value is compared to the actual signal and the “difference” is measured. The encoder transmits this measured “difference” signal to the decoder. Each subband is processed in parallel and the output of the string quantizer and predictor is encoded with a predetermined resolution. The processing output of the four subbands is multiplexed into a single 16- or 24-bit enhanced apt-X signal. Then additional data and sync data is added to it for streaming.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Author: R. Arias

R. Arias is the author of this article and has extensive experience for more than 30 years as a recording engineer and audio specialist, as well as more than 20 years of experience creating algorithms related to audio and video. Linkedin