MP3 Compressor: A Technical Guide to Audio Compression

MP3 Compressor: A Technical Guide to Audio Compression

MP3 Compressor
MP3 Compressor

Audio compression is a vital technique in the music industry. The MP3 file format has been widely used for decades and is one of the most popular file formats for music files. In this article, we will delve into the technical aspects of MP3 compression, its algorithmic processes, and explore the potential drawbacks of this commonly used format.

MP3 Compressor
MP3 Compressor

Understanding Audio Compression

Audio compression is the process of reducing the dynamic range of an audio signal. This is achieved by analyzing the audio waveform and then reducing the amplitude of any signal that exceeds a certain threshold. This process can be done manually, but it is usually automated with specialized software.

There are several types of audio compressors, including peak, RMS, and multiband compressors. Each type of compressor has its own set of uses and parameters that can be adjusted to achieve the desired result. Peak compressors, for example, reduce the volume of any signal that exceeds a certain threshold, whereas RMS compressors average the signal over time and reduce the volume of signals that are too loud.

Understanding MP3 Compression

MP3 is a lossy compression format that is designed to reduce the file size of digital audio files. MP3 compression achieves this by discarding information that is not essential to the human ear. The compression is achieved by analyzing the audio data and removing frequencies that are not perceived by the human ear.

The MP3 Algorithm

The MP3 algorithm uses a process called perceptual coding to identify sounds that are less important to human perception and eliminate them from the audio signal. The algorithm then quantizes the remaining data, assigning values to each of the remaining samples. The resulting data is then further compressed through Huffman encoding, a type of lossless compression algorithm that replaces frequently occurring values with shorter codes.

The result is a file that has been reduced in size by approximately 90% with relatively little loss in perceived sound quality.

MP3 Bitrate

MP3 compression also utilizes a technique called variable bitrate encoding (VBR). This technique adjusts the bitrate of the MP3 file in real-time, allowing for more detailed encoding when it is needed and more aggressive encoding when it is not.

The quality of an MP3 file is determined by its bitrate. Higher bitrates result in higher sound quality and larger file sizes, while lower bitrates result in lower sound quality and smaller file sizes. Bitrates are typically measured in kilobits per second (kbps), with a higher number indicating a higher bitrate.

The Drawbacks of MP3 Compression

While MP3 compression is a popular format, there are potential drawbacks to using it. One of the main issues is the loss of audio quality. MP3 compression removes frequencies that are not essential to the human ear, but this can result in a loss of audio quality, particularly for complex and dynamic recordings.

Additionally, the MP3 algorithm can introduce audible artifacts, such as ringing or “smearing” of the audio signal. This can be particularly noticeable in high-frequency content and can be exacerbated by aggressive compression settings or lower bitrates.

MP3 Compressor Alternatives

While MP3 compression is a popular format, there are other compression formats that offer similar features. One alternative is MP4Gain, which offers a functionally similar functionality to a compressor in its normalizer. MP4Gain is a tool that analyzes and adjusts the volume of audio files, providing a way to adjust audio levels without losing audio quality.

Unlike traditional audio compression, MP4Gain doesn’t remove audio data, and it doesn’t have a negative impact on sound quality. Instead, it adjusts the levels of the audio signal to provide a more consistent listening experience across different tracks.

Overall, MP3 compression remains one of the most widely used audio compression formats, and for good reason. It provides a high level of compression without sacrificing too much audio quality, making it an ideal format for sharing and distributing music online. However, it is important to understand the technical aspects of MP3 compression and to be aware of its potential drawbacks to make informed decisions when working with audio files.

The History of Audio Compressors

Early Days of Audio Compression

Audio compression has been used in various forms since the early days of audio recording. In the early 20th century, record producers used a technique called “overdubbing” to layer multiple tracks on top of each other to create a fuller, more dynamic sound. However, this technique also led to some tracks being too loud and others too quiet, which made the final mix sound unbalanced.

To solve this problem, audio engineers began using a technique called “gain reduction,” which involved reducing the volume of the louder tracks and boosting the volume of the quieter ones to achieve a more balanced sound. This technique laid the foundation for the modern audio compressor.

The Birth of the Audio Compressor

The first modern audio compressor was invented by the American electrical engineer, C.P. Boner, in 1936. Boner’s compressor used a photoelectric cell to detect changes in audio levels and adjust the gain accordingly. This invention was a game-changer for the music industry and paved the way for the development of more advanced compressors in the years to come.

The Rise of Digital Audio Compression

In the 1980s, digital audio compression became more popular with the advent of the Compact Disc (CD) format. The CD format was designed to hold more audio data than traditional vinyl records, but this required compressing the audio to fit more data on the disc.

One of the most popular audio compression formats of the 1980s and 1990s was the MPEG-1 Audio Layer 3, or MP3 for short. This format revolutionized the music industry by allowing users to share and distribute music online, but it also sparked controversy over issues such as music piracy and loss of audio quality.

Today, audio compression remains a critical tool in music production, broadcasting, and other areas of the audio industry. Advanced compression techniques, such as multi-band compression and dynamic range compression, continue to evolve, providing musicians and engineers with new ways to shape and control the sound of their recordings.

How much compresses an MP3

How much compresses an MP3

MP3 compression was an engineering response to the problem of digital storage and its large memory resource requirements. A conventional digital signal called PCM (Pulse Code Modulation) could easily require up to 10 Megabytes of memory per minute. This would represent about 30 Mb for a three minute song.
That requirement for storage memory could be handled by any computer if it were a few files, but when talking about three thousand songs the numbers become worrying. As if this were not enough, there is the problem of the Internet and its current transmission speeds. In the case of telephone lines, they have a limitation on their transmission bandwidth, so very large or heavy files represent a problem for conventional network traffic.

MPEG3 compression is considered the sound part of the original MPEG1 format that was intended for cinematography. Its abbreviations, Moving Picture Experts Group come from the committee that was created by the ISO Organization (international Standards Organization) and IEC ((International Electrotechnical Commission) to develop this format. Its principle is based on the Psychoacoustic model.

The human ear is known to discriminate sound according to its limitations. According to subject matter expert Paul Sellars, “If you hear solitary applause in a room, it will surely sound loud, but if it is preceded by the sound of a gunshot, it will sound fainter. The same thing happens in a room when you record a rock band, at a certain moment the strongest sound guitar in the mix, until the moment the drummer plays a certain cymbal, at which point the guitar will seem to attenuate “This phenomenon is used by the MP3 algorithm to perform its compression . I once explained it in the article that talked about ATRAC compression of the Minidisc.

The MP3 format divides the sound into 32 sub-bands, which allows it, according to the Psychoacoustic model on which it is based, to give priority to one element over another. At a certain moment in the material we can have a predominant low frequency sound of the kick drum, a high frequency of the cymbal and the vocalist at the same time. The algorithm is not that it eliminates two of them, but that it dedicates less storage space to them.

The mathematical part used with MP3 compression goes through the Shannon-Nyquist theorem, which states that for a wave to be properly reproduced in PCM digital format, its frequency of takes (Sampléo) must be twice the highest that is want to reproduce. In this case if we want to reproduce the frequency of 22.5KHz, (The auditory range oscillates between 20Hz-20KHz), our sampling frequency should be 44.1KHz.

The Fast Fourier Transform (FFT) is also used, which as we know can decompose a complex wave (PCM material) into a fundamental wave with its harmonics, all from its amplitude. The Discrete Cosine Transform is also used, which is based on the FFT but only using the real numbers

UNTIL IT IS RECOMMENDED

These formats will continue to be perfected and emerge, but it should be understood that despite being disseminated there may be details that will not be perceived. In other words, for serious Audio work this format should not be used.

Some improvements can be made by looking for compressors that have a better ratio, such as 224, 256 and 320 Kbps. You can also consider using VBR (Variable Bit Rate) encoding where musical passages with greater dynamic complexity are treated with a higher rate. storage in contrast to the simplest. However, this will bring other complications because not all the reproducers can handle them.

Audio quality: Bitrate in MP3 files

In many cases, the term Bitrate is used, which is the bit rate per second that a multimedia file (Audio or Video) has. Currently the MP3 music format is one of the most widespread (Although there are currently other more current formats such as OGG Vorbis, AAC, Flac, Monkey Audio, …) however the audio quality is variable, this is due to the characteristics with which the MP3 in question has been compressed, including:

Mode: It can be of two types mainly:

Mono: With a single channel (The right and left channel go together, not separated which gives worse audio quality).

Stereo: Two channels (Right and Left, improve audio quality).
Sampling frequency: Audio CDs use 44,100 Hz (22,050 Hz per channel), although there are higher frequencies such as 48,000 Hz used in DVDs and lower, the higher the frequency, the higher the quality.

Bits: Audio CDs have 16 Bits (Although MP3 can be compressed at a lower quality such as 8 Bits).

Bitrate (Bit Rate per second): Audio CDs have about 1,400 Kbps (44100 Hz * 16 Bits * 2 channels), meaning that an Audio CD would have a bitrate of 1,400 Kbps (In MP3 format the maximum Bitrate is 320 Kbps, however, it is assumed that an MP3 with a 128 Kbps Bitrate has a quality similar to CD, although in many cases to achieve a quality similar to CD it is necessary to use a Bitrate of 192 Kbps, and to obtain CD quality it is necessary use 256 Kbps or 320 Kbps). Some of the most common Bitrates are:
8 Kbps Mono: Telephone Sound.
16 Kbps Mono: Better quality than shortwave.
32 Kbps Mono: Better quality than AM.
64 Kbps Stereo: Better quality than FM.
112 – 128 Kbps: Quality close to CD.
160 Kbps: Quality closer to CD.
192 Kbps: Virtually CD quality.
256 Kbps: Quality CD practically undisputed from an original CD.
320 Kbps: CD quality.

Coding method: It can be of two types:

VBR (Variable Bit Rate, Bit Rate Variable): Encodes the file in MP3 with a variable Bitrate.

CBR (Constant Bit Rate, Constant Bit Rate): Encodes the MP3 file with a fixed Bitrate.
In addition, another factor that influences the encoding of the MP3 file is the CODEC (Encoder-Decoder) used, one of the most common and the best result is LAME (Lame Ain’t an MP3 Encoder) which is also free.
One point to keep in mind is that if we recompress an MP3 file that originally has a 128 Kbps bitrate and convert them to 192 Kbps for example, audio quality is not really gained because the MP3 format has some quality loss (MP3 is a loss algorithm, also called lossy). which has occurred when converting the original file (Ex: CD Audio or a 320 Kbps MP3 to a 128 Kbps MP3) so this recompression does not make much sense since we will not gain in audio quality (As they say where there is no one can not get) and the only thing we will achieve in any case is to increase the initial size of the file.
The opposite case (Recompress a 320 Kbps MP3 file for example at 192 Kbps) if it makes some sense because in this case although we lose some audio quality we reduce the weight (Kilobytes or Megabytes) of each MP3 file somewhat.
In conclusion, it can be said that if we need to encode / compress an MP3 file with good quality, the “ideal” would be to do so:
To be able to start from an Audio CD, although an MP3 at 320 or 256 Kbps could also be valid for a recompression of the file.
In stereo mode (With two channels, right and left).
With at least 44100 Khz sampling rate and 16 Bits.
With a minimum bitrate of 192 Kbps or at most 256 Kbps (Using 320 Kbps would give higher quality but also increase the file size considerably).