Choose the best format to compress audio data


Free Download Mp4Gain
picture

Choose the best format to compress audio data: MP3, AAC or WavPack?

Best Audio Format

Choose the best format to compress audio data: MP3, AAC or WavPack?
If not lossless, then a cat? MP3, AAC, what else? Previously, we have already studied music compression algorithms several times, it is time to compare the most valuable ones.

Best Audio Format

Amicably, you’d need to give up lossy codecs entirely, but it’s always interesting to draw the line where quantity turns into quality. Also, even a lossy codec can surprise you with something, you’ll see. In this review, it was decided not to play around with different VBR modes, but to immediately stop at the maximum bitrate with a constant value of 320 kb / s. Today, with modern laptop capacity, asking for an extra 10MB for album capacity at the risk of losing quality? For what? In general, even with older codecs, the 320 kb / s stream ensures the absence of characteristic artifacts with nasty jingles. The first part of the review will be devoted to comparing the growth of artifacts using RMAA software, in the second part, the subjective experience of the listener in real phonograms is presented.

Comparative frequency response of three lossy formats relative to original WAV
If the last time the iPad Mini was used as a sound source, now, to improve accuracy, we take any iron influence out of the brackets, and then all the distortion analysis will be done exclusively in the digital domain, without conversion to analog as RMAA provides such an opportunity.To do this, we generate a test sample in WAV in RMAA, then handle it one by one in various lossy codecs. Then we will convert WAV from them again, so that the program can “recognize” the file and evaluate deviations from the original template. Now let’s look at how high frequencies are cut and distortion increases, giving the sound an unpleasant color. By the way, there won’t be that many. In general, at a bit rate of 320 kb / s, it will not be so easy to detect something harmful by ear. It’s not even about artifacts, but maybe a bit of “boring” of the sound compared to the original. The phonogram seems to fade a bit, it loses its mobility due to the alteration of transient processes after psychoacoustic processing. But it will not always be possible to clearly record this difference, it depends on the specific track.
MP3: Avalanche Distortion Let’s start with the most popular format. MP3 is a monster from the Fraunhofer Institute that has taken over the Earth. Because of this, nowadays no one thinks of using pure WAV for sound recording. Even if they rip out the defaced YouTube audio, they still rip it back down to MP3, and even at an obscene 128kb / s bit rate. We will not do that, and for the test we will use the most current version of the LAME 3.100 encoder with an insane preset and 320 kb / s bit rate. In the first figure, it was seen that the spectrum in MP3 is expected to experience oscillations in the HF region and eventually filter into the 20 kHz limit. Of course, this is the limit of the synthetic test; in a real music signal, it will probably be even lower. The size of the dynamic range in the MP3 file has not changed compared to the original. Those. The LAME 3,100 encoder at 320 kb / s does not add any noise to the recording.

1 kHz waveform distortion when encoded in MP3 compared to original WAV
Converting a single 1 kHz signal to MP3 showed the appearance of many small harmonic distortions. And although formally their participation is small (0.0009%), that is, one and a half to two times less than in the exhaust of a good DAC: in the dynamic spectrum of a real phonogram, their number will grow in an avalanche and in an unpredictable order. Furthermore, the “thickening” of the base of the strait at the original 1 kHz peak indicates certain problems, fouling with parasitic oscillations. This characteristic is clearly illustrated by the 100 Hz “square” wave after conversion to MP3. As you can see, its outline loses its definition along the horizontal axis. All of this ultimately has a negative effect on listening fatigue when listening to MP3s, unfortunately even the highest bit rates.

100 Hz “square” wave after conversion to MP3 (top) and AAC (bottom)
AAC: Increase the noise, but keep it clean A more precise way operates the AAS algorithm, which is actively used by Apple, and not only by it. Digital TV broadcasters work with this audio codec and furthermore AAC is included in the MPEG-4 container package.The square wave after conversion to AAC retains its shape, although base distortion and distortion also occurred. harmonics around the 1 kHz peak, although less noticeably than MP3. At the same time, AAC demonstrates a 1 dB higher measured noise level. What does it mean: intermediate recording on a cassette or what? No, the AAC algorithm probably uses something like noise shaping, a great invention that allows you to reduce quantization errors when mixing a pseudo-random noise signal. Again, it’s not just about drowning out the distortion below the noise floor, but using more sophisticated math. To illustrate, let’s look at the artifacts around the so-called 11.025 kHz jitter test. Why this particular frequency? Because the multiple harmonic of this peak falls exactly on the upper limit of the 44 kHz digital stream spectrum, and all the rest will be outside of it. Small spurious peaks, especially those that are symmetrical with respect to pitch (modulation products, “sidebands”) – these are the grains of jitter.

AAC (top) and MP3 (bottom) jitter test stability
As you can see, Fool-MP3 saved a low noise level, but generated more high frequency fluctuations (more noticeable to the ear), and AAC raised the noise a bit, but avoided clutter in the rest of the spectrum. But the WavPack encoder does even bigger tricks with noise shaping.
WavPack: Keep Frequency, Change Bit Width In general, if it comes immediately and very briefly, today’s WavPack encoder math belongs to the most flexible and cool protocols for audio enthusiasts, no kidding. Unlike FLAC, it can support 32-bit computation (I recommended it for creating lossless vinyl rips). Furthermore, in WavPack you can even package a DSD file without converting it to PCM. In this case, the file size will be much smaller than the original dsf. But we will talk about lossless WavPack some other time, but for now we will consider the unique principle of how the WavPack codec works at a loss. In one of my reviews, I showed that in several cases when compressing lossy it makes sense to reduce not the sample rate , but directly the bit depth of the signal (that is, below 24 or 16 bits), carefully mixing the dither (that is, a special noise profile to reduce quantization errors). WavPack went in exactly this glorious way, without touching on discretion and frequency in general, but changing the bit depth, which is now a dynamic value, describing the loudness level of the signal. A bit like the DSD principle, right? It is noteworthy that when converting to a lossy WavPack, you can also save a parallel “correction” file, with which it will be possible to fully restore the original, down to the last bit. It is true that in this case it will not work to save disk space, since the size of said pair will still correspond to the original without loss. However, the functionality of the protocol is still impressive, the bitrate of our test file was set at 320 kb / s to compare it to the maximum of our MP3 and AAC, but theoretically in WavPack it can be set even higher.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Author: R. Arias

R. Arias is the author of this article and has extensive experience for more than 30 years as a recording engineer and audio specialist, as well as more than 20 years of experience creating algorithms related to audio and video. Linkedin