lossy compression in audio Archives

About Lossy

Free Download Mp4Gain

About Lossy

Lossy

We all love good music. More recently, the audio CD was good digital music. This is 44100 Hz, stereo, 16 bits (linear) per channel, not compressed in any way, which means, according to Wikipedia, 1411.2 kbps.

Lossy

But at the end of the 20th century, in the era of the birth of multimedia, when music began to be played not only on players, but also on computers, it turned out that the audio CD (that is, naked PCM) is even better. . compress. There was, for example, Microsoft ADPCM, which compressed this case a bit, without losing quality, in WAV files. But generally speaking, the original 44 kHz stereo would still require a lot of space this way. Hence, the quality dropped to 22 kHz mono. One of the first multimedia albums of that time: “Immersion” from the group “Nautilus Pompilius”, is still around, and I did.

So MP3 won. To store and distribute compressed music. At 128 kbps “CD Quality”.

MP3 came up strangely. Technically, this is MPEG-1 Audio Layer 3. A layer for compressing audio data into a modern, progressive standard for storing video data on Video CDs. Just packed in its own .mp3 file format. The video CD is no longer interesting to anyone. The following MPEG-2 standard is used in DVD and digital television broadcasts (not HD). And the next MPEG-4 standard is now used for HD video and continues to evolve.

MP3 was revolutionary. It was (almost) the first lossy compression format. When we don’t try to preserve everything that was in the original signal, but, based on some psychoacoustic model, we cut out what a person is not going to hear anyway, and compress the rest. Like JPEG.

Then I tried digitizing the accumulated audio collection. Compact cassettes (just “cassettes”, but more correctly “compact cassettes”) turned out to be complete shit. The frequency range is such that it makes no sense to sample with more than 22 kHz. There were no reel-to-reel recorders in the house. But vinyl records shook the sound quality. With good equipment, you can draw better quality than a CD. You just need to get rid of the clicks.

And then I realized that MP3 is shit too. At these same 128 kbps, the sound quality suffers greatly. And the scariest thing is that vile metallic hues appear where they shouldn’t be. My ears need at least 192 kbps, and the more the better.

Let’s take a hint from a famous punk rock band in the past. Like FLAC. It is such a modern lossless compression standard that it has successfully replaced WAV. Because it is free.

The original is CD quality, so frequencies up to 22 kHz are present as expected.

Original flac

We are going to harvest with FFmpeg, or rather with LAME.

At 320 kbps and 256 kbps, the spectrogram looks almost like the original.

At 192 kbps, there are signs of a 16 kHz cutoff. The spectrogram “darkens”, apparently, the psychoacoustic model has cut something out. By ear, the higher frequency “bursts” really disappeared.

MP3 192 kbps

At the notorious 128 kbit / s, everything is already specifically cut off at 16 kHz. Background sounds are “fuzzy” and begin to bubble. Nothing to do with the original in terms of enjoying the musical details.

MP3 128 kbps

But you can do 64 kbps in MP3. The stereo is gone. Everything gurgles terribly and irritates with completely strange sounds.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

In what format and with what quality is music heard on the radio?

In what format and with what quality is music heard on the radio?

Radio most used audio file formats

In fact, we can say that there are currently two main audio formats: lossy (compressed) and lossless (uncompressed). They are classified into many types.

Radio audio file formats

Lossy takes up less disk space, but degrades the quality of the audio track. When compressed using the MPEG protocol (hence the name mp3 – mp4 for files containing video sequences), the hues and transition tones, which are barely noticeable to the ear, are cut off. This makes the file clearer, but it also degrades it. The last place is occupied by the bit rate of that file: the degree of compression of each second of the audio track. The lower the bitrate, the less space the file will occupy and the worse the quality. Thus, a composition of three minutes in mp3 with a bit rate of 320 kilobits per second will occupy up to 3 megabytes on disk; a similar composition with a 96 kilobit bit rate will occupy about 400 kilobytes.

Lossless is as close to the original analog sound as possible *, making it much loved by sound engineers. Lossless formats take up much more disk space even compared to mp3-320. Among these formats, the most common are WAV (standard), FLAC (economic), AIFF (Apple). The former is used most often.

Professional sound recording is done only in uncompressed format. Only with him do sound engineers work.

On the radio, the situation is somewhat more complicated. This is due to the peculiarities of the work of the media, namely, efficiency and commercial profitability. The use of high-capacity servers is expensive and therefore most radio stations encode audio tracks in mp3 format at a bit rate of 256 kilobits per second. However, this is typical mainly of national stations. Equipment purchased from abroad has standard configurations that assume WAV encoding.

Why are software developers focusing on WAV? Because the radio signal cannot propagate without interference. Therefore, the listener still receives a small and sometimes significantly distorted signal. Therefore, broadcasters are faced with a reasonable question: what quality of sound will the listener perceive best: distorted ideal or distorted distortion? For this reason, in Europe and the United States, the WAV standard (AIFF, if the station operates with Apple equipment) is adopted, in Russia – mp3 with a bit rate of 256 kilobits per second.

Analog data transmission is based on the physical properties of sound. The record-playback mechanism is based on the principles of human auditory perception. That is, the sound wave vibrates the membrane (by analogy with the tympanic membrane of the ear) and is fixed with a needle in the carrier in the form in which it was obtained. Reproduced, therefore, also without deviations and changes associated with digital conversion.

The Audio Files category includes compressed and uncompressed audio formats that contain a data signal and can be played by audio programs. This category also includes MIDI files, music scores, and audio project files, which generally do not contain audio data.

The most common extensions are .WAV, .AIF, .MP3, and .MID.

Lossy audio compression

Lossy audio compression

MP3: Lossy compression

I’ll start with the well-known and widely used (though not always loved) MP3 format.

Lossy audio format

This audio format is actively used everywhere and everywhere, where it is needed and where it is not needed. But this does not mean that it is not worthy of the place it occupies in its niche. Very worthy. Although he has been “sitting” in his niche for about two decades, no one has “kicked” him out of there yet. And there were many who wanted to say it. And the main favorite of them is WMA (Windows Media Audio), which was conceived by Microsoft as an alternative to MP3. As a result, it is an alternative and it is, despite the best efforts of the developers. The next character is OGG. Despite the broader possibilities than MP3, for example, it never received widespread acceptance. Although it is compatible with many operating systems. Perhaps, it is worth mentioning the AAC audio format, which was supposed to replace MP3 in the relay. Encoding quality has been improved and compression loss reduced. But Ay.

The main advantage of these formats is their small size. The downside is the loss of quality.

Different formats
In today’s world, you can find a large number of different sound extensions. Let’s remember at a glance:

MP3 (Well where without it?)
WMA
OGG
CAA
And many others
Of course, each of these formats is good, especially MP3, which is probably the most popular format. But today we are not talking about popularity. MP3 and other similar formats, no matter how good they sound, are compressed originals. And even if you set the maximum quality to 320 btrate, it still won’t be of the highest quality. It was compressed, reduced, so there will be certain losses.

Lossy compression: Compress audio and video

Lossy compression: Compress audio and video

Lossy cmpression

High-quality digitized audio requires a large amount of disk space. Attempts to reduce file size using standard file cabinets do not yield significant gains due to the specificity of the audio data. However, it is possible to achieve a fairly significant level of compression of the audio information using special methods based on the analysis of the data structure and subsequent compression with some loss.

Lossy Compression

The real possibility of sound processing comparable in quality to existing analog examples did not appear until the late 1980s. In 1988, the International Organization for Standardization (ISO) formed the MPEG (Moving Image Experts Group) committee. , whose main task is to develop standards for the encoding of moving images, sound and their combination. During the ten years of its existence, the committee has developed a series of norms on this subject. As a result, summarizing the extensive research in this area, several specific formats were recommended for storing data, which are excellent in quality of results and data flow.

Currently, the three most common video storage standards are MPEG-1, MPEG-2, and MPEG-4. Within the first two formats, there are also formats for storing audio information: Layer-1, Layer-2 and Layer-3. These three audio formats are defined for MPEG-1 and minor extensions are used in MPEG-2. The three formats are similar to each other, but use different levels of compromise between compression and complexity. Layer-1 is the simplest level, it does not require significant compression costs, but it also provides a negligible compression ratio. Layer-3 level: the most time consuming and provides the best compression. Recently, this format has gained immense popularity. It is often called MP3. This name is associated with the extension of the audio files stored in this format.

Founded idea, in which all audio signal loss compression methods – ignore the subtle details of the original sound, which are outside of what the human ear perceives. Here several points can be highlighted.

Noise level. Sound compression is based on a simple fact: if a person is near a loud siren, they are unlikely to hear the conversation of the people who are nearby. Also, this happens not because a person pays close attention to a loud sound, but to a greater extent because the human ear actually misses out sounds that are in the same frequency range as a louder sound. This effect is called masking, it changes with the difference in volume and frequency of the sound.

The second point is the division of the audio frequency band into subbands, each of which is further processed separately. The encoding program extracts the loudest sounds in each band and uses this information to determine an acceptable noise level for that band. The best encoding programs also take into account the influence of adjacent bands. A very loud sound in one band can affect the masking effect and nearby bands.

Another point of the codification is the use of a psychoacoustic model based on the peculiarities of the human perception of sound. Compression The use of this model is based on removing obviously inaudible frequencies with more careful preservation of sounds that are clearly distinguishable by the human ear. Unfortunately, there can be no exact mathematical formulas here. The human perception of sound is a complex process, not fully understood, so the choice of compression methods is based on analyzing listening and comparing compressed sounds differently by teams of experts. But here there are practically limitless possibilities in the field of improving psychoacoustic models. Most of the existing algorithms to encode the human voice are based on the high predictability of said signal; Universal MPEG compression algorithms have tried to apply this technique with variable success.

Another compression technique is the use of so-called joint stereo. It is known that the human hearing aid can only determine the direction of the mid frequencies, the high and low sound, so to speak, separately from the source. This means that these background frequencies can be encoded into a mono signal. In addition to all this, compression uses the difference in the complexity of the flows in the channels. For example, if there is total silence on the right channel for some time, this “reserved” place is used to improve the quality of the left channel.