mp3 compression technology Archives - Page 2 of 2

How much compresses an MP3

Free Download Mp4Gain

How much compresses an MP3

MP3 compression was an engineering response to the problem of digital storage and its large memory resource requirements. A conventional digital signal called PCM (Pulse Code Modulation) could easily require up to 10 Megabytes of memory per minute. This would represent about 30 Mb for a three minute song.
That requirement for storage memory could be handled by any computer if it were a few files, but when talking about three thousand songs the numbers become worrying. As if this were not enough, there is the problem of the Internet and its current transmission speeds. In the case of telephone lines, they have a limitation on their transmission bandwidth, so very large or heavy files represent a problem for conventional network traffic.

MPEG3 compression is considered the sound part of the original MPEG1 format that was intended for cinematography. Its abbreviations, Moving Picture Experts Group come from the committee that was created by the ISO Organization (international Standards Organization) and IEC ((International Electrotechnical Commission) to develop this format. Its principle is based on the Psychoacoustic model.

The human ear is known to discriminate sound according to its limitations. According to subject matter expert Paul Sellars, “If you hear solitary applause in a room, it will surely sound loud, but if it is preceded by the sound of a gunshot, it will sound fainter. The same thing happens in a room when you record a rock band, at a certain moment the strongest sound guitar in the mix, until the moment the drummer plays a certain cymbal, at which point the guitar will seem to attenuate “This phenomenon is used by the MP3 algorithm to perform its compression . I once explained it in the article that talked about ATRAC compression of the Minidisc.

The MP3 format divides the sound into 32 sub-bands, which allows it, according to the Psychoacoustic model on which it is based, to give priority to one element over another. At a certain moment in the material we can have a predominant low frequency sound of the kick drum, a high frequency of the cymbal and the vocalist at the same time. The algorithm is not that it eliminates two of them, but that it dedicates less storage space to them.

The mathematical part used with MP3 compression goes through the Shannon-Nyquist theorem, which states that for a wave to be properly reproduced in PCM digital format, its frequency of takes (Sampléo) must be twice the highest that is want to reproduce. In this case if we want to reproduce the frequency of 22.5KHz, (The auditory range oscillates between 20Hz-20KHz), our sampling frequency should be 44.1KHz.

The Fast Fourier Transform (FFT) is also used, which as we know can decompose a complex wave (PCM material) into a fundamental wave with its harmonics, all from its amplitude. The Discrete Cosine Transform is also used, which is based on the FFT but only using the real numbers

UNTIL IT IS RECOMMENDED

These formats will continue to be perfected and emerge, but it should be understood that despite being disseminated there may be details that will not be perceived. In other words, for serious Audio work this format should not be used.

Some improvements can be made by looking for compressors that have a better ratio, such as 224, 256 and 320 Kbps. You can also consider using VBR (Variable Bit Rate) encoding where musical passages with greater dynamic complexity are treated with a higher rate. storage in contrast to the simplest. However, this will bring other complications because not all the reproducers can handle them.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

How an MP3 compresses music

We all know that MP3 was the audio format that quickly became popular and the main reason is because it took up much less space than the WAV format that has no compression and therefore was very difficult to transfer via internet from one computer to another.

And then it was when the MP3 made its appearance because it had a very good sound and yet it took between 7 and 10 times less space than the original file.

We all know that this caused people to easily exchange music files online and this changed even the way the music industry works thereafter.

But although we all know that MP3 takes up less space, it is very few people who understand that in the first place in MP3 what it does is compress the music. But it also uses some other procedures to make music take up less disk space, Today we will briefly explain how this mp3 performs this compression.

Remove inaudible sounds

One of the first things MP3 does is to analyze the music file and eliminate all those frequencies that are not audible to the human ear but nevertheless occupy a space in the original file. Then the MP3 saves a lot of space without losing quality by eliminating sound frequencies that the human ear cannot hear.

Eliminate redundancy

Another of the mechanics that is used for an mp3 saves space is to eliminate redundant sounds. And with that we understand sounds that sound very similar and basically occupy the same Soundtracks. Therefore, the ear will only perceive some. And then the MP3 eliminates those redundant sounds that will not be heard by the human ear.

Sound masking

Acoustics and audio specialists have long discovered that when the human ear perceives more than one sound simultaneously it is very likely that one of them masks the others.

The Sound perception produces that when a person perceives 2 sounds of different intensity at the same time the weakest sound, with less volume, is inaudible to the one who is listening. This, as we indicated earlier, is what is called the sound perception and the MP3 is based a lot on the sound perception to be able to eliminate sounds under this principle of sound masking.

In other words, in MP3 you decide which sound will mask others and then eliminate these others.

It should be noted that when one decides if the MP3 encodes at 128 kilo bytes per second or at 320 kbs it is modifying the amount of sounds that will be eliminated in the masking. Well, at 320 to eliminate very few sounds and as I lowered the number of kbs it will eliminate more sounds which the person can produce if he can distinguish a difference between the original audio file and the encoded file.

How is an mp3 file compressed?

The MP3 file takes up less space but loses information from the original recording, so it is a lossy compression. The question is, what is the algorithm for scrapping those details of music? How are they removed from the recording? Don’t they really matter and we don’t perceive those losses?

MP3 and auditory masking

The algorithm for MP3 compression eliminates details of the original music based on the phenomenon of the sound masking of our sense of hearing, a psychoacoustic phenomenon so daily that surely many will not have paid attention before, and that it is necessary to know to understand the MP3 .

Imagine that we are talking to someone on the street, a car passes by and suddenly we stop hearing our interlocutor. Why have we stopped hearing the other person? If we had recorded this situation with a microphone we would see that both sounds, the voice and the car, would have been perfectly recorded …

This phenomenon occurs because there are situations in which our sense of hearing gives prominence to one sound and ignores another if both are simultaneous, what is called sound masking, and that depends on well-defined causes that can be summarized as follows.

A sound can mask another when they reach the ear simultaneously depending on their relative frequencies and volumes. As seen in the figure, at the loudest sound our ear creates a new limit of hearing or masking at that time. If another simultaneous sound is under that frequency environment, we will not perceive it.

Temporary masking

When there is a sound of sufficient power to be masking, there are moments before and after that we will not perceive other sounds, depending on how closely they are in time and their relative volume, with the behavior represented in the figure. As you can see, a sound can be masked whether it occurs immediately after the masking, or if it occurs before!

The MP3 compression algorithm

When we perform an MP3 compression, the coding algorithm divides the music into a multitude of short-lived fragments. Each of these fragments are analyzed individually in many frequency bands, to be able to detect if in any of them there is any masking sound that is masking sounds of the other bands of the fragment, and therefore are inaudible or expendable. In that case, what you will do is encode that fragment with fewer bits than the original fragment, so resolution of the more subtle details (those details that have been dispensable) will be lost and the background noise of the fragment will increase.

The amount of bit reduction for that fragment will depend on the quality sought in the encoding. If we set it to high quality, it will reduce the resolution of the fragment only just enough so that the new background noise is still masked by the masking sound that was detected in that fragment.

Therefore, and according to the masking theory, no change will be perceived after the resolution reduction: neither by the loss of the details that were already originally masked, nor by the new background noise, which will remain imperceptible by also maintaining below that masking sound detected.

After this process, the fragment could have been encoded with fewer bits, occupying less information than the original. Once this attempt at bit reduction has been repeated with all the multitude of fragments into which the original file had been divided, the song is reconstructed and a compressed file is obtained that will now take up less space.

In addition to this masking-based coding, finally an “Huffman” arithmetic coding is applied to the resulting bits, similar to that performed in a “.zip” compression. This process will not entail additional quality losses.

Sound quality in MP3 files

The sound quality of the compression depends on the size that we want the compressed song to occupy, therefore the bitrate we indicate when performing the compression. If we choose a high bitrate, the algorithm will not be forced to eliminate much information, so it will eliminate really inaudible details according to the masking curves. But if we want the file to take up less space and choose a lower bitrate, the algorithm will have to be more drastic overcoming the most imperceptible masking curves, and it will be inevitable that the loss of information will be noticed.

For example, in the most common 128 kbps MP3s a few years ago, the quality is significantly lower than the original for most people, if a direct comparison is made. On the other hand, an MP3 file with the maximum bitrate of 320 kbps hardly loses information, and is practically indistinguishable from the original in most cases.