Digital audio compression


Free Download Mp4Gain
picture

Digital audio compression

Digital Audio Compression

Audio data compression is a real problem today. There are two reasons for the need to compress audio data: memory savings when storing audio information, low bandwidth of remote digital information transmission channels. Compression effectively solves the two problems above. Data compression is an algorithmic transformation of data performed to reduce its volume.

Data Compression

It is used for a more rational use of data storage and transmission devices. Compression is based on eliminating the redundancy contained in the original data. To guarantee the parameters necessary for the transmission of voice signals (music) over modern low-speed digital communication channels and to guarantee the specified noise immunity, it is necessary to use highly efficient data compression algorithms. The transmission channel is characterized by a concept such as the capacity of the channel: And the signal – by the volume (signal): …

Both of the above features include dynamic range D, channel width (signal spectrum), and transit time T. Digital audio compressors are used to reduce dynamic range. To improve spectral efficiency, digital filters are used to limit the spectrum of the encoder output signal (according to Nyquist criteria). Among other things, encoders based on the principles of elimination of redundancy (Huffman codes) are used to guarantee a certain information transmission speed. The essence of which is as follows: codes based on the principle of assigning more probable values ​​of the amplitudes of the codewords of shorter length than the improbable ones.

Let’s consider how the types of redundancy described above are eliminated.
Structure of a lossy audio compression encoder The original digital audio signal is divided into frequency subbands and time-segmented into a time-frequency segmentation block. The length of the encoded sample depends on the shape of the temporal function of the audio signal. In the absence of sharp peaks in amplitude, a long sample is used, which provides high-frequency resolution. In the case of abrupt changes in signal amplitude, the length of the encoded sample decreases dramatically, giving a higher time resolution. The decision to change the length of the coded sample is made by the psychoacoustic analysis unit, calculating the value of the psychoacoustic entropy of the signal.
After segmentation, the frequency subband signals are normalized, quantized, and encoded. In the most efficient compression algorithms, it is not the samples of the audio signal that are encoded, but the corresponding MDCT coefficients. (the differential between the coefficients is smaller) The accounting of the auditory perception patterns of a sound signal is carried out in the psychoacoustic analysis unit. Here, according to a special procedure, for each frequency sub-band, the maximum allowable level of quantization distortion (noise) is calculated, in which they are still masked by the useful signal of this sub-band.

The block of dynamic distribution of bits according to the requirements of the psychoacoustic model for each coding subband selects a minimum possible number of them, in which the level of distortions caused by quantization does not exceed the threshold of their audibility calculated by the model psychoacoustic.

This article will consider the functional diagrams of the audio data compression algorithms, based on µ-laws, A. The functional diagram of the compression algorithm based on the A-level compression law is shown in Fig.2. Figure 2. Functional diagram of the compression algorithm based on the A-level compression law A signal (discrete sine) is applied to the input of the compressor. After compression, the signal passes to the adder, where the noise is fed to the second input of the adder, thus simulating the additive noise of the transmission channel.

Then the noisy signal enters the input of the expander, at the output we get the reconstructed signal. The reconstructed and original signal is then fed to the adder, after which the power of the spectral noise is observed.

Simulation results (A = 87.6)
The following graphs are presented: 1-original signal, 2-signal passed through the compressor, 3-recovered signal, 4-noise power at the output of the noise generator, 5-noise power after the expander.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Understand what audio compression is

Understand what audio compression is

Audio Compresion

A container format is a data format that “encapsulates” other encoded data. It often contains “meta information” about the encoded data, or has a way of storing several separate streams of encoded data, or something like that.

Adio Compression

The encoding produced by the codec is the real essence of the data stream.

The most common example I can think of is the Ogg / Vorbis format. Ogg is the container format and Vorbis is the encoding. So you have an Ogg file and inside there are these little segments that contain encoded data. Each block contains a stream of Vorbis-encoded data and nothing else. For example, a cube might have the name of an artist and the title of a song stamped on it.

So, back to technology:

If you already have lossy music like mp3 or ogg / vorbis, converting it to lossless format will only take up (a lot) of disk space and will NOT, at all, NOT improve the audio quality at all. You can’t create loyalty when it’s already lost. Unless you’re writing a Visual Basic GUI on some popular TV show called CSI, but that’s fantasy, not reality.

If you have music in other lossless formats and want to convert it to FLAC, you can.

Be careful when using the term “WAV”. Wav doesn’t have to be lossless; in fact, WAV is just a container for the various possible formats. In this sense, it is similar to AVI. You can have lossless WAV if it is just raw PCM data, but you can also embed MPEG-1 Layer III (lossy) data in a WAV file.

It is possible to lose data when converting from one lossless format to another if you reduce the precision of the data. For example, if you convert an unsigned 16-bit PCM data stream at 48000 Hz to 8-bit PCM data at 44100 Hz, you lose precision in two ways: samples are merged from 48000 to just 44100 at a time. second (leading to data loss), and the data needs to be scrambled to fit the information into just 8 bits instead of 16 per sample, which will drastically degrade the quality.

Every digital audio stream, even encoded with a compressed encoder (lossy or lossless), has the following sample format properties, which are important elements that describe the properties of the stream:

An example of bit width and depth, i.e. 8 bit, 16 bit, etc. Bit widths and depths are slightly different and there is also big endian / endian byte order (which does not affect quality) and signed or unsigned sign (which does not matter either) affects quality but does affect encoder / decoder operation with data). The key point to remember is “the more bits the better”. So 32-bit is better than 16-bit, etc.

Frequency, also known as the sample rate. The more the better, because more “samples” of sound are played per second. Imagine sliding your finger across a deck of cards and seeing the cards blur; this is essentially how digital sound occurs. Each sample is a map, and if you have more maps flying per second, the sound is softer. For example, you would really notice if you were flipping only 5 cards per second, but everything would be blurry if you were flipping thousands of cards per second. So it’s even better, because it’s more natural and closer to reality, which is analogous and infinitely divisible (well, up to Planck units, but this is debatable and off-topic).

Lossless simply means that if you use the same or better sample format in the output that you used in the input, you won’t lose any data.

Therefore, if you change from 16-bit to 32-bit sample format, you will not lose data. But if you go from 32 bit to 16 bit, you will lose data.

So the answer to your question about whether it makes sense to use FLAC depends on the original data: if you have 64-bit WAV files that were originally recorded in this 192,000 Hz (or 192 kHz) sample format, and you convert them to “format Standard 16-bit 44.1kHz FLAC, you’ll lose a ton of data. But if your WAV file is 8-bit with 22100 samples per second and you convert it to 16-bit FLAC with 44100 samples in second, you won’t lose data. and you can even increase the file size depending on whether you gain lossless compression or a smaller sample format.

The sample format will affect the amount of space the file takes up, so “bigger” bits and a “faster” sample rate will take up more space.

When it comes to practical considerations and human hearing, you won’t notice if you convert very high-quality originals to 16-bit FLAC at 44.1 kHz. But you won’t notice any improvement if you convert MP3 to FLAC either. As such, you need to evaluate what format your raw data is in before deciding what to do.

Improved efficiency of digital audio data compression algorithms.

Improved efficiency of digital audio data compression algorithms.

audio compression

The relevance of the work. Methods for encoding high quality (HS) audio signals have become very widespread in the last decade in the field of broadcasting, digital sound recording, and home audio and video equipment. There’s even a fast-growing new class of consumer electronics: portable MP3 players.

Audio Compression:

Digital television and radio transmission networks are being developed, providing consumers with high-quality images and sound with a wide coverage area. The popularity of radio and television broadcasts over the Internet and mobile phone networks is increasing. All these technological innovations have become economically viable, and in some cases even technically possible, thanks to the use of highly efficient digital video and audio data compression algorithms, such as MPEG-1 ISO / IEC 11172, MPEG-2 TSO / IEC 13818, MPEG-4 ISO / IEC FCD 14496, ATSC Dolby AC-3. At the same time, due to the economic advantages of using these algorithms, which make it possible to reduce the bandwidth requirements of the transmission channels or the capacity of the information carriers by an order of magnitude, it is necessary to compensate with a certain decrease in the sound quality. During the era of the dominance of digital audio CDs, consumers have created a requirement for high sound quality from any sound reproduction equipment. The efforts of algorithm developers for encoding audio signals have always been aimed at ensuring that the quality of decoded audio material is no worse than that of a CD. Sound quality is often the determining factor in the economic success of digital broadcasting services or digital sound distribution services like iTunes). Further,

It is obvious that the problem of improving the quality of audio coding is today one of the key problems for the sound recording industry, the audio broadcasting industry and the manufacturers of various multimedia systems.

The basic principle of operation of highly efficient audio coding systems is to use the properties of the human auditory system, mainly the phenomenon of masking. The phenomenon of psychoacoustic masking is due to the biophysical and neuronal processing of sound signals by the human auditory system [173]. At the same time, part of the sound information does not affect the acoustic perception of the sound signal due to the presence of components with greater intensity in it. Therefore, the strongest components of the audio signal form the so-called masking thresholds. Sound information with a signal energy level below the masking threshold is not perceived by the auditory system. In the traditional digital representation of audio signals using pulse code modulation (PCM), time-sampled samples of the original signal are represented using a specific number of bits in the code word. The finite precision of the instantaneous values ​​of a continuous analog signal introduces an error in the signal, the so-called quantization noise. The idea of ​​encoding audio signals with the elimination of psychoacoustic redundancy is to combine psychoacoustic analysis and the quantization mechanism of audio signals [112]. In this case, the digitally encoded signal is converted into a time-frequency representation, as close as possible to the time-frequency resolution of the human auditory system. Psychoacoustic analysis determines the masking thresholds at each point in the time-frequency representation of the encoded signal, and the quantizer re-quantizes the signal with the minimum possible number of bits per sample, in which the increasing quantization noise is still below the masking thresholds. Thus, a compact representation of audio signals can be achieved without subjective degradation of sound quality. It is obvious that the efficiency and quality of such systems depend mainly on the precision of the psychoacoustic analysis. a compact representation of audio signals can be achieved without subjective degradation of sound quality. It is obvious that the efficiency and quality of such systems depend mainly on the precision of the psychoacoustic analysis. a compact representation of audio signals can be achieved without subjective degradation of sound quality. It is obvious that the efficiency and quality of such systems depend mainly on the precision of the psychoacoustic analysis.

Audio compression

Audio compression

Audio Compression

Well-established data compression methods such as RLE, statistical and dictionary methods can be used to compress lossless audio files, but the result is highly dependent on the specific audio data. Some sounds will compress well with RLE, but poorly with statistical algorithms. Statistical compression is more suitable for other sounds, but with a dictionary approach, on the contrary, expansion can occur. Here is a brief overview of the effectiveness of these three methods for compressing audio files.

Audio Compression

RLE works well with sounds that contain long series of repeating sound chunks – samples. With 8-bit sampling, this can happen quite often. Remember that the voltage difference between two 8-bit samples n and n – 1 is approximately 4 mV. A few seconds of homogeneous music, in which the sound wave changes by less than 4 mV, will generate a sequence of thousands of identical samples. With 16-bit sampling, obviously long repeats are less common and therefore the RLE algorithm will be less efficient.

Statistical methods assign variable length codes to audio samples according to their frequency. With 8-bit sampling, there are only 256 different samples, so the samples can be distributed evenly in a large audio file. A file of this type cannot be compressed well with the Huffman method. With 16-bit sampling, more than 65,000 sound bites are allowed. In this case, some samples may be more common and others less common. With a strong probability skew, good results can be achieved with the help of arithmetic coding.

Dictionary-based methods assume that some phrases will appear frequently throughout the file. This occurs in a text file in which individual words or sequences of them are repeated many times. However, the sound is an analog signal and the values ​​of the specific generated samples are highly dependent on the operation of the ADC. For example, with 8-bit sampling, an 8 mV waveform becomes a numeric sample of 2, but a nearby wave of, say 7.6 mV or 8.5 mV, can be converted to a different number. For this reason, voice snippets that contain overlapping phrases and sound the same to us may differ slightly when digitized. Then they will enter the dictionary in the form of different phrases, which will not give the expected compression. Therefore, dictionary methods are not very suitable for audio compression.

You can achieve better results in lossy audio compression by developing compression techniques that take into account the perception of sound. They remove the part of the data that remains inaudible to the audience. It is like compressing images, discarding information invisible to the eye. In both cases, we assume that the original information (image or sound) is analog, that is, part of the information has already been lost during quantization and digitization. Allowing a little more loss with care will not affect the quality of the uncompressed sound reproduction, which will not differ much from the original. We will briefly describe two approaches called silence suppression and compaction.

The idea behind silence suppression is to treat small samples as if they were not there (i.e. they are zero). Such a zeroing will generate a series of zeros, so the method of suppressing pauses is, in fact, a variant of RLE adapted to audio compression. This method is based on the peculiarity of sound perception, which consists of the tolerance of the human ear to rule out barely audible sounds. Audio files containing long stretches of quiet sound will be better compressed using the silence suppression method than files full of loud sounds. This method requires the participation of the user, who will control the parameters that establish the loudness threshold for the samples. This requires two more parameters, which are not necessarily controlled by the user. One parameter is used to determine the shortest sequences of silent samples, usually 2 or 3. And the second sets the smallest number of consecutive strong samples, when silence or pause occurs. For example, 15 silent samples can be followed by 2 strong and then 13 silent,

Consolidation is based on the property that the ear better distinguishes changes in the amplitude of soft sounds than loud sounds. A typical ADC for computer sound cards uses a linear conversion to convert the voltage into a numerical form. If the amplitude a became n, then the amplitude 2 a will become 2 n.

LEARN HOW AUDIO DATA COMPRESSION WORKS

LEARN HOW AUDIO DATA COMPRESSION WORKS

Audio Data Compression

MP3s Around Us Many, many years ago, the Internet was supposed to be the force that would democratize the music industry, physical distribution was supposed to become obsolete, and it was possible to publish music on the Internet and be heard by millions of audiences.

Audio Data Compression

In fact, enthusiasts and companies have created websites where fans can listen to new tunes, the MP3 format has made it easy to place songs for critics, and music demo pieces are now helping to sell a CD or LP. physical. It is not difficult to put your music on the Internet, but if you are not a star of the first magnitude, you will have to accept the placement of the data in compressed format to save space on the server, as well as save download time for those who download your masterpiece. While there are many critics of MP3, there are ways around some of the limitations of this format.

The MP3 format is based on the use of data compression algorithms that can reduce the amount of data required to play music. Compression algorithms in MP3 work with loss of data, they do not work like Zip or Rar compression algorithms that restore original file without data loss. MP3 algorithms discard “unnecessary” data. For example, if there is a lot of high-level sound on a track, the algorithm may assume that you cannot hear low-level material and think that only 24 dB of dynamic range is sufficient for that part of the audio material. It only requires 4 bits of data, a quarter of the data needed for 16-bit resolution. Unfortunately, it is difficult to preserve the sound quality of music when compressed, but it is possible. One way is to use algorithms, working without data loss, such as FLAC, or some algorithms offered by Microsoft and Apple for their audio formats. However, these algorithms do not lead to a significant reduction in file size; with complex music, the size reduction can be only 10-20%.

Although there are many algorithms for compressing audio data, only a few are the most common:

MP3. This format allows multiple levels of encoding, you can create audio files of almost any size with a smaller size with greater loss of precision. There are many free and shareware MP3 players (such as iTunes and Windows Media Player), to encode MP3, you can use iTunes and most digital audio editors.

AAC. As the native iPod format, this format is quite popular and sounds better than MP3 for the same file size according to most users. ITunes can convert files to AAC.

Windows Media Audio. The format is promoted by Windows, but is used less frequently than MP3 or AAC. WMA sound quality is generally better than MP3. While Microsoft does not offer users WMA playback software for the Mac platform, the Flip4Mac utility (free version available) can play Windows Media formats on Mac.

Ogg Vorbis. A great but rarely used format that sounds better than MP3 at the same bit rate, and unlike MP3, the encoding tools are free for developers. Ogg Vorbis files are not widely used yet, but they are popular with advanced technical users.

FLAC. This popular lossless format is not supported by many portable music players, but musicians often use FLAC to exchange files when working on collaborative projects. High sound quality is maintained.

Although MP3 does not offer the best quality, this format is most often used when placing audio files on the network. all players can play MP3. It is important to choose the correct MP3 settings. When encoding files to MP3, it is always best to use a high-quality source file without compression. Then select the compression settings. When saving in MP3 format, you can generally choose from a range of bit rates (bits per second), from 320 kbps stereo (great quality, but also a fairly large file) to 8 kbps mono (good enough for dictation) . In addition to the fixed settings, there is variable bit rate (VBR) encoding, which optimizes the bit stream according to the playback material. VBR encoding is not supported by all players.

Audio compression: facts, myths, and a blind test

Audio compression

When compressing, for example with MP3, there is a loss. But do you hear that? Where does good hearing end and where does esotericism begin? We verify the theory with a blind test, which you can do yourself.
Audio compression is a constant part of everyday life – almost always when you listen to music, it gets compressed. However, audio signal processing is difficult to understand for people who do not work in this field and who have adequate basic training. Consequently, in my impression, most people do not care at all or demonize MP3 and everything that has to do with compression.

MUSIC PRODUCTION WEEK: DAY 2, Compressor Tuesday: How to use compressors  and why? — Steemit

The question is: Are we depriving ourselves of a pleasant pleasure if we only listen to music on Spotify or YouTube? Or don’t you notice a difference with the best possible quality?

Numbers and what they say

Different measurement parameters say something about sound quality, but what exactly is it? The following is an overview of the factors as brief and clear as possible.

1. Bit rate

Bit rate tells you how many bits are processed per second. It is also called data transfer speed or bandwidth.

It makes intuitive sense: the more data that flows, the higher the sound quality. Bit rate is the most important measured variable in everyday life. However, the bitrate alone doesn’t say much about sound quality.

There are variable and constant bit rates. Today variable bit rates (abbreviated VBR) are mainly used. In “little happens” passages, more data can be compressed without audible loss, whereas a relatively large amount of data is stored in complex passages. The result is higher sound quality with the same file size. In the case of variable bit rates, the average is given as a value, sometimes also the maximum allowed.

2. Compression method

CAA compresses more efficiently than MP3, making it better quality than MP3 at the same bit rate. The same goes for Ogg Vorbis, which is used on Spotify.

Also the compression software that Encoder, has an impact on the quality. In the early days of MP3, 128 kbit / s songs often sounded terrible. Now they sound so much better because bad encoders are no longer used.

3. bit depth

Bit depth tells you how many bits a sample has. Therefore, it is also called the sampling depth. The more bits per sample, the more different volume levels can be stored.

This may remind you of photos and videos – there are bit depths too and they mean something similar.

The LG V30 can record * 10-bit videos **. What is the point? A direct comparison with our system camera VIDEO
mobile background
The LG V30 can record 10-bit videos. What is the point? A direct comparison with our system camera.
Which is better: * RAW or JPEG? **
background photo + video
Which is better: RAW or JPEG?
A CD has 16 bits per stereo channel. There is no fixed bit depth with MP3 and other compressed audio files. Bit depth hardly plays a role in normal everyday life, only in studio recordings. Sometimes 24-bit is also used there to get more out of the sound processing. However, in the end, the music is reduced to 16-bit because it can see the difference, according to acoustics experts I can’t hear anything.

.
4. Sampling frequency

The sample rate (also called the sample rate) is also irrelevant for normal music listeners. But it is important to understand how digital sound storage works in the first place. A CD has a sampling frequency of 44100 Hz or 44.1 kHz. Hertz is a unit of measurement that indicates something like “frequency per second”. In audio sampling, it means that the sound level is measured 44,100 times per second. The same applies here: when recording in the studio, higher values ​​make sense, but not in the final format.

Nyquist’s theorem: Many people believe that digital music is fundamentally a loss compared to a “real” (analog) sound wave. These discussions began when the CD was invented and immediately ridiculed by audio snobs as inferior to the record. But that can be refuted. The Nyquiste Theorem states that an audio curve can be completely reconstructed from individual points without any loss if the sample rate is high enough. And it also says how high the rate should be: twice the bandwidth. Since the human ear reaches a maximum of 20,000 Hz, this bandwidth is roughly selected. Hence the sample rate of just over 40,000 Hz.

5. Other factors

With all the technical measurement parameters, it should not be forgotten that the best values ​​are useless if the sound is already badly recorded. For example, if the sound engineer has not set the volume level high enough, dynamism is lost. The recording starts to creak when it gets louder afterwards. If the level is too high, the result is even worse: the recording is cluttered, rattles and scratches. Or a dynamic compressor alienates the result. Bad recordings are ubiquitous on YouTube and are also sold on CDs, for example for very old studio recordings or live concert recordings.

The quality of your headphones or speakers also has an influence. With faulty minijacks, you will barely hear a difference between 128 kbit / s MP3 and uncompressed music. Most likely with good boxes.

How is music encoded?

First of all, let’s understand why music should be compressed.

Uncompressed files like AIFF and WAV take up a lot of space. This causes that it is not comfortable to transfer them on phones or players, or even store them on the hard drive of our computer.

Lossy audio encoding

Even trying to send them online would be very difficult, due to their large size.
,
This has forced the creation of various formats of audio files that take up less space. Of course, the important thing is that they sound practically the same as the original, although they take up less space.

lossless lossy audio

This is where compression enters the picture.

On the one hand, ZIP or RAR compression is used, but it is not enough. So other techniques are used, namely:

– An uncompressed file contains a lot of information about sounds (even silence) that is inaudible to the human ear and that information is discarded. With that one, it is possible to save a lot of space, since there is little point in occupying space in storing information about sounds that our hatred cannot perceive.

-On the other hand, there is a perfectly known phenomenon regarding the human ear, which is based on the idea that if two sounds occur more or less simultaneously and these sounds occupy similar or close frequencies and one of them sounds louder, the ear You will NOT hear the less loud sound.

This is other information that can also be discarded, since it is generally not audible or the brain does not process it.

Once discarding both types of information, the file has been much less large and therefore does not occupy the same space.

Practically what remains is to apply some composition algorithm, something similar to ZIP. And then you will have a compressed file, for example the mp3.

This is called the lossy method.

There is another method, without loss, where it is only compressed with a method similar to ZIP, but without discarding information.

Is there really a difference between the two? Practically no. the human ear practically cannot distinguish between the two.

A file with loss, that has a good sample rate (minimum 44,100) and a good bit rate, it is almost impossible to distinguish it from the original and therefore, from the file without loss.

Many experiments have been done allowing people to listen to both types of files (those with loss and those without loss) and more than 90% have not been able to distinguish between them, as long as the one with loss has a good samplerate and a good bit rate.

Audio compression basics

Audio compression basics

Today we use music almost exclusively digitally. It has become quite normal for us too that we always carry our music collections, often many thousands of titles, with us. Stored on a chip somewhere in our smartphone or MP3 player. It is thanks to the so-called audio compression that this was possible in the first place.
initial situation

audio compression

Noises and tones, such as birdsong or the ringing of church bells, are analog events with an extremely wide spectrum. A good example of this is a bell. If it is struck, we think we only hear one note. In fact, its ringing consists of around 200 individual tones. These contain soft and strong tones, as well as frequencies that are outside our hearing range.

Audio Compressor

It is no different with music. However, the human ear can only perceive tones above a certain basic volume, so the thresholds for low, medium and high tones are very different. The ear is most sensitive in the tone range of human speech at around 3 kilohertz (kHz). The lower or higher tones have to be much louder for us to perceive them. The volume threshold, at which we begin to perceive sounds, is called the silent hearing threshold. A strong sound covers a lower one if its pitch is the same or similar.

For example, a 1 kHz high tone from an organ pipe can be heard clearly, while one or more soft tones that are close to each other in frequency are masked by higher ones. Although they are there, we still cannot perceive them. The secret that many hifi fans still trust the old record is that it stores all the tones and frequencies just as they are emitted by 1: 1 musical instruments. It also contains those tones that, strictly speaking, we cannot even perceive consciously. we still cannot perceive them.

The secret that many hifi fans still trust the old record is that it stores all the tones and frequencies just as they are emitted by 1: 1 musical instruments. It also contains those tones that, strictly speaking, we cannot even perceive consciously. we still cannot perceive them. The secret that many hifi fans still trust the old record is that it stores all the tones and frequencies just as they are emitted by 1: 1 musical instruments. It also contains those tones that, strictly speaking, we cannot even perceive consciously.

The essential

There are many standards for audio compression, such as MP3, AAC, or WMA. They are all based on the same fundamentals. The processes use the psychoacoustic effects of human auditory perception. All audio information that the human ear cannot perceive is filtered out of the data stream and therefore not saved. MP3 and Co make use of these human hearing effects by using mathematical analysis methods to determine and filter the imperceptible sound information.

An example: if you want to talk to a second person in a very noisy environment, they will hardly hear each other. In such cases, the energy level of the noise (or music at the disco, for example) is higher than that of your voices. This effect is also known as frequency masking. These masked tones are removed. In the same way, tones are filtered in the frequency range outside of our perception.
Another criterion is the so-called silent hearing threshold. All existing tones that are below it, here we talk about threshold masking, are also filtered through a compression process. Time masking is particularly exciting. With it, tones that are drowned out by other signals are also filtered. The timing of the tones is also taken into account. Our hearing is partially receptive to sounds and needs a short recovery phase before it can become receptive again.
This post masking takes about 200 milliseconds. There is also a pre-masking. It is caused by the fact that our brains take a little longer to process soft sounds than loud ones. The pre-masking time is approximately 20 milliseconds. Time masking alone ensures a relevant reduction in audio signals. True to the motto: everything nobody needs comes out. This reduces the music to a fraction of its original volume.

Does MP3 affect the sound quality?

The compression of songs affects the quality, but the losses are not necessarily audible.

mp3 audio quality

Is compression of MP3 songs harmful to the sound quality? Whether it is HD music or “normal” definition, the question of compression remains. The advantage is that the weight of the songs is reduced, so they take up less space in the memory of a phone or a portable music player. With standard MP3 compression, a music album ranges from 500 MB to 45 MB.

But by the way, the music is damaged. The sound seems a little less natural, less precise, less dynamic. Some of the audio information is literally destroyed. It doesn’t always sound good, but for some songs the difference is clear until everyone will notice.

mp3 quality

Fortunately, you can improve the quality of an MP3 song by compressing it with less force. The loss of sound quality becomes less clear, but in return the song weighs more. MP3 isn’t the only compressed music format that corrupts music. The most famous competitors are AAC, Ogg Vorbis and WMA. MP3 is not the most efficient compression format, this title applies to the Ogg Vorbis, but it is still a good option. All music players can play MP3 and online record stores prefer this format.

Lossless compression

However, some music lovers are reluctant to MP3. They swear by “nondestructive” compression, which does not remove sound information. The music has been completely preserved: we hear absolutely no difference. The best known non-destructive formats are Flac, APE and Alac. Unfortunately, not all electronic devices can play music recorded in these formats. Few artists offer their music in “non-destructive” compression. And the weight of the parts thus compressed is still very heavy. An album quickly reaches several hundred megabytes. However, the Flac stands out as the reference format for the most demanding music lovers.

Is it reasonable to keep using MP3? This remains a smart choice for most music lovers, as long as they choose an appropriate compression ratio. Which one to choose: 192 kbit / s, 256 kbit / s or 320 kbit / s? The stronger the compression, the lighter the number, but the lower the quality. With 128 kbit / s, the sound has clearly deteriorated, most of us can hear it. At 192 kbit / s, degradation becomes difficult for most of us to observe except for some rare numbers.

With 256 kbit / s, you have to have a musical ear and good sound equipment to make the difference. With 320 kbit / s, you need a well-trained ear and highly accurate audio equipment to make a difference. We only see a difference in quality in certain titles and only in certain passages. Therefore, most of us can settle for 192 kbit / s recording. Music lovers should expect a minimum of 256 kbit / s. And professionals will choose formats of 320 kbit / s or ‘lossless’.

Data compression techniques

It is evident that coding techniques for multimedia information contain large amounts of data that require memory space for recording and high transmission speed for transfer to other digital systems.

These needs can be met by reducing the space occupied by the data with special compression techniques. Compressed data cannot be used directly for processing, viewing, or playback. Compression techniques are used by special programs immediately before data storage or transmission. During the read or receive phase, similar programs perform decompression. Compression can be done on the basis that information encoding techniques dedicate an always equal amount of memory to each information element (be it a character, a pixel or a sound sample), regardless of their statistical frequency and its significance.

The compression techniques developed so far are more than a hundred but grouped into two categories:

Compression without loss of information.

Lossless compression techniques are based on compact coding of the same data streams or coding with a small number of bits of the most statistically frequent data.

Picture
This compression is completely reversible and the decompression program returns the exact bit sequence as it originally was. For this reason, loss-free technique is applicable to any type of data, including executable texts and programs, although the achievable compression factor is not very high: values ​​usually range from 2: 1 to 4: 1. Of course, these results vary depending on the type of input data.

RLE encoding

Data Compression

The RLE (Run Length Encoding) compression technique is oriented to equal byte sequences. In the original version, it provides the introduction of a special character that indicates the beginning of a sequence, and instead of encoding the same characters in the sequence one by one, it encodes only the first one, followed by a number indicating where many times drawn and repeated. Specifies with the Sc character at the beginning of the sequence, the statement

these ******** are eight stars… these Sc * 8 are eight stars

where 8 is not encoded as an ASCII character but as a binary number.

The decompression program interprets the next byte as a counter and rebuilds the original sequence.

For image compression, RLE encoding only works well with images that contain large areas of uniform color, but are not very effective with complex images.

Compression with loss of information.

Loss-free compression techniques are not sufficient to solve the problem of the huge amount of data generated by encoding multimedia information, e.g. Video images while allowing better use of memory space on disks or data transmission lines. High resolution. , audio or video.

However, to try to solve this problem, it is necessary to remember that multimedia information, although subject to transformation, can remain understandable; This allows for compression factors that are higher in some orders of magnitude than those observed.

These interventions can be studied based on the behavior (vision and hearing) of our sensory systems to reduce the required memory without obvious changes in information content. Compression techniques that do this are called “lossy” since the least significant piece of information is irreversibly suppressed. Therefore, it appears that the bitstream after decompression is different from the original, and therefore these techniques cannot be used for other types of information, e.g. Text. Furthermore, the information thus compressed is not suitable for further processing as the loss introduced with each subsequent step becomes more and more apparent.