What are the pros and cons of digital audio?


Free Download Mp4Gain
picture

What are the pros and cons of digital audio?

Pros and Cons of  Digital Audio

The digital representation of sound is valuable, first of all, for the possibility of endless storage and reproduction without loss of quality, but the conversion from analog to digital form and vice versa inevitably leads to its partial loss.

Gaming Headsets: Everything you need to know - Gaming Lifestyle Secrets

The most unpleasant distortions introduced in the digitizing stage are the granular noise that occurs when the signal is quantized by level due to rounding of the amplitude to the nearest discrete value. Unlike simple broadband noise introduced by quantization errors, granular noise is the harmonic distortion of the signal, most noticeable in the upper part of the spectrum.

The power of the granular noise is inversely proportional to the number of quantization steps; However, due to the logarithmic characteristic of hearing with linear quantization (constant step value), quiet sounds have fewer quantization steps than loud sounds, and as a result, the main density of non-linear distortions falls in the region of sounds. silent. This leads to a limitation of the dynamic range, which ideally (without taking into account harmonic distortion) would be equal to the signal-to-noise ratio, but the need to limit this distortion reduces the dynamic range for 16-bit encoding to 50-60 dB. The situation could have been saved by logarithmic quantification, but its implementation in real time is very difficult and expensive.

The distortion introduced by granular noise can be reduced by adding normal white noise (random or pseudo-random signal) to the signal, with an amplitude of half the least significant bit; such an operation is called dithering. This leads to a slight increase in the noise level, but weakens the correlation of quantization errors with the components of the high-frequency signal and improves subjective perception. Anti-aliasing is also applied before rounding the samples by decreasing their bit depth. Essentially, dithering and noise shaping are special cases of the same technology, with the difference that, in the first case, white noise with a flat spectrum is used and, in the second, noise with a spectrum with a “shape “special.

When restoring audio from digital to analog, there is the problem of smoothing the stepped waveform and suppressing the harmonics introduced by the sample rate. Due to the imperfection of the frequency response of the filters, insufficient suppression of this interference or excessive attenuation of useful high-frequency components may occur. Poorly suppressed sample rate harmonics distort the shape of the analog signal (especially in the high frequency region), resulting in a “rough” and “dirty” sound.

What methods are used to effectively compress digital audio?

Currently, the most famous are Audio MPEG, PASC and ATRAC. They all use the so-called “perception coding” (perceptual coding), in which information barely perceptible to the ear is removed from the sound signal. As a result, despite the change in the shape and spectrum of the signal, your hearing perception is practically unchanged and the compression ratio justifies a slight decrease in quality. Such encoding refers to lossy compression methods, when it is no longer possible to accurately restore the original waveform from the compressed signal.

Techniques to remove some of the information are based on a characteristic of human hearing, called masking: if there are pronounced peaks (dominant harmonics) in the sound spectrum, the weakest frequency components in the immediate vicinity of them are practically not perceived (masked) by ear. During encoding, the entire audio stream is divided into small frames, each of which is converted into a spectral representation and divided into several frequency bands. Within bands, masked sounds are detected and removed, after which each frame undergoes adaptive coding directly in spectral form. All these operations make it possible to significantly reduce (several times) the amount of data while maintaining the quality acceptable to most listeners.

Each of the described encoding methods is characterized by the bit rate at which the compressed information must enter the decoder when the audio signal is recovered. The decoder converts a series of compressed instantaneous signal spectra into a conventional digital waveform.

Audio MPEG is a group of audio compression techniques standardized by MPEG (Moving Pictures Experts Group).


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Misconceptions about digital audio

Misconceptions about digital audio

Digital Audio

The higher the bitrate, the better the track

This is not always the case. For starters, let me remind you what bitrate t (bitrate, instead of bitraid). In fact, this is the data rate in kilobits per second during playback. That is, if we take the size of the track in kilobits and divide it by its duration in seconds, we get its bit rate, the call. File-based bitrate (FBR), usually not too different from the bitrate of the audio stream (the reason for the differences is the presence of metadata on the track: tags, “embedded” images, etc.) .

Digital audio

Now let’s take an example: the uncompressed PCM audio bit rate recorded on a normal audio CD is calculated as follows: 2 (channels) × 16 (bits per sample) × 44100 (samples per second) = 1411200 (bps ) = 1411.2 kbps … Now let’s grab and compress the track with any lossless codec (“lossless” – “lossless”, that is, one that does not lead to data loss), for example, the FLAC codec. As a result, we will get a lower bit rate than the original, but the quality will remain unchanged; here is your first rebuttal.

Something else is worth adding here. The lossless compression output bitrate can be very different (but is generally lower than uncompressed audio); It depends on the complexity of the compressed signal, or rather on data redundancy. So simpler signals will compress better (ie we have smaller file size for the same duration => lower bitrate), and more complex signals will be worse. That’s why lossless classical music has a lower bitrate than, say, rock. But it must be emphasized that the bit rate here is in no way an indicator of the quality of the sound material.

Now let’s talk about lossy compression. First of all, you need to understand that there are many different encoders and formats, and even within the same format, the encoding quality for different encoders can differ (for example, QuickTime AAC encodes much better than outdated FAAC), not to mention the superiority of modern formats (OGG Vorbis, AAC, Opus) in MP3. Simply put, from two identical tracks encoded by different encoders with the same bit rate, some will sound better and some will sound worse.

Also, there is upconversion. That is, you can take a track in MP3 format with 96 kbps bit rate and convert it to 320 kbps MP3. Not only will the quality not improve (after all, data lost during the previous 96 kbit / s encoding cannot be returned), it will even get worse. It’s worth noting that at each lossy encoding stage (at any bit rate and any encoder), a certain amount of distortion is introduced into the audio.

And even more. There is one more nuance. If, say, the bitrate of an audio stream is 320 kbps, this does not mean that the 320 kbps was spent encoding that very second. This is typical for constant bit rate encoding and for those cases where a person, hoping to get the highest quality, forces a constant bit rate too high (for example, setting CBR to 512 kbps for Nero AAC ). As you know, the number of bits assigned to a particular frame is regulated by the psychoacoustic model. But in case the allocated amount is much lower than the set bitrate, even the bit deposit is not saved (for terms see the article “What is CBR, ABR, VBR?”) – as a result, we get useless “zero bits” that simply “wrap up” the frame size to the desired one (that is, increase the size of the stream to the specified size). By the way, this is easy to check: compress the resulting file with a filing cabinet (preferably 7z) and look at the compression ratio – the more, the more zero bits (as they lead to redundancy), the more space wasted.

Lossy codecs (MP3 and others) can cope with modern electronic music, but cannot efficiently encode classical (academic), live and instrumental music.
The “irony of fate” here is that, in fact, everything is the exact opposite. As you know, academic music in the vast majority of cases follows melodic and harmonic principles, as well as instrumental composition. From a mathematical point of view, this leads to a relatively simple harmonic composition of the music.

MP3, FLAC, WAV, ALAC: the differences between audio formats

Digital audio formats

Digital Audio

Today, most people listen to music completely digitally. The differences between digital audio formats like WAV, FLAC, MP3, and ALAC are not clear to everyone. We put the facts together.

Digital audio formats

While vinyl is booming and CD sales are slowly but surely falling, today’s music is often heard without any physical medium. Whether you use your smartphone or digital audio player, you can move forward with digital audio formats on the go. After all, no one today wants to carry a Discman and multiple CDs with them when they typically have a powerful pocket computer in the form of a smartphone that can play digital music files. But what are the differences between the individual file formats and what are their advantages and disadvantages?

WAV and AIFF: the uncompressed ones

The Wave container format (.wav) was developed by Microsoft. Saves uncompressed audio content, so files require a lot of storage space (2 minutes can take 20MB of space. WAV is especially important when recording and editing audio content. The downside of .wav files is that they don’t metadata is required (about, Title Artist) can be stored,
the equivalent developed by Apple AIFF (.aif) Due to the fact that Apple computers are very common in music production, this audio format is very common there.

MP3, AAC, WMA, Ogg-Vorbis – compressed to save space, but not lossless

The MP3 file format (.mp3, named for the MPEG-1 Audio Layer 3 compression codec) developed by the Fraunhofer Institute in the 1980s is probably the best-known digital audio format. It gave the MP3 player its name, and for a long time music was digitized almost exclusively as MP3, for example, on the extremely popular and now illegal file-sharing networks around the turn of the millennium. The advantage of MP3 is the small amount of storage space required: on average, it takes up one-tenth the size of the original file. However, one disadvantage that should not be neglected is that it is lossy – frequencies that are inaudible to humans are removed to drastically reduce the memory required. To what extent this affects the sound, you can compare Flac with MP3 Read.

AAC (Advanced Audio Coding) is a successor to the MP3 format, offering slightly better sound quality. Apple continues to mainly offer songs in this audio format on the iTunes store.

WMA stands for Windows Media Audio (.wma), as the name suggests, a development by Microsoft. .Wma is also a lossy compression file format.

A somewhat rarer audio format is Ogg-Vorbis (.ogg), where Vorbis is the music compression technology and .ogg is the container format. Like MP3, .ogg is also lossy, but requires less storage space and better quality.

FLAC / ALAC / WMA lossless – the lossless

Lossless formats were developed to preserve all sound information while keeping the amount of memory required small. With all file formats, the required memory is reduced to about half the original file. With audio conversion software, the file can be converted to other lossless formats, something unthinkable with lossy formats. This is why lossless file formats are popular for archiving music collections in a space-saving way.

FLAC – Free Lossless Audio Code (.flac) is a free audio format, so it is not owned by any major corporation. ALAC: Apple Lossless Audio Codec (.alac) is Apple’s lossless file format, while Microsoft also has its own development on the market with WMA Lossless.

Basics of digital audio

Basics of digital audio:

Before the computer can record, manipulate, and reproduce sound, sound must be transformed from an audible analog form to a computer-acceptable digital form, using a process called analog-to-digital conversion (ADC). Once the sound data has been stored as bytes in the computer, the power of the computer’s CPU can be used to transform this sound in thousands of ways. Finally, when you are ready to listen to the result, the digital-to-analog conversion (DAC) process transforms the sound bytes back into an analog electrical signal from the speakers.

Sampling: Analog to Digital Conversion

Given an analog signal, discrete values ​​of its amplitude are taken at small time intervals, obviously the more reliable the reproduction the more samples per second are taken. These obtained values ​​are assigned a digital value that the computer can understand and process as required. We can use 8 or 16 bit words, thus obtaining 256 or 65536 different combinations and obtaining higher resolution.

 

SAMPLE FREQUENCY: According to the Nyquist theorem, it is possible to accurately repeat a waveform if the sampling frequency is at least twice the frequency of the component with the highest frequency. The highest frequency that the human ear can perceive is close to 20 kHz, so the 44.1 kHz sampling rate of sound cards is more than enough. This value is the one used today by CD audio players.

SAMPLE SIZE: The sample size controls the dynamic range that can be recorded. For example, 8-bit samples limit the dynamic range to 256 steps (50 dB range). In contrast, a 16-bit sample has a dynamic range of 65,536 steps (90 dB range) a substantial improvement. The human ear perceives a whole world of differences between these two sample sizes. Ears are more sensitive to detecting differences in pitch than intensity, but are even more sensitive to the strength of sound.

From the previous processes we can get an audio file, such as (and since it is the best known), a WAV audio file. It is the own format of Windows. They can be 8 or 16 bit with sampling rates of 11,025 kHz, 22.05 kHz, or 44.1 kHz and generally have good sound quality.

Digital audio compression

It could be assumed that all you have to do to get good sound is to record at the 44.1 kHz speed limit with 16-bit (2-byte) samples. The only problem that appears if recording in stereo, sampling simultaneously on the left and right channels at 44.1 kHz, a one minute sound sample needs a 10.58MB storage space. This involves using large disk spaces to store these sound files. Many compressed file formats (codecs) have been developed that enable high-quality recording without the need for so much disk space.

Most common audio formats:

With the simple objective of listing a series of codecs used by different operating systems to perform audio compression. Later, a more complete description of the most used is made: MP3.

Therefore, some of the most used are:

Advanced Audio Coding (AAC): used by Apple computers. More efficient than MP3.

Audio for Unix (AU): Acoustic standard for the JAVA programming language.

Windows Media Audio (WMA)

Ogg Vorbis: It is free, open and not patented.

Atrac: compression and playback technology for minidisc.

 

The codec par excellence: the MP3

Its origin and current

The abbreviations MP3 respond to the abbreviation of MPEG (Moving Picture Expert Group) 1 Layer 3, which is a perceptual coding algorithm. This among others was developed by the Moving Picture Expert Group (MPEG) (http://www.cselt.it/mpeg/) together with the Fraunhofer Institute of Technology (http://www.ipa.fhg.de/english/ ).

Moving Picture Expert Group is an ISO / IEC research committee. MPEG is in charge of the international development of compression, decompression, processing and encoded rendering standards for movies, audio and the combination of both. It is a non-profit institution created in 1988, which brings together 300 experts from 20 countries three times a year.