
What are the pros and cons of digital audio?

The digital representation of sound is valuable, first of all, for the possibility of endless storage and reproduction without loss of quality, but the conversion from analog to digital form and vice versa inevitably leads to its partial loss.
The most unpleasant distortions introduced in the digitizing stage are the granular noise that occurs when the signal is quantized by level due to rounding of the amplitude to the nearest discrete value. Unlike simple broadband noise introduced by quantization errors, granular noise is the harmonic distortion of the signal, most noticeable in the upper part of the spectrum.
The power of the granular noise is inversely proportional to the number of quantization steps; However, due to the logarithmic characteristic of hearing with linear quantization (constant step value), quiet sounds have fewer quantization steps than loud sounds, and as a result, the main density of non-linear distortions falls in the region of sounds. silent. This leads to a limitation of the dynamic range, which ideally (without taking into account harmonic distortion) would be equal to the signal-to-noise ratio, but the need to limit this distortion reduces the dynamic range for 16-bit encoding to 50-60 dB. The situation could have been saved by logarithmic quantification, but its implementation in real time is very difficult and expensive.
The distortion introduced by granular noise can be reduced by adding normal white noise (random or pseudo-random signal) to the signal, with an amplitude of half the least significant bit; such an operation is called dithering. This leads to a slight increase in the noise level, but weakens the correlation of quantization errors with the components of the high-frequency signal and improves subjective perception. Anti-aliasing is also applied before rounding the samples by decreasing their bit depth. Essentially, dithering and noise shaping are special cases of the same technology, with the difference that, in the first case, white noise with a flat spectrum is used and, in the second, noise with a spectrum with a “shape “special.
When restoring audio from digital to analog, there is the problem of smoothing the stepped waveform and suppressing the harmonics introduced by the sample rate. Due to the imperfection of the frequency response of the filters, insufficient suppression of this interference or excessive attenuation of useful high-frequency components may occur. Poorly suppressed sample rate harmonics distort the shape of the analog signal (especially in the high frequency region), resulting in a “rough” and “dirty” sound.
What methods are used to effectively compress digital audio?
Currently, the most famous are Audio MPEG, PASC and ATRAC. They all use the so-called “perception coding” (perceptual coding), in which information barely perceptible to the ear is removed from the sound signal. As a result, despite the change in the shape and spectrum of the signal, your hearing perception is practically unchanged and the compression ratio justifies a slight decrease in quality. Such encoding refers to lossy compression methods, when it is no longer possible to accurately restore the original waveform from the compressed signal.
Techniques to remove some of the information are based on a characteristic of human hearing, called masking: if there are pronounced peaks (dominant harmonics) in the sound spectrum, the weakest frequency components in the immediate vicinity of them are practically not perceived (masked) by ear. During encoding, the entire audio stream is divided into small frames, each of which is converted into a spectral representation and divided into several frequency bands. Within bands, masked sounds are detected and removed, after which each frame undergoes adaptive coding directly in spectral form. All these operations make it possible to significantly reduce (several times) the amount of data while maintaining the quality acceptable to most listeners.
Each of the described encoding methods is characterized by the bit rate at which the compressed information must enter the decoder when the audio signal is recovered. The decoder converts a series of compressed instantaneous signal spectra into a conventional digital waveform.
Audio MPEG is a group of audio compression techniques standardized by MPEG (Moving Pictures Experts Group).












