digital audio representation and processing in multimedia Archives

Free Download Mp4Gain

How is digital audio processing done?

Audio Processing

Digital audio is processed by mathematical operations applied to individual samples of a signal, or to groups of samples of different lengths.

Audio processing

The mathematical operations performed can simulate the work of traditional means of analog processing (mixing of two signals – sum, amplification / attenuation of a signal – multiplication by a constant, modulation – multiplication by a function, etc.), or use alternative methods – for example, decomposition of a signal into a spectrum (Fourier series), correction of individual frequency components, then inverse “assembly” of the signal from the spectrum.

Digital signal processing is subdivided into linear (in real time, on a “live” signal) and non-linear, on a pre-recorded signal. Linear processing requires sufficient speed from the computer system (processor); in some cases it is impossible to combine the required performance and quality, and then simplified processing with reduced quality is used. Non-linear processing is not limited in time, therefore computing facilities of any power can be used and the processing time, especially with high quality, can reach several minutes or even hours.

For processing, both general-purpose processors (Intel 8035, 8051, 80×86, Motorola 68xxx, SPARC) and specialized digital signal processors (Digital Signal Processors, DSP) are used Texas Instruments TMS xxx, Motorola 56xxx, Analog Devices ADSP- xxxx, etc.

The difference between a general-purpose processor and a DSP is that the former focuses on a wide class of tasks: scientific, economic, logical, gaming, etc., and contains a large set of general-purpose instructions, in which common mathematical and logical operations prevail. DSPs are especially focused on signal processing and contain sets of specific operations: limiting addition, vector multiplication, calculation of mathematical series, etc. Implementing even simple audio processing on a universal processor requires significant performance and is far from always possible in real time, whereas even simple DSPs often cope with relatively complex real-time processing, and DSPs powerful are capable of processing high-quality spectrals of several signals at the same time.

Due to their specialization, DSPs are rarely used independently; Most of the time, the processing device has a universal average power processor to control the entire device, receive / transmit information, interact with the user, and one or more DSPs to process the audio signal. For example, to implement reliable and fast signal processing in computer systems, specialized boards with DSP are used, through which the processed signal is passed, while the central processor of the computer has only control and transmission functions. .

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

How is sound represented digitally?

Digital representation of the sound

The original shape of an audio signal (a continuous change in amplitude over time) is represented digitally by “cross-sampling”, in time and level.

digital sound representation

According to Kotelnikov’s theorem, any continuous process with a limited spectrum can be fully described by a discrete sequence of its instantaneous values, following with a frequency at least twice the frequency of the highest harmonic of the process; the sampling frequency Fd of instantaneous values (samples) is called the sampling frequency.

It follows from the theorem that a signal with a frequency Fa can be successfully sampled in time at a frequency of 2Fa only if it is a pure sinusoid, because any deviation from the sinusoidal shape leads the spectrum to go beyond the frequency Fa . Therefore, for time sampling of an arbitrary audio signal (which generally has, as is known, a spectrum that falls smoothly), either the selection of the sampling frequency with a margin or the forced limitation of the spectrum of the input signal below half the sampling frequency.

Simultaneously with time sampling, amplitude sampling is performed: measurement of instantaneous amplitude values and their representation in the form of numerical values with some precision. The precision of the measurement (binary width N of the obtained discrete value) determines the signal-to-noise ratio and the dynamic range of the signal (theoretically these are reciprocal values, but any real path also has its own level of noise and interference).

The resulting stream of numbers (a series of binary digits) that describe an audio signal is called Pulse Code Modulation (PCM), since each pulse of a time-sampled signal is represented by its own digital code.

Linear quantization is most often used when the numerical value of the sample is proportional to the amplitude of the signal. Due to the logarithmic nature of hearing, logarithmic quantization, when the numerical value is proportional to the magnitude of the signal in decibels, would be more appropriate, but this is fraught with difficulties of a purely technical nature.

Time sampling and amplitude quantization of the signal inevitably introduce noise distortions in the signal, the level of which is generally estimated using the formula 6N + 10lg (Fdiscr / 2Fmax) + C (dB), where the constant C varies for different types of signals: for a pure sinusoid it is 1.7 dB, for sound signals – from -15 to 2 dB. Thus, it can be seen that a decrease in noise in the operating frequency band 0..Fmax leads not only to an increase in the bit depth of the sample, but also to an increase in the sample rate relative to 2Fmax, as the quantization noise is “washed out” across the band up to the sample rate, and the audio information occupies only the smallest part of this strip.

Most modern digital sound systems use standard 44.1 and 48 kHz sample rates, but the signal’s frequency range is typically limited to around 20 kHz to keep it within the theoretical limit. Also the most common is 16-bit level quantization, which provides a limit signal-to-noise ratio of approximately 98 dB. In studio equipment, higher resolutions are used: 18, 20 and 24 bit quantization at 56, 96 and 192 kHz sample rates. This is done to preserve the higher harmonics of the sound signal, which are not directly perceived by the ear, but affect the formation of the overall sound image.

To digitize lower-quality, narrow-band signals, you can lower the sample rate and bit depth; for example, telephone lines use 7 or 8 bit digitization with frequencies of 8..12 kHz.

The representation of an analog signal in digital form is also called Pulse Code Modulation (PCM), since the signal is represented as a series of pulses of constant frequency (time sampling), the amplitude of which is digitally encoded (amplitude sampling ). A PCM stream can be parallel, when all the bits in each sample are transmitted simultaneously over several lines with one sampling frequency, or sequential, when the bits are transmitted one after the other with a higher frequency on a line.

Digital sound itself and related elements are often denoted by the general term Digital Audio; The analog and digital portions of a sound system are called the Analog Domain and Digital Domain.

What is ADC and DAC?

Analog-to-digital and digital-to-analog converters. The first converts the analog signal to a digital amplitude value, the second performs the inverse conversion.