Basic concepts of digital sound theory


Free Download Mp4Gain
picture

Basic concepts of digital sound theory

Digital Sound

Sound is, in general, the vibrations of an elastic medium. The sound is caused by mechanical vibrations of some object (this can be a string, vocal cords, etc.) in contact with the environment. The frequency of vibration (measured in Hertz) determines the pitch. The higher the frequency, the louder the sound. The human ear can perceive sound vibrations from the air with a frequency of 20 Hz to 20 kHz. The ear perceives the amplitude of the vibration as volume. The higher the amplitude, the louder the sound.

Digital Sound

Electromagnetic waves are a direct analog of sound waves. The latter are less susceptible to dispersal by the environment, the information they carry is easier to store and process. Electromagnetic waves are the most important secondary carrier of sound. The transformation of acoustic waves into electromagnetic waves (as well as the reverse operation) is carried out due to the usual induction effect, which consists in the appearance of a current in a conductor when it is placed in an alternating magnetic field.

Simply put, the oscillation of the loudspeaker membrane magnet near the coil induces an alternating current in it. If this current is applied to another speaker, then the magnet on its membrane will move, creating a corresponding sound.

This is how the telephone and the radio work.

Sound converted to electromagnetic waveform can be easily stored. For this, some parameter of the carrier must be compared (the depth of the plate track or the degree of magnetization of the film) with the amplitude of the oscillations (that is, the strength of the induced current in the speaker coil) . Sound converted directly to electromagnetic waves is called analog sound. Its main characteristic is the direct correspondence of the electromagnetic waves transmitted or recorded with the acoustic ones.

Digital sound is relatively new. Its main difference from analog is discretion. When digitizing, a special device, an analog-to-digital converter (ADC), measures at regular intervals (approximately 0.001-0.0001 seconds) the magnitude of the amplitude of an electromagnetic wave corresponding to an analog sound form and writes its value to a file with a specified precision. This value is generally called sample, or in jargon, sample (of the sample in English, sample). The same digitization is often called sampling or sampling.

By converting sound from digital to analog (this is done by a device called a digital-to-analog converter (DAC)).

The interpolation (approximation) of the intermediate values ​​of the amplitude is carried out according to the known ones. Since the sampling frequency is usually high, this operation allows you to fairly accurately reconstruct the original analog signal.

The digital form of sound is characterized by five parameters.

1. The sampling rate;
2. Bit size of the samples.
3. The number of channels or tracks.
4. Compression / decompression algorithm (codec).
5. Storage format.

Since each of these parameters is quite specific, we will consider them separately.

Sampling rate
The sample rate determines how many samples per second will be taken when digitizing. If we compare digital sound with digital images, then the sample rate will correspond to the resolution (a more “realistic” analogy is the frame rate in cinema). The higher the sampling frequency, the better it is possible to reconstruct the analog signal based on the digital form of the sound (more precisely, the higher the sampling frequency, the broader the spectrum of frequencies that can be recorded during digitization).
The famous Nyquist-Kotelnikov theorem states that for the correct reconstruction of an analog signal from its digital recording, it is necessary that the sampling frequency be at least twice the maximum sound frequency.

Since the upper listening limit is 20 kHz, ideally the sample rate should be at least 40 kHz. This is why the standard sampling frequency used for recording CDs is 44.1 kHz (so-called CD quality). However, the sample rate can be higher, but this sound quality is only used by recording studios and especially demanding music lovers.

A sample rate of 44.1 kHz is not always ideal. When transmitting data over a low bandwidth network, sound quality must be sacrificed in favor of size, in practice sampling frequencies two, four and eight times lower than 44.1 kHz are often used.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Sound information on the computer

Sound information on the computer

Digital Audio

Sound is a continuous signal, a sound wave with variable amplitude and frequency.

digital wave sound

The greater the amplitude of the signal, the stronger it will be for a person.

The higher the frequency of the signal, the higher the pitch.

The frequency of a sound wave is expressed as a number of vibrations per second and is measured in Hertz (Hz, Hz).

The human ear can perceive sounds in the range of Hz to 20 kHz, which is called sound .2020
The number of bits per audio signal is called the audio coding depth.
Modern sound cards provide 16-, 32-, or 64-bit audio encoding depth. 163264

When encoding audio information, a continuous signal is replaced by a discrete one, that is, it is converted into a sequence of electrical impulses (binary zeros and ones).
The process of converting audio signals from a continuous representation form to a discrete digital form is called digitization.
An important characteristic when encoding audio is the sample rate, the number of signal level measurements in second: 1
– (one) measurement per second corresponds to a frequency of Hz; 11
– measurements per second correspond to a frequency of kHz. 10001
Audio sample rate is the number of audio volume measurements in one second.
The number of measurements can be in the range of kHz to kHz (from the radio transmission frequency to the frequency corresponding to the sound quality of musical media) .848

The higher the sampling frequency and depth of the sound, the better the sound of the digitized sound. The lowest quality of digitized sound, corresponding to the quality of telephone communication, is obtained at a sampling rate of times per second, a sampling rate of bits, and by recording an audio track (“mono” mode). The highest quality digitized audio, corresponding to the quality of an audio CD, is achieved with a sampling rate of times per second, a sampling rate of bits, and the recording of two audio tracks (stereo mode) .8000 848 000 16
It should be remembered that the higher the quality of the digital sound, the greater the volume of information in the audio file.
The volume of information in a mono audio file () can be estimated as follows: VV = N⋅ f⋅ k, where is the total duration of the sound (seconds), is the sampling frequency (Hz), is the encoding depth (bit) .norteFk

For example, with a sound duration of one minute and a medium sound quality (bits, kHz): 11624
V = 60 ⋅ 24000 ⋅ 16 bits = 23040000 bits = 2,880,000 bytes = 2812.5 kB = 2.75 MB.

When encoding stereo sound, the sampling process is performed separately and independently for the left and right channels, consequently doubling the size of the audio file compared to mono sound.

For example, let’s estimate the information volume of a digital stereo sound file with a duration of one second with an average sound quality (bits, measurements per second). For this encoding, the depth must be multiplied by the number of measurements per second and multiplied by (stereo): 11624 00012
V = 16 bits ⋅ 24000⋅2 = 768000 bits = 96000 bytes = 93.75 KB.

There are several methods for encoding audio information with binary code, among which two main areas can be distinguished: the FM method and the Wave-Table method.

The FM (Frequency Modulation) method is based on the fact that, theoretically, any complex sound can be decomposed into a sequence of the simplest harmonic signals of different frequencies, each of which is a regular sinusoid and therefore It can be described by a code. The decomposition of audio signals into harmonic series and representation in the form of discrete digital signals is done by special devices – analog-to-digital converters (ADC).

Conversion of an audio signal into a discrete signal: to – audio signal at the ADC input; b – discrete signal at the ADC output.

Digital-to-analog converters (DACs) perform reverse conversion to reproduce sound encoded with a numeric code. The sound conversion process is shown in Fig. Below. This encoding method does not provide good sound quality, but it does provide compact code.

Conversion of a discrete signal into an audio signal: to – discrete signal at the DAC input; b – audio signal at the DAC output.

The table wave method (the Wave, the Table) is based on the fact that the previously prepared tables store sound samples from the world, musical instruments, etc.

How digital sound is reproduced

How digital sound is reproduced

digital sound

Have you ever wondered how sound is reproduced on digital devices?

Digital Audio

How is a sound signal formed from a combination of ones and zeros? I’m sure I was thinking, since I started reading! But often, even professionals have only a general idea of ​​the modern sound route. In this article, you will learn how the different formats appeared, what a digital-to-analog converter is, what types of DACs exist, and what determines the quality of sound reproduction.

PCM
As you know, in digital audio, almost any format, with rare exceptions, is recorded using a pulse code stream or a PCM stream – pulse code modulation. FLAC, MP3, WAV, Audio CD, DVD-Audio and other formats are just ways to pack, “preserve” the PCM stream.

How it all began
The theoretical foundations of digital sound transmission were developed at the dawn of the 20th century, when scientists tried to transmit an audio signal over a long distance, but not by telephone, but in a rather strange way for that time.

By dividing the sound wave into small parts, it could be sent to the receiver in some kind of mathematical representation. The recipient, in turn, could restore the original waveform and listen to the recording. In addition, scientists were faced with the task of increasing the bandwidth of the “ether”.

In 1933, the theorem of V.A. Kotelnikov. In Western sources, it is called the Nyquist-Shannon theorem. Yes, Harry Nyquist was the first to raise this issue: in 1927 he calculated the minimum sampling frequency for transmitting a waveform, which later received his name “Nyquist frequency”, but Kotelnikov’s theorem was published 16 years earlier.

The essence of the theorem is simple: a continuous signal can be represented as an interpolation series, consisting of discrete reports, from which the signal can be reconstructed. In order to roughly restore the original state of the signal, the sampling frequency must be at least twice the upper cutoff frequency of this signal.

For many years, the theorem was not in demand, until the advent of the digital age. It was then that it found a use. In particular, the theorem was useful in the development of the CDDA (Compact Disc Digital Audio) format, in common people it is called Audio CD or Red Book. The format was released by engineers at Philips and Sony in 1980 and has become the standard for audio CDs.

Format characteristics:

sampling frequency – 44.1 kHz;
quantization capacity – 16 bits.

INFO
Sampling rate: the number of samples of the signal “taken” during its sampling. Measured in Hertz.
Quantization bit: the number of binary digits that express the amplitude of the signal. Measured in bits.
The 44.1 kHz sampling rate was calculated from Kotelnikov’s theorem. It is believed that the hearing of the average person cannot pick up sound beyond 19-22 kHz. The frequency was probably 22 kHz and was chosen as the upper limit.

22,000 × 2 = 44,000 + 100 = 44,100 Hertz

Where does the 100 Hertz come from? There is a version that this is a small margin in case of errors or oversampling. In fact, Sony chose this frequency for its compatibility with the PAL transmission standard.

The bit depth of the CDDA format is 16 bits, or 65,536 samples, which equates to a dynamic range of approximately 96 dB. Such a large number of samples were not chosen by chance. Firstly, due to the strong influence of quantization noise, and secondly, to provide a formal dynamic range superior to that of the main competitors at the time: cassette records and vinyl records. I’ll cover this in more detail in the section on digital to analog converters.

Development of PCM continued on the principle of multiplying by two. Other sample rates appeared: first, the 48 kHz sample rate was added, and then the frequencies based on it were 96, 192, and 384 kHz. The 44.1 kHz frequency was also doubled to 88.2, 176.4 and 352.8 kHz. Bit width increased from 16 to 24 and then to 32 bits.

The next after CDDA in 1987 appeared the DAT format – Digital Audio Tape. The sample rate was 48 kHz, the quantization bit did not change. And although the format failed, the 48 kHz sample rate caught on in recording studios, as they say, due to the convenience of digital processing.

In 1999, the DVD-Audio format was released, which made it possible to record on a disc six stereo tracks with a sampling frequency of 96 kHz and a bit depth of 24 bits, or two stereo tracks with a frequency of 192 kHz, 24 bits.