What is digital audio?


Free Download Mp4Gain
picture

What is digital audio?

Digital Audio
Digital Audio

How does digital audio work?

Digital Audio
Digital Audio

What is the rate? Of course, I can’t directly explain to you that “rate is bitrate”. When you play sound files with some software, you should notice a small message. For example, “128Kbps”, “1411Kbps”… Some friends also know that under normal circumstances, the larger the number in front of “Kbps”, the better the sound effect, for example, CD is “1411Kbps”. So what exactly do these numbers represent? In a nutshell, how much data is converted into sound per second. The reason CDs sound better than MP3s is that CDs have more information per second than MP3s. For example, compared to a 1411 Kbps CD file, a 128 Kbps MP3 file can convert almost 12 times less data per second than a CD. For the same song, the CD is much more delicate to listen to (of course, there is a group of people in the crowd known as “mushrooms” who can feel that the effect is the same) MP3 expresses the same content with less data and, of course, its level of detail is not as good as that of a CD.

 

2. Sampling rate.

 

Sampling rate is also a very common term. The specific form is “XXHz”, where “XX” is a specific number. Such as “44100Hz (44.1KHz)”, “32000Hz (32KHz)” and so on. As mentioned above, digital audio files are made up of many “points”, so the sample rate is actually a standard “quantity” to collect these “points”. Obviously, the sampling rate of “44100 Hz” is higher than that of “32000 Hz”, so more points are collected per time unit (1 second). The more points per unit of time, the more complete the sound information and, of course, the closer to reality. So if the guaranteed rate is the same, the file “44100Hz” is better than “32000Hz” (of course, this is not absolute).

 

————————————————–

 

lossy compression

 

In fact, we are all familiar with lossy compressed audio sources. At present, popular lossy formats mainly include MP3, WMA, OGG, MP3pro, AAC, VQF, ASF, etc.

 

2.WMV format

 

 

 

The full name of WMA is WindowsMedia Audio, which is an audio format promoted by Microsoft. The WMA format achieves a higher compression ratio by reducing the data stream while maintaining sound quality. The compression ratio can usually reach 1:18, and the generated file size is only half of the corresponding MP3 file.

 

3.MP3 format

 

 

 

The full name of MP3 is MovingPicture Experts Group Audio Layer Ⅲ. In a nutshell, MP3 is an audio compression technology. Since the full name of this compression method is called MPEGAAudio Layer 3, people call it MP3 for short. It was born in 1993, and its “parents” are the German FaunhofeIIS and the French Thomson.

 

MP3 uses MPEGAudio Layer 3 technology to compress music into smaller files with a compression ratio of 1:10 or even 1:12. In other words, you can compress files to a smaller size with little loss of sound quality. And it keeps the original sound quality very well. It is precisely because of MP3’s small size and high sound quality that the MP3 format has become almost synonymous with online music. The MP3 format of music per minute is only 1 MB in size, so the size of each song is only 3-4 megabytes. Use an MP3 player to uncompress (decode) MP3 files in real time so that high-quality MP3 music can be played.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

What is digital audio?

What is digital audio?

Digital Audio
Digital Audio

How does digital audio work?

Digital Audio
Digital Audio

In our daily lives, we listen to all kinds of music, and most of this music is transmitted in digital form, whether it is listened to or downloaded to a computer or played on an MP3 or CD player. Of course, you will often see various formats like MP3, WMV, APE, etc., but do you understand the meaning of these formats? Below I have compiled some of this content for you, I hope it helps you.

 

1. Introduction to digital music

 

 

 

Digital audio sources, that is, digital audio formats, first referred to CDs. After the CDs were compressed, a variety of formats suitable for playback on Walkmans were derived. These compressed formats can be divided into two categories: there is lossy and lossless compression. The compression mentioned here refers to converting the audio stream encoded in PCM or WAV format to other formats after special compression processing, so as to achieve the effect of reducing the file size. Lossy/Lossless refers to whether the sound signal retained in the new file is reduced compared to the original PCM/WAV format signal after compression.

 

PCM encoding is short for PulseCode Modulation, also known as Pulse Code Modulation, which is one of the digital communication encoding methods. The sampled value is rounded and quantized according to the hierarchical unit, and the sampled value is represented by a set of binary codes to represent the amplitude of the sampled pulse.

The final form of the digital audio signal is still made up of “0/1”. They can be any permutation and combination, such as “0001110101” or “11100001010”. Of course, different combinations have different effects. Seeing this, some friends should have noticed. If the sound is recorded in the form of “00101010”, then the final form is not a “dot”, that is, a simple “change” process. The sound is continuous, how can it be recorded with “dots”? Shouldn’t the sound we hear be segment by segment? The reason is not difficult to understand. Go home and turn on the fluorescent light, can you find the fluorescent light flickering? can not? In fact, fluorescent lights flicker constantly. Have you seen cartoons? They are all connected by a grid of still images. We can also simply understand the images one by one as “dots” one by one. Man against nature

There are limits to the sense of the world, both visual and auditory. The reason cartoons can produce coherent motion is that these “dots” are an illusion that people create when human vision doesn’t respond in time. With the exception of machines, people cannot distinguish these “dots”. So is the sound. If the frequency of the sound flicker is very fast, people cannot distinguish it. Also, when the sound performs a “digital conversion of analog signals” (D/A conversion), the decoder chip has already connected these “dots” coherently, so we hear a very coherent sound.

Digital audio formats

Digital audio formats

Digital Audio

The digital audio format is a format for presenting audio data used in digital audio recording, as well as for additional storage of recorded material on a computer and other electronic media, so-called audio media.

digital audio

The audio file (a file containing a sound recording) is a computer file consisting of information about the amplitude and frequency of sound, saved for later playback on a computer or player.

Varieties of digital audio formats.

There are several concepts of audio format.

The digital representation of the audio data depends on how the digital-to-analog converter (DAC) quantizes. In sound engineering, two types of quantization are currently the most common:

pulse code modulation

sigma delta modulation

Quantization bit depth and sample rate are often specified for various audio recording and playback devices as a digital audio rendering format (24-bit / 192 kHz; 16-bit / 48 kHz).

The file format determines the structure and presentation characteristics of the audio data when stored on a PC storage device. To eliminate the redundancy of the audio data, audio codecs are used, with the help of which the audio data is compressed. There are three groups of audio file formats:

uncompressed audio formats like WAV, AIFF

lossless compressed audio formats (APE, FLAC)

lossy compressed audio formats (mp3, ogg)

Modular music file formats are highlighted. Created synthetically or from prerecorded live instrument samples, they are primarily used to create modern electronic music (MOD). Also, this can be attributed to the MIDI format, which is not a sound recording, but at the same time, using a sequencer, it allows you to record and play music using a certain set of commands in the form of text.

Digital audio media formats are used for both mass distribution of sound recordings (CD, SACD) and professional sound recording (DAT, minidisc).

For surround sound systems, sound formats can also be distinguished, which are mainly multichannel sound accompaniments for movies. These systems have complete format families from two major competitors, Digital Theater Systems Inc. – DTS and Dolby Laboratories Inc. – Dolby Digital.

The format is also called the number of channels in multichannel sound systems (5.1; 7.1). This system was originally developed for movie theaters, but has since been expanded for home theater systems.

What is digital audio and how does it work

What is digital audio and how does it work

Digital Audio

Regardless of the path chosen, after connecting the source, the sound from the source will be sent to a microprocessor called a digital audio converter (DAC for short), where there will be 2 stages:

Digital Audio

1) Conversion from analog to digital (a / d);

2) Conversion from digital to analog (d / a).

This processor is sometimes called an ad / da converter. Here, the analog audio signal is processed into digital, then redirected to the central processor and memory, and then to the storage medium. Stored digital recordings (often in .WAV format) are sent back to memory and the CPU, and then converted back to analog by the DAC.

The digital audio / MIDI sequencer allows you to record the sound of synthesizers, guitars, and microphones to files with the .wav extension. No matter how sound is transferred to the computer, it will still go to the DAC, computer memory, and hard drive. The resulting data type is called digital audio data. If you record in “CD quality” (among other things one of the lowest possible), every second of the sound is divided into 44,100 pieces. What is this data? Only numbers. But unlike the MIDI format that encodes the notes played, digital audio data is a digital representation of the actual sound wave. This is the same sound described in numbers. Can you guess that this format takes up thousands of times more space than midi data? This is true.

It is a graphical representation of digital audio data. For a computer, this is a sequence of numbers. With this data, you can perform various operations to change and improve. Outwardly, the signals appear to undergo a series of effects, but in reality what happens is a mathematical process.

How MIDI is converted to sound
You may be wondering how to convert MIDI to audio, is there a “convert” utility for that? Connect the output jacks of your synthesizer to your sound card (or audio interface, or mixer with firewire, etc.) and start recording. Analog waves go through a digital converter (DAC), are converted into numbers, and voila! you will receive digital audio data. The nice thing about a sequencer is that you first record a MIDI track and then refine it. in editors and translate it to digital audio for a perfect recording (well maybe not perfect, there is nothing perfect in the world). Yes; you are using synthesizer software, the process will be called slightly differently, but the gist is the same. The computer creates an audio track based on MIDI data and records it in audio format.

Time to process the resulting files perfectly in sync with plugins or effects. You can also save the finished tracks in MIDI format (then you can edit them at any time) and add the sound of vocals, guitars, or whatever else you want. The sequencer can work simultaneously with MIDI files and digital audio.

Effects types
One of the main and most used effects is VIBRATO.
Distinguish amplitude vibrato, when the amplitude of the signal changes periodically. The frequency of change should be small, from a few fractions of a hertz to 10-12 Hz. Tremolo is a type of amplitude vibrato. The frequency of vibration in the case of a tremolo is not less than 10-12 Hz, and the resulting signal is output in portions.

Frequency vibrato. In a non-electronic way, it was done with electric guitars. By changing the tension of the strings with a special lever, the musician changes the pitch (understand – frequency) and achieves the effect of frequency vibrato. The same can be done with synthesizers and midi keyboards using a special wheel or lever. In music editors, you can also adjust the frequency of the sound, change it within the specified or desired limits.

Ring vibrato. The signal passes through a filter, the settings of which are periodically changed. An interesting and beautiful sound is obtained due to periodic changes in the coloration of the timbre.

Effects: Reverb, Chorus, Flanger, Phaser, Delay: effects based on the delay of the signal.

Reverberation: the effect is created by mixing the main signal with copies lagged for different periods of time, obtained as a result of the reflection of various obstacles (walls, objects, etc.) The number of copies can be infinite, the reflected signal can return to reflected from another obstacle (the delay increases naturally) and again summarized with the main one. With a short delay, the effect results in an immersive and booming sound experience. .

MIDI and digital sound: pros and cons

MIDI and digital sound: pros and cons

Digital Audio

The WAVE format is one of many, but it is far from the only format for recording digital audio.

Digital Audio

Unlike MIDI data, digital audio data is actually sound recorded in thousands of units called samples. Digital data represents the amplitude (or volume) of a sound at discrete moments. The sound of digital data is independent of the playback device and therefore always sounds the same. But you have to pay for this with large volumes of sound files.

MIDI data is to digital data what vector graphics are to bitmaps. In other words, MIDI data depends on the audio playback devices and digital data is independent. Just as the appearance of vector graphics depends on the printer or monitor screen, the sound of MIDI files depends on the MIDI device to play these files. Similarly, a melody played on a concert piano will sound different from a normal piano. Digital data, on the other hand, is identical and independent of the reproduction system. The MIDI standard is similar in this respect to the PostScript standard and allows you to control instruments in understandable language.

Compared to digital sound, MIDI has the following advantages:

MIDI files take up less memory and the size of these files does not affect sound quality. On average, MIDI files are 200 to 1000 times smaller than digital files and therefore take up a small amount of RAM, disk space and do not require large CPU resources.

In some cases, MIDI files sound better than digital audio files. In this case, the sound source of the MIDI files must be of high quality.

You can change the length of MIDI files by changing the tempo of the sound, while maintaining the quality and volume of the sound. MIDI data can be easily edited, even at the single note level. You can manipulate small segments of a MIDI song (with millisecond precision), which is not possible with digital audio.

The main disadvantage of a MIDI file comes from its merits. Since MIDI data is not sound itself, playback will be as accurate as the device for playing the MIDI data is identical to the device used to create the original file. Even the sound of a MIDI instrument according to the General MIDI standard depends on the electronic playback device and the method used. MIDI sound is not used for voice playback.

The main advantage of digital audio over MIDI sound is that the reproduction quality of digital sound is always constant, and here MIDI sound is inferior to digital sound. There are two reasons why you should work with digital audio:

A wider selection of programs and systems that support working with digital sound.

The preparation and creation of digital sound elements does not require knowledge of music theory, which cannot be said for MIDI data.

Sound tips
Voice recording from microphone
Any book devoted to multimedia necessarily contains a section on microphone sound recording. In addition, the Sound Recorder (Phonograph) program, which is included in the standard Windows distribution, is usually used for this. Working with it is described in detail in the attached help file. It is easy to use and we will not dwell on it in detail.

The microphones come in condenser and dynamic microphones. Capacitors are more expensive, they give better sound, but your connection must be compatible with a sound card. And the vast majority of sound cards are designed for dynamic microphones.

Another important characteristic of a microphone is its directivity. The microphones are omni-directional (they have the same sensitivity to sound in all directions), unidirectional (they have the highest sensitivity to sound coming from the front), and bi-directional (more sensitive to sound coming from the front and rear). A unidirectional microphone is usually the best option, as it eliminates background noise. But it is more expensive than omni-directional microphones and is more sensitive to choppy breath sounds.

Be sure to pay attention to the impedance (impedance) of the microphone. The optimal value is around 600 ohms.

Therefore, we recommend a 600 ohm dynamic omni-directional microphone.

Differences between analog and digital audio

Differences between analog and digital audio

Analog vs Digital

Sound information. Sound is a wave that travels through air, water, or other medium with a continuously changing intensity and frequency.

Digital vs. Analog

A person perceives sound waves (air vibrations) with the help of hearing in the form of sound of varying volume and pitch. The greater the intensity of the sound wave, the louder the sound, the higher the frequency of the wave, the higher the pitch of the sound (Fig. 1.1).

Dependence of the volume and pitch of the sound on the intensity and frequency of the sound wave.

The human ear perceives sound at a frequency of 20 vibrations per second (low sound) to 20,000 vibrations per second (high sound).

A person can perceive sound in a wide range of intensities, in which the maximum intensity is 1014 times greater than the minimum (one hundred thousand billion times). A special unit of “decibels” (dbl) is used to measure the volume of sound (Table 5.1). Decreasing or increasing the sound volume by 10 dB corresponds to a decrease or increase in sound intensity by 10 times.

Provisional discretization sound. In order for a computer to process sound, a continuous audio signal must be converted to a discrete digital form using time sampling. A continuous sound wave is divided into separate small time sections, for each section a certain value of sound intensity is set.

Therefore, the continuous dependence of the loudness of the sound at time A (t) is replaced by a discrete sequence of loudness levels. On the graph, this appears to replace a smooth curve with a sequence of “steps” (Fig. 1.2).

Sync Audio Sampling

Sampling frequency. A microphone connected to the sound card is used to record analog sound and convert it to digital format. The quality of the digital sound obtained depends on the number of measurements of the sound volume level per unit of time, that is, the sampling frequency. The more measurements that are made in 1 second (the higher the sampling frequency), the more accurately the “ladder” of the digital audio signal repeats the curve of the dialogue signal.

The audio sample rate is the number of sound volume measurements in one second.

The audio sample rate can vary between 8000 and 48000 sound volume measurements per second.

Audio encoding depth. Each “step” is assigned a specific value for the sound volume level. Loudness levels of sound can be viewed as a set of possible states N, for which a certain amount of information I is required, which is called audio coding depth.

Audio encoding depth is the amount of information required to encode the discrete volume levels of digital audio.

If the encoding depth is known, then the number of digital audio loudness levels can be calculated using the formula N = 2I. Let the audio encoding depth be 16 bit, then the number of sound volume levels is:

N = 2I = 216 = 65536.

During the encoding process, each sound volume level is assigned its own 16-bit binary code, the lowest sound level will correspond to the code 0000000000000000 and the highest – 1111111111111111.

The quality of digitized sound. The higher the sampling frequency and depth of the sound, the better the sound of the digitized sound. The lowest quality of digitized sound, corresponding to the quality of telephone communication, is obtained at a sampling rate of 8000 times per second, a sampling rate of 8 bits, and by recording an audio track (“mono” mode). The highest quality of digitized sound, corresponding to the quality of an audio CD, is achieved with a sampling rate of 48,000 times per second, a sampling rate of 16 bits and the recording of two audio tracks (stereo mode) .

It should be remembered that the higher the quality of the digital sound, the greater the volume of information in the audio file. You can estimate the volume of information in a digital stereo sound file with a duration of 1 second with an average sound quality (16 bits, 24,000 measurements per second). To do this, the encoding depth must be multiplied by the number of measurements in 1 second and multiplied by 2 (stereo sound):

16 bits × 24,000 × 2 = 768,000 bits = 96,000 bytes = 93.75 KB.

Sound editors. Sound editors allow you not only to record and play sound, but also to edit it. Digitized sound is presented in sound editors visually, so copying, moving, and deleting parts of the audio track can be easily performed with the mouse. Also, you can layer tracks

Analog or digital audio?

Analog or digital audio?

Analog vs. Digital Audio

Mechanical, electromechanical, optical, and magnetic recording were originally analog recording methods: recording and reproducing sound vibrations in their natural form (waves).

ANALOG vs. DIGITAL AUDIO

Many people believe that there is no better sound recording than analog. The warm analog sound of the magnetic tape is the standard of the best audio recordings for all mankind. Everyone from Elvis Presley and the Beatles to the latest electronic musicians have used and are using analog tape recording or emulation to create their music.

But analog recording is not the most accurate way to record sound. Rather the most beautiful. Analog sound is pleasant to the human ear due to the presence of “warm” harmonics, which are, in fact, distortions of sound. The most accurate sound recording principle today is digital recording.

The father of digital sound was 25-year-old Volodya Kotelnikov, who created it in 1933. The famous “report theorem” (also known as “Kotelnikov’s theorem” or “Nyquist-Shannon theorem). This theorem was the beginning of the creation of the principle of digitizing sound: encoding an audio signal into bits, that is, converting an analog signal into digital. It only took 49 years to create the CDs we know about. the world, it was only adopted in 1982.

A complete list of the types of digital sound recording in use today is digital magnetic recording (format: DAT cassette), magneto-optical recording (miniDisc format), laser recording (CD, SACD formats), digital recording optical (dolby digital)

The development of computers and digital technology has opened up enormous possibilities for processing and recording sound. Huge analog studios with countless multi-kilogram recording equipment, consoles, and sound processors are being replaced by virtual studios that fit into the computer’s system unit.

To process sound on a computer, it must first be recorded in digital, encoded format. The analog signal is encoded by an analog-to-digital converter (ADC). To play back the recording, you must reverse the digital-to-analog audio conversion using a digital-to-analog converter (DAC). The DAC and ADC are part of the computer sound card and other digital audio equipment. The quality of sound recording and playback is highly dependent on the quality of the DAC and ADC.

DAC and ADC

The main parameters of digital sound are sample rate and bit depth. Both the quality of the digitized sound and the size of the recorded file depend directly on them.

Sampling rate (sampling)

Analog recording begins by pressing the “record” button and ends by pressing the “stop” button. Digital recording is discreet. It consists of many recording fragments (samples) that follow one after another. The number of samples logged per second is the sample rate. It is calculated in hertz. The 44 100 Hz sampling rate (standard for CD) means that the audio signal is measured 44 100 times per second. The lower the sampling frequency, the smaller the frequency spectrum that is recorded. The higher the sampling frequency of the source material, the higher the quality and the larger the file size. When you talk on the phone, you only hear a small mid-range range. This is because the sample rate for phone calls is only 8,000 Hz. To transmit a range of frequencies that the average person’s ear hears and transmits home stereos: 40,000 Hz is sufficient. If the difference in sound quality between 32 and 44.1 kHz is obvious, then the higher the sampling frequency, the less perceptible or not at all perceptible to the ear the difference in quality between the two different frequencies will be. A higher sample rate describes sound more precisely, but at the same time describes those frequencies that the human ear can no longer hear, although changes in sound in the inaudible frequency range can still affect audible frequencies, so that studio recording is performed at a higher sample rate. Since consumer equipment is primarily designed to reproduce sound with a sample rate of 44.1 kHz, when the recording is ready, it is re-encoded to a generally accepted standard. If the difference in sound quality between 32 and 44.1 kHz is obvious, then the higher the sampling frequency, the less perceptible or not at all perceptible to the ear the difference in quality between the two different frequencies will be.

Benefits of “digital audio”

Benefits of “digital audio”

Digital Audio

The digitized audio signal has the following advantages:

DIGITAL AUDIO

-the possibility of infinitely long storage without loss of original quality,

-the ability to reproduce for a long time without losing the original quality,

-the possibility of infinite reproduction without loss of original quality,

-simplicity and wide possibilities of processing by modern means,

-Resistance to interference in signal transmission lines.

From CD to Super Audio CD and DVD Audio

CD (Compact Disk) is a type of removable plastic disk with optical reading of information.

In 1979, Sony and Philips proposed the Red Book standard for digital audio recording.

Analog sound is digitized and recorded as a spiral track of alternating zeros and ones (micron holes and a smooth surface) on a 12 cm polycarbonate disc, slightly thicker than a millimeter, covered with the thinner layer gold (later aluminum).

The player’s laser illuminates the disc and detects binary “zeros” and “ones”, which, after processing, are converted back to sound. It is almost impossible to mistake zero for one. Possible problems associated with read errors and scratches on the disc surface were compensated for using digital error correction.

As a result, not only did the physical dimensions of the record holder decrease compared to vinyl record, but also the musical capacity increased significantly: up to 74 minutes (the then owner of Sony wanted his favorite Beethoven Ninth Symphony to fit into a disk).

In 1982 in Langenhagen (Germany) the mass production of compact discs (CD) began with the “Alpine Symphony” by I. Strauss.

Real

High-quality audio is now recorded in Super Audio CD and DVD Audio formats, which:

use a DVD media,

use multichannel recording (up to 5.1),

sampling rate up to 192 kHz,

quantization level: up to 24 bits (each bit doubles the precision of sound transmission and, at such a depth of quantization, the dynamic range of the reproduced sounds can exceed 130 dB).

The new recording formats offer the highest quality, are expensive ($ 15 per disc), and are not popular because most listeners, sadly, don’t care too much about sound quality.

Digital audio options

The important parameters of the digital representation of sound are the sample rate of the audio signals and the quantization of bits.

Quantization rates indicate how many times per second a signal is sampled (measured in amplitude) for conversion to digital code.
For CD standard it is 44KHz (44 thousand times per second), for SACD 192KHz

The quantization bit characterizes the number of signal steps and is measured by the power of 2.

For the CD standard, 16-bit audio adapters are used, which have 65,536 quantization steps (2 to the 16 power), as in an audio CD. For standard and 24-bit SACD.

Digital audio storage

About digitizing sound has a set of signal amplitude values ​​taken at regular intervals and can be written to file sequence numbers (amplitude values).

Two methods are widely used to encode audio information:

PCM (pulse code modulation)

ADPCM (Adaptive Relative Pulse Code Modulation)

PCM (Pulse Code Modulation) is a method of digitally encoding a signal by recording the absolute values ​​of the amplitudes. This is how data is recorded on all audio CDs.

ADPCM (Adaptive Delta PCM) – Records signal values ​​in relative amplitude changes (increments), allowing you to simplify data to take up less memory.

Lossless encoding (for lossless data odirovanie) allows data recovery from fully compressed (20-50%) stream.

Popular L ossless encoding algorithms:

Windows Wave (WAV) is the primary audio file format for Windows.
The Audio Interchange File Format (AIFF) is the primary audio format for the Macintosh.

L ossy encoding (lossy data encoding) enables you to achieve sound similarity of the reconstructed signal to the original with the highest possible data compression (10-1 5 times).

The basis of lossy-encoders is the use of psychoacoustic models: certain portions of the signal, in certain frequency ranges that are inaudible to the human ear, nuances (masked or inaudible frequencies) and occurs to remove them from the original signal.

Analog Audio and Digital Audio

Analog Audio and Digital Audio

Analog vs Digital Audio

A sound wave is a kind of complex function, the dependence of the amplitude of a sound wave on time.

Analog Audio vs. Digital Audio

The information contained in the acoustic wave is not determined by the parameters of the medium in which the elastic wave propagates, and the oscillation parameters (amplitude and frequency, tone and harmonics).

Any form of recording (mechanical and Skye, magnetic, optical, laser) is based on the previous conversion of the sound wave into an alternating electrical current with the same parameters of the oscillations (via microphone).

Analog sound is represented on the device as a continuous electrical signal.

Sound quality depends on the fidelity of the waveform, which is very difficult to maintain.

Until 1982, the world was consuming “canned music” only from analog media: vinyl records and magnetic tapes.

Good vinyl records, played with good equipment, offered excellent sound quality, which unfortunately deteriorated a little with each listening due to mechanical wear as the stylus moved along the sound groove and into the dust that permeated everything.

Tape recorders required precision read heads and high tape feed speeds to reproduce smoothly. Over time, the tape demagnetized, the magnetic layer crumbled.

But the main disadvantage of analog audio recording is the inevitable loss of quality when copying.

The mystery of trigonometry

According to the theory of the mathematician Jean Baptiste Fourier, a sound wave can be represented as a spectrum of frequencies included in it.

The frequency components of the spectrum are sinusoidal oscillations (pure tones), each of which has its own amplitude and frequency.

According to Kotelnikov’s theorem, any vibration, even the most complex shape (for example, a human voice), can be recovered unambiguously and without loss from its discrete samples taken with a frequency equal to its doubled maximum frequency.

Vladimir Aleksandrovich Kotelnikov (1908-2005) – a prominent Soviet and Russian scientist in the field of radio engineering, radiocommunication and radio astronomy.

Observation . The finite duration signal has an infinitely wide spectrum. Therefore, when a signal with a finite duration is sampled, it is impossible to recover it from the samples without loss of quality.

Digitization of audio information

The digitization of sound is the recording of the amplitude of the signal at certain intervals and the recording of the amplitude values ​​obtained in the form of rounded digital values.
Any computer includes a motherboard, an audio adapter (sound card).

Sound cards include: ADC (analog to digital converter), synthesizer, mixer, DAC (digital to analog converter) amplifier s, MIDI interface port for gaming devices.

To record digital sound, the ADC produces:

temporal sampling of a continuous signal (determines the value of the amplitude of the signal with the frequency necessary to recreate its original shape = twice the maximum frequency of the sound wave);

quantization by the levels of the measured signal values ​​(determines the number of fixed values ​​(levels, gradations) of the amplitude of the signal);

signal coding (writing in a binary number system).

The reverse operation is performed by the DAC (digital to analog converter).

Bitrate

Bit rate (bit rate): literally bits of information of the transmission rate.

The bit rate is the effective information transmission rate through the channel (the transmission rate of “useful information”, in addition to the service information) expressed in kilobits per second (kilobits per second, kbps).

In lossy compression video and audio transmission formats, the bit rate parameter expresses the degree of compression of the stream and thus determines the size of the channel for which the data stream is compressed.

P-mode compression data stream:

with constant bit rate (constant bit rate, CBR) – The required bit rate is initially set, which does not change throughout the file. It makes it possible to predict the final file size quite accurately, but it does not provide an optimal size / quality ratio for musical works, the sound of which changes dynamically over time.

with variable bit rate (VBR): the codec changes the value of the bit rate based on the desired quality level according to the psychoacoustic model. It offers the best quality of the output file, but its size is unpredictable (it may differ several times).

with an average bit rate (ABR): a hybrid of constant and variable bit rates: the user sets the bit rate in kbit / s and the program varies it within certain limits.

Digital audio

Digital audio

Digital Audio

what happens to sound within computer programs

Digital Audio

Digital audio is a representation of analog sound used by computers and various digital devices to record and reproduce audio information. Like the frames of a movie, a digital audio signal is created from a series of sound fragments that are played when we press the play button. There are many different digital audio formats, they differ from each other in the transmission quality of the audio information.

About Pulse Code Modulation – PCM

If we talk about an acoustic sound or an analog signal, we are always talking about the propagation of sound waves in space. Whereas digital audio is only a rough description of what happens to sound or should happen within computer programs or digital devices.

This article will discuss pulse code modulation (PCM), the most common digital audio decoding system. Besides PCM, there are also DTS and Dolby Digital systems, but these are mainly applicable in the field of film and video production. Today we will not talk about them.

In pulse code modulation, a signal is read many times per second. At each reading moment the amplitude of the sound wave is recorded and reproduced. As mentioned above, a digital signal is just a rough copy of an analog signal, since an analog wave cannot be recreated with perfect precision. The values ​​of each fragment are rounded to the nearest most accurate, then all the fragments are played and we hear a copy of the original analog sound.

“What meanings are we talking about?” – you ask. Just as analog audio is defined by frequency and amplitude, digital audio is determined by two important values: the sample rate and the bit depth. The sample rate means how many times per second the fragments of the audio signal are read, and the bit depth is the value of the dynamic range of each fragment of the audio signal.

Sampling rate

The standard 44.1 kHz sample rate used for recording audio to CDs (remember those?) Might seem like a random number. But this is not the case at all. This value was chosen based on Kotelnikov’s theorem, which essentially states that the sampling frequency must be more than 2 times higher than the maximum value of the reading frequency. As you know, the upper limit of audibility of the human ear’s frequency range is 20 kHz. It turns out that the sampling frequency must be higher than 40 kHz. An additional 4.1 kHz is added to avoid distortion, the so-called aliasing effect. In theory, 44.1 kHz should be sufficient to accurately reproduce an audio signal, however there are higher values.

For example, 48 kHz is the dominant standard in film and video production. As in the case of cinema, sound is synchronized at a frame rate of 24 frames per second. We won’t go into the details of why exactly 24 frames per second was chosen, in other words, this is the minimum frequency at which we can see a smooth, eye-pleasing image. The sample rate must match this frame rate. Using a frequency of 44.1 kHz can cause a noticeable out of sync of the picture and sound. Again, based on Kotelnikov’s theorem.

Even higher sample rates are repelled by these two base frequencies of 44.1 or 48 kHz, multiplying them by multiples of 2. That is, 88.2, 96, 192 kHz are the standard sample rates for all audio equipment. modern audio.

Bit depth

The bitness or bitness of an audio file tells us about its dynamic resolution or, more simply, clarity. You can draw an analogy with digital photography: the higher the resolution of the photo, the clearer and better the image will be.

It is important to note here that we are not talking about the loudness of the signal, but about a more realistic, clean and clear sound. More accurate transmission of the audio signal.

Bit depth can be compared to text in the book. The lower the bit depth, the less meaningful the text will make. That is, lowering the bitness leads to the fact that some letters begin to disappear from words, punctuation marks from sentences. At the moment, we will still be able to grasp the meaning of the text, but if the bit depth continues to decrease, the information will become so distorted that we simply stop understanding what we are talking about. The same goes for sound: the lower the bit depth, the more distorted we hear the sound.