How does encoding work in digital audio? Part 5


Free Download Mp4Gain
picture

How does encoding work in digital audio? Part 5

encoding digital audio

DSD offers significant advantages over PCM:

encoding digital audio

more precisely draw a wave;
increased immunity to noise;
an easier way to change and transmit a digital stream;
In theory, it is possible to reduce cost by simplifying DAC circuits, but due to backward compatibility, manufacturers are unlikely to do so.
Originally, SACDs used the DSD x64 format with a sample rate of 2822.4 kHz. The 44.1 kHz audio CD sample rate was taken as the basis, increased 64 times, hence the name x64. The following DSDs are currently in use:

x64 = 2822.4 kHz;
x128 = 5644.8 kHz;
x256 = 11,289.6 kHz;
x512 = 22,579.2 kHz;
declared DSD x1024.

DXD
There is a certain intermediate format between PCM and DSD called DXD – Digital eXtreme Definition. This is, in fact, high definition PCM: 352.8 kHz or 384 kHz with 24 or 32 bit quantization. It is used in studies for the processing and subsequent mixing of materials.

But this approach is flawed: firstly, it does not allow to use all the advantages of DSD, and secondly, the file size is larger than in DSD. At the moment, flagship DACs on the I2S input accept a PCM data stream with a sample rate of up to 768 kHz and a bit depth of up to 32 bits. It’s scary to even consider how much hard drive space an album will take up at this resolution.

DSD has practically separated from SACD. Now, the DSD format can often be found packaged in files with the DSF and DFF extensions. Many turntables have been released with the ability to record in DSF and DFF, lovers of good sound are increasingly digitizing vinyl records in the DSD format. But in recording studios, nobody wants to invest in unpopular formats, so they continue to rivet the sound with a minimum wage: 44.1 × 16.

DSD switching and data transmission
To transfer a digital transmission to DSD, a three-pin connection scheme is used:

DSD Clock Pin (DCLK) – sync;
Data input pin DSD Lch (DSDL) – left channel data;
Data input pin DSD Rch (DSDR): Right channel data.

Unlike I2S, DSD data transmission is extremely simplified. DCLK sets the clock rate of the bit sync, and the left and right channel data is transmitted sequentially through the DSDL and DSDR pins, respectively. Here there are no adjustments, recording and playback in DSD is done little by little. This approach provides the closest approximation to the analog signal, and due to the high frequency, the quantization noise is reduced and the reproduction precision is increased by an order of magnitude.

PDO
DoP is often used to carry DSD data streams, so it’s worth mentioning. DoP is an open standard for transferring DSD data over PCM frames (DSD over PCM). The standard was created to transmit a stream through controllers and devices that do not support direct DSD streaming (not native DSD).

The principle of operation is as follows: in a 24-bit PCM frame, the upper 8 bits are padded with ones; this means that DSD data is currently being transmitted. The remaining 16 bits are sequentially filled with DSD data bits.

For x64 DSD transmission with a single bit rate of 2822.4 kHz, a PCM sample rate of 176.4 kHz (176.4 x 16 = 2822.4 kHz) is required. For DSD x128 transmission at 5644.8 kHz, a PCM sampling rate of 352.8 kHz is already required.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

How does encoding work in digital audio? Part 4

How does encoding work in digital audio? Part 4

encoding digital audio

When playing PCM 44.1×16, the most significant bits are simply ignored as they are filled with zeros, or, in the case of older multi-bit DACs, they can go to the next frame. The length of the “word” (WS) may also depend on the player through which the music is played, as well as the driver for the playback device.

encoding digital audio

An alternative to PCM and I2S would be to record the audio signal in DSD. This format was developed in parallel with PCM, although Kotelnikov’s theorem had some influence here. To improve sound quality compared to CDDA, the emphasis was not on increasing the quantization bit, as in the DVD Audio format, but on increasing the sample rate.

DSD
DSD stands for Direct Stream Digital. It originates from Sony and Philips labs, however, just like the other formats discussed in this article.

SACD
DSD first saw the light of day on Super Audio CDs in 2002.

At the time, SACD looked like a masterpiece of engineering, applying a completely new way of recording and playback, very close to analog devices. The implementation was simple and elegant.

The media was even equipped with copy protection, although without it, no pirate was afraid. Under the Sony and Philips brands, they began to produce “closed” devices exclusively for playback, with no possibility of copying discs. Manufacturers sold recording equipment to studios, but kept control over the SACD launch.

Who knows, perhaps the SACD format could gain comparable popularity to Audio CD, if it weren’t for the cost of the playback devices. By unreasonably selling out player prices, Sony and Philips’ own leaders stymied the popularity of their format. And the next mistake put an end to the sale of specialized devices. To promote the Sony PlayStation game console, Sony engineers have added the ability to listen to SACD on it. Hackers immediately hacked the set-top box and began to copy SACD discs into ISO images, which can be burned to a regular DVD disc and played on any competing player; others simply ripped out tracks to play on a computer.

Record labels are good too: contrary to what music lovers expected, they did not take full advantage of the new high-definition format. The studios did not record music from the master tape in DSD, instead they took a digital recording in PCM, remixed and processed everything in a row: limiters, compressors, noise-shaping dithering, and various digital filters. The result was a sound so sterile and dry that even CD Audio could have sounded much better. Thus, listeners’ trust in the SACD was undermined, and at the same time in the new formats in general.

INFO
Unfortunately with vinyl records this vicious practice continues to this day: studios print vinyl from a digital recording, even if they have the recording on the master tape. So on modern vinyl it can easily be 44.1 x 16.

DSD
What is DSD? This is a one-bit stream with a very high sample rate compared to PCM. Also, DSD uses a different type of modulation, PDM (Pulse Density Modulation) – pulse density modulation. Sound recording in this format is done by a one-bit analog-to-digital converter, now these ADCs based on sigma-delta modulation are used everywhere. The recording process looks like this: while the amplitude of the wave increases, the output of the ADC is a logical unit, when the amplitude falls, the output is a logical zero, there can be no average value. It is compared with the previous value of the wave amplitude.

How does encoding work in digital audio? Part 3

How does encoding work in digital audio? Part 3

encoding digital audio

The structure of the digital audio path.

encoding digital audio

When playing music, something like the following happens: the player, using a codec created in the form of a device or program, decompresses the file into a specific format (FLAC, MP3 and others) or reads data from a CD, DVD-Audio or disc SACD, receiving a standard PCM data stream … This sequence is then sent via USB, LAN, S / PDIF, PCI, etc. to the I2S converter. In turn, the converter converts the received data into so-called I2S data interface frames (not to be confused with I2C!).

I2S
I2S is a digital audio transmission serial bus. Now I2S is a standard for connecting a signal source (computer, turntable) to a digital-to-analog converter. It is through it that the vast majority of the DAC connects directly or indirectly. There are other digital audio transmission standards, but they are much less common.

I2S output (input) on PCB
I2S output (input) on PCB
Other articles in this issue:
Xakep # 256. Fight Linux
Problem content
Subscription to “Hacker”
The I2S bus can consist of three, four, or even five pins:

continuous serial clock (SCK) – bit sync clock (can be called BCK or BCLK);
word selection (WS) – frame sync clock (may be called LRCK or FSYNC);
serial data (SD): the signal of the transmitted data (can be called DATA, SDOUT or SDATA). As a general rule, data is transmitted from a transmitter to a receiver, but there are devices that can act as a receiver and transmitter at the same time. In this case, another contact may be present;
Serial data in (SDIN): On this pin, data moves in the receive direction, not transmit.
SD or SDOUT is used to connect a D / A converter and SDIN is used to connect an A / D converter to the I2S bus.

In most cases, there is another pin, the master clock (MCLK or MCK), which is used to synchronize the transmitter and receiver from the same clock to reduce the transmission error rate. For external synchronization of MCLK, two clock generators are used: with a frequency of 22 579 kHz and 24 576 kHz. The first, 22,579 kHz, is for frequencies that are multiples of 44.1 kHz (88.2, 176.4, 352.8 kHz), and the second, 24,576 kHz, is for frequencies that are multiples of 48 kHz (96, 192, 384 kHz). There may also be generators at 45158.4 kHz and 49152 kHz; You’ve probably already noticed how in the world of digital sound they like to double everything.

Frame or I2S frame
In I2S, three contacts are necessarily used: SCK, WS, SD; all other contacts are optional.

On the SCK channel, synchronization pulses are transmitted, under which the frames are synchronized.

The length of the “word” is transmitted over the WS channel and logical states are also used. If the WS pin is a logical unit, then the right channel data is transmitted, if it is zero, the left channel data.

The data bits are transmitted via SD: the amplitude values ​​of the audio signal during quantization, the same 16, 24 or 32 bits. No checksums or service channels are provided on the I2S bus. If the data is lost in transit, there is no way to get it back.

Expensive DACs often have external connectors to connect to I2S. The use of such connectors and cables can have a negative effect on the sound, even the appearance of “artifacts” and stuttering, everything will depend on the quality and length of the cable. Still, I2S is a plug-and-play connector, and the length of the wires from the transmitter to the receiver should tend to zero.

Let’s take a look at how the PCM data stream is transmitted over the I2S bus. For example, when transmitting PCM 44.1 kHz at 16 bits, the length of the word on the SD channel will be these sixteen bits and the length of the frame will be 32 bits (right + left). But most of the time, the transmitters use a 24-bit word length.

How does encoding work in digital audio? Part 2

How does encoding work in digital audio? Part 2

digital audio

The 44.1 kHz sampling rate was calculated from Kotelnikov’s theorem. It is believed that the hearing of the average person cannot pick up sound beyond 19-22 kHz. The frequency was probably 22 kHz and was chosen as the upper limit.

digital audio

22,000 × 2 = 44,000 + 100 = 44,100 Hertz

Where does the 100 Hertz come from? There is a version that this is a small margin in case of errors or oversampling. In fact, Sony chose this frequency for its compatibility with the PAL transmission standard.

The bit depth of the CDDA format is 16 bits, or 65,536 samples, which equates to a dynamic range of approximately 96 dB. Such a large number of samples were not chosen by chance. Firstly, due to the strong influence of quantization noise, and secondly, to provide a formal dynamic range superior to that of the main competitors at the time: cassette records and vinyl records. I’ll cover this in more detail in the section on digital to analog converters.

Development of PCM continued on the principle of multiplying by two. Other sample rates appeared: first, the 48 kHz sample rate was added, and then the frequencies based on it were 96, 192, and 384 kHz. The 44.1 kHz frequency was also doubled to 88.2, 176.4 and 352.8 kHz. Bit depth increased from 16 to 24 and then to 32 bits.

The next after CDDA in 1987 appeared the DAT format – Digital Audio Tape. The sample rate was 48 kHz, the quantization bit did not change. And although the format failed, the 48 kHz sample rate has taken hold in recording studios, as they say, due to the convenience of digital processing.

In 1999, the DVD-Audio format was released, which made it possible to record on a disc six stereo tracks with a sampling frequency of 96 kHz and a 24-bit bit depth, or two stereo tracks with a frequency of 192 kHz, 24 bits.

That same year, the SACD – Super Audio CD format was introduced, but the discs began to be produced only three years later. I’ll tell you more about this format in the DSD section.

These are the main formats that are considered the standard for digital audio recordings on media. Now let’s see how the data is transmitted on a digital audio path.

How does encoding work in digital audio?

How does encoding work in digital audio?

encoding digital audio

Have you ever wondered how sound is reproduced on digital devices? How is a sound signal formed from a combination of ones and zeros?

encoding digital audio

I’m sure I was thinking, since I started reading! But often, even professionals have only a general idea of ​​the modern sound route. In this article, you will learn how the different formats appeared, what a digital-to-analog converter is, what types of DACs exist, and what determines the quality of sound reproduction.

PCM
As you know, in digital audio, almost any format, with rare exceptions, is recorded using a pulse code stream or a PCM stream – pulse code modulation. FLAC, MP3, WAV, Audio CD, DVD-Audio and other formats are just ways to pack, “preserve” the PCM stream.

How it all began
The theoretical foundations of digital sound transmission were developed at the dawn of the 20th century, when scientists tried to transmit an audio signal over a long distance, but not by telephone, but in a rather strange way for that time.

By dividing the sound wave into small parts, it could be sent to the receiver in some kind of mathematical representation. The recipient, in turn, could restore the original waveform and listen to the recording. In addition, scientists were faced with the task of increasing the bandwidth of the “ether”.

In 1933, the theorem of V.A. Kotelnikov. In Western sources, it is called the Nyquist-Shannon theorem. Yes, Harry Nyquist was the first to raise this issue: in 1927 he calculated the minimum sampling frequency for transmitting a waveform, which later received his name “Nyquist frequency”, but Kotelnikov’s theorem was published 16 years earlier .

The essence of the theorem is simple: a continuous signal can be represented in the form of an interpolation series consisting of discrete reports, from which the signal can be reconstructed. In order to roughly restore the original state of the signal, the sample rate must be at least twice the upper cutoff frequency of this signal.

For many years, the theorem was not in demand, until the advent of the digital age. It was then that it found a use. In particular, the theorem was useful in the development of the CDDA (Compact Disc Digital Audio) format, in common people it is called Audio CD or Red Book. The format was released by engineers at Philips and Sony in 1980 and has become the standard for audio CDs.

Format characteristics:

sampling frequency – 44.1 kHz;
quantization capacity – 16 bits.

Digital audio encoding

Digital audio encoding

Digital audio encoding

In fact, one or another digital form of representation of analog audio signals is already a coding method – a sequence of numbers that describes an analog audio signal is itself a digital code.

Digital Audio Encoding

However, the encoding that we are going to talk about now is something else. Now let’s look at the methods of encoding digital audio signals.

A digitized audio signal “in its pure form” is a fairly accurate, but not the most compact, way of recording the original analog signal.

Judge for yourself. To obtain complete information about the original analog signal in the frequency range 0-20 kHz (in the audible frequency range), the analog signal must be sampled at a frequency of at least 40 kHz. Therefore, the CD – DA standard (the standard for recording data on audio CDs familiar to all) establishes the following encoding parameters: recording of two or one channel in PCM format with a sampling frequency of 44.1 kHz and a 16-bit quantization bit depth. One hour of music in this format takes up approximately 600 MB of space (60 minutes * 60 seconds * 2 channels * 44100 samples per second * 2 bytes per sample = approximately 605 MB). Taking into account that, for example, the music collection of an ordinary music lover may have 5,000 tracks with an average length of about 3 minutes each, the amount of memory required to store it in its original digital form is quite significant. Awesome. Therefore, storing relatively large amounts of audio data, ensuring fairly good sound quality, requires the use of various “tricks” to compress the data.

In general, all existing methods for encoding audio information can be conditionally divided into only two types.

1. Lossless data compression (“Lossless Encoding”) is a method of encoding (compacting) digital audio information, which enables one hundred percent recovery of the original data from the compressed transmission (the term ” original data “here means the original form of the digitized audio data). This method of data compression is used in cases where one hundred percent absolute preservation of the quality of the original audio data is required. Lossless compression algorithms that exist today can reduce the volume of data occupied by 20-50% and at the same time guarantee a 100% recovery of the original digital material from the compressed data. The operating mechanisms of such encoders are similar to the operating mechanisms of general data archivers, such as ZIP or RAR, but at the same time they are specially adapted to compress audio data …. Lossless encoding While it is ideal in terms of preserving the quality of audio materials, it cannot provide a high level of compression.

2. There is another more modern way to compact data. This so-called lossy data compression (Engl. “Lossy encoding”) The purpose of encoding is to achieve the highest data compression rate by all means while keeping sound quality at an acceptable level. The idea behind lossy encoding is based on two simple underlying considerations:

original digital audio data is redundant: it contains a lot of unnecessary information that is useless to the ear, which can be removed, thereby increasing the compression ratio;
Requirements for the sound quality of audio material may vary and depend on specific purposes and areas of use.
Lossy encoding is therefore called “lossy”, which results in the loss of some of the audio information. Such encoding leads to the fact that the decoded signal, when reproduced, sounds similar to the original, but in reality it is no longer identical to it. Most lossy coding methods rely on the use of the psychoacoustic properties of the human auditory system, as well as various tricks associated with resampling and resampling the signal. In frequency, during the compression process, the encoder analyzes the audio data to identify various details of the sound that can be ignored. Disguised frequencies, inaudible and inaudible sound details can be sacrificed for a higher compression ratio. Where intelligibility is only important in sound (for example, in telephony, where the presence of frequencies above 4 kHz is not necessary), the audio information during the encoding process undergoes a serious “simplification”, which, together with the use of successful “smart” quantifiers and “greedy” data compression algorithms.

How sound is encoded

How sound is encoded

How sound is encoded

Sound is a wave that travels more frequently in air, water, or other medium with a continuously changing intensity and frequency.

How sound is encoded

A person can perceive sound waves (air vibrations) with the help of hearing in the form of sound, while distinguishing between volume and pitch.

The higher the intensity of the sound wave, the louder the sound, the higher the frequency of the wave, the higher the pitch of the sound.

We previously wrote in more detail about the human perception of sound, you can read it here.

How audio is encoded (digital encoding and audio processing)
Dependence of the loudness, as well as the tone of the sound on the intensity and frequency of the sound wave.

Hertz (denoted by Hz or Hz) is a unit of measurement for the frequency of periodic processes (eg, oscillations).
1 Hz means an execution of said process in one second: 1 Hz = 1 / s.

If we have 10 Hz, this means that we have ten executions of said process in one second.

The human ear can perceive sound at frequencies ranging from 20 vibrations per second (20 Hertz, low sound) to 20,000 vibrations per second (20 KHz, high sound).

In addition, a person can perceive sound in a wide range of intensities, in which the maximum intensity is 1014 times greater than the minimum (one hundred thousand billion times).

To measure the volume of sound, a special unit of “decibels” (dB) was invented and used.

A decrease or increase in sound volume by 10 dB corresponds to a decrease or increase in sound intensity by 10 times.

Characteristic sound Loudness measured in decibels
Lower limit of human ear sensitivity 0
Leaf whisper ten
Conversation 60
Horn 90
Jet engine 120
Pain threshold 140

Sound volume in decibels

Sync Audio Sampling

In order for computer systems to process sound, a continuous audio signal must be converted to a discrete digital form by time sampling.

For this, a continuous sound wave is divided into separate small time sections, for each section a certain value of sound intensity is set.

Therefore, the continuous dependence of the loudness of the sound at time A (t) is replaced by a discrete sequence of loudness levels. On the graph, this appears to replace a smooth curve with a sequence of “steps.”

How audio is encoded (digital encoding and audio processing)
Sync Audio Sampling

A microphone connected to the sound card is used to record analog audio and convert it to digital format.

The denser the discrete strips are located on the graphic, the better it will be to ultimately recreate the original sound.

The resulting digital sound quality depends on the number of sound volume level measurements per unit time, that is, the sampling frequency.

Audio sample rate is the number of audio volume measurements in one second.

The more measurements that are made in one second (the higher the sampling frequency), the more accurately the “ladder” of the digital audio signal repeats the curve of the analog signal.

Each “step” of the graph is assigned a certain value for the sound volume level. Loudness levels can be thought of as a set of possible N states (gradations), which require a certain amount of I information to encode, which is called audio encoding depth.

Audio encoding depth is the amount of information required to encode the discrete volume levels of digital audio.

If the known encoding depth, the number of digital audio volume levels can be calculated by the general formula N = 2 I.

For example let the audio encoding depth be 16 bit, in this case the number of audio volume levels is:

N = 2I = 2 16 = 65 536.

During the encoding process, each sound volume level is assigned its own 16-bit binary code, the smallest sound level will correspond to the code 0000000000000000, and the highest – 1111111111111111.

Digitized audio quality

Therefore, the higher the sample rate and depth of audio encoding, the better the digitized sound will sound and the better you can bring the digitized sound closer to the original sound.

The lowest quality of digitized sound, corresponding to the quality of telephone communication, is obtained at a sampling rate of 8000 times per second, a sampling rate of 8 bits, and by recording an audio track (“mono” mode).
But it should be remembered that devices that resemble speech synthesizers and speech coders are used to improve this sound in telephony. About speech coders, this article also

Digital audio encoding

Digital audio encoding

Digital audio encoding

To represent the vibrations of sound in digital form, the amplitude of the sound signal is measured at each specific moment of the sound.

DIGITAL AUDIO ENCODING

Since the waveform of sound is inherently continuous, for its accurate digital display it is necessary to measure the amplitude an infinite number of times per second and divide the amplitude scale by an infinite number of gradations. In reality, the number of measurements per second (sample rate) typically ranges from 10,000 to 96,000. Currently, the most common sample rates are 44100 Hz (the standard for CD-audio) and 48000 Hz (the main standard for CD-audio). DAT). The number of amplitude gradations (resolution) is generally taken equal to 28, 216, or 224 (depending on the number of bits allocated for this information).

Of course, distortion is unavoidable when sampling a continuous signal. The lower the sample rate and / or resolution, the closer the output waveform will be to rectangular. In this case, high-frequency distortions arise, which are partially suppressed by filters installed at the DAC output.

Digitized audio requires a large amount of memory. In fact, at a standard 44100 Hz sample rate and 16-bit resolution, the audio material (stereo) for one minute would be 10,584,000 bytes (approximately 10.09 MB). Also, the sound files are very poorly compressed by standard archive programs (zip, arj, etc.). Therefore, there are special compression algorithms for them. For example, a WAV file compressed with ADPCM takes about four times less space. However, distortion may occur. Therefore, it is better not to use audio compression algorithms in professional work.

What is digital audio?

What is digital audio?

DIGITAL AUDIO

In fact, there can be several types of “digital sound”, more precisely, the types of its representation on a computer.

Digital Audio

The now familiar “digitized sound” is an analog of a photograph, an exact digital copy of sounds input from outside. It can be a microphone recording of your voice, a copy of audio tracks from a CD, or other sources. Like photography, this sound takes up a lot of space … however, the appetite for photography compared to sound is simply negligible! One minute of digital audio recorded at the highest quality requires approximately 10 megabytes. It is true that there are special compression methods that reduce the volume of computer sound ten times. But more on that later.

Besides “digital”, there is also “synthesized” sound – more precisely, music in MIDI format. Well, you are probably familiar with synthesizers. Briefly, the essence of MIDI technology can be summed up as follows: the computer not only plays the melody you need, but synthesizes it using a sound card. MIDI melodies are just command systems that control a sound card, note codes that it should “display” (indicating instruments, duration and some other parameters of this note). This technology is ideal for computer composers, as it allows you to easily change any parameter of the melody created on the computer: replace instruments, add or remove them, change the tempo and even the style of the song. And files with MIDI music are small, only a few tens of kilobytes. But MIDI has drawbacks too: you can’t record a voice to a MIDI file, and music sounds good only on a very high-quality sound card. Transfer the file you created to a neighbor’s computer equipped with a $ 10 card, and you will long think where all the charm and beauty of the melody has evaporated. It is true that MIDI can be relatively easily converted to digital sound format; reverse conversion, unfortunately, is impossible at the current level of computer technology development.

Finally, there is a third type of sound you can work with at home: “tracker” or “sampler” technology, a kind of love that comes from digital and synthesized sound. When you work with programs of this type, you will “build” a musical composition from small “pieces” of digital or synthesized sound that are repeated periodically: loops or samples. It is on this principle that compositions are created in the current popular style of “house”, “trance”, “techno” …

In short, all simple dance (not to say grosser, primitive), rhythmic music. This type of music, a cross between digital and synthesized, is called “tracker” and has a limited but loyal audience of fans.

What is digital audio?

What is digital audio?

Digital audio

Today we hear everywhere: high-quality digital sound, digital photography, digital video.

Digital Audio

What does this buzzword mean: digital? The key lies in modern methods of recording, processing and storing a wide variety of information, which appeared simultaneously with the advent of personal computers. The first PCs were designed only for settlement operations, but later they discovered that they can operate with texts, images, sounds and videos. You just need to translate everything into the computer language.

Let’s take a look at how you can record and play sound with a PC. First, the sound vibrations are converted to an alternating voltage using a microphone. This voltage is fed into the input of a special computing device – a sound card. The computer cannot register voltage. Like any electronic device, it can only record the voltage value of two levels: “there is voltage” (we should say a logical unit) or “there is no voltage” – logical zero.

It is in the form of combinations of logical zeros and ones that the PC records numbers, letters, words, or formulas. It is clear that recording a large amount of information requires many memory cells, because only one binary number can be written in a cell: 1 or 0. To write a digit or letter, 8 memory cells are needed. The number 3 is written as 00000011, the number 5 is 00000101, the letter k is 01101001, and the like.

How to record sound?
PC audio processing device control panel Very simple! The alternating voltage that reaches the sound card receives multiple measurements, the results of which are carefully recorded by the PC in memory. The computer measures the voltage approximately 44,000 times per second at any given time and records its value in memory. This is similar to how students keep a weather calendar: every day, at the same time, they record the readings of a thermometer, a barometer. The PC also records voltage values, but it does so much more frequently. How do you manage? Easy! Modern computers can do more than a billion simple operations per second, so the 44 or even 98,000 measurements required to record high-quality audio are not a problem for a computer. At the same time, the PC has to do a lot of work: drawing on the screen, writing the measurement results to disk, keeping an eye on which key you pressed, where the mouse moved, measured new voltage values, etc. Despite the fact that a voltage measurement consists of several dozen simple operations, the speed of modern processors is sufficient for it.

Large amounts of memory are required to store digital audio. One second of sound takes up the same space as 88,000 letters! This is how sound is recorded: voltage measurements are recorded on a large CD. Compare: You can record in text format a small library of 4-5 thousand books for several hundred pages or … 76 minutes of quality music.

Modern computers have learned to “cheat.” They record very quiet sounds with less precision, the ear will not yet hear them clearly. Sounds that are masked as loud sounds are also digitized less precisely. Why record in detail how smooth the violin sounds when the drum is struck hard? Therefore, the amount of memory occupied by sounds can be reduced ten times. This (and not only this) is done in the popular MP3 computer audio formats, which are common on the Internet, and in portable MP3 players, and Atrac, which is used in minidisc players.

How do I play the sound?
How is digital sound recreated? Even easier than typing it! In math lessons, you probably had to graph a function by points, and in physics lab work, you had to draw a graph based on measurements. During playback, the PC reads the voltage value from memory at all times and, using a sound card, resumes almost the same alternating voltage that was digitized.

These methods of recording and reproducing sound are used not only by computers, but also by various CD, MD and MP3 players, which, in fact, are also microcomputers, albeit without the usual keyboards, mice and monitors.

It is convenient not only to record and store digital sound, but also to transmit it remotely. The convenience lies in conserving airtime and battery life. During a conversation on a mobile phone, the voice is converted into digital form and memorized. When, say, 1/5 of a second of sound has accumulated, the phone’s transmitter turns on and the sound is transmitted for 1/100 of a second.