The benefits of digital audio


Free Download Mp4Gain
picture

The benefits of digital audio

Digital Audio

The basics of “numbers”

DIGITAL AUDIO

Each of the multimedia devices on sale today, be it a CD player, a voice recorder or a flash memory player, uses many different types of presentation of data streams, which are then converted into sound. And even more sound formats used for professional purposes have been invented. An inexperienced buyer is forced to gather information on designations on boxes and devices from a variety of sources, often receiving incorrect information or even more confusion.

Almost all devices in the “Portable Audio” section of the ZOOM.CNews.ru catalog support multiple sound formats at the same time, and many devices that do not belong in this category are also tagged with support for playing sound files. To help our reader, we decided to create a short glossary of abbreviations and talk about the most common formats. We plan to leave it open for updates and modifications, adding new formats and describing in more detail the advantages and disadvantages of the already common or forgotten ones.

A little theory

To begin with, remember that digital sound is nothing more than a collection of numbers. The determining factor is the system by which sound as air pressure is converted into data streams and encoded for further processing and reproduction. Consequently, digital sound is usually included in computer files with various extensions, which more often (but not always) can determine their format. And the same concept of format can have, paradoxically, two meanings. First, the format may exist as a general characteristic, including both the type and the physical characteristics of the medium (disc or cassette), method of recording, principles of encoding, and protection against errors. Second, the format can only be understood as the method of encoding and compressing sound, as standard means are used for transfer, for example a computer.

Analog sound, unlike digital, is reproduced on analog devices and has several significant differences. While not a data stream, analog sound is represented as a continuous electrical signal that represents the change in sound wave. To translate it into digital format, the sound is “digitized”, that is, it is divided into certain segments, in which the numerical value of the amplitude is fixed at that moment. We will not delve into the principles of digital sound creation, but it is absolutely necessary to note that the more often a sound segment is divided and its characteristics described, the clearer and more complete the sound image itself is created.

This process generates an enormous flow of data that describes the sound, and it is clear that each digital audio format is nothing more than a compromise between the need to present the sound as loud as possible and the limitations of the memory of the computer or device. Of reproduction.

A little more theory. In most cases, the human ear perceives sound with a frequency no higher than 22,000 Hz and, to describe it fully in digital form, a sampling frequency of at least 44.1 kHz is required. Since it is absolutely impossible to determine the value of the signal at any given time, during digitization quantization occurs, that is, the replacement of the actual values ​​of the signal by approximate values. The more levels of audio quantization, the more accurately the signal level is described. As a result, each standard CD carries an audio signal with a sampling frequency of the same 44.1 kHz and a 16-bit quantization level,


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Is the digital signal distorted during transmission and storage?

Is the digital signal distorted during transmission and storage?

DIGITAL AUDIO

Since any digital signal is represented as a real voltage or current electrical curve, its shape is distorted in one way or another during any transmission, and a signal “frozen” for storage (signalogram) is subject to degradation due to physical reasons. common.

Digital Audio

All of these influences on the shape of the carrier signal are interferences that, up to a certain value, do not change the information content of the signal, since individual distortions and letter loss in words generally do not interfere with the correct understanding of words. words, and information redundancy, such as an increase in the length of the words, increases the probability of successful recognition. … In other words, the carrier signal itself can be distorted, but the information it carries, the encoded audio signal, remains unchanged in the vast majority of cases.

So that the quality of the carrier signal does not deteriorate, any transmission of useful audio information (copying, writing to a carrier and reading it) must necessarily include the operation of restoring the form of the carrier signal, and ideally, and the digital form primary of the information signal, and only after that the newly generated carrier signal can be transmitted to the next consumer. In the case of direct copy without restoration (for example, simply rewriting a video cassette with a digital signal obtained with a PCM decoder in common VCRs), the quality of the digital signal deteriorates, although it still contains all the information it carries. However, after repeated sequential copies or long-term storage, the quality deteriorates so much that unrecoverable errors begin to appear that irreversibly distort the information carried by the signal. Therefore, the copying and transmission of digital signals should be done only on digital devices and, when stored on media, should be “updated” in a timely manner without waiting for irreversible degradation (for magnetic media, this period is estimated to be several years ). A correctly transmitted or updated digital signallogram does not lose quality and can be copied and exist forever in absolutely unaltered form. without waiting for irreversible degradation (for magnetic carriers this period is estimated to be several years). A correctly transmitted or updated digital signallogram does not lose quality and can be copied and exist forever in absolutely unaltered form. without waiting for irreversible degradation (for magnetic carriers this period is estimated to be several years). A correctly transmitted or updated digital signallogram does not lose quality and can be copied and exist forever in absolutely unaltered form.

However, it should not be forgotten that the correctness of any code is finite, and the actual carriers are far from ideal, therefore the occurrence of unrecoverable errors is such a rare thing, especially with careless handling of the carrier. When reading new and correctly stored DAT cassettes or CDs on high-quality and reliable devices, these errors practically do not occur, however, with aging, contamination and damage of media and reading systems, they become more. A single uncorrected error is almost always invisible to the ear due to interpolation, however, it leads to distortion of the original sound signal, and the accumulation of such errors over time begins to be felt in the ear.

A separate problem is the difficulty of recording uncorrected errors, as well as verifying the identity of the original and the copy. Very often, designers of digital audio devices operating in real time do not care about the issue of accurate verification of the reliability of the transmission, considering that the measures taken to correct the errors are sufficient. In the general case, the impossibility of retransmitting an erroneous sample or block leads to interpolation occurring secretly and after copying it is impossible to say with certainty whether the original signal was copied exactly. Error indicators, which are found on some devices, usually light up only at the moment of their appearance, and in the case of single errors, their operation can easily go unnoticed. Even in personal computer-based systems, it is often impossible to control the accuracy of reception through a digital interface or direct reading from a CD; the only way out is to repeat the operation and compare the results.

What formats are used to represent digital audio?

What formats are used to represent digital audio?

Audio Formats

The format is used in two different ways.

Digital Audio Formats

When using a specialized medium or recording method and special read / write devices, the concept of format includes both physical characteristics of a sound carrier: the dimensions of a cassette with a magnetic tape or disk, the tape itself, or a disc, recording method, signal parameters, encoding and error protection principles, etc. .P. When using a universal information medium of wide application, for example, a flexible computer or a hard disk, the format is understood only as a method of encoding a digital signal, the peculiarities of the arrangement of bits and words and the structure of service information; all the “low-level” part directly related to working with the media, in this case, remains under the control of the computer and its operating system.

Of the specialized digital audio formats and media, the following are the best known today:

CD (Compact Disc) is a 120mm or 90mm single sided optical laser read / write disc, containing a maximum of 74 minutes of stereo sound at 44.1 kHz sampling rate and 16 linear quantization bits. The system is offered by Sony and Philips and is called CD-DA (Compact Disc – Digital Audio). For error protection, Cross Interleaved Reed-Solomon code (CIRC) and Hamming code 8-14 modulation (Eight to Fourteen Modulation, EFM) are used. A distinction is made between stamped compact discs (CD) write-only (CD-R) and rewritable (CD-RW).
PCM decoder (PCM deck): a system for converting the digital audio signal into a pseudo-video signal compatible with popular video formats (NTSC, PAL / SECAM) and vice versa. PCM decoders are used in combination with home (VHS) or studio (S-VHS, Beta, U-Matic) VCRs, using them as read / write devices. The devices operate with 16-bit linear quantization at sample rates of 44.056 kHz (NTSC) and 44.1 kHz (PAL / SECAM) and can record a two- or four-channel digital signal. In fact, such a decoder is a modem (modulator-demodulator) for a video signal.
S-DAT (Fixed Head Digital Audio Tape – Fixed Head Digital Audio Tape) is a system similar to a conventional cassette recorder, in which recording and reading is performed by a block of thin film fixed heads in a 3.81 mm wide tape in a double-sided cassette with dimensions of 86 x 55.5 x 9.5 mm. It implements two- or four-channel 16-bit recording at 32, 44.1, and 48 kHz.
R-DAT (Rotating Head Digital Audio Tape) is a VCR-like system with cross-tilted rotating head recording. The most popular tape-based digital recording format, R-DAT systems are often referred to simply as DAT. The R-DAT uses a 73 x 54 x 10.5mm cassette, with a 3.81mm wide tape, and the cassette and tape system itself is very similar to a typical VCR. The basic belt speed is 8.15mm / s, the rotation speed of the main unit is 2000rpm. R-DAT operates with a two-channel signal (on some models, four channels) at sample rates of 44.1 and 48 kHz with 16-bit linear quantization and 32 kHz with 12-bit non-linear quantization. To guard against errors, a double Reed-Solomon code and modulation with an 8-10 code are used. Cassette capacity – 80. .240 minutes depending on speed and belt length. Domestic DAT recorders are usually equipped with a phonogram illegal copy protection system, which does not allow recording from the analog input at a frequency of 44.1 kHz, as well as direct digital copying in the presence of SCMS prohibition codes (Serial Code Managenent System). Studio tape recorders have no such restrictions.
DASH (Digital Audio Stationary Head) is a 6.3 and 12.7 mm wide magnetic tape recording system with fixed heads. Belt speed is 19.05, 38.1, 76.2 cm / sec. Implements 16-bit recording with sample rates of 44.056, 44.1 and 48 kHz from 2 to 48 channels.
ADAT (Alesis DAT) is a proprietary system for recording eight-channel audio on S-VHS videotape, developed by Alesis. It uses linear quantization of 16 bits at 48 kHz, the capacity of the cassette is up to 60 minutes per channel. ADAT tape recorders can be cascaded so that a 128-channel synchronous recording system can be assembled.

What are the pros and cons of digital audio?

What are the pros and cons of digital audio?

Digital Audio

The digital representation of sound is valuable, first of all, for the possibility of endless storage and reproduction without loss of quality; however, the conversion from analog to digital and vice versa inevitably leads to its partial loss.

digital audio

The most unpleasant distortions introduced in the digitizing stage are the granular noise that occurs when the signal is quantized by level due to rounding of the amplitude to the nearest discrete value. Unlike simple broadband noise introduced by quantization errors, granular noise is the harmonic distortion of the signal, most noticeable in the upper part of the spectrum.

The power of the granular noise is inversely proportional to the number of quantization steps; However, due to the logarithmic characteristic of hearing with linear quantization (constant step value), quiet sounds have fewer quantization steps than loud sounds, and as a result, the main density of non-linear distortions falls in the region of sounds. silent. This leads to a limitation of the dynamic range, which ideally (without taking into account harmonic distortion) would be equal to the signal-to-noise ratio, but the need to limit this distortion reduces the dynamic range for 16-bit encoding to 50-60 dB. The situation could have been saved by logarithmic quantification, but its implementation in real time is very difficult and expensive.

The distortion introduced by granular noise can be reduced by adding normal white noise (random or pseudo-random signal) to the signal, with an amplitude of half the least significant bit; such an operation is called dithering. This leads to a slight increase in the noise level, but weakens the correlation of quantization errors with the components of the high-frequency signal and improves subjective perception. Anti-aliasing is also applied before rounding the samples by decreasing their bit depth. Essentially, dithering and noise shaping are special cases of the same technology, with the difference that, in the first case, white noise with a flat spectrum is used and, in the second, noise with a spectrum with a “shape “special.

When restoring audio from digital to analog, there is the problem of smoothing the stepped waveform and suppressing the harmonics introduced by the sample rate. Due to the imperfection of the frequency response of the filters, insufficient suppression of this interference or excessive attenuation of useful high-frequency components may occur. Poorly suppressed sample rate harmonics distort the shape of the analog signal (especially in the high frequency region), resulting in a “rough” and “dirty” sound.

Digital audio formats: how to choose the best one (Part 2)

Digital audio formats: how to choose the best one (Part 2)

Digital Audio

The higher the bit rate, the better the sound quality. For example, at a bit rate of 128 kilobits per second, five minutes of music will require only about five megabytes on a hard drive or flash drive. The optimal bit rate for storing MP3 music files is believed to be 256 or 320 kilobits per second.

Digital Audio

Another popular lossy compression format is OGG Vorbis. Unlike MP3, it was originally free and open source, so it quickly gained popularity among independent developers. In terms of quality, it is in no way inferior to MP3, although it does use its own psychoacoustic model for file compression.

WMA is a lossy audio compression format developed by Microsoft Corporation. It can be found on any Windows operating system, but it is not very popular with users. Another relatively common lossy audio compression codec is AAC, which differs from MP3 in slightly less quality loss at the same bit rate.

Audio codecs for music lovers
Newer formats provide lossless audio compression. The most popular among users is the free FLAC format, introduced in 2001. FLAC is perfect for archiving your audio collection, as well as for listening to music on high-quality sound reproduction equipment.

In so-called lossless codecs, encoded data can always be retrieved with bit precision. The encoding is carried out using a mathematical scheme: a certain regularity is found in the initial data and, taking this regularity into account, a second sequence is generated, which fully describes the original.

The second most popular lossless compression format is Monkey’s Audio, which is distributed as free software for Microsoft Windows. The WavPack format has support for multi-channel streaming and a slightly better compression ratio. Apple introduced its own lossless ALAC codec in 2004, which resembles FLAC.

Digital audio has huge advantages over analog files. The user can store and replicate their material for an infinitely long time without losing the original quality. At the same time, storing the “digit” is more cost-effective, because it takes up much less physical space, unlike a collection of records or cassettes.
Thus, a powerful ZIP archiver can compress a WAV file by only 10-20%, while FLAC achieves compression rates of 30-50% for most audio files. At the same time, the audio codec allows the recovery of partially corrupted data and the decoding process itself is very undemanding on processor resources.

To archive your music collection, it is now optimal to use lossless compression formats, for example FLAC, which is supported by most players. However, to store audiobooks, where high fidelity of the original sound is not required, you can use cheaper MP3 or OGG.

Digital audio formats: how to choose the best one

Digital audio formats: how to choose the best one

Digital Sound

Most users store music and other audio files in various digital formats. There are about a hundred digital audio encoding algorithms, but they all have their own characteristics. What format to choose to store your home audio collection and why is the well-known MP3 losing popularity?

digital sound

Analog audio is a wave. Almost every process in our world can be described using mathematics. Digital audio is the description of an analog waveform using a sequence of numbers. For example, more than 44,000 digital values ​​are used to digitize one second of music on a CD.
How digital sound was born
The theoretical foundations of digital sound in 1928 were laid by Harry Nyquist in his work “Certain problems in the theory of telegraphic transmission”, where for the first time it was possible to determine the “width” of the communication line for the transmission of a signal pulse without distortion. Regardless of the American, the Soviet scientist Vladimir Kotelnikov published similar studies in 1933.

Kotelnikov and Nyquist independently discovered that restoration of any analog signal can be guaranteed using a certain mathematical algorithm from discrete samples, that is, fragmentary data. So instead of full data for the sake of economy, you can encode only a small part, and then restore the original.

They began to digitize analog sound using pulse code modulation; today this technology is still the most widespread. The sound wave is converted into numbers by three sequential operations: time sampling, amplitude quantization and final coding. Battery calibration: how to extend the life of the smartphone

What is sampling? This is a sample of values ​​at regular time intervals. The algorithm reads the levels of the analog waveform at an incredible speed: 44,100 readings per second for the CD standard. This indicator is called the sample rate. For example, audio in movies is standardized to a sample rate of 48,000 Hertz.

To achieve this speed, all values ​​are slightly rounded to previously calculated values. This process is called quantification. The more often the algorithm reads the readings, the better the digital recording will sound. However, microscopic quantification error is unavoidable.

Computers use memory to store information – billions of tiny electrical switches that can only be in two positions: on or off. The position of one of those switches is a bit informative. The CD standard provides 16 bits for audio, which provides 65,536 different values ​​for encoding.

How are digital audio formats different?
Digital sound is a very long sequence of numbers. However, these numbers can be encoded in different ways. For example, on a CD, music files are stored in WAV format. Its main problem is that it takes up too much space, since all the information is digitized without using compression algorithms.

To reduce the amount of space taken up, mathematical algorithms have been invented – audio codecs that compress digital audio data according to certain psychoacoustic models. However, there are two main types of compression: lossless compression and lossy compression.

The most famous lossy compression format is MP3. Its developers have relied on the fact that the human ear is imperfect and a lot of redundant information is transmitted in uncompressed sound. The algorithm divides the entire frequency spectrum into small parts and then eliminates sounds that are practically not perceived by humans.

The quality of MP3 files is irretrievably degraded compared to the original, but the file itself can be 10 times “lighter” than the original. In this case, the user can choose the degree of compression of the file. For this, there is a bit rate; in fact, this is the space needed to store one second of music.

Files with digitized audio

Files with digitized audio

Digital audio

Sound files in which the original continuous (“analog”) waveform is recorded as a sequence of short discrete values ​​of the amplitudes of the sound signal, measured (“selected”) at equal time intervals and with an interval very small between them.

DIGITAL AUDIO

The process of replacing a continuous signal with a sequence of its values ​​is called sampling, and this form of recording is pulse code. The hardware implementation of digital audio processing is that an analog-to-digital converter (ADC) converts an analog signal into a set of digital measurements and, during playback, a digital-to-analog converter (DAC) performs the reverse process: convert a digital signal into analog. There are two types of files with digitized audio: header and no header.

Files with music notation (song file, music file): sound files that contain a sequence of commands indicating which note and by which instrument and for how long to play at any given time. The format can foresee the simultaneous execution of several musical instruments, in this case it speaks of the corresponding number of voices.
Edit Basic standards for multichannel audio

Dolby Stereo is a standard for digital movie sound recording / playback technology for cinemas that allows four channels to be encoded into two movie soundtracks: left, center, right, and rear. The signal read from the film is converted by the decoder into four channels, which gives a surround sound effect. Without a decoder, the sound is played as normal two-channel stereo. The standard was proposed by Dolby Laboratories in 1976.

DDS (Dolby Surround Sound) is a standard for digital recording / playback of movie soundtracks in the frequency range 100-7000 Hz for home theater systems. The standard allows encoding three channels in two soundtracks of a movie: left, right and rear. The signal read from the film is decoded into three channels. Without a decoder, the sound is played as normal two-channel stereo. The standard was proposed by Dolby Laboratories in 1982.
DPL (Dolby Surround Pro Logic) is an evolution of the DDS standard for home theater systems with three to four sound channels: left, center, right and surround. The standard was proposed by Dolby Laboratories in 1987.
Dolby Digital is a standard for encoding / decoding six-channel (5 + 1) audio recording in the 20 Hz to 20 kHz range: 5 surround channels and one low-frequency channel (subwoofer). The standard was proposed by Dolby Laboratories in 1992. The frequency range of the five channels is 3 Hz to 20 kHz, the subwoofer is 3 Hz to 120 kHz.
Dolby Digital AC3 is an addition to the Dolby Digital standard with a scheme that provides an audio recording compression density of 12: 1 or more at a 64 to 640 Kbps bit rate with high quality playback.
Dolby Surround AC3 is a simplified version of the Dolby Digital home theater standard with reduced bit rates.
DTS (Digital Theater System) is a standard for six-channel (5 + 1) sound recording on music DVDs, close to Dolby Digital, with a lower compression ratio (4: 1) and a faster data rate. high (bit rate – 882 Kbps). Due to this, in addition to the use of a perfect compression algorithm, it is characterized by high-quality sound recording and reproduction. The recording uses a 48 kHz sample rate, making it the highest quality DVD audio standard ever recorded.
Dolby Pro Logic II is an evolution of the Dolby Surround Pro Logic standard, which breaks down normal stereo sound into six channels: 5 + 1.
Dolby Pro Logic Iix is ​​an evolution of the Dolby Surround Pro Logic standard, which provides stereo sound decomposition into 7 (6 + 1) or 8 channels (7 + 1). Possible decoding modes: Movie: mirroring the center channel or rear channels; game (Play): the signal is also sent to the “new channels”; Music).
Dolby Digital EX is a home theater variant of the Dolby Pro Logic Iix standard.
Dolby Digital Surround EX is an expanded version of up to 7 channels (6 + 1) of the Dolby Digital Surround standard, in which there is an additional rear channel (rear) that doubles the center channel if the sound is recorded in 5 + 1 format. If the sound is recorded in 6 + 1 format, the additional channel becomes a full surround channel.
DTS-ES is an analog of the Dolby Digital EX standard developed by DTS; allows you to encode audio in 6 + 1 and 7 + 1 formats and decompose audio encoded in DTS (5 + 1) format into 7 (6 + 1) or 8 (7 + 1) channels.

Digital audio information (Part 3)

Digital audio information (Part 3)

digital audio

Codec sample rate and bit depth

Digital Audio

Sampling is the acquisition of instantaneous values ​​(samples) of an analog signal with a certain time step in the digitization process. The frequency of this step is called the sample rate (it is also the sample or sample rate). The larger it is, the better the sound recorded and reproduced. In studio equipment, the frequency is 48 kHz, in home systems – 44.1 kHz.

Bit depth determines the quality of the recorded audio. Higher is better. The bit value, for example 32, denotes the number of bits that are allocated to record the amplitude of the signal at the time of its measurement.

Consequently, the more often (sample rate) and more accurately (bit depth) the audio signal is measured, the higher quality audio file is obtained.

Bitrate

The bit rate (literally, the information bit rate) determines the maximum amount of information that can be transmitted through the audio channel per unit of time. A high bit rate is needed to transmit a rich sound image and is not required when encoding speech. Audio recordings with a 128 Kbps bit rate are suitable for inexpensive speakers, but when accessing expensive equipment, it makes sense to get music at a 192-256 Kbps bit rate.

Convenient solution: variable bit rate encoding, change the bandwidth of the audio channel according to the quality and saturation of the musical fragment.

Audio formats

MP3 is the most popular digital audio format right now. It is widely used in file-sharing networks due to the small size of the final files (approximately 1/10 of the original audio CD file) and due to its special data compression algorithm, it provides playback quality very close to that of original. The MP3 format is compatible with absolutely all RoverMedia players, as well as all modern stereos and DVD players.

WMA is a file format developed by Microsoft to store and transmit audio information. The main advantage of WMA over MP3 is its greater compression capacity, which results in a smaller file size. The latest versions of the format, starting with Windows Media Audio 9.1, provide lossless encoding, multi-channel surround sound encoding, and speech encoding.

WAV is an audio container file format for storing a recording of a digitized audio stream. This format is mainly used to record sound from the voice recorder built into RoverMedia players and most modern devices.

FLAC (Free Lossless Audio Codec) is one of the most popular formats for lossless audio compression. Unlike MP3 and WMA formats, it does not remove any information from the audio stream when encoding the audio. Thanks to this, FLAC files are suitable not only for listening to high-quality music on RoverMedia portable media players, but even on high-quality audio equipment.

Number of audio channels

Infectious mononucleosis
Mono (from the Greek (Monos) – one) is a prefix that means the relationship with the singular.
Mono eng. Mono (monophony) is most often used as a term related to the recording and reproduction of sound.
Mono means monophonic, single channel.

Stereo
Stereo (from Greek solid, spatial)
Stereophony or stereo sound (from the ancient Greek words “stereoros” – solid, spatial and “background” – sound): recording, transmission or reproduction of sound, in which the auditory information about the location of its source is stored through sound design over two (or more) independent audio channels. …
In stereo recording, the recording is made from 2 microphones spaced a certain distance, each with a separate channel (right or left).
The result is what is called “panoramic sound”.

Digital audio information (Part 2)

Digital audio information (Part 2)

DIGITAL AUDIO

Sampling frequency. A microphone connected to the sound card is used to record analog sound and convert it to digital format. The quality of the digital sound obtained depends on the number of measurements of the sound volume level per unit of time, that is, the sampling frequency. The more measurements that are made in 1 second (the higher the sampling frequency), the more accurately the “ladder” of the digital audio signal repeats the curve of the dialogue signal.

Digital Audio

The audio sample rate is the number of sound volume measurements in one second.

The audio sample rate can vary between 8000 and 48000 sound volume measurements per second.

Audio encoding depth. Each “step” is assigned a specific value for the sound volume level. Loudness levels of sound can be viewed as a set of possible states N, for which a certain amount of information I is required, which is called audio coding depth.

Audio encoding depth is the amount of information required to encode the discrete volume levels of digital audio.

If the encoding depth is known, then the number of digital audio loudness levels can be calculated using the formula N = 2I. Let the sound encoding depth be 16 bit, then the number of sound volume levels is:

N = 2I = 216 = 65536.

During the encoding process, each sound volume level is assigned its own 16-bit binary code, the lowest sound level will correspond to the code 0000000000000000 and the highest – 1111111111111111.

The quality of digitized sound. The higher the sampling frequency and depth of the sound, the better the sound of the digitized sound. The lowest quality of digitized sound, corresponding to the quality of telephone communication, is obtained at a sampling rate of 8000 times per second, a sampling rate of 8 bits, and by recording an audio track (“mono” mode). The highest quality of digitized sound, corresponding to the quality of an audio CD, is achieved with a sampling rate of 48,000 times per second, a sampling rate of 16 bits and the recording of two audio tracks (stereo mode) .

It should be remembered that the higher the quality of the digital sound, the greater the volume of information in the audio file. It is possible to estimate the volume of information of a digital stereo sound file with a duration of 1 second with an average sound quality (16 bits, 24,000 measurements per second). To do this, the encoding depth must be multiplied by the number of measurements in 1 second and multiplied by 2 (stereo sound):

16 bit? 24,000? 2 = 768,000 bits = 96,000 bytes = 93.75 KB.

Sound editors. Sound editors allow you not only to record and play sound, but also to edit it. Digitized sound is presented in sound editors visually, so copying, moving, and deleting parts of the audio track can be easily performed with the mouse. Furthermore, you can layer audio tracks on top of each other (mix sounds) and apply various acoustic effects (echo, reverse playback, etc.).

Sound editors allow you to change the digital sound quality and volume of an audio file by changing the sample rate and encoding depth. Digitized audio can be saved uncompressed as universal WAV or compressed MP3 audio files.

By storing audio in compressed formats, low-intensity audio frequencies “excessive” for human perception are discarded, coinciding in time with high-intensity audio frequencies. Using this format allows you to compress audio files dozens of times, but it leads to irreversible loss of information (files cannot be restored in their original form).
test questions

1. How do sample rate and encoding depth affect digital audio quality?
Self-help assignments

1.22. Selective Response Mapping. The sound card performs binary encoding of the analog audio signal. How much information is needed to encode each of the 65,536 possible levels of signal intensity?
16 bits;
256 bits;
1 bit;
8 bits.

1.23. A task with a detailed answer. Estimate the volume of information in digital audio files with a duration of 10 seconds at an encoding depth and a sample rate of an audio signal that provides the minimum and maximum sound quality:

a) mono, 8 bits, 8000 measurements per second;

b) stereo, 16 bits, 48,000 measurements per second.

Digital audio information (Part 1)

Digital audio information (Part 1)

Digital Audio

The history of recording technology

Digital Audio

The creation of sound by computer is a modern stage in the history of the development of sound technology. Let’s take a brief look at this story.

Since the late 19th century, the technical means of storing and transmitting information have developed rapidly. So in the late 1800s, the famous American inventor Thomas Edison made a phonograph.

The principle of operation of the phonograph is as follows. Speech, music, or song create sound vibrations that are transmitted to the recording pen of the phonograph. The needle, acting on the surface of the rotating wax roller, leaves in it a groove with variable depth: a sound track. When a sound is reproduced, the opposite process occurs: the movement of the reading needle along the soundtrack is accompanied by its oscillations with the same frequency. These vibrations are converted by the phonograph into an audible sound. The Edison phonograph is the first sound recording device.

The same idea served as the basis for the production of celluloid gramophone records and mechanisms that reproduce the sound recorded on them: gramophone and gramophone.

In the middle of the 20th century, an electrophone appeared, an electrical analog of a gramophone.
Analog sound representation

The soundtrack of a phonograph record is an example of a continuous form of sound recording.

The electrical signal is transmitted to the speaker of the microphone and converted into sound.

In the 20th century, the tape recorder was invented, a device for recording sound on magnetic tape. It also uses an analog form of audio storage. Only now the soundtrack is not a mechanical “pit groove”, as shown in fig. 1.1, and a line with continuously changing magnetization. With the help of a magnetic reading head, an alternating electrical signal is generated, which is emitted by an acoustic system.

Until recently, all sound transmission technology was analog. This is both telephone communication and radio communication. During a telephone conversation, the sound vibrations from the microphone membrane are converted into an alternating electrical signal that is transmitted through electrical cables. On the receiving phone, they become sound.
Audio encoding and processing

Sound information. Sound is a wave that travels through air, water, or other medium with a continuously varying intensity and frequency.

A person perceives sound waves (air vibrations) with the help of hearing in the form of sound of different volume and pitch. The greater the intensity of the sound wave, the louder the sound, the higher the frequency of the wave, the higher the pitch of the sound.
Dependence of the volume and pitch of the sound on the intensity and frequency of the sound wave.

The human ear perceives sound at a frequency of 20 vibrations per second (low sound) to 20,000 vibrations per second (high sound).

A person can perceive sound in a wide range of intensities, in which the maximum intensity is 1014 times greater than the minimum (one hundred thousand billion times). To measure the volume of sound, a special unit “decibel” (dbl) is used (Table 5.1). A decrease or increase in sound volume by 10 dB corresponds to a decrease or increase in sound intensity by 10 times

Sound volume

Sound volume in decibels:
-Lower limit of human ear sensitivity 0
-Rustling leaves 10
-Talk 60
-90 car horn
-120 jet engine
-Pain threshold 140

Sound time sampling. (Part 1)

In order for a computer to process sound, a continuous audio signal must be converted to a discrete digital form using time sampling. A continuous sound wave is divided into separate small time sections, for each section a certain value of sound intensity is set.

Therefore, the continuous dependence of the loudness of the sound at time A (t) is replaced by a discrete sequence of loudness levels. On the graph, this appears to replace a smooth curve with a sequence of “steps”).