Digital audio encoding


Free Download Mp4Gain
picture

Digital audio encoding

Digital audio encoding

In fact, one or another digital form of representation of analog audio signals is already a coding method – a sequence of numbers that describes an analog audio signal is itself a digital code.

Digital Audio Encoding

However, the encoding that we are going to talk about now is something else. Now let’s look at the methods of encoding digital audio signals.

A digitized audio signal “in its pure form” is a fairly accurate, but not the most compact, way of recording the original analog signal.

Judge for yourself. To obtain complete information about the original analog signal in the frequency range 0-20 kHz (in the audible frequency range), the analog signal must be sampled at a frequency of at least 40 kHz. Therefore, the CD – DA standard (the standard for recording data on audio CDs familiar to all) establishes the following encoding parameters: recording of two or one channel in PCM format with a sampling frequency of 44.1 kHz and a 16-bit quantization bit depth. One hour of music in this format takes up approximately 600 MB of space (60 minutes * 60 seconds * 2 channels * 44100 samples per second * 2 bytes per sample = approximately 605 MB). Taking into account that, for example, the music collection of an ordinary music lover may have 5,000 tracks with an average length of about 3 minutes each, the amount of memory required to store it in its original digital form is quite significant. Awesome. Therefore, storing relatively large amounts of audio data, ensuring fairly good sound quality, requires the use of various “tricks” to compress the data.

In general, all existing methods for encoding audio information can be conditionally divided into only two types.

1. Lossless data compression (“Lossless Encoding”) is a method of encoding (compacting) digital audio information, which enables one hundred percent recovery of the original data from the compressed transmission (the term ” original data “here means the original form of the digitized audio data). This method of data compression is used in cases where one hundred percent absolute preservation of the quality of the original audio data is required. Lossless compression algorithms that exist today can reduce the volume of data occupied by 20-50% and at the same time guarantee a 100% recovery of the original digital material from the compressed data. The operating mechanisms of such encoders are similar to the operating mechanisms of general data archivers, such as ZIP or RAR, but at the same time they are specially adapted to compress audio data …. Lossless encoding While it is ideal in terms of preserving the quality of audio materials, it cannot provide a high level of compression.

2. There is another more modern way to compact data. This so-called lossy data compression (Engl. “Lossy encoding”) The purpose of encoding is to achieve the highest data compression rate by all means while keeping sound quality at an acceptable level. The idea behind lossy encoding is based on two simple underlying considerations:

original digital audio data is redundant: it contains a lot of unnecessary information that is useless to the ear, which can be removed, thereby increasing the compression ratio;
Requirements for the sound quality of audio material may vary and depend on specific purposes and areas of use.
Lossy encoding is therefore called “lossy”, which results in the loss of some of the audio information. Such encoding leads to the fact that the decoded signal, when reproduced, sounds similar to the original, but in reality it is no longer identical to it. Most lossy coding methods rely on the use of the psychoacoustic properties of the human auditory system, as well as various tricks associated with resampling and resampling the signal. In frequency, during the compression process, the encoder analyzes the audio data to identify various details of the sound that can be ignored. Disguised frequencies, inaudible and inaudible sound details can be sacrificed for a higher compression ratio. Where intelligibility is only important in sound (for example, in telephony, where the presence of frequencies above 4 kHz is not necessary), the audio information during the encoding process undergoes a serious “simplification”, which, together with the use of successful “smart” quantifiers and “greedy” data compression algorithms.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Digital audio formats: how to choose the best one (Part 2)

Digital audio formats: how to choose the best one (Part 2)

Digital Audio

The higher the bit rate, the better the sound quality. For example, at a bit rate of 128 kilobits per second, five minutes of music will require only about five megabytes on a hard drive or flash drive. The optimal bit rate for storing MP3 music files is believed to be 256 or 320 kilobits per second.

Digital Audio

Another popular lossy compression format is OGG Vorbis. Unlike MP3, it was originally free and open source, so it quickly gained popularity among independent developers. In terms of quality, it is in no way inferior to MP3, although it does use its own psychoacoustic model for file compression.

WMA is a lossy audio compression format developed by Microsoft Corporation. It can be found on any Windows operating system, but it is not very popular with users. Another relatively common lossy audio compression codec is AAC, which differs from MP3 in slightly less quality loss at the same bit rate.

Audio codecs for music lovers
Newer formats provide lossless audio compression. The most popular among users is the free FLAC format, introduced in 2001. FLAC is perfect for archiving your audio collection, as well as for listening to music on high-quality sound reproduction equipment.

In so-called lossless codecs, encoded data can always be retrieved with bit precision. The encoding is carried out using a mathematical scheme: a certain regularity is found in the initial data and, taking this regularity into account, a second sequence is generated, which fully describes the original.

The second most popular lossless compression format is Monkey’s Audio, which is distributed as free software for Microsoft Windows. The WavPack format has support for multi-channel streaming and a slightly better compression ratio. Apple introduced its own lossless ALAC codec in 2004, which resembles FLAC.

Digital audio has huge advantages over analog files. The user can store and replicate their material for an infinitely long time without losing the original quality. At the same time, storing the “digit” is more cost-effective, because it takes up much less physical space, unlike a collection of records or cassettes.
Thus, a powerful ZIP archiver can compress a WAV file by only 10-20%, while FLAC achieves compression rates of 30-50% for most audio files. At the same time, the audio codec allows the recovery of partially corrupted data and the decoding process itself is very undemanding on processor resources.

To archive your music collection, it is now optimal to use lossless compression formats, for example FLAC, which is supported by most players. However, to store audiobooks, where high fidelity of the original sound is not required, you can use cheaper MP3 or OGG.

Digital audio formats: how to choose the best one

Digital audio formats: how to choose the best one

Digital Sound

Most users store music and other audio files in various digital formats. There are about a hundred digital audio encoding algorithms, but they all have their own characteristics. What format to choose to store your home audio collection and why is the well-known MP3 losing popularity?

digital sound

Analog audio is a wave. Almost every process in our world can be described using mathematics. Digital audio is the description of an analog waveform using a sequence of numbers. For example, more than 44,000 digital values ​​are used to digitize one second of music on a CD.
How digital sound was born
The theoretical foundations of digital sound in 1928 were laid by Harry Nyquist in his work “Certain problems in the theory of telegraphic transmission”, where for the first time it was possible to determine the “width” of the communication line for the transmission of a signal pulse without distortion. Regardless of the American, the Soviet scientist Vladimir Kotelnikov published similar studies in 1933.

Kotelnikov and Nyquist independently discovered that restoration of any analog signal can be guaranteed using a certain mathematical algorithm from discrete samples, that is, fragmentary data. So instead of full data for the sake of economy, you can encode only a small part, and then restore the original.

They began to digitize analog sound using pulse code modulation; today this technology is still the most widespread. The sound wave is converted into numbers by three sequential operations: time sampling, amplitude quantization and final coding. Battery calibration: how to extend the life of the smartphone

What is sampling? This is a sample of values ​​at regular time intervals. The algorithm reads the levels of the analog waveform at an incredible speed: 44,100 readings per second for the CD standard. This indicator is called the sample rate. For example, audio in movies is standardized to a sample rate of 48,000 Hertz.

To achieve this speed, all values ​​are slightly rounded to previously calculated values. This process is called quantification. The more often the algorithm reads the readings, the better the digital recording will sound. However, microscopic quantification error is unavoidable.

Computers use memory to store information – billions of tiny electrical switches that can only be in two positions: on or off. The position of one of those switches is a bit informative. The CD standard provides 16 bits for audio, which provides 65,536 different values ​​for encoding.

How are digital audio formats different?
Digital sound is a very long sequence of numbers. However, these numbers can be encoded in different ways. For example, on a CD, music files are stored in WAV format. Its main problem is that it takes up too much space, since all the information is digitized without using compression algorithms.

To reduce the amount of space taken up, mathematical algorithms have been invented – audio codecs that compress digital audio data according to certain psychoacoustic models. However, there are two main types of compression: lossless compression and lossy compression.

The most famous lossy compression format is MP3. Its developers have relied on the fact that the human ear is imperfect and a lot of redundant information is transmitted in uncompressed sound. The algorithm divides the entire frequency spectrum into small parts and then eliminates sounds that are practically not perceived by humans.

The quality of MP3 files is irretrievably degraded compared to the original, but the file itself can be 10 times “lighter” than the original. In this case, the user can choose the degree of compression of the file. For this, there is a bit rate; in fact, this is the space needed to store one second of music.

Files with digitized audio

Files with digitized audio

Digital audio

Sound files in which the original continuous (“analog”) waveform is recorded as a sequence of short discrete values ​​of the amplitudes of the sound signal, measured (“selected”) at equal time intervals and with an interval very small between them.

DIGITAL AUDIO

The process of replacing a continuous signal with a sequence of its values ​​is called sampling, and this form of recording is pulse code. The hardware implementation of digital audio processing is that an analog-to-digital converter (ADC) converts an analog signal into a set of digital measurements and, during playback, a digital-to-analog converter (DAC) performs the reverse process: convert a digital signal into analog. There are two types of files with digitized audio: header and no header.

Files with music notation (song file, music file): sound files that contain a sequence of commands indicating which note and by which instrument and for how long to play at any given time. The format can foresee the simultaneous execution of several musical instruments, in this case it speaks of the corresponding number of voices.
Edit Basic standards for multichannel audio

Dolby Stereo is a standard for digital movie sound recording / playback technology for cinemas that allows four channels to be encoded into two movie soundtracks: left, center, right, and rear. The signal read from the film is converted by the decoder into four channels, which gives a surround sound effect. Without a decoder, the sound is played as normal two-channel stereo. The standard was proposed by Dolby Laboratories in 1976.

DDS (Dolby Surround Sound) is a standard for digital recording / playback of movie soundtracks in the frequency range 100-7000 Hz for home theater systems. The standard allows encoding three channels in two soundtracks of a movie: left, right and rear. The signal read from the film is decoded into three channels. Without a decoder, the sound is played as normal two-channel stereo. The standard was proposed by Dolby Laboratories in 1982.
DPL (Dolby Surround Pro Logic) is an evolution of the DDS standard for home theater systems with three to four sound channels: left, center, right and surround. The standard was proposed by Dolby Laboratories in 1987.
Dolby Digital is a standard for encoding / decoding six-channel (5 + 1) audio recording in the 20 Hz to 20 kHz range: 5 surround channels and one low-frequency channel (subwoofer). The standard was proposed by Dolby Laboratories in 1992. The frequency range of the five channels is 3 Hz to 20 kHz, the subwoofer is 3 Hz to 120 kHz.
Dolby Digital AC3 is an addition to the Dolby Digital standard with a scheme that provides an audio recording compression density of 12: 1 or more at a 64 to 640 Kbps bit rate with high quality playback.
Dolby Surround AC3 is a simplified version of the Dolby Digital home theater standard with reduced bit rates.
DTS (Digital Theater System) is a standard for six-channel (5 + 1) sound recording on music DVDs, close to Dolby Digital, with a lower compression ratio (4: 1) and a faster data rate. high (bit rate – 882 Kbps). Due to this, in addition to the use of a perfect compression algorithm, it is characterized by high-quality sound recording and reproduction. The recording uses a 48 kHz sample rate, making it the highest quality DVD audio standard ever recorded.
Dolby Pro Logic II is an evolution of the Dolby Surround Pro Logic standard, which breaks down normal stereo sound into six channels: 5 + 1.
Dolby Pro Logic Iix is ​​an evolution of the Dolby Surround Pro Logic standard, which provides stereo sound decomposition into 7 (6 + 1) or 8 channels (7 + 1). Possible decoding modes: Movie: mirroring the center channel or rear channels; game (Play): the signal is also sent to the “new channels”; Music).
Dolby Digital EX is a home theater variant of the Dolby Pro Logic Iix standard.
Dolby Digital Surround EX is an expanded version of up to 7 channels (6 + 1) of the Dolby Digital Surround standard, in which there is an additional rear channel (rear) that doubles the center channel if the sound is recorded in 5 + 1 format. If the sound is recorded in 6 + 1 format, the additional channel becomes a full surround channel.
DTS-ES is an analog of the Dolby Digital EX standard developed by DTS; allows you to encode audio in 6 + 1 and 7 + 1 formats and decompose audio encoded in DTS (5 + 1) format into 7 (6 + 1) or 8 (7 + 1) channels.

Digital audio information (Part 3)

Digital audio information (Part 3)

digital audio

Codec sample rate and bit depth

Digital Audio

Sampling is the acquisition of instantaneous values ​​(samples) of an analog signal with a certain time step in the digitization process. The frequency of this step is called the sample rate (it is also the sample or sample rate). The larger it is, the better the sound recorded and reproduced. In studio equipment, the frequency is 48 kHz, in home systems – 44.1 kHz.

Bit depth determines the quality of the recorded audio. Higher is better. The bit value, for example 32, denotes the number of bits that are allocated to record the amplitude of the signal at the time of its measurement.

Consequently, the more often (sample rate) and more accurately (bit depth) the audio signal is measured, the higher quality audio file is obtained.

Bitrate

The bit rate (literally, the information bit rate) determines the maximum amount of information that can be transmitted through the audio channel per unit of time. A high bit rate is needed to transmit a rich sound image and is not required when encoding speech. Audio recordings with a 128 Kbps bit rate are suitable for inexpensive speakers, but when accessing expensive equipment, it makes sense to get music at a 192-256 Kbps bit rate.

Convenient solution: variable bit rate encoding, change the bandwidth of the audio channel according to the quality and saturation of the musical fragment.

Audio formats

MP3 is the most popular digital audio format right now. It is widely used in file-sharing networks due to the small size of the final files (approximately 1/10 of the original audio CD file) and due to its special data compression algorithm, it provides playback quality very close to that of original. The MP3 format is compatible with absolutely all RoverMedia players, as well as all modern stereos and DVD players.

WMA is a file format developed by Microsoft to store and transmit audio information. The main advantage of WMA over MP3 is its greater compression capacity, which results in a smaller file size. The latest versions of the format, starting with Windows Media Audio 9.1, provide lossless encoding, multi-channel surround sound encoding, and speech encoding.

WAV is an audio container file format for storing a recording of a digitized audio stream. This format is mainly used to record sound from the voice recorder built into RoverMedia players and most modern devices.

FLAC (Free Lossless Audio Codec) is one of the most popular formats for lossless audio compression. Unlike MP3 and WMA formats, it does not remove any information from the audio stream when encoding the audio. Thanks to this, FLAC files are suitable not only for listening to high-quality music on RoverMedia portable media players, but even on high-quality audio equipment.

Number of audio channels

Infectious mononucleosis
Mono (from the Greek (Monos) – one) is a prefix that means the relationship with the singular.
Mono eng. Mono (monophony) is most often used as a term related to the recording and reproduction of sound.
Mono means monophonic, single channel.

Stereo
Stereo (from Greek solid, spatial)
Stereophony or stereo sound (from the ancient Greek words “stereoros” – solid, spatial and “background” – sound): recording, transmission or reproduction of sound, in which the auditory information about the location of its source is stored through sound design over two (or more) independent audio channels. …
In stereo recording, the recording is made from 2 microphones spaced a certain distance, each with a separate channel (right or left).
The result is what is called “panoramic sound”.

Digital audio information (Part 2)

Digital audio information (Part 2)

DIGITAL AUDIO

Sampling frequency. A microphone connected to the sound card is used to record analog sound and convert it to digital format. The quality of the digital sound obtained depends on the number of measurements of the sound volume level per unit of time, that is, the sampling frequency. The more measurements that are made in 1 second (the higher the sampling frequency), the more accurately the “ladder” of the digital audio signal repeats the curve of the dialogue signal.

Digital Audio

The audio sample rate is the number of sound volume measurements in one second.

The audio sample rate can vary between 8000 and 48000 sound volume measurements per second.

Audio encoding depth. Each “step” is assigned a specific value for the sound volume level. Loudness levels of sound can be viewed as a set of possible states N, for which a certain amount of information I is required, which is called audio coding depth.

Audio encoding depth is the amount of information required to encode the discrete volume levels of digital audio.

If the encoding depth is known, then the number of digital audio loudness levels can be calculated using the formula N = 2I. Let the sound encoding depth be 16 bit, then the number of sound volume levels is:

N = 2I = 216 = 65536.

During the encoding process, each sound volume level is assigned its own 16-bit binary code, the lowest sound level will correspond to the code 0000000000000000 and the highest – 1111111111111111.

The quality of digitized sound. The higher the sampling frequency and depth of the sound, the better the sound of the digitized sound. The lowest quality of digitized sound, corresponding to the quality of telephone communication, is obtained at a sampling rate of 8000 times per second, a sampling rate of 8 bits, and by recording an audio track (“mono” mode). The highest quality of digitized sound, corresponding to the quality of an audio CD, is achieved with a sampling rate of 48,000 times per second, a sampling rate of 16 bits and the recording of two audio tracks (stereo mode) .

It should be remembered that the higher the quality of the digital sound, the greater the volume of information in the audio file. It is possible to estimate the volume of information of a digital stereo sound file with a duration of 1 second with an average sound quality (16 bits, 24,000 measurements per second). To do this, the encoding depth must be multiplied by the number of measurements in 1 second and multiplied by 2 (stereo sound):

16 bit? 24,000? 2 = 768,000 bits = 96,000 bytes = 93.75 KB.

Sound editors. Sound editors allow you not only to record and play sound, but also to edit it. Digitized sound is presented in sound editors visually, so copying, moving, and deleting parts of the audio track can be easily performed with the mouse. Furthermore, you can layer audio tracks on top of each other (mix sounds) and apply various acoustic effects (echo, reverse playback, etc.).

Sound editors allow you to change the digital sound quality and volume of an audio file by changing the sample rate and encoding depth. Digitized audio can be saved uncompressed as universal WAV or compressed MP3 audio files.

By storing audio in compressed formats, low-intensity audio frequencies “excessive” for human perception are discarded, coinciding in time with high-intensity audio frequencies. Using this format allows you to compress audio files dozens of times, but it leads to irreversible loss of information (files cannot be restored in their original form).
test questions

1. How do sample rate and encoding depth affect digital audio quality?
Self-help assignments

1.22. Selective Response Mapping. The sound card performs binary encoding of the analog audio signal. How much information is needed to encode each of the 65,536 possible levels of signal intensity?
16 bits;
256 bits;
1 bit;
8 bits.

1.23. A task with a detailed answer. Estimate the volume of information in digital audio files with a duration of 10 seconds at an encoding depth and a sample rate of an audio signal that provides the minimum and maximum sound quality:

a) mono, 8 bits, 8000 measurements per second;

b) stereo, 16 bits, 48,000 measurements per second.

Digital audio information (Part 1)

Digital audio information (Part 1)

Digital Audio

The history of recording technology

Digital Audio

The creation of sound by computer is a modern stage in the history of the development of sound technology. Let’s take a brief look at this story.

Since the late 19th century, the technical means of storing and transmitting information have developed rapidly. So in the late 1800s, the famous American inventor Thomas Edison made a phonograph.

The principle of operation of the phonograph is as follows. Speech, music, or song create sound vibrations that are transmitted to the recording pen of the phonograph. The needle, acting on the surface of the rotating wax roller, leaves in it a groove with variable depth: a sound track. When a sound is reproduced, the opposite process occurs: the movement of the reading needle along the soundtrack is accompanied by its oscillations with the same frequency. These vibrations are converted by the phonograph into an audible sound. The Edison phonograph is the first sound recording device.

The same idea served as the basis for the production of celluloid gramophone records and mechanisms that reproduce the sound recorded on them: gramophone and gramophone.

In the middle of the 20th century, an electrophone appeared, an electrical analog of a gramophone.
Analog sound representation

The soundtrack of a phonograph record is an example of a continuous form of sound recording.

The electrical signal is transmitted to the speaker of the microphone and converted into sound.

In the 20th century, the tape recorder was invented, a device for recording sound on magnetic tape. It also uses an analog form of audio storage. Only now the soundtrack is not a mechanical “pit groove”, as shown in fig. 1.1, and a line with continuously changing magnetization. With the help of a magnetic reading head, an alternating electrical signal is generated, which is emitted by an acoustic system.

Until recently, all sound transmission technology was analog. This is both telephone communication and radio communication. During a telephone conversation, the sound vibrations from the microphone membrane are converted into an alternating electrical signal that is transmitted through electrical cables. On the receiving phone, they become sound.
Audio encoding and processing

Sound information. Sound is a wave that travels through air, water, or other medium with a continuously varying intensity and frequency.

A person perceives sound waves (air vibrations) with the help of hearing in the form of sound of different volume and pitch. The greater the intensity of the sound wave, the louder the sound, the higher the frequency of the wave, the higher the pitch of the sound.
Dependence of the volume and pitch of the sound on the intensity and frequency of the sound wave.

The human ear perceives sound at a frequency of 20 vibrations per second (low sound) to 20,000 vibrations per second (high sound).

A person can perceive sound in a wide range of intensities, in which the maximum intensity is 1014 times greater than the minimum (one hundred thousand billion times). To measure the volume of sound, a special unit “decibel” (dbl) is used (Table 5.1). A decrease or increase in sound volume by 10 dB corresponds to a decrease or increase in sound intensity by 10 times

Sound volume

Sound volume in decibels:
-Lower limit of human ear sensitivity 0
-Rustling leaves 10
-Talk 60
-90 car horn
-120 jet engine
-Pain threshold 140

Sound time sampling. (Part 1)

In order for a computer to process sound, a continuous audio signal must be converted to a discrete digital form using time sampling. A continuous sound wave is divided into separate small time sections, for each section a certain value of sound intensity is set.

Therefore, the continuous dependence of the loudness of the sound at time A (t) is replaced by a discrete sequence of loudness levels. On the graph, this appears to replace a smooth curve with a sequence of “steps”).

Parameters that affect the quality of digital audio. (Part 3)

Parameters that affect the quality of digital audio. (Part 3)

digital audio

The quality of the digital sound obtained depends on the number of measurements of the sound volume level per unit of time, that is, the sampling frequency.

DIGITAL AUDIO

Audio sample rate is the number of audio volume measurements in one second.

The more measurements that are made in one second (the higher the sampling frequency), the more accurately the “ladder” of the digital audio signal repeats the curve of the analog signal.

Each “step” of the graph is assigned a certain value for the sound volume level. Loudness levels can be thought of as a set of possible N states (gradations), which require a certain amount of I information to encode, which is called audio encoding depth.

Audio encoding depth is the amount of information required to encode the discrete volume levels of digital audio.

If the known encoding depth, the number of digital audio volume levels can be calculated by the general formula N = 2 I.

For example, if the audio encoding depth is 16-bit, then the number of audio volume levels is:

N = 2 I = 2 16 = 65 536.

During the encoding process, each sound volume level is assigned its own 16-bit binary code, the lowest sound level will correspond to the code 0000000000000000 and the highest – 1111111111111111.

Digitized audio quality

Therefore, the higher the sample rate and depth of audio encoding, the better the digitized sound will sound and the better you can bring the digitized sound closer to the original sound.

The highest quality of digitized sound, corresponding to the quality of an audio CD, is achieved with a sampling rate of 48,000 times per second, a sampling rate of 16 bits and the recording of two audio tracks (stereo mode) .

It should be remembered that the higher the quality of the digital sound, the greater the volume of information in the audio file.

You can easily estimate the volume of information in a digital stereo sound file with a duration of 1 second with an average sound quality (16 bits, 24,000 measurements per second). To do this, the encoding depth must be multiplied by the number of measurements per second and multiplied by 2 channels (stereo sound):

16 bits × 24,000 × 2 = 768,000 bits = 96,000 bytes = 93.75 KB.

There are three main types of audio digits:

format – no compression;
format (lossy) – lossy compression;
format (lossless): lossless compression.
Lossy compression: technology in which there is a significant reduction of the encoded file compared to the original, due to the removal of information that is not perceived by the human ear.

The downside of this technology is the fact that the compressed file will never be identical to the original.

List of the most common lossy formats:

AAC (.m4a, .mp4, .m4p, .aac): advanced audio encoding (often in MPEG-4 container)
MP2 (MPEG Layer 2)
MP3 (MPEG Layer 3)
MPC (known as Musepack, previously called MPEGplus or MP +)
Ogg Vorbis
WMA (Windows Media Audio)

Lossless – Lossless compressed audio formats, including:

FLAC (Free Lossless Audio Codec)
APE (mono audio)
WV (WavPack)
These formats are capable of converting CD to digital format while maintaining quality. As an example, you can take a CD, convert it to WAV, then WAV to FLAC, then go back from FLAC to WAV, and then burn it to a blank CD and you have an absolutely identical copy of your source.

What format does the music sound with the best quality?
The most popular is the lossless FLAC format, and one of the most widely used CD to FLAC conversion programs is EAC (Exact Audio Copy).

Of all the parameters of digital audio, it is necessary to pay attention primarily to the following indicators:

sampling rate (precision of digitizing an analog signal in time),
bit rate (amount of information contained in a file in terms of one second).

The sample rate is the frequency at which digital audio is processed. The most common sample rate for quality audio formats is 44.1 kHz.

It is generally accepted that a high bit rate guarantees the best quality; this is true, but only if the source file is of good quality. A high quality MP3 should have a bit rate of 320 kbps, but a high quality FLAC format generally has a bit rate of 900 kbps and higher.

What is the best quality music format?
In addition to the audio formats themselves, for high-quality music sound, high-quality reproduction equipment is also needed: speakers, amplifiers, headphones.

Parameters that affect the quality of digital audio. (Part 2)

Parameters that affect the quality of digital audio. (Part 2)

digital audio

The format is also called the number of channels in multichannel sound systems (5.1; 7.1). Initially such a system was developed for cinemas but later spread by Software Codec

 

DIGITAL SOUND

Software-level audio codec

§ G.723.1 – one of the basic codecs for IP telephony applications

§ G.729 – proprietary narrowband codec used for digital representation of speech

§ Internet Low Bit Rate Codec (iLBC) – a popular free codec for IP telephony (in particular for Skype and Google Talk)

Audio Codec (Audio Codec; Audio Encoder / Decoder) – A computer program or hardware designed to encode or decode audio data.

Software codec

A software-level audio codec is a specialized computer program, a codec that compresses (compresses) or decompresses (decompresses) digital audio data according to an audio file format or streaming audio format. The task of an audio codec as a compressor is to provide an audio signal with a certain quality / precision and the smallest possible size. Compression reduces the amount of space required to store audio data, and it is also possible to reduce the bandwidth of the channel through which the audio data is transmitted. Most audio codecs are implemented as software libraries that interact with one or more audio players such as QuickTime Player, XMMS, Winamp, VLC media player, MPlayer, or Windows Media Player.

Popular software audio codecs by application:

§ MPEG-1 Layer III (MP3) is a proprietary audio recording codec (music, audiobooks, etc.) for computer equipment and digital players

§ Ogg Vorbis (OGG) – the second most popular format, widely used in computer games and file-sharing networks to transfer music

§ GSM-FR is the first digital voice coding standard used in GSM phones

Adaptive Multispeed (AMR): human voice recording on mobile phones and other mobile devices

Dependence of the loudness, as well as the tone of the sound on the intensity and frequency of the sound wave.

Hertz (denoted by Hz or Hz) is a unit of measurement for the frequency of periodic processes (eg, oscillations).
1 Hz means an execution of said process in one second: 1 Hz = 1 / s.

If we have 10 Hz, this means that we have ten executions of said process in one second.

The human ear can perceive sound at frequencies ranging from 20 vibrations per second (20 Hertz, low sound) to 20,000 vibrations per second (20 KHz, high sound).

In addition, a person can perceive sound in a wide range of intensities, in which the maximum intensity is 1014 times greater than the minimum (one hundred thousand billion times).

To measure the volume of sound, a special unit of “decibels” (dB) was invented and used.

A decrease or increase in sound volume by 10 dB corresponds to a decrease or increase in sound intensity by 10 times.

Sound volume in decibels

In order for computer systems to process sound, a continuous audio signal must be converted to a discrete digital form by time sampling.

For this, a continuous sound wave is divided into separate small time sections, for each section a certain value of sound intensity is set.

Therefore, the continuous dependence of the loudness of the sound at time A (t) is replaced by a discrete sequence of loudness levels. On the graph, this appears to replace a smooth curve with a sequence of “steps.”

Sync Audio Sampling

A microphone connected to the sound card is used to record analog audio and convert it to digital format.

The denser the discrete fringes are located on the graph, the better it is ultimately possible to recreate the original sound.

Parameters that affect the quality of digital audio. (Part 1)

Parameters that affect the quality of digital audio. (Part 1)

digital audio

The best music formats for sound quality Minimum and maximum sound quality

DIGITAL AUDIO

The main parameters that affect the quality of digital audio recording are:

§ The capacity of the ADC and DAC.

§ Sampling frequency of ADC and DAC.

§ Jitter ADC and DAC

§ Resampling

In addition, the parameters of the analog path of digital sound recording and playback devices are still important:

§ Signal to noise ratio

§ Total harmonic distortion

§ Intermodulation distortion

§ Inequality of the amplitude-frequency response

Channel interpenetration

§ Dynamic range

Digital sound recording techniques

Digital sound recording is currently done in recording studios, under the control of high-quality, expensive personal computers and other equipment. In addition, the concept of “home studio” is quite developed, in which professional and semi-professional recording equipment is used, allowing you to create high-quality recordings at home.

Sound cards are used as part of the computers that perform processing on your ADCs and DACs; Most of the time at 24 bit and 96 kHz, a further increase in bit rate and sample rate hardly increases the recording quality.

There is a whole class of computer programs: sound editors that allow you to work with sound:

Record incoming audio stream

§ create (generate) sound

§ modify an existing recording (add samples, change timbre, speed of sound, cut parts, etc.)

§ rewrite from one format to another

Convert convert different audio codecs

Some simple programs only allow converting formats and codecs.

Varieties of digital audio formats.

There are several concepts of audio format.

The digital representation of the audio data depends on how the digital-to-analog converter (DAC) quantizes. In sound engineering, two types of quantization are currently the most common:

Pulse code modulation

Sigma-delta modulation

Quantization bit depth and sample rate are often specified for various audio recording and playback devices as a digital audio rendering format (24-bit / 192 kHz; 16-bit / 48 kHz).

The file format determines the structure and presentation characteristics of the audio data when stored on a PC storage device. To eliminate the redundancy of the audio data, audio codecs are used, with the help of which the audio data is compressed. There are three groups of audio file formats:

§ uncompressed audio formats like WAV, AIFF

Lossless audio formats (APE, FLAC)

Lossy compression audio formats (mp3, ogg)

Modular music file formats are highlighted. Created synthetically or from prerecorded live instrument samples, they are primarily used to create modern electronic music (MOD). Also, this can be attributed to the MIDI format, which is not a sound recording, but at the same time, using a sequencer, it allows you to record and play music using a certain set of commands in the form of text.

Digital audio media formats are used for both mass distribution of sound recordings (CD, SACD) and professional sound recording (DAT, minidisc).

For surround sound systems, sound formats can also be distinguished, which are mainly multi-channel sound accompaniments for movies. These systems have complete format families from two major competitors, Digital Theater Systems Inc. – DTS and Dolby Laboratories Inc. – Dolby Digital.