How to digitize sound quality


Free Download Mp4Gain
picture

How to digitize sound quality

digital sound

Many books and articles have been written on how to use a sound card, including on our website.

DIGITAL SOUND

However, this time we will not talk about what every regular reader of the Multimedia section already knows, but about what is called the practice of digital sound recording. Surely any owner of a multimedia computer sooner or later starts this exciting activity. Actually, for this (and not only) you buy a computer. However, this process is not that simple and requires some skill to achieve the highest quality. The purpose of this article is to give the readers of the site (and the owners of SB Live! Among them in particular) some useful recommendations in this area, which for one reason or another are not adequately covered by the press or the Web. .

To begin with, at one point I was faced with the question of converting my music library on cassettes to MP3 files, and I had to spend more than one night for the process of transferring audio information to a computer to be the highest quality. and as versatile as possible for most audio recordings. I will say right away that despite my solid experience in recording (both analog and digital), this, at first glance, an innocent occupation required a lot of mobilization of my forces and knowledge.

However, the user of a decent sound card is by no means obliged (as I am) to have a higher education in radio engineering and yet has the right to demand a decent quality of the received recording. I consider it my duty to provide the iXBT audience with that minimum of information which, I hope, will avoid many of the problems associated with digitizing audio (such as interference, interference, etc.). I think some of the information in this material will be useful for advanced users. In order not to go beyond the limits of decency, I will also say that everything that is written below is the result of generalizing the experience of many people, but of course it does not claim to be the ultimate truth. Reasonable reviews from readers are always good! (You can also write your comments on our conference articles About Site Materials.)

General remarks
Most of the time, multimedia users have to digitize the following sources:

Vinyl records . The main thing here is a good turntable and a preamplifier-corrector (the one that is built into expensive amplifiers). Of home turntables, I recommend Phoenix EP 009S (diamond ellipse head, auto arm). And then, we record the record on a computer, clean it from clicks (Click Elimination), filter the infrasound below 16 Hz (to eliminate noise), and cut the recording into songs. It is better not to eliminate the noise, since the noise of 65-70 dB at the output of the player (or the equalizer) is not that great. For example, 65-70 dB is the analog output of most CD-ROMs and nothing. But with the background (an unpleasant low-frequency tone of 50, 100, 150, etc.) it is better to find out before digitizing: the earth is hanging somewhere or the poles inside the player are confused.

Microphone I mean a good mic and mic amp. And about that, and about another, you can find a lot of information in print media, and also on the Web. I will give advice on only one thing.

The point is, in the practice of the study, there is a very clever principle for patch cords. Everyone already knows the twisted pair of signal lines, but here is how to solder the cables at the ends of the cables, only the dedicated ones, and even then not all.

The following image shows how to properly make a cable that will not contribute to the recording quality if it consists of quality cables. A copper braid is used as a screen (copper is desirable everywhere!). The signal wires inside the shield are a twisted pair of copper twisted wires. It is better to buy such a cable from a store that sells professional microphones, guitars, etc. (the cable will cost less than the interference). It is worth noting that only with a microphone it is necessary to be so scrupulous with the cable, otherwise you will switch microphone amplifiers and microphones to the Greek calendars.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

How digital sound works (Part 3)

How digital sound works (Part 3)

Digital Sound

Frequency

DIGITAL SOUND

Having finished with bit depth, it’s time to move on to frequency. It is the frequency that sets the entire range of sounds that can be recorded, while the bit depth only affects the volume and dynamic range. Frequency determines how many of these 16-bit numbers, which we talked about earlier, can be recorded in one second of audio recording (per channel).

Here everything is relatively simple. Humans hear sounds ranging from 20 hertz to 20 kilohertz (20,000 hertz). 1 hertz means that the wave oscillates from maximum to minimum for one second, 20 hertz – 20 vibrations.

Sound with a frequency of less than 20 Hertz is infrasonic and dangerous to health. People do not hear sound above 20 kilohertz, these waves are too fast for the ears to pick up. Of course, many people imagine that they already hear perfectly all frequencies and even above 20 kilohertz, but in fact, most of the people who read this text hardly hear sounds with a frequency of more than 17-19 kilohertz, especially If you abuse MP3 players.

Music is in the midrange, between 25 hertz and 10 kilohertz. The .WAV format, which is used on audio discs, records sound up to 22.05 kilohertz per channel. This is due to the fact that recording equipment does not have ideal sensitivity and decreases as it approaches the upper end of the range. Therefore, this upper limit is taken as a number of 22.05 kilohertz, so that up to 20 kilohertz the sensitivity is maximum.

A typical nonsense that audiophiles spread about frequency is that they claim that the higher the frequency, the more accurate a sinusoid can be built. The more accurate the sine wave, the better the sound, so it is better to listen to music with a frequency of up to 192 kilohertz. This makes sense?

To be honest, here we are faced with a banal ignorance of mathematics. The fact is that if we know the maximum frequency of the wave, ideally we can reproduce its shape using the Nyquist-Shannon theorem, also known as Kotelnikov’s theorem, which states that the verification frequency of a specific value must be twice the wave peak frequency. … That is, for 20 kilohertz we can use a sample rate of 40 kilohertz and we can reproduce the ideal waveform based on this.

You can find the proof of this theorem yourself, if you need it. I will just say that it is tested and that in itself it has nothing to do with sound or any technical aspect of sound recording. It is just a fundamental law of the universe.

For whatever reason, audiophiles don’t perceive this. In his understanding, a sound wave manages to make incomprehensible eddies back and forth or up and down in the shortest period of time between samples and therefore must be constantly captured so as not to lose information. In fact, the waves are purely physically incapable of this.

Since actual audio recordings use 22.05 kHz, .WAV files use an actual sample rate of 44.1 kHz per channel. This is done so that the listener, using their equipment, can accurately construct exactly the waveform that was received during recording. This has nothing to do with sampling errors, you need to recreate the sinusoid and just for this.

The question may arise, what to do if the ADC gave an error during recording and showed the wrong number that corresponds to the actual pressure value at that time. We will talk about this in the next section.

6. ADC, DAC and amplifiers

In general, reading thematic forums and sites, I get the impression that ADCs and DACs are a kind of mystical devices for audiophiles. In fact, in fact, this is just a chain of resistors connected in a special order. As in any electrical device, in ADCs and DACs, the voltage is constantly oscillating back and forth, thanks to quantum mechanics, and it is impossible to do anything with this process. The main question is whether these measurement errors have any meaning.

As we remember, the value given by the ADC is pressure. In turn, a person’s sensitivity to pressure is a difficult subject, especially considering that it changes according to conditions. But overall, it’s pretty obvious that humans don’t have the sensitivity to distinguish all 65,536 possible stops in dynamic range. If we talk about sensitivity in decibels, then people do not consciously feel the difference of 0.2 decibels, but they perceive unconsciously. A difference of 0.1 decibels is considered indistinguishable, neither consciously nor unconsciously.

How digital sound works (Part 2)

How digital sound works (Part 2)

Digital Sound

What is sound?

DIGITAL SOUND

If we talk about sound, then it is actually a wave that is transmitted through a certain physical medium, in our case it is air. This wave is almost impossible to visualize, since it is three-dimensional and propagates in all directions with a fairly complex geometry. To display a wave graphically, a sine wave is usually drawn. It is important to understand here that a sine wave is NOT a wave, it is just a sine wave. It shows the state of a wave at a certain point in space at a certain moment in time and nothing else. We see only part of the wave that passed through this point at any one time. However, this is more than enough to fix the properties of the wave, such as its frequency.

24nnoeb.jpg

The same value that is shown in the sine wave, in the physical sense, is the pressure that the sound wave exerts on a microphone or a person’s ears. This pressure is measured in micropascals, and it is very important to understand that any sound, and also music, are oscillations of a wave with a certain frequency (in the case of music, with a changing frequency), but not a value of separate pressure taken at a given time. It’s just that air pressure is not sound and does not carry any sound information to the human brain. When the pressure fluctuates from one value to another, say with a frequency of 15 kilohertz, it creates a high-pitched, “screeching” sound. The specific pressure value during such fluctuations determines the volume: the higher the pressure, the greater the volume. When the pressure is too high

Therefore, I repeat, the pressure value at a given moment does not contain any information about the sound, and if there is no oscillation, any value corresponds to silence.

3. What are decibels?

After we discover the physical nature of sound (I hope), it’s time to talk about something as mystical as decibels. Decibels are “just” a unit of measurement for something, the same as megabytes and others, to put it simply.

The problem for many people is that decibels are not a constant unit of measurement, and the unit in which each step grows exponentially compared to the previous one. That is, suppose we have 1 decibel of something. Then we got 2 decibels. If you decompose these two decibels and represent them in the form of a ruler measuring centimeters, it turns out that the first decibel occupies only one centimeter, while the second occupies two whole centimeters, so the total value will be 3 centimeters. This is because the second decibel has grown exponentially compared to the first. If you add a third decibel, then it will already take 4 centimeters on this ruler and the total value will be 7 centimeters. (This is just an example to show exponential growth,

If you are far from engineering, then you may be wondering why such a unit of measure is needed. The answer to this question is beyond the scope of this post, and if anyone is interested, I suggest they watch this video:

I’ll keep talking about sound. In our case, we can use decibels for volume and nothing else. That is, 0 decibels for us will correspond to absolute silence (empty), while, let’s say, 140 decibels literally kill; this is such a loud sound. The main thing to remember is that even though we are measuring volume in decibels, this unit continues to grow exponentially. A sound with a volume of 140 decibels is not 140 times louder than a 1 decibel sound, but millions of times (8,912,655 times, to be precise).

Also, some may wonder what negative decibels are, like -40 decibels, etc. So this is the same, it’s just that in many audio devices, engineers take a certain value, say 80 decibels, for the “standard” volume value, and from it they measure a lower volume and a larger one. The default value itself is 0 decibels on the local system of this device. In some cases, 0 decibels is generally the maximum volume and the sound is measured exclusively downwards on such equipment.

We will not use these negative decibels, and for us, absolute silence will always be 0 decibels.

4. Bit depth

Now that we’ve cleared up or remembered all the basics of the basics, it’s time to move on to how digital audio is recorded. Sound is recorded by a microphone, a device that captures the vibrations of a sound wave and converts it into an electric current, the voltage of which fluctuates in proportion to the vibrations of the sound wave, so that its sinusoid is the same.

How digital sound works. (Part 1)

How digital sound works. (Part 1)

digital sound

In this post, I’d like to talk about digital sound and, along the way, expose such a popular form of freestyle as audiophilia.

Digital Sound

Unfortunately, lately I see more and more manifestations of this phenomenon, penetrating the minds of even quite reasonable people and causing them to spend money on technological analogues of homeopathic pills. I say “sadly” because everything that I will write in this article should, in principle, be known to all the people who graduated from school. But for some reason that I do not understand, they forget or do not want to apply in practice the knowledge they once acquired. The belief in audiophilia at this point has even penetrated and spread widely among engineers, although that’s really who, and they should understand these things thoroughly.

I originally wanted to write this article in a more aggressive style. But in the end I decided that it would be better for me to do without curses and provocations. On the contrary, I really hope that audiophiles read this article and reflect on what they believe and if they have enough reason to believe. Therefore, I will do so without provocation and will focus solely and exclusively on the facts.

And the most important thing I want to say right now: the audiophile arguments are not arguments related to any technical or engineering aspect. Audifilov’s arguments contradict science, specifically physics and mathematics. They also contradict technical and engineering aspects and audiophiles don’t know how their audio systems work, but this is a small problem compared to how they contradict physical or mathematical laws, showing a complete ignorance of the basics. It is the scientific aspects that I will focus on instead of explaining what the different types of CAD are and other details that are not of fundamental importance.

1. Basics: how sound is reproduced on a computer and any other electronic device

To begin with, an audio file is on a digital medium, such as a hard drive. This audio file has a certain internal format, but they are all a set of zeros and ones (0110010101 …), that is, any file can be represented as a very large number. This number can be easily converted to the usual decimal number system (189208 …).

The direct consequence of this is that the copies of the same file are all exactly the same. It doesn’t matter what medium they are in or how they were transferred or created: if the copies are correct, then they are exactly the same. The difference in playing the same file can only be caused by some other element in this play chain.

And this string is like this:

File -> audio player program -> digital to analog converter (DAC) -> amplifier -> speakers or headphones.

It works like this:

First, the player program loads (or receives from outside) an audio file into memory.

The software then decodes it, if necessary, into an uncompressed digital stream, which is digital audio. We will simply call this uncompressed digital audio .WAV and assume that this is the format in which music is delivered on conventional audio discs (two-channel stereo, 16-bit, 44.1 kilohertz per channel).

After that, this sound enters a digital to analog converter, which takes each number and converts it to an analog value that corresponds to it, most of the time it is a voltage measured in volts (from a certain minimum value that corresponds to a digital number 0 and up to a maximum value that corresponds to the number 65,536 – this is the maximum number that can be written in 16 bits).

After that, the sound, already in the form of electric current, enters the amplifier, the task of which is to raise the voltage to a value that suits the speakers. The amplifier must amplify the signal linearly, that is, each value that reaches it at the input must increase in the same proportion at the output.

In the speakers, the electric current is converted into physical vibrations, which are transmitted to the air and thus the sound we hear is obtained.

This chain, which from now on we will call the audio path, is present in one form or another in any digital audio system. The elements themselves may look very different on different systems (MP3 players, smartphones, computers, etc.), but they are necessarily present. When it comes to a computer, the DAC and amplifier are on the sound card (which is often built into the motherboard). Speakers often have their own built-in amplifier, and some of them may have their own DAC (and connecting to them bypasses the sound card).

Basic concepts of digital sound theory

Basic concepts of digital sound theory

Digital Sound

Sound is, in general, the vibrations of an elastic medium. The sound is caused by mechanical vibrations of some object (this can be a string, vocal cords, etc.) in contact with the environment. The frequency of vibration (measured in Hertz) determines the pitch. The higher the frequency, the louder the sound. The human ear can perceive sound vibrations from the air with a frequency of 20 Hz to 20 kHz. The ear perceives the amplitude of the vibration as volume. The higher the amplitude, the louder the sound.

Digital Sound

Electromagnetic waves are a direct analog of sound waves. The latter are less susceptible to dispersal by the environment, the information they carry is easier to store and process. Electromagnetic waves are the most important secondary carrier of sound. The transformation of acoustic waves into electromagnetic waves (as well as the reverse operation) is carried out due to the usual induction effect, which consists in the appearance of a current in a conductor when it is placed in an alternating magnetic field.

Simply put, the oscillation of the loudspeaker membrane magnet near the coil induces an alternating current in it. If this current is applied to another speaker, then the magnet on its membrane will move, creating a corresponding sound.

This is how the telephone and the radio work.

Sound converted to electromagnetic waveform can be easily stored. For this, some parameter of the carrier must be compared (the depth of the plate track or the degree of magnetization of the film) with the amplitude of the oscillations (that is, the strength of the induced current in the speaker coil) . Sound converted directly to electromagnetic waves is called analog sound. Its main characteristic is the direct correspondence of the electromagnetic waves transmitted or recorded with the acoustic ones.

Digital sound is relatively new. Its main difference from analog is discretion. When digitizing, a special device, an analog-to-digital converter (ADC), measures at regular intervals (approximately 0.001-0.0001 seconds) the magnitude of the amplitude of an electromagnetic wave corresponding to an analog sound form and writes its value to a file with a specified precision. This value is generally called sample, or in jargon, sample (of the sample in English, sample). The same digitization is often called sampling or sampling.

By converting sound from digital to analog (this is done by a device called a digital-to-analog converter (DAC)).

The interpolation (approximation) of the intermediate values ​​of the amplitude is carried out according to the known ones. Since the sampling frequency is usually high, this operation allows you to fairly accurately reconstruct the original analog signal.

The digital form of sound is characterized by five parameters.

1. The sampling rate;
2. Bit size of the samples.
3. The number of channels or tracks.
4. Compression / decompression algorithm (codec).
5. Storage format.

Since each of these parameters is quite specific, we will consider them separately.

Sampling rate
The sample rate determines how many samples per second will be taken when digitizing. If we compare digital sound with digital images, then the sample rate will correspond to the resolution (a more “realistic” analogy is the frame rate in cinema). The higher the sampling frequency, the better it is possible to reconstruct the analog signal based on the digital form of the sound (more precisely, the higher the sampling frequency, the broader the spectrum of frequencies that can be recorded during digitization).
The famous Nyquist-Kotelnikov theorem states that for the correct reconstruction of an analog signal from its digital recording, it is necessary that the sampling frequency be at least twice the maximum sound frequency.

Since the upper listening limit is 20 kHz, ideally the sample rate should be at least 40 kHz. This is why the standard sampling frequency used for recording CDs is 44.1 kHz (so-called CD quality). However, the sample rate can be higher, but this sound quality is only used by recording studios and especially demanding music lovers.

A sample rate of 44.1 kHz is not always ideal. When transmitting data over a low bandwidth network, sound quality must be sacrificed in favor of size, in practice sampling frequencies two, four and eight times lower than 44.1 kHz are often used.

How digital sound is reproduced

How digital sound is reproduced

digital sound

Have you ever wondered how sound is reproduced on digital devices?

Digital Audio

How is a sound signal formed from a combination of ones and zeros? I’m sure I was thinking, since I started reading! But often, even professionals have only a general idea of ​​the modern sound route. In this article, you will learn how the different formats appeared, what a digital-to-analog converter is, what types of DACs exist, and what determines the quality of sound reproduction.

PCM
As you know, in digital audio, almost any format, with rare exceptions, is recorded using a pulse code stream or a PCM stream – pulse code modulation. FLAC, MP3, WAV, Audio CD, DVD-Audio and other formats are just ways to pack, “preserve” the PCM stream.

How it all began
The theoretical foundations of digital sound transmission were developed at the dawn of the 20th century, when scientists tried to transmit an audio signal over a long distance, but not by telephone, but in a rather strange way for that time.

By dividing the sound wave into small parts, it could be sent to the receiver in some kind of mathematical representation. The recipient, in turn, could restore the original waveform and listen to the recording. In addition, scientists were faced with the task of increasing the bandwidth of the “ether”.

In 1933, the theorem of V.A. Kotelnikov. In Western sources, it is called the Nyquist-Shannon theorem. Yes, Harry Nyquist was the first to raise this issue: in 1927 he calculated the minimum sampling frequency for transmitting a waveform, which later received his name “Nyquist frequency”, but Kotelnikov’s theorem was published 16 years earlier.

The essence of the theorem is simple: a continuous signal can be represented as an interpolation series, consisting of discrete reports, from which the signal can be reconstructed. In order to roughly restore the original state of the signal, the sampling frequency must be at least twice the upper cutoff frequency of this signal.

For many years, the theorem was not in demand, until the advent of the digital age. It was then that it found a use. In particular, the theorem was useful in the development of the CDDA (Compact Disc Digital Audio) format, in common people it is called Audio CD or Red Book. The format was released by engineers at Philips and Sony in 1980 and has become the standard for audio CDs.

Format characteristics:

sampling frequency – 44.1 kHz;
quantization capacity – 16 bits.

INFO
Sampling rate: the number of samples of the signal “taken” during its sampling. Measured in Hertz.
Quantization bit: the number of binary digits that express the amplitude of the signal. Measured in bits.
The 44.1 kHz sampling rate was calculated from Kotelnikov’s theorem. It is believed that the hearing of the average person cannot pick up sound beyond 19-22 kHz. The frequency was probably 22 kHz and was chosen as the upper limit.

22,000 × 2 = 44,000 + 100 = 44,100 Hertz

Where does the 100 Hertz come from? There is a version that this is a small margin in case of errors or oversampling. In fact, Sony chose this frequency for its compatibility with the PAL transmission standard.

The bit depth of the CDDA format is 16 bits, or 65,536 samples, which equates to a dynamic range of approximately 96 dB. Such a large number of samples were not chosen by chance. Firstly, due to the strong influence of quantization noise, and secondly, to provide a formal dynamic range superior to that of the main competitors at the time: cassette records and vinyl records. I’ll cover this in more detail in the section on digital to analog converters.

Development of PCM continued on the principle of multiplying by two. Other sample rates appeared: first, the 48 kHz sample rate was added, and then the frequencies based on it were 96, 192, and 384 kHz. The 44.1 kHz frequency was also doubled to 88.2, 176.4 and 352.8 kHz. Bit width increased from 16 to 24 and then to 32 bits.

The next after CDDA in 1987 appeared the DAT format – Digital Audio Tape. The sample rate was 48 kHz, the quantization bit did not change. And although the format failed, the 48 kHz sample rate caught on in recording studios, as they say, due to the convenience of digital processing.

In 1999, the DVD-Audio format was released, which made it possible to record on a disc six stereo tracks with a sampling frequency of 96 kHz and a bit depth of 24 bits, or two stereo tracks with a frequency of 192 kHz, 24 bits.