
Digital audio

what happens to sound within computer programs

Digital audio is a representation of analog sound used by computers and various digital devices to record and reproduce audio information. Like the frames of a movie, a digital audio signal is created from a series of sound fragments that are played when we press the play button. There are many different digital audio formats, they differ from each other in the transmission quality of the audio information.
About Pulse Code Modulation – PCM
If we talk about an acoustic sound or an analog signal, we are always talking about the propagation of sound waves in space. Whereas digital audio is only a rough description of what happens to sound or should happen within computer programs or digital devices.
This article will discuss pulse code modulation (PCM), the most common digital audio decoding system. Besides PCM, there are also DTS and Dolby Digital systems, but these are mainly applicable in the field of film and video production. Today we will not talk about them.
In pulse code modulation, a signal is read many times per second. At each reading moment the amplitude of the sound wave is recorded and reproduced. As mentioned above, a digital signal is just a rough copy of an analog signal, since an analog wave cannot be recreated with perfect precision. The values of each fragment are rounded to the nearest most accurate, then all the fragments are played and we hear a copy of the original analog sound.
“What meanings are we talking about?” – you ask. Just as analog audio is defined by frequency and amplitude, digital audio is determined by two important values: the sample rate and the bit depth. The sample rate means how many times per second the fragments of the audio signal are read, and the bit depth is the value of the dynamic range of each fragment of the audio signal.
Sampling rate
The standard 44.1 kHz sample rate used for recording audio to CDs (remember those?) Might seem like a random number. But this is not the case at all. This value was chosen based on Kotelnikov’s theorem, which essentially states that the sampling frequency must be more than 2 times higher than the maximum value of the reading frequency. As you know, the upper limit of audibility of the human ear’s frequency range is 20 kHz. It turns out that the sampling frequency must be higher than 40 kHz. An additional 4.1 kHz is added to avoid distortion, the so-called aliasing effect. In theory, 44.1 kHz should be sufficient to accurately reproduce an audio signal, however there are higher values.
For example, 48 kHz is the dominant standard in film and video production. As in the case of cinema, sound is synchronized at a frame rate of 24 frames per second. We won’t go into the details of why exactly 24 frames per second was chosen, in other words, this is the minimum frequency at which we can see a smooth, eye-pleasing image. The sample rate must match this frame rate. Using a frequency of 44.1 kHz can cause a noticeable out of sync of the picture and sound. Again, based on Kotelnikov’s theorem.
Even higher sample rates are repelled by these two base frequencies of 44.1 or 48 kHz, multiplying them by multiples of 2. That is, 88.2, 96, 192 kHz are the standard sample rates for all audio equipment. modern audio.
Bit depth
The bitness or bitness of an audio file tells us about its dynamic resolution or, more simply, clarity. You can draw an analogy with digital photography: the higher the resolution of the photo, the clearer and better the image will be.
It is important to note here that we are not talking about the loudness of the signal, but about a more realistic, clean and clear sound. More accurate transmission of the audio signal.
Bit depth can be compared to text in the book. The lower the bit depth, the less meaningful the text will make. That is, lowering the bitness leads to the fact that some letters begin to disappear from words, punctuation marks from sentences. At the moment, we will still be able to grasp the meaning of the text, but if the bit depth continues to decrease, the information will become so distorted that we simply stop understanding what we are talking about. The same goes for sound: the lower the bit depth, the more distorted we hear the sound.





