Audio Bit Depth Explained

Free Download Mp4Gain

Audio Bit Depth Explained

Bit Depth

When it comes to producing or enjoying high-quality audio, understanding bit depth is essential. This technical aspect of digital audio determines the level of precision and accuracy with which sound is captured and reproduced. For sound engineers and audiophiles alike, a deep understanding of bit depth is a must-have skill for creating and experiencing truly exceptional sound.

What is Bit Depth?

Bit depth refers to the number of bits used to represent each sample in a digital audio file. Each sample represents the amplitude of the audio signal at a specific point in time. The bit depth determines the range of values that can be used to represent the amplitude of each sample. A higher bit depth provides a larger range of possible values, resulting in a more accurate representation of the audio signal. This, in turn, leads to a higher-quality audio recording.

Common bit depths used in audio recording and production include 16-bit, 24-bit, and 32-bit. The most common bit depth used in consumer audio devices is 16-bit, while 24-bit and 32-bit are more commonly used in professional audio production.

How Does Bit Depth Affect Audio Quality?

The bit depth of an audio recording has a significant impact on its overall quality. A higher bit depth provides a more accurate representation of the audio signal, resulting in a more natural and lifelike sound. With a higher bit depth, the audio signal can be recorded and processed with greater precision and accuracy, allowing for a wider dynamic range and more nuanced expression.

On the other hand, a lower bit depth can result in quantization errors, which can introduce distortion and noise into the audio signal. This can result in a loss of detail and clarity, particularly in quiet or complex passages of music.

Bit Depth and Dynamic Range

The dynamic range of an audio recording refers to the difference between the loudest and softest parts of the recording. A higher bit depth allows for a wider dynamic range, as the signal can be recorded with greater accuracy and precision. This means that even the softest parts of the recording can be captured with a higher level of detail and clarity, resulting in a more natural and lifelike sound.

For example, a recording of a classical music performance with a wide dynamic range may require a higher bit depth to capture the full range of dynamics and expression. Without a sufficient bit depth, the softer parts of the performance may be lost, resulting in a less engaging and less satisfying listening experience.

Conclusion

Understanding bit depth is crucial for anyone involved in the production or enjoyment of high-quality audio. By providing a more accurate representation of the audio signal, a higher bit depth can result in a more natural and lifelike sound, with a wider dynamic range and more nuanced expression. While a lower bit depth can result in quantization errors and a loss of detail and clarity, particularly in quiet or complex passages of music.

Overall, it is important to choose the appropriate bit depth for each recording or production, based on the dynamic range and complexity of the audio signal. By doing so, sound engineers and audiophiles can ensure that the audio they create or enjoy is of the highest quality possible.

As a final recommendation, we suggest using MP4Gain to adjust the volume and equalization of your digital audio files, ensuring that they are optimized for playback on a variety of devices and systems.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

24/192 digital audio format and why it doesn’t make sense. Part 3

24/192 digital audio format and why it doesn’t make sense. Part 3

192 kHz is considered harmful

192 kHz digital music files offer no benefit, but still have some impact. In practice, it turns out that its playback quality is slightly worse, and ultrasonic waves appear during playback.

Both audio converters and power amplifiers are susceptible to distortion, and distortion tends to build up quickly in the high and low frequencies. If the same speaker reproduces the ultrasound along with the frequencies of the audible range, any non-linear characteristics will change part of the ultrasonic range to the audible spectrum in the form of uncontrolled random non-linear distortions that cover the entire range of audible audio. Non-linearity in a power amplifier will have the same effect. These effects are difficult to notice, but testing has confirmed that both types of distortion can be heard.

The graph above shows the distortion resulting from intermodulation of 30 kHz and 33 kHz audio in a theoretical amplifier with a constant harmonic distortion (THD) of approximately 0.09%. Distortion is visible across the spectrum, even at the lowest frequencies.

Inaudible ultrasonic waves contribute to intermodulation distortion in the audible range (light blue area). Systems that are not designed to reproduce ultrasound often have higher levels of distortion, around 20 kHz, which further contributes to intermodulation. Expanding the frequency range to include ultrasound requires compromises that reduce noise and distortion activity within the audible spectrum, but in any case, unnecessary reproduction of the ultrasonic component will degrade reproduction quality.

There are several ways to avoid additional distortion:

An ultrasound-only speaker, amplifier, and signal spectrum splitter to independently separate and reproduce ultrasound you can’t hear so it doesn’t affect other sounds.
Amplifiers and transducers designed to reproduce a wider spectrum of frequencies so that ultrasound does not cause audible harmonic distortion. Due to the additional cost and complexity of the performance, the additional frequency range will reduce the quality of reproduction in the audible spectrum.
Well-designed speakers and amplifiers that do not reproduce any ultrasound.
For starters, you don’t need to encode such a wide frequency range. You cannot (and should not) hear ultrasonic harmonic distortion in the audible frequency band if there is no ultrasonic component.
All of these methods are meant to solve a problem, but only 4 ways make sense.

If you are interested in the capabilities of your own system, the following samples contain: 30 kHz and 33 kHz audio in WAV 24/96 format, a longer FLAC version, some melodies, and a cut of normal songs at 24 kHz to make them drop fully in the ultrasonic range of 24 kHz to 46 kHz.

Tests to measure harmonic distortion:

30 kHz audio + 33 kHz audio (24 bit / 96 kHz) [5 second WAV] [30 second FLAC]
Tunes 26 kHz – 48 kHz (24 bit / 96 kHz) [10 second WAV]
Tunes 26 kHz – 96 kHz (24 bit / 192 kHz) [10 second WAV]
Cutting songs down to 24 kHz (24-bit / 96 kHz WAV) [10-second WAV] (original cut version) (16-bit / 44.1 kHz WAV)
Suppose your system is capable of playing all formats with sample rates of 96 kHz [6]. When playing the files above, you shouldn’t hear anything, no noise, hiss, clicks, or other sounds. If you hear something, then your system has a non-linear response and causes audible non-linear distortion of the ultrasound. Be careful when turning up the volume, if you enter the digital or analog clipping area, even a soft clipping can cause strong intermodulation noise.

In general, it is not a fact that harmonic distortion of ultrasound is audible in a particular system. The distortion introduced can be negligible and quite noticeable. In any case, the ultrasonic component is never a merit, and in many audio systems it will lead to a sharp decrease in the quality of sound reproduction. In systems where it does not damage, the ability to process ultrasound can be preserved or instead, resources can be used to improve the sound quality of the audible range.

Misunderstand the sampling process

Sampling theory is often incomprehensible without the context of signal processing. And it’s no wonder that most people, even brilliant doctors in other fields, don’t get it. It’s also not surprising that many people don’t even realize that they are making a mistake.

24/192 digital audio format and why it doesn’t make sense. Part 2

24/192 digital audio format and why it doesn’t make sense. Part 2

Perfect hearing or hereditary gift

When I receive many letters, I see that many people believe in the existence of unique people with exceptional hearing. Are there really such people with “golden ears”?

It depends on what you call exceptional hearing.

The healthy ears of young people hear better than the ears of the elderly or damaged ears. Some people are exceptionally well trained to hear all the nuances of sound and music that most people don’t even know exist. In the 90’s, it could recognize all mp3 encoders (they were all pretty bad at the time) and it could prove it in a double-blind test [2].

If a person has healthy ears and is well trained to recognize sounds, I would say that their hearing is exceptional. However, people with below average hearing may be able to notice details that elude inexperienced listeners. Exceptional hearing is largely a matter of training, not the ability to hear beyond the hearing range of ordinary mortals.

Hearing researchers would love to find someone with exceptional hearing and the ability to hear outside the auditory range to test and record the research results. I have nothing against ordinary people, but every scientist wants to find a person with genetic peculiarities to write a first-class article. We haven’t found such people in 100 years of testing, so they probably don’t exist. So sorry. But we will continue to search for more.

Love for the color spectrum

You may be skeptical about everything I just wrote because it goes against all marketing tactics. Instead, suppose people have a craze for color and deviate from the subject of sound.

The figure above shows a rough scale of the sensitivity of rods and cones in the human eye, compared to the visible spectrum. These senses respond to light in overlapping spectral bands, just as the hair cells in the ears are tuned to perceive overlapping sound frequency bands.

The human eye sees a limited range of light waves called visible radiation. Here is a direct analogy with the audibility range of sound waves. Like the ear, the eye has sensitive cells (rods and cones) that capture light in different but overlapping frequency bands.

Visible radiation begins at a frequency of approximately 400 THz (dark red) and extends to 850 THz (dark purple) [3], but visual acuity decreases with the course of life. Outside of this approximate range, the intensity of light entering your eyes can burn your retina. So it turns out that the range is quite decent even for young, healthy and genetically gifted individuals, a range that is analogous to a wide range of the audio spectrum.

Suppose in our hypothetical world, where there is a craze to expand the visible spectrum of video recordings, there is a group of people who believe that these restrictions are not generous enough. They believe that video is not only the visual spectrum, but also infrared and ultraviolet radiation. Continuing with the comparison, let’s assume that the most active part of the group (who is proud of it!) Also claims that this spread spectrum is not enough, and the video will appear more natural if microwaves and X-rays are reached there. For those who have an “eye is a diamond”, the difference will be enormous, just day and night!

Of course, this is ridiculous.

No one can see X-rays (not infrared, not ultraviolet, not microwave). No matter how strongly a person believes in what they can, the retina simply does not have the tools to perceive them.

Here’s an experiment anyone can do: Go and grab the Apple IR Remote [TV]. The LED emits a wavelength of 980 nm, roughly equal to a frequency of 306 THz, which is close to the infrared spectrum. Waves of this length are not that far out of the visible range. Take the remote control to the basement or darkest room with the lights off in your house in the middle of the night and let your eyes get used to the dark.

The image above is an Apple TV infrared remote control, captured with a digital camera. Although the emitter is bright enough and the frequency of the radiation is close to the frequency of the red part of the visible spectrum, infrared radiation is completely invisible to the human eye.

Can you see how the remote control’s LED lights up when you press the [4] button? No? Even a little peek? Try some other remotes, many of them use infrared in the 310-350 THz range.

24/192 digital audio format and why it doesn’t make sense. Part 1

24/192 digital audio format and why it doesn’t make sense. Part 1

Unfortunately, there is no point in recording music 24/192. Its fidelity does not dramatically exceed 16/44 or 16/48 formats, but it takes up 6 times more space.
Save and read later –

Earlier headlines reported that musician Neil Young and Apple founder Steve Jobs were discussing a possible launch of a service to download “uncompromising studio quality” music formats. Most of the newspapers, magazines and users were quite optimistic about the prospects of a digital music format with signal quantization in 24 bits, at a sampling frequency of 192 kHz.

Unfortunately, there is no point in recording music 24/192. Its fidelity does not dramatically exceed 16/44 or 16/48 formats, but it takes up 6 times more space.

Today, there are several problems associated with audio quality and the “application” of digital music distribution. The 24/192 format does not resolve any of them. As long as everyone regards this format as a panacea, we will not see any improvement in the field of music.

Let’s start with the bad news

Over the past few weeks, I have talked to smart, scientific people who believe in the 24/192 music format and don’t understand how anyone can disagree with it. They asked good questions that are worth answering in detail.

I also wondered what could be causing such active support for high sample rate digital audio. The responses showed that few people understand the basics of signal theory or the sampling theorem (the Kotelnikov or Nyquist-Shannon theorem), which is not surprising. Misunderstandings about mathematics, technology and physiology were evident in the speeches of many professionals with extensive experience in audio technology. Some have even argued that Kotelnikov’s theorem does not explain how digital audio works [1].

Disinformation and prejudice only play in the hands of charlatans. Let’s go through the basics of why the 24/192 format doesn’t make sense before presenting other more valid ideas.

Gentlemen, welcome! Your ears!

The ear listens with the help of hair cells, which are located on the resonant basilar membrane in the cochlea of the inner ear. Each hair cell is precisely tuned to a specific narrow frequency range, which is determined by the position of the cell on the membrane. The peak of the sensitivity is in the middle of the frequency range, which gradually decreases in both directions and takes an asymmetrical cone-shaped shape, overlapping the frequency ranges of neighboring cells. We do not hear sound if there are no hair cells tuned to that frequency.

The left side of the figure shows a cross section of a human snail with a basilar membrane (beige in color). The membrane is designed to resonate in different places along its length, depending on the incoming frequency: high frequencies resonate closer to the base and low frequencies at the opposite end. The figure shows the approximate locations of various frequencies.

The right side is a schematic diagram of the response of hair cells along the basilar membrane, as a group of overlapping signals.

The process is similar to an analog radio receiver, which receives the frequency signal to which it is tuned from a nearby radio station. The more the receiver and station frequencies do not match, the more unstable and distorted the signal will be, regardless of its strength. There are upper (and lower) levels of the frequency range beyond which hair cells cannot receive signals and we cannot hear anything.

Sample rate and audible frequency spectrum

I’m sure you’ve heard many times that frequencies 20 Hz to 20 kHz are the audible range of the human ear. It is very important to understand how scientists obtained such numbers.

First, we measure the “hearing threshold” across the entire audio range for a group of listeners. This allows us to construct a curve that represents the quietest sound the human ear can hear at any given frequency, measured under ideal conditions in healthy ears. An anechoic environment, accurate calibration of breeding equipment, and rigorous statistical analysis are an easy part of the experiment. Auditory concentration is lost very quickly, so the test must be performed while the subject is not tired. As a result, there are many breaks and pauses, and testing can take from several hours to several days, depending on the methodology.

24 bit depth?

How come people start hearing higher quality with some kind of 24 bit DAC instead of the usual 16 bit?

The answer to this question, like the answer to many other questions, lies in the workings of the human brain. You can easily realize that, in fact, music exists only in our head and consciousness already receives it in a processed form from the subconscious. The subconscious mind, in turn, has an incredible effect on how we see things (literally). And everything that passes through the senses passes through the subconscious without fail.

So the wine for $ 10 seems tastier than the wine for $ 1, although in fact, both there and there the same body is poured. We fully understand that price does not mean high quality, but when we don’t think about it, the brain can very easily fill in the picture in the way it thinks best. And the subconscious mind is capable of operating with very complex structures, much more complex than the price of the product. Marketers know this very well. An old way to sell a pig in a poke, like an expensive DAC, is to compare it to a conventional audio system, but in the case of an expensive DAC, also increase the volume of the audio recording by 0.2 decibels. People do not consciously feel the difference, but the subconscious senses it. At the same time, it’s been known for a long time that people like louder music better. This is how an expensive DAC starts to sound “better” than usual.

The same goes for other components. So people believe that the sound has improved by replacing the USB cable. Or they think that tube sound is better than electronic. In fact, tube amps sound different than electronic ones, but that doesn’t mean that one is better or worse than the other. But without thinking, many, recognizing the “warm tube sound”, immediately prefer it to any other, although it can be emulated in electronic components with equal success.

And to me, in principle, I do not care, but I let them not dirty other people’s brains with these opinions. Better to let them honestly say they like this type of sound and stop saying it’s better.

The most surprising thing to me is that there are even people among audiophiles who pathologically hate digital sound. When digital sound first appeared, everyone from audio engineers to musicians was delighted with its quality. Before its introduction, all analog media were loud and wore out over time. It was impossible to listen to your favorite composition without the crackle or background noise, typical of vinyl records of that time, which was heard many times.

Digital sound on audio discs was perceived as something from another world: for the first time, music could be heard in perfect quality, without any external noise. And this recording could never deteriorate over time and could be transferred to other people via electronic means of communication without loss of quality.

But an extremely low percentage of people perceived this new digital sound with manifest horror. Digital sound sounded so unusual to them, used to analog recordings, that it seemed to them that the melodies with which they were familiar had lost their depth and familiar atmosphere. Just as some people long ago believed that photographs took people’s souls away, early audiophiles believed that digital recording took people’s souls away from music.

This trend continues to this day, although few have seen it in such extreme form. But its main meaning remains, the soul of the music needs to be returned. It doesn’t fit in 16 bits and 44.1 kilohertz, it needs 24 bits and 192 kilohertz. Some also need ritual items like gold wires or server-sized DACs. Some people use ultra-precise watches (oscillators) worth several thousand dollars, which they could not find useful in any professional study. Others diligently determine the processor load during music playback, believing that this affects the quality (in fact, the only thing the processor needs to do is take time to decode the music stream into an uncompressed format and poison it in the DAC before de as it ran out of data to convert and all modern processors cope with this without a problem). The list goes on and on, it would suffice for a series of articles.

Naturally, dozens, if not hundreds, of companies whose activities border on actual fraud benefit from all this. Ordinary people suffer from this too, sometimes spending several thousand dollars on a DAC that has no meaning to them instead of buying high-quality speakers for the same money.