bit depth and sample rate for cd Archives

Bit Depth and Sample Rate PART 2

Free Download Mp4Gain

Bit Depth and Sample Rate PART 2

Fade processing

Bit Depth and Sample Rate

We now know that digital signal processing is bound to be very buggy. So the approximation of the total will also have a lot of error. These errors not only render the audio unrecoverable, but also introduce an unnatural sound.

To remove these artifacts, we add computed low-amplitude noise to the signal, which we call dithering. The amplitude of the jitter noise is very low, and although some is still heard, it is better than no addition.

Note that jitter noise accumulates. When you add noise to a signal, the signal-to-noise ratio decreases. If the operation is repeated, this ratio will continue to decrease, adding uncertainty to the signal. This is why dithering is often applied as the last step in mastering, and only once.

Dithering has quite an interesting history:

The first dither processing appeared during World War II. Bombers use mechanical computers for navigation and ballistic calculations. Interestingly, these computers are more precise in their processing performance in the air. Engineers realized that vibrations from the plane reduced errors in moving parts. His movements become more continuous, rather than sudden vibrations. Computers have little vibrating motors, and their vibrations are called oscillation, which is derived from the medieval English word “didderen,” meaning “to shake.” Modern dictionaries define dither as a state of high tension, confusion, or anxiety. Dithering brings digital systems closer to analog systems in some way.

– Ken Pohlmann, Digital Audio Rules

Sampling rate
According to theory, the sampling rate of 44.1 K per second is sufficient to cover the hearing range of the human ear. You may have inadvertently learned about Nyquist’s theorem, which states how to avoid aliasing (a type of distortion) and how to reconstruct all frequencies by sampling, which requires sampling at twice the highest frequency of the signal (this theorem also applies to non-audio media, we won’t go into that here).

The human ear has a hearing range of up to 20kHz (most studies show that this number is actually around 17K), so a sample rate of 40K is enough to hear every frequency clearly. 44.1K is the industry standard, which was determined by SONY, which was an oligopoly at the time, for a few reasons.

In a nutshell, the digital audio samples must be above the Nyquist frequency because, in practice, the samples are low-pass filtered during the digital-to-analog conversion process to prevent aliasing. The smoother the slope of the low pass filter, the lower the manufacturing cost. So an audio signal that normally uses a low pass filter will have a smooth slope at 2 kHz. For example, to keep the full spectrum below 20kHz, it should be done at a 44kHz sample rate (20K[highest frequency]+2K[low pass filter slope]x2[Nyquist theory]=44K)

Ultimately, the 44.1K standard was resolved in a battle between Sony and Philips (both had similar end goals). This is also based on the math behind audio sample rate and videotape anatomy. In this way, audio and video can coexist on the same video tape, which has a higher cost performance. However, 48K is the standard for video related to audio. CD audio remains at 44.1K.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

Bit depth and sample rate

The first thing to understand is that bit depth and sample rate only exist in digital audio.

In digital audio, bit depth describes amplitude (vertical axis) and sample rate describes frequency (horizontal axis). So increasing the number of bits we use increases the resolution of the sound’s amplitude, and increasing the number of samples per second increases the resolution of the sound’s frequency.

In an analog system (the natural world), the audio is continuous and smooth. In digital systems, smooth analog waveforms can only be roughly sampled and limited to a certain amplitude range. When sampling a sound, the audio is divided into small segments (samples) that are fixed at an amplitude level. The process of correcting a signal to a certain amplitude level is called quantization, and the process of creating a sample segment is called sampling.

In the graph below, a natural sine wave is displayed for up to 1 s, starting from 0 and ending at 1 s. The blue bars represent approximations of the digital quantization of the sine wave, and each bar is a sample, clipped to the approximate available amplitude level. (Of course, the graph is more incomplete than reality).

Depending on the choice made during recording, an audio of 1 s duration can have samples of 44.1K, 48K and, in the case of 24 bits, contains an amplitude level of -144 dB at 0 dB (- 96dB to 0dB for 16bit). The dynamic range resolution (the number of amplitude level units that can be used for a sample, ie the number of rectangles displayed) is 65536 at 16 bits and 16777216 at 24 bits.

Therefore, increasing bit depth can greatly improve amplitude resolution and dynamic range. So where does the increase in dynamic range appear? Since the amplitude cannot exceed 0dB, the added dB is distributed to samples with smaller amplitudes. So one can hear more small sounds (such as a reverb track stretching at -130dB) that would cut off at 16 bits, -96dB.

round and discard

In digital audio, each sample is analyzed, processed, converted to audio, and then played through speakers. When a sample is processed in your DAW (gain, distortion, etc.), they go through basic multiply and divide operations that allow you to change the digital representation of the sample. Very simply, if we don’t do the rounding process (the 1dB gain must be multiplied by 1.122018454), even 8 or 4 bits of sample precision will exceed the 24-bit space.

So since we only have 24 bits, these long numbers need to fit in this space. To do this, the DSP rounds or discards the least significant bit (LSB, the last digit in the number of bits, for example, the 16th digit in a 16-bit sample). Rounding is fairly straightforward and uses algorithms that you are familiar with. Discard discards the information after the least significant bit without analysis.

Both processes have certain errors, they will introduce errors into the equation, these errors accumulate through signal chain processing and are eventually reflected. On the plus side, the LSB is the digital bit with the smallest amplitude, so the error occurs at -96dB for 16-bit samples and -144dB for 24-bit samples. At the same time, the different structures and methods of digital signal processors will also lead to different results.

Bit Depth explanation

Definition

In digital audio, the bit depth is the number of information bits of each sample and is closely linked to the resolution of the audio. Unlike an analog signal, which is periodic and is composed of infinite points, digital audio is a discrete signal since it is composed of a finite number of points. Use binary numbers (bits) to determine the number of available states to represent the strength of each audio sample and thus represent the signal. “The quality of the representation increases, in general, when this number of states is increased. For example, […] high-fidelity music recording is obtained on a CD with 65,536 amplitude levels. The number of possible states of a binary system of n digits (n bits) is E = 2 ^ n. ” 1. In summary, it is the resolution, in terms of amplitude, that will have a digitized signal. Determine the dynamic range of that signal. In the following image we can see how a signal is represented in 4 bits of depth. 4 bits generate 16 possible values on the vertical axis.

Aspects to consider

The accuracy of each sample is determined by its bit depth. Then, the higher the bit depth, the higher the resolution in the digitized signal. In addition, the greater the bit depth, the greater the dynamic range for the signal because it will have more points to represent the amplitude of each audio sample. It follows that low levels of bit depth can affect the shape of the wave and thus not achieve a good representation of the original wave because there are fewer possible points to represent it. For example, in the following graph we can see a sinusoid represented with different bit depths. A depth of 1 bit will generate a wave more similar to the square wave (depending on the quantification) because we only have two possible points on the vertical axis.

Requirements

A very important aspect to keep in mind is that at greater bit depth we will need more resources to process the audio and more memory to save it. This is because we will have more information. The size of our audio file will be given by the following account:

Bit number * Sample rate * number of seconds duration [* 2 (if stereo signal)]

Then, for example, the size of a second of audio on a CD, which works with a depth of 16 bits and a sampling frequency of 44,100Hz / second will be given by the following account:

1 second = 16 * 44100 * 2 (since it is stereo)

1 second = 1411200 bits (0.1764 Mb)

Sample Rate and Bit Depth

In sound and audio software and hardware specifications we are often told about processing capacities of up to 96kHz and 64bit operation, but what do these issues really mean? And how do they affect the quality of our sound?

Sample Rate and Frequency Range

The sampling rate is the frequency with which the A / D converter (analog to digital) measures the levels of a signal, the samples are broadly analogous to a series of snapshots. If the converter takes ten samples of the signal every second, it would have a sampling rate of 10 Hz.
The frequency range that an A / D converter (present on a sound card for example) can capture is determined by the sampling frequency, or sampling rate. However, in this there is a strict law that may seem unintuitive: the maximum frequency that can be captured is only half of the sampling frequency. A sampling rate of 10 Hz can capture a maximum frequency of 5 Hz, not 10 Hz. The reason is that, without double the samples of a sound source, some of the oscillations of the signal are lost.
But what happens if there are frequencies higher than the capacity of our sampling frequency in the captured analog audio signal? Aliasing then occurs, phenomena that occur when the highest sampling frequency that has been sampled is higher than the frequencies that can be accurately captured by the A / D converter. Aliasing adds distortion to the audio signal artificially, adding lower frequencies to higher partials. Aliasing can occur in a digital audio system as a result of a poorly designed A / D converter, but you are much more likely to hear it when you play high notes from a software-based synthesizer. If the synthesizer does not use an antialiasing technology, the high notes have the possibility of becoming random groups of tones that have no relation to the key note you are playing.

The researchers at Bell Laboratory are familiar with this problem since 1920 and conceptualized the principle as the Nyquist-Shannon sampling theorem. The theorem is simple: to sample the frequency value of x correctly, you need a sampling frequency of at least twice x. (The maximum frequency at which it can be sampled without aliasing at a certain sampling rate is thus the so-called Nyquist frequency.) So why do we need the sampling rate to be twice as fast as the most frequency? high to be recorded? Because each ordinary period of a waveform includes an upward and a downward oscillation. If the A / D converter takes less than two samples per period, it cannot capture the entire oscillation. In order to capture each “up” and “down” state, you need to take at least two samples from each period. Thus, the sampling rate has to be twice the highest frequency that must be recorded.

According to the Nyquist-Shannon theorem, to sample frequencies that are in the upper limit of the human ear (around 22000 Hz), you need a sampling frequency of around 44000 Hz, which is, not by chance, the rate Normal sampling for commercial audio CDs, 44100 Hz.

This obviously allows you to sample the frequencies from the top of the range of our ear, but what happens when the frequencies of the signal that reach the A / D converter exceed the maximum frequency limit of 22 kHz? They fold into the audible spectrum as distortion, so the A / D converters incorporate an anti-aliasing filter that eliminates these high partials, before the audio is converted to digital format.

AUDIO WHY SEND MY WAV FILES TO 16 BITS, 44,100HZ?

Many will ask, what do we mean by the technical term of 44,100Hz at 16 bits? That term refers to the coding standard with which the compact disc was marketed in the 80’s.

The quality of a compact disc has a depth (bit depth) of 16 bits and a sampling rate of 44.1 kHz, which means that it is the standard quality with which your music will be played from the physical format. But what is the depth and frequency of sampling? Why not handle a higher quality coding such as 24-bit at 96kHz?

Bit depth:

In digital audio using pulse code modulation (MIC or PCM by Pulse Code Modulation), it is the number of bits of information for each sampling and corresponds directly to the resolution of each sampling. Examples of this: The compact disc which uses 16 bits per sampling, DVD Audio and Blu Ray which support 24 bits per sampling. Bit depth is only applicable to lossless (loseless) files and not to compressed (lossy) files such as mp3, wma, etc. With 16-bit audio, there are 65,536 possible levels. With all the higher resolution bits, the number of levels is doubled. By the time we reach 24 bits, we actually have 16777216 levels. Remember that we are talking about a frozen audio segment in an instant of time.

Sample depth:

Pulse code modulation (MIC or PCM by Pulse Code Modulation) is a modulation procedure used to transform an analog signal into a bit sequence. The unit of measure commonly used is Hertz (Hz).

When it is necessary to capture the entire range of human ear capacity (20-20,000 Hz) such as recording studio music, or various types of acoustic events, audio waves are usually recorded at 44,100 Hz, 48,000 Hz, 88,200 Hz or 96,000 Hz. Sampling frequencies of more than 50,000 Hz or 60,000 Hz do not provide useful information to human ears, although the difference is small, in 96,000 Hz sampling it is effective eliminating distortion.

Why send my WAV files at 16 bits, 44,100Hz?

To hear the difference between your music in 16 bits at 44,100Hz and 24bits 96,000Hz you must have a decent professional audio system or professional headphones, have a well-trained ear and this without counting the noise or noise that exists around you, However, if you want to compare both formats, the difference is imperceptible in low-end headphones, speakers of a stereo coppel or the speakers of your macintosh.

It also greatly influences the mixing and production made during the recordings by the audio engineer when capturing the instruments in their raw state. This greatly influences your WAV files to be heard well in their final mix at 44.1KhZ 16 bits or 96kHz at 24 bits.

The society of audio engineers recommend 48,000 Hz for most applications however they give recognition to 44,1000 Hz for the compact disc and its various applications. In any case, it is recommended for its average consumption in digital media a coding at 44,100 Hz at 16 bits to make up your music in a compact disc format and also for digital distributions … although spotify, itunes, etc … compress your music in mp3 format to 128kbps, a minimum and lousy quality.

WAV is a lossless digital audio format (loseless) and are raw audio files which you can request from your audio engineer at no cost when you finish mixing your tracks.