Related Audio Attribute Part 2

Free Download Mp4Gain

Related Audio Attribute Part 2

The higher the sampling, the more realistic and natural the sound will be.

The frequency recognition range for people is 20 HZ – 20,000 HZ. If 20,000 samples per second can be sampled, it will be enough to satisfy the needs of the human ear during playback. So 22050 The sample rate is commonly used, 44100 is already CD quality, and sampling more than 48000 is no longer meaningful to the human ear. This is similar to a 24 frames per second image from a movie.

Sampling bits
After sampling the audio for a sample, two steps must be performed for the sample:

1. Quantify. The quantization bits commonly used for audio quantization are:

8 bits (that is, 1 byte) can only register 256 numbers, that is, only the amplitude can be divided into 256 levels;

16 bits (ie 2 bytes) can be as small as 65536 numbers, which is already the CD standard;

32 bits (ie 4 bytes) can subdivide the amplitude into 4294967296 levels, which is really unnecessary.

The number of quantization bits is also called the number of sampling bits, bit depth, and resolution, and refers to how many levels the continuous intensity of the sound can be divided after being digitally represented. N-bit means that the intensity of the sound is divided equally into 2^N levels. 16 bits, it is level 65535. This is a very large number and people may not be able to tell the difference in sound intensity from 1/65,535. You can also say that it is the resolution of the sound card. The higher the value, the higher the resolution and the greater the ability to produce sound. The sampling multiple here is primarily addressing the strength characteristics of the signal, and the sampling rate is addressing the time (frequency) characteristics of the signal, which are two different concepts.

2. Binary encoding. That is, the result of the quantization, ie the single channel sample, is stored in a binary keyword. There are two storage methods:

Store the result of the quantization directly in the cast, that is, the two’s complement code;

The result of quantization is stored in floating point type, ie floating point encoding code.

Most PCM sample data formats use integers to store, and for some applications that require high precision, use floating point to represent PCM sample data.

frame
After the audio is quantized to a binary codeword, it must be transformed and the transformation (MDCT) is done in block units, and a block is made up of multiple (120 or 128) samples. A frame will contain one or more blocks. Common frame sizes are 960, 1024, 2048, 4096, etc. A frame records a sound unit whose duration is the product of the sample duration and the number of channels. The nb_samples in the AVFrame structure in FFmpeg represent the number of single channel audio samples in a frame.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

Author: R. Arias

R. Arias is the author of this article and has extensive experience for more than 30 years as a recording engineer and audio specialist, as well as more than 20 years of experience creating algorithms related to audio and video. Linkedin View all posts by R. Arias