when should you compress audio Archives - Page 2 of 2

Compression encoding method Part 2

Free Download Mp4Gain

Compression encoding method Part 2

Other divisions of compression methods

In the field of audio compression, there are two compression methods, lossy compression and lossless compression. Commonly seen MP3, WMA, OGG are called lossy compression As the name suggests, lossy compression reduces the audio sample rate and bit rate, and the output audio file will be smaller than the original file. . Another audio compression is called lossless compression, which is what we’re talking about. Lossless compression can compress the volume of the audio file to a smaller size on the premise of saving 100% of all the data in the original file, and after restoring the compressed audio file, it can achieve the same size and same bitrate as the source file. Lossless compression formats include APE, FLAC, WavPack, LPAC, WMALossless, AppleLossless, La, OptimFROG, Shorten, while common and conventional lossless compression formats are just APE and FLAC. [1]
Main classifications and typical representatives of audio compression algorithms.edit streaming
Generally speaking, audio compression techniques can be divided into two categories: lossless compression and lossy compression, and according to different compression schemes, they can be divided into time-domain compression, transform compression, and time-domain compression. subband, as well as hybrid compression in which multiple technologies are combined with each other. Various compression techniques have large differences in algorithm complexity (including time complexity and space complexity), audio quality, algorithm efficiency (ie compression ratio), and codec delay. The applications of various compression techniques are also different.
Time domain compression technology (or waveform coding)
It directly processes the sample values of the audio PCM code stream and compresses the code stream through silence detection, nonlinear quantization, and difference. Common features of this type of compression technology are low algorithm complexity, average sound quality, small compression ratio (CD quality > 400kbps), and shortest codec delay (relative to other technologies) . This type of compression technology is generally used for voice compression, low bit rate (small source signal bandwidth) applications. Time domain compression technology mainly includes G.711, ADPCM, LPC, CELP, and block compression technology developed on these technologies, such as NICAM, Subband ADPCM (SB-ADPCM) technology.
Subband compression technology
Subband coding theory was first proposed by Crochiere et al. in 1976. The basic idea is to decompose the signal into the sum of components into several subbands and then adopt different compression strategies for each subband component according to its different layout features to reduce code rate. The usual subband compression technology and transform compression technology described below are based on the human perception model (psychoacoustic model) of the sound signal, and the quantization order of the subband samples or the samples The frequency domain is determined by analyzing the spectrum of the signal. other parameters are selected, so it can also be called perceptual compression encoding (Perceptual). Compared with time domain compression technology, these two compression methods are much more complicated. At the same time, the coding efficiency and sound quality are also greatly improved, and the coding delay is correspondingly increased. Generally speaking, the complexity of subband coding is slightly less than that of transform coding and the coding delay is relatively short.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

Compression encoding method

Transmission

According to different compression principles, audio signal coding is divided into waveform coding, parameter coding, and coding forms that integrate various technologies.
(1) Waveform coding directly samples the time-domain or frequency-domain waveform of the audio signal at a certain rate, and then quantizes the amplitude samples hierarchically, transforms them into digital codes, and outputs a signal coding system reconstructed from the waveform data. , the waveform is as consistent as possible with the original sound waveform, preserving detailed signal changes and various transition characteristics.
(2) Parametric coding First, a feature model based on different signal sources, such as language signals, natural sounds, etc., is established through feature parameter extraction and coding processing, trying to that the reconstructed sound signal is as loud as possible. to keep the semantics of the original sound, but reconstructed. The waveform of the signal may be quite different from the waveform of the original sound signal. Characteristic parameters in common use are formant, linear prediction coefficient, frequency band division filter and other parameter coding technologies, which can realize low-speed sound signal coding, and bit rate. can be compressed to 2 Kbit/s – 4.8 Kbit/s, but the sound quality can only reach moderate naturalness, especially low, only suitable for language transmission and expression.
(3) Hybrid coding The coding way that combines waveform coding and parameter coding overcomes the weaknesses of original waveform coding and parameter coding, and strives to maintain high quality of coding of waveforms and the low rate parameter coding, at a rate of 4 -16Kbit/s A high quality synthetic sound signal can be obtained. The basis of hybrid coding is linear predictive coding (LPC), commonly used coding methods such as pulse-excited linear prediction coding (MPLPC), scheduling pulse-excited linear prediction coding (KPELPC), Codebook Excited Linear Prediction (CELPC), etc.

Audio compression, how it works Part 2

Redundant information for transmission signals

Digital audio compression coding compresses the audio data signal as much as possible on the premise of ensuring that the signal is not audibly distorted. Digital audio compression coding is implemented by removing redundant components in sound signals. So-called redundant components refer to signals in the audio that cannot be perceived by the human ear and do not help determine the timbre, pitch, and other information of the sound. Redundant signals include audio signals outside the range of human hearing and masked audio signals. For example, the frequency range of the sound signal that can be perceived by the human ear is 20 Hz to 20 KHz, and frequencies other than this frequency that cannot be detected by the human ear can be considered as redundant signals. In addition, according to the physiological and psychoacoustic phenomena of the human ear, when a strong signal and a weak signal exist at the same time, the weak signal will be masked by the strong signal and cannot be heard, so the weak signal can be regarded as a redundant signal. Do not send. This is the masking effect of human hearing, which is mainly manifested in the spectral masking effect and the time-domain masking effect, which are presented below:
Spectral masking effects.
After the sound energy of a frequency is below a certain threshold, it will not be heard by the human ear, and this threshold is called the minimum audible threshold. When another sound with higher energy appears, the threshold value close to the frequency of the sound will increase considerably, which is known as the masking effect.

Masking effects in the time domain.
When strong and weak signals appear at the same time, there is also a masking effect in the time domain. That is, when the two occur very close in time, the masking effect will also occur. Time-domain masking is divided into three parts: pre-masking, simultaneous masking, and post-masking. Pre-masking refers to the short time before the human ear hears a strong signal, the already existing weak signal will be masked and cannot be heard. Simultaneous masking means that when a strong signal and a weak signal exist at the same time, the weak signal is masked by the strong signal and cannot be heard. Post-masking means that when the strong signal disappears, it takes a long period of time to hear the weak signal again, which is called post-masking. These weak masked signals can be considered redundant signals.

Audio compression, how it works

audio compression

It must have a corresponding inverse transform, called decompression or decoding. The audio signal can introduce a lot of noise and some distortion after passing through a codec system

Audio compression technology refers to the application of suitable digital signal processing technology to the original digital audio signal stream (PCM encoding), without losing the amount of useful information, or under the condition that the loss introduced insignificant, reducing (compressing) its code rate, and also called compression encoding. It must have a corresponding inverse transform, called decompression or decoding. Audio signals can introduce a great deal of noise and some distortion after passing through a codec system. The advantages of digital signal are obvious, but it also has its own corresponding disadvantages, ie increased storage capacity requirements and increased channel capacity requirements during transmission. Taking a CD as an example, the sampling frequency is 44.1KHz and the quantization precision is 16 bits, so a stereo audio signal for 1 minute needs to occupy about 10M bytes of storage capacity, that is, the capacity of a CD turntable is only about 1 hour. Of course, the problem is even more pronounced in the world of much higher bandwidth digital video. Are all these bits necessary? The study found that there is a large redundancy in the direct use of the PCM code stream for storage and transmission. In fact, sound can be compressed at least 4:1 under lossless conditions, that is, only 25% of the digital amount is used to retain all the information, and the compression ratio in the video field can even reach to several hundred times. Therefore, in order to use limited resources, compression technology has received much attention since its inception. The research and application of audio compression technology has a long history, like A-law coding, u-law is a simple almost instant compression technology, and has been applied in ISDN voice transmission. Research on speech signals has been developed before and has matured, and has been widely used, such as adaptive differential PCM (ADPCM), linear predictive coding (LPC), and other technologies.