audio compression on audacity Archives - Page 2 of 2

Compression encoding method Part 2

Free Download Mp4Gain

Compression encoding method Part 2

Other divisions of compression methods

In the field of audio compression, there are two compression methods, lossy compression and lossless compression. Commonly seen MP3, WMA, OGG are called lossy compression As the name suggests, lossy compression reduces the audio sample rate and bit rate, and the output audio file will be smaller than the original file. . Another audio compression is called lossless compression, which is what we’re talking about. Lossless compression can compress the volume of the audio file to a smaller size on the premise of saving 100% of all the data in the original file, and after restoring the compressed audio file, it can achieve the same size and same bitrate as the source file. Lossless compression formats include APE, FLAC, WavPack, LPAC, WMALossless, AppleLossless, La, OptimFROG, Shorten, while common and conventional lossless compression formats are just APE and FLAC. [1]
Main classifications and typical representatives of audio compression algorithms.edit streaming
Generally speaking, audio compression techniques can be divided into two categories: lossless compression and lossy compression, and according to different compression schemes, they can be divided into time-domain compression, transform compression, and time-domain compression. subband, as well as hybrid compression in which multiple technologies are combined with each other. Various compression techniques have large differences in algorithm complexity (including time complexity and space complexity), audio quality, algorithm efficiency (ie compression ratio), and codec delay. The applications of various compression techniques are also different.
Time domain compression technology (or waveform coding)
It directly processes the sample values of the audio PCM code stream and compresses the code stream through silence detection, nonlinear quantization, and difference. Common features of this type of compression technology are low algorithm complexity, average sound quality, small compression ratio (CD quality > 400kbps), and shortest codec delay (relative to other technologies) . This type of compression technology is generally used for voice compression, low bit rate (small source signal bandwidth) applications. Time domain compression technology mainly includes G.711, ADPCM, LPC, CELP, and block compression technology developed on these technologies, such as NICAM, Subband ADPCM (SB-ADPCM) technology.
Subband compression technology
Subband coding theory was first proposed by Crochiere et al. in 1976. The basic idea is to decompose the signal into the sum of components into several subbands and then adopt different compression strategies for each subband component according to its different layout features to reduce code rate. The usual subband compression technology and transform compression technology described below are based on the human perception model (psychoacoustic model) of the sound signal, and the quantization order of the subband samples or the samples The frequency domain is determined by analyzing the spectrum of the signal. other parameters are selected, so it can also be called perceptual compression encoding (Perceptual). Compared with time domain compression technology, these two compression methods are much more complicated. At the same time, the coding efficiency and sound quality are also greatly improved, and the coding delay is correspondingly increased. Generally speaking, the complexity of subband coding is slightly less than that of transform coding and the coding delay is relatively short.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

Compression encoding method

Transmission

According to different compression principles, audio signal coding is divided into waveform coding, parameter coding, and coding forms that integrate various technologies.
(1) Waveform coding directly samples the time-domain or frequency-domain waveform of the audio signal at a certain rate, and then quantizes the amplitude samples hierarchically, transforms them into digital codes, and outputs a signal coding system reconstructed from the waveform data. , the waveform is as consistent as possible with the original sound waveform, preserving detailed signal changes and various transition characteristics.
(2) Parametric coding First, a feature model based on different signal sources, such as language signals, natural sounds, etc., is established through feature parameter extraction and coding processing, trying to that the reconstructed sound signal is as loud as possible. to keep the semantics of the original sound, but reconstructed. The waveform of the signal may be quite different from the waveform of the original sound signal. Characteristic parameters in common use are formant, linear prediction coefficient, frequency band division filter and other parameter coding technologies, which can realize low-speed sound signal coding, and bit rate. can be compressed to 2 Kbit/s – 4.8 Kbit/s, but the sound quality can only reach moderate naturalness, especially low, only suitable for language transmission and expression.
(3) Hybrid coding The coding way that combines waveform coding and parameter coding overcomes the weaknesses of original waveform coding and parameter coding, and strives to maintain high quality of coding of waveforms and the low rate parameter coding, at a rate of 4 -16Kbit/s A high quality synthetic sound signal can be obtained. The basis of hybrid coding is linear predictive coding (LPC), commonly used coding methods such as pulse-excited linear prediction coding (MPLPC), scheduling pulse-excited linear prediction coding (KPELPC), Codebook Excited Linear Prediction (CELPC), etc.

Audio compression, how it works Part 2

Redundant information for transmission signals

Digital audio compression coding compresses the audio data signal as much as possible on the premise of ensuring that the signal is not audibly distorted. Digital audio compression coding is implemented by removing redundant components in sound signals. So-called redundant components refer to signals in the audio that cannot be perceived by the human ear and do not help determine the timbre, pitch, and other information of the sound. Redundant signals include audio signals outside the range of human hearing and masked audio signals. For example, the frequency range of the sound signal that can be perceived by the human ear is 20 Hz to 20 KHz, and frequencies other than this frequency that cannot be detected by the human ear can be considered as redundant signals. In addition, according to the physiological and psychoacoustic phenomena of the human ear, when a strong signal and a weak signal exist at the same time, the weak signal will be masked by the strong signal and cannot be heard, so the weak signal can be regarded as a redundant signal. Do not send. This is the masking effect of human hearing, which is mainly manifested in the spectral masking effect and the time-domain masking effect, which are presented below:
Spectral masking effects.
After the sound energy of a frequency is below a certain threshold, it will not be heard by the human ear, and this threshold is called the minimum audible threshold. When another sound with higher energy appears, the threshold value close to the frequency of the sound will increase considerably, which is known as the masking effect.

Masking effects in the time domain.
When strong and weak signals appear at the same time, there is also a masking effect in the time domain. That is, when the two occur very close in time, the masking effect will also occur. Time-domain masking is divided into three parts: pre-masking, simultaneous masking, and post-masking. Pre-masking refers to the short time before the human ear hears a strong signal, the already existing weak signal will be masked and cannot be heard. Simultaneous masking means that when a strong signal and a weak signal exist at the same time, the weak signal is masked by the strong signal and cannot be heard. Post-masking means that when the strong signal disappears, it takes a long period of time to hear the weak signal again, which is called post-masking. These weak masked signals can be considered redundant signals.

Audio compression, how it works

audio compression

It must have a corresponding inverse transform, called decompression or decoding. The audio signal can introduce a lot of noise and some distortion after passing through a codec system

Audio compression technology refers to the application of suitable digital signal processing technology to the original digital audio signal stream (PCM encoding), without losing the amount of useful information, or under the condition that the loss introduced insignificant, reducing (compressing) its code rate, and also called compression encoding. It must have a corresponding inverse transform, called decompression or decoding. Audio signals can introduce a great deal of noise and some distortion after passing through a codec system. The advantages of digital signal are obvious, but it also has its own corresponding disadvantages, ie increased storage capacity requirements and increased channel capacity requirements during transmission. Taking a CD as an example, the sampling frequency is 44.1KHz and the quantization precision is 16 bits, so a stereo audio signal for 1 minute needs to occupy about 10M bytes of storage capacity, that is, the capacity of a CD turntable is only about 1 hour. Of course, the problem is even more pronounced in the world of much higher bandwidth digital video. Are all these bits necessary? The study found that there is a large redundancy in the direct use of the PCM code stream for storage and transmission. In fact, sound can be compressed at least 4:1 under lossless conditions, that is, only 25% of the digital amount is used to retain all the information, and the compression ratio in the video field can even reach to several hundred times. Therefore, in order to use limited resources, compression technology has received much attention since its inception. The research and application of audio compression technology has a long history, like A-law coding, u-law is a simple almost instant compression technology, and has been applied in ISDN voice transmission. Research on speech signals has been developed before and has matured, and has been widely used, such as adaptive differential PCM (ADPCM), linear predictive coding (LPC), and other technologies.

Audio compression for music lovers

Audio compression for music lovers

the truth about high bitrate lossy compression

In the opinion of most people, the word music lover is most often associated with a person who not only loves and collects music, but also appreciates high-quality music, and not only in artistic and aesthetic terms, but also the quality of the recording of the phonogram itself. Just think, a few years ago, an audio CD was considered the standard for music quality, whereas a computer, even in dreams, could not compete with the quality of a CD. However, time is a great joker, and he often likes to turn things upside down. It would seem that quite a while, a year or two passed and … that’s it, the CD on the PC went into the background. Don’t ask “why?”, You know the answer to this question yourself. Everything is to blame for the revolution in the world of computer sound: audio compression (hereinafter referred to as audiolo compression which means lossy compression to reduce the size of the audio file), which made it possible to store music on disk hard, lots of music! In addition, it was possible to exchange it over the Internet. New sound cards have been released, capable of almost “squeezing” studio quality out of a piece of hardware that seems useless in terms of music. Today, even having a computer that is not very smart in performance, having bought a Creative SoundBlaster Live! and remembering that since Soviet times there is a good amplifier and good acoustics, you will get nothing but a high-quality music center, the sound of which is inferior only to very expensive audio equipment (average or even the highest Hi-Fi category ). Add to this the general availability of music files and you understand that you have the power in your hands. And then there is a revolution, and you understand that a compact disc is no longer so convenient, you are fascinated by something completely different: the magic “MP3” signs. You cannot eat or sleep; you are faced with the seemingly insoluble “chicken and egg” question: how to “squeeze” and, most importantly, how to “squeeze” …

This is where I will help you. This article is the beginning of my new series of informational materials on music on the computer. For over a year developing OrlSoft MPeg eXtension and maintaining an extensive database of MP3 files, I have accumulated a great deal of research on audio compression. It is these studies that I will try to share with you. Many articles have been written on audio compression by different respected authors, so I will try not to write what I can easily find in other sources of information. I would like to put my position on the subject we are considering simply and clearly. We will not consider audio compression to be as compact a tool as possible put audio information on your hard drive (so that you can record so many hours of music there). Yes, compression allows you to record music more compactly, but my goal is to minimize quality loss by converting “pure” audio to compressed audio. This is why only high bit rates and qualitatively compressing encoders are considered in these modes. So it is much more convenient to work with compressed audio – instant access to any track from any album, convenient software for playback. And, of course, the financial issue has not been forgotten either.

Of the audio compression formats that exist today, in my opinion, three deserve attention: MP3 (or MPEG-1 Audio Layer III), LQT (as representative of the MPEG-2 AAC / MPEG-4 family) and a Completely new OGG format (Ogg Vorbis) developed by a group of enthusiasts:

MP3 is by far the most used of these (mainly because it is free). Let me remind you that it was thanks to the MP3 format that the victorious procession of compressed audio took place. However, as often happens with pioneers, little by little it is losing ground and giving way to new and better formats.
The second format, LQT, is a representative of a new direction of audio coding algorithms, a representative of the AAC family. This is a fairly high quality, but commercial and highly classified format.
OGG became widely known to the public this summer and is currently developing rapidly, soon (with the launch of the Encoder and Decoder) it should beat MP3 with better sound quality with smaller file size.

What is video encoding and how does it work?

The technique of compressing videos

What do we mean when we talk about video coding or, as industry experts generally call it, video coding?

Simply put, video encoding is the process of compressing and converting video content. The ultimate goal is to use less storage space, use less bandwidth, and make the user experience smoother. It goes without saying that the compression process causes a significant loss of information. The more data that is applied, the more data is deleted in the video. The result is a different version of the original due to missing data.

Why is video coding so important?

Video encoding is essential for transmission because it simplifies the transmission of video on the Internet through a compression process. Compression reduces the bandwidth required while providing a high quality experience. Without this, raw video content would not allow many users to view content on the Internet due to insufficient connection speeds. The protagonist of this process is the bit rate or the speed of digital data transmission that can be transmitted in a certain time interval in a communication channel. When streaming, the bit rate determines whether users can easily view the content or are exposed to video buffering.

Another fundamental aspect of video coding is compatibility. Indeed, sometimes the content is already compressed to an appropriate size, but it still needs to be encoded to be compatible with different devices and applications, although this is often referred to as transcoding.

The video encoding process is governed by video codecs, which are compression standards that are created through software or hardware applications. Each codec consists of an encoder for compressing the video and a decoder for restoring an approximation of the video for playback. The name codec is actually derived from the merging of the words “encoder” and “decoder”.

But what is the best codec?

It depends on the type of video. On this occasion we will describe the most commonly used.

To stream high quality video over the Internet, H.264 is arguably the most widely used codec for most multimedia traffic. This codec is considered to be of excellent quality, coding speed and compression efficiency, although it is not as efficient as the later HEVC (High Efficiency Video Coding) compression standard, also known as H.265. H.264 also supports 4K video streaming, a real advance for a codec created in 2003.

Now that we have an overview of codecs, let’s look at some compression techniques.

Compression techniques

The most common compression technique is scaling the resolution. The higher the resolution of a video, the more information is contained in each picture. One way to reduce the amount of data is to reduce the size of the image and then scan it again. As a result, fewer pixels are generated, which reduces the level of detail of the image, which has a positive effect on the amount of information required. This process allows you to set multiple quality levels for a video that correspond to different resolutions created. A practical example is if you are watching a movie in streaming before playing it, you can actually choose the resolution at which you want to watch it, provided your device
Support him

One video compression technique that may not be widely used is the interframe. This process reduces “redundant” information from one frame to another.

Another technique is the P-frame, short for predictive frame, which means that it can look back at an i-frame or another P-frame and understand whether the same images are present. In this case, this part is excluded for reasons of space.

B-Frame, on the other hand, is the bidirectional predictive frame that offers good compression without affecting the viewing experience. However, this technique requires a higher coding profile.

Another technique is that which makes it possible to intervene in the color. This process, called “chroma subsampling”, tries to maintain the brightness of the image, which affects the quality of the color. Finally, another method of compressing videos is to reduce the number of frames per second.