Introduction to various conventional audio encodings (or formats) Part 3


Free Download Mp4Gain
picture

Introduction to various conventional audio encodings (or formats) Part 3

PCM

Hearing model import: Experts have found that the human ear has a shadow effect through long-term acoustic research.

PCM

The sound signal is actually a type of energy wave, which propagates in air or other media. The most direct response of the human ear to the amount of sound energy, that is, the volume or pressure of the sound, is to hear the size of the sound. We call it the volume, which means the volume. The unit of energy is the decibel (dB). Even sounds of the same volume can be perceived by people as different in size due to their different frequencies. The 4000 Hz frequency is the easiest for the human ear to hear. It doesn’t matter if the frequency increases or decreases, even if the volume is the same, everyone will feel the sound becomes smaller. But when the volume drops to a certain level, the human ear cannot hear it, and each frequency has a different value.

You can see that this curve basically forms a V. When the frequency exceeds 15000 Hz, the human ear will feel that the sound is very small. Many people who are not very good at hearing cannot hear the frequency of 20000 Hz at all, no matter how loud it is… When the human ear hears two sounds with different frequencies and different volume at the same time, the one with the lower volume will also be ignored. For example, it is hard to hear the sound of the computer cooling fan during the day, but it becomes a noise source at night. According to this principle, the encoder can filter out many inaudible sounds to simplify information complexity and increase the compression ratio without significantly reducing sound quality. This shading is called the simultaneous shading effect. However, sound A is protected by sound B. If A is within the protection range centered on B, the protection will be more obvious. This range is called the critical bandwidth. The critical bandwidth of each frequency is different and the higher the frequency, the larger the critical bandwidth.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Introduction to various conventional audio encodings (or formats) Part 2

Introduction to various conventional audio encodings (or formats) Part 2

pcm

2. MP3 encoding

PCM

As the most popular audio compression format, MP3 is widely accepted by everyone. Various MP3-related software products emerge in a never-ending stream, and more hardware products start to support MP3 as well. Many VCD/DVD players that we can buy are compatible with MP3. , and there are more portable MP3 players, etc. Although several of the major music companies are extremely displeased with this open format, they cannot prevent the survival and spread of this compressed audio format. MP3 has been in development for 10 years. It is short for MPEG (MPEG: Moving Picture Experts Group) Audio Layer-3, which is a coding scheme derived from MPEG1. MP3 can achieve an incredible 12:1 compression ratio and still maintain basically audible sound quality. In the days when the hard drive was expensive, users quickly accepted MP3. With the popularity of the Internet, hundreds of millions of users accepted MP3. users At the beginning of the release of MP3 encoding technology, it was actually very imperfect. Due to a lack of research on sound and human hearing, almost all early mp3 encoders were crudely encoded and the sound quality was severely damaged. With the continuous introduction of new technologies, mp3 encoding technology has been improved over and over again, including two major technical improvements.

VBR: An interesting feature of MP3 files is that they can be read and played, which is also in line with the most basic features of streaming media. That is, the player can play without first reading the entire content of the file and play where it reads, even if the file is partially damaged. Although mp3 can have a file header, it is not very important for mp3 format files. Because of this feature, each frame of an MP3 file can have a separate average data rate without a special decoding scheme. That is why there is a technology called VBR (Variable bitrate, dynamic data rate), which allows each segment or even each frame of an MP3 file to have a separate bitrate, the advantage of this is that the sound quality is guaranteed to the maximum. . File size is limited. The advantages of this technology are obvious, but it is really difficult to use, because it requires the encoder to know how to assign the bitrate to each segment, which is like a dummy for encoders without waveform analysis. As such, VBR technology didn’t seem glamorous as soon as it appeared.

Introduction to various conventional audio encodings (or formats)

Introduction to various conventional audio encodings (or formats)

PCM

1.PCM encoding

PCM

PCM Pulse Code Modulation is short for Pulse Code Modulation. In the text above, we mentioned the general PCM workflow. We don’t need to care which calculation method is used in the final PCM encoding. We just need to know the advantages and disadvantages of the PCM encoded audio stream. The biggest advantage of PCM encoding is good sound quality and the biggest disadvantage is its large size. Our common audio CD uses PCM encoding, and the capacity of one CD can only hold 72 minutes of music information.

2. WAVE

This is an old audio file format, developed by Microsoft. WAV is a file format that complies with the PIFF Resource Interchange File Format specification. All WAVs have a file header, the encoding parameters of this file header audio stream. WAV does not have strict rules for encoding audio streams. In addition to PCM, almost all encodings that support the ACM specification can encode WAV audio streams. Many friends do not have this concept. Let’s take AVI as an example, because AVI and WAV are very similar in file structure, but AVI has one more video stream. There are many types of AVIs we have come into contact with, so we often need to install some decoders to watch some AVIs. DivX, which we have come into contact with a lot, is a type of video encoding. AVI can use DivX encoding to compress video streams, of course, we can also use other code compression. Similarly, WAV can also use a variety of audio codecs to compress its audio stream, but we commonly use WAV whose audio stream is processed by PCM encoding, but this does not mean that WAV can only use PCM codec, it is also you can use MP3 codec. in WAV Just like AVI, as long as the corresponding Decode is installed, you can enjoy these WAVs.

On the Windows platform, WAV based on PCM encoding is the best supported audio format. All audio software can support it perfectly. Because it can meet higher sound quality requirements, WAV is also the preferred format for music creation and editing. Suitable for storing musical material. Therefore, WAV based on PCM encoding is used as an intermediate format, and is often used in the mutual conversion of other encodings, such as MP3 to WMA.

PCM Audio Coding Part 4

PCM Audio Coding Part 4

pcm

Bit rate

PCM

Bitrate refers to how many bits per second the encoded audio data must be represented.

lossy and lossless
For “lossless audio” we usually mean, it generally refers to the 16-bit/44.1kHz sample rate file format in the traditional CD format, and is called lossless compression because it includes the file format of 20Hz-22.05kHz. frequency response frequency that completely covers the audible range of the human ear.

Where I have confusion here is the relationship between the channel and the sample rate? At first, the sample rate was assumed to be 44100. If two channels were used, the sample rate of each channel would be 22100. This is actually incorrect, the sample rate is the sample rate of each channel, not the sample rate of all channels.
Therefore, if the sample rate is 44100, then for two channels the number of samples collected should be 88200.

PCM Audio Coding Part 3

PCM Audio Coding Part 3

PCM

Sampling frequency

PCM

The human frequency recognition range is 20HZ – 20000HZ. If 20,000 sound samples per second can be sampled, it can meet the needs of human ears during playback.

8000 Hz for telephone sampling.
A sample rate of 22050 is typically used.
44100 is already CD quality, and samples over 48000 are meaningless to the human ear
When decoding AAC (Advanced Audio Coding) audio with a sampling rate of 44.1 kHz, the decoding time of one frame should be controlled within 23.22 milliseconds. Usually it is a frame with 1024 sampling points.

Why do we have to talk about audio frames here?
The concept of audio frames is not as clear as that of video frames. Almost all video encoding formats can simply think of a frame as an encoded image. But the audio frame is related to the encoding format, which is implemented by each encoding standard. Because if it’s PCM (unencoded audio data), you don’t need the concept of frames at all, and it can be played according to the sample rate and sample precision. For example, for audio with a sample rate of 44.1 kHz and a sample precision of 16 bits, you can calculate that the bit rate is 44100 16 kbps and the audio data per second is fixed at 44100 16/8 bytes .
But we don’t want all the samples returned to us for processing, what we want is to return all the data sampled over a period of time. The audio box here is how much sample data is returned to us each time and in general 2048 sample data is returned.
So what is the size of 2048 sample data using 16-bit sampling bits for mono? 2048*16/8 = 4096 bytes.

Sampling bits
Each sampled data registers the amplitude, and the sampling precision depends on the size of the storage space (sampling bits):

1 byte (ie 8 bits) can only register 256 numbers, that is, only the amplitude can be divided into 256 levels
2 bytes (ie 16 bits) can be as small as 65536 numbers, which is already the CD standard;
4 bytes (i.e. 32 bits) can subdivide the amplitude into 4294967296 levels, which is really unnecessary
If it’s stereo, the samples are doubled and the file is almost doubled in size.

PCM Audio Coding Part 2

PCM Audio Coding Part 2

PCM Audio Coding

Coding

PCM

The quantized sampled signal is converted into a series of decimal digital code streams arranged according to the sampling sequence, ie, a decimal digital signal. A simple and efficient data system is a binary code system. Therefore, the decimal digital code must be converted to a binary code. Based on the total number of decimal digital codes, the number of binary code bits required can be determined, that is, the word length (sampling bits) The process of transforming the quantized sample signal into a binary code stream of a given word length is called encoding.

Example
Next, the above 1.65 V corresponds to a quantization level of 128. The corresponding binary system is 10000000. That is, the encoded result of the sample point is 10000000. Of course, this is an encoding method without considering values positive and negative, and there are many types of coding methods that require specific analysis of specific problems. (PCM audio format encoding is A-law 13 polyline encoding)

PCM audio encoding
PCM signals are not subject to any encoding or compression (lossless compression). Compared to analog signals, it is less susceptible to transmission system clutter and distortion. The dynamic range is wide and the sound quality is quite good. The coding adopts the A-law 13 polyline coding.

A-grade 13-fold line
The A-law is a form of logarithmic compansion in PCM non-uniform quantization. Digital pulse code modulation (PCM) is the basic method for digitizing analog signals today. PCM includes three steps: sampling, quantization, and encoding. Among them, the quantization is the discrete value of the sampling values. Uniform quantization and non-uniform quantization can effectively improve the quantization signal-to-noise ratio of the signal. Quantization of speech signals often takes two logarithmic forms of non-uniform quantization and ITU-recommended compression characteristics: A-law and Mu-law. A-law coding is primarily used in 30/32-channel group systems, and A- The PCM law is used in Europe and China.

See the article for more details

channel
Channels can be divided into PCM mono and stereo (two channels)
.Each sample value is contained in an integer i whose length is the minimum number of bytes needed to accommodate the specified sample length.
The low significant byte is stored first, and the bit representing the sample amplitude is placed in the high significant bit of i, and the remaining positions are 0, so the data format of the PCM waveform samples 8 and 16 bit is as follows.

DSD or PCM, which format is really better?

DSD or PCM, which format is really better?

DSD  PCM

High definition music exists as digital files in two main formats, PCM and DSD. How are they alike, how are they different, and which one should I prefer?

PCM-vs-DSD

What is PCM
Let’s start with the fact that PCM (Pulse Code Modulation) is initially older, the first mentions of its successful use date back to the middle of the last century and are associated, like many technological advances, with the defense industry, that is, with the Navy radars. As for home use, first of all it is a well-known CD with a sampling frequency of 44.1 kHz and a 16-bit quantization level.

golden-disc-540×390

What is DSD
DSD (Pulse Density Modulation) is a format developed by Sony and Philips at the end of the last century and intended for the digital archiving of analog phonograms. The physical medium of this format is SACD. In fact, there is only one similarity between these two formats, both are digital, which for the user means the possibility of making unlimited copies without loss. As for the difference, relative to the field of graphic design, it is pretty much the same as raster and vector graphics. And if it is even more artistic, like cross stitch and watercolors. In both cases, an image is obtained, but the method of its creation and, as a result of perception, are completely different.

philipsCD100

What is the difference?
PCM, even because of its age, is much more studied, it has much better compatibility with a large number of very different devices, it implies the possibility of editing (equalization, division into frequency bands, transformations). DSD is actually a closed format, you can record to it, you can play it back, that’s all. However, it is inherently much closer to the original analog signal.

Which is better?
The first and most important conclusion is that from a technical point of view, the formats are far apart in terms of implementation methods, but are often practically indistinguishable in practical use, that is, in the sound of the final file. We are talking only about small differences in the nuances of the presentation of the music. So, all things being equal, when choosing the next file to download and play, it’s best to focus on the source material. If you are looking to digitize an analog then DSD will probably be preferable and will retain more nuances from the original. If it is a remastering of a digital recording previously made in PCM, then it would be more logical for it to stay in this domain.

Audio pcm what. Digital sound: DSD vs PCM Part 3

Audio pcm what. Digital sound: DSD vs PCM Part 3

What is DSD Audio? [Sound Quality, DSD vs PCM]

Retrieve a “digit” analog signal

But digitizing an analog signal is half the battle. To listen to digital music, you must reverse convert. First, let’s see how to convert a digital DSD broadcast to sound. As we already know, this stream is a high frequency bi-level signal (2.8 MHz or more), the average value of this signal changes with the audio frequency. That is, if the approach to solving the problem is as simple as possible, you need to filter out all the high-frequency components of the DSD stream, leaving only a useful sound signal (frequencies up to 20 … 22 kHz). This is done using an analog low pass filter (LPF). The simplest LPF is an RC chain.

As you can see, the resulting graph only vaguely looks like the original sinusoid. But let’s not forget that we “applied” the simplest filter, improving the filter circuit can achieve an almost total absence of high frequency noise and obtain an analog sound with good quality indicators.

To restore an analog signal from a digital PCM, just an analog low-pass filter is not enough, you must first decrypt the digital data, for this, digital-to-analog converters (DACs) are used. They are of different types, but it is beyond the scope of this article to describe them all. Let’s dwell on the 2 most common types of sound technology. First of all, this is the so called ladder type DAC (also called multibit). As you probably guessed, such a DAC converts a PCM digital data stream into a stream of audio signal values ​​that look like a ladder on the graph (Figure 6). As with DSD, it is imperative to use an analog filter to smooth out the jogging.

Often these converters use intermediate oversampling of the digital PCM signal at higher frequencies (eg 192 kHz): this reduces the “steps”, allowing for simplification of the analog filter circuit.

The second type of DAC, delta-sigma, uses oversampling at even higher frequency values ​​with a simultaneous reduction of the bit depth to one bit. Doesn’t it look like anything? This is a familiar DSD signal! We have already discussed how to further process such a signal and convert it to analog.

PCM and DSD application, advantages / disadvantages
Where can we find each of the encoding methods? PCM format is very common: CDDA discs, DVD audio, MP3 files, FLAC, ALAC, AAC, sound in movies, and so on, it is easier to say when it is not PCM. Super Audio CD, DSD, DSF, DFF files are in DSD format. What is better? What format will we get a better sound from?
The articles dedicated to the DSD format describe many advantages over PCM, but are all the advantages described true, or are they myths invented for laymen who do not understand the technical component to recover the market densely occupied by the PCM format? Let’s briefly review the list.

conclusions
So should you choose DSD or PCM? There is no single answer and it cannot be: PCM 24 bit 92 kHz and DSD128, for example, are very similar in quality characteristics, and these characteristics are better than the equipment on which these formats will be played, which means a further increase in the quality of digital formats for playback at this stage is not practical. When evaluating the quality of sound in different high definition formats, subjective sensations come to the fore, because the human brain is not eaten by the same quality: the design of the equipment, its cost and, most importantly, the well-being and the The listener’s moods have a much greater effect on the sensation of listening to music. Therefore, choose what you like personally and do not impose your opinion on others. Happy listening everyone!
You can help and transfer some funds for the development of the site.

Audio pcm what. Digital sound: DSD vs PCM Part 2

Audio pcm what. Digital sound: DSD vs PCM Part 2

DSD Vs PCM - Real Competitors? | Headfonics Audio Reviews

First, let’s answer the question, what is digital sound? How is it different from analog? In short, in mathematical terms, an analog audio signal is a continuous function, a digital audio signal is a discrete function. What does that mean?

Analog signal
If we draw in our imagination a graph of a sinusoid (this is how a sound wave is most often represented): then, no matter how we magnify it, trying to see all the details, we will always see a smooth and uniform line – this is an analog audio signal

Analog audio (recording) has many parameters that can be used to evaluate its quality. Consider the three most important: frequency range, dynamic range, distortion.

Digital sound. How many myths revolve around this phrase. How many disputes have arisen between lovers of comfort and digital quality and supporters of “live airy” vinyl sound multiplied by “warm tube” sound. In addition, there is a lot of controversy among lovers of “numbers”: is 16×44.1 enough or is 24×192 necessary? Which is better: multibit or delta sigma? CDDA or SACD? PCM or DSD? In this article, I will try to explain the basics of digital sound in simple language and will also expand in more detail on comparing two types of encoding of an analog to digital signal: DSD and PCM.

First, let’s answer the question, what is digital sound? How is it different from analog? In short, in mathematical terms, an analog audio signal is a continuous function, a digital audio signal is a discrete function. What does that mean?

Analog signal
If we draw in our imagination a graph of a sinusoid (this is how a sound wave is most often represented): then, no matter how we magnify it, trying to see all the details, we will always see a smooth and uniform line – this is an analog audio signal

Analog audio (recording) has many parameters that can be used to evaluate its quality. Consider the three most important: frequency range, dynamic range, distortion.

The frequency range is a set of frequencies contained in a sound. It is generally accepted that the frequency range of human hearing is 20 … 20,000 Hz (sometimes 16 to 22,000 Hz is indicated). The frequency range of the music itself is of no interest in terms of quality assessment (for example, the frequency range of the same plane taking off will be very wide and the tenor’s vocal part will be much narrower). A qualitative parameter, say, of an earphone is the potential frequency range, and it is estimated using the amplitude frequency characteristic (AFC). The ideal frequency response, a straight line across the entire range of hearing frequencies, means that the sound source does not amplify or attenuate any individual frequencies, meaning that the extracted sound matches the original.

Dynamic range (DD) is the difference between the quietest and loudest sound. Loudness is measured in decibels (dB). It is generally accepted that the maximum volume that does not cause injury to a person is 130 dB, the sound of an airplane taking off, and the minimum audible volume, 5 … 10 dB, is at the level of the rustling of the leaves in low wind conditions. Naturally, it will be impossible to distinguish the rustle of leaves against the background of a plane taking off, and listening to music at a level of 130 dB is extremely unpleasant. Therefore, it is generally accepted that a comfortable DD for listening to music is 80 … 100 dB.

The distortion is nothing more than a deviation of the signal from the original.

Principles of digital sound presentation
What happens when I digitize analog audio? We will not delve into the technical aspects, we will analyze everything, as they say, on paper: for this we will draw our imaginary “ideal” sinusoid and measure the value of the signal at regular intervals (this process is called sampling or quantization): we will obtain a certain sequential set of values ​​- this will be our digital signal obtained by the pulse code modulation (PCM) method

The two main parameters of PCM signal quality are frequency and bit depth. Frequency is the number of measurements per second, the more, the more accurate the signal is transmitted. Frequency is measured in Hertz: 44100 Hz, 192000 Hz, etc. Bit depth: the number of possible values ​​of the signal value (precision of the value transmission). The more options, the more accurate the signal will be. Bit depth is measured in bits: 16 bits (65,536 possible values, DD 96 dB), 24 bits (16,777,216 values, DD 144 dB), etc.

Audio pcm what. Digital sound: DSD vs PCM Part 1

Audio pcm what. Digital sound: DSD vs PCM Part 1

DSD vs. PCM

What is PCM

DSD & PCM

Let’s start with the fact that PCM (Pulse Code Modulation) is initially older, the first mentions of its successful use date back to the middle of the last century and are associated, like many technological advances, with the defense industry, that is, with the Navy radars. As for home use, first of all, it is a well-known CD with a sampling frequency of 44.1 kHz and a 16-bit quantization level.

What is DSD
DSD (Pulse Density Modulation) is a format developed by Sony and Philips at the end of the last century and intended for the digital archiving of analog phonograms. The physical medium of this format is SACD. In fact, there is only one similarity between these two formats, both are digital, which for the user means the possibility of making unlimited copies without loss. As for the difference, relative to the field of graphic design, it is roughly the same as raster and vector graphics. And if it is even more artistic, like cross stitch and watercolor. In both cases, an image is obtained, but the method of its creation and, as a result of perception, are completely different.

What is the difference?
PCM, even because of its age, is much more studied, it has much better compatibility with a large number of very different devices, it implies the possibility of editing (equalization, division into frequency bands, transformations). DSD is actually a closed format, you can record to it, you can play it, that’s it. However, it is inherently much closer to the original analog signal.

Which is better?
The first and most important conclusion is that from a technical point of view, the formats are far apart in terms of implementation methods, but they are often practically indistinguishable in practical use, that is, in the sound of the final file. We are talking only about minor differences in the nuances of the musical presentation. So, all things being equal, when choosing the next file to download and play, it’s best to focus on the source material. If you are looking to digitize an analog then DSD will probably be preferable and will retain more nuances from the original. If this is a remastering of a digital recording previously made in PCM, then it would make more sense for it to stay in this domain.

Digital sound. How many myths revolve around this phrase. How many disputes have arisen between lovers of comfort and digital quality and supporters of “live air” vinyl sound multiplied by “warm tube” sound. In addition, there is a lot of controversy among lovers of “numbers”: is 16×44.1 enough or is 24×192 necessary? Which is better: multibit or delta sigma? CDDA or SACD? PCM or DSD?

First, let’s answer the question, what is digital sound? How is it different from analog? In short, in mathematical terms, an analog audio signal is a continuous function, a digital audio signal is a discrete function. What does that mean?