what is audio compression in multimedia Archives - Page 2 of 5

Relationship between human audible range and sample rate

Free Download Mp4Gain

Relationship between human audible range and sample rate

Audio Sample Rate

The two main factors that indicate the performance of an audio interface are the number of sample bits and the sample rate.

sample rate

Of these, the number of sample bits is expressed as a numeric value, such as 16 bits or 24 bits, and last time I introduced that the dynamic range differs based on the difference in the number of sample bits. In other words, we have also used graphs to show that the difference in the number of bits is the precision with which very quiet sound can be expressed.
So what about the other sample rate? The sampling frequency is also called the sampling frequency, but the unit is usually kHz. The most commonly used are 32 kHz, 44.1 kHz, 48 kHz, and 96 kHz.
The Roland audio interfaces introduced last time, such as the UA-1X and UA-3FX, as well as the UA-1D and UA-20, are models that support 44.1 kHz and 48 kHz.

UA-1X dal_4007_s.jpg dal_4002_s.jpg UA-20
UX-1X UA-1D UA-3FX UA-20
As many of you will know, CDs, which can be said to be representative of digital audio, are compatible with 44.1 kHz and with 44.1 kHz, that clear sound can be expressed. But why is it 44.1 kHz? Here is a clear medical basis. It is the relationship with the human audible range, that is, the audible frequency band.
Generally, the highest pitch that can be expressed is said to be half the sample rate. In other words, 44.1 kHz is up to 22.05 kHz and 48 kHz is up to 24 kHz. On the other hand, the range that humans can hear is said to be 20 Hz to 20 kHz for healthy people. Therefore, according to the theory, recording of 20 kHz or more does not make sense because humans cannot perceive it. However, considering a small margin, it is the CD standard that can be expressed up to 22.05kHz. However, the reason it became a medium number like 44.1kHz is that when CD was standardized, the VTR was used for digital recording, and the TV’s horizontal and vertical sync signal was 44.1kHz., It is said which was by using it.

■ Can humans really detect sounds above 20 kHz?

However, if you can’t really hear more than 20 kHz, there is no point in picking up frequencies above that. But is that true?
The answer is clear from the appearance of DVD-Audio, which has a sound quality superior to that of CDs. Yes, it is certainly difficult to recognize 20 kHz or more as a single signal, but when signals of various frequencies, such as music, are expressed in an overlapping way, the atmosphere of the sound that can be heard depends on whether 20 kHz or more is being output. o No. It makes a difference. When I listen to a CD and an analog record, sometimes I feel that the sound of the record is better, but it can also be said that this is the result of not setting an upper limit on the frequency in the case of analogs.
Here, let’s experiment a bit to see if it is true that “the highest pitch that can be expressed is half the sample rate.”

48 kHz 96 kHz 48 kHz 96 kHz
White noise expressed at a sampling frequency of 48 kHz (left) and a sampling frequency of 96 kHz (right). In the case of 48 kHz, the sound is output only up to about 24 kHz, but in the case of 96 kHz, all the sound is output flat. In the two graphs above, the horizontal axis was only up to 48kHz, so it looked completely flat at 96kHz, but when the horizontal axis is up to 96kHz and expressed in exponential notation, it is 48k, which is almost the same as the theoretical . value. You can see exactly what comes out.
The graph shown here shows the extent to which frequency is expressed by creating white noise that mixes evenly from low to loud sounds at 48 kHz and 96 kHz. If you look at this, you can see that the 48 kHz sample rate is up to about 24 kHz and the 96 kHz sample rate is up to 48 kHz. However, the two charts on the right side have an index on the horizontal axis, so it might not seem like much of a difference, but it does have a double number range.
You can say that this is the difference between 48kHz and 96kHz.

■ If you want to make a CD last, do you need 24-bit / 96 kHz specifications?

By the way, some people may have some doubts about the story so far? Yes, I would like to digitally record analog recordings and tapes and eventually convert them to a CD, but if the CD itself is 16-bit / 44.1 kHz, the specs, such as 24-bit / 96 kHz, are above spec. Is it unnecessary?
It certainly may not be necessary if you burn the recording as is to CD without any processing.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

What is Sample Rate and Bit Rate Depth?

What is Sample Rate and Bit Rate Depth?

Audio Compression

Both image and video data have some numerical values related to image quality, such as the number of pixels, the number of colors that can be expressed, and the number of frames per second in the case of video.

Audio Compression

Similarly, audio data also has two numerical values related to sound quality, which are the sample rate and the bit rate. I do not understand the difficulty in either case, but I am sure I am not mistaken, so I will write about these two today.

Sampling rate
Let’s start with the sample rate.

Simply put, the sample rate is a numerical value that indicates “how loud the sound is recorded.” For some reason, when the sampling frequency is 44.1 kHz, it is not possible to record up to 44.1 kHz and it seems that it is possible to record up to about 22 kHz. Remember that you register up to half the frequency. If you’re wondering why that happens, google it (laughs).

It seems to have an effect on the sound of musical instruments that produce a crisp sound like cymbals, but I have never bothered to change the sample rate under the same conditions and compare them, so the amount of sound depends on the frequency of sampling. It is unknown if it will change. In professional environments, it is often recorded at 48 kHz. On rare occasions, the sample rate changes the sound quality, and some teachers boast that they can tell the difference. You seem to understand something. I would love to take a blind test, but I don’t have free time to go out with me.

Bit rate depth
This is a numerical representation of “how low a sound can be picked up (small change in volume)”. This can be a bit difficult to imagine.

The higher the bit rate, the smoother the waveform lines will be as the sound rises and falls, and the lower the depth of the bit rate, the rougher it becomes.

There are two options, 16-bit or 24-bit. There are also 32 bits at the moment.

Bitrate is likely to make a difference when recording percussion instruments such as drums (instruments with extremely loud volume). Some engineers record in 16-bit from scratch because the sound impression changes when 24-bit drum sound is converted to 16-bit for burning to CD. Unlike the sample rate, this is quite different.

Personal feeling about sample rate and bit rate.
First of all, the sound quality of commonly sold CDs is 16-bit at 44.1 kHz. And, in the professional field, it is often recorded at 24 bits and 48 kHz (which is called Neyonyonpachi). And the human audible range is said to be up to 20 kHz.

With that in mind, it is honestly ridiculous to see and hear something like “This audio interface supports up to XXkHz, so the sound is good …”. Just record at 2448. And there should hardly be any current audio interface model that doesn’t support 2448.

There are audio interfaces that support 192 kHz, but I honestly doubt the idea that the higher the sample rate, the better the sound quality. The basis of recording is to record the desired sound as loud as possible. To record sounds that are far from the human audible range, reducing the proportion of sounds that we really want (of course, sounds that can be heard by the human ear) is what we call high-quality sound. First of all, I think that high frequency sound is nothing more than noise like white noise. If you think that those high frequency sounds are generated by playing musical instruments, it means that the same or louder sounds are generated from fluorescent lamps and all machines, and those sounds are also recorded.

Data lost due to compression is irreversible Part 2

Data lost due to compression is irreversible Part 2

audio compression

[Quantization bit number (bit depth)]

Audio Compression

◉ Unit: bit
◉ Audio: Resolution related to volume. The higher the value, the more faithfully the quiet sound can be reproduced and the wider the theoretical dynamic range (ratio of the maximum and minimum volume values). 16-bit, 24-bit, and 32-bit floats are used primarily in production.
◉ If you compare it with the video …: Conceptually, it corresponds to the number of gradation bits. In terms of feel, it is almost the same as the dynamic range of the video. The wider the range, the greater the gradation possible without overexposure and underexposure.
◉ Remarks: There is no concept of the amount of quantization bits in compression formats such as MP3.
◉ Image of the number of quantization bits

When a square is cut on the vertical (volume) axis, the volume change less than one step cannot be reproduced, resulting in noise. In other words, the finer the squares, the more accurately the low volume can be reproduced. The actual number of steps in the number of bits in common use is as follows.

・ 16 bits → 65,536 steps

・ 24 bit → 16,777,216 steps

It can be seen that the 24-bit, which is said to be high-resolution, can reproduce the volume change much more accurately than the CD-quality 16-bit. In other words, 24-bit has a “wider dynamic range” than 16-bit.

[Sampling frequency]
◉ Unit: Hz
◉ Audio: Temporal resolution. Involved in the reproducible frequency range. If the frequency is low, the treble range will not be reproduced correctly. As the frequency increases, it is possible to reproduce frequencies above the audible range. Those used primarily in production are 44.1 kHz, 48 kHz, 96 kHz, and 192 kHz.
◉ If you compare it to video …: In terms of temporal resolution, it is equivalent to frame rate. The higher the speed, the smoother the video will be (in the case of sound, it is perceived as treble reproducibility rather than smoothness).
◉ Remarks: The upper limit of the frequency that can actually be reproduced is half the frequency. For example, if the speed is 96 kHz, it can be played up to
48 kHz ◉ Explanatory sampling frequency diagram

If you compare it to a video, you may understand it in some way. As of 2018, I think the lowest line quality that can be used regularly is the “16 bit / 44.1 kHz” used by CDs. If each value gets lower than this, it will collapse more and more so that it can be heard. If the number of bits is small, small sounds are converted to noise, and if the sampling frequency is small, the aliasing noise (noise that is inevitably generated by digitization. Moiré sound phenomenon) falls into the audible range and is comes back jarring. And note that half the value of the sample rate is the upper limit of the actual recorded / played rate. In other words, in the case of “44.1 kHz”, the actual recording / playback is up to about 22 kHz. The human audible range is said to be 20Hz to 20kHz, so that’s a sufficient value in terms of specs. By setting the sample rate to twice the upper limit of this audible range, overlapping noise is removed from the audible range, and by cutting it with a digital filter, jarring noise, which is CD quality, is removed. From this, you can see that “16 bit / 44.1 kHz” is the lowest line.

The master file
must be of high quality

That said, it’s hard to understand how sound quality changes at low bits and low sample rates without actually experiencing it.

Data lost due to compression is irreversible

Data lost due to compression is irreversible

Audio Compression

In this series, we will focus on the basic knowledge about “sound” that is necessary for video production, and we will make it easy to understand by omitting small and difficult things as much as possible, such as a little general knowledge and sound, including music. . I look forward to delivering it, so I look forward to working with you!

Audio Compression

Now, let’s talk about the first memorable event under the name [Digital Audio Basics]. There are several types of digital audio. Among them, I have summarized the main ones.

[Format types and functions]
◉ Uncompressed format: linear PCM (WAV, BWF, AIFF)
→ The most basic format for digital audio. BWF is a commercial WAV that can contain metadata.

◉ Lossy compression format: P3, AAC (MP4), MQA, etc.
→ Format used mainly for general purposes. In many cases, the information in the uncompressed data is shrunk and compressed. The data capacity is reduced, but the sound quality also deteriorates accordingly. MQA is a new format that is irreversible in terms of data, but reversible in terms of sound quality.

◉ Lossless compression format: FLAC, ALAC, etc.
→ Format mainly used for high-quality listening. It has the reversibility of being able to reproduce exactly the same sound quality as before compression, but the data capacity is not that small.

◉ Others: DSD (DSF, DSDIFF, etc.)
→ It is also called 1-bit audio, but since the concept is fundamentally different from multi-bit audio like linear PCM, it can be compared to “24bit” WAV, etc. in the same line I have not. Currently, it is one of the highest quality formats, but it has the weakness of not being editable.

How is it? I think there are several things, from the familiar ones to the ones you see for the first time, but among them, the one that is most suitable for today’s video production is “Linear PCM”! The reason is as follows.

1. Since it is an uncompressed format, it has excellent sound quality.

2. You can edit like cut and paste.

3. The digital voice tracker is the most popular Ma ‘around the world because the bet, any device, can be managed by software.

Since MP3 and AAC (MP4) are compressed formats, there is a considerable loss in sound quality. Depending on the compression ratio, it may not be obvious at first glance, but it is not suitable as processing-based material such as video production and music production. FLAC and ALAC are lossless compression formats that do not deteriorate sound quality, but do not significantly reduce capacity, and there is no software that can be edited natively (without conversion to other formats), so it is still unsuitable for the production. . DSD was adopted from SACD which appeared in 1999, and is said to be the most analog digital audio today, and it has a smooth texture that is different from linear PCM in terms of sound quality. This format has finally attracted attention in recent years, but due to its mechanism, it has the weakness that it cannot be edited as is, so on the production site, mainly one-shot music recording (recording without editing) and mixing (long-playing recording without editing) and mixing (often used as a master recorder when combining multiple sounds into one stereo or surround sound (also called track down). “Almost Ichi 択 linear PCM” video production, I think I could understand that you can refer to. Of course, if the compressed format does not make you uncomfortable, you can use it, but consider it as an emergency. If you still want quality, you must use linear PCM. The data lost by compression is irreversible. The file that will be the master of the work must be of the highest possible quality. By the way, whether you use WAV or AIFF, the sound quality is almost the same. However, co Considering compatibility, even Mac users can be relieved to use WAV for data transfer.

“16 bit / 44.1 kHz” is
the lowest line of CD quality

Now let’s dive a little deeper into linear PCM. There are “number of quantization bits” (bit depth) and “sample rate” (sample rate) that represent linear PCM specifications. Have you ever seen the notation “16 bit / 44.1 kHz”? This means that the original (analog) audio is sampled (digitized) 44,100 times per second at the 16-bit volume stage (2 raised to 16 = 65,536)! Still, I think it’s “what is this?”, So I tried to sum up the points by comparing it to the video!

What is a bit rate?

What is a bit rate?

BITRATE

I write “What is a bit rate?”, But most people can say “Bit rate? I have never heard of it”.

bitrate

Have you ever seen “MP3 128kbps” when going to home electronics retail stores?

MP3 is a compression format and the next 128 kbps is the part called the bit rate. This is an indicator of “how much data is converted per second”. The higher the number, the better the sound quality.

Conversely, the lower the number, the higher the compression ratio, but the worse the sound quality.

In other words, what is even more different from CD and MD players is that you can decide the sound quality yourself. (For indecisive people, it means “you have to decide yourself the goodness of the sound quality” (^ o ^) /)

I’m the last person, so it took me two weeks to decide …

If you don’t know, that’s fine. Most media players are set to MP3 128 kbps by default (initial state) or are recommended by the manufacturer’s software. For Windows Media Player, the extension is wma.

So if you don’t mind too much, buy it and import it on your computer and transfer it! It’s okay. Actually, I can’t tell the difference when I set it to 128 kbps or more. Is it a place where you can focus and see the difference between 128 kbps and 192bps? (If you have a good ear, you will be able to understand it in your daily life …)

So when does the bitrate change?
～～ By increasing the bit rate ～～
For example, if you buy a large capacity player, about 10,000 songs will be included. You don’t listen that much, you can’t hear it, right? So if the capacity is full I think it is fine to capture at 192 kbps or the highest 320 kbps. (To be honest, I don’t know the difference between 192 kbps and 320 kbps)

If it is a classical song, you can increase the bit rate by one step. That is a song to enjoy the song. I think pop music can be left as is. I enjoy singing.

It’s okay to change it depending on the song, but it might get annoying soon. .. ..

～～ By reducing the bit rate ～～ Since the
Flash memory player capacity is limited, some people use it with a lower bit rate. However, at 64 kps, you can see that the sound quality is clearly bad. Music like nothing? If so, one way is to downsize it to save capacity.

If you don’t have enough capacity, you can reduce it if you are learning a language. “So should we learn with poor sound quality?” It is true that the sound is a little worse, but the sound range (frequency) is limited to the human voice, so it is better than music. The sound did not get worse. If you don’t need it, 128 kbps is enough.

By the way, I put the difference in the bitrate as shown below. Actually, the file size is compressed to correspond to each one. The optimal compression rate is 128 kbps. Can you tell the difference compared to the original logo?

What is the “clock” on a CD?

What is the “clock” on a CD?

CD Player

The CD player contains a biological clock. You may think it is true, but it is a fact.

Cd Player

A watch is called a “clock” and it actually carries a crystal oscillator (crystal clock) that keeps the exact time. This is not for the timer. Time is important to read the information recorded on the CD, and the crystal clock, which is the body’s clock, plays an important role. Since this is a very high frequency pulse (clock pulse), it splits (slows down the count) and issues the necessary commands to various blocks in the player.

Let’s teach the seeds we are proud of as an ear study. “The clock is related to the pit length of the CD.” In order for the player to read the 0 and 1 information of the hole, it is necessary that the length of the hole and the time of the biological clock coincide exactly, but for that purpose it is not good. The length of the pit is set to an integral multiple of the clock. There are actually only 9 types of wells on the board, from the shortest (3T) to the longest (9T). You can see that T is a clock pulse and it is a well-researched format.

If the clock is wrong, the sound will be cloudy. This is because the pasle’s time axis fluctuates and jitter occurs. Therefore, the topic of discussion among fans is the external clock. If your body clock is poor, there are other, much more accurate cesium and rubidium clocks. You can use this pulse to move the player! This is why some high-end CD players have an external clock input.

Next time, let’s go through the glossary and how to read the optical disc player specifications that have come out so far.

CD Player Sound Quality Enhancement Technology: What are High Bits and High Sampling?

CD Player Sound Quality Enhancement Technology: What are High Bits and High Sampling?

However, CD players have various technologies to improve the sound depending on the manufacturer.

Like Denon’s AL24 processing and Pioneer’s legato link conversion. Even if the name is different for each manufacturer, it basically reproduces the subtle nuances and quirky atmosphere of the original analog audio that was cut on CD using extended technology like high bit and high sample. It’s just a device in CD format, but when you ask it, it certainly feels clear and the amount of information has increased.

So what kind of processing are you doing?

sampling

The left side of the figure is a normal CD format. The horizontal axis is incremented by fs = 44.1 kHz and the sample data is read with 16-bit precision. This is as explained above and unless there is special processing on the player side it will play as is with CD audio.

But the figure on the right is different. This is an image of the AL24 example, and the bits are expanded from the usual 16-bit to 24-bit using a dedicated chip. So a simple calculation can express a fine sound that is 2 to the eighth power, that is, 256 times. It seems that the upper and lower bits are moved and advanced things are done, but due to such bit expansion and high sampling (extending the high frequency range) like 4fs and 8fs in the direction of the horizontal axis, the squares are much smaller . Even if it is a CD, you can enjoy high-quality sound that surpasses that of a CD.

PCM conversion flow

PCM conversion flow

Pulse Code Modulation

Let’s summarize how analog music signals are digitized in PCM and burned to CD. PCM is an abbreviation for pulse code modulation. In Japanese, it translates to pulse code modulation method.

PCM

The music signal is originally a continuous analog signal. A continuous waveform that ripples like a wave will not fit in the hole of a CD as is, so test it first. What part of the rippling wave should be used as a sample? Of course, it is necessary to have regular intervals, and in the case of CD, it is decided to sample at 44.1 kHz. kHz is a unit of frequency and is the number of repetitions per second. We’re going to sample at a tremendous rate of 44,100 times per second. The job of sampling is sampling, and it does not mean that the waves are crushed separately.

After sampling in the direction of the time axis in this way, the next step is how to read the discrete data (points) with what precision. This is the quantification. It’s not used often, but in English it’s called quantizing. Since the vertical axis of the graph is the signal level, that is, the magnitude, the precision point is how many steps to read to the highest point of the wave. The unit is the number of bits.

The bits are a binary number in the digital count. Binary numbers are a game, and as the number of bits increases, the number that can be expressed at an accelerated rate increases (number of steps = sampling precision). The calculation is “2 raised to the power of the bits.” For example, 3 bits would have 2 x 2 x 2 = 8 steps, but 5 bits would have 2 x 2 x 2 x 2 x 2 = 32 steps. It seems that it will be incredible if we continue like this. Yes, 16 bits is 2 to the power of 16, so multiply 2 16 times to get 65536 steps. Remember the “65,000 steps”.

Still, it’s not analog per se, but if you play it on a CD player it will play the original continuous analog wave, which is why digital is Erai. Actually, after quantization, the encoding work is done and a 16-bit PCM digital signal is obtained as “010011 … 10”.

Digital is strict and, in fact, there are some rules. It is often said that “CD has a frequency range of 20 kHz and a dynamic range of 96 dB”. This is determined solely by the format. To put it bluntly, the 20 kHz high-frequency range comes from the sample rate, while the 16-bit quantization defines the D range as 96 dB.

It’s kind of logical, but it’s called “Shannon’s Sampling Theorem (Erai scholar)”, and it can record high frequencies up to almost half the sampling frequency (fs). For quantization, there is a guideline of 6 decibels per bit, which is 6 x 16 = 96 decibels.

What are the sample rate, the number of quantization bits, and the clock?

What are the sample rate, the number of quantization bits, and the clock?

Sample Rate and Bit Depth

There is some format jargon that you really need to know about CDs. It is the “sample rate” and the “quantization bit number”.

Sample Rate and Bit Depth

Related to that, you will deepen your understanding if you also learn about the “clock” from the CD. The next time you learn “How to Read Specifications / Optical Discs”, it will go into your head.

■ What is the sampling frequency and the number of bits?

Digital audio recorded on a CD has a 44.1 kHz sample rate and a 16-bit quantization bit rate, right? Yes, that is correct. It has appeared several times so far, but this is the first time that we have explained it in detail from the basics.

First, let’s start with the image. Just the esoteric feeling of sampling and quantizing, and the “vertical slice” and “horizontal slice” of the signals first. Think of it like cutting a radish. First of all, I’ll cut it vertically with a kitchen knife. You can make a lot of cuts, but they were originally continuous. The solid curve is the analog voice, and the first thing to do when digitizing it is the “vertical slice” = “sample” image.

Next is the quantification work. Even if the cut is a cut, it is quantified to “cross” the kitchen knife on its side. Then the radish will be divided into small squares. Did you imagine that the finer the square, the closer it is to the original analog signal?

The CD format is the rule of how fine the radish is cut (analog signal). “The sampling frequency is 44.1 kHz and the number of quantization bits is 16 bits” means that the first sampling is done at a rate of 44,100 times per second, and then the level is read with an accuracy of 16 bits (2 to power step 16). . Sampling is also called sampling, but in the first place, sampling is the norm, and without sampling, the quantification work cannot be done.

What is the so-called bit rate?

What is the so-called bit rate?

BitRate

A value indicating how many bits of information are processed or sent / received per unit of time.

AUDIO COMPRESSION

Also called transfer fee. The amount of information in one second of audio data and video data is expressed in “bits per second” (bps: bits per second). Usually used in conjunction with “k (kilo)” which represents a unit of thousand or “M (mega)” which represents one million units because the number of digits increases and is expressed as “kbps” or “Mbps” . (1 kbps is 1000 bps, 1 Mbps is 1 million bps). It is often used in the audiovisual (AV) genre, and in the case of audio and image data, the higher the value, the more detailed the information, and the better the sound quality and picture quality. The standard bit rate for MP3, one of the audio compression formats, is 128 kilobits per second (kbps), which compresses uncompressed WAV files (approximately 1400 kbps) with CD sound quality to approximately one-tenth of the amount of information. what are you doing. The video bit rate is higher due to the large amount of information, and the high definition terrestrial digital transmission is about 18 megabits per second (Mbps), and the BS high definition digital transmission is about 24 Mbps. Also There is a unit that expresses the transfer speed, “bytes per second” (Bps or B / s), which is a reference value that expresses the number of bytes per second. Since 1 byte is 8 bits, Bps can be calculated by dividing bps by 8.

Bit rate

It is the data communication speed, which is the amount of data that can be sent and received in a certain period of time. The unit is “bps”, which is short for “bits per second”. It is also used to refer to the amount of data used to express one second of video or audio when compressing video or audio. The greater the amount of data (= lower the compression rate), the more faithful it will be to the original, but a high-speed communication line is required.
On the other hand, as the amount of data is reduced (= the compression rate is higher), the image quality and sound quality deteriorate, but transmission is possible even in an environment where the communication speed is slow .
⇨  bps, transmission.

Processed per unit of time, or the transfer is a bit number. It is generally expressed as a number per second and uses bps as the unit. In a computer network, it is represented by a physical quantity as a communication speed, and in data transfer with a peripheral circuit or device within a computer, it is represented by a physical quantity as a transfer speed. It is also used as a unit to express the amount of information per second when compressing audio and video data, and if this value is the same, the higher the value, the higher the sound quality and picture quality. ◇ Also called “bit rate”, “bit efficiency” and “bit rate”.