Psychoacoustic Models in MP3 and AAC Encoding


Free Download Mp4Gain
picture

Psychoacoustic Models in MP3 and AAC Encoding

Psychoacoustic Models in MP3 and AAC Encoding

Let’s talk about Psychoacoustic Models in MP3 and AAC Encoding

When it comes to digital audio compression, especially in MP3 and AAC formats, psychoacoustic models are the secret sauce that makes it all work. These models allow us to shrink large audio files into much smaller sizes without a noticeable loss in sound quality. In my years of working with audio encoding, I’ve seen how these models have revolutionized the way we perceive sound after compression. The core idea is simple: we don’t hear all sounds equally. Some frequencies and nuances are more noticeable than others, and psychoacoustic models exploit this fact to make compression more efficient.

Think of it like this: imagine you’re at a concert, and a loud bass guitar is playing alongside a softer violin. Your attention is drawn to the bass because it’s much louder, and the violin’s subtle details get masked. This is exactly what psychoacoustic models do—they remove or reduce sounds that are unlikely to be heard due to masking effects. In this article, I’ll walk you through how psychoacoustic models in MP3 and AAC encoding work and why they matter for audio quality and file size.

Understanding the Basics of Psychoacoustic Models

Psychoacoustic models are based on the science of how our ears and brain perceive sound. They take into account how different sounds mask each other, which frequencies we are most sensitive to, and how we interpret sound in different contexts. MP3 and AAC encoding use these models to compress audio by identifying and removing information that won’t be noticeable to the listener.

A simple analogy would be taking a photograph with a high-resolution camera and then reducing its size by removing some pixels. You won’t notice much difference in the quality of the image because you can’t see all the pixels. Similarly, these audio encoders remove frequencies or audio details that the human ear won’t detect, making the audio file smaller without compromising its perceived quality.

Frequency Masking

  • Frequency masking happens when a louder sound in one frequency range makes a softer sound in a nearby frequency range inaudible.
  • Psychoacoustic models use this to discard or reduce the quieter, masked sounds, optimizing compression.
  • For example, if a heavy guitar is playing at a loud volume, the model might remove the higher-pitched background notes that are masked by the louder guitar.

Temporal Masking

  • Temporal masking occurs when one sound, like a sharp drum hit, can mask a quieter sound that occurs immediately after it.
  • This type of masking is crucial for determining which transient sounds can be removed in compression.
  • For instance, a loud snare hit can mask a subtle violin note that comes milliseconds after, making it unnecessary to keep all the data for that note.

The Role of Psychoacoustic Models in MP3 Encoding

In MP3 encoding, psychoacoustic models play a critical role in reducing the file size while maintaining an acceptable level of sound quality. The MP3 codec was one of the first to use psychoacoustic models to exploit human hearing limitations, and it was revolutionary when it was introduced in the 1990s. The encoder divides audio into different frequency bands and applies masking principles to decide which data can be discarded.

What’s fascinating is that MP3 uses a hybrid of time-domain and frequency-domain processing. It first splits the audio into small segments and then performs a frequency analysis. Using this information, the encoder decides which frequencies can be reduced or eliminated entirely. By doing this, the model allows the MP3 format to achieve relatively small file sizes while preserving the overall listening experience.

MP3 and the Trade-off Between Compression and Quality

  • MP3 encoding sacrifices some of the finer audio details to reduce file size.
  • The trade-off is more noticeable at lower bitrates, where artifacts like compression noise or a “tinny” sound may become audible.
  • Higher bitrates, like 192 kbps or 256 kbps, provide better sound quality, though the file size increases.

AAC: The Next Generation of Psychoacoustic Modeling

While MP3 revolutionized audio compression, AAC (Advanced Audio Codec) takes things a step further. As a more advanced codec, AAC uses a refined psychoacoustic model that performs better at lower bitrates, providing higher-quality audio with less data. This is especially important for modern audio streaming services, which need to balance high-quality sound with efficient bandwidth usage.

The AAC psychoacoustic model is more sophisticated, taking into account additional factors like stereo imaging and spatial effects. It’s also more adept at handling complex audio, such as orchestral music or tracks with a wide range of dynamics. From my experience, AAC does a better job than MP3 in preserving the subtleties of sound, especially at lower bitrates, which is why I recommend it over MP3 when available.

Why AAC Outperforms MP3

  • AAC uses more advanced psychoacoustic techniques, making it more efficient at lower bitrates.
  • It better preserves transient sounds and complex audio elements, like the reverberations of a piano or the nuances of a singer’s voice.
  • With AAC, you can get excellent sound quality at 128 kbps, whereas MP3 may require 192 kbps or higher for a similar result.

How Psychoacoustic Models Help with Audio Quality at Low Bitrates

One of the most remarkable aspects of psychoacoustic models is how they enable high-quality audio at low bitrates. At lower bitrates, many codecs, including MP3 and AAC, might introduce artifacts such as distortion or loss of clarity. However, psychoacoustic models allow the encoder to focus on the most important elements of the sound—those that we are most likely to notice—while discarding the less important parts.

This is especially noticeable in AAC, where the advanced psychoacoustic model ensures that even at low bitrates, the encoding still captures essential auditory information, such as pitch, rhythm, and timbre. I’ve personally found that with AAC, even at 128 kbps, I can enjoy clear vocals and instruments without the harsh artifacts that often accompany MP3 at the same bitrate.

Latest Words on Psychoacoustic Models in MP3 and AAC Encoding

Psychoacoustic models are an integral part of both MP3 and AAC encoding, helping us achieve smaller file sizes while preserving audio quality. These models allow the encoder to reduce the file size by removing sounds that are less perceptible to the human ear, making the audio more efficient without sacrificing what matters most to the listener. While MP3 was groundbreaking in its time, AAC offers superior compression and better handling of complex audio, making it the better choice for modern audio applications.

As I’ve discussed throughout this article, these psychoacoustic models are crucial in ensuring that we can enjoy high-quality audio, even with file sizes that fit comfortably on our devices and bandwidth constraints. Whether you’re listening to your favorite album or streaming a podcast, psychoacoustic models are working behind the scenes to make your audio experience better. As the technology continues to improve, we can only expect even better performance in the future.

Frequently Asked Questions

What are psychoacoustic models in MP3 and AAC encoding?

Psychoacoustic models in MP3 and AAC encoding are based on the way humans perceive sound. These models analyze how different frequencies mask each other, allowing the codecs to remove or reduce the data for sounds that are less noticeable to the human ear. This process helps reduce file size without sacrificing audio quality. Essentially, psychoacoustic models optimize compression by focusing on the most important sounds in an audio file.

How do psychoacoustic models improve audio compression?

Psychoacoustic models improve audio compression by eliminating or reducing sounds that the human ear is less sensitive to. For example, louder sounds can mask softer ones, so the encoder can discard those quieter sounds, saving space without impacting the perceived quality of the audio. This makes it possible to compress audio files into smaller sizes while still delivering high-quality sound, especially in formats like MP3 and AAC.

What is the difference between MP3 and AAC in terms of psychoacoustic models?

The main difference between MP3 and AAC lies in the sophistication of their psychoacoustic models. AAC has a more advanced model that better handles complex audio, such as classical music or tracks with subtle dynamic changes. It also performs better at lower bitrates compared to MP3, providing higher sound quality at the same compression level. In short, AAC offers superior compression efficiency, especially when dealing with modern audio formats and streaming.

Why does AAC sound better than MP3 at lower bitrates?

AAC sounds better than MP3 at lower bitrates because it uses a more efficient psychoacoustic model. The AAC codec is designed to optimize the way it removes or reduces sounds, prioritizing the frequencies that are most important for human perception. This allows it to achieve a better balance between file size and audio quality, especially at bitrates like 128 kbps, where MP3 might begin to show noticeable artifacts.

How does temporal masking affect audio compression?

Temporal masking occurs when a loud sound at one moment in time masks a softer sound that follows it almost immediately. This effect is important for audio compression because it allows the encoder to discard these masked sounds without the listener noticing. This type of masking helps improve compression efficiency, especially in formats like MP3 and AAC, where transient sounds, like a snare hit or cymbal crash, may cover quieter background elements.

Can psychoacoustic models cause distortion in compressed audio?

While psychoacoustic models aim to reduce file size without degrading sound quality, they can sometimes introduce distortion, particularly at lower bitrates. This happens when the codec removes too much data, resulting in noticeable artifacts such as a “tinny” or metallic sound. However, with modern codecs like AAC, these artifacts are much less common, even at lower bitrates, thanks to more advanced psychoacoustic modeling.

Comments:

Wow, I had no idea how much science goes into these audio codecs. Your explanation about frequency and temporal masking really helped me understand why AAC sounds better at lower bitrates. Great article! – AudioFan77

I’ve always been a fan of MP3, but now I’m definitely considering switching to AAC for my music collection. The way you described the differences in psychoacoustic models makes it so much clearer! Thanks! – MusicJunkie88

This article is awesome! The real-life examples helped me visualize how psychoacoustic models work. I never understood how my music could sound so good at a low bitrate, but now I get it. Thanks for the great info! – SoundLover42

Can you talk more about how AAC handles high-frequency sounds compared to MP3? I’d love to know more about that! Great article though, very informative. – HighFreqFan

I didn’t realize how important these psychoacoustic models were in compressing audio. I always wondered how audio streaming services maintain such high-quality sound at lower bitrates. Now I know! – DeeJayDave

This is one of the most detailed articles on this topic I’ve found! I’ve been using AAC for a while now, but this article really made me appreciate how much better it is than MP3, especially for complex audio. – SoundEngineerX

Excellent breakdown of the differences between MP3 and AAC. I always assumed MP3 was “good enough” but now I realize AAC is the better choice, especially for lower bitrates. Thanks for clearing that up! – TechieTom

Great read, but I wish you would’ve gone deeper into how these psychoacoustic models impact the experience for listeners with hearing impairments. Any chance you can dive into that next? – ClearSound76

As a musician, I’ve always been picky about sound quality. After reading this, I’m convinced that AAC is worth the switch for my music files. Thanks for sharing your expertise! – MusicMaker24

I had no idea that psychoacoustic models were so important for compression. I always assumed audio codecs just “squished” the data and that was it! – CuriousGeorge

Very well-written article! I didn’t know much about psychoacoustics before, but now I understand why AAC sounds better at lower bitrates. Thanks for breaking it down so clearly! – TuneInExpert


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

The higher the bitrate, the higher the sound quality and the larger the file size.

The higher the bitrate, the higher the sound quality and the larger the file size.

audio bit rate
audio bit rate

but the quality of the source file determines the final quality.

audio bit rate
audio bit rate

From highest to lowest, the sound quality will be worse, but from lowest to highest, the sound quality will remain unchanged at most, but the file will be larger.Many

General mp3 are good with bit rate around 128, and also 3-4 BM in size.

The bitrate, choosing it, directly affects the size of your mp3 file and the listening experience. High compression ratio has high distortion, and low compression ratio has low distortion, but how do we find a balance point that we can accept on both counts? This requires careful exploration in the experiment. Considering that the sound quality of low bitrate files is not suitable for playing music, the minimum set is 128kbps, and four fixed bitrate files of 128, 192, 256 and 320 are used for comparison. and try.

The compression ratio of 128 kbps is still relatively rough, and the high-frequency part is highly distorted after compression. It sounds hollow, wrinkled, rough, and there are often flickering sounds. Misunderstanding, the compressed volume of a 3 minute 39 piece of music is 3414 Kb. Although the volume is not large, the sound is not satisfactory, and there are relatively large defects.

192kbps bit rate compression effect is much better than 128. First of all, the sound is solid, at least there is no empty feeling, the high-frequency distortion is also much less, the sound is compact, the noise is small and clean, and achieve relatively ideal listening The sound effect, just because the compression is still relatively strong, the detail performance is still not very good, the texture of musical instruments, especially instruments of wind, it is still very hard, unreal and lacks musicality. The compressed size is 5123kb, and I think the compression ratio is 128~ It is better to use it in a mp3 player with a capacity of ~256m, which can not only satisfy the basic sense of hearing, but also is suitable in size.128m can store about 95 minutes of music, and 256m can double to 190 minutes of music.

The 256 kbps compression rate is naturally a step higher than 192 in terms of sound quality. Take the first 10 seconds of the track, the low frequency of the cello is obviously less grainy, and the sound is more smooth and natural, with texture and texture. It is also clearer, with much more detail, the rendering of the atmosphere is more prominent, the rotation of parts in the following songs is also more expressive, the clarity of large and small signals is also improved, and the sound is more detailed and lasting. But at the same time, the file size has also increased to 6831kb, which is still affordable for a 256m mp3 player. It is not difficult to know by calculation. According to the bit rate of 256, about 135 minutes of music can be stored. Generally speaking, it is enough, 128m is a bit less and can only support a little over an hour, so it is recommended to use 192 bitrate for 128m.

320 kbps is the maximum bitrate that lame can provide. The final file generated is 8592kb which is about 8.4M. Compared to the 37M of the wav file the compression ratio is basically 4.5:1 but the generated mp3 file sounds very distorted Now on Compared with other 320 bit rate, the natural advantage is obvious, the tone, details, etc. are very delicate, basically achieve the sound quality of the original CD copy, especially in the CD player with playback function from mp3, the basic No difference, but I use relatively high-end earplugs with high resolution, plus my experience and skill with music and equipment, I can still hear a lot of differences compared to wav files, first Instead, the compressed mp3 sounds a bit The crunch feeling is relatively dry on the whole. Without the wav file, it sounds fresh and dynamic. In terms of final details, nuances and sense of space, the separation is not as high as the quality of the wav file, but it is quite close in terms of timbre, but the performance is poor and the digital flavor is relatively strong. So if you are using a miniature hard drive player like an iPod, I recommend you use 320kbps compression ratio, which can get the best listening experience. Of course listening to wav directly is the best~

The bit rate directly affects the sound quality.

The bit rate directly affects the sound quality.

audio bit rate
audio bit rate

High bitrate is good and low bitrate is bad.

audio bit rate
audio bit rate

The code rate is the number of data bits transmitted per unit of time during data transmission. Generally, the unit we use is kbps, that is, kilobits per second.

The popular understanding is the sampling rate. The higher the sampling rate per unit time, the higher the precision, and the processed file is closer to the original file, but the file size is proportional to the sampling rate, so almost all encoding formats pay attention. It’s about how to use the lowest code rate to achieve the least distortion. The cbr (fixed code rate) and vbr (variable code rate) derived from this core are all articles in this regard, but things are not absolute, in terms of audio, the higher the bit rate, the lower the compressed ratio, the smaller the sound quality loss and the closer it is to the sound quality of the audio source.
The information in the computer is represented by binary 0 and 1, and each 0 or 1 is called a bit, which is represented by lowercase b, that is, bit (bit); uppercase B represents byte, ie byte, one byte = Eight bits, ie 1B=8b; the capital K in front stands for thousand, that is, thousand bits (Kb) or kilobytes (KB). Indicates the size of the file, usually using bytes (KB) to indicate the size of the file.

Kbps: The first thing to understand is that ps refers to /s, which is every second. Kbps refers to the speed of the network, that is, how many thousands of bits of information are transmitted per second (K means thousands of bits, Kb means how many thousands of bits), it is expressed in kb (kilobit), and in the case KBps means how many kilobytes are transferred per second. 1KBps = 8Kbps. The Internet speed of ADSL is 512 Kbps. If converted to bytes, it is 512/8 = 64 KBps (that is, 64 kilobytes per second).

A frame is a still image, and continuous frames form an animation, like a television image.
We normally say the number of frames. Simply put, it is the number of image frames transmitted in 1 second. It can also be understood that the graphics processor can update several times per second, usually expressed in fps (Frames Per Second). Each frame is a still image, and showing frames in rapid succession creates the illusion of movement. Higher frame rates result in smoother, more realistic animations. The more frames per second (fps), the smoother the motion is displayed.

What is the bitrate of the music?
It can also be called bit rate, which is nothing more than the amount of data reproduced per second by a type of music, the unit is expressed in bits, that is, binary bits. bps is the bit rate. b is bit, s is second, p is per, and one byte is equal to 8 binary bits. That is, the file size of a 4-minute song at 128bps is calculated as (128/8)*4*60=3840kB=3.8MB, which means that the same song with the same bit rate (bps) will not no matter what format (such as mp3 wma) The capacity is basically the same, which can only represent a transmission rate, not the sound quality. Due to different compression engines, the sound quality of different formats varies a lot. However, for the same format, the higher the bitrate, the larger the file and the better the sound quality.

What is the sample rate of the music?
Sampling rate refers to the number of samples per unit of time. The sampling rate is 44KHz, which means the number of samples per second is 44K, which means that 44,000 pieces of data are used to describe the sound waveform in 1 second. That is, the higher the sample rate, the better the sound quality. But he and bitrate are two completely different concepts.

Bitrate Part 2

Bitrate Part 2

bitrate

The amount of information transmitted through the channel per unit of time is called the bit rate, and the unit is bits per second (bit/s), called the bit rate.

BITRATE

Bitrate is often used in communications as a synonym for connection speed, transmission speed, channel capacity, peak throughput, and digital bandwidth capacity. The higher the bit rate, the higher the data transfer. Bit rate in video refers to the sampling rate at which an analog signal is converted to a digital signal [4] . Video file quality is often measured in terms of bitrate. [4] .
Distinction of conceptedit transmission
Baud rate is also known as waveform rate or modulation rate. The code for a data unit is represented by a finite combination of numbers, each of which is a symbol (or code point). In electrical communication, an electrical waveform is often used to represent one or more symbols. Waveforms with different characteristics may represent different symbol values ​​or symbol combination values, and the duration of the waveform corresponds to the duration of the symbol or symbol combination it represents. Obviously, the shorter the duration of an electrical waveform, the more waveforms are transmitted in a unit of time, or the more data is transmitted, that is, the higher the data rate. Therefore, we can define the baud rate as follows: In the process of data transmission, the number of waveforms transmitted per unit time on the line is the baud rate, and its unit is “baud” [5] .
“Bit rate” and “baud rate” are speed units defined in two different concepts, and it is often easy to confuse them when you are not careful. When binary waveform is used, baud rate and bit rate have the same value, but their meanings are different [5] .
Difference: Both bit rate and baud rate are units that measure the transmission rate of a modem. In data transmission, data information is represented by binary numbers “0” and “1”, and each binary number is called 1 bit. The number of bits transmitted through the channel per unit of time is called the bit rate, expressed in bits per second, usually abbreviated as bit/s. The number of symbols transmitted through the channel per unit of time is called the baud rate, also called the modulation rate. Bit rate and baud rate are consistent only when modulated with two values. For example, in quadrature modulation, every two bits of the data signal form a symbol, and there are 4 values: 00, 01, 10 and 11, which represent the phase changes of the 4 types of carrier signals respectively, for Therefore, send such a symbol. It is equivalent to transmitting two bits of data, and the baud rate is equivalent to half the bit rate. The usual transmission rates of 300, 600, 1200 and 9600, etc., refer to the baud rate, which indicates that the number of binary numbers transmitted per unit of time is 300, 600, 1200 and 9600 [6] .

Bit rate

Bit rate

Bitrate

Bit rate refers to the number of bits (bit) transmitted per unit of time, in bps (bit per second).

bit rate

Bit rate is also known as “binary bit rate”, commonly known as “code rate”. Indicates the number of bits transmitted per unit of time. It is used to measure the transmission speed of digital information, often written as bit/sec. According to the number of bits occupied by each image storage frame and the transmission bit rate, the digital image information transmission speed can be calculated [1].
In modern digital communication, the transmission volume of digitized video and other information is large, so it is often measured in kilobits per second or megabits per second, which are written as kbit/sec (or kbps) and Mbit/sec. (or Mbps respectively). ). For example, the amount of information digitized from an ordinary color TV signal can reach 216 Mbit/sec. A good digital broadcast channel can transmit dozens of color TV programs, and its capacity can reach several gigabits or gigabits per second (written as Gbit/sec or Gbps) [1] .
Bitrate is often used to measure the quality of video files.
Bitrate is often used to measure the quality of video files.
flexibility edit stream
Because each network is unique and each access line has different conditions (such as length, attenuation, crosstalk environment, etc.), access lines from different telephone companies must support different data rates. For ADSL and VDSL modems, it is best to set the data rate to one of many possible data rates. For example, DMT-based ADSL and VDSL can theoretically change the tariff at fine intervals, and CAP-based RADSL (Rate Adaptive ADSL) also provides some flexibility in tariff configuration [2].
However, telephone companies may want to limit xDSL service to a small set of rates sufficient to provide a variety of services. If a limited set of tariffs can be adapted to a wide range of services, then the management of the services in this case is simpler than in the case of variable tariffs. Telephone companies want the choice of modem speed to be under the control of the network, not the user [2] .
In this mode, the selection of the transmission rate set of the xDSL network must be prudent. In this case, there is a possibility that two adjacent systems receive traffic at very different rates and the system must be able to handle such a situation. The other model, the “best match” approach using adaptive rate ADSL (similar to a voiceband modem), is more beneficial to new network operators and Internet Service Providers (ISPs) [2] .
Transmission control method
Most bit rate control schemes consist of two parts. Part of the encoded bit stream output by the encoder is fed into a buffer. For a constant bitrate channel, the data in the buffer is fetched at a constant rate, and if the buffer is large enough, the bitrate variation caused by the MPEG picture type, etc. can be smoothed out. This is necessary for both constant bit rate transmission and variable bit rate transmission in general. However, in practice, the buffer size is always limited. The buffering process will bring a delay to the system, and this delay is proportional to the size of the buffer. Latency is often a serious issue for real-time image communication, so buffers should be kept as small as possible. That is, long-term fluctuations in bitrate due to changes in scene content or changes, etc. they cannot be softened in this way, so another part is needed. This is to send some measure of the output bitrate to the encoder to control the encoding process, thus changing the output bitrate [3] .

Sample rate and bit rate of MP3 Part 2

Sample rate and bit rate of MP3 Part 2

BIT RATE

The number of digits in the sound is equivalent to the number of colors on the screen, indicating the amount of data per sample.

bit rate

Of course, the larger the amount of data, the more accurate the playback sound, so as not to confuse the sound. of the teapot with the train whistle. In the same way, it is more clear and precise for the image, so as not to confuse blood and ketchup. [However, limited by the function of human organs, 16-bit sound and 24-bit image are basically the limits of ordinary humans, and the higher digits can only be distinguished by instruments. For example, the phone has 7-bit sound sampled at 3 kHz and the CD has 16-bit sound sampled at 44.1 kHz, so the CD is clearer than the phone. ]

When you understand the above two concepts, bitrate is easy to understand. Take the phone as an example, 3000 samples per second, each sample is 7 bits, then the phone’s bit rate is 21000. And the CD is 44100 samples per second, two channels, each sample is 13 bit PCM encoded, so the CD bit rate is 44100*2*13=1146600, which means the CD data volume per second is about 144KB. the capacity of a CD is 74 minutes equal to 4440 seconds, which is 639360KB=640MB.

Sound is actually a type of energy wave, so it also has the characteristics of frequency and amplitude, with frequency corresponding to the time axis and amplitude corresponding to the level axis. The wave is infinitely smooth, and the string can be considered to be made up of innumerable points. Since the storage space is relatively limited, in the process of digital encoding, the points of the string must be sampled. The sampling process consists of extracting the frequency value of a certain point. Obviously, the more points that are extracted in one second, the richer the frequency information that can be obtained. To restore the waveform, there must be two sampling points in one vibration. The highest frequency that can be felt is 20kHz, so to meet the auditory requirements of the human ear, at least 40k samples per second, expressed at 40kHz, and this 40kHz is the sample rate. Our common CD has a sample rate of 44.1 kHz. It is not enough to have only frequency information, we must also obtain and quantify the energy value of this frequency to represent the strength of the signal. The number of quantization levels is an integer power of 2, and the sample size of our common CD bit is 16 bits, that is, 2 to the power of 16. Sample size is harder to understand than bit rate. sampling, because it makes it seem abstract. For a simple example: suppose a wave is sampled 8 times, and the energy values ​​corresponding to the sampling points are A1-A8, but we only use 2-bit sampling size, as a result we can only keep the 4 point values ​​in A1-A8 and discard the other 4. If we use the 3bit sample size, all 8 point information is recorded. The higher the sample rate and sample size values, the closer the recorded waveform is to the original signal.

MP3 sample rate and bit rate

MP3 sample rate and bit rate

Bit Rate

When we listen to mp3 and watch movies, we will notice two parameters.

BIT RATE

The most common ones are 44.1 KHz sample rate and 192 Kbps bit rate. So what is the sample rate and what is the bit rate? What is the relationship between them? Explain:

The process of converting an analog audio signal to a digital audio signal is called sampling. In a nutshell, how many data points does it take to record a 1 second long sound via waveform sampling. For example: the sound sample rate of 44.1 KHz is equivalent to spending 44,000 data points to describe the sound waveform for 1 second. In principle, the higher the sample rate, the better the sound quality; sampling frequency is generally divided into three levels: 22.05KHz, 44.1KHz and 48KHz; 22.05KHz can only achieve FM radio sound quality, and 44.1KHz is the theoretical limit of CD sound quality, 48KHz has reached DVD quality.

Sampling rate refers to the sampling frequency when converting sound (analog signal) to mp3 (digital signal), i.e. how many data points are sampled per unit of time. (The data for a sample point is 8 (or even more) bits long.)

Bit rate refers to the number of bits (bits) transmitted per second. The unit is bps (bit per second). The higher the bitrate, the more data transmitted and the better the sound quality.

It can be said that the sample rate and bit rate are like the horizontal and vertical coordinates on the coordinate axis. The sampling frequency on the abscissa represents the data points sampled per second. The bit rate on the ordinate represents the precision when quantizing analog quantities with digital quantities.

The sample rate is similar to the number of frames of moving images. For example, the sampling rate of movies is 24 Hz, the sampling rate of PAL format is 25 Hz, and the sampling rate of NTSC format is 30 Hz. When we play back the still images sampled at the same rate as the sampling frequency, we see a continuous image. In the same way, when a CD recorded at a sampling rate of 44.1 kHz is played back at the same rate, a continuous sound can be heard. Obviously, the higher the sample rate, the more coherent the sound will be heard and the picture will be seen. [Of course, the sampling rate that human auditory and visual organs can distinguish is limited, which is basically higher than sound sampled at 44.1kHZ, and most people haven’t noticed the difference. ]

Quality (bit rate)

Quality (bit rate)

Bit Rate

In multimedia technology, quality is often used to judge the effect of audio, and quality here is actually bitrate.

Bit Rate

1. Introduction
2 sound control
3 encoding mode
Introductionedit transmission
The term quality is widely used.
In multimedia technology, quality is often used to judge the effect of audio, and quality here is actually bitrate.
On WINDOWS it is called “bit rate” and on some players it is described as ” bit rate “.
Quality refers to the bit rate at which digital sound is converted from analog to digital format. The higher the bitrate, the better the quality of the restored sound.
sound control edit stream
16 Kbps = phone quality
24 Kbps = increase phone quality, shortwave transmission, longwave transmission, European standard medium wave transmission
40 Kbps = American standard medium wave transmission
56Kbps=Voice
64 Kbps = boost voice (best bitrate setting for cell phone ringtones, best setting for cell phone mono MP3 players)
112 Kbps = FM stereo broadcast FM 128 Kbps = tape (best setting for mobile phone stereo MP3 player, best setting for low-end MP3 player)
160 Kbps = HIFI high fidelity (best setting for mid to high end MP3 players)
192Kbps=CD (best setting for high-end MP3 players)
256Kbps=Studio Music Studio (for music enthusiasts)
In fact, with the advancement of technology, the quality of music is also getting higher and higher, the highest quality of MP3 is 320Kbps, but some formats can achieve higher sound quality.
For example, the emerging APE audio format can provide real audiophile level lossless sound quality and smaller volume than WAV format, and its quality is usually 550kbps-950kbps.
encoding modeedit stream
VBR (Variable Bitrate) Dynamic Bitrate means there is no fixed bitrate. The compression software immediately determines which bitrate to use based on the audio data being compressed. This is a method that takes quality as a premise and takes file size into account The recommended encoding mode;
ABR Average Bit Rate (Average Bit Rate) is an interpolation parameter of VBR. LAME created this encoding mode in response to the low file volume ratio of CBR and the variable size of files generated by VBR. Within the specified file size, ABR takes every 50 frames (about 1 second for 30 frames) as a segment. High-frequency and insensitive frequencies use relatively low traffic, and low-frequency and large dynamic performance use high traffic, which can be used as VBR and CBR, a compromise option.
CBR (constant bitrate), constant bitrate means the file has one bitrate from start to finish. Compared to VBR and ABR, the compressed file size is very large and the sound quality will not improve significantly compared to VBR and ABR.

How does the bit rate affect the quality of the music?

How does the bit rate affect the quality of the music?

Audio Bitrate Quality

Does the bit rate affect the quality of the music?

There is a lot of talk these days that we have lost real music with the advent of compressed audio formats like MP3, AAC and the like. Is it really so? Will lossless music save music? Can an inexperienced listener tell the difference between MP3 and FLAC music? Let’s take a look at this problem.

Audio Bitrate

What is Bitrate?

You’ve probably heard the term “bitrate” before and you probably have a basic idea of ​​what it means, but it might be a good idea to familiarize yourself with its official definition so you know how it all works.

Bit rate is the number of bits or the amount of data that is processed over a period of time. In audio, this generally means kilobits per second. For example, the music you buy from iTunes is 256 kilobytes per second, which means that every second of the song contains 256 kilobytes of data.

The higher the bit rate of the track, the more space it will take up on your computer. Audio CDs typically take up quite a bit of space, so it has become common practice to compress these files so that you can burn more music to your hard drive (or iPod, Dropbox or whatever). This is where the “lossy” and “lossy” formats conflict.

Lossless and Lossy formats: what’s the difference?

When we say lossless, we mean that we haven’t really changed the original file. That is, we copy a track from the CD to our hard drive, but we do not compress it to the point of losing data. Essentially the same as the original CD track.

However, most of the time, you will probably extract your music in Lossy format. That is, you took a CD, copied it to your hard drive, and compressed the tracks so they don’t take up a lot of space. A typical MP3 or AAC album is probably about 100MB. The same album in a lossless format like FLAC or ALAC (aka Apple Lossless) will be around 300MB, so it has become common practice to use lossy formats for faster downloads and more hard drive savings. .

The problem is that when you compress a file to save space, you are removing chunks of data. Just like when you take a high quality image and compress it to JPEG, your computer grabs the raw data and “tricks” certain parts of the image into being basically the same, but with some loss of clarity and quality.

An example of how the JPEG graphics compression algorithm works
Remember that you are saving hard drive space by compressing music in lossy formats, which can make a big difference for an iPhone with 32GB of storage, but is only a trade-off in terms of size / quality.

There are different levels of compression: 128 Kbps, for example, takes up very little space, but it will also have a lower quality of playback than a larger 320 Kbps file, which in turn is of lower quality than the 1,411 reference file Kbps. From. 1,411 kbps is an audio CD level quality, which is more than sufficient in most cases.

The problem is not how much the music is compressed, but what equipment you listen to it on.

Does bit rate really matter?

As memory gets cheaper every year, listening to sound at a higher bit rate, or even lossless formats, is starting to become more and more popular. But is it worth the time, effort, and storage space on your phone or computer?

I don’t like answering questions this way, but sadly the answer is: it depends.

Part of the equation is the hardware you use. If you are using a good quality pair of headphones or speakers, you are used to wide frequency and dynamic range. As such, you are more likely to notice the downsides that come with compressing music into lower bitrate files. You may notice that low-quality MP3 files lack a certain level of detail; Subtle backing tracks may be harder to hear, the highs and lows won’t be as dynamic, or you may hear distortion in the lead vocal. In these cases, you may want a higher bit rate track.

However, if you’re listening to your music with a cheap pair of headphones on your iPod, you probably won’t notice the difference between a 128Kbps file and a 320Kbps file, let alone 1,411Kbps lossless music. Remember when you I showed the image a few paragraphs above and noticed that you probably had to look at it to see the flaws? Your headphones are like a truncated version of the image: they will make these imperfections difficult to perceive, as they are not physically capable of reproducing the music for you the way you want them to.

The other part of the equation is, of course, your own ears. It can be very difficult for some people to distinguish between two different bit rates for the simple reason: they listen to little music. Listening skills, like any other, develop with practice. If you listen to your favorite music often and a lot, your hearing becomes more accurate and begins to pick up small details and midtones. But until then, doesn’t it really matter what bitrate you use?

So what format and bit rate should you choose yourself? Is 320 Kbps enough for you or do you definitely need Lossless format?

The point is that it is difficult to hear the difference between a lossless file and a 320Kbps MP3 file. To hear the difference, you need serious high-quality equipment, good hearing, and some kind of music (for example, classical or jazz). .

For the vast majority of people, 320 Kbps is more than enough to listen to.

What else should you consider?

Music recorded in the Lossless format can be useful. Lossless files are more reliable in the future, in the sense that you can always compress them to Lossy format when you need to, but you can’t do the opposite and restore original CD quality from MP3 file. This, again, is one of the fundamental problems of online music stores: if you have created a huge music library on iTunes and one day you decide that you need more bitrate, you will have to buy it again, but this time only in CD format . …

Whenever I can, I always buy or copy music in Lossless format for backup.

I understand that audiophiles are like a needle under your nails. Like I said, it all depends on you, your audition and the equipment you have.

Compare two tracks recorded in Lossless and Lossy formats. Try a few different audio formats, listen to them for a while and see if it makes a difference for you or not.