q to q adalah Archives

What is the difference between 128k and 320k music? Part 3

Free Download Mp4Gain

What is the difference between 128k and 320k music? Part 3

The sampling frequency is approximately the following depending on the type of use (k is the thousand bit symbol, 1khz=1000hz):

8khz – used for phones etc, is enough to record human voices.

22.05khz: transmission use frequency.

44.1kb: Audio CD.

48khz: used in DVD and digital TV.

96khz-192khz: used for DVD-Audio, Blu-ray HD, etc.

The common range of sample precision is 8 bits to 32 bits, with 16 bits generally used on CD.

Having said that, my friends are starting to get confused. It’s not the bitrate that determines the sound quality, so why is everyone saying that 320kb sound quality is better than 128kb?

【Audio Compression】

Well, in fact, the bit rate should be said to be another dimension, it is a compression of audio files.

Nowadays, most of the audio formats we use regularly are based on the original “WAV” file of the audio CD (44.1khz sample rate, 16bit sample precision, 2ch). The original recorded sound data is stored in a matrix, which is in PCM format, while WAV format is an encoding format developed by Microsoft. Its function is to reproduce the data in PCM format through encoding.

Since the data in WAV basically completely restores the PCM data, MP3, AAC and other lossless encoding formats are basically recompressed based on the WAV files. Therefore, we can simply think that WAV is the original audio format and other audio formats are compressed formats.

When it comes to compression, storage and transmission are inseparable. The purpose of compression is to improve storage and transmission. Therefore, before we talk about compression, we need to understand the basic units of computers.

We all know that the computer is a binary number system, and the files stored by the computer are made up of two numbers, 0 and 1. Therefore, the computer’s transmission is based on each number, and each number is called 1 ” bit”. For example, for an audio piece, its basic data is “0,1,1,1,0,1, 1 ,0”, and when transmitting, these numbers are transmitted one by one. The sampling precision mentioned above is this unit.

The storage unit of the computer is “byte (Byte)”. In the computer, 1 byte consists of 8 bits, that is, 8b(bit)=1B(Byte). In computer parlance, data storage is expressed in decimal and data transmission is expressed in binary, so 1KB=1024B=1024×8b. This is also part of the reason why the hard drive capacity we see does not match the actual capacity.

Go back and talk about audio compression, the bitrate of the audio is actually the compression ratio. So the bitrate really just defines the size of the file, but because under normal conditions the larger the file, the less data you lose, so the sound quality is relatively higher. However, the bit rate itself does not directly affect the quality of the file. For example, if we take a 128kb file as the source file, even if it is converted to a 320kb file, the sound quality will not be better than 128kb. .

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

What is the difference between 128k and 320k music? Part 2

Bit rate, sample rate, lossless, MP3, FLAC, APE, 320kb, 192kb, 128kb, 44.1khz, CBR, VBR. Does this bunch of various names make you both familiar and unknown?

The higher the bitrate, the better the sound quality. Lossless music is the highest sound quality, right? So, let’s start with the sound collection.

【Audio composition】

Nowadays, when we talk about audio, everything is digital audio. Digital audio consists of three parts: sample rate, sample precision, and number of sound channels.

Sample Rate: Both the sample rate, which refers to the number of samples per second when recording the sound, expressed in Hertz (Hz).

Sampling Precision: Refers to the dynamic range of the recorded sound, measured in bits (Bit).

Sound channel: the number of channels (1-8).

In simple terms, we can think of a sound wave as a curve. We know that the curve is made up of points, and the sampling frequency is the number of points in the middle of the length per second (the horizontal axis of the figure above). Sampling precision is the number of points in the dynamic range (upper vertical axis). The finer the positioning of these two dimensions, the greater the true sound restoration and the better the sound quality. Of course, the larger the audio file will be. The customer mentioned by the previous colleague said that the latest Hi-Res Audio format released by SONY is a 6-channel 192kHz/24-bit recorded audio file. The size of the lossless format, of course, will be more than 200 megabytes.

What is the difference between 128k and 320k music?

I can’t fully understand music in words.show all

【Preface】

Some time ago, a colleague came across a very troubled client. The mess was said to have been caused by the client asking him to provide song files larger than 100MB-200MB in size. And my colleagues don’t know much about audio formats, so they started endlessly fumbling about FLAC, WAV and audio size. In the end, the colleague did not clearly explain to the customer what was going on.

After that, some other things happened that made me feel that in the music industry there are too many practitioners around me who have an extremely poor understanding of music and even lack some basic knowledge related to music. I don’t even have the idea to understand, which makes me very sad. It seems that music has only one merchandise attribute, and our practitioners only need to organize the shelves, encode various merchandise, and use the big data of users’ purchase records to recommend merchandise to users, no matter why to users. they like this. features that these products have, and use cold data to provide users with various services.

Therefore, I think it is necessary to write something. I don’t expect practitioners to become people who really love music. I just hope that even if you still think of “her” as a commodity, you can first figure out what you’re selling. and what is..

PS: The content of the first lesson is about media files. Since the relevant content involves a lot of technical issues, it seems a bit boring, but if you read it carefully, you will find that it is actually very easy to understand, but this basic knowledge can be very helpful.Improve your skill well. Also expect more interesting content about records, musical styles, etc. which I will post soon.

Related Audio Attribute Part 3

How samples are combined

This is mainly for two-channel or multi-channel audio. For a two-channel audio, it can be combined in the following two ways:

interleaved Taking stereo as an example, a stereo audio sample is obtained by interleaving the storage of two mono samples.
flat. The samples of each channel are stored separately.

The data after FFmpeg audio decoding is stored in the AVFrame structure.

In packed format, frame.data[0] or frame.extended_data[0] contains all the audio data.
In Planar format, frame.data[i] or frame.extended_data[i] represents the data of the i-th channel (assuming channel 0 is the first), the size of the AVFrame.data array is set to 8, if If the number of channels exceeds 8, you should get the channel data from frame.extended_data.

sample format
The sample formats in FFmpeg are mainly:

copy code
enum AVSampleFormat {
AV_SAMPLE_FMT_NONE = – 1 ,
AV_SAMPLE_FMT_U8, /// < 8 bits unsigned
AV_SAMPLE_FMT_S16, /// < 16 bits
signed AV_SAMPLE_FMT_S32, /// < 32 bits
signed AV_SAMPLE_FMT_FLT, /// < float
AV_SAMPLE_FMT_DBL, /// < double

AV_SAMPLE_FMT_U8P, /// < 8 bits unsigned, flat
AV_SAMPLE_FMT_S16P, /// < 16 bits signed, flat
AV_SAMPLE_FMT_S32P, /// < 32 bits signed, flat
AV_SAMPLE_FMT_FLTP, /// < float, flat
AV_SAMPLE_FMT_DBLP, /// < double, flat
AV_SAMPLE_FMT_S64, /// < 64 bits
signed AV_SAMPLE_FMT_S64P, /// < 64 bits signed, plain

AV_SAMPLE_FMT_NB /// < Number of sample formats DO NOT USE if dynamically linked
};
copy code
to illustrate:

1. U8 (8-bit unsigned integer), S16 (16-bit integer), S32 (32-bit integer), FLT (single-precision floating-point type), DBL (double-precision floating-point type), S64 (64-bit integer), those not ending with P are interleaved structures, and those ending with P are flat structures.
2. Flat mode is FFmpeg’s internal storage mode, and the audio files we use are in packed mode.
3. The FFmpeg audio sample format that decodes different output audio formats is not the same. The test found that the data output by AAC decoding is in floating point AV_SAMPLE_FMT_FLTP format, and the data output by MP3 decoding is in AV_SAMPLE_FMT_S16P format (the mp3 file used is 16-bit deep). For the specific sample format, you can see the format member in the decoded AVFrame or the sample_fmt member in the AVCodecContext of the decoder.

Bit rate
The transfer rate per second (bit rate, also called bitrate). Like 705.6kbps or 705600bps, where b is a bit, ps is per second (per second), which means a capacity of 705600bit per second. Compressed audio files are often represented at double speed, for example CD quality MP3 is 128kbps/44100HZ. Note that the unit here is bit instead of byte. One byte is equal to 8 bits (bits). The bit is the smallest unit. It is generally used to describe network speed and various communication speeds. The byte is used to calculate the size. hard drive and memory.

Mbps is: Millionbit per second (millions of bits per second);
Kbps is: Kilobit per second (kilobit per second);
bps is: bit per second (bit per second), the corresponding conversion ratio is:

1Millionbit=1000Kilobit=1000000bit; 1Mbps = 1000,000bps; Again, this is the unit of speed, which refers to the number of bits transmitted per second. The unit of measure for data transmission speed K is the decimal meaning, but the K for data storage is the binary meaning. E.g:

The 1M bandwidth generally described is 1 Mbps = 1,000,000 bps = 1,000,000 / 8 / 1,000 = 125; therefore, the download speed of 1M bandwidth generally does not exceed 125KB/s
. 1000 = 12.5, so the maximum download rate of 100M bandwidth can reach 12.5MB/s
. Of course, the above is only the theoretical rate. In fact, the maximum download rate may not reach that much, and it is mainly affected by various losses, generally 100MB A broadband download rate of 10MB is not bad.

Related Audio Attribute Part 2

The higher the sampling, the more realistic and natural the sound will be.

The frequency recognition range for people is 20 HZ – 20,000 HZ. If 20,000 samples per second can be sampled, it will be enough to satisfy the needs of the human ear during playback. So 22050 The sample rate is commonly used, 44100 is already CD quality, and sampling more than 48000 is no longer meaningful to the human ear. This is similar to a 24 frames per second image from a movie.

Sampling bits
After sampling the audio for a sample, two steps must be performed for the sample:

1. Quantify. The quantization bits commonly used for audio quantization are:

8 bits (that is, 1 byte) can only register 256 numbers, that is, only the amplitude can be divided into 256 levels;

16 bits (ie 2 bytes) can be as small as 65536 numbers, which is already the CD standard;

32 bits (ie 4 bytes) can subdivide the amplitude into 4294967296 levels, which is really unnecessary.

The number of quantization bits is also called the number of sampling bits, bit depth, and resolution, and refers to how many levels the continuous intensity of the sound can be divided after being digitally represented. N-bit means that the intensity of the sound is divided equally into 2^N levels. 16 bits, it is level 65535. This is a very large number and people may not be able to tell the difference in sound intensity from 1/65,535. You can also say that it is the resolution of the sound card. The higher the value, the higher the resolution and the greater the ability to produce sound. The sampling multiple here is primarily addressing the strength characteristics of the signal, and the sampling rate is addressing the time (frequency) characteristics of the signal, which are two different concepts.

2. Binary encoding. That is, the result of the quantization, ie the single channel sample, is stored in a binary keyword. There are two storage methods:

Store the result of the quantization directly in the cast, that is, the two’s complement code;

The result of quantization is stored in floating point type, ie floating point encoding code.

Most PCM sample data formats use integers to store, and for some applications that require high precision, use floating point to represent PCM sample data.

frame
After the audio is quantized to a binary codeword, it must be transformed and the transformation (MDCT) is done in block units, and a block is made up of multiple (120 or 128) samples. A frame will contain one or more blocks. Common frame sizes are 960, 1024, 2048, 4096, etc. A frame records a sound unit whose duration is the product of the sample duration and the number of channels. The nb_samples in the AVFrame structure in FFmpeg represent the number of single channel audio samples in a frame.

Related Audio Attribute

channel, sample rate, sample bits, sample format, bit rate

Sample Rate

The PCM obtained from audio sampling contains three elements: channel, sample rate, and sample rate.

channel
When people hear the sound, they can locate the sound source. By setting the sound source to different positions, a better listening experience can be created. If the position of the audio is adjusted with the image, a better audio-visual experience will be obtained. Effect. Common channels are:

monkey monkey
Two channels, stereo, the most common type, including left and right channels
2.1 channels, adding a bass channel on the basis of two channels
5.1 channels, including one front channel, one front left channel, one front right channel, one surround left channel, one surround right channel, and one bass channel, first used in early theaters
7.1 channel, on the basis of 5.1 channel, the surround left and right channels are divided into surround left and right channels and rear left and right channels, mainly used in BD and modern theaters
Next is a two-channel audio system.

Sampling rate
Audio sampling is the conversion of sound from an analog signal to a digital signal. The sample rate is the number of times the sound is collected per second and is also the number of samples per second of the resulting digital signal. When sampling sound, common sample rates are:

8,000 Hz – telephone sampling rate, sufficient for human speech
11,025 Hz – sample rate for AM radio
22,050 Hz and 24,000 Hz – sample rate for FM radio
32,000 Hz – sampling for miniDV digital camcorder, DAT (LP mode)
44,100 Hz – Audio CD, also commonly used in MPEG-1 audio (VCD, SVCD, MP3) Sample rate 47 250
Hz – Sampling frequency
48,000 Hz for commercial PCM recorders – for miniDV, digital TV, DVD, DAT, movies, and pro audio Sampling rate 50,000 Hz for 2,000 – 96,000 or 192,000 Hz digital sound
for commercial digital sound recorders
– DVD-Audio, some LPCM DVD soundtracks, BD-ROM (Blu-ray Disc) and HD-DVD (High Definition DVD) soundtracks The sample rate used by the audio track
2.8224 MHz: The sample rate used by Direct Stream Digital’s 1-bit sigma-delta modulation process.

What is the use of playback gain in the player? What is the difference between an album and a track?

Replay Gain

Generally speaking, replaygain refers to a method by which software automatically adjusts the volume to the correct volume for each song and album during playback.

Replay Gain

Because the volume of each recording is different, even if your playback device maintains the same volume, some songs may be too loud and others too quiet, and playback gain can solve such situations.

Specifically, playback gain is a label that can be stored in an audio file, indicating a certain gain value (+3.1 dB, -2.0 dB, etc.). Players that support playback gain can read these tags and play Adjust the internal volume from time to time to keep the actual volume of different songs at the same level without changing the system volume. At the same time, playback gain is also a standard, which specifies a set of algorithms to automatically detect the volume of audio files, so as to realize the purpose of automatically adjusting the volume difference between different songs by the software. . The newest current standard is Playback Gain 2.0, which is more accurate than previous versions.

The playback gain tab is divided into two categories, one is album gain and the other is track gain. Album gain is for treating an album as a whole and bringing different albums to the same volume; track gain is used to balance the volume differences of different songs within the same album. The reason for the distinction between the two is that although there are differences in volume within the same album, they are often intentional on the part of the recorder. By distinguishing between two types of gain tags, users can choose to apply playback gain at the album or individual level.

Mp4Gain does not use Replay Gain as its main normalization method, since it has already developed its own much more advanced methods. However, if you wish, you have the option of selecting Replay Gain to be used as a method in Mp4Gain.

But for the common user, this is not necessary, since Mp4Gain has, as we said, more modern, efficient and sophisticated methods.

Mp3 (an audio encoding method) Part 3

MP3 ENCODING

To generate bit-compliant (Layer 1.Layer 2.Layer 3) MPEGAudio files, ISO MPEG Audio committee members developed reference simulation software in C called ISO 11172-5.

MP3 ENCODING

It can demonstrate the first real-time DSP-based hardware decoding of compressed audio on some non-real-time operating systems. Various other MPEG audio was developed in real time for digital broadcasting (DAB radio and DVB TV) for consumer receivers and set-top boxes.
Later on July 7, 1994, Fraunhofer-Gesellschaft released the first MP3 encoder called l3enc.
The Fraunhofer development team selected the .mp3 extension on July 14, 1995 (previously the extension was .bit). Using Winplay3 (released September 9, 1995), the first real-time software MP3 player, many people were able to encode and play MP3 files on their own personal computers. Since hard drives at the time were relatively small (such as 500MB), this technology was essential for storing entertainment music on computers.
MP2, MP3 and Internet
In October 1993, MP2 (MPEG-1 Audio Layer 2) files appeared on the Internet and were often played by Xing MPEG Audio Player and later MAPlay developed by Tobias Bading for Unix. MAPplay was first released on February 22, 1994 and ported to the Microsoft Windows platform.
The only MP2 encoder products at first were Xing Encoder and CDDA2WAV, a CD ripper that converts audio tracks from CDs to WAV format.
Often considered the father of the online music revolution, the Internet Underground Music Archive (IUMA) was the first hi-fi music site on the Internet, with thousands of licensed MP2 recordings before MP3 and the web became popular. .
From the first half of 1995 to the end of the 1990s, MP3 began to flourish on the Internet. MP3’s popularity is largely due to the success of companies and software packages such as Winamp released by Nullsoft in 1997 and Napster released by Napster in 1999, and they are mutually reinforcing. These programs make it easy for normal users to play, create, share and collect MP3 files.
The debate about sharing MP3 files between peers has spread rapidly in recent years, mainly because compression makes file sharing possible, uncompressed files are too large to share. Since MP3 files are widely spread over the Internet, Napster has been sued by some of the major record labels to protect their copyright (see Copyright).
Commercial online music distribution services, such as the iTunes Music Store, often choose other proprietary or DRM-enabled music file formats to control and limit the use of digital music. Formats that support DRM are used to protect copyrighted material from copyright infringement, but most protection mechanisms can be broken in some way. Computer experts can use these methods to generate unlocked files that can be freely copied. One notable exception is Microsoft’s Windows Media Audio 10 format, which has yet to be cracked. If a compressed audio file is desired, the recorded audio stream must be compressed and the sound quality will be degraded.
streaming audio quality
Because MP3 is a lossy compression format, it offers a variety of options for different “bit rates,” that is, the number of encoded data bits needed to represent the audio per second. Typical speeds are between 128 kbps and 320 kbps (kbit/s). In contrast, the uncompressed audio bitrate on a CD is 1411.2 kbps (16 bits/sample × 44100 samples/sec × 2 channels).
MP3 files encoded with lower bit rates generally play at a lower quality. If you use too low a bitrate, “compression artifact” (sounds not present in the original recording) will appear during playback. A good example of compression noise is the sound of compressed cheering; due to its randomness and sharp changes, encoder errors are more pronounced and sound like echoes.

Mp3 (an audio encoding method) Part 2

mp3 3ncoding

MPEG-1 Audio Layer 2 encoding began as a digital audio broadcast (DAB) managed by Egon Meier-Engelen at the German Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt (later known as Deutsches Zentrum für Luft- und Raumfahrt, German Space Center). )draft.

mp3 encoding

This project is funded by the European Union as a EUREKA research project, and its name is commonly known as EU-147. The study period for EU-147 was from 1987 to 1994.
2. By 1991, two proposals had emerged: Musicam (called Layer 2) and ASPEC (Adaptive Spectrum Sensing Entropy Coding). The Musicam method proposed by Philips of the Netherlands, CCETT of France, and the Institut für Rundfunktechnik of Germany was chosen due to its simplicity, error robustness, and lower computational effort in high-quality compression. The Musicam format based on subband coding is a key factor in determining the MPEG audio compression format (sample rate, frame structure, header, sample points per frame). This technology and its design philosophy are fully integrated into the definition of ISO MPEG Audio Layer I, II and later Layer III (MP3) formats. The standard was developed by Leon van de Kerkhof (Layer I) and Gerhard Stoll (Layer II) under the auspices of Prof. Mussmann (University of Hannover).
3. A working group consisting of Leon Van de Kerkhof from the Netherlands, Gerhard Stoll from Germany, Yves-François Dehery from France and Karlheinz Brandenburg from Germany absorbed design ideas from Musicam and ASPEC and added their own design ideas to develop an MP3. MP3 can achieve MP2 sound quality from 192 kbit/s to 128 kbit/s.
4. All of these algorithms eventually became part of the first group of MPEG standards, MPEG-1, in 1992, resulting in the international standard ISO/IEC 11172-3 published in 1993. Further work on MPEG audio was eventually became part of the MPEG-2 standard, a second group of MPEG standards developed in 1994, officially known as ISO/IEC 13818-3, first published in 1995.
5. The compression efficiency of the encoder is generally defined by the bit rate, because the compression rate depends on the number of bits (: in: bit depth) and the sampling rate of the input signal. However, there are often products that use CD parameters (44.1 kHz, two channels, 16 bits per channel, or 2×16 bits) as the compression ratio reference, and the compression ratio using this reference is usually higher, which which also shows that the compression ratio is very important for lossy compression problems.
6. Karlheinz Brandenburg used Suzanne Vega’s song Tom’s Diner on CD to test MP3 compression algorithms. This song is used because the song’s smooth and simple melody makes it easier to hear glitches in the compressed format during playback. Some jokingly refer to Suzanne Vega as “the mother of MP3”. Some more serious and critical audio extracts (glockenspiel, triangle, accordion…) from the EBU V3/SQAM reference CD are used by professional audio engineers to assess the subjective perceived quality of the MPEG audio format.

Mp3 (an audio encoding method)

Mp3 encxoding

MP3 is an audio compression technology, its full name is Moving Picture Experts Group Audio Layer III, called MP3.

mp3 encoding

It is designed to drastically reduce the amount of audio data. Using MPEG Audio Layer 3 technology, music is compressed into a smaller capacity file with a compression ratio of 1:10 or even 1:12, and for most users, playback quality is not as good as the original uncompressed. audio Significant decrease. It was invented and standardized in 1991 by a group of engineers at the Fraunhofer-Gesellschaft research organization in Erlangen, Germany. Music stored in the form of MP3 is called MP3 music, and a machine that can play MP3 music is called an MP3 player.

Motion Picture Expert Compression Standard Audio Layer 3 foreign name Moving Picture Expert Group Audio Layer III research organization Fraunhofer-Gesellschaft type audio coding advantage Drastically reduce the amount of audio data defect sound quality loss
content
1 Features
2 story
▪ origin
▪ go to the masses
3 audio quality
4 patent issues
transmission characteristics
MP3 converts the time-domain waveform signal to a frequency-domain signal by taking advantage of the human ear’s insensitivity to high-frequency sound signals and splits it into multiple frequency bands, using different compression rates. for different frequency bands and increasing the compression ratio for high frequencies (even ignoring the signal) Use a small compression ratio for low frequency signals to ensure that the signal is not distorted. In this way, it is equivalent to discarding the high-frequency sound that is basically inaudible to the human ear [1], keeping only the audible low-frequency part, thus compressing the sound with a compression ratio of 1:10 or even 1: 12. Because the full name of this compression method is called MPEG Audio Player3, people call it MP3 for short.
According to the MPEG specification, AAC (Advanced Audio Coding) in MPEG-4 will be the next generation of the MP3 format.
Compared to CD, FLAC and APE lossless compression formats, the sound quality of the highest parameter MP3 (320 Kbps) is not much different.
MP3 players are dying
When they first came out, MP3 players were at the forefront of the digital revolution. However, sales of iPods and other MP3 players in the UK fell sharply in 2012 as consumers turned to other digital products such as smartphones.
In 2012, sales of MP3 players in the UK market were £110m ($178m), just 29% of the £381m in 2011, according to market research firm Mintel. Mintel expects total MP3 player sales in the UK market to halve by 2017. In the worst case scenario, total MP3 player sales in the UK market will be just 25 million dollars five years later. [23]
1. MP3 is a data compression format;
2. Discards pulse code modulation (PCM) audio data that is not important to the human ear (similar to JPEG is a lossy image compression), resulting in a much smaller file size;
3. MP3 audio can be compressed according to different bit rates, providing a variety of trade-offs between data size and sound quality. The MP3 format uses a mixed conversion mechanism to convert audio domain signals. time in frequency domain signals;
4. 32 band polyphase integral filter (PQF);
Modified discrete cosine filter (MDCT) of 5, 36 or 12 taps; each subband size can be independently selected between 0…1 and 2…31;
6. MP3 not only has extensive client software support, but also has a lot of hardware support, such as portable media players (referring to MP3 players), DVD and CD players, outgoing calls