MP3 file format

MP3 file format

Mp3 file format
Mp3 file format

Introduction:
MP3 file format

Mp3 file format
Mp3 file format

1. Overview:
MP3 files are made up of frames, and frames are the smallest unit of MP3 files. The full name of MP3 must be MPEG1 Layer 3 audio files. MPEG
(Motion Picture Experts Group) translates into Chinese as Moving Picture Experts Group, and refers specifically to moving video and audio compression standards.
MPEG1 standard, also known as MPEG audio layer, which is divided into three layers based on compression quality and encoding complexity, namely,
Layer-1, Layer2 and Layer3, which correspond to the three sound files of MP1, MP2 and MP3 respectively, and use different
levels of audio files according to different purposes. The higher the MPEG audio encoding level, the more complex the encoder and the higher the compression ratio. The compression ratios of MP1 and MP2 are 4:1 and
6:1-8:1 respectively, while the compression ratio of MP3 is as high as 10:1-8:1. 12:1, meaning one minute of CD-quality music requires 10MB
of storage space without compression, but only about 1 MB after MP3 compression encoding. However, MP3 uses a lossy compression method for audio signals. To reduce
sound distortion, MP3 adopts “sensory coding technology”, that is, it first analyzes the frequency spectrum of audio files during encoding, and then uses filters to filter the
noise . levels. Then the remaining bits are spread and arranged by means of quantization, and finally an MP3 file with a higher compression ratio is formed, and the
compressed file can achieve a sound effect closer to the original sound source during playback.
2. The whole structure of
MP3 files: MP3 files are roughly divided into three parts: TAG_V2 (ID3V2), Frame, TAG_V1 (ID3V1)
ID3V2 contains information like author, composer, album, etc. The length is not fixed, which expands the information volume of ID3V1.
A series of frames, the number is determined by the size of the file and the length of the frame. The length of each frame of the
frame
may not be fixed or fixed, and is determined by the bitrate
.
Each table is divided into two parts: table header and data entity Header of data.
frame
Record the bit rate, sample rate, version and other information of mp3, and each frame is independent of each other The frame
ID3V1 contains information like author, composer, album, etc., and the length is 128BYTE . 3. MP3 FRAME format: each FRAME has a FRAMEHEADER frame header, the length is 4BYTE (32 bits), there may be two CRC check bytes after the frame header, the existence of these two bytes depends on the FRAMEHEADER information If bit 16 is 0, there is no checksum after the frame header, and if it is 1, there is a checksum. The checksum length is 2 bytes, followed by the FRAMEHEADER, followed by the frame entity data. The format is as follows: FRAMEHEADER CRC (free) MAIN_DATA 4 BYTE 0 OR 2 BYTE The length is calculated from frame header 1. The format of the FRAMEHEADER frame header is as follows: AAAAAAAA AAABCCD EEEEFFGH IIJJKLMM

The mp3 phenomenon

The mp3 phenomenon

MP3

The MP3 music format (MPEG-1 Layer 3) is one of the most widely used digital audio formats in the world. It is compatible with all portable and stationary audio devices. In May 2017, the developers of the format announced his “death”.

mp3

On April 23, 2017, the Technicolor and Fraunhofer IIS licensed commercial program was canceled: the last patent included in the program expired, making the format standard in the public domain. Can we say that the days of the most popular format are numbered? MP3 development began in the late 1980s at the Fraunhofer Institute for Integrated Circuits (IIS).

In 1987, the University of Erlangen-Nuremberg and Fraunhofer IIS teamed up to work on the EU147 EUREKA Digital Audio Broadcasting (DAB) project. The first result of the alliance’s work was the LC-ATC codec, which made it possible to encode stereo music in real time. The next step was the development of an optimal frequency domain (OCF) coding algorithm, which already had some of the characteristics of the future MP3 codec. For the first time, it is possible to encode music in good quality at 64 kbps for a mono signal. OCF was the beginning of the path towards the standardization of MPEG (Moving Picture Expert), an organization, responsible for the development and implementation of international standards for the compression and transmission of digital video and audio content.

In 1989, MPEG received 14 proposals for the implementation of an audio coding standard, so participants were invited to combine their developments. This led to the emergence of four potential candidates, including MUSICAM from the Institute of Broadcasting Technology IRT and Philips and ASPEC (Adaptive Spectral Perceptual Entropy Coding), which is the result of further enhancements to OCF Fraunhofer IIS, as well as contributions from the University of Hannover in collaboration with AT&T and Thomson. After extensive testing, MPEG proposed combining MUSICAM and ASPEC to create a family of three encoding methods: Level 1: a low-complexity version of MUSICAM; level 2 – MUSICAM codec; Level 3 (later called MP3): based on ASPEC.

Technical development of the MPEG-1 standard was completed in December 1991. In 1994, Fraunhofer IIS introduced the world’s first MP3 encoder, the L3enc, and in 1995 the Fraunhofer researchers unanimously accepted “.mp3” as the file extension for MPEG Layer 3 [1]. Thanks to the compression algorithm used in the MP3 audio format, the size of the data required to reproduce the recording and ensure the quality of sound reproduction is significantly reduced to 10-12 times the original, depending on the recording bit rate. . Bit rate refers to the encoding / decoding rate of a digital audio stream; sound quality improves with increasing bit rate. The MP3 format has the following bit rates: 32 kbps (very low quality, acceptable only for voice), 96 kbps, 128 kbps (medium quality), 160 kbps, 192 kbps, 256 kbps, 320 kbps (highest best quality). The principle of the compression algorithm is as follows: during the compression process, the audio codecs analyze the signals, focusing on the audible fragments, which are saved for later playback or transmission.

This rules out sounds beyond the perception range of the human ear (20 to 20,000 Hz). That is why MP3 is called lossy. There are three ways to encode MP3 files: constant bit rate (CBR), variable bit rate (VBR), and medium bit rate (ABR). CBR is the default encryption mode. In this mode, the bit rate is constant for the entire file. This means that each part of the MP3 file uses the same number of bits. Regardless of the complexity of a piece of music, the encoder uses the same bit rate, so the quality of the final file is variable. Complex parts will be of lower quality than simpler ones. The main advantage of this mode is that the size of the final files does not change and can be accurately predicted.

When encoding in VBR mode, the user selects the desired quality on a scale of 9 (lowest quality, highest distortion) to 0 (highest quality / lowest distortion). The codec then tries to maintain a certain quality throughout the file by choosing the optimal number of bits for each part of the audio recording. The main advantage is the ability to specify the level of quality to be achieved, but a significant disadvantage is the unpredictability of the final file size. In ABR mode, the user sets the bit rate and the encoder tries to keep the average bit rate constantly while using higher bit rates for the parts of the music that require more bits. The

Size and quality of MP3 files

Size and quality of MP3 files

MP3 File

The MP3 file format is an “open format” supported by most manufacturers.

mp3 file

The MP3 format is one of the most common digital audio encoding formats. One feature of MP3 audio encoding is lossy encoding. However, the coding is based on a special model that takes into account the peculiarities of auditory perception. Therefore, the presence of losses does not lead to catastrophic sound degradation.

MP3 files have become a de facto standard and are compatible with the most popular operating systems, many CD and DVD players, and other devices.

Interestingly, the standard describes the actual storage format and not the way the sound is encoded. As a result, there are many tools available to play MP3 audio.

Special codecs are used to encode audio in MP3 format.
An audio codec can be of two types: hardware codec and software codec.

Hardware coding is done by special microcircuits.
Software coding is done using special computer programs.

Audio quality in MP3 format (all other things being equal) depends on the compression ratio (read the amount of loss) and the encoding program. That is why brand name players using well-known brand codecs and audio signal processing systems are significantly superior in playback quality to conventional devices assembled from standard assemblies.

The quality of actual playback depends on the size of the media data stream. The amount of data stream is sometimes called the stream width. There is a special term: bit rate. The data flow rate is defined in kilobits per second and is denoted kbs, kbps, kb / s. Recording can be encoded in several ways: constant bit rate and variable bit rate. Variable bit rate helps preserve details by increasing the amount of data.

Not all bit rates are suitable for high-quality music playback

MP3 digital audio format

MP3 digital audio format

MP3 File Format

High-quality digitized audio requires a large amount of disk space.

mp3 file

Attempts to reduce the size of files using standard archivers (RAR, GZIP, etc.) do not generate significant gains due to the specificity of the sound data. However, it is possible to achieve a fairly significant level of compression of the audio information using special methods based on the analysis of the data structure and subsequent compression with some loss.

The real possibility of sound processing comparable in quality to existing analog examples did not appear until the late 1980s.

In 1988, the International Organization for Standardization (ISO) formed the MPEG (Moving Picture Experts Group) committee, whose main task is to develop standards for the encoding of moving pictures, sound and their combination. During the ten years of its existence, the committee has developed a series of norms on this subject. As a result, summarizing the extensive research in this area, several specific formats were recommended for storing data, which are excellent in quality of results and data flow.

There are currently three video storage standards: MPEG-1, MPEG-2, and MPEG-4.

Within the first two formats, there are also formats for storing audio information: Layer-1, Layer-2 and Layer-3. These three audio formats are defined for MPEG-1 and minor extensions are used in MPEG-2. The three formats are similar to each other, but use different levels of trade-off between compression and complexity.

Layer-1 is the simplest, it does not require significant compression costs, but it also provides a negligible compression ratio.

Layer-3 is the most time consuming and provides the best compression. Recently, this format has gained immense popularity. It is often called MP3. This name is associated with the extension of the audio files stored in this format.

The underlying idea behind all lossy audio compression techniques is to neglect the subtle details of the original sound that are beyond the reach of the human ear. Here several points can be highlighted.

Noise level . Sound compression is based on a simple fact: if a person is near a loud siren, they are unlikely to hear the conversation of the people who are nearby. And this happens not because a person pays close attention to a loud sound, but to a greater extent because the human ear actually misses out sounds that are in the same frequency range as a louder sound. This effect is called masking, it changes with the difference in volume and frequency of the sound.

The second point is the division of the audio frequency band into subbands, each of which is further processed separately. The encoding program extracts the loudest sounds in each band and uses this information to determine an acceptable noise level for that band. The best encoding programs also take into account the influence of adjacent bands. A very loud sound in one band can affect the masking effect and nearby bands.

Another point of the codification is the use of a psychoacoustic model based on the peculiarities of the human perception of sound. The compression used by this model is based on removing frequencies known to be inaudible, while more carefully preserving sounds that can be easily heard by the human ear. Unfortunately, there can be no exact mathematical formulas here.

The human perception of sound is a complex process, not fully understood, so the choice of compression methods is based on analyzing listening and comparing compressed sounds differently by teams of experts. But here there are practically limitless possibilities in the field of improving psychoacoustic models. Most of the existing algorithms to encode the human voice are based on the high predictability of said signal; Universal MPEG compression algorithms have tried to apply this technique with variable success.

Another compression technique is the use of so-called joint stereo. It is known that the human hearing aid can only determine the direction of the mid frequencies, the high and low sound, so to speak, separately from the source. This means that these background frequencies can be encoded into a mono signal. In addition to all this, compression uses the difference in the complexity of the flows in the channels.

Why mp3 is enough for you, but Lossless is not necessary

Why mp3 is enough for you, but Lossless is not necessary

mp3

 

Why mp3 is enough for you, but Lossless is not necessary
Did you finish the greenhouse? So you don’t need to lose, listen to high quality mp3.

MP3

Very often there are people who, in principle, despise compressed formats. You should not be guided by your opinion. The following mods that in the studio with a 90% probability will not hear the differences between compressed and uncompressed audio.

MP3 wasn’t invented just to reduce quality. It was developed by the Fraunchhofer Society, an association of applied research institutes in Germany. Later they came up with AAC, which could become the main compressed audio format … But it didn’t work.

Did you know that MP3 comes with variable (VBR) and constant (CBR) bit rate? The constant bit rate, due to the operation of the algorithm, is encoded each time as the first. Therefore, it can produce uneven quality, which means that not all sounds in this situation will be recorded in high quality.

Since MP3 has been around for a long time, it has many limitations. Bit width is 16-24 bits. The sample rate is represented by the following set of options: 8; 11,025; 12; sixteen; 22.05; 24; 32; 44.1; 48. The maximum bit rate does not exceed 320 kbps. The maximum number of channels is 2. But we are still talking about music, we still have to search for multi-channel recordings.

Now let’s see how MP3 is encoded. The illustration shows the time-frequency distribution of sound. Same recording: Audio CD, OGG file, MP3 well encoded. What we observe is that the pieces on the right and left almost completely coincide. This means that the MP3 file sounds almost the same as the original CD recording.

Human hearing and its limits – psychoacoustics

The fact is that the main task of the Fraunchhofer Society is the development of psychoacoustic models of human perception of sound. And here are many subtleties. The main thing is that we are not dolphins.

Second, there are certain restrictions on the number of sounds perceived simultaneously. A person cannot simultaneously hear more than 250 sounds of 24 ranges (in addition, the number of simultaneous sounds in the range is also quite small).

Third, the audible range is 16 Hz to 20 kHz and at the age of 60 it is reduced by almost half. Ideally, and during training (yes, you have to train it!).

All frequencies below 100 Hz are perceived not by the hearing cells, but … by the skin. Then the low waves are reflected in the ear canal; these waves are perceived as infrabass. (This is from the bone conduction area).
mp3_7_resize
Also, the number of cells that register acoustic waves is different for each one. But what is there? For each individual, their number in the right and left ear is different.

By the way, the perception of each ear is different. Change channels of your favorite song – get a new sound.

If you dig deeper, it turns out that each sound frequency is perceived only at a certain volume. When it is reached, the silence is replaced by a sharp and quite different sound. After that, a person can hear a lower sound of this frequency.

Digital audio formats: the MP3 phenomenon

Digital audio formats: the MP3 phenomenon

MP3 format

The MP3 music format (MPEG-1 Layer 3) is one of the most widely used digital audio formats in the world.

MP3 formatMP3 format : An Overview

It is compatible with all portable and stationary audio devices. In May 2017, the developers of the format announced his “death”. On April 23, 2017, the Technicolor and Fraunhofer IIS licensed commercial program was canceled: the last patent included in the program expired, making the format standard in the public domain.
Can we say that the days of the most popular format are numbered? MP3 development began in the late 1980s at the Fraunhofer Institute for Integrated Circuits (IIS). In 1987, the University of Erlangen-Nuremberg and Fraunhofer IIS teamed up to work on the EU147 EUREKA Digital Audio Broadcasting (DAB) project. The first result of the alliance’s work was the LC-ATC codec, which made it possible to encode stereo music in real time.

The next step was the development of an optimal frequency domain (OCF) coding algorithm, which already had some of the characteristics of the future MP3 codec. For the first time, it is possible to encode music in good quality at 64 kbps for a mono signal. OCF was the beginning of the path towards standardization MPEG (Moving Picture Expert): an organization, responsible for the development and implementation of international standards for the compression and transmission of digital video and audio content.

In 1989, MPEG received 14 proposals for the implementation of an audio coding standard, so participants were invited to combine their developments. This led to the emergence of four potential candidates, including MUSICAM from the Institute for Broadcasting Technology IRT and Philips and ASPEC (Adaptive Spectral Perceptual Entropy Coding), which is the result of further enhancements to the OCF Fraunhofer IIS in addition to contributions from the University of Hannover in collaboration with AT&T and Thomson.

After extensive testing, MPEG proposed combining MUSICAM and ASPEC to create a family of three encoding methods: Level 1: a low-complexity version of MUSICAM; level 2 – MUSICAM codec; Level 3 (later called MP3): based on ASPEC. Technical development of the MPEG-1 standard was completed in December 1991. In 1994, Fraunhofer IIS introduced the world’s first MP3 encoder, the L3enc, and in 1995 the Fraunhofer researchers unanimously accepted “.mp3” as the file extension for MPEG Layer 3 [1].

Thanks to the compression algorithm used in the MP3 audio format, the size of the data required to reproduce the recording and ensure the quality of sound reproduction is significantly reduced to 10-12 times the original, depending on the recording bit rate. . Bit rate refers to the encoding / decoding rate of a digital audio stream; sound quality improves with increasing bit rate. The MP3 format has the following bit rates: 32 kbps (very low quality, acceptable only for voice), 96 kbps, 128 kbps (medium quality), 160 kbps, 192 kbps, 256 kbps, 320 kbps (maximum optimal quality). The principle of the compression algorithm is as follows: during the compression process, the audio codecs analyze the signals, focusing on the audible fragments, which are saved for later playback or transmission.

This rules out sounds beyond the perception range of the human ear (20 to 20,000 Hz). That is why MP3 is called lossy. There are three ways to encode MP3 files: constant bit rate (CBR), variable bit rate (VBR), and medium bit rate (ABR). CBR is the default encryption mode. In this mode, the bit rate is constant for the entire file. This means that each part of the MP3 file uses the same number of bits. Regardless of the complexity of a piece of music, the encoder uses the same bit rate, so the quality of the final file is variable.

Complex parts will be of lower quality than simpler ones. The main advantage of this mode is that the size of the final files does not change and can be accurately predicted. When encoding in VBR mode, the user selects the desired quality on a scale of 9 (lowest quality, highest distortion) to 0 (highest quality / lowest distortion). The codec then tries to maintain a certain quality throughout the file by choosing the optimal number of bits for each part of the audio recording. The main advantage is the ability to specify the level of quality to be achieved, but the significant disadvantage is the unpredictability of the final file size.