Differences in audio waveform representation in PCM and FLAC

Free Download Mp4Gain

Differences in audio waveform representation in PCM and FLAC

Let’s talk about differences in audio waveform representation in PCM and FLAC

When it comes to audio compression, two popular formats often come up: PCM (Pulse Code Modulation) and FLAC (Free Lossless Audio Codec). Both are widely used, but their representation of audio waveforms differs in significant ways. As an expert with years of experience in digital audio, I can tell you that understanding these differences is essential for choosing the right format for your needs. In this article, I’ll dive deep into how PCM and FLAC represent audio waveforms and why those differences matter for sound quality, file size, and usability.

PCM is the standard method for representing audio waveforms in a raw, uncompressed form. It’s what most of us think of when we listen to a CD. The sound is captured as a continuous stream of amplitude values sampled at a fixed rate. In contrast, FLAC is a compressed format, meaning it stores the same audio data but does so more efficiently, without losing any of the original sound quality. Let’s break down how each format works and where the differences lie, especially in their waveform representation.

How PCM Represents Audio Waveforms

PCM audio is all about simplicity and accuracy. It represents sound by recording amplitude values at regular intervals, which we call samples. These samples are then stored as a sequence of binary numbers. Imagine listening to a radio station—you hear a continuous flow of sound waves. Now, if you were to capture that sound digitally using PCM, it would look like a series of steps, where each step corresponds to a snapshot of the audio at a specific moment.

The resolution of PCM’s waveform representation depends on two key factors: sample rate and bit depth. The sample rate is how often the audio is sampled per second, and the bit depth defines how precise each sample is. For instance, a standard CD uses a sample rate of 44.1 kHz and a bit depth of 16 bits. The higher these values, the more accurately PCM can represent the original waveform.

Key Features of PCM Audio Representation

Raw, uncompressed format
Each sample corresponds to an amplitude value at a specific point in time
Higher sample rates and bit depths provide more accurate representation
Typically large file sizes due to the uncompressed nature
Widely used in professional audio applications

For example, if you were to look at the waveform of a song in PCM, you’d see a jagged line that closely follows the original audio signal. Each point on the line represents a sample, and the more samples you take (with a higher sample rate and bit depth), the smoother the waveform appears. This representation is precise but also creates large files since every sample needs to be stored.

How FLAC Represents Audio Waveforms

On the other hand, FLAC compresses audio data without losing any quality. This compression is what makes it different from PCM. FLAC uses lossless compression, which means that it reduces file size while maintaining the integrity of the original waveform. It’s like folding a piece of paper into a smaller, more compact shape without tearing or cutting it—when you unfold it, it’s still the same shape.

In FLAC, the waveform is represented in a way that keeps the essential information but removes redundancy. It analyzes the audio to find patterns that can be encoded more efficiently. For example, if a section of audio contains a long string of similar or repeating values, FLAC will store that section in a more compact form, only using extra data where it’s truly needed. When you decode the FLAC file, it reconstructs the exact same audio data that PCM would provide.

Key Features of FLAC Audio Representation

Lossless compression that retains full audio quality
Stores audio in a more compact form, reducing file sizes
Uses advanced algorithms to find and eliminate redundancy in the waveform
Ideal for audiophiles and archival purposes
Less storage space required compared to PCM

The FLAC waveform representation might appear similar to the PCM waveform in terms of its overall shape, but the difference lies in the file size. A FLAC file will be much smaller than an uncompressed PCM file, even though both formats contain identical audio data. This is due to FLAC’s ability to remove redundant information in the waveform without affecting the sound quality.

Comparison of File Sizes: PCM vs FLAC

One of the most noticeable differences between PCM and FLAC is the file size. Since PCM stores every sample of the waveform in its original form, it tends to produce very large files. For example, a typical uncompressed PCM file (like a WAV or AIFF) for a single song can range from 40 MB to 100 MB or more, depending on the length and sample rate.

FLAC, on the other hand, compresses the same audio without losing any quality. Typically, you can expect FLAC files to be about 30-60% smaller than their PCM counterparts. This makes FLAC an attractive choice for people who want to store high-quality audio without taking up as much disk space. A FLAC file might be only 20 MB to 40 MB for the same song that would be 100 MB in PCM.

Comparison of File Sizes

PCM files are large due to uncompressed data (e.g., WAV, AIFF)
FLAC files are compressed, typically 30-60% smaller than PCM files
FLAC provides the same sound quality as PCM but with reduced storage needs
FLAC is ideal for audiophiles who want to save space while preserving audio integrity

If you’ve ever had to manage a large music library or archive audio files, you’ll quickly realize how much space you can save by converting your PCM files to FLAC. It’s like switching from storing a stack of paper in a huge box to a compact, neatly folded bundle. Not only is FLAC more space-efficient, but it’s also more manageable for devices with limited storage capacity, like smartphones and portable music players.

Impact on Audio Quality: PCM vs FLAC

In terms of sound quality, both PCM and FLAC deliver the exact same result when it comes to playing back audio. Since FLAC is a lossless format, it preserves the full audio information from the original recording, just like PCM does. However, the key distinction is that PCM provides that audio in its raw, uncompressed form, while FLAC compresses the data without any loss of quality.

In real-world usage, this means that unless you have a very high-end audio system that can detect minute differences, you’ll hear no difference between PCM and FLAC when listening to music. Both formats are considered to be “bit-perfect,” meaning they deliver the exact same sound. But, FLAC’s advantage comes when you need to manage large collections of music or require a more efficient way to store audio without sacrificing quality.

Let’s talk about the benefits of PCM and FLAC for different uses

When deciding between PCM and FLAC, it’s important to think about your specific use case. PCM is often favored in professional audio applications, where raw, uncompressed sound is required for tasks like recording, mixing, and mastering. Since PCM retains every sample without compression, it gives audio engineers the maximum flexibility and accuracy in their work.

FLAC, on the other hand, is perfect for audiophiles and anyone looking to store or share high-quality music files without taking up as much space. If you’re archiving your music collection or want to listen to uncompressed sound without using a ton of storage, FLAC is the better choice. It offers the best of both worlds—lossless compression with manageable file sizes.

Latest words on differences in audio waveform representation in PCM and FLAC

To sum up, the differences between PCM and FLAC primarily come down to how the audio data is represented and stored. PCM is uncompressed and accurate, providing a true representation of the waveform, but at the cost of large file sizes. FLAC, on the other hand, compresses audio without losing any quality, making it a more space-efficient choice without sacrificing sound fidelity. Whether you choose PCM or FLAC depends on your needs—if you want raw, uncompressed audio for professional work, PCM is the way to go. If you’re looking to save space while keeping the same audio quality, FLAC is an excellent choice.

FAQ

What is the main difference between PCM and FLAC audio formats?

PCM is an uncompressed audio format that provides a raw waveform representation of sound, while FLAC is a lossless compressed format that reduces file size without affecting audio quality.

Does FLAC compress audio without losing quality?

Yes, FLAC is a lossless compression format, meaning it reduces file size while preserving the original audio data perfectly, without any loss in quality.

Which audio format is better for storage space, PCM or FLAC?

FLAC is better for storage space because it compresses audio files without losing any quality. PCM files tend to be much larger due to their uncompressed nature.

Is the sound quality different between PCM and FLAC?

No, the sound quality is identical between PCM and FLAC because FLAC is a lossless format, meaning it retains all the audio information of the original PCM file.

Can I convert FLAC to PCM?

Yes, FLAC can be converted to PCM, but since FLAC is lossless, converting it to PCM will not result in any loss of quality.

Why would I use PCM over FLAC?

You would use PCM if you require the raw, uncompressed audio for professional applications like recording, mixing, or mastering, where accuracy is crucial.

Does FLAC reduce audio quality during playback?

No, FLAC does not reduce audio quality during playback. It provides the same quality as the original PCM file but in a smaller size.

What is the ideal use case for FLAC?

FLAC is ideal for audiophiles, music collectors, or anyone who wants high-quality audio without taking up as much storage space as uncompressed PCM files.

Comments:

Great article! I never knew PCM and FLAC were so different in how they store audio. I always thought FLAC was just another MP3 type file, but now I understand it’s lossless. Thanks for breaking it down!

Wow, I didn’t realize the size difference between PCM and FLAC was so significant. It’s nice to know FLAC keeps the same sound quality but uses less space. I’ll definitely start using FLAC for my music collection.

This was really helpful, but I’d love to know more about when to choose PCM over FLAC for specific audio projects. Would love some more real-world examples of where PCM really shines.

After reading this, I feel a lot more confident in using FLAC for my home recordings. I was always worried about file sizes, but now I see it’s not a problem!

I’ve always used MP3s but now I see why audiophiles swear by FLAC. I’m going to try converting my music to FLAC, especially since it’s lossless. Great info!

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

M4A Audio: Lossless vs. Hybrid Formats

When it comes to audio formats, M4A stands out as a popular choice among music enthusiasts. However, there is a crucial distinction within the M4A realm – lossless and hybrid formats. Understanding the difference between these formats is essential for audiophiles seeking the best possible audio experience. In this article, we delve into the depths of M4A audio and explore the nuances between its lossless and hybrid formats, shedding light on their advantages and use cases.

Lossless M4A Audio: Uncompressed Audio Fidelity

Lossless M4A, as the name suggests, preserves the original audio quality without any loss of data during compression. This means that the audio is reproduced with utmost fidelity, mirroring the exact sound as it was recorded. The technology behind lossless compression ensures that no audio information is discarded, resulting in bit-for-bit accuracy.

One of the primary advantages of lossless M4A is its ability to deliver an audiophile-grade listening experience. Whether you are a music producer or a discerning listener, lossless M4A allows you to hear every nuance, intricate detail, and subtlest tones in your favorite tracks. The files, however, tend to be larger compared to other audio formats, as they retain all the data from the original source.

“Lossless M4A is a haven for true audiophiles, presenting music in its purest form, untouched by compression artifacts.” – The Audiophile’s Guide to High-Resolution Audio

Hybrid M4A Audio: Striking a Balance Between Quality and Size

Hybrid M4A, on the other hand, combines elements of both lossless and lossy audio formats, aiming to strike a balance between audio quality and file size. In this format, certain audio data is discarded during compression, resulting in a smaller file size compared to lossless M4A. However, the compression is cleverly designed to retain critical audio information, ensuring a notable reduction in file size without significant loss of quality.

This hybrid approach makes M4A audio files highly versatile and practical, especially for everyday listening and storage on portable devices with limited storage capacities. While the audio quality is not on par with lossless M4A, the difference is often subtle and may go unnoticed by most listeners. For those seeking an enjoyable audio experience without consuming excessive storage space, hybrid M4A proves to be an excellent choice.

“Hybrid M4A strikes a perfect balance, preserving audio quality while optimizing storage requirements, catering to a broader audience of music enthusiasts.” – The Art of Digital Audio Compression

Use Cases and Applications

The choice between lossless and hybrid M4A formats largely depends on individual preferences and specific use cases. Let’s explore some common scenarios where each format shines:

Lossless M4A:

– Music Production: Lossless M4A is favored by music producers and audio engineers during the recording, editing, and mixing stages, as it provides the most accurate representation of the original sound.

– Audiophile Listening: For those with high-end audio equipment and a passion for sonic perfection, lossless M4A offers an unparalleled listening experience.

– Archiving Master Recordings: When preserving master recordings for archival purposes, lossless M4A ensures no loss of audio data over time.

Hybrid M4A:

– Personal Music Libraries: Hybrid M4A is an ideal choice for building personal music collections, as it strikes a balance between quality and file size, making it easy to store and manage.

– Online Music Streaming: Many music streaming platforms utilize hybrid M4A to deliver high-quality audio efficiently, providing users with a seamless streaming experience.

– Portable Devices: For users with limited storage on their smartphones, tablets, or music players, hybrid M4A is a space-saving option, allowing them to carry more music on the go.

“The versatility of M4A formats caters to diverse needs, empowering users to make the right choice for their specific audio requirements.” – Audio Formats for the Modern Listener

Final Words

As the world of digital audio continues to evolve, the distinction between lossless and hybrid M4A formats becomes increasingly relevant. Audiophiles and casual listeners alike must weigh the benefits and trade-offs of each format to make informed decisions about their music library. Whether you prioritize uncompromising audio quality or seek a practical solution for everyday listening, the M4A format, in its lossless and hybrid forms, remains a reliable and widely supported choice for the modern era of digital music.

Understanding the Differences between FLAC, MP3, M4A, OGG, and WAV Audio Formats

When it comes to digital audio, there are a plethora of different file formats to choose from. Each format has its own set of advantages and disadvantages, making it important to understand the differences between them in order to choose the best option for your needs. In this article, we will take a closer look at five popular audio formats: FLAC, MP3, M4A, OGG, and WAV.

FLAC

FLAC, or Free Lossless Audio Codec, is a popular open-source format that is known for its lossless compression. This means that, unlike some other formats, FLAC does not lose any audio quality during the compression process. This makes FLAC a great option for audiophiles who want the highest quality audio possible. However, FLAC files are typically larger than other formats, which can be an issue for those with limited storage space.

MP3

MP3, or MPEG Audio Layer III, is one of the most widely used audio formats. It uses a lossy compression method, which means that some audio quality is lost during the compression process. However, MP3 files are significantly smaller than FLAC files, making them a great option for those who want to store a large amount of music on their device. Additionally, the MP3 format is supported by a wide range of devices and software, making it a very convenient option.

M4A

M4A, or MPEG-4 Audio, is a file format that is commonly used for music and other audio files. It is similar to MP3 in that it uses a lossy compression method, but M4A files are typically smaller than MP3 files. Additionally, M4A files can contain advanced features such as chapters and artwork, making them a great option for audiobooks and other spoken-word content. However, it is important to note that not all devices and software support M4A files.

OGG

OGG, or Ogg Vorbis, is a free and open-source format that is similar to MP3 and M4A. It uses a lossy compression method and is known for providing a good balance of audio quality and file size. OGG files are typically smaller than FLAC files but larger than MP3 and M4A files. Additionally, OGG files can contain advanced features such as tags and chapters, making them a great option for audiobooks and other spoken-word content. However, it is important to note that not all devices and software support OGG files.

WAV

WAV, or Waveform Audio File Format, is a popular format that is known for its high audio quality. It is a lossless format, which means that no audio quality is lost during the compression process. However, WAV files are typically larger than other formats, making them an option for those who want the highest quality audio possible but have limited storage space. Additionally, WAV files are supported by a wide range of devices and software, making them a convenient option.

Why are there so many video and audio formats, and is there a difference?

audio formats

7. VQF format

audio formats

The compression ratio of VQF format can reach 1:18, so under the same circumstances, the volume of compressed VQF file is 30-50% smaller than MP3, which is more convenient for Online streaming and sound quality is excellent with close to CD sound quality (16-bit 44.1kHz stereo). However, VQF has not disclosed technical standards and has not yet become popular.
Supplement: rare

8. Format
FLAC FLAC is lossless audio compression encoding. FLAC is a set of well-known free audio compression codes, which is characterized by lossless compression. Unlike other lossy compression codes, such as MP3 and AAC, it won’t destroy any original audio information, so you can restore the sound quality of music CDs. It is now compatible with many software and hardware audio products. In short, FLAC is similar to MP3, but it is lossless compression, which means that the audio will not lose any information when compressed in FLAC mode. This compression is similar to Zip, but FLAC will give you a higher compression ratio, because FLAC is a compression method specially designed for audio characteristics, and you can use the player to play FLAC compressed files, just like you normally do with your MP3. the files are the same.
Supplement: Lossless format, compared to ape , is larger in size, but has good compatibility, fast encoding speed, and broader player support

9. Format
APE APE is one of the most popular digital music file formats. Unlike lossy compression methods such as MP3, APE is a lossless audio compression technology, which means that after compressing audio data files read from an APE-format audio CD, it can also compress audio data files. APE Restore format files and the restored audio. the files are exactly the same as before the compression without any loss. The file size of APE is about half of that of a CD, but with the popularization of broadband, many music lovers love the APE format, especially for friends who want to stream audio CDs over the network. APE can help them save a lot of resources.
Supplement – lossless compression format, compared to FLAC, the volume is smaller. Encoding is slow.

10. Format
MID MID is the abbreviation of midi, which is its extension, “interface of digital musical instruments”, that is, its real meaning is the name of an interface for different devices to transmit signals. All of our current MIDI music production depends on this interface, and the information transmitted between this interface is also called MIDI information. MIDI was first applied to electronic synthesizers (electronic musical instruments played on keyboards. Due to the inconsistent technical specifications of early electronic synthesizers, it was difficult to link different synthesizers. In August 1983, YAMAHA, ROLAND, KAWAI, and others Well-known electronic musical instruments Instrument manufacturers jointly specified a unified digital musical instrument interface specification, which is the MIDI 1.0 Technical Specification.Since then, a number of electronic synthesizers and electronic musical instruments, such as electronic pianos, have adopted this unified specification. , so that various electronic musical instruments can be linked together to transmit MIDI information and form a true synthetic music performance system.

Why are there so many video and audio formats, and is there a difference?

3.WAV format

The WAV format is the oldest digital audio format and is widely supported by the Windows platform and its applications. WAV supports many compression algorithms, supports a variety of audio bits, sampling rates and channels, adopts 44.1 kHz sampling rate and 16 quantization bits, so the sound quality of WAV is almost the same than CD, but WAV format requires too much storage space Not easy to communicate and broadcast.
Supplement: Lossless volume is large

4. Format
ASF ASF is a multimedia playback format formulated by Microsoft, suitable for playback on the Internet.
Supplement: rare format

5. Format
AAC AAC is actually short for Advanced Audio Coding. AAC is part of the MPEG-2 specification. The algorithm used by AAC is different from that of MP3. AAC improves encoding efficiency by combining other features. AAC’s audio algorithm far exceeds some older compression algorithms (like MP3, etc.) in terms of compressibility. It also supports up to 48 audio tracks, 15 low-frequency audio tracks, higher sample rates and bit rates, multi-language support, and higher decoding efficiency. In short, AAC can provide better sound quality on the assumption that MP3 files are 30% smaller.
Added: One of the best lossy formats out there. There are many encodings, faac and nero are common, and the bit rate is up to 448kbps. In terms of hardware support, advanced mp3 and mobile phones are generally supported.

6. Format
Mp3Pro Mp3Pro is an improved version of the Mp3 encoding format. MP3Pro is developed by the Swedish Coding Technology Company, which can also compress the volume of sound files to half the size of the original MP3 format while maintaining the same sound quality. Also, the sound quality of the original MP3 music can be improved basically without changing the file size. You can compress audio files to a lower bit rate and keep the sound quality before compression to the greatest extent possible. MP3pro can achieve full compatibility. The extension of the files compressed by mp3Pro remains .mp3. It can be played on old mp3 players. Old mp3 files can be played on the new mp3pro players.

Why are there so many video and audio formats, and is there a difference?

I found that there are many video and audio formats, what is the difference between them? Is there a player that supports most audio and video playback formats?

The difference lies in the encoding method. Original video and audio require a lot of storage space. In the era when the storage device was still in MB as a large drive, various lossy compression encoding formats began to appear. The difference between various encoding formats is the compression ratio. The pros and cons of height and reduction ratio.

Basically, there are more advanced encodings that can provide high-quality audio and video effects with higher compression ratio.

1. Format
MP3 MP3 uses MPEG Audio Layer 3 technology to compress music into a file with a smaller capacity at a compression ratio of 1:10 or even 1:12. Files are compressed to a smaller size. But also very good at keeping the original sound quality. It is precisely because of the small size and high sound quality of MP3 that the MP3 format has become almost synonymous with online music. The music per minute MP3 format is only 1 MB in size, so the size of each song is only 3-4 megabytes.

Supplement: the highest bit rate is 320K, and there is no high frequency part is its default. The sound quality is not high!

2. Format
WMA WMA achieves a higher compression ratio by reducing data traffic while maintaining sound quality. The compression rate can generally reach 1:18, and the generated file size is only half of the corresponding MP3 file. This is very important for models that only assemble 32M. It supports both WMA and RA formats, which means that the 32M space is virtually expanded by 2 times. In addition, WMA can also add copy prevention through the DRM scheme, or add restrictions on playback time and number of playbacks, or even restrictions on playback machines, which can effectively prevent piracy.
Supplement: 128 kbps is the optimal compression ratio of wma, 128 kbps wma = 192 kbps mp3

Principle of mp3 and file format analysis. Part4

The three bytes starting at 1397H are 54 41 47, which store the “TAG” character, indicating that this file has ID3 V1.0 information.

The 30 bytes starting at 139AH store the name of the song, the first 4 bytes that are not 00 are 54 45 53 54, which means “TEST”;
the 4 bytes starting at 13F4H are 04 19 14 03 and the year of storage is “04/25/2003” ”;
the last byte is 4E, which represents the music category, and the code name is 78, that is, “Rock&Roll”; the
other bytes are all 00, and no information is stored.

4 Conclusions
As an important multimedia data type, people are always looking for more efficient compression methods and new sound file formats. In the MP3 file, the MDCT transform is used, which is a quasi-optimal transform with a simple structure and easy programming, which avoids the problem that the optimal transform (KL) is difficult to solve for the eigenvalues and eigenvectors of the covariance. matrix.

Through the analysis of the MP3 file format, it is not difficult to find its shortcomings. Each frame of an MP3 file has the same 4-byte frame header, which requires some space overhead for an MP3 file with a large number of frames. ID3 stores the music description information. The proprietary, copyright, and other information in the frame header is also description information. The music description information is a bit messy.

In any case, the development of MP3 is unstoppable. MP3 has become a recognized sound data format. MP3 is becoming a hot spot in the field of multimedia information processing along with JPEG images and PDF documents.

Principle of mp3 and file format analysis. Part 3

The ID3 standard MP3 frame header does not consider storing complex information such as song title, author, album name, year, etc., except some simple music description information such as privacy, copyright and original, which are very necessary in MP3 applications.

In 1996, in the “Studio 3” project, FricKemp proposed to add description information for storing songs at the end of the MP3 file and formed the ID3 standard. Until now, ID3 V1.0, V1.1, V2 .0, V2, .3 and V2.4 standards have been formulated. The higher the version, the richer and more detailed the relevant information is recorded.
The ID3 V1.0 standard is not complete and the information stored is too small to store lyrics, album covers, images, etc. V2.0 is a fairly complete standard, but it brings difficulties in writing software, although there are many people in favor of this format, very few are actually implemented in software. The vast majority of MP3s still use the ID3 V1.0 standard. This standard uses the last 128 bytes at the end of the MP3 file to store ID3 information. See Table 3 for instructions on using these 128 bytes.
Table 3 Final ID3 V1.0 File Description
length in
byte (byte) Description
1-3 3 Stores the “TAG” character, which indicates the ID3 V1.0 standard, followed by the song information.
4-33 30 Song name
34-63 30 Author
64-93 30 Album name
94-97 4 Year
98-127 30 Notes
128 1 MP3 music category, a total of 147 types.

3.3 File example
Open a file called test.mp3 in VC++ with the following content:
000000 FF FB 52 8C 00 00 01 49 09 C5 05 24 60 00 2A C1
000010 19 40 A6 00 00 05 96 41 34 18 20 80 08 26 48 29
000020 83 04 00 01 61 41 40 50 04 00 C1 2 41 50 64
…
0000d0 Fe FF FB 52 80 01 EE 90 65 6E 02 30
0000E0 32 0C CD CD CD CD 46 16 41 89 B8 408 89 300 408
0000F0 33 B7 00 00 01 02 FF FF FF F4 E1 2F FF FF FF FF
……
0001A0 DF FF FF FF FB 52 8C 12 00 E 01 FE 90 58 6E 09 A0 02
000150 8513 B0 AC 45 F6 19 61 26 26
0001C0 05 AC B4 20 28 94 FF FF FF FF FF FF FF FF FF FF
…
001390 7F FF FF FF FD 4E 00 54 41 47 54 45 53 54 00 00
0013A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
001400
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00
001410 00 00 00 00 00 00 4E
File length is 1416H (5.142K), frame header is: FF FB 52 8C, converted to binary:
11111111 11111011
01010010
10001100T

Principle of mp3 and file format analysis. Part 2

MP3 uses perceptual audio coding (Perceptual Audio Coding) this distortion algorithm.

The frequency range of sound perceived by the human ear is 20 Hz to 20 kHz. MP3 cuts out a lot of redundant signals and irrelevant signals. The encoder transforms the original sound into the frequency domain through a mixed filter bank and uses a psychoacoustic model. to estimate that it may be only The perceived noise level is quantized and converted to Huffman coding to form an MP3 bit stream. The decoder is much simpler, its task is to extract the sound signal from the encoded spectral line components through inverse quantization and inverse transformation. The MP3 encoding and decoding process is shown in Figure 1.
2.4 Modified Discrete Cosine Transform The cosine transform
Modified Discrete CT (MDCT) refers to converting a time-domain data set to frequency-domain data in order to know the changes in the time domain. MDCT is an enhancement of the DCT algorithm. The first fast algorithm is fast Fourier transform (FFT), but FFT has complex operations, MDCT are real operations, easy to program.
When compressing audio data, first divide the original sound data into fixed blocks, and then perform direct MDCT (direct MDCT) to convert the value of each block into MDCT 512 coefficients. The 512 coefficients are restored to the original sound data, and The original before and after sound data is inconsistent because redundant and irrelevant data is removed during the compression process. The FMDCT transformation formula is:
k=0, 1,
.
n0=(N/2+1)/2, X(n) is the time domain value, X(k) is the frequency domain value. If N takes 1024 points, it becomes 512 frequency domain values.
The IMDCT transformation formula is:

n=0, 1, …, N-1
MDCT itself does not compress data, it simply maps the signal to another domain, and quantization compresses the data. When bit allocation is done on the quantized transformed samples, the entire quantized block must be considered the smallest, which is called lossy compression.
3 File Format Analysis
MP3 MP3 file data is made up of multiple frames, and the frame is the smallest unit of the MP3 file. Each frame, in turn, consists of a frame header, additional information, and sound data. The playback time of each frame is 0.026 seconds and its duration varies with the bit rate. Some MP3 files have extra bytes at the end that contain description information for non-audio data.

Principle of mp3 and file format analysis.

Principle of mp3 and file format analysis

1. Introduction
With the rapid development of file compression technology, MP3 has become the most popular music format today. High-quality music spreads rapidly around the world with the arrangement of 0 and 1, which shakes people’s hearts. What is MP3? The full name of MP3 is MPEG Audio Layer 3, which is an efficient computer audio coding scheme. It converts audio files into smaller files with an .MP3 extension with a higher compression ratio, basically maintaining the sound quality of the original file. MP3 is part of the ISO/MPEG standard, which describes audio compression using a high-performance perceptual coding scheme. This standard has been continuously updated to meet the pursuit of “high quality and low quality”, and has now formed MPEG Layer 1, Layer 2, Layer 3 three audio encoding and decoding schemes. MPEG Layer 3 compression ratio can reach 1:10 to 1:12, 1M of MP3 file can be played for 1 minute and 1 minute of CD-quality WAV file (44100Hz, 16bit, dual channel, 60 seconds) occupies 10M space, so Calculated, the playing time of a 650M MP3 disc should be more than 10 hours, and the playing time of a CD of the same capacity is about 70 minutes. The advantage of MP3 is that the CD is incomparable.
2 Analysis of the principle of MP3
2.1 audio standard
MPEG MPEG (Moving Picture Experts Group) is a group of dynamic picture experts under ISO, the MPEG standard which makes it widely used in various multimedia. The MPEG standards include audio and video standards, of which the audio standards have been established as MPEG-1, MPEG-2, MPEG-2 AAC, and MPEG-4.
The MPEG-1 and MPEG-2 standards use the same family of audio codecs: Layer 1, 2, 3. A new feature of MPEG-2 is the use of low sample rate expansion to reduce the data stream, and another feature is multichannel expansion, which increases the number of main channels to 5. The MPEG-2 AAC (MPEG-2 Advanced Audio Coding) standard was released by Fraunhofer IIS and AT&T in 1997 to significantly reduce data traffic. The MDCT (Modified Discrete Cosine Transform) algorithm adopted by MPEG-2 AAC has a sampling frequency between 8KHz and 96KHz, the number of channels can be between 1-48.
The three layers of MPEG Audio Layer 1, 2, and 3 use the same filter bank, bitstream structure, and header information, and the sampling frequency is 32KHz, 44.1KHz, or 48KHz. Layer 1 is designed for DCC (Digital Compact Cassette) compressed digital tape, the data rate is 384kbps, Layer 2 has made a compromise between complexity and performance, and the data rate is reduced to 256kbps-192 kbps. Layer 3 is designed for low data traffic from the start, and the data traffic is 128Kbps-112Kbps. Layer 3 adds MDCT transformation to make its frequency resolution 18 times that of layer 2. Layer 3 also uses average information similar to MPEG video. Entropy Encoding reduces redundant information. The vast majority of MP3s use the MPEG-1 standard.
2.2 Purpose of audio compression
The MP3 format began in the mid-1980s, when the Fraunhofer Institute in Erlangen, Germany, dedicated itself to encoding high-quality, low-data-rate sound. Let’s look at an example: you want to sample a song you like that is about 4 minutes long, store it on a disk, sample it in CD-quality WAV format, at a sample rate of 44.1 kHz, that is, receive a value of 44100 per second, stereo, each sampled data is 16 bits (2 bytes), so the space this song occupies is:
44100 x 2 channels x 2 bytes x 60 seconds x 4 minutes = 40.4 MB
If you download this song from the Internet, assuming the transmission speed is 56 kbps, the download time is:
40.4x106x8/56x103x60=96 minutes
Even a 1M broadband network requires more than 5 minutes, it can be seen that audio compression is particularly important to reduce audio data storage space.
2.3 Encoding and decoding
MP3 MP3 audio compression consists of two parts: encoding and decoding. Encoding converts the data in a WAV file into a highly compressed bitstream, and decoding takes the bitstream and reconstructs it into a WAV file.