k digital music Archives

Digital music recording Part 4

Free Download Mp4Gain

Digital music recording Part 4

Digital music recording

Let’s take a look at the main audio file formats.

Digital music recording

Mp3 appeared in 1992. With its high compression ratio and acceptable sound quality, it has become extremely popular and has become the de facto standard for storing music files. It is in this format that music files are recorded on portable players, so popular with young people. However, since the summer of 2002, mp3 has become a payment for programmers: for the right to include support for the format in their program, a license fee of 75 cents was established for each copy. To get a new and more advanced version of mp3 Pro, one had to pay $ 1.25 for each program. Naturally, the developers and users of the programs were extremely unhappy with this idea. In particular, mp3 support was not possible on open source operating systems like all Linux clones. Feeling they had had enough, the patent owners – the Fraunhofer Institute and Thomson Multimedia – were quick to declare that they were “misunderstood”, but, as in the old joke, “although spoons were found, the residue still remained.”

The unsuccessful and inflexible policy of patent holders has led to a sharp rise in the computing world of interest in other audio encoding formats, the first of which, of course, is WMA (Windows Media Audio) , created by Microsoft. It is based on the successful Voxware Audio Codec 4 technology, originally designed for speech encoding: Voxware 4 files retained 90 percent intelligibility at 64 Kbps, twice that of the competition.

The modified Voxware codec has become the WMA brand and now allows you to record music at 64 Kbps, similar in quality to mp3 at 128 Kbps. This means that for the same sound quality, a WMA file occupies half the size of a mp3 file. Experts believe that music recorded in WMA sounds “cleaner and more alive” than in mp3.

The most interesting and serious opponent of mp3 and WMA is the OGG (Ogg Vorbis Audio) format. The project started in 1993 under the name “Squish”. In English, this word has many meanings: jam, nonsense, and whining. It’s hard to say exactly what the authors had in mind, but some candy company said Squish was their trademark. I had to urgently change the name. No doubt, to avoid coincidences, he was chosen for being picky: the word “Vorbis” was taken from Terry Pratchett’s science fiction novel, and “Ogg” is a slang word for computer gamers, meaning “there is power. , does not matter!” ”

OGG is a free and open format. Its codec supports sample rates up to 48 kHz, bit rates up to 512 Kbps, up to 255 channels, allows text and graphic information to be stored in a file along with a composition, and sound is encoded at a variable rate. Since the stereo channels are encoded together, and not separately, the music that sounds on both channels is recorded not twice, but once, which makes the file very compact, its compression is 20-50% better than the mp3 and subjective sound quality is higher … The problem with Ogg Vorbis is that the whales of the computer business do not need a strong competitor and do not include its support in popular operating systems.

AAS. The full name is MPEG-2 AAC (Advanced Audio Coding). Developed by the Fraunhofer Institute and various commercial firms. It is based on the same mp3. The AAC was originally designed to support sample rates up to 96 kHz, and the maximum number of channels was increased from 2 to 48, taking into account future multi-channel formats such as today’s Dolby Digital. Due to the use of more complex algorithms, its encoders are significantly slower than in the case of mp3s, and the players also require more processor power. The best choices for 96Kbps AAC encoders deliver quality no worse, and sometimes even better, than 128Kbps mp3.

The AAC format allows the use of steganography techniques to embed so-called watermarks in the recorded sequence: author / artist names, copyright information, etc. Subsequently, the co-authors of the format independently created several versions of it, the most famous of which is Liquid Audio.

Until recently, Liquid Audio was considered the best in terms of playback quality and could claim to be the successor to mp3, but the creator of the format, Liquid Audio Company, followed an unsuccessful policy in its implementation.

VQF is a method and format developed by the Japanese company NTT and promoted mainly by the Japanese company Yamaha under the name SoundVQ.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

Digital music recording part 3

Digital music recording

Time masking is based on the fact that if a silent one immediately follows a loud sound, then it can be ruled out, because the change in the hearing threshold of a human ear does not happen instantly.

Digital music recording

All lossy audio encoding methods work according to the same scheme. First, the sound is divided into frames, from which the masked components are removed, after which the frames are encoded using the Hoffman method, whereby the most common code words are given the minimum duration, and the least frequent, on the contrary, the maximum. The difference between the methods lies in the way the sound is analyzed and the masked components are removed.

Lossless compression algorithms are relatively rare, although they have their own indisputable advantages. The point is, any loss spoils the sound. It is one thing if you, working at a computer, listen through plastic Chinese speakers – “Cheburashka” “And you kiss me everywhere …”, and another – when playing symphonic music on serious equipment. Furthermore, even a professional can hardly tell what exactly was missing from the sound during encoding. Vague terms such as “colorful”, “transparency”, “juiciness” … will be used.

There are many algorithms for compressing audio files and consequently the formats of these files. For example, the audio recording formats for PC games, audio players, and Internet downloads are different. The general rule of thumb is that high bit rate files have relatively high audio quality and large size, while low bit rate files are compact, but can only be called music as a courtesy.

Additionally, various audio file formats have been created for various computing platforms such as PC, Macintosh, Amiga, and others.

Digital music recording

Digital music recording

Digital music recording

In 1900, the Danish engineer W. Paulsen at the World’s Fair in Paris demonstrated a working model of a magnetic recording apparatus created as an alternative to Edison’s invention.

Digital music recording

For the first time in human history, a human voice sounded on a magnetic recording: the astonished Parisians heard the voice of the Austro-Hungarian Emperor Franz Joseph breaking the whistle. From this moment, perhaps, the true history of sound recording began, the theory of which was created in the 30s of the 20th century.

Sound is a complex analog signal. For the analysis of such signals a technique widely used in radioelectronics is used. Using the Fourier transform, a complex signal is converted into a harmonic series consisting of sinusoids with different frequencies and amplitudes. But in practice the signal we are dealing with is of course very different from the sinusoidal one.

Musicians call the first harmonic in this spectrum the fundamental tone, and harmonics with higher frequencies are called harmonics. The main tone determines the pitch and the harmonics give it a certain color, creating the timbre of a voice or musical instrument.

To study the spectra of audio signals, complex and expensive instruments are used – spectrum analyzers.

With the help of such devices, it can be established that some musical instruments, for example a violin, have a relatively uniform spectrum and some wind spectra with pronounced maxima and minima, called formants.

There are no terms that directly describe the coloring of the timbre of a human voice or of musical instruments, so it is necessary to resort to various metaphors such as “deep timbre”, “hard timbre”, “metallic” sound or even “transistor”.

Attempts to use digital information processing methods in connection with sound recording were made many times, but the first serious results were achieved in the early 1980s of the 20th century, and coincided with the rapid development of computers and the successful microminiaturization of radio. components. The use of digital sound processing techniques has opened up exciting new possibilities.

To process sound on a computer, it must first be converted to a digital, encoded format. An analog signal is encoded by devices called analog-to-digital converters (ADCs). The main method of encoding an analog signal is pulse code modulation, which consists of three operations: sampling, quantizing, and encoding.

We won’t go into coding theory now, especially since it’s quite complicated and requires higher math skills. It is important for us to understand that the quality of the digitized sound and the resulting file size depend on the sample rate and bit depth.

The sample rate is the frequency at which the characteristics of an audio signal are measured. It follows from Kotelnikov’s sampling theorem that to obtain an undistorted digital signal, the sampling frequency must be at least twice the highest frequency of the encoded signal. Therefore, when encoding an audio signal, the sample rate must be at least 40 kHz. In digital communication systems, the sampling frequency is 32 kHz, in laser CD players and consumer digital tape recorders – 44.1 kHz. In digital studio equipment, the sample rate is even higher: 48 kHz.

The bit depth of the recorded sound is the number of memory bits that are allocated to record each value of the amplitude of the sound signal at the time of its measurement. Modern sound cards use 8 or 16 bits of memory per dimension, and higher quality 32-bit cards are available. The higher the bit depth, the higher the quality of the digitized sound.

As already mentioned, the size of an audio file depends on the sample rate and bit depth of the sound. So, with a sample rate of 44 kHz and a sound depth of 16 bits, one minute of sound requires a file size of 5.3 MB, and with a sample rate of 11 kHz and 8 bits – 660 Kb.

It is clear that such a waste of disk space turned out to be unacceptable, and special algorithms and formats were created for cheaper storage of audio files.

When comparing different compression formats, the parameter “sound quality at a certain bit rate” is often used.

Bit rate is a parameter that indicates how much disk space is used to store 1 second of music. For example, a bit rate of 128 Kbps means that a three-minute song will occupy about 2.8 MB.