Digital audio


Free Download Mp4Gain
picture

Digital audio

Digital Audio

what happens to sound within computer programs

Digital Audio

Digital audio is a representation of analog sound used by computers and various digital devices to record and reproduce audio information. Like the frames of a movie, a digital audio signal is created from a series of sound fragments that are played when we press the play button. There are many different digital audio formats, they differ from each other in the transmission quality of the audio information.

About Pulse Code Modulation – PCM

If we talk about an acoustic sound or an analog signal, we are always talking about the propagation of sound waves in space. Whereas digital audio is only a rough description of what happens to sound or should happen within computer programs or digital devices.

This article will discuss pulse code modulation (PCM), the most common digital audio decoding system. Besides PCM, there are also DTS and Dolby Digital systems, but these are mainly applicable in the field of film and video production. Today we will not talk about them.

In pulse code modulation, a signal is read many times per second. At each reading moment the amplitude of the sound wave is recorded and reproduced. As mentioned above, a digital signal is just a rough copy of an analog signal, since an analog wave cannot be recreated with perfect precision. The values ​​of each fragment are rounded to the nearest most accurate, then all the fragments are played and we hear a copy of the original analog sound.

“What meanings are we talking about?” – you ask. Just as analog audio is defined by frequency and amplitude, digital audio is determined by two important values: the sample rate and the bit depth. The sample rate means how many times per second the fragments of the audio signal are read, and the bit depth is the value of the dynamic range of each fragment of the audio signal.

Sampling rate

The standard 44.1 kHz sample rate used for recording audio to CDs (remember those?) Might seem like a random number. But this is not the case at all. This value was chosen based on Kotelnikov’s theorem, which essentially states that the sampling frequency must be more than 2 times higher than the maximum value of the reading frequency. As you know, the upper limit of audibility of the human ear’s frequency range is 20 kHz. It turns out that the sampling frequency must be higher than 40 kHz. An additional 4.1 kHz is added to avoid distortion, the so-called aliasing effect. In theory, 44.1 kHz should be sufficient to accurately reproduce an audio signal, however there are higher values.

For example, 48 kHz is the dominant standard in film and video production. As in the case of cinema, sound is synchronized at a frame rate of 24 frames per second. We won’t go into the details of why exactly 24 frames per second was chosen, in other words, this is the minimum frequency at which we can see a smooth, eye-pleasing image. The sample rate must match this frame rate. Using a frequency of 44.1 kHz can cause a noticeable out of sync of the picture and sound. Again, based on Kotelnikov’s theorem.

Even higher sample rates are repelled by these two base frequencies of 44.1 or 48 kHz, multiplying them by multiples of 2. That is, 88.2, 96, 192 kHz are the standard sample rates for all audio equipment. modern audio.

Bit depth

The bitness or bitness of an audio file tells us about its dynamic resolution or, more simply, clarity. You can draw an analogy with digital photography: the higher the resolution of the photo, the clearer and better the image will be.

It is important to note here that we are not talking about the loudness of the signal, but about a more realistic, clean and clear sound. More accurate transmission of the audio signal.

Bit depth can be compared to text in the book. The lower the bit depth, the less meaningful the text will make. That is, lowering the bitness leads to the fact that some letters begin to disappear from words, punctuation marks from sentences. At the moment, we will still be able to grasp the meaning of the text, but if the bit depth continues to decrease, the information will become so distorted that we simply stop understanding what we are talking about. The same goes for sound: the lower the bit depth, the more distorted we hear the sound.


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

MP3: the digital audio revolution

Perhaps not many people know that in 1992 a silent and unstoppable revolution of digital audio began for mass, until then essentially represented by CD-Audio. This was, in fact, the year that the algorithm underlying the MP3 format was born by the Fraunhofer-Institut für Integrierte Schaltungen (IIS).

Mp3

Part of a European research project called EUREKA, which started in 1987 and ended in 1994, the then-MPEG 1 Layer 3 was one of the most important and mature fruits in the field of psychoacoustic compression algorithms. This family of compression algorithms, whose first studies date back to 1979 by Manfred R. Schroeder, German physicist at AT & T-Bell Labsc, aims to reduce the amount of information capable of describing an audio sequence, from the assumption that the human ear, fortunately for us, is not perfect. The basic idea is to exploit the inability of the man’s auditory system to recognize certain sounds and frequencies, when they are masked by others.

MP3

Audio masking is detected at two levels: frequency and temporal masking. To explain the principle quickly, let’s take an example: in the presence of two tones, depending on their frequency and intensity, our ears will be able to recognize both or only one.

In the latter case, we have a frequency masking, and therefore information related to the least audible tone can be discarded. What happens, however, if the most intense tone is lost? It will happen that the tone that was not noticed before, will now return to the foreground. However, for the hearing system to notice, time will inevitably pass, because the membrane needs to stop vibrating and readjust.

We speak, of course, of times in the order of milliseconds, which are however precious, because the sound that falls within this time will be cut by the compression algorithm and, consequently, will help to reduce the amount of information necessary to describe what is audible.

The first MP3 encoder, called l3enc, was released by the Fraunhofer Society on July 7, 1994, while the MP3 extension was officially born on July 15 of the following year.

Those who lived through this time know that we are talking about years in which ADSL did not exist, hard drives were a few hundred MB in size, and in general, both from the point of view of communications and data storage, the figures they were far from being as generous as they are today. With these limitations in mind, I want to remind you that an uncompressed audio file in PCM WAV format, with a resolution of 44 kHz and 16 bits, stereo, as required by the CD-Audio standard, has a bit rate equal to 1411.2 kbit / s. This means that if you want to rip a song from an audio CD on your hard drive, the occupied space in uncompressed WAV format is approximately 10MB per minute. Today perhaps it would not be a problem to have this space, but in the mid-nineties it was a notable limitation.

The compactness of the MP3 format combined with the more than acceptable quality (a very optimistic estimate is a bit rate of 128 kbit / s to obtain a quality comparable to CD-Audio), made it in a few years the vehicle of transmission par excellence for music. The milestones that contributed to this unstoppable technological success were the launch of the Winamp player software by Nullsoft in 1997, and the arrival on the market just one year after the first portable media players: the MPMan F10 from Eiger Labs and the Rio PMP300 from Diamond. Multimedia.

Finally, it is impossible not to mention the birth of peer-to-peer networks aimed at exchanging MP3 files with Napster, one of the most famous applications in history, both for the innovative service that was made accessible and for the inevitable judicial events that followed and which decreed its closure in 2001.

In the same year, another symbol of the multimedia revolution, the result of the same technological horizon drawn by the MP3 format, appeared on the market: the Apple iPod.
Continuing until today we find, in parallel with the birth of new and more efficient compression formats, increasingly evident examples of the revolution, also social and commercial, that led to the arrival of the MP3 format.

There was a time when playlists were decided exclusively by record companies that were mixed into albums with mediocre songs, greatest hits; Today you can create your favorite playlist, selecting the songs and the order of play without any difficulty.

DIGITAL AUDIO explained

Audio is the electronic information that represents sound, or rather, having sound of a temporary nature is the flow of information that represents it.

Sound is made up of pressure waves traveling in space, therefore it is represented by a sinusoidal.

Digital Audio

The characteristics of a sound are:

Amplitude: Measured in Hertz (Hz) and determined by the frequency of a sound, the higher the frequency, the louder the sound, the lower it is, the lower the sound.

Intensity: it is measured in decibels (db) and is determined by the power of a sound, the more intense a sound is, the greater its volume.

Duration: It is measured in seconds (s) and dermal how long a sound lasts over time.

Timbre: It is not directly measurable, but it is that sound parameter that allows us to distinguish a trumpet from a drum. It constitutes the trace of a sound and is characterized by harmonics.

digital audio

ANALOGUE AND DIGITAL

There are two different ways of representing sound as electronic, analog and digital information.

Analog audio was the first, in chronological order, to be developed.

The information varies similarly to the information it represents and can (in theory) assume any value.

If we greatly expand the sine wave that describes an analog sound, we would see that it is a continuous line without interruptions.

Instead, digital audio is encoded with a number system, which allows discretization (transition from analog to digital), during this step information is lost, but once the sound is written as a series of numbers (digital information) it is possible to reproduce it. , transmit and modify it without losing anything in terms of quality, which is impossible with analog information.

If we greatly expand the sine wave that represents a digital sound, we would realize that it is not a continuous line as in the previous case, but a series of points very close to each other.

The amount of these points in one second of information will define the “sampling frequency”.

The amount of information that each point can contain is called “bit depth”.

THE CHARACTERISTICS OF DIGITAL SOUND

Sampling rate

Determine the number of samples contained in one second of information.

It is expressed in hertz (Hz) and generally assumes the following values ​​in the musical field: 22050Hz, 44100Hz, 96000Hz.

According to Nyquist’s theorem, each sampling frequency can record and reproduce sounds that have a maximum frequency equal to half of the chosen sampling frequency, this means that a piece sampled at 44Mhz can assume values ​​of up to 22Mhz only

Bit depth

Determine the amount of information contained in each sample.

It is expressed in Bit (bit) and generally assumes the following values ​​in the musical field 8Bit, 16Bit and 24Bit.

Above all, this is the parameter that depends on the quality of a sound.

Transmission rate (bit rate)

It is a characteristic of codecs, that is, of the “machine language” used to describe a sound.

Sets the total amount of information needed to play a second of a sound.

It is expressed in Bit / s.

AUDIO PROCESSING

Whether you’re talking about studio recording or live performances, the audio signal is never sent directly from the microphone to the speakers / recording medium, but is always processed first, through tools that allow you to perform different interventions. in the sound

These instruments can be analog, therefore they have the instrument physically in the studio (which is usually inserted inside a shelf), which must be connected between the microphone and the mixer or between the mixer and the speakers / recording medium.

Or you can simulate them through some plugins for your computer.

It is necessary to have a Daw (Digital Audio Workstation), which is the workspace in which all editing operations are performed. (Ableton, Cubase, Fruitloops, Logic, Reaper).

Within this software it is possible to install smaller ones, called VST (Virtual Studio Technology) that simulate the circuits of the studio equipment, emulating the effect.

(There are also other proprietary plugins with extensions other than the classic VST like .component or .au).

Some tools are essential and are used in all audio recordings, others are used only in particular situations or to obtain / avoid certain effects.

The main ones are:

Equalizer, is used to emphasize or attenuate some frequencies, this way you get a cleaner sound and a less “mixed” mix where all the instruments occupy only the correct frequencies, without overlapping.

The compressor, as the name suggests, serves to compress the dynamic range, so that the sound is more consistent and less dispersive.

Amp, wavering of different kinds, is used to increase the intensity of a sound.

Limiter works in a similar way to the compressor, but instead of compressing all frequencies, it attenuates those that exceed a predetermined threshold (threshold), avoids entering faults.

Reverb adds a slight reverb that makes a sound recorded in a soundproof studio much more natural than it would be too “dry”.

Filters (high / low cut) allow you to cut some useless and sumptuous frequencies too low or too high. (They are just 1 band parametric equalizers).

Digital audio formats on the network

Digital audio formats on the network:

WAV: Waveform files (or simply wave) are the most common sound formats on Windows platforms. WAV files can also be played on Mac and other systems with player software.

MPEG (MP3): The Motion Pictures Experts Group (MPEG) format is a standard format with significant compression capability. MPEG level 3 or MP3 files are frequently used for web music distribution. However, due to their size, MPEG files must be downloaded completely before playing them.

RealAudio (.rm): Real Audio is the technology that currently predominates on the Web. You need a proprietary player, but the basic versions of the player are available for free.
MIDI: The Musical Instrument Digital Interface format is not a digital audio format. It represents notes and other information so that music can be synthesized. MIDI has good support and its files are very small, but it is only useful for certain applications because of the quality of its sound when played on PC hardware.

AU: The u-law format is one of the oldest sound formats on the Internet. Players are available for almost all platforms.

RMF: The Rich Music Format supported by Beatnik (www.beatnik.com) is a high quality audio format, primarily for “download-and-play”, which is becoming increasingly popular.

AIFF: The Audio Interchange File Format is very common on Macs. It is widely used in multimedia applications, but it is not very common on the Web.

Flac: Free Lossless Audio Codec (FLAC) (Lossless audio compression codec) Ogg project format without loss. The initial file can be completely recomposed with the disadvantage that the file occupies much more space than would be obtained when applying lossy compression or Lossy.

Digital audio on the network:

The digital sound is measured by the sampling frequency, or how many times the sound is digitized over a certain period of time. The sampling frequencies are indicated in kilohertz (kHz), which indicate the number of times the sound is sampled per second. The CD sound quality is obtained with 44.1 kHz, or 44,100 samples per second. For stereo sound, two channels are required, each 8 bits; At 16 bits per sample, this results in 705,600 bits of data on a CD, producing high quality sound, at the request of the end user. In reality, the transmission of this amount of data would occupy almost half the bandwidth of the T1 network. As the average user of the Web does not have this bandwidth, another solution is necessary. One possible solution is to decrease the sampling rate when digital sound is created for sending through the Web. A sampling frequency of 8 kHz, in mono, would produce acceptable results for simple applications, such as language, especially if we consider that the playback hardware generally consists of a combination of a simple sound card and a small speaker. Low quality audio does not require more than 64,000 bits of data per second, but the end user still has to wait to download the sound. Modern users need several seconds to receive, even in the best conditions, a single second of low quality sound, making continuous sound impossible.