audio normalization for videos Archives

What is an audio normalizer?

Free Download Mp4Gain

What is an audio normalizer?

How can an audio normalizer help?

Mp4Gain is a tool that will adjust the volume of an mp3 file so that the loudest and softest parts of the sound are more balanced.

The main advantage of an audio normalizer is that it can be used to make a song louder without clipping or distorting it. It achieves this by increasing the volume of softer sounds, which in turn makes louder sounds quieter.

An audio enhancer is a similar tool, but instead of balancing out the volume, it increases certain frequencies to make a song clearer and more pleasant to listen to.

Normalizing the volume of audio files is crucial for many reasons: it makes listening to music more pleasant, it increases the clarity of speech, and it can even help you sleep better.

Mp4Gain is an audio normalizer and volume booster. It can be used to automatically adjust the volume of all your music files so that they are at the same level.

Mp4Gain is an easy-to-use tool for adjusting the volume of all your music files to a uniform level. It does not need any technical knowledge, just drag and drop your music files into the program window and click “Normalize”.

Mp4Gain is an audio normalizer, it can help you to increase the volume of mp3 files. It can also be used as a volume booster or audio enhancer.

Free Download Mp4Gain

Mp4Gain Main Window

Mp4Gain Features

Free Download Mp4Gain

Let’s talk about “musical dynamics” and “musical loudness” Part 2

The two brief examples above are to tell you that frequency content, sound pressure, and sound duration will affect people’s perception of sound volume.

That is why it is said that “loudness tends to human subjective perception”.

Since the volume is the subjective perception of people, how to quantify it?

To quantify “loudness”, the first thing you need to look at is the relationship between the frequency and the loudness of the human ear. There are two pictures below, you can read them carefully for reference:

Looking at the two images above, you will clearly see that the human ear and the human brain are not an organ that flattens the receiving frequency. It will not develop here. For the basis of loudness quantization, see the second image, there is a unit called “fon”. The phon unit is an attempt to quantize loudness. We take a 1kHz signal as an example, and it can be perceived at a volume of 40dB at 1kHz, so it is 40phon. Based on this, another unit is called a sone, 1 sone = 40phon. Both are units that attempt to quantify volume.

The international organization will be the ITU and the EBU…etc. The characteristics of the human ear, the psychoacoustics of the human brain, etc., all factors that affect loudness perception are considered together, and these factors are calculated through complicated mathematical calculations Define and standardize the reasonable loudness range for ” sound reproduction” only after statistically significant results have been obtained. Those interested can search: “ITU-R 1770 and ITU-R 1771”.

Should the rules be followed?

Of course continue! In fact, there is such a problem in Taiwan. Not to mention music, only the fourth channel and MOD, the sound level of each channel is different! The scariest thing is switching from the movie station to the shopping channel and often still being scared by the sudden volume of the shopping channel. Even radio shows have this kind of situation.

Here, you can go to Google again: “Volume War Loudness War”. All this is commonplace. This article is mainly to introduce the definition and specification of loudness.

Effects of loudness specifications

Although ITU, EBU, ISO, ANSI and other organizations have introduced loudness specifications, major music and video streaming platforms still have their own standards. However, the standards of the main platforms will continue to be around the specifications, and there will be no big or outside. When it comes to the audiovisual industry, it will generally affect these things:

Music streaming platforms: Records must meet loudness specifications at time of release

Video streaming platforms: Loudness specifications must be met when movies are released

Let’s talk about “musical dynamics” and “musical sonority”

Where does the music we listen to come from?

Before we talk about it, it is necessary to quickly talk about the disc creation process. In principle, it can be divided into three parts: the initial stage, the intermediate stage and the later stage.

First stage: compose, arrange

Middle term: recording, mixing

Post: post mastering, distribution, marketing

Whether a piece of music is good or not can be determined at the initial stage of the arrangement. Then there is the recording. The recording process can be finding real musicians to record the sound of real instruments, or completing the melody required by the arrangement through software instruments. Then find a singer or singers to sing… and so on. This process is called recording.

The “balance” of a song is not only achieving the balance of the melody in the arrangement, on the other hand, it is leaning on the mix to make the recorded elements a harmony in listening and frequency, it is also usually necessary to coordinate It depends on where the track goes, or what the producer wants. After all, the purpose of a song or album is to become a commercial release, and the post-production and embellishment processes that need to be done are necessary.

Usually the post-mastering process will be done last. After the entire album’s timbre direction, volume adjustment and minor flaws etc. are fixed, the final mastering will be uploaded according to the loudness specifications required by each streaming platform. .

Quantify the volume and intensity of what we hear

Sometimes people equate loudness with loudness. Actually the two are different. They are different and at the same time influence each other.

Loudness can be quantified, in simple terms, it is our most used “decibel dB”. Volume, on the other hand, tends to be subjectively perceived by people. how to say? Different 75-decibel musical signals are sent out at a time, and everyone has different feelings about its loudness and volume.

Because loudness is related to three things: frequency content, duration, and sound pressure.

We played a 1000 Hz test signal for three minutes at a sound pressure of 80 decibels. Your perception of the volume of this signal will be very different from playing it for 10 seconds or 30 seconds.

Let us take two singers as an example, one of them has a more evident timbre in the mid-high frequency band, and the other has a more evident mid-low frequency band, they sing the same song, and they sing with the same key and similar sound pressure, generally in the mid-high frequency band. The sound of the sound will feel stronger.

Loudness normalization

When you have a lot of mp3 files, you often look for loudness normalization.

What usually happens is that we have mp3s (although Mp4Gain can do Loudness normalization of many other audio and video formats!!) that have been created with different settings, for example different bit rates… which causes them to have a loudness different and that is annoying to the ear.

Many times we have been collecting mp3s from different sources, finding one here and another there and over time we have managed to have a good collection that is worth thinking about, but we have a problem: the loudness differs between different music or video files.

And this has generated that we desperately need to find a solution.

Mp4Gain is the result of many years of experience and is definitely the best normalizer out there, I have no doubt.

Even for very advanced users, it offers different settings to adjust exactly what you are looking for. Pewreo if you are a common user, you will not need anything, just load the song or video (you can normalize one or hundreds at the same time) and click a button, it’s that simple.

Audio normalization for beginners

What’s more annoying when listening to music is that you have to manipulate the volume control for every song that plays. If you have a computer, a tool allows you to uniformize the atmosphere from track to track while the songs are playing. This is called normalization. Three main means are used to achieve this result more or less effectively.

Audio normalization

Normalization through detection of maximum volume

The player or audio processing software analyzes the sound of the track and detects the highest amplitude. If it is less than the maximum gain value that is imposed, the signal is automatically boosted by the number of decibels required to reach and reach this value in all samples on the track. If the highest amplitude is equal to or greater than the maximum gain value, nothing is done.

Normalization

This method has only one advantage: the avoidance of saturation. However, the drawbacks are many.

This form of normalization cannot be applied in real time, as it is assumed that the maximum signal value is known in advance, which is hardly the case with live audio sources (playback or recording). Also, this type of normalization turns out to be totally ineffective when the overall sound of the song is low, but interrupted by small ridges that can be parasitic. When these peaks reach or exceed the maximum gain value, nothing happens and the overall sound is always reduced, especially if these peaks last only a few fractions of a second.

Normalization in detecting maximum volume is almost never used by reading software. Many audio processing software or even audio CD burning offers this option, such as Audacity and Nero.

Normalization by medium volume detection

Here, the player or audio processing software analyzes the sound of the track and does not detect the highest amplitude, but the average amplitude of the signal. Thus, the volume of the song will automatically increase or decrease by the number of decibels required to reach the imposed value, as appropriate.

Also known as RMS, this method has the advantage that the sound is fairly accurately balanced from one song to another, even if there are sharp peaks in the volume.

However, normal normalization of volume detection, like the previous method, cannot be applied in real time and is ipso facto unsuitable for live audio sources. In addition, saturation can occur if the imposed value to be achieved is not sufficient. It is recommended to use normalization values small enough to avoid this problem as much as possible.

Many reading software programs use this normalization mode, but they all work better or worse than the others. .

Sound compression / modern normalization

The mp4gain audio processing software performs the audio signal analysis, analysis that will lead to increase or decrease the volume of certain areas of the signal according to a complete set of fairly complex parameters inherent in the signal itself. Ultimately, the loud sounds will be attenuated, the weak sounds will improve when multiple presets are reached.

This is the best normalization method if the sound processing values are well established, in which case the sound volume becomes very constant and without saturation, regardless of the source and signal type, in real time or No

However, this type of normalization requires some processing power from the processor. Although the results achieved are much more professional and the only ones that really achieve what the 2020 ear is looking for. Mp4Gain has the most efficient response to normalize audio, either from audio files of the most popular formats or from video files, including the most commonly used formats.

Audio Normalization, understand what it is about

Difference between Peak level and RMS in Audio

Something that is mentioned a lot, for example when audio recordings are produced, is about the so-called Peak Level and RMS, Peak and RMS (Root Mean Square), which are detected by meters (software, or hardware) But… What are they exactly these values?

Tube Compressor-Limiter

It is important that someone who does not record audio but simply listens to understands these differences.
This will make you a true expert, even if you are just someone who has a good collection of music, but knows how to distinguish who is normalizing and understands the subject.

DIFFERENCES

The Peak value will inform us of all those maximum values that occur in our music in real time. To understand us … If we have, for example, a recorded song where a drummer emphasizes playing the tarola or a cymbal, we will see that our peak meter will show a higher value for a moment, because it is the one that is sounding louder in that instant. This meter will work with fast attack times, to be able to immediately measure these peaks and maybe use a limiter to avoid them.

What is RMS?

The RMS value, however, will mark the average value of the loudness or volume of our music … how does that do it? , for this it will use attack times, much longer longer. To be clearer … This value will give a reference of the energy level or volume (how high or low is the volume that is playing) but will not be affected by the peaks.

When we say that it has a slower attack value, this means that it does not measure variations so quickly, but rather that it is “slow” to react and therefore shows us something that could be an “average” volume level.

In any case, the suitable normalizer must be a mixture of limiter (that device that prevents the music from distorting because it has exceeded the maximum possible level) and a compressor, which is the one that prevents the peaks from exceeding a level and also prevents them from Volume drops drop more than a preset value.

In this way the music always remains within a medium range, without exceeding a limit neither up nor down.

Professionally recorded or broadcast music is always limited and compressed to keep it playing its best within a suitable range.

The only software that does exactly this is the Mp4Gain. That is why it has been accepted not only by amateurs, but by professionals.

Audio Level normalization

The audio levels of the material produced in a radio station
In general, in radio they do not tend to stay within standardized levels for their audio editions (spots), it is not necessary to know much about levels, since an audio processor compresses and limits everything on air.

The console operator does not understand anything about dynamic range, something that has no practical use in the air. And this is how many radios work with adjustments that “work” in the air by trial and error, and not always with the most demanding criteria. successful.

Level normalization

In radio, an editor does not know or manage any level convention, so it could be said that level normalization is not widely used. However, a good professional practice would be that all the material generated by a station “sounds” at the same level. Not to the air, because to the air if it is transmitted normalized or compressed and limited, but inside the station. And for this, there are two ways:

The material is processed “by ear” by comparison.
An RMS value is defined and all publishers normalize their mixes to that average level.

Regarding the first point, differences of up to +/- 2 dB will be absolutely acceptable. But a very common vice is to overcompress the edits, or sometimes the voices, seeking to hear the compact and aggressive sound of the FM on studio monitoring. That sound should be determined on-air by the streaming processor, not the publisher. Editors generally abuse processes like Normalize RMS (Sound Forge) and “maximizers”; Wave Hammer (Sound Forge / Vegas) Ultramaximizer and L1 (Waves). Ideally, how much to “squeeze” the dynamics of the edited material should be a function of the type of processor the radio has. At this point it is possible to clarify a fairly common confusion: STANDARDIZATION has nothing to do with making an audio sound “strong” or “powerful”. Using normalization for that purpose is a beginner’s mistake.

The second option is the most accurate way of working -although this precision is not necessary- normalizing all the editions to a given RMS value. This does not impact the sound in the air but it does the internal prolixity of the station. RMS is not an accurate measurement of loudness or “volume”, but for what you need in radio it is enough.

The streaming audio processor knows nothing about the level of the audio file. The processor receives an audio level from the console and works accordingly. What affects the behavior of the processor is the dynamics of the material, if it has dynamics or is super-compressed / limited.

Normal working values

The level at which operator-editors generate material has two well-defined extremes to avoid: very high levels of compression / cliping and excessively low material (less than 24 dB RMS). When we talk about level, we must be clear about the differences between peak level and average level.

PEAK level

Regarding the peak level, the logical maximum limit is digital cliping. Needless to say, a cliping mix is unacceptable.
It is advisable that the maximum peak level is not 0 dBfs, as this will generate overshoot cliping in the D / A converters and especially if the compressed material (MP3) is exported.
An appropriate value for the material on a radio is maximum peak – 1dBfs (the recommendation if using mp3 compression is -3 dBfs). But this does not mean that it should be -1 dB. If no peak reaches the established maximum it is not a problem as long as the material complies with the appropriate working level. The peak level does not matter, but in general the signal will always reach the maximum peak level.

Listening level (RMS)

The “listening level” or mix level is determined by the RMS or “average” value of the material. This is true even if the publisher has never measured the RMS value of their audios. In general the radio editor “compresses”, “maximizes” or -conception error by- “normalizes” your edits “so that they sound”. And in that “so that they sound”, it is taking the cuts to a certain value.

The question that arises is what should that value be? How much should the final mix “squeeze”? The final value should not be a value that generates excessive compression, as this is the task of the transmission processor. How to compress is a topic of discussion for another article, since it is fine spinning and the radios in general do not take into account these aspects. In general lines we will say:

If the radio has a simple analog processor, type M31 or Solidyne 362, they will perform better with material that has a more compact sound (more compression).
If the station has a high-end digital processor, and especially if it works with a highly processed sound in the air, it is not recommended or necessary to excessively maximize the material generated by the station, because these audio equipment respond better when the material is origin is not over compressed.

But what if the file level is very low? It depends. Depending on the PC-Console connection, the operator typically has at least 15 dB of gain range for level correction from the PC. In turn, if the level is low with the fader on, the AGC of the processor has between 10 and 20 dB more correction to compensate the level in the air. But if the file were generated too low, it could fall outside the operator / processor correction range and go low on air.

GENERAL AND ELEMENTARY CONCLUSIONS:

Different materials generated in the radio must sound at the same level, either by ear or measured RMS.
It should not be overcompressed, much less cliping.
The peak level should not exceed -1 dB.
It should not be too low as it may fall outside the processor’s AGC / operator correction ranges.

Put in values:

RMS values between -16 to -13 dB RMS are acceptable.
Values between -13 and -10 dB RMS generally indicate strong compression.
Values less than -10 dB RMS indicate excessive compression, not recommended as it generates a very loud but “muffled” sound that cannot be “improved” by the air processor.

Normalization of an audio file.

Normalization is used to increase or decrease the level of the song as a whole, so that its maximum volume peaks assume the indicated level.

Loudness Normalization

For example, if the maximum intensity points of the song are -3 dB (therefore well below 0, which should represent the maximum before distortion), normalizing to 0 dB means increasing the level of the entire song so that these peaks reach 0 dB.

This is the typical normalization of the peaks.

There is also RMS normalization (which takes into account not the peaks but the actual average level of the song).

Audio Normalization

AUDIO CDs, which have good dynamic possibilities (various intensity tones, from pianissimo to fortissimo), are generally recorded so that the maximum volume points are at 0 dB.

Normalizing your WAV recordings can be helpful in adjusting them to the average level of a CD in case they are too low (because you had been careful in level during recording) but one important thing to note:

Normalization of this type alters the original dynamics, that is, the reciprocal relationships between weak and strong sounds.

Although all levels are raised by the same amount, the relationship between 2 levels changes (small mathematical example:
2/5 = 0.4 ma (2 + 1) / (5 + 1) = 0.5 …

The result is that the weaker sounds, after abrupt normalization, sound much louder and those that were already playing only sound a little louder … altering the dynamic relationships that had been envisioned by those who originally recorded the music and making the sound output to lose depth.

Some types of music, generally already deficient dynamics (rock, metal, etc.) since the excursions between the minimum and maximum volume are almost never very consistent, are more “normalizable” without problems, while the genres in which there may be Large Dynamic excursions (classical music or music with passages from pianissimi to fortissimi) are more problematic.

In addition, it is necessary to take into account that if you normalize a large wav file that contains many songs (not yet divided) there can still be, even in genres with little dynamics, substantial differences, in this case between one song and another and not between different points of the same song.

So a light normalization can do and is actually used (to raise the level of the part), but it would be better to make sure you don’t need it (recording from the beginning with a good level) or at least not have too much. remember, however, that the dynamics are somewhat flattened.

Normalize with Mp4Gain

This software is capable (it is the only one that can do this) of normalizing the main audio and video formats and its standardization algorithm is by far the most efficient and the one that produces the best results.
For this reason it is used by musicians, radio broadcasters, universities, television stations, producers, etc.

Audio normalization explained

Audio normalization – Audio normalization

Audio normalization is the application of a constant amount of amplification of a sound recording to bring the amplitude of a target level (standard). Because the same amount of gain over the entire recording, the signal-to-noise ratio and relative dynamics are unchanged.

Two basic types of audio normalization exist. Peak normalization adjusts the recording based on the highest signal level present in the recording. Loudness normalization adjusts the recording based on perceived loudness.

Normalization differs from dynamics compression, which applies varying levels of gain across a recording to fit the level within a minimum and maximum range. Normalization adjusts the gain with a constant value over the entire recording.

Normalization is one of the functions usually provided by a digital audio workstation.

Peak normalization

One type of normalization is peak normalization, where the gain is changed to bring the highest PCM sample value or analog signal peak to a certain level – usually 0 dBFS the loudest level allowed in a digital system.

Peak normalization

Since it only goes to the highest level, only peak normalization does not take into account the apparent loudness of the content. As such, peak normalization is commonly used to change the volume so as to ensure optimal use of the available dynamic range during the mastering phase of a digital recording. In combination with compression / restriction, however, peak normalization becomes a feature that can provide a volume advantage over off-peak normalized material. This feature of digital recording systems, compression and limiting followed by peak normalization, sets contemporary trends in program loudness.

Loudness normalization

Another type of normalization is based on a measurement of loudness, where the gain is changed to bring the average amplitude to a target level. This average can be a simple measurement of average power, such as the RMS value, or it can be a measure of human perceived loudness, such as that offered by ReplayGain, Soundcheck and EBU R128.

Loudness Normalization

For example, YouTube reference level -14 LUFS, so if a program analyzed at -10 LUFS, YouTube will decrease the level 4 dB to the reference of -14 LUFS.

Loudness normalization was made in different volume combat when listening to different music in a series. Before loudness normalization, one song in a playlist would be quieter than the rest, so the end listener would have to put a volume knob to adjust the playback volume.

Depending on the dynamic range of the content and the target level, loudness normalization may result in peaks that exceed the storage medium. Software offering such normalization usually offers the option of using dynamic range compression to avoid clipping when this happens. In this situation, signal-to-noise ratio and relative dynamics changed.

Volume normalization, an explanation

Audio Normalization: Make Your Audio & Video Consistently Loud

Audio normalization is a process in which the amplitude (volume) of an audio recording is increased or decreased in a constant relationship over time, so that the maximum amplitude or the maximum effective value or the perceived volume (volume) reaches a predetermined level, the standard. If the signal has multiple tracks, they all undergo the same correction.

Normalize Audio

Example: normalization of peaks to -3 dB:
A collection of digital recordings is made with a peak modulation standard of -3dB FS.
A new stereo recording is measured. The highest maximum level is -5.5 dB FS on the left track, -5.7 dB FS on the right track.
Normalization consists of applying a constant gain of 5.5 – 3 = 2.5 dB.
Standardization requires two passes. The first determines the maximum level, the second applies the correction to the entire recording.

Audio Normalization

Maximum normalization changes the level, but not the dynamics of the sound.
Volume normalization or perception of loudness often includes compression that changes the dynamics of sound.

Peak normalization

Peak normalization applies a constant gain to a recording to bring the highest peak to a target level, 89% professional audio (-1 dBFS true peak (True Peak)).

The sound dynamics of the recording are more or less preserved, except that maintaining a low distortion level after multiplication of all samples may involve the application of a known quantization error decorrelation noise. under the name redithering (tingling of the least significant bit) 2, which slightly increases the background noise level.

Volume normalization

The purpose of volume normalization is to bring all sound elements in a collection to the same sound volume level, so you can hear them without having to adjust the volume. In fact, the normalization of the maximum level in no way guarantees a homogeneity of the perceived sound volume (Loudness).

A simple approach to volume normalization, which is provided by various software programs, is to normalize the RMS value of the integrated signal within a few tenths of a second. The most advanced machines use extensive algorithms for more accurate evaluation of the perceived noise level. The European Broadcasting Union published a recommendation 1 in 2011, which provides a relatively simple method for this evaluation.

If the standard is not low enough, volume normalization involves compression for recordings whose sound dynamics would be higher than implied when setting the standard from the maximum level. If not, the signal peaks would exceed the quantization limits.

In the simplest implementation, volume normalization collects volume data during the first pass, determines the gain or attenuation necessary for the maximum volume to reach the norm, and applies this correction to the second pass. If the elements of the collection have the same characteristics, from form factor to top factor and dynamics, as is the case with popular music collections or recorded speech, this approach produces satisfactory results.

Extensive implementations use a standard that includes not only the volume of the sound, but also the maximum maximum values and dynamics of the sound. They collect loudness levels and maximum values