Audio Masking Effect – Part 7


Free Download Mp4Gain
picture

Audio Masking Effect – Part 7

auditory masking

For each type of speech, music, and other sounds, there are optimal limits for change in reverberation time, which depend on room volume and frequency.

 

auditory masking threshold

In the reverberation process, a special influence of the “early sound” can be distinguished. Early sound – this is the direct sound that came along with the reflected sound in the first 80 ms of training. Early sound has a significant impact on the sense of expansion of the sound source and this radically changes the perception of the qualitative component. The final evaluation of a particular listening room or room is directly dependent on the reverberation time. Any slight change in these characteristics can change the sound beyond recognition, make it loud or muffled, color a certain segment of the frequency range, distort the reliable transmission of the bass register, etc. Using the example from the above, it is easy to understand how important it is to pay attention precisely to listening room tuning issues, if the target high is set to recreate at least something reliable and close to the original sound.

However, the positive reverse of the medal is immediately revealed – the degree of freedom for a creative and individual approach, with the help of which you can not only exclusively achieve “your” desired sound, but also combine this implementation with design. and appearance of a room, corridor, or any different volume, which is initially selected to reproduce high-fidelity sound quality. Returning to how the formation of a sound image occurs in a room, it will be helpful to note that the role of reflected waves in the final sound image should not be underestimated. It is these waves that form such important characteristics as: the unique timbre of each instrument separately and the sound as a whole; the breadth and depth of perception of the sound stage; The “virtual size” of the instrument (for example, the organ can be made “small and insignificant”, killing its unique sound at the root); intimacy, warmth, toughness and other important characteristics that form a very accurate interpretation of the original musical image.

Waves in a room can not only be reflected, while changing their direction and speed. An absorption effect is also possible, which in turn has no less (if not more) value in the final sound. Using a simple and familiar example, you can track exactly how a room affects sound. Many faced renovations or relocations and were lucky enough to see the effect of the extraordinary loudness of a room without furniture or wall decoration. But as soon as all the objects are placed in the room, the loudness disappears due to complex absorption / reflections from a wide variety of surfaces.

If we formulate the role of a listening room in an idealized way, then we can say the following: absolutely every style of music requires its own unique and unique room with specific acoustics, bringing a certain color and sound resolution to the sound. In the past, such rules were constantly observed, halls and amphitheaters were built, taking into account the peculiarities of those orchestras that subsequently played their music there. But in our time, in practice, this approach is extremely rare, and its implementation is quite questionable, if not absurd. Hardly anyone has the opportunity to build a separate room for classical music and another room for heavy music, for example. So you have to sacrifice something for a certain average and balanced sound of a particular room,


Free Download Mp4Gain
picture


Mp4Gain Main Window
picture


Mp4Gain Features
picture


Free Download Mp4Gain
picture

Audio Masking Effect – Part 6

Audio Masking Effect – Part 6

audio masking

When in practice it comes to the process of correcting a room to optimize the final sound and give it a certain color character or neutrality, it is absolutely necessary to determine the impulse characteristics of the desired room.

auditory masking

For this, measurements are made with a microphone at different points in space, and there is also an option to calculate the specified geometric dimensions of the room height / depth / width. The impulse response obtained as a result of calculations or measurements allows the room to be dynamically corrected (with sound-absorbing materials or materials that are reflected at certain points), as well as to affect the signal directly using filters or computer programs. In the case of a competent implementation of this method, it is quite possible to achieve a full three-dimensional surround sound effect at a given point in space. However, this assumes that the person will be at a specific point without turning their head. If not, a full recalculation and the entire procedure will need to be done again.

Fortunately, listening to music at home or in a car generally provides a more or less fixed position of the listener in space, allowing you to perform the necessary calculations and a one-time correction at an early stage of system construction. It is much more difficult to implement the correct sound for two or more listeners for the above reasons, but in this case it is possible to achieve an acceptable sound if you follow the technology described above. There is no doubt that in the end it is also necessary to take into account the final signal transformations that occur in the atrium and auditory canal, making corrections not only for the general model of auditory perception, but mainly for the individual characteristics of the hearing. structure. of each person’s auditory system.

Sound depending on the characteristics of the room
From the point of view of psychoacoustics, it is important to understand the great role that the room (or the volume at which you hear the sounds / music) has in the final character of the sound. As mentioned above, almost every feature in a listening room or car interior has a significant impact on the original sound signal, acting as a kind of filter. Such filtering introduces changes in the temporal structure and spectrum of the audio signal. Therefore, there are distortions and changes in the nature of sound originally inherent in the meaning of sound recording. These distortions can be analyzed and corrected to recreate the original sound image. In many ways, this is the core and most important foundation of modern “hi-fi” sound. It is important to remember that it is the characteristics of the volume at which listening to music takes place that has a fundamental effect on the sound or the music that a person listens to as a result. It is this understanding that avoids a host of mistakes often made by hi-fi enthusiasts striving to achieve the desired “reference” sound, ignoring this simple, but at the same time, the most important principle.

The sound image in the room, created by a conventional stereo pair, is made up of a series of processes and transformations that the sound wave undergoes from the moment the speaker is formed as a result of oscillatory activity until the moment it enters the room. ear canal. . What happens to the sound during this period? As you know, the sound wave generated by an acoustic system has a spherical shape. Consequently, the propagation of sound waves occurs in all directions from the emitting surface of the dynamic head. First, the listener usually perceives a direct wave, which can be represented by a “virtual” horizontal line clearly opposite the speaker. In addition, the rest of the waves come into action with a certain delay, which also reaches the listener’s ears. These waves often arrive already highly distorted and transformed, which happens as a result of multiple reflections from the surface of the listening room / volume boundaries. The degree of delay of the reflected waves in time in this case depends directly on the material of the surface of the walls / floor / ceiling, as well as the shape of the room and its size. After the image of numerous re-reflected waves has been formed, their number gradually increases, then the sound field gradually straightens, the distribution of sound energy in space flows evenly at all points, after which the image shifts toward a decrease.

Audio Masking Effect – Part 5

Audio Masking Effect – Part 5

Auditory masking

Auralization

sound masking

The established task of reconstructing the signal in its original form is quite complicated and includes many aspects that must be considered. In this case, you fit a definition that describes modern trends and tries to form a high-precision sound field. Auralization (named by analogy with the definition of “visualization”) will help achieve similar results. Auralization is a way of recreating a 3-D sound field, trying to influence the final signal in a certain way to recreate a feeling of spaciousness and modulate a binaural auditory sensation at a given position in space. The human auditory system is influenced by only two main parameters: 1) the energy intensity or pressure of the sound wave; 2) the time of formation and decay of the signal, the change of periodicity and frequency in time. Other aspects of perception are the result of signal processing by the hearing aid and the brain. These are parameters such as: timbre of the sound, volume, height, width and depth of the scene, etc.

If you imagine the ideal situation for a certain audio signal recorded in a damped chamber or in a professional recording studio, then it is not difficult to trace what factors ultimately affect the final signal under real conditions. The main obstacle to recreating a high fidelity sound image is listening volume. When listening to such a signal in an ordinary room / hall, it turns out that the room acts as a linear filter due to the appearance of reflected waves, as well as attenuation, diffraction and other processes. In addition, the head and ears of each individual listener perform their own processing of the already “filtered” signal. If we somehow take into account all the previous transformations of the audio signal and carry out the appropriate filtering (for example, using an equalizer), it is quite possible to restore the very feeling of a three-dimensional sound field, which was established at the time. sound recording. In the primitive case, this function can be handled by a conventional equalizer, both a band pass (if configured correctly), and an equalizer with “presets” to filter a specific room (cathedral, concert hall, opera hall, amphitheater). However, the task is much more complicated than it seems at first glance, and to this day its final solution has not been found.

It should also be noted that the process of “correcting” the original signal with an equalizer is extremely undesirable and has a strong negative effect on the quality of the signal, introducing a large amount of distortion into the original path. If we draw a brief intermediate conclusion at this stage, then we can safely say that, as an alternative to using the equalizer, it is much more correct and priority to fine-tune the sound by changing the room / volume parameters for listening, achieving the desired effect of depth. and amplitude of the soundstage. The reverberation process, which is the process of attenuation of sound waves as a result of multiple reflections, has the greatest impact on the sound in a room. That is, the reverberation time is important or the decay time of the signal by an amount of 60 dB. The structure of early reflections, the nature of late stage attenuation, and a host of other characteristics also play an equally important role in the subjective sense of spatiality and fullness of the sound. It is thanks to this that a person can distinguish a good room (hall) from a bad one in acoustics.

Audio Masking Effect – Part 4

Audio Masking Effect – Part 4

audio masking

For a creative musical performance, it is very important (from the point of view of the performer and the sound engineer) to use these techniques in practice.

 

It is not difficult to assume that the phenomenon of “novelty” neuron activation is ultimately subject to adaptive processes, reducing the perceptual brightness of even the sound recorded according to all the rules, subject to repeated listening. For example, if you play the same piece of music for several days in a row, your perception is noticeably weakened and weakened. But the essence of this phenomenon is not fully understood. Along with the sensitivity of the auditory system to the amplitude-frequency dependence of the timbre of the sound, there is also a phase dependence, that is, the human hearing aid is susceptible to phase changes between various components of the signal within 10- 15 degrees. This phase-dependent effect is easier to hear in practice in the lower frequency range (below 100 Hz), since it is in this area that time processes prevail. The timbre, in addition to all of the above, makes it possible to highlight the source of the sound and determine the physical nature.

In recognizing and isolating the individual sounds of everything that enters the ear canals, there are certain principles closely related to the study of the timbre of sound:

division into sound streams, in other words: subjective separation of a fixed group of sound sources;
sounds similar in timbre and pitch are grouped together and identified as belonging to the same source;
the auditory system can “skip” and ignore the presence of a short beep in a continuous stream of noise or music;
sounds, whose development and change of phase and frequency occur more or less synchronously, are attributed to a source;
It is much more interesting to consider the brain’s ability not only to group and process received sound information, but also its ability to compare received sound combinations with available images. This means that if during the comparison the sound differs from the “reference” sound being compared, certain “virtual” properties and characteristics are attributed to that signal.

Audio Masking Effect – Part 3

Audio Masking Effect – Part 3

Masking Effect

Temporal difference in sound, peculiarity of the perception of timbre.

Masking effect

The most important characteristic (especially in the context of high-class “hi-fi” acoustics) of the human auditory system is the ability to pick up subtle temporal differences in the structure of the sound signal. The mechanism of this phenomenon is not fully understood, however, it is directly related to the fact that the auditory system is not linear. It is thanks to this that it is practically impossible to “fool” the perception mechanism into the level of the precise recreation of a live sound image and other musical nuances. Even the most modern high-fidelity (hi-fi) sound reproduction systems still do not fully account for this critical human ability to respond to temporal differences in sound. Some trained people (in terms of sounds and music) can pick up the time difference of signals in the range of 2-7 ms. The main significant characteristic of this phenomenon of perception is that

The characteristics of attack and release determine the timbre of the sound of a particular instrument, since the time interval of these periods for most instruments is between 5-360 ms. Compliance with the sequence of triphasic periods (attack, stationary part, decay) is extremely important, as it is this that determines down to the smallest detail how accurately the timbre of each particular musical reproduction will be perceived or transmitted. If the periods are reversed, for example, the period of decay with the period of attack, then the person will not be able to recognize the timbre of the sound. This circumstance allows listening to musical compositions in rooms and other volumes, since the reflected sounds do not have time to influence and mask the signal formation stage, that is, the attack phase, leaving it practically “clean” and intact. In this case, the timbre does not suffer much. A set of tools containing close harmonics in the spectral composition, establishes a certain timbre character to the whole sound as a whole, as if “pulling the lead”. This determines the spectral energy distribution and the area of ​​concentration in the frequency range. This zone of concentration of sound energy tends to change in one direction or another under the influence of certain factors, such as volume.

Very often, with increasing volume, the sound acquires a ringing brightness due to the corresponding shift of the area of ​​concentration and an increase in the influence of the high-frequency segment of the spectrum. During attack-stationary-decay phase transformations, complex processes of dynamic change of the saturation zone take place, depending on which harmonics “saturate” a segment of the range at any particular moment in time. As you know, the timbre of a sound depends on the spectral and temporal components of the signal. However, there is a characteristic circumstance that determines the perception of the timbre. In the upper sections of the human auditory system, there are special neurons, also called “novelty” neurons. These neurons fire only when the stationary signal changes. Using the example of music, we can say that its “turn-on” occurs with “bright” and dynamic changes in tone, volume, sound balance and others. Otherwise, the listener stops perceiving sound information of any nature, be it emotional, informational or aesthetic components.

Audio Masking Effect – Part 2

Audio Masking Effect – Part 2

Auditory masking

The situation is somewhat different with non-simultaneous (temporary) masking ….

Masking Effect

This phenomenon occasionally occurs in practice when the sounds that precede or follow after are masked by sufficiently loud sounds. In this case, time shift masking occurs. The degree of such masking can be determined depending on: the time interval between the arrival of the masked and masked sounds; the intensity level of the masking sound; the duration of exposure to the masking signal; The most effectively expressed is inverse time masking, which occurs when the masking sound follows the masking. An increase in effect is also seen if both sound signals are transmitted to one ear (monaural). In the case of temporal masking, the effect weakens as the time interval between the arrival of the original and masking signals increases. Contrary to the expected phenomenon, temporal masking does not show a linear increase in the masking rate due to the increase in the intensity of the masking sound. However, time masking preserves the nature of frequency dependence: the closer the masked and masked sounds are in frequency, the more pronounced this phenomenon is.

The masking examples considered above took place in cases where both signals (masked and masked) arrived monaurally, that is, in one ear. However, if the masking sound signal enters one ear and the masked sound enters the other, the masking effect will still be observed. This masking is called central (binaural) masking …. It is in many ways similar to monaural, although there are significant differences. The amount of threshold change caused by center masking is much less than monaural masking, appearing more for high-frequency sounds than for all others. Sound tones of similar frequency, as in the case of monaural masking, have the greatest audible effect. Increasing the intensity of the masking sound when considering binaural masking is important in the case of a pulsed signal.

The human characteristic of sound perception reveals a number of interesting characteristics, one of which is binaural unmasking. The essence of this phenomenon lies in the fact that thanks to two sound wave receivers, a person can “separate” sounds of a certain frequency from general noise (for example, a conversation). Very often, this effect is observed in a noisy environment, for example, at a party, surrounded by conversation noise, when it suddenly turns out to “hear” the conversation of the interlocutor. The ability to isolate (in this context, acts as the ability to amplify) a certain frequency range from the variety of inconsistent sounds (noise) is a unique natural gift and characteristic of the human perception system.

Audio Masking effect

Masking effect

audio masking

When multiple audio signals simultaneously interact with each other, a masking effect occurs, when one signal is masked and the other is masked.

masking sound

This interaction has a different effect on the sound that a person hears later. There may be the effect of not hearing one tone on an equal footing with another (the masking tone completely excludes the masked one). This usually occurs when there is a large difference in the intensity of the tone and the sound pressure. For example, a loudly passing train can almost completely mask the sound of a medium-intensity human voice. Shouting Another variant of masking occurs in the form of signal distortion or a change in the timbre of the sound. As in many other aspects of human perception of sound information, the preference for the human system of perception of sound in the case of masking is given to the middle frequencies, which corresponds to the segment of perception of speech and the voice of other people and it is due to the biological formation of the human individual. Masking processes occur in the upper parts of the brain and look like this: if there are two tones of sound or signals of different intensities, both are received simultaneously by the peripheral auditory system and sent for further processing and analysis to the brain. However, at this stage, the person will hear only one of the signals, the most intense. But this does not negate the fact that two different sound tones are emitted, it is just that a person finally hears only one of them. This is roughly how the masking effect works, which is of four types:

simultaneous masking (monaural);
non-simultaneous (temporary) masking;
central masking (binaural);
binaural unmasking;
Simultaneous (monaural) masking – This is the most common case that occurs constantly in practice and in everyday life. This phenomenon is characterized by the simultaneous appearance of sound waves and is manifested the louder, the more intense is the masking sound in relation to the masking, and this occurs in direct proportion. The degree of masking in this case is easily represented in decibels (dB), as the difference between the threshold level of hearing of a given tone in the presence of a masking tone and its threshold level of hearing in silence. But this doesn’t end with the features of simultaneous masking. The degree of masking is greater and depends not only on the increase in intensity of one sound wave in relation to another. The most pronounced masking occurs when the frequency of the masked sound is closest to the frequency of the masked sound. Consequently, the greater the difference in frequency, the less the effect manifests itself. Furthermore, masking becomes increasingly asymmetrical in relation to high-frequency sounds, whenever the intensity of the masking sound increases. The peculiarity of the specificity of sound processing by human hearing organs also causes such an interesting effect: high-frequency maskers work effectively (mask) in a fairly narrow frequency range, while low-frequency tones cover a range of much wider frequency.

Sound masking

Sound masking

sound masking

Sound masking is a phenomenon that affects the audibility of a sound (signal) in the presence of other sounds (interference).

 

Hearing impairment is usually expressed in an increase in the detection threshold of the signal and M. h. can be quantitatively evaluated by the number of dB, at a cutoff the hearing threshold increases in the presence of interference (masking threshold). Distinguish between simultaneous, direct and inverse sequential M. s. In the first case, the test signal and the interference (masker) sound simultaneously, in the second, the signal follows the masker, and in the third, the signal precedes the masker. Backward masking occurs only for short signals.
If the signal and interference are broadband, then the value of the simultaneous M. s. in a great dynamic. the range is proportional to the intensity level of the interference. If the signal and the masker are tones of the same frequency, then M. z. grows slower than the masker level. With a difference in the spectral composition of the signal and the interference M. z. determined by Ch. arr. interference components close in spectrum to the signal. To reveal the frequency selectivity of hearing, pure tones or very narrow band noise are used as the signal and masker. Frequency dependence of the masker level required to mask a weak fixed signal. frequency and level, characterizes the frequency setting of the auditory system in the area of ​​the signal frequency (Fig.). In direct sequence mode. The selectivity of the masking frequency increases, which is explained by the manifestation of non-linear properties of the cochlea.

Frequency dependence of the level of the masking signal L M, required to mask a tone signal with a frequency of 1 kHz and a level of 20 dB: 1 – with simultaneous masking; 2 – with direct sequential masking.

3009-97.jpg

At the same time. masking a noisy tone signal, the spectrum of which is limited by a band with a center, a frequency corresponding to the signal, the spread of the spectrum of the masker at a constant integral energy at a certain value of the bandwidth does not affect the value of M. z. Expansion outside this band, called critical, leads to a decrease in M. s.
M. s has important characteristics. with binaural sound perception. When the signal has a frequency below 2 kHz or when it is at a higher frequency it changes rapidly in amplitude, M. h. it depends on the interural relationship (between channels) of the phases of the carrier (or, consequently, the envelope) of the signal and the masker. With the same interaural phase shift of the signal and the M. h. maximum, with a difference in interural phase changes of 180 ° M. h. generally minimal. This effect, apparently, is decisive for the phenomenon called “cock-tail-party” phenomenon and consists of the ability of a person to follow the signal coming from a source (interlocutor), ignoring interference with similar spectral-temporal characteristics ( other voices, etc.).

SOUND MASKING

SOUND MASKING

Masking audio

Phenomenon that consists in the alteration of the audibility of a sound (signal) in the presence of another sound (the so-called masker).

Psychoacoustic Model

In general, the deterioration in audibility is expressed in an increase in the threshold to detect or distinguish a signal, and M. h. it can be quantified by the value (decibels) by which the threshold rises in the presence of a masker. Distinguish between simultaneous, forward sequential, and reverse M. s. In the first case, the test signal and the masker sound simultaneously, in the second, the signal follows the masker, and in the third, the signal precedes the masker. Invert M. z. appears only for short signals.

With the simultaneous masking of a noisy tone with a fixed energy and a center frequency equal to the frequency of the signal, the spread of the masker spectrum to a certain value does not affect the value of M. h. Expansion outside this frequency band, called critical, leads to a decrease in M. s. If the masker is amplitude modulated, then adding a sound that changes synchronously with the main. a masker, can lead to a decrease in M. z. (the phenomenon of commodity release from masking).

M. z. it depends on the interural (between ears) relationship of the signal phases and the masker. With zero phase shift M. z. maximum, with an interaural displacement of 180 ° M. h. generally minimal. This effect, called binaural unmasking, defines the cocktail phenomenon, which consists of the ability to follow the signal from a source (interlocutor), ignoring interference with similar spectral-temporal characteristics (other voices, etc.)