| Research > Binaural hearing |
IntroductionThe binaural hearing system facilitates our ability to detect, localize, separate, and identify sound sources. Besides perceiving sound sources within the visual field, the perception of sounds extends to positions above, below, behind and to the left and right of the listener. The process of detecting and localizing a sound source is accurate and happens almost automatically. It is impressive that the auditory system is able to perform this task given the complexity of the information which it has to use. In the visual system, for example, there is a close relationship between the direction of a visual object and its projection on the retina. Such a place-localization map rather directly provides information for determining the absolute and relative positions of visual objects. In the peripheral auditory system, however, there is no such place-localization relation. Sound sources which exist in a 3-dimensional world give rise to a complex vibrational pattern in the surrounding air, which is only observed at two points in space, the entrances to the ear canals. Despite the complex and indirect coding of the information about the position of sound sources, the auditory system is able to reconstruct a three-dimensional aural world by clever analysis of specific properties of the waveforms arriving at both ears. Sound source localizationIn the horizontal plane, localization is mainly facilitated by two stimulus properties. For a sound source that is located to one side of the listener, the waveforms will arrive earlier at the ear oriented towards the sound source due to the finite velocity of sound travelling through air. Hence depending on the azimuth of the sound source, an interaural time delay (ITD) exists between the waveforms arriving at both ears. Furthermore, the earlier-arriving signal will generally be more intense than the opposite-ear signal due to shadowing of the head. This shadowing effect is especially strong for sounds with a wavelength that is short compared with the size of the head. Additional intensity differences can occur for small source distances, due to the longer distance compared to the source-oriented ear. This is generally referred to as interaural intensity difference (IID). The combined effect of these cues results in the ability of human listeners to discriminate between different positions in the horizontal plane with an accuracy of 1 to 10 degrees. Absolute localization tasks usually result in a lower accuracy between 2 and 30 degrees. In the vertical plane, on the other hand, sound localization is facilitated by specific properties of the magnitude spectra of the waveforms arriving at the eardrum. Due to reflections in the pinna and other body parts, spectral peaks and dips are superimposed on the original sound source spectrum. The frequencies at which these features occur depend on the elevation of the sound source. These cues facilitate a vertical absolute localization accuracy of about 4 to 20 degrees. It has also been shown that changes in the localization cues, as long as the movement of the sound source is relatively slow. MaskingIn some conditions, the auditory system fails to detect the presence of a sound source. This can be due to a very low sound level, but it may also be the result of the presence of other sound sources, i.e., the sound source is masked by other sound sources. It has been shown that the amount of masking strongly depends on the position of both sound sources. If both sounds come from the same direction, more masking occurs than if sounds come from different directions. A well-known example of binaural properties of masking in daily life is the so-called ’cocktail party effect’. If, in a room where several people are engaged in a conversation, a listener plugs one ear, it becomes much more difficult to understand a single conversation than with two ears. A systematic study of the binaural phenomena of masking started with experiments that investigated the masking of signals by broadband noise as a function of the exact interaural phase relationship of signal and masker. Since that time, many of the binaural variables affecting masking have been investigated. For example, in various experiments subjects had to detect a pure tone in the presence of white noise. If the noise is presented in phase to both ears via headphones, and the tone is presented out-of-phase to each ear, the masked threshold level is lower than for the case that both the noise and the signal are presented in phase. This release of masking is generally referred to as binaural masking level difference (BMLD). It is generally accepted that BMLDs are caused by the fact that the binaural properties (i.e., the ITD and IID) change through the addition of the signal to a masker. Due to the high sensitivity to binaural cues, the auditory system is able to detect the signal at much lower intensities compared to conditions in which no binaural cues can be used in the detection task. Binaural modelsOver the past decades several models of binaural processing have been developed that address various aspects of binaural hearing. The general setup of the majority of these models is very similar. This bottom-up setup is shown in Fig. 1.
The signals arriving at the eardrums are first processed by a peripheral preprocessing stage. This stage usually consists of phenomenological or physiological models of the transduction from pressure variations to spike rates in the auditory nerve. Subsequently, binaural interaction occurs in a binaural processor. In this stage, the signals from the left and right sides are compared. Basically two types of binaural interaction have been used extensively: one is based on the similarity of the incoming waveforms while the other is based on the differences of the incoming waveforms. These classes of binaural interaction are often referred to as cross-correlation based models and EC (Equalization-Cancellation) models (based on the EC theory of Durlach), respectively. A common feature of the cross-correlation models is that the binaural interaction is computed for a range of internal delays in parallel after a peripheral preprocessing stage. More sophisticated models compute the cross-correlation for several peripheral filters in parallel and supply methods of combining information across frequency bands. More informationPhD thesis Technische Universiteit Eindhoven J. Acoust. Soc. Am. 110, p. 1074-1088 (2001) (c) 2007 www.jeroenbreebaart.com |