| Research > Parametric Stereo |
IntroductionCompression of audio (speech and music) has gained large interest during the last decades. With the increasing popularity of mobile applications, internet, and wireless communication protocols, the demand for more efficient compression methods is still sustaining. Downloading music from the internet or over a mobile phone network are good examples where compression technologies make these transmissions faster and cheaper, by reducing the amount of information that has to be sent without sacrificing the audio quality. One of the most recent break-throughs in high-quality stereo audio compression, developed by Philips and Coding Technologies, is 'Parametric Stereo'. Mono versus stereo transmissionNormally, digital transmission of stereo sound (two channels) involves many more (about twice as many) bits than transmission of mono sound (one channel). Parametric Stereo technology makes it possible to encode a stereo audio signal as if it were a mono signal. This reduced mono bit stream is accompanied by a small amount of extra information that allows the receiver’s decoder to convert the mono signal back to a stereo signal, with hardly any loss of sound quality. The extra information layer contains all the perceptually relevant spatial properties of the stereo input signal captured in 'parameters'. These parameters include information about the perceived position of sound sources as well as perceptual 'quality' ques such as 'perceptual diffuseness'. This method of describing two or more audio channels is often referred to as 'Spatial Audio Coding' (SAC) or Binaural Cue Coding (BCC). Parametric Stereo achieves up to 40% higher compression rates than conventional stereo coding techniques. Parametric Stereo / Spatial Audio CodingA schematic overview of a Parametric Stereo encoder is shown in Figure 1.
A stereo input signal is processed by a down-mix and parameter extraction stage. This stage computes a mono down mix of the stereo input and computes all relevant spatial parameters. The spatial parameters are subsequently encoded by a 'parameter encoder'. The parameter encoder is based on spatial psycho-acoustic knowledge, ensuring that the effect of parameter quantization and encoding is inaudible. The resulting bit rate for the parameters is scalable between roughtly 1 kbps to 8 kbps. These bit rates are approximately ten times smaller than the bit rate that is required to encode a mono audio channel. The mono down mix can be encoded by any conventional mono compression technique, such as AAC. Finally, the resulting mono bit stream is combined with the encoded parameters to form the final output bit stream. The corresponding decoder is shown in Figure 2. The incoming bit stream is split into a spatial parameter bit stream and mono audio bit stream by a de-multiplexer. The mono bit stream is decoded by a conventional mono audio decoder. The resulting mono down mix is converted to stereo by a spatial synthesis stage, which is controlled by the decoded spatial parameters.
Parametric Stereo performanceExtensive listening test results have shown that Parametric Stereo can achieve high-quality stereo audio using parameter bit rates between 1 and 8 kbps. Moreover, compared to conventional stereo techniques (such as mid/side coding), the additional compression gain of Parametric Stereo is about 40%. Parametric Stereo can be fully integrated with other parametric coding techniques, such as Spectral Band Replication (SBR). The combination of AAC, SBR and parametric stereo is currently known as 'aacPlus v2', 'enhanced aacPlus' or 'HE-AAC/PS' and is standardized in 3GPP and MPEG-4. 'aacPlus v2' is the most powerful audio coder on the market today, delivering high quality stereo audio at bit rates as low as 24 kbps. Product supportParametric Stereo is the first commercially available application that incorporates Spatial Audio Coding technology. The range of products that support aacPlus v2 is extending rapidly. Many mobile phones are capable of decoding aacPlus v2 encoded content. Below is a list of PC-based encoders and decoders:
Future developmentsThe spatial coding approach of Parametric Stereo has been extended to multi-channel audio. This extension is referred to as 'MPEG surround'. See this link for more details. More informationEURASIP J. Applied Signal Proc.: 9, 1305-1322. John Wiley & Sons, 2007 (c) 2007 www.jeroenbreebaart.com |