Spatial audio


The human auditory system includes a variety of delicate processes for predicting spatial parameters of the environment, allowing the location of sources and reflecting objects to be judged. The reproduction of spatial environments can be approached in several ways. Soundfield reconstruction methods aim to control the soundfield in a region surrounding the listener, either just around the head so that the listener is free to orient the head, or around an extended region that the listener is free to move around. Binaural methods on the other hand aim to control only the sound entering the ears at any time, either by headphones or the combined control of multiple fixed louspeakers. In either case the target binaural signals depend on the position and orientation of the head, so full control is only possible with head-tracking and interactive processing.

The research here has investigated different aspects of these approaches, including source synthesis and effects in low resolution soundfield synthesis, complex source synthesis in high resolution systems and studio environments, and binaural synthesis of near sources using virtual soundfield methods. The following link to more information-




Soundfield control with general boundaries

The problem of controlling the interior field of a given continuous boundary of monopole drivers is solved formally using simple source method, which involves the exterior Helmholtz problem, equivalent to a scattering problem. A more practical and general method was sought, that solves directly for a discrete boundary and can specify a sub-region of the interior as target.

A solution has been developed by extending the Ambisonic decoding process from controlling a single region to multiple overlapping regions simultaneously. This allows the interior of arbitrary boundaries, or arbitrary sweet spots, to be controlled completely. This provides a much more flexible tool than standard HOA or wavefield approaches.

Menzies, D. ‘Soundfield Control with Distributed Modal Constraints', Acta Acustica united with Acustica, to appear.
Menzies, D.
'Sound Synthesis for General Enclosures', Ambisonic Symposium, IRCAM Paris, May 5-7 2010.


Some examples-

Speakers on an L shaped boundary, filling out the interior with a planewave. The bottom plot shows the absolute relative error.

 

Speakers on a dome with a squashed, raised sweet area. Waves from two directions shown. The energy is minimised and accuracy is focused on the desired region:

dome.jpg

dome.jpg


Control of four regions independently. This becomes increasingly difficult as the regions become closer and more opposed:

indep.jpg


opposed Interior point sources can be represented in exotic ways, over continuous regions that can extend even over 180 degrees from the source, and at surrounding islands:

<<



Near-field binaural synthesis

An area where binaural systems are still developing is the provision for source distance variation up to the near-field. If this can be achieved accurately and practically it would enable a variety of applications involving objects that are within manual interaction distance - arms length.

One approach to this is to derive near HRTFs from distant HRTFs using physical principles. Although promising, there are a number of complications. By varying the construction of the virtual soundfield used it is possible to improve the HRTFs calculated. This line of research also opens up more generally the question of representing point sources with freefields, and has implications for real soundfield construction, for example using Wavefield methods.

Menzies, D. 'Near-field HRTFs from Point Source Representations', Ambisonic Symposium, IEM Graz, June 24-28 2009.
Menzies, D.
and Al-Akaidi, M. 'Nearfield Binaural Synthesis and Ambisonics', J.Acous.Soc.Am. March 2007.


The following diagrams show the approximation error of a source located at the origin represented by a focused source that has been further re-expanded using a Fourier Bessel expansion with maximum order N. This allows far-field accuracy to be traded for near-field accuracy.

3DFSEerr.jpg

<<



Production with complex sources in the studio

In the studio environment sources are frequently recorded with a single microphone, thus loosing directional source information. Even when multiple microphones are used, the information is treated in a way that does not reproduce the directive qualities of the object fully. The direct signal from a source is usually narrowly spatially confined. On the other hand the reverberant signal comes from all directions. The pattern of reverberation is dependent on the direction the sound leaves the source. Since many sources have complex patterns of directivity that change rapidely with time, the resulting reverberant field changes too.  A simple method is found to approximate features of this field using multiple microphone recordings processed with reverberators, resulting in a stereo image that appears natural like a stereo crossed-pair recording of the source in a real room. A more compact parametric representation of the complex source is also considered.

Menzies, D. ‘Parametric Representation of Complex Sources in Reflective Environments’, Proc. AES 128th International Convention, Paris, May 2010.

Studio test materials are available here.

The following diagram illustrates how different reflections originate from different parts of the source, in a simple room shape.

room.jpg

<<



Soundfield synthesis of complex sources

A general source with directivity can be represented by a spherical harmonic encoding. In order to render the source with high order soundfield reconstruction, the free-field expansion of the source field is required about any point, even in the near-field of the source. A solution if found, providing the natural extension to 'O-format' considered in earlier work.

Menzies, D. and Al-Akaidi, M. 'Ambisonic Synthesis of Complex Sources', J. Audio Eng. Soc, October 2007

The images below shows the direct field of a complex source at O, and part of this field accurately resynthesized at B around the listener, using a high-order Ambisonic encoding derived from the source.

<<




Source location and effects in 1st order Ambisonics

1st order Ambisonics is an early soundfield encoding and reconstruction system that works efficiently on low numbers of speakers, and was originally designed to address deficiencies in quad systems. A suite of source rendering and processing methods were developed to address the lack of digital tools. These include through centre panning, using a technique called W-panning to simulate extended object width with independently controlled gain. A frequency spreading option was also added. A soundfield feedback network was designed for creating spatially accurate echos and reflections, using rotation in the feedback loop to spread reflections in different directions.

wpan.jpg

O-format
is a 1st order encoding of source directionality, in a sense the reverse of the 1st order Ambisonics encoding B-format. A simple approximation was found for rendering O-format sources in terms of the dominance operation.

oformat.jpg


Menzies, D. 'W-panning and O-format, Tools for object spatialization', AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio, June 2002.
Menzies, D. 'New Performance Instruments for Electroacoustic Music', PhD thesis, University of York Electronics Dept, 1998. (British Library 1999)

A real-time application, LAmb, was built for silicon graphics machines incorporating these sound processes with graphical controls and additional features for object control using an external midi keyboard, and a recorder. It was designed for live diffusion as well as studio mixing.

lamb.tar.gz

LAmb tutorial (pdf)

Menzies, D. 'LAmb, an Introduction and Tutorial', Bourges Synthese, June 1997.


lamb.jpg


<<