Auditory Saliency Model

From Emcap

Contents

Auditory Saliency Model

by Susan Denham <denham@plymouth.ac.uk>

Centre for Theoretical and Computational Neuroscience, University of Plymouth

2008-02-16

Auditory Saliency Model code

What is the Module?

This is the Matlab implementation of the auditory saliency and perceptual onsets model, reported in (Coath et al 2007).

What are the inputs and outputs?

results = auditoryPOnsets(fName,thresh0,div1,doDisplay)

Inputs:

  • fName name of the sound file containing the stimulus (.wav or .au)
  • thresh0 initial threshold for event detection; default = 1.
  • div1 divisor for use in threshold adaptation (should be >=1); default = 2.
  • doDisplay flag set for a plot of the saliency trace and detected

events and a soundtrack with the original stimulus superimposed on the detected event track; default 0.

Outputs:

results data structure in which the peripheral model responses are returned:

  • .stim
  • .sOrig stimulus
  • .sDet detected event track
  • .fs sampling rate
  • .saliency continuous saliency trace
  • .pOnsets discrete perceptual onsets extracted from the saliency using an adaptive decision threshold; stored as a matrix with column 1 containing event times in seconds, and column 2 a measure of event saliency; each row corresponds to a detected event.
  • .eResp response of the cochlear model
  • .tResp response of the transient model
  • .cortResp response of the cortical filters

What are the parameters?

The auditory saliency model has essentially no adjustable parameters. The model has been previously tuned to approximate some of the characteristics of neural responses (for details, see Coath, 2005).

The principal adjustable parameters relate to the decision threshold for detecting discrete perceptual onsets; thresh0 determines the initial event threshold (i.e. to identify the first peak with height > thresh0), and is useful for ignoring spurious low level early activity in noisy conditions; div1 determines the degree to which the threshold is adapted by successive events.

What are the system requirements?

The only requirements are a copy of Matlab (developed in version R2007b), and the Matlab Signal Processing Toolbox. (A continuous time version of this code (suitable for translation directly into C) has also been developed, and will be uploaded later).

How to Cite

Coath, M., Denham, S.L., Smith, L.M., Honing, H., Hazan, A., Holonowicz, P., Purwins, H. (2007). An auditory model for the detection of perceptual onsets and beat tracking in singing. Neural Information Processing Systems, Workshop on Music Processing in the Brain, Vanc., Dec 2007.

How to Use, an example

The function call has the following form:

>> results = auditoryPOnsets(fName,thresh0,div1,doDisplay)

An example of the use of this function:

results = auditoryPOnsets('ines1.wav',0.5,2,1)

results returns with the following fields, the displays shown below, and plays the original sound with event markers superimposed:

>> results =

stim: [1x1 struct]

saliency: [1x2846 double]

pOnsets: [45x2 double]

eResp: [30x14226 double]

tResp: [30x2846 double]

cortResp: [303x2846 double]

Intermediate outputs
Enlarge
Intermediate outputs
Saliency tace and perceptual onsets
Enlarge
Saliency tace and perceptual onsets