Saliency-Driven Unstructured Acoustic Scene Classification Using Latent Perceptual Indexing (bibtex)
by Kalinli, Ozlem, Sundaram, Shiva and Narayanan, Shrikanth
Abstract:
Automatic acoustic scene classification of real life, complex and unstructured acoustic scenes is a challenging task as the number of acoustic sources present in the audio stream are unknown and overlapping in time. In this work, we present a novel approach to classification such unstructured acoustic scenes. Motivated by the bottom-up attention model of the human auditory system, salient events of an audio clip are extracted in an unsupervised manner and presented to the classification system. Similar to latent semantic indexing of text documents, the classi- fication system uses unit-document frequency measure to index the clip in a continuous, latent space. This allows for developing a completely class-independent approach to audio classification. Our results on the BBC sound effects library indicates that using the saliency-driven attention selection approach presented in this paper, 17.5% relative improvement can be obtained in frame-based classification and 25% relative improvement can be obtained using the latent audio indexing approach.
Reference:
Saliency-Driven Unstructured Acoustic Scene Classification Using Latent Perceptual Indexing (Kalinli, Ozlem, Sundaram, Shiva and Narayanan, Shrikanth), In Proceedings of IEEE MMSP, 2009.
Bibtex Entry:
@inproceedings{kalinli_saliency-driven_2009,
	address = {Rio de Janeiro, Brazil},
	title = {Saliency-{Driven} {Unstructured} {Acoustic} {Scene} {Classification} {Using} {Latent} {Perceptual} {Indexing}},
	url = {http://ict.usc.edu/pubs/Saliency-Driven%20Unstructured%20Acoustic%20Scene%20Classification%20Using%20Latent%20Perceptual%20Indexing.pdf},
	abstract = {Automatic acoustic scene classification of real life, complex and unstructured acoustic scenes is a challenging task as the number of acoustic sources present in the audio stream are unknown and overlapping in time. In this work, we present a novel approach to classification such unstructured acoustic scenes. Motivated by the bottom-up attention model of the human auditory system, salient events of an audio clip are extracted in an unsupervised manner and presented to the classification system. Similar to latent semantic indexing of text documents, the classi- fication system uses unit-document frequency measure to index the clip in a continuous, latent space. This allows for developing a completely class-independent approach to audio classification. Our results on the BBC sound effects library indicates that using the saliency-driven attention selection approach presented in this paper, 17.5\% relative improvement can be obtained in frame-based classification and 25\% relative improvement can be obtained using the latent audio indexing approach.},
	booktitle = {Proceedings of {IEEE} {MMSP}},
	author = {Kalinli, Ozlem and Sundaram, Shiva and Narayanan, Shrikanth},
	month = oct,
	year = {2009}
}
Powered by bibtexbrowser