Signature Cluster Model Selection for Incremental Gaussian Mixture Cluster Modeling in Agglomerative Hierarchical Speaker Clustering (bibtex)
by Han, Kyu J. and Narayanan, Shrikanth
Abstract:
Agglomerative hierarchical speaker clustering (AHSC) has been widely used for classifying speech data by speaker charac- teristics. Its bottom-up, one-way structure of merging the clos- est cluster pair at every recursion step, however, makes it diffi- cult to recover from incorrect merging. Hence, making AHSC robust to incorrect merging is an important issue. In this pa- per we address this problem in the framework of AHSC based on incremental Gaussian mixture models, which we previously introduced for better representing variable cluster size. Specif- ically, to minimize contamination in cluster models by hetero- geneous data, we select and keep updating a representative (or signature) model for each cluster during AHSC. Experiments on meeting speech excerpts (4 hours total) verify that the proposed approach improves average speaker clustering performance by approximately 20% (relative).
Reference:
Signature Cluster Model Selection for Incremental Gaussian Mixture Cluster Modeling in Agglomerative Hierarchical Speaker Clustering (Han, Kyu J. and Narayanan, Shrikanth), In Proceedings of Interspeech 2009, 2009.
Bibtex Entry:
@inproceedings{han_signature_2009,
	address = {Brighton, UK},
	title = {Signature {Cluster} {Model} {Selection} for {Incremental} {Gaussian} {Mixture} {Cluster} {Modeling} in {Agglomerative} {Hierarchical} {Speaker} {Clustering}},
	url = {http://ict.usc.edu/pubs/Signature%20Cluster%20Model%20Selection.pdf},
	abstract = {Agglomerative hierarchical speaker clustering (AHSC) has been widely used for classifying speech data by speaker charac- teristics. Its bottom-up, one-way structure of merging the clos- est cluster pair at every recursion step, however, makes it diffi- cult to recover from incorrect merging. Hence, making AHSC robust to incorrect merging is an important issue. In this pa- per we address this problem in the framework of AHSC based on incremental Gaussian mixture models, which we previously introduced for better representing variable cluster size. Specif- ically, to minimize contamination in cluster models by hetero- geneous data, we select and keep updating a representative (or signature) model for each cluster during AHSC. Experiments on meeting speech excerpts (4 hours total) verify that the proposed approach improves average speaker clustering performance by approximately 20\% (relative).},
	booktitle = {Proceedings of {Interspeech} 2009},
	author = {Han, Kyu J. and Narayanan, Shrikanth},
	month = oct,
	year = {2009}
}
Powered by bibtexbrowser