Efficient Scalable Speech Compression for Scalable Speech Recognition (bibtex)
by Naveen Srinivasamurthy, Antonio Ortega, Shrikanth Narayanan
Abstract:
We propose a scalable recognition system for reducing recognition complexity. Scalable recognition can be combined with scalable compression in a distributed speech recognition (DSR) application to reduce both the computational load and the bandwidth requirement at the server. A low complexity preprocessor is used to eliminate the unlikely classes so that the complex recognizer can use the reduced subset of classes to recognize the unknown utterance. It is shown that by using our system it is fairly straightforward to trade-off reductions in complexity for performance degradation. Results of preliminary experiments using the TI-46 word digit database show that the proposed scalable approach can provide a 40\% speed up, while operating under 1.05 kbps, compared to the baseline recognition using uncompressed speech.
Reference:
Efficient Scalable Speech Compression for Scalable Speech Recognition (Naveen Srinivasamurthy, Antonio Ortega, Shrikanth Narayanan), In Proceedings of the IEEE Conference on Multimedia and Expo, 2000.
Bibtex Entry:
@inproceedings{srinivasamurthy_efficient_2000,
	title = {Efficient {Scalable} {Speech} {Compression} for {Scalable} {Speech} {Recognition}},
	url = {http://ict.usc.edu/pubs/Efficient%20Scalable%20Speech%20Compression%20for%20Scalable%20Speech%20Recognition.pdf},
	abstract = {We propose a scalable recognition system for reducing recognition complexity. Scalable recognition can be combined with scalable compression in a distributed speech recognition (DSR) application to reduce both the computational load and the bandwidth requirement at the server. A low complexity preprocessor is used to eliminate the unlikely classes so that the complex recognizer can use the reduced subset of classes to recognize the unknown utterance. It is shown that by using our system it is fairly straightforward to trade-off reductions in complexity for performance degradation. Results of preliminary experiments using the TI-46 word digit database show that the proposed scalable approach can provide a 40\% speed up, while operating under 1.05 kbps, compared to the baseline recognition using uncompressed speech.},
	booktitle = {Proceedings of the {IEEE} {Conference} on {Multimedia} and {Expo}},
	author = {Srinivasamurthy, Naveen and Ortega, Antonio and Narayanan, Shrikanth},
	year = {2000}
}
Powered by bibtexbrowser