Combining lexical, syntactic and prosodic cues for improved online dialog act tagging (bibtex)
by Sridhar, Vivek Kumar Rangarajan, Bangalore, Srinivas and Narayanan, Shrikanth
Abstract:
Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acoustic– prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of the prosodic contour. The proposed scheme for exploiting prosody results in an absolute improvement of 8.7% over the use of most other widely used representations of acoustic correlates of prosody. The proposed scheme is discriminative and exploits context in the form of lexical, syntactic and prosodic cues from preceding discourse segments. Such a decoding scheme facilitates online DA tagging and offers robustness in the decoding process, unlike greedy decoding schemes that can potentially propagate errors. Our approach is different from traditional DA systems that use the entire conversation for offine dialog act decoding with the aidofa discourse model.In contrast, we use only static features and approximate the previous dialog act tags in terms of lexical, syntactic and prosodic information extracted from previous utterances. Experiments on the Switchboard-DAMSL corpus, using only lexical, syntactic and prosodic cues from three previous utterances, yield a DA tagging accuracy of 72% compared to the best case scenario with accurate knowledge of previous DA tags (oracle), which results in 74% accuracy. © 2009 Elsevier Ltd. All rights reserved.
Reference:
Combining lexical, syntactic and prosodic cues for improved online dialog act tagging (Sridhar, Vivek Kumar Rangarajan, Bangalore, Srinivas and Narayanan, Shrikanth), In Computer Speech and Language, volume 23, 2009.
Bibtex Entry:
@article{sridhar_combining_2009,
	title = {Combining lexical, syntactic and prosodic cues for improved online dialog act tagging},
	volume = {23},
	url = {http://ict.usc.edu/pubs/Combining%20lexical,%20syntactic%20and%20prosodic%20cues%20for%20improved%20online%20dialog%20act%20tagging.pdf},
	abstract = {Prosody is an important cue for identifying dialog acts. In this paper, we show that modeling the sequence of acoustic– prosodic values as n-gram features with a maximum entropy model for dialog act (DA) tagging can perform better than conventional approaches that use coarse representation of the prosodic contour through summative statistics of the prosodic contour. The proposed scheme for exploiting prosody results in an absolute improvement of 8.7\% over the use of most other widely used representations of acoustic correlates of prosody. The proposed scheme is discriminative and exploits context in the form of lexical, syntactic and prosodic cues from preceding discourse segments. Such a decoding scheme facilitates online DA tagging and offers robustness in the decoding process, unlike greedy decoding schemes that can potentially propagate errors. Our approach is different from traditional DA systems that use the entire conversation for offine dialog act decoding with the aidofa discourse model.In contrast, we use only static features and approximate the previous dialog act tags in terms of lexical, syntactic and prosodic information extracted from previous utterances. Experiments on the Switchboard-DAMSL corpus, using only lexical, syntactic and prosodic cues from three previous utterances, yield a DA tagging accuracy of 72\% compared to the best case scenario with accurate knowledge of previous DA tags (oracle), which results in 74\% accuracy. © 2009 Elsevier Ltd. All rights reserved.},
	number = {4},
	journal = {Computer Speech and Language},
	author = {Sridhar, Vivek Kumar Rangarajan and Bangalore, Srinivas and Narayanan, Shrikanth},
	month = oct,
	year = {2009},
	pages = {407--422}
}
Powered by bibtexbrowser