Multimodal Learning for Identifying Opportunities for Empathetic Responses (bibtex)
by Tavabi, Leili, Stefanov, Kalin, Nasihati Gilani, Setareh, Traum, David and Soleymani, Mohammad
Abstract:
Embodied interactive agents possessing emotional intelligence and empathy can create natural and engaging social interactions. Providing appropriate responses by interactive virtual agents requires the ability to perceive users’ emotional states. In this paper, we study and analyze behavioral cues that indicate an opportunity to provide an empathetic response. Emotional tone in language in addition to facial expressions are strong indicators of dramatic sentiment in conversation that warrant an empathetic response. To automatically recognize such instances, we develop a multimodal deep neural network for identifying opportunities when the agent should express positive or negative empathetic responses. We train and evaluate our model using audio, video and language from human-agent interactions in a wizard-of-Oz setting, using the wizard’s empathetic responses and annotations collected on Amazon Mechanical Turk as ground-truth labels. Our model outperforms a textbased baseline achieving F1-score of 0.71 on a three-class classification. We further investigate the results and evaluate the capability of such a model to be deployed for real-world human-agent interactions.
Reference:
Multimodal Learning for Identifying Opportunities for Empathetic Responses (Tavabi, Leili, Stefanov, Kalin, Nasihati Gilani, Setareh, Traum, David and Soleymani, Mohammad), In Proceedings of the 2019 International Conference on Multimodal Interaction, ACM, 2019.
Bibtex Entry:
@inproceedings{tavabi_multimodal_2019,
	address = {Suzhou China},
	title = {Multimodal {Learning} for {Identifying} {Opportunities} for {Empathetic} {Responses}},
	isbn = {978-1-4503-6860-5},
	url = {https://dl.acm.org/doi/10.1145/3340555.3353750},
	doi = {10.1145/3340555.3353750},
	abstract = {Embodied interactive agents possessing emotional intelligence and empathy can create natural and engaging social interactions. Providing appropriate responses by interactive virtual agents requires the ability to perceive users’ emotional states. In this paper, we study and analyze behavioral cues that indicate an opportunity to provide an empathetic response. Emotional tone in language in addition to facial expressions are strong indicators of dramatic sentiment in conversation that warrant an empathetic response. To automatically recognize such instances, we develop a multimodal deep neural network for identifying opportunities when the agent should express positive or negative empathetic responses. We train and evaluate our model using audio, video and language from human-agent interactions in a wizard-of-Oz setting, using the wizard’s empathetic responses and annotations collected on Amazon Mechanical Turk as ground-truth labels. Our model outperforms a textbased baseline achieving F1-score of 0.71 on a three-class classification. We further investigate the results and evaluate the capability of such a model to be deployed for real-world human-agent interactions.},
	booktitle = {Proceedings of the 2019 {International} {Conference} on {Multimodal} {Interaction}},
	publisher = {ACM},
	author = {Tavabi, Leili and Stefanov, Kalin and Nasihati Gilani, Setareh and Traum, David and Soleymani, Mohammad},
	month = oct,
	year = {2019},
	keywords = {UARC, Virtual Humans},
	pages = {95--104}
}
Powered by bibtexbrowser