OpenMM: An Open-Source Multimodal Feature Extraction Tool (bibtex)
by Michelle Renee Morales, Stefan Scherer, Rivka Levitan
Abstract:
The primary use of speech is in face-to-face interactions andsituational context and human behavior therefore intrinsicallyshape and affect communication. In order to usefully modelsituational awareness, machines must have access to the samestreams of information humans have access to. In other words,we need to provide machines with features that represent eachcommunicative modality: face and gesture, voice and speech,and language. This paper presents OpenMM: an open-sourcemultimodal feature extraction tool. We build upon existingopen-source repositories to present the first publicly availabletool for multimodal feature extraction. The tool provides apipeline for researchers to easily extract visual and acousticfeatures. In addition, the tool also performs automatic speechrecognition (ASR) and then uses the transcripts to extract lin-guistic features. We evaluate the OpenMM’s multimodal fea-ture set on deception, depression and sentiment classificationtasks and show its performance is very promising. This tool pro-vides researchers with a simple way of extracting multimodalfeatures and consequently a richer and more robust feature rep-resentation for machine learning tasks. OpenMM: An Open-Source Multimodal Feature Extraction Tool (PDF Download Available). Available from: https://www.researchgate.net/publication/319185055\_OpenMM\_An\_Open-Source\_Multimodal\_Feature\_Extraction\_Tool [accessed Sep 13, 2017].
Reference:
OpenMM: An Open-Source Multimodal Feature Extraction Tool (Michelle Renee Morales, Stefan Scherer, Rivka Levitan), In Proceedings of Interspeech 2017, ISCA, 2017.
Bibtex Entry:
@inproceedings{morales_openmm:_2017,
	address = {Stockholm, Sweden},
	title = {{OpenMM}: {An} {Open}-{Source} {Multimodal} {Feature} {Extraction} {Tool}},
	url = {https://www.researchgate.net/publication/319185055_OpenMM_An_Open-Source_Multimodal_Feature_Extraction_Tool},
	doi = {10.21437/Interspeech.2017-1382},
	abstract = {The primary use of speech is in face-to-face interactions andsituational context and human behavior therefore intrinsicallyshape and affect communication. In order to usefully modelsituational awareness, machines must have access to the samestreams of information humans have access to. In other words,we need to provide machines with features that represent eachcommunicative modality: face and gesture, voice and speech,and language. This paper presents OpenMM: an open-sourcemultimodal feature extraction tool. We build upon existingopen-source repositories to present the first publicly availabletool for multimodal feature extraction. The tool provides apipeline for researchers to easily extract visual and acousticfeatures. In addition, the tool also performs automatic speechrecognition (ASR) and then uses the transcripts to extract lin-guistic features. We evaluate the OpenMM’s multimodal fea-ture set on deception, depression and sentiment classificationtasks and show its performance is very promising. This tool pro-vides researchers with a simple way of extracting multimodalfeatures and consequently a richer and more robust feature rep-resentation for machine learning tasks. 

OpenMM: An Open-Source Multimodal Feature Extraction Tool (PDF Download Available). Available from: https://www.researchgate.net/publication/319185055\_OpenMM\_An\_Open-Source\_Multimodal\_Feature\_Extraction\_Tool [accessed Sep 13, 2017].},
	booktitle = {Proceedings of {Interspeech} 2017},
	publisher = {ISCA},
	author = {Morales, Michelle Renee and Scherer, Stefan and Levitan, Rivka},
	month = aug,
	year = {2017},
	keywords = {Virtual Humans},
	pages = {3354--3358}
}
Powered by bibtexbrowser