ICT Researcher Giving YouTube Views a Whole New Meaning
Computer scientist Louis-Philippe Morency is analyzing online videos to capture the nuances of how people communicate opinions through words and actions
Louis-Philippe Morency spends much of his workday watching YouTube. He logs hours at his desk viewing videos of people expounding on everything from presidential politics to peanut butter preferences.
But Morency is no slacker. He’s a scientist at the University of Southern California Institute for Creative Technologies whose focus is teaching computers to identify and understand the ways people convey emotion – including those times when we say one thing and mean the opposite. And in an interesting twist for those studying human communication, it turns out computers themselves are becoming one of the best places to explore how people express themselves these days.
“There is a growing field of opinion mining right now, where people study internet posts like Amazon book reviews or other text-based product and movie critiques to find out how people feel about a topic,” said Morency, who is also a research assistant professor of computer science at the USC Viterbi School of Engineering. “We are taking this field one step further by focusing on online videos which provide verbal and non-verbal communication clues beyond just words.”
Most people can cite countless cases of misreading either written or body language. For example, those tone-deaf emails where the jokes come across as an insult or conversations where it is only whether a statement is delivered with a smile or a stare that the sentiment can be understood.
“By looking at more than just text we can learn when someone is using sarcasm, for example saying they love something when their facial expressions and body language indicate that they hate it,” said Morency.
Social scientists have advanced understanding of everything from autism to cross- cultural differences by studying how people use verbal and non-verbal forms of communication. In the past, researchers needed live subjects to study. But with the increasing volume of videos posted online, the Internet has become an invaluable resource. For his latest effort – figuring out how to identify when someone is sharing a positive, negative or neutral opinion – YouTube provides a limitless library of likes and loathes.
“There are a lot of people sharing their sentiments on YouTube,” said Morency, “The goal of this work is to see if we can find a way to analyze these millions of videos and accurately assess what kinds of views they are expressing.”
To do this, Morency and his colleagues created a proof-of-concept data set of about 50 YouTube videos that feature people expressing their opinions. The videos were input into a computer program Morency developed that zeroes in on aspects of the speaker’s language, speech patterns and facial expressions to determine the type of opinion being shared.
Morency’s small sample has already identified several advantages to analyzing gestures and speech patterns over looking at writing alone. First, people don’t always use obvious polarizing words like love and hate each time they express an opinion. So software programmed to search for these “obvious” occurrences can miss many other valuable posts.
Also, Morency found that people smile and look at the camera more when sharing a positive view. Their voices become higher pitched when they have a positive or negative opinion, and they start to use a lot more pauses when they are neutral.
“These early findings are promising but we still have a long way to go,” said Morency. “What they tell us is that what you say, how you say it, and the gestures you make while speaking all play a role in pinpointing the correct sentiment.”
Morency first demonstrated his YouTube model at the International Conference on Multimodal Interaction in Spain last fall. He has since expanded the dataset to include close to 500 videos and will submit results from this larger sample for publication later this year.
The YouTube opinion dataset is also available to other researchers by contacting Morency’s Multimodal Communication and Machine Learning lab at ICT. Potential commercial uses could include for marketing or survey purposes. In the academic community, Morency foresees his research and database being resources for scientists working to understand human non-verbal and verbal communication,helping to identify conditions like autism or depression or to build more engaging educational systems. Potential commercial uses could include for marketing or survey purposes.
As for Morency, he plans to continue to view how people behave over computers in order to make computers behave more like people.
And that is an effort worth watching.