Collecting Better Training Data using Biased Agent Policies in Negotiation Dialogues (bibtex)
by Vasily Konovalov, Oren Melamud, Ron Artstein, Ido Dagan
Abstract:
When naturally occurring data is characterized by a highly skewed class distribution, supervised learning often benefits from reducing this skew. Human-agent dialogue data is commonly highly skewed when using standard agent policies. Hence, we suggest that agent policies need to be reconsidered in the context of training data collection. Specifically, in this work we implemented biased agent policies that are optimized for data collection in the negotiation domain. Empirical evaluations show that our method is successful in collecting a reasonably balanced corpus in the highly skewed Job-Candidate domain. Furthermore, using this balanced corpus to train a negotiation intent classifier yields notable performance improvements relative to naturally distributed data.
Reference:
Collecting Better Training Data using Biased Agent Policies in Negotiation Dialogues (Vasily Konovalov, Oren Melamud, Ron Artstein, Ido Dagan), In Proceedings of WOCHAT, the Second Workshop on Chatbots and Conversational Agent Technologies, Zerotype, 2016.
Bibtex Entry:
@inproceedings{konovalov_collecting_2016,
	address = {Los Angeles},
	title = {Collecting {Better} {Training} {Data} using {Biased} {Agent} {Policies} in {Negotiation} {Dialogues}},
	url = {http://workshop.colips.org/wochat/documents/RP-270.pdf},
	abstract = {When naturally occurring data is characterized by a highly skewed class distribution, supervised learning often benefits from reducing this skew. Human-agent dialogue data is commonly highly skewed when using standard agent policies. Hence, we suggest that agent policies need to be reconsidered in the context of training data collection. Specifically, in this work we implemented biased agent policies that are optimized for data collection in the negotiation domain. Empirical evaluations show that our method is successful in collecting a reasonably balanced corpus in the highly skewed Job-Candidate domain. Furthermore, using this balanced corpus to train a negotiation intent classifier yields notable performance improvements relative to naturally distributed data.},
	booktitle = {Proceedings of {WOCHAT}, the {Second} {Workshop} on {Chatbots} and {Conversational} {Agent} {Technologies}},
	publisher = {Zerotype},
	author = {Konovalov, Vasily and Melamud, Oren and Artstein, Ron and Dagan, Ido},
	month = sep,
	year = {2016},
	keywords = {Virtual Humans}
}
Powered by bibtexbrowser