Kenji Sagae, Andrew Gordon: “Clustering Words by Syntactic Similarity Improves Dependency Parsing of Predicate-Argument Structures”

October 7, 2009 | Paris, France

Speaker: Kenji Sagae, Andrew Gordon
Host: International Conference on Parsing Technologies (IWPT-09)

We present an approach for deriving syntactic word clusters from parsed text, grouping words according to their unlexicalized syntactic contexts. We then explore the use of these syntactic clusters in leveraging a large corpus of trees generated by a high-accuracy parser to improve the accuracy of another parser based on a different formalism for representing a different level of sentence structure. In our experiments, we use phrase-structure trees to produce syntactic word clusters that are used by a predicate-argument dependency parser, significantly improving its accuracy.