Detecting Dependency Parse Errors with Minimal Resources

Markus Dickinson and Amber Smith

Proceedings of IWPT'11.

To detect errors in automatically-obtained dependency parses, we take a grammar-based approach. In particular, we develop methods that incorporate n-grams of different lengths and use information about possible parse revisions. Using our methods allows annotators to focus on problematic parses, with the potential to find over half the parse errors by examining only 20% of the data, as we demonstrate. A key result is that methods using a small gold grammar outperform methods using much larger grammars containing noise. To perform annotation error detection on newly-parsed data, one only needs a small grammar.

Electronically available file formats:

The code we used to score dependency parses is now available: here (in a .tgz bundle)

Bibtex entry:

