Error Detection with Discontinuous Constituents

Markus Dickinson and Walt Detmar Meurers

Proceedings of MCLC 04.

The idea that variation in annotation can indicate annotation errors has recently been explored to detect errors in part-of-speech (pos) annotation. The method for detecting variation in pos-annotation we proposed in Dickinson and Meurers 2003a and Dickinson and Meurers 2003b is related to the common interannotator agreement evaluation which compares multiple annotations of the same sentence at the same corpus position. Our inconsistency detection approach differs, however, from the interannotator agreement approach in that it compares the occurrence of identical words with similar contexts throughout a single annotated version of the corpus. In this paper, we discuss how one can extend our approach to the detection of errors in syntactic annotation containing discontinuous constituents.

Electronically available file formats:

Bibtex entry:

  author =       {Markus Dickinson and W. Detmar Meurers},
  title =        {Error Detection with Discontinuous Constituents},
  booktitle =    {Proceedings of the Midwest Computational Linguistics Colloquium (MCLC-04)},
  address =      {Bloomington, Indiana},
  year =         {2004},
  url =          {}