Evaluating Distributional Properties of Tagsets

Markus Dickinson and Charles Jochim

Proceedings of the 7th Language Resources and Evaluation Conference (LREC 2010). Marrakech, Morocco.

We investigate which distributional properties should be present in a tagset by examinging different mappings of current part-of-speech tagsets. Given the importance of distributional information, we present a simple model for evaluating how a tagset mapping captures distribution. In addition to an accuracy metric capturing the internal quality of a tagset, we introduce a way to evaluate the external quality of tagset mappings so that we can ensure that the mapping retains linguistically important information from the original tagset.

Electronically available file formats:

Bibtex entry:

  author =       {Markus Dickinson and Charles Jochim},
  title =        {Evaluating Distributional Properties of Tagsets},
  booktitle =    {Proceedings of the 7th Language Resources and 
                  Evaluation Conference (LREC 2010)},
  address =      {Valletta, Malta},
  pages =        {},
  url =          {\url{http://cl.indiana.edu/~md7/papers/dickinson-jochim10.html}},
  year =         {2010}