Extending dependency treebanks with good sentences

Alexander Volokh, Günter Neumann; Proceedings of KONVENS 2012 (Main track: poster presentations), pp. 218-222, September 2012.


For many resource-poor languages additional annotated data would be beneficial. However, annotation process is tedious and expensive. We propose a metric for selecting the most promising sentences for annotation. Annotating only good sentences saves time and would allow better results to be achieved even with a smaller amount of annotated data. We demonstrate how our method works on the example of parsing Finnish dependency treebank with MaltParser.

[pdf] [bibtex]