Extending the STTS for the annotation of spoken language

Ines Rehbein, Sören Schalowski; Proceedings of KONVENS 2012 (Main track: poster presentations), pp. 238-242, September 2012.


This paper presents an extension to the Stuttgart-Tübingen Tagset (STTS), the standard part-of-speech tagset for German, for the annotation of spoken language. The additional tags deal with hesitations, backchannel signals, interruptions, onomatopoeia and uninterpretable material. They allow one to capture phenomena specific to spoken language while, at the same time, preserving inter-operability with already existing corpora of written language.

