Creating annotated resources for polarity classification in Czech

Kateřina Veselovská, Jan Hajič Jr., Jana Šindlerová; Proceedings of KONVENS 2012 (PATHOS 2012 workshop), pp. 296-304, September 2012.


Although the automatic extraction of subjective opinions and emotions has been in the forefront of recent linguistic research for some time, there are only few attempts to build sentiment analysis systems for morphologically rich languages, see Hayeon and Hyopil (2010). This paper presents the first steps towards reliable polarity classification based on Czech data. We describe a method for annotating Czech evaluative structures and build a standard unigram-based Naive Bayes classifier on three different types of annotated texts. Furthermore, we analyze existing results for both manual and automatic annotation, some of which are promising and close to the state-of-the-art performance, see Cui (2006).

