Statistical denormalization for Arabic text

Mohammed Moussa, Mohammed Fakhr, Kareem Darwish; Proceedings of KONVENS 2012 (Main track: poster presentations), pp. 228-232, September 2012.


In this paper, we focus on a sub-problem of Arabic text error correction, namely Arabic Text Denormalization. Text Denormalization is considered an important post-processing step when performing machine translation into Arabic. We examine different approaches for denormalization via the use of language modeling, stemming, and sequence labeling. We show the effectiveness of different approaches and how they can be combined to attain better results. We perform intrinsic evaluation as well as extrinsic evaluation in the context of translating from English to Arabic.

[pdf] [bibtex]