WordNet-based lexical simplification of a document

S. Rebecca Thomas, Sven Anderson; Proceedings of KONVENS 2012 (Main track: oral presentations), pp. 80-88, September 2012.


We explore algorithms for the automatic generation of a limited-size lexicon from a document, such that the lexicon covers as much as possible of the semantic space of the original document, as specifically as possible. We evaluate six related algorithms that automatically derive limited-size vocabularies from Wikipedia articles, focusing on nouns and verbs. The proposed algorithms combine Personalized Page Rank \cite{AgirreSoroa2009} and principles of information maximization, beginning with a user-supplied document and constructing a customized small vocabulary using WordNet. The best-performing algorithm relies on word-sense disambiguation with sentence-level context information at the earliest stage of analysis, indicating that this computationally costly task is nonetheless valuable.

[pdf] [bibtex]