Ranking Sentences by Complexity


Work in readability has typically focused on document-level measures of text diffi-culty, but many natural language generation applications would benefit from im-proved models of text difficulty for individual sentences. Fortunately, researchin psycholinguistics on the processing of sentences by human readers has yieldedpromising features, such as surprisal (Hale, 2001), integration cost (Gibson, 2000),embedding depth (van Schijndel and Schuler, 2012; van Schijndel et al., 2013), andidea density (Kintsch, 1972; Kintsch and Keenan, 1973). In this thesis, I evaluate the effectiveness of these features for ranking sentences bytheir complexity. Using sentence-aligned corpora drawn from Wikipedia (Hwang etal., 2015, ESEW) and edited news texts (Vajjala, 2015, OSE) as a source of labeledsentence pairs (ESEW) and triples (OSE), I train a series of linear models based onthese psycholinguistic features. My results show that psycholinguistic features improve model accuracy significantlyover a simple baseline using word length and sentence length features, with gainsin the 0.5 to 3 percentage point range. However, overall model accuracy remains inthe 70-80% range, suggesting that these features may not be worth the extra timerequired to parse sentences using a cognitively-plausible parser.

Master's thesis

My Master’s thesis on using psycholinguistic features to rank sentences by their linguistic complexity. This work formed the basis for my EACL 2017 submission with Vera Demberg.