Dave looking professional ;)

Dave Howcroft

Computational Linguist

Reach me on gmail: dave.howcroft

[PGP Public Key]

Find me on: StackOverflow, GitHub, Twitter

About Me

I research the factors that make a text more or less difficult to read and ways to incorporate this knowledge into natural language generation systems. On the complexity side of things, I am interested in the role played by factors like surprisal, embedding depth, dependency length, and idea density in reading comprehension. In addition to factoring these features into models of generation, I have worked on grammar induction for microplanning and the influence of discourse markers on fluency judgements.


Resource Acquisition and Induction for Natural Language Generation

For the last year and a half I have been focusing on making it easier to start new projects in NLG or to port existing systems to new domains. The emphasis in this work is on inducing 'grammars' for microplanning, originally in a template-derived approach (White & Howcroft 2015) but now using Bayesian non-parametric approaches.

Measuring Sentential Complexity

My Master's thesis (Howcroft 2015) evaluated the discriminative power of psycholinguistic metrics in ranking sentences according to their complexity. Using the English and Simple English Wikipedia Corpus (Hwang et al. 2015; ESEW) and the One Stop English Corpus (Vajjala 2015; OSE), I trained an averaged perceptron model using both traditional features (like word and sentence length) and psycholinguistically-motivated features (like surprisal and embedding depth). The psycholinguistic features resulted in a small but significant improvement in accuracy.

Information Theoretic approaches to Diachronic Questions

Cynthia A Johnson, Rachel Steindel Burdin, and Rory Turnbull, and I are examining adjectival paradigms in Middle and New High German using expected relative entropy. For an overview of the project, you can check out an old handout.

In January 2014 we presented a poster on our work at the LSA's Friday Morning Plenary Poster Session.

Generating Contrastive Expressions

In 2012 and 2013 I worked with Michael White on the generation of contrastive expressions and presented our work at ENLG. Unfortunately there's no video, but you should read the paper if you're interested:

David M. Howcroft, Crystal Nakatsu, and Michael White. 2013. "Enhancing the Expression of Contrast in the SPaRKy Restaurant Corpus". In Proceedings of the 14th European Workshop on Natural Language Generation. [PDF]

Links and Other Resources

So what is linguistics anyway? There are plenty of sources out there you can find to answer this question for yourself, but among the best breakdowns I've seen is this two-page introduction Basic Facts about Linguistics (PDF) written by Carl Pollard.