Crowdsourcing and evaluating text quality


Over the last decade, crowdsourcing has become a standard method for collecting training data for NLP tasks and evaluating NLP systems for things like text quality. Many evaluations, however, are still ill-defined.

In the practical portion of this talk I present an overview of current tasks addressed with crowdsourcing in computational linguistics, along with tools for implementing them. This overview is meant to be interactive: I am sharing some of the best or most interesting tasks I am aware of, but I would like us to have a conversation about how you are using crowdsourcing as well.

After this discussion of tasks, tools, and best practices, I introduce a new research program from the Heriot-Watt NLP Lab looking at human and automatic evaluations for natural language generation. This includes foundational work to make our evaluations more well-defined, experimental work developing new reading time measures to assess readability, and modeling work as we seek new methods of quality estimation that improve upon metrics like BLEU and BERTscore.

20 Jan 2020 15:00 — 16:00
422 Seminar Room
Sir Alwyn Williams Building, University of Glasgow, Glasgow, G12 8QN, United Kingdom
Dave Howcroft
Research Associate

Dave Howcroft is a computational linguist working in the Interaction Lab at Heriot-Watt University's School of Mathematics and Computer Sciences.