evaluation

Workshops, etc, on Evaluating Text Quality

There's been a lot of work on evaluating text quality over the years. This page aims to collect past and current workshops, shared tasks, and other resources relating to evaluation. Please get in touch if you see something missing!

Crowdsourcing and evaluating text quality

Over the last decade, crowdsourcing has become a standard method for collecting training data for NLP tasks and evaluating NLP systems for things like text quality. Many evaluations, however, are still ill-defined. In the practical portion of this …

Arguing for consistency in the human evaluation of natural language generation systems