Publication:
Lessons learned from a Citizen Science Project for Natural Language Processing

Placeholder

School / College / Institute

Organizational Unit

Program

KU Authors

Co-Authors

Klie, Jan-Christoph
Lee, Ji-Ung
Stowe, Kevin
Moosavi, Nafise Sadat
Bates, Luke
Petrak, Dominic
de Castilho, Richard Eckart
Gurevych, Iryna

Publication Date

Language

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Many Natural Language Processing (NLP) systems use annotated corpora for training and evaluation. However, labeled data is often costly to obtain and scaling annotation projects is difficult, which is why annotation tasks are often outsourced to paid crowdworkers. Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP. To investigate whether and how well Citizen Science can be applied in this setting, we conduct an exploratory study into engaging different groups of volunteers in Citizen Science for NLP by re-annotating parts of a pre-existing crowdsourced dataset. Our results show that this can yield high-quality annotations and attract motivated volunteers, but also requires considering factors such as scalability, participation over time, and legal and ethical issues. We summarize lessons learned in the form of guidelines and provide our code and data to aid future work on Citizen Science. © 2023 Association for Computational Linguistics.

Source

Publisher

Association for Computational Linguistics (ACL)

Subject

Computer science, Learning systems, Artificial intelligence

Citation

Has Part

Source

EACL2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference

Book Series Title

Edition

DOI

10.18653/v1/2023.eacl-main.261

item.page.datauri

Link

Rights

Copyrights Note

Endorsement

Review

Supplemented By

Referenced By

0

Views

0

Downloads

View PlumX Details