Publication:
Lessons learned from a Citizen Science Project for Natural Language Processing

dc.contributor.coauthorKlie, Jan-Christoph
dc.contributor.coauthorLee, Ji-Ung
dc.contributor.coauthorStowe, Kevin
dc.contributor.coauthorMoosavi, Nafise Sadat
dc.contributor.coauthorBates, Luke
dc.contributor.coauthorPetrak, Dominic
dc.contributor.coauthorde Castilho, Richard Eckart
dc.contributor.coauthorGurevych, Iryna
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentKUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.facultymemberYes
dc.contributor.kuauthorŞahin, Gözde Gül
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteResearch Center
dc.date.accessioned2024-12-29T09:37:08Z
dc.date.issued2023
dc.description.abstractMany Natural Language Processing (NLP) systems use annotated corpora for training and evaluation. However, labeled data is often costly to obtain and scaling annotation projects is difficult, which is why annotation tasks are often outsourced to paid crowdworkers. Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP. To investigate whether and how well Citizen Science can be applied in this setting, we conduct an exploratory study into engaging different groups of volunteers in Citizen Science for NLP by re-annotating parts of a pre-existing crowdsourced dataset. Our results show that this can yield high-quality annotations and attract motivated volunteers, but also requires considering factors such as scalability, participation over time, and legal and ethical issues. We summarize lessons learned in the form of guidelines and provide our code and data to aid future work on Citizen Science. © 2023 Association for Computational Linguistics.
dc.description.fulltextNo
dc.description.harvestedfromManual
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessN/A
dc.description.peerreviewstatusN/A
dc.description.publisherscopeInternational
dc.description.readpublishN/A
dc.description.sponsoredbyTubitakEuN/A
dc.description.sponsorshipThis work has been funded by the German Research Foundation (DFG) as part of the Evidence project (GU 798/27-1), UKP-SQuARE (GU 798/29-1), INCEpTION (GU 798/21-1) and PEER (GU 798/28-1), and within the project “The Third Wave of AI” funded by the Hessian Ministry of Higher Education, Research, Science and the Arts (HWMK). Further, it has been funded by the German Federal Ministry of Education and Research and HMWK within their joint support of the National Research Center for Applied Cybersecurity ATHENE.
dc.description.studentonlypublicationNo
dc.description.studentpublicationNo
dc.description.versionN/A
dc.identifier.doi10.18653/v1/2023.eacl-main.261
dc.identifier.embargoN/A
dc.identifier.isbn978-195942944-9
dc.identifier.quartileBakılacak
dc.identifier.scopus2-s2.0-85159859181
dc.identifier.urihttps://doi.org/10.18653/v1/2023.eacl-main.261
dc.identifier.urihttps://hdl.handle.net/20.500.14288/22276
dc.identifier.wos1181056902041
dc.keywordsComputational linguistics
dc.keywordsCrowdsourcing
dc.keywordsEthical technology
dc.language.isoeng
dc.publisherAssociation for Computational Linguistics (ACL)
dc.relation.affiliationKoç University
dc.relation.collectionKoç University Institutional Repository
dc.relation.ispartofEACL2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference
dc.relation.openaccessN/A
dc.rightsN/A
dc.subjectComputer science
dc.subjectLearning systems
dc.subjectArtificial intelligence
dc.titleLessons learned from a Citizen Science Project for Natural Language Processing
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorŞahin, Gözde Gül
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublicationd437580f-9309-4ecb-864a-4af58309d287
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files