Publication:
Team curie at HSD-2Lang 2024:hate speech detection in Turkish and Arabic tweets using BERT-based models

dc.conference.dateMarch 22, 2024
dc.conference.locationSt. Julian's, Malta
dc.conference.organizer7th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events from Text, CASE 2024
dc.contributor.coauthorHürriyetoğlu, Ali
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.facultymemberNo
dc.contributor.kuauthorBarkhordar, Ehsan
dc.contributor.kuauthorTopçu, Işık Sulal
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-12-29T09:36:46Z
dc.date.issued2024
dc.description.abstractThis study focuses on hate speech detection in Turkish and Arabic tweets using advanced BERT-based models. Performance metrics demonstrate the models' effectiveness, with the Turkish variant achieving a 71.8% F1 score and the Arabic model a 76.9% F1 score, ranking them fourth and third, respectively, in a competitive leaderboard. Performance enhancements were realized through targeted preprocessing, including emoji translation and user mention exclusion, and thoughtful data balancing approaches. Future directions include refining model accuracy and broadening language support. Our reproducible approach and detailed findings are accessible on GitHub.
dc.description.fulltextNo
dc.description.harvestedfromManual
dc.description.indexedbyScopus
dc.description.openaccessN/A
dc.description.peerreviewstatusN/A
dc.description.publisherscopeInternational
dc.description.readpublishN/A
dc.description.sponsoredbyTubitakEuEU
dc.description.sponsorshipThis work is supported by the European Research Council Politus Project (ID:101082050) and European Union's HORIZON projects EFRA (ID: 101093026) and ECO-Ready (ID: 101084201).
dc.description.studentonlypublicationYes
dc.description.studentpublicationYes
dc.description.versionN/A
dc.identifier.embargoN/A
dc.identifier.endpage220
dc.identifier.grantno101093026
dc.identifier.grantno101084201
dc.identifier.isbn9798891760707
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85190289452
dc.identifier.startpage215
dc.identifier.urihttps://hdl.handle.net/20.500.14288/22162
dc.keywordsF1 scores
dc.keywordsModeling accuracy
dc.keywordsPerformance enhancements
dc.keywordsPerformance metrices
dc.keywordsSpeech detection
dc.keywordsTurkishs
dc.language.isoeng
dc.publisherAssociation for Computational Linguistics (ACL)
dc.relation.affiliationKoç University
dc.relation.collectionKoç University Institutional Repository
dc.relation.ispartofCase 2024 - 7th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events From Text, Proceedings of the Workshop
dc.relation.openaccessN/A
dc.rightsN/A
dc.subjectSpeech recognition
dc.titleTeam curie at HSD-2Lang 2024:hate speech detection in Turkish and Arabic tweets using BERT-based models
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorBarkhordar, Ehsan
local.contributor.kuauthorTopçu, Işık Sulal
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery434c9663-2b11-4e66-9399-c863e2ebae43

Files