Publication:
KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorYüret, Deniz
dc.contributor.kuauthorSafaya, Ali
dc.contributor.kuauthorIsentemiz, Moutasem
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofilePhD Student
dc.contributor.kuprofileMaster Student
dc.contributor.yokid179996
dc.contributor.yokidN/A
dc.contributor.yokidN/A
dc.date.accessioned2024-11-09T23:12:31Z
dc.date.issued2020
dc.description.abstractIn this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community.
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.identifier.doiN/A
dc.identifier.isbn9781-9521-4831-6
dc.identifier.linkhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-85118416740&partnerID=40&md5=0ea75bf0a2e6450e0159116791ac4892
dc.identifier.urihttps://hdl.handle.net/20.500.14288/9830
dc.keywordsComputational linguistics
dc.keywordsConvolutional neural networks
dc.keywordsSocial networking (online)
dc.keywordsSpeech recognition
dc.keywordsConvolutional neural network
dc.keywordsDown-stream
dc.keywordsF1 scores
dc.keywordsLanguage identification
dc.keywordsLanguage model
dc.keywordsOffensive languages
dc.keywordsSocial media
dc.keywordsSpeech identification
dc.keywordsSubtask
dc.keywordsTurkishs
dc.keywordsSemantics
dc.languageEnglish
dc.publisherInternational Committee for Computational Linguistics
dc.source14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings
dc.subjectCyberbullying
dc.subjectHate speech
dc.subjectSocial networks
dc.titleKUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authorid0000-0002-7039-0046
local.contributor.authoridN/A
local.contributor.authoridN/A
local.contributor.kuauthorYüret, Deniz
local.contributor.kuauthorSafaya, Ali
local.contributor.kuauthorIsentemiz, Moutasem
local.publication.orgunit1College of Engineering
local.publication.orgunit1Graduate School of Sciences and Engineering
local.publication.orgunit2Department of Computer Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files