Publication:
KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media

dc.conference.dateDEC 12-13, 2020
dc.conference.locationBarcelona, Spain
dc.conference.organizer14th International Workshops on Semantic Evaluation (SemEval)
dc.contributor.departmentKUIS AI (Koç University & İş Bank Artificial Intelligence Center)
dc.contributor.facultymemberYes
dc.contributor.kuauthorIsentemiz, Moutasem
dc.contributor.kuauthorSafaya, Ali
dc.contributor.kuauthorYüret, Deniz
dc.contributor.schoolcollegeinstituteResearch Center
dc.date.accessioned2024-11-09T23:12:31Z
dc.date.issued2020
dc.description.abstractIn this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community.
dc.description.fulltextYes
dc.description.harvestedfromManual
dc.description.indexedbyScopus
dc.description.indexedbyWOS
dc.description.openaccessYES
dc.description.peerreviewstatusN/A
dc.description.publisherscopeInternational
dc.description.readpublishN/A
dc.description.sponsoredbyTubitakEuEU
dc.description.studentonlypublicationNo
dc.description.studentpublicationYes
dc.description.versionPost-print
dc.identifier.embargoNo
dc.identifier.filenameinventorynoIR06867
dc.identifier.grantno714868
dc.identifier.isbn9781952148316
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85118416740
dc.identifier.urihttps://hdl.handle.net/20.500.14288/9830
dc.identifier.wos001361895500271
dc.keywordsComputational linguistics
dc.keywordsConvolutional neural networks
dc.keywordsSocial networking (online)
dc.keywordsSpeech recognition
dc.keywordsConvolutional neural network
dc.keywordsDown-stream
dc.keywordsF1 scores
dc.keywordsLanguage identification
dc.keywordsLanguage model
dc.keywordsOffensive languages
dc.keywordsSocial media
dc.keywordsSpeech identification
dc.keywordsSubtask
dc.keywordsTurkishs
dc.keywordsSemantics
dc.language.isoeng
dc.publisherInternational Committee for Computational Linguistics
dc.relation.affiliationKoç University
dc.relation.collectionKoç University Institutional Repository
dc.relation.ispartofProceedings of the Fourteenth Workshop on Semantic Evaluation
dc.relation.openaccessYes
dc.rightsOther
dc.subjectCyberbullying
dc.subjectHate speech
dc.subjectSocial networks
dc.titleKUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorYüret, Deniz
local.contributor.kuauthorSafaya, Ali
local.contributor.kuauthorIsentemiz, Moutasem
relation.isOrgUnitOfPublication77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isOrgUnitOfPublication.latestForDiscovery77d67233-829b-4c3a-a28f-bd97ab5c12c7
relation.isParentOrgUnitOfPublicationd437580f-9309-4ecb-864a-4af58309d287
relation.isParentOrgUnitOfPublication.latestForDiscoveryd437580f-9309-4ecb-864a-4af58309d287

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
IR06867.pdf
Size:
158.74 KB
Format:
Adobe Portable Document Format