KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media

Publication:
KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media

dc.contributor.department	Department of Computer Engineering
dc.contributor.department	Graduate School of Sciences and Engineering
dc.contributor.kuauthor	Isentemiz, Moutasem
dc.contributor.kuauthor	Safaya, Ali
dc.contributor.kuauthor	Yüret, Deniz
dc.contributor.schoolcollegeinstitute	College of Engineering
dc.contributor.schoolcollegeinstitute	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned	2024-11-09T23:12:31Z
dc.date.issued	2020
dc.description.abstract	In this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community.
dc.description.indexedby	Scopus
dc.description.openaccess	YES
dc.description.publisherscope	International
dc.description.sponsoredbyTubitakEu	N/A
dc.identifier.isbn	9781-9521-4831-6
dc.identifier.link	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85118416740&partnerID=40&md5=0ea75bf0a2e6450e0159116791ac4892
dc.identifier.uri	https://hdl.handle.net/20.500.14288/9830
dc.keywords	Computational linguistics
dc.keywords	Convolutional neural networks
dc.keywords	Social networking (online)
dc.keywords	Speech recognition
dc.keywords	Convolutional neural network
dc.keywords	Down-stream
dc.keywords	F1 scores
dc.keywords	Language identification
dc.keywords	Language model
dc.keywords	Offensive languages
dc.keywords	Social media
dc.keywords	Speech identification
dc.keywords	Subtask
dc.keywords	Turkishs
dc.keywords	Semantics
dc.language.iso	eng
dc.publisher	International Committee for Computational Linguistics
dc.relation.ispartof	14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings
dc.subject	Cyberbullying
dc.subject	Hate speech
dc.subject	Social networks
dc.title	KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media
dc.type	Conference Proceeding
dspace.entity.type	Publication
local.contributor.kuauthor	Yüret, Deniz
local.contributor.kuauthor	Safaya, Ali
local.contributor.kuauthor	Isentemiz, Moutasem
local.publication.orgunit1	College of Engineering
local.publication.orgunit1	GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit2	Department of Computer Engineering
local.publication.orgunit2	Graduate School of Sciences and Engineering
person.familyName	Isentemiz
person.familyName	Safaya
person.familyName	Yüret
person.givenName	Moutasem
person.givenName	Ali
person.givenName	Deniz
relation.isOrgUnitOfPublication	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication	3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery	89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication	8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication	434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery	8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Collections

Publications without Fulltext

Publication: KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media

Files

Collections

Publication:
KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media