Publication: KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.department | N/A | |
dc.contributor.department | N/A | |
dc.contributor.kuauthor | Yüret, Deniz | |
dc.contributor.kuauthor | Safaya, Ali | |
dc.contributor.kuauthor | Isentemiz, Moutasem | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | PhD Student | |
dc.contributor.kuprofile | Master Student | |
dc.contributor.other | Department of Computer Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.yokid | 179996 | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | N/A | |
dc.date.accessioned | 2024-11-09T23:12:31Z | |
dc.date.issued | 2020 | |
dc.description.abstract | In this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community. | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.identifier.doi | N/A | |
dc.identifier.isbn | 9781-9521-4831-6 | |
dc.identifier.link | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85118416740&partnerID=40&md5=0ea75bf0a2e6450e0159116791ac4892 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/9830 | |
dc.keywords | Computational linguistics | |
dc.keywords | Convolutional neural networks | |
dc.keywords | Social networking (online) | |
dc.keywords | Speech recognition | |
dc.keywords | Convolutional neural network | |
dc.keywords | Down-stream | |
dc.keywords | F1 scores | |
dc.keywords | Language identification | |
dc.keywords | Language model | |
dc.keywords | Offensive languages | |
dc.keywords | Social media | |
dc.keywords | Speech identification | |
dc.keywords | Subtask | |
dc.keywords | Turkishs | |
dc.keywords | Semantics | |
dc.language | English | |
dc.publisher | International Committee for Computational Linguistics | |
dc.source | 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings | |
dc.subject | Cyberbullying | |
dc.subject | Hate speech | |
dc.subject | Social networks | |
dc.title | KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-7039-0046 | |
local.contributor.authorid | N/A | |
local.contributor.authorid | N/A | |
local.contributor.kuauthor | Yüret, Deniz | |
local.contributor.kuauthor | Safaya, Ali | |
local.contributor.kuauthor | Isentemiz, Moutasem | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae |