Researcher: Safaya, Ali
Name Variants
Safaya, Ali
Email Address
Birth Date
3 results
Search Results
Now showing 1 - 3 of 3
Publication Metadata only KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media(International Committee for Computational Linguistics, 2020) Department of Computer Engineering; N/A; N/A; YĆ¼ret, Deniz; Safaya, Ali; Isentemiz, Moutasem; Faculty Member; PhD Student; Master Student; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; 179996; N/A; N/AIn this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community.Publication Metadata only Mukayese: Turkish NLP strikes back(Assoc Computational Linguistics-Acl, 2022) KurtuluÅ, Emirhan; GƶktoÄan, Arda; N/A; Department of Computer Engineering; Safaya, Ali; YĆ¼ret, Deniz; PhD Student; Faculty Member; Department of Computer Engineering; KoƧ Ćniversitesi Ä°Å Bankası Yapay Zeka Uygulama ve AraÅtırma Merkezi (KUIS AI)/ KoƧ University Ä°Å Bank Artificial Intelligence Center (KUIS AI); Graduate School of Sciences and Engineering; College of Engineering; N/A; 179996Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the state-of-the-art in NLP applications. As a solution, we present MUKAYESE, a set of NLP benchmarks for the Turkish language that contains several NLP tasks. We work on one or more datasets for each benchmark and present two or more baselines. Moreover, we present four new bench-marking datasets in Turkish for language modeling, sentence segmentation, and spell checking. All datasets and baselines are available under: https :// github.com/ alisafaya/mukayesePublication Open Access Mukayese: Turkish NLP strikes back(Association for Computational Linguistics (ACL), 2022) KurtuluÅ, Emirhan; GƶktoÄan, Arda; Department of Computer Engineering; YĆ¼ret, Deniz; Safaya, Ali; Faculty Member; Department of Computer Engineering; KoƧ Ćniversitesi Ä°Å Bankası Yapay Zeka Uygulama ve AraÅtırma Merkezi (KUIS AI)/ KoƧ University Ä°Å Bank Artificial Intelligence Center (KUIS AI); College of Engineering; Graduate School of Sciences and Engineering; 179996; N/AHaving sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the state-of-the-art in NLP applications. As a solution, we present MUKAYESE, a set of NLP benchmarks for the Turkish language that contains several NLP tasks. We work on one or more datasets for each benchmark and present two or more baselines. Moreover, we present four new bench-marking datasets in Turkish for language modeling, sentence segmentation, and spell checking.