Researcher:
Safaya, Ali

Loading...
Profile Picture
ORCID

Job Title

PhD Student

First Name

Ali

Last Name

Safaya

Name

Name Variants

Safaya, Ali

Email Address

Birth Date

Search Results

Now showing 1 - 3 of 3
  • Placeholder
    Publication
    KUISAIL at SemEval-2020 Task 12: BERT-CNN for offensive speech identification in social media
    (International Committee for Computational Linguistics, 2020) Department of Computer Engineering; N/A; N/A; YĆ¼ret, Deniz; Safaya, Ali; Isentemiz, Moutasem; Faculty Member; PhD Student; Master Student; Department of Computer Engineering; College of Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; 179996; N/A; N/A
    In this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020. We show that combining CNN with BERT is better than using BERT on its own, and we emphasize the importance of utilizing pre-trained language models for downstream tasks. Our system, ranked 4th with macro averaged F1-Score of 0.897 in Arabic, 4th with score of 0.843 in Greek, and 3rd with score of 0.814 in Turkish. Additionally, we present ArabicBERT, a set of pre-trained transformer language models for Arabic that we share with the community.
  • Placeholder
    Publication
    Mukayese: Turkish NLP strikes back
    (Assoc Computational Linguistics-Acl, 2022) Kurtuluş, Emirhan; Gƶktoğan, Arda; N/A; Department of Computer Engineering; Safaya, Ali; YĆ¼ret, Deniz; PhD Student; Faculty Member; Department of Computer Engineering; KoƧ Ɯniversitesi İş Bankası Yapay Zeka Uygulama ve Araştırma Merkezi (KUIS AI)/ KoƧ University İş Bank Artificial Intelligence Center (KUIS AI); Graduate School of Sciences and Engineering; College of Engineering; N/A; 179996
    Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the state-of-the-art in NLP applications. As a solution, we present MUKAYESE, a set of NLP benchmarks for the Turkish language that contains several NLP tasks. We work on one or more datasets for each benchmark and present two or more baselines. Moreover, we present four new bench-marking datasets in Turkish for language modeling, sentence segmentation, and spell checking. All datasets and baselines are available under: https :// github.com/ alisafaya/mukayese
  • Thumbnail Image
    PublicationOpen Access
    Mukayese: Turkish NLP strikes back
    (Association for Computational Linguistics (ACL), 2022) Kurtuluş, Emirhan; Gƶktoğan, Arda; Department of Computer Engineering; YĆ¼ret, Deniz; Safaya, Ali; Faculty Member; Department of Computer Engineering; KoƧ Ɯniversitesi İş Bankası Yapay Zeka Uygulama ve Araştırma Merkezi (KUIS AI)/ KoƧ University İş Bank Artificial Intelligence Center (KUIS AI); College of Engineering; Graduate School of Sciences and Engineering; 179996; N/A
    Having sufficient resources for language X lifts it from the under-resourced languages class, but not necessarily from the under-researched class. In this paper, we address the problem of the absence of organized benchmarks in the Turkish language. We demonstrate that languages such as Turkish are left behind the state-of-the-art in NLP applications. As a solution, we present MUKAYESE, a set of NLP benchmarks for the Turkish language that contains several NLP tasks. We work on one or more datasets for each benchmark and present two or more baselines. Moreover, we present four new bench-marking datasets in Turkish for language modeling, sentence segmentation, and spell checking.