Researcher: Başkaya, Osman
Name Variants
Başkaya, Osman
Email Address
Birth Date
4 results
Search Results
Now showing 1 - 4 of 4
Publication Metadata only AI-KU: using co-occurrence modeling for semantic similarity(Association for Computational Linguistics (ACL), 2014) N/A; Başkaya, Osman; Master Student; Graduate School of Sciences and Engineering; N/AIn this paper, we describe our unsupervised method submitted to the Cross-Level Semantic Similarity task in Semeval 2014 that computes semantic similarity between two different sized text fragments. Our method models each text fragment by using the co-occurrence statistics of either occurred words or their substitutes. The co-occurrence modeling step provides dense, low-dimensional embedding for each fragment which allows us to calculate semantic similarity using various similarity metrics. Although our current model avoids the syntactic information, we achieved promising results and outperformed all baselines. © 8th International Workshop on Semantic Evaluation, SemEval 2014 - co-located with the 25th International Conference on Computational Linguistics, COLING 2014, Proceedings.Publication Metadata only Semi-supervised learning with induced word senses for state of the art word sense disambiguation(AI Access Foundation, 2016) Jurgens, David; N/A; Başkaya, Osman; Master Student; Graduate School of Sciences and Engineering; N/AWord Sense Disambiguation (WSD) aims to determine the meaning of a word in context, and successful approaches are known to bene fit many applications in Natural Language Processing. Although supervised learning has been shown to provide superior WSD performance, current sense-annotated corpora do not contain a sufficient number of instances per word type to train supervised systems for all words. While unsupervised techniques have been proposed to overcome this data sparsity problem, such techniques have not outperformed supervised methods. In this paper, we propose a new approach to building semi-supervised WSD systems that combines a small amount of sense-annotated data with information from Word Sense Induction, a fully-unsupervised technique that automatically learns the different senses of a word based on how it is used. In three experiments, we show how sense induction models may be effectively combined to ultimately produce high-performance semi-supervised WSD systems that exceed the performance of state-of-the-art supervised WSD techniques trained on the same sense-annotated data. We anticipate that our results and released software will also bene fit evaluation practices for sense induction systems and those working in low-resource languages by demonstrating how to quickly produce accurate WSD systems with minimal annotation effort.Publication Metadata only AI-KU: using substitute vectors and co-occurrence modeling for word sense induction and disambiguation(Association for Computational Linguistics (ACL), 2013) N/A; N/A; N/A; Department of Computer Engineering; Başkaya, Osman; Cirik, Volkan; Yüret, Deniz; Master Student; Master Student; Master Student; Faculty Member; Department of Computer Engineering; Koç Üniversitesi İş Bankası Yapay Zeka Uygulama ve Araştırma Merkezi (KUIS AI)/ Koç University İş Bank Artificial Intelligence Center (KUIS AI); Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; Graduate School of Sciences and Engineering; College of Engineering; N/A; N/A; N/A; 179996Word sense induction aims to discover different senses of a word from a corpus by using unsupervised learning approaches. Once a sense inventory is obtained for an ambiguous word, word sense discrimination approaches choose the best-fitting single sense for a given context from the induced sense inventory. However, there may not be a clear distinction between one sense and another, although for a context, more than one induced sense can be suitable. Graded word sense method allows for labeling a word in more than one sense. In contrast to the most common approach which is to apply clustering or graph partitioning on a representation of first or second order co-occurrences of a word, we propose a system that creates a substitute vector for each target word from the most likely substitutes suggested by a statistical language model. Word samples are then taken according to probabilities of these substitutes and the results of the co-occurrence model are clustered. This approach outperforms the other systems on graded word sense induction task in SemEval-2013.Publication Open Access AI-KU: using co-occurrence modeling for semantic similarity(Association for Computational Linguistics (ACL), 2014) Department of Computer Engineering; Başkaya, Osman; Department of Computer Engineering; College of EngineeringIn this paper, we describe our unsupervised method submitted to the Cross-Level Semantic Similarity task in Semeval 2014 that computes semantic similarity between two different sized text fragments. Our method models each text fragment by using the co-occurrence statistics of either occurred words or their substitutes. The co-occurrence modeling step provides dense, low-dimensional embedding for each fragment which allows us to calculate semantic similarity using various similarity metrics. Although our current model avoids the syntactic information, we achieved promising results and outperformed all baselines.