Publication:
Learning morphological disambiguation rules for Turkish

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorTüre, Ferhan
dc.contributor.kuauthorYüret, Deniz
dc.contributor.kuprofileUndergraduate Student
dc.contributor.kuprofileFaculty Member
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.yokidN/A
dc.contributor.yokid179996
dc.date.accessioned2024-11-09T23:46:54Z
dc.date.issued2006
dc.description.abstractIn this paper, we present a rule based model for morphological disambiguation of Turkish. The rules are generated by a novel decision list learning algorithm using supervised training. Morphological ambiguity (e.g. lives = live+s or life+s) is a challenging problem for agglutinative languages like Turkish where close to half of the words in running text are morphologically ambiguous. Furthermore, it is possible for a word to take an unlimited number of suffixes, therefore the number of possible morphological tags is unlimited. We attempted to cope with these problems by training a separate model for each of the 126 morphological features recognized by the morphological analyzer. The resulting decision lists independently vote on each of the potential parses of a word and the final parse is selected based on our confidence on these votes. The accuracy of our model (96%) is slightly above the best previously reported results which use statistical models. For comparison, when we train a single decision list on full tags instead of using separate models on each feature we get 91% accuracy.
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.identifier.doi10.3115/1220835.1220877
dc.identifier.linkhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-84858435058&doi=10.3115%2f1220835.1220877&partnerID=40&md5=39587bf525c9097f0d248365b4c392d3
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-84858435058
dc.identifier.urihttps://aclanthology.org/N06-1042/
dc.identifier.urihttps://hdl.handle.net/20.500.14288/14038
dc.keywordsComputational linguistics
dc.keywordsLearning algorithms
dc.keywordsAgglutinative language
dc.keywordsDecision lists
dc.keywordsMorphological analyzer
dc.keywordsMorphological disambiguation
dc.keywordsMorphological features
dc.keywordsRule-based models
dc.keywordsSingle decision
dc.keywordsSupervised trainings
dc.keywordsText processing
dc.languageEnglish
dc.publisherAssociation for Computational Linguistics (ACL)
dc.sourceHLT-NAACL 2006 - Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings of the Main Conference
dc.subjectComputer engineering
dc.titleLearning morphological disambiguation rules for Turkish
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authoridN/A
local.contributor.authorid0000-0002-7039-0046
local.contributor.kuauthorTüre, Ferhan
local.contributor.kuauthorYüret, Deniz
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files