Publication:
Modeling morphologically rich languages using splitwords and unstructured dependencies

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorYüret, Deniz
dc.contributor.kuauthorBiçici, Ergün
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofilePhD Student
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.yokid179996
dc.contributor.yokidN/A
dc.date.accessioned2024-11-09T23:50:51Z
dc.date.issued2009
dc.description.abstractWe experiment with splitting words into their stem and suffix components for modeling morphologically rich languages. We show that using a morphological analyzer and disambiguator results in a significant perplexity reduction in Turkish. We present flexible n-gram models, Flex-Grams, which assume that the n-1 tokens that determine the probability of a given token can be chosen anywhere in the sentence rather than the preceding n-1 positions. Our final model achieves 27% perplexity reduction compared to the standard n-gram model.
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsorshipAsian Federation of Natural Language Processing (AFNLP)
dc.description.sponsorshipAssociation for Computational Linguistics (ACL)
dc.identifier.doi10.3115/1667583.1667690
dc.identifier.isbn9781-6173-8258-1
dc.identifier.linkhttps://www.scopus.com/inward/record.uri?eid=2-s2.0-84859062288anddoi=10.3115%2f1667583.1667690andpartnerID=40andmd5=d7266a1e61d9de32751e585fe9d85dba
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-84859062288
dc.identifier.urihttp://dx.doi.org/10.3115/1667583.1667690
dc.identifier.urihttps://hdl.handle.net/20.500.14288/14609
dc.keywordsComputational linguistics
dc.keywordsNatural language processing systems
dc.keywordsText processing
dc.keywordsMorphological analyzer
dc.keywordsN-gram modeling
dc.keywordsN-gram models
dc.keywordsTurkishs
dc.keywords% reductions
dc.keywordsSplittings
dc.keywordsModeling languages
dc.languageEnglish
dc.publisherAssociation for Computational Linguistics (ACL)
dc.sourceACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf.
dc.subjectComputer engineering
dc.titleModeling morphologically rich languages using splitwords and unstructured dependencies
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authorid0000-0002-7039-0046
local.contributor.authorid0000-0002-2293-2031
local.contributor.kuauthorYüret, Deniz
local.contributor.kuauthorBiçici, Ergün
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files