Publication: Modeling morphologically rich languages using splitwords and unstructured dependencies
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.kuauthor | Yüret, Deniz | |
dc.contributor.kuauthor | Biçici, Ergün | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | PhD Student | |
dc.contributor.other | Department of Computer Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.yokid | 179996 | |
dc.contributor.yokid | N/A | |
dc.date.accessioned | 2024-11-09T23:50:51Z | |
dc.date.issued | 2009 | |
dc.description.abstract | We experiment with splitting words into their stem and suffix components for modeling morphologically rich languages. We show that using a morphological analyzer and disambiguator results in a significant perplexity reduction in Turkish. We present flexible n-gram models, Flex-Grams, which assume that the n-1 tokens that determine the probability of a given token can be chosen anywhere in the sentence rather than the preceding n-1 positions. Our final model achieves 27% perplexity reduction compared to the standard n-gram model. | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsorship | Asian Federation of Natural Language Processing (AFNLP) | |
dc.description.sponsorship | Association for Computational Linguistics (ACL) | |
dc.identifier.doi | 10.3115/1667583.1667690 | |
dc.identifier.isbn | 9781-6173-8258-1 | |
dc.identifier.link | https://www.scopus.com/inward/record.uri?eid=2-s2.0-84859062288anddoi=10.3115%2f1667583.1667690andpartnerID=40andmd5=d7266a1e61d9de32751e585fe9d85dba | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-84859062288 | |
dc.identifier.uri | http://dx.doi.org/10.3115/1667583.1667690 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/14609 | |
dc.keywords | Computational linguistics | |
dc.keywords | Natural language processing systems | |
dc.keywords | Text processing | |
dc.keywords | Morphological analyzer | |
dc.keywords | N-gram modeling | |
dc.keywords | N-gram models | |
dc.keywords | Turkishs | |
dc.keywords | % reductions | |
dc.keywords | Splittings | |
dc.keywords | Modeling languages | |
dc.language | English | |
dc.publisher | Association for Computational Linguistics (ACL) | |
dc.source | ACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf. | |
dc.subject | Computer engineering | |
dc.title | Modeling morphologically rich languages using splitwords and unstructured dependencies | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-7039-0046 | |
local.contributor.authorid | 0000-0002-2293-2031 | |
local.contributor.kuauthor | Yüret, Deniz | |
local.contributor.kuauthor | Biçici, Ergün | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae |