Department of Computer Engineering2024-11-0920099781-6173-8258-110.3115/1667583.16676902-s2.0-84859062288http://dx.doi.org/10.3115/1667583.1667690https://hdl.handle.net/20.500.14288/14609We experiment with splitting words into their stem and suffix components for modeling morphologically rich languages. We show that using a morphological analyzer and disambiguator results in a significant perplexity reduction in Turkish. We present flexible n-gram models, Flex-Grams, which assume that the n-1 tokens that determine the probability of a given token can be chosen anywhere in the sentence rather than the preceding n-1 positions. Our final model achieves 27% perplexity reduction compared to the standard n-gram model.Computer engineeringModeling morphologically rich languages using splitwords and unstructured dependenciesConference proceedinghttps://www.scopus.com/inward/record.uri?eid=2-s2.0-84859062288anddoi=10.3115%2f1667583.1667690andpartnerID=40andmd5=d7266a1e61d9de32751e585fe9d85dbaN/A9834