Publication:
Modeling morphologically rich languages using splitwords and unstructured dependencies

Placeholder

Organizational Units

Program

KU Authors

Co-Authors

Advisor

Publication Date

2009

Language

English

Type

Conference proceeding

Journal Title

Journal ISSN

Volume Title

Abstract

We experiment with splitting words into their stem and suffix components for modeling morphologically rich languages. We show that using a morphological analyzer and disambiguator results in a significant perplexity reduction in Turkish. We present flexible n-gram models, Flex-Grams, which assume that the n-1 tokens that determine the probability of a given token can be chosen anywhere in the sentence rather than the preceding n-1 positions. Our final model achieves 27% perplexity reduction compared to the standard n-gram model.

Description

Source:

ACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf.

Publisher:

Association for Computational Linguistics (ACL)

Keywords:

Subject

Computer engineering

Citation

Endorsement

Review

Supplemented By

Referenced By

Copy Rights Note

0

Views

0

Downloads

View PlumX Details