Modeling morphologically rich languages using splitwords and unstructured dependencies

Publication:
Modeling morphologically rich languages using splitwords and unstructured dependencies

Departments

Organizational Unit

Department of Computer Engineering

Organizational Unit

Graduate School of Sciences and Engineering

School / College / Institute

Organizational Unit

College of Engineering

Organizational Unit

GRADUATE SCHOOL OF SCIENCES AND ENGINEERING

Upper Org Unit

KU-Authors

Biçici, Ergün

Yüret, Deniz

Date

2009

Type

Conference Proceeding

Embargo Status

N/A

Abstract

We experiment with splitting words into their stem and suffix components for modeling morphologically rich languages. We show that using a morphological analyzer and disambiguator results in a significant perplexity reduction in Turkish. We present flexible n-gram models, Flex-Grams, which assume that the n-1 tokens that determine the probability of a given token can be chosen anywhere in the sentence rather than the preceding n-1 positions. Our final model achieves 27% perplexity reduction compared to the standard n-gram model.

Publisher

Association for Computational Linguistics (ACL)

Subject

Computer engineering

Source

ACL-IJCNLP 2009 - Joint Conf. of the 47th Annual Meeting of the Association for Computational Linguistics and 4th Int. Joint Conf. on Natural Language Processing of the AFNLP, Proceedings of the Conf.

DOI

10.3115/1667583.1667690

URI

https://doi.org/10.3115/1667583.1667690
https://hdl.handle.net/20.500.14288/14609

Publication: Modeling morphologically rich languages using splitwords and unstructured dependencies

Departments

School / College / Institute

Program

KU-Authors

KU Authors

Co-Authors

Editor & Affiliation

Compiler & Affiliation

Translator

Other Contributor

Date

Language

Type

Embargo Status

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

Source

Publisher

Subject

Citation

Has Part

Source

Book Series Title

Edition

DOI

URI

item.page.datauri

Link

Rights

Copyrights Note

Collections

Endorsement

Review

Supplemented By

Referenced By

Related Goal

2

Views

0

Downloads

Publication:
Modeling morphologically rich languages using splitwords and unstructured dependencies