Publication: Context-based sentence alignment in parallel corpora
dc.contributor.department | Graduate School of Sciences and Engineering | |
dc.contributor.schoolcollegeinstitute | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
dc.date.accessioned | 2024-11-09T23:26:06Z | |
dc.date.issued | 2008 | |
dc.description.abstract | This paper presents a language-independent context-based sentence alignment technique given parallel corpora. We can view the problem of aligning sentences as finding translations of sentences chosen from different sources. Unlike current approaches which rely on pre-defined features and models, our algorithm employs features derived from the distributional properties of words and does not use any language dependent knowledge. We make use of the context of sentences and the notion of Zipfian word vectors which effectively models the distributional properties of words in a given sentence. We accept the context to be the frame in which the reasoning about sentence alignment is done. We evaluate the performance of our system based on two different measures: sentence alignment accuracy and sentence alignment coverage. We compare the performance of our system with commonly used sentence alignment systems and show that our system performs 1.2149 to 1.6022 times better in reducing the error rate in alignment accuracy and coverage for moderately sized corpora. | |
dc.description.indexedby | WOS | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | NO | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | N/A | |
dc.description.volume | 4919 | |
dc.identifier.isbn | 978-3-540-78134-9 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.scopus | 2-s2.0-49949083112 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/11490 | |
dc.identifier.wos | 253658200037 | |
dc.keywords | Sentence alignment | |
dc.keywords | Context | |
dc.keywords | Zipfian word vectors | |
dc.keywords | Multilingual | |
dc.language.iso | eng | |
dc.publisher | Springer-Verlag Berlin | |
dc.relation.ispartof | Computational Linguistics and Intelligent Text Processing | |
dc.subject | Computer science | |
dc.subject | Artificial intelligence | |
dc.subject | Computer science | |
dc.subject | Theory and methods | |
dc.title | Context-based sentence alignment in parallel corpora | |
dc.type | Conference Proceeding | |
dspace.entity.type | Publication | |
local.contributor.kuauthor | Biçici, Ergun | |
local.publication.orgunit1 | GRADUATE SCHOOL OF SCIENCES AND ENGINEERING | |
local.publication.orgunit2 | Graduate School of Sciences and Engineering | |
relation.isOrgUnitOfPublication | 3fc31c89-e803-4eb1-af6b-6258bc42c3d8 | |
relation.isOrgUnitOfPublication.latestForDiscovery | 3fc31c89-e803-4eb1-af6b-6258bc42c3d8 | |
relation.isParentOrgUnitOfPublication | 434c9663-2b11-4e66-9399-c863e2ebae43 | |
relation.isParentOrgUnitOfPublication.latestForDiscovery | 434c9663-2b11-4e66-9399-c863e2ebae43 |