Publication:
Local context selection for aligning sentences in parallel corpora

dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T23:03:58Z
dc.date.issued2007
dc.description.abstractThis paper presents a novel language-independent context-based sentence alignment technique given parallel corpora. We can view the problem of aligning sentences as finding translations of sentences chosen from different sources. Unlike current approaches which rely on pre-defined features and models, our algorithm employs features derived from the distributional proper-ties of sentences and does not use any language dependent knowledge. We make use of the context of sentences and introduce the notion of Zipfian word vectors which effectively models the distributional properties of a given sentence. We accept the context to be the frame in which the reasoning about sentence alignment is done. We examine alternatives for local context models and demonstrate that our context based sentence alignment algorithm per-forms better than prominent sentence alignment techniques. Our system dynamically selects the local context for a pair of set of sentences which maximizes the correlation. We evaluate the performance of our system based on two different measures: sentence alignment accuracy and sentence alignment coverage. We compare the performance of our system with commonly used sentence alignment systems and show that our system performs 1.1951 to 1.5404 times better in reducing the error rate in alignment accuracy and coverage.
dc.description.indexedbyWOS
dc.description.indexedbyScopus
dc.description.openaccessNO
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuN/A
dc.description.volume4635
dc.identifier.eissn1611-3349
dc.identifier.isbn978-3-540-74254-8
dc.identifier.issn0302-9743
dc.identifier.scopus2-s2.0-37249071820
dc.identifier.urihttps://hdl.handle.net/20.500.14288/8558
dc.identifier.wos250748900007
dc.language.isoeng
dc.publisherSpringer-Verlag Berlin
dc.relation.ispartofModeling and Using Context
dc.subjectComputer science, artificial intelligence
dc.subjectComputer science, Software engineering
dc.titleLocal context selection for aligning sentences in parallel corpora
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorBiçici, Ergun
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery434c9663-2b11-4e66-9399-c863e2ebae43

Files