Publication: Cross-lingual visual pre-training for multimodal machine translation
dc.contributor.coauthor | Çağlayan, O. | |
dc.contributor.coauthor | Kuyu, M. | |
dc.contributor.coauthor | Amaç, M. S. | |
dc.contributor.coauthor | Madhyastha, P. | |
dc.contributor.coauthor | Erdem, E. | |
dc.contributor.coauthor | Specia, L. | |
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.kuauthor | Erdem, Aykut | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.date.accessioned | 2024-11-09T12:01:17Z | |
dc.date.issued | 2021 | |
dc.description.abstract | Pre-trained language models have been shown to improve performance in many natural language tasks substantially. Although the early focus of such models was single language pre-training, recent advances have resulted in cross-lingual and visual pre-training methods. In this paper, we combine these two approaches to learn visually-grounded cross-lingual representations. Specifically, we extend the translation language modelling (Lample and Conneau, 2019) with masked region classification and perform pre-training with three-way parallel vision & language corpora. We show that when fine-tuned for multimodal machine translation, these models obtain state-of-the-art performance. We also provide qualitative insights into the usefulness of the learned grounded representations. | |
dc.description.fulltext | YES | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | EU - TÜBİTAK | |
dc.description.sponsorship | MMVC Project | |
dc.description.sponsorship | Scientific and Technological Research Council of Turkey (TÜBİTAK) | |
dc.description.sponsorship | British Council Newton Fund Institutional Links Grant Programme | |
dc.description.sponsorship | European Union (EU) | |
dc.description.sponsorship | Horizon 2020 | |
dc.description.sponsorship | MultiMT Project | |
dc.description.sponsorship | ERC Starting Grant | |
dc.description.sponsorship | Air Force Office of Scientific Research | |
dc.description.sponsorship | TUBA GEBIP Fellowship | |
dc.description.version | Publisher version | |
dc.identifier.doi | 10.18653/v1/2021.eacl-main.112 | |
dc.identifier.embargo | NO | |
dc.identifier.filenameinventoryno | IR02976 | |
dc.identifier.isbn | 978-195408502-2 | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-85107296187 | |
dc.identifier.uri | https://doi.org/10.18653/v1/2021.eacl-main.112 | |
dc.keywords | Cross-lingual | |
dc.keywords | Improve performance | |
dc.keywords | Language model | |
dc.keywords | Machine translations | |
dc.keywords | Natural languages | |
dc.keywords | Parallel vision | |
dc.keywords | Region classifications | |
dc.keywords | State-of-the-art performance | |
dc.language.iso | eng | |
dc.publisher | Association for Computational Linguistics (ACL) | |
dc.relation.grantno | 219E054 | |
dc.relation.grantno | 352343575 | |
dc.relation.grantno | 678017 | |
dc.relation.grantno | FA8655-20-1-7006 | |
dc.relation.ispartof | EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference | |
dc.relation.uri | http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/9624 | |
dc.subject | Visual languages | |
dc.title | Cross-lingual visual pre-training for multimodal machine translation | |
dc.type | Conference Proceeding | |
dspace.entity.type | Publication | |
local.contributor.kuauthor | Erdem, Aykut | |
local.publication.orgunit1 | College of Engineering | |
local.publication.orgunit2 | Department of Computer Engineering | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isParentOrgUnitOfPublication | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 | |
relation.isParentOrgUnitOfPublication.latestForDiscovery | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 |
Files
Original bundle
1 - 1 of 1