Publication: Cross-lingual visual pre-training for multimodal machine translation
| dc.contributor.coauthor | Caglayan, Ozan | |
| dc.contributor.coauthor | Kuyu, Menekse | |
| dc.contributor.coauthor | Amac, Mustafa Sercan | |
| dc.contributor.coauthor | Madhyastha, Pranava | |
| dc.contributor.coauthor | Erdem, Aykut | |
| dc.contributor.coauthor | Specia, Lucia | |
| dc.contributor.department | Department of Computer Engineering | |
| dc.contributor.facultymember | Yes | |
| dc.contributor.kuauthor | Erdem, Aykut | |
| dc.contributor.schoolcollegeinstitute | College of Engineering | |
| dc.date.accessioned | 2024-11-09T22:50:20Z | |
| dc.date.issued | 2021 | |
| dc.description.abstract | Pre-trained language models have been shown to improve performance in many natural language tasks substantially. Although the early focus of such models was single language pre-training, recent advances have resulted in cross-lingual and visual pre-training methods. In this paper, we combine these two approaches to learn visually-grounded cross-lingual representations. Specifically, we extend the translation language modelling (Lample and Conneau, 2019) with masked region classification and perform pre-training with three-way parallel vision & language corpora. We show that when fine-tuned for multimodal machine translation, these models obtain state-of-the-art performance. We also provide qualitative insights into the usefulness of the learned grounded representations. | |
| dc.description.fulltext | No | |
| dc.description.harvestedfrom | Manual | |
| dc.description.indexedby | WOS | |
| dc.description.indexedby | Scopus | |
| dc.description.openaccess | NO | |
| dc.description.peerreviewstatus | N/A | |
| dc.description.publisherscope | International | |
| dc.description.readpublish | N/A | |
| dc.description.sponsoredbyTubitakEu | EU - TÜBİTAK | |
| dc.description.sponsorship | This work was supported in part by the TÜBA GEBİP fellowship awarded to Erkut Erdem; the MMVC project funded by TÜBİTAK [219E054, 352343575] and the British Council through the Newton Fund Institutional Links grant programme [219E054, 352343575]; the MultiMT project (H2020 ERC Starting Grant No. 678017); and the Air Force Office of Scientific Research [FA8655-20-1-7006]. Lucia Specia, Pranava Madhyastha, and Ozan Caglayan also received support from the MultiMT project, while Lucia Specia was additionally supported by the Air Force Office of Scientific Research. | |
| dc.description.version | N/A | |
| dc.identifier.embargo | N/A | |
| dc.identifier.endpage | 1324 | |
| dc.identifier.grantno | 678017 | |
| dc.identifier.grantno | 219E054 | |
| dc.identifier.isbn | 9781954085022 | |
| dc.identifier.quartile | N/A | |
| dc.identifier.scopus | 2-s2.0-85107296187 | |
| dc.identifier.startpage | 1317 | |
| dc.identifier.uri | https://hdl.handle.net/20.500.14288/6658 | |
| dc.identifier.wos | 000863557001034 | |
| dc.keywords | Cross-lingual visual | |
| dc.keywords | Multimodal machine translation | |
| dc.language.iso | eng | |
| dc.publisher | Association for Computational Linguistics | |
| dc.relation.affiliation | Koç University | |
| dc.relation.collection | Koç University Institutional Repository | |
| dc.relation.ispartof | 16th Conference of The European Chapter of The Association For Computational Linguistics | |
| dc.relation.openaccess | N/A | |
| dc.rights | N/A | |
| dc.subject | Computer science | |
| dc.subject | Artificial intelligence | |
| dc.subject | Computer science | |
| dc.subject | Linguistics | |
| dc.title | Cross-lingual visual pre-training for multimodal machine translation | |
| dc.type | Conference Proceeding | |
| dspace.entity.type | Publication | |
| local.contributor.kuauthor | Erdem, Aykut | |
| relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
| relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
| relation.isParentOrgUnitOfPublication | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 | |
| relation.isParentOrgUnitOfPublication.latestForDiscovery | 8e756b23-2d4a-4ce8-b1b3-62c794a8c164 |
