Publication: Visually grounded language learning for robot navigation
dc.contributor.department | Department of Computer Engineering | |
dc.contributor.kuauthor | Yemez, Yücel | |
dc.contributor.kuauthor | Ünal, Emre | |
dc.contributor.kuauthor | Can, Ozan Arkan | |
dc.contributor.kuprofile | Faculty Member | |
dc.contributor.kuprofile | Other | |
dc.contributor.other | Department of Computer Engineering | |
dc.contributor.schoolcollegeinstitute | College of Engineering | |
dc.contributor.schoolcollegeinstitute | Graduate School of Sciences and Engineering | |
dc.contributor.yokid | 107907 | |
dc.contributor.yokid | N/A | |
dc.contributor.yokid | N/A | |
dc.date.accessioned | 2024-11-09T11:39:49Z | |
dc.date.issued | 2019 | |
dc.description.abstract | We present an end-to-end deep learning model for robot navigation from raw visual pixel input and natural text instructions. The proposed model is an LSTM-based sequence-to-sequence neural network architecture with attention, which is trained on instructionperception data samples collected in a synthetic environment. We conduct experiments on the SAIL dataset which we reconstruct in 3D so as to generate the 2D images associated with the data. Our experiments show that the performance of our model is on a par with state-of-the-art, despite the fact that it learns navigational language with end-to-end training from raw visual data. | |
dc.description.fulltext | YES | |
dc.description.indexedby | Scopus | |
dc.description.openaccess | YES | |
dc.description.publisherscope | International | |
dc.description.sponsoredbyTubitakEu | TÜBİTAK | |
dc.description.sponsorship | Scientific and Technological Research Council of Turkey (TÜBİTAK) | |
dc.description.version | Publisher version | |
dc.format | ||
dc.identifier.doi | 10.1145/3347450.3357655 | |
dc.identifier.embargo | NO | |
dc.identifier.filenameinventoryno | IR01982 | |
dc.identifier.isbn | 9781450369183 | |
dc.identifier.link | https://doi.org/10.1145/3347450.3357655 | |
dc.identifier.quartile | N/A | |
dc.identifier.scopus | 2-s2.0-85074933738 | |
dc.identifier.uri | https://hdl.handle.net/20.500.14288/158 | |
dc.keywords | Instruction following | |
dc.keywords | Natural language processing | |
dc.keywords | Robot navigation | |
dc.keywords | Visual grounding | |
dc.language | English | |
dc.publisher | Association for Computing Machinery (ACM) | |
dc.relation.grantno | 1.79769313486232E+308 | |
dc.relation.uri | http://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/8610 | |
dc.source | MULEA '19: 1st International Workshop on Multimodal Understanding and Learning for Embodied Applications | |
dc.subject | Computer engineering | |
dc.title | Visually grounded language learning for robot navigation | |
dc.type | Conference proceeding | |
dspace.entity.type | Publication | |
local.contributor.authorid | 0000-0002-7515-3138 | |
local.contributor.authorid | N/A | |
local.contributor.authorid | N/A | |
local.contributor.kuauthor | Yemez, Yücel | |
local.contributor.kuauthor | Ünal, Emre | |
local.contributor.kuauthor | Can, Ozan Arkan | |
relation.isOrgUnitOfPublication | 89352e43-bf09-4ef4-82f6-6f9d0174ebae | |
relation.isOrgUnitOfPublication.latestForDiscovery | 89352e43-bf09-4ef4-82f6-6f9d0174ebae |
Files
Original bundle
1 - 1 of 1