Publication:
Visually grounded language learning for robot navigation

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.departmentGraduate School of Sciences and Engineering
dc.contributor.kuauthorCan, Ozan Arkan
dc.contributor.kuauthorÜnal, Emre
dc.contributor.kuauthorYemez, Yücel
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGRADUATE SCHOOL OF SCIENCES AND ENGINEERING
dc.date.accessioned2024-11-09T11:39:49Z
dc.date.issued2019
dc.description.abstractWe present an end-to-end deep learning model for robot navigation from raw visual pixel input and natural text instructions. The proposed model is an LSTM-based sequence-to-sequence neural network architecture with attention, which is trained on instructionperception data samples collected in a synthetic environment. We conduct experiments on the SAIL dataset which we reconstruct in 3D so as to generate the 2D images associated with the data. Our experiments show that the performance of our model is on a par with state-of-the-art, despite the fact that it learns navigational language with end-to-end training from raw visual data.
dc.description.fulltextYES
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuTÜBİTAK
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TÜBİTAK)
dc.description.versionPublisher version
dc.identifier.doi10.1145/3347450.3357655
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR01982
dc.identifier.isbn9781450369183
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85074933738
dc.identifier.urihttps://doi.org/10.1145/3347450.3357655
dc.keywordsInstruction following
dc.keywordsNatural language processing
dc.keywordsRobot navigation
dc.keywordsVisual grounding
dc.language.isoeng
dc.publisherAssociation for Computing Machinery (ACM)
dc.relation.grantno1.79769313486232E+308
dc.relation.ispartofMULEA '19: 1st International Workshop on Multimodal Understanding and Learning for Embodied Applications
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/8610
dc.subjectComputer engineering
dc.titleVisually grounded language learning for robot navigation
dc.typeConference Proceeding
dspace.entity.typePublication
local.contributor.kuauthorYemez, Yücel
local.contributor.kuauthorÜnal, Emre
local.contributor.kuauthorCan, Ozan Arkan
local.publication.orgunit1College of Engineering
local.publication.orgunit1GRADUATE SCHOOL OF SCIENCES AND ENGINEERING
local.publication.orgunit2Department of Computer Engineering
local.publication.orgunit2Graduate School of Sciences and Engineering
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication3fc31c89-e803-4eb1-af6b-6258bc42c3d8
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isParentOrgUnitOfPublication8e756b23-2d4a-4ce8-b1b3-62c794a8c164
relation.isParentOrgUnitOfPublication434c9663-2b11-4e66-9399-c863e2ebae43
relation.isParentOrgUnitOfPublication.latestForDiscovery8e756b23-2d4a-4ce8-b1b3-62c794a8c164

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
8610.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format