Publication:
Visually grounded language learning for robot navigation

dc.contributor.departmentDepartment of Computer Engineering
dc.contributor.kuauthorYemez, Yücel
dc.contributor.kuauthorÜnal, Emre
dc.contributor.kuauthorCan, Ozan Arkan
dc.contributor.kuprofileFaculty Member
dc.contributor.kuprofileOther
dc.contributor.otherDepartment of Computer Engineering
dc.contributor.schoolcollegeinstituteCollege of Engineering
dc.contributor.schoolcollegeinstituteGraduate School of Sciences and Engineering
dc.contributor.yokid107907
dc.contributor.yokidN/A
dc.contributor.yokidN/A
dc.date.accessioned2024-11-09T11:39:49Z
dc.date.issued2019
dc.description.abstractWe present an end-to-end deep learning model for robot navigation from raw visual pixel input and natural text instructions. The proposed model is an LSTM-based sequence-to-sequence neural network architecture with attention, which is trained on instructionperception data samples collected in a synthetic environment. We conduct experiments on the SAIL dataset which we reconstruct in 3D so as to generate the 2D images associated with the data. Our experiments show that the performance of our model is on a par with state-of-the-art, despite the fact that it learns navigational language with end-to-end training from raw visual data.
dc.description.fulltextYES
dc.description.indexedbyScopus
dc.description.openaccessYES
dc.description.publisherscopeInternational
dc.description.sponsoredbyTubitakEuTÜBİTAK
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TÜBİTAK)
dc.description.versionPublisher version
dc.formatpdf
dc.identifier.doi10.1145/3347450.3357655
dc.identifier.embargoNO
dc.identifier.filenameinventorynoIR01982
dc.identifier.isbn9781450369183
dc.identifier.linkhttps://doi.org/10.1145/3347450.3357655
dc.identifier.quartileN/A
dc.identifier.scopus2-s2.0-85074933738
dc.identifier.urihttps://hdl.handle.net/20.500.14288/158
dc.keywordsInstruction following
dc.keywordsNatural language processing
dc.keywordsRobot navigation
dc.keywordsVisual grounding
dc.languageEnglish
dc.publisherAssociation for Computing Machinery (ACM)
dc.relation.grantno1.79769313486232E+308
dc.relation.urihttp://cdm21054.contentdm.oclc.org/cdm/ref/collection/IR/id/8610
dc.sourceMULEA '19: 1st International Workshop on Multimodal Understanding and Learning for Embodied Applications
dc.subjectComputer engineering
dc.titleVisually grounded language learning for robot navigation
dc.typeConference proceeding
dspace.entity.typePublication
local.contributor.authorid0000-0002-7515-3138
local.contributor.authoridN/A
local.contributor.authoridN/A
local.contributor.kuauthorYemez, Yücel
local.contributor.kuauthorÜnal, Emre
local.contributor.kuauthorCan, Ozan Arkan
relation.isOrgUnitOfPublication89352e43-bf09-4ef4-82f6-6f9d0174ebae
relation.isOrgUnitOfPublication.latestForDiscovery89352e43-bf09-4ef4-82f6-6f9d0174ebae

Files

Original bundle

Now showing 1 - 1 of 1
Thumbnail Image
Name:
8610.pdf
Size:
1.32 MB
Format:
Adobe Portable Document Format