Publication:
Visually grounded language learning for robot navigation

Thumbnail Image

School / College / Institute

Organizational Unit

Program

KU Authors

Co-Authors

Publication Date

Language

Embargo Status

NO

Journal Title

Journal ISSN

Volume Title

Alternative Title

Abstract

We present an end-to-end deep learning model for robot navigation from raw visual pixel input and natural text instructions. The proposed model is an LSTM-based sequence-to-sequence neural network architecture with attention, which is trained on instructionperception data samples collected in a synthetic environment. We conduct experiments on the SAIL dataset which we reconstruct in 3D so as to generate the 2D images associated with the data. Our experiments show that the performance of our model is on a par with state-of-the-art, despite the fact that it learns navigational language with end-to-end training from raw visual data.

Source

Publisher

Association for Computing Machinery (ACM)

Subject

Computer engineering

Citation

Has Part

Source

MULEA '19: 1st International Workshop on Multimodal Understanding and Learning for Embodied Applications

Book Series Title

Edition

DOI

10.1145/3347450.3357655

item.page.datauri

Link

Rights

Copyrights Note

Endorsement

Review

Supplemented By

Referenced By

0

Views

3

Downloads

View PlumX Details