Publication Date
Spring 2021
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Chrisopher J Pollett
Second Advisor
Robert Chun
Third Advisor
Sunhera Paul
Keywords
Affordance Prediction, Heat Map, ConvLSTM
Abstract
The rapid growth of the development of autonomous robots is transforming the manufacturing and healthcare industry in many ways, but they still face many challenges. One of the challenges experienced by autonomous robots is their inability to manipulate an unknown object without human supervision. One way through which autonomous robots can manipulate an unknown object is affordance learning [1]. Affordance describes the action a user can perform on the object in given surroundings. This report describes our proposed model to detect and predict the affordance of an object from videos by leveraging the spatial-temporal feature extraction through ConvLSTM and Fully Convolutional Networks. Our model is built upon an Encoder-Decoder architecture. The encoder consists of CNN to capture spatial features of the input frames and ConvLSTM to capture the temporal dynamics of the input frames. The decoder utilizes the encoder's output to classify the affordance of a given task and predict the interaction region between the human and the object in the form of a heatmap. The decoder is composed of a LSTM, utilized to classify affordance of a given task, and a Fully Convolutional Neural Network to predict the heatmap of the interaction region.
Recommended Citation
Matharu, Bhumika Kaur, "Detecting and predicting visual affordance of objects in a given environment" (2021). Master's Projects. 990.
DOI: https://doi.org/10.31979/etd.wjn4-au32
https://scholarworks.sjsu.edu/etd_projects/990