Publication Date

Spring 5-22-2019

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Katerina Potika

Second Advisor

Sami Khuri

Third Advisor

Robert Chun


Graphs are a powerful way to model network data with the objects as nodes and the relationship between the various objects as links. Such graphs contain a plethora of valuable information about the underlying data which can be extracted, analyzed, and visualized using Machine Learning (ML). The challenge to this task is that graphs are non-Euclidean structures which means that they cannot be directly used with ML techniques because ML techniques only work with Euclidean structures like grids or sequences. In order to overcome this challenge, the graph structure first needs to be encoded into an equivalent Euclidean representation in the form of a low-dimensional vector. This low-dimensional vector is called an embedding vector, and the encoding process is called node embedding. Traditionally, user-defined heuristics and matrix- factorization based methods were used for node embedding. However, these methods are slow and perform poorly on large and complex graphs. During the recent years, various ML techniques have been developed that learn the encoding of the graph automatically, and in a faster and more efficient way. A few of these techniques called Graph Convolutional Networks (GCNs) use variants of the convolutional neural networks adapted for graphs, and are implemented using deep neural networks. The aim of this project is two-fold. Firstly, to develop a unified framework focusing on three major GCN techniques in order to analyze, evaluate, and compare their performance on select benchmark datasets for the task of node classification. And secondly, to implement a new aggregator for one of the techniques — GraphSAGE, and compare the performance of the aggregator with the existing GCN methods as well as the other aggregators provided by GraphSAGE.

Available for download on Friday, May 22, 2020