Publication Date

Spring 2019

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Katerina Potika

Second Advisor

Sami Khuri

Third Advisor

Mike Wu

Keywords

GloVE, NLP, DeepWalk, node2vec, GloVeNoR

Abstract

A graph is a very powerful abstract data type that can be used to model entities (nodes) and relationships (edges). Many real world networks like biological, computer and friendship networks can be represented as graphs. Graphs can be mined to extract interesting patterns and interactions between the participating entities. Recently, various Artificial Intelligence (AI) and Machine Learning (ML) techniques are used for this purpose. In order to do that, the nodes of a graph have to be represented as low dimensional feature vectors. Node embedding is the process of generating a �-dimensional feature vector corresponding to each node of a graph, such that the structurally similar nodes remain close in the �-dimensional space.

There are many state-of-the-art methods, like node2vec and DeepWalk to com- pute node embeddings. These techniques borrow methods like the Skip-Gram model, used in the domain of Natural Language Processing (NLP) to compute word embed- dings. This project explores the idea of porting the GloVe (Global Vectors for Word Representation) model, a popular technique for word embeddings, to a new method called GloVeNoR to compute node embeddings in a graph. We evaluate the model’s quality by comparing it with node2vec and DeepWalk on the problem of community detection on five different data sets. We observe that GloVeNoR discovers similar or better communities than the other existing models on all the datasets.

Share

COinS