Master of Science (MS)
Knowledge Graph Construction, SpaCy, TEI, Wikidata
Textbooks are written and organized in a way that facilitates learning and understanding. Sections like glossary terms at the end of a textbook provide guidance on the topic of interest. However, it takes manual effort to create the index terms in the glossary that highlight the key referenced terminologies and related terms. Knowledge graphs, which have been used to represent and even reason over data and knowledge, can potentially capture textbook’s important terms, concepts, and their relations. Popular since the initial introduction by Google Knowledge Graphs (KGs), they combine graph and data to capture and model enormous amounts of relational facts in fields ranging from social media to sciences. Recently, techniques have been developed to extract knowledge bases from textbooks. After we have the knowledge graph of a textbook we can perform completion tasks of predicting missing entities or relations by representing knowledge graphs in low-dimensional spaces.
The main objective of the project is to apply knowledge graph construction tech- niques on textbooks. The main challenge is the absence of the domain specific schema of each textbook. We use different entity and relation extraction models to capture logical and semantic information related to the textbook topic. A Text-Encoding- Initiative model was employed to extract hierarchical concepts from a textbook; spaCy NLP ,and Google Cloud NLP were able to extract semantic information from the main textual content of a textbook. A case study on a cloud computing textbook was conducted and evaluated with each of the approaches.
Yao, Yutong, "Application Of Knowledge Graph Techniques On Textbooks" (2023). Master's Projects. 1226.
Available for download on Friday, May 24, 2024