Publication Date

Fall 2025

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Christopher Pollett

Second Advisor

Navrati Saxena

Third Advisor

Prasanna Nikhil Sathwik Vadlamani

Keywords

Large Language Models, LLMs, Mathics, Lean, Chain-of-Thought prompting, Deepseek, Post-tuning

Abstract

Large language models (LLMs) are good at natural language tasks but are not up to the mark when it comes to mathematics and theorem-proving, as they rely on language patterns instead of understanding the problem and thinking through the solution. This project addresses issues such as a lack of exposure to structured datasets, difficulty with generating outputs that require multi-step reasoning, and limitations of short context. We fine-tune pre-trained LLMs on structured datasets like MATH, GSM8k, open-r1, deepseek-prover, and OpenBootstrappedTheorem. We integrate two software tools, Mathics and LEAN, and enhance reasoning through Chain-of-Thought (CoT). Additionally, we conduct the experiments using state of the art Mixture of Experts (MoE) and parameter efficient fine-tuning (PEFT) techniques such as LoRA and DoRA. The outcomes of this project are better model performance on complex math problems, and particularly on formal theorem-proving datasets, which is a comparatively understudied domain in recent LLM research. This takes a step toward developing and fine-tuning models that can handle challenging mathematical and logical domains.

Available for download on Saturday, December 19, 2026

Share

COinS