Publication Date

Spring 2025

Degree Type

Thesis

Degree Name

Master of Science (MS)

Department

Computer Engineering

Advisor

Mahima Suresh; Jorjeva Jetcheva; Wencen Wu

Abstract

Quantization has become a key approach for reducing storage and computational demands of deep neural networks while maintaining high accuracy. Although 8-bit quantization is well-established for convolutional architectures such as ResNet50 and MobileNetV2, its application to graph-based vision models remains underexplored. In this work, we extend quantization-aware training to Vision Graph Neural Networks (ViGs) and conduct comparisons with quantized CNNs on the CIFAR-100 dataset. To ensure parity, all models have same training hyperparameters such as learning rate, batch size, optimizer, number of epochs. We used numerous techniques to preserve performance for low-bit precision. First, Pauta Quantization clips activation outliers based on statistical thresholds. Next, Attention Quantization Distillation (AQD) encourages the quantized network to mimic channel-wise attention patterns from full-precision activations to retain feature representations. Finally, Stochastic Quantization Distillation (SQD) injects randomness into the quantization process to make it more robust. Comprehensive experiments on CIFAR-100 reveal that 8-bit quantization delivers an excellent trade-off between efficiency and accuracy across both architectures. Notably, our quantized ViGs almost matches the performance of their ResNet50 and MobileNetV2 for top-1 accuracy, while also making similar inference and memory footprint. These findings demonstrate that quantized Vision GNNs are a practical alternative to CNNs for deployment on resource-constrained devices. Future research will explore fine-tuning of AQD and other distillation parameters.

Available for download on Sunday, August 02, 2026

Share

COinS