Publication Date
Fall 2025
Degree Type
Thesis
Degree Name
Master of Science (MS)
Department
Applied Data Science
Advisor
Vishnu Pendyala; Guannan Liu; Ronald Mak
Abstract
Early stopping prevents overfitting in neural network training, yet conventional methods based on validation loss minima may stop training before models have fully converged. This paper proposes five Page-Hinkley (PH) test variants that monitor train-validation divergence dynamics, variance patterns, and generalization trends to detect sustained changes in training behavior. Through comprehensive evaluation on classification (CIFAR-10, Fashion-MNIST) and regression (5 datasets) tasks with exhaustive hyperparameter search over 1,254 configurations per experiment (23,826 total configuration evaluations across 95 training runs), we find that PH methods can identify when training dynamics have fundamentally shifted. The normalized divergence variant (PH-NormDiv) achieves near-zero or slightly negative mean shortfall (−0.022±0.134 percentage points in classification) when compared against validation-loss-minimum stopping, suggesting these methods may identify training points with comparable or potentially superior test performance. PH methods significantly outperform all baseline methods on the primary performance shortfall metric (p = 4.8×10−8 vs best baseline, Cohen’s d = 1.96), with statistically significant improvements observed in 5 of 6 evaluation metrics in classification (p < 0.05). Post-hoc evaluation identifies hyperparameter configurations achieving reliable performance: PH-NormDiv with δ = 0.05, λ = 40, w = 5–10 performs consistently across all 4 classification experiments without tuning. The results suggest that accumulated evidence of training dynamic changes may provide alternative stopping signals to single-epoch validation loss comparisons, particularly for high-capacity models.
Recommended Citation
Tran, Minett, "Applying Page-Hinkley Tests to Neural Network Early Stopping: An Empirical Investigation" (2025). Master's Theses. 5738.
DOI: https://doi.org/10.31979/etd.apwf-x98g
https://scholarworks.sjsu.edu/etd_theses/5738