Faculty Research, Scholarly, and Creative Activity

An empirical analysis of the shift and scale parameters in BatchNorm

Publication Date

8-1-2023

Document Type

Article

Publication Title

Information Sciences

Volume

637

DOI

10.1016/j.ins.2023.118951

Abstract

Batch Normalization (BatchNorm) is a technique that improves the training of deep neural networks, especially Convolutional Neural Networks (CNN). It has been empirically demonstrated that BatchNorm increases performance, stability, and accuracy, although the reasons for such improvements are unclear. BatchNorm includes a normalization step as well as trainable shift and scale parameters. In this paper, we empirically examine the relative contribution to the success of BatchNorm of the normalization step, as compared to the re-parameterization via shifting and scaling. To conduct our experiments, we implement two new optimizers in PyTorch, namely, a version of BatchNorm that we refer to as AffineLayer, which includes the re-parameterization step without normalization, and a version with just the normalization step, that we call BatchNorm-minus. We compare the performance of our AffineLayer and BatchNorm-minus implementations to standard BatchNorm, and we also compare these to the case where no batch normalization is used. We experiment with four ResNet architectures (ResNet18, ResNet34, ResNet50, and ResNet101) over a standard image dataset and multiple batch sizes. Among other findings, we provide empirical evidence that the success of BatchNorm may derive primarily from improved weight initialization.

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Department

Computer Science

Recommended Citation

Yashna Peerthum and Mark Stamp. "An empirical analysis of the shift and scale parameters in BatchNorm" Information Sciences (2023). https://doi.org/10.1016/j.ins.2023.118951

Download

Find in your library

COinS

Faculty Research, Scholarly, and Creative Activity

An empirical analysis of the shift and scale parameters in BatchNorm

Publication Date

Document Type

Publication Title

Volume

DOI

Abstract

Creative Commons License

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

An empirical analysis of the shift and scale parameters in BatchNorm

Authors

Publication Date

Document Type

Publication Title

Volume

DOI

Abstract

Creative Commons License

Department

Recommended Citation

Share

Search

Browse All

Links