Improving Neural ODE Training with Temporal Adaptive Batch Normalization

  • Su Zheng
  • , Zhengqi Gao
  • , Fan Keng Sun
  • , Duane S. Boning
  • , Bei Yu*
  • , Martin Wong
  • *Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

4 Citations (Scopus)

Abstract

Neural ordinary differential equations (Neural ODEs) is a family of continuous-depth neural networks where the evolution of hidden states is governed by learnable temporal derivatives. We identify a significant limitation in applying traditional Batch Normalization (BN) to Neural ODEs, due to a fundamental mismatch - BN was initially designed for discrete neural networks with no temporal dimension, whereas Neural ODEs operate continuously over time. To bridge this gap, we introduce temporal adaptive Batch Normalization (TA-BN), a novel technique that acts as the continuous-time analog to traditional BN. Our empirical findings reveal that TA-BN enables the stacking of more layers within Neural ODEs, enhancing their performance. Moreover, when confined to a model architecture consisting of a single Neural ODE followed by a linear layer, TA-BN achieves 91.1% test accuracy on CIFAR-10 with 2.2 million parameters, making it the first unmixed Neural ODE architecture to approach MobileNetV2-level parameter efficiency. Extensive numerical experiments on image classification and physical system modeling substantiate the superiority of TA-BN compared to baseline methods.

Original languageEnglish
Title of host publicationProceedings of 38th Conference on Neural Information Processing Systems, NeurIPS 2024
EditorsA. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, C. Zhang
PublisherNeural Information Processing Systems Foundation
Pages95875-95895
Number of pages21
ISBN (Electronic)9798331314385
DOIs
Publication statusPublished - Dec 2024
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver Convention Center , Vancouver, Canada
Duration: 9 Dec 202415 Dec 2024
https://neurips.cc/Conferences/2024
https://openreview.net/group?id=NeurIPS.cc/2024
https://proceedings.neurips.cc/paper_files/paper/2024 (Conference Proceedings)

Publication series

NameAdvances in Neural Information Processing Systems
PublisherNeural Information Processing Systems Foundation
Volume37
ISSN (Print)1049-5258

Conference

Conference38th Conference on Neural Information Processing Systems, NeurIPS 2024
Country/TerritoryCanada
CityVancouver
Period9/12/2415/12/24
Internet address

Fingerprint

Dive into the research topics of 'Improving Neural ODE Training with Temporal Adaptive Batch Normalization'. Together they form a unique fingerprint.

Cite this