PipeMoE: Accelerating Mixture-of-Experts through Adaptive Pipelining

Shaohuai Shi, Xinglin Pan, Xiaowen Chu*, Bo Li

*Corresponding author for this work

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

2 Citations (Scopus)

Abstract

Large models have attracted much attention in the AI area. The sparsely activated mixture-of-experts (MoE) technique pushes the model size to a trillion-level with a sub-linear increase of computations as an MoE layer can be equipped with many separate experts, but only one or two experts need to be trained for each input data. However, the feature of dynamically activating experts of MoE introduces extensive communications in distributed training. In this work, we propose PipeMoE to adaptively pipeline the communications and computations in MoE to maximally hide the communication time. Specifically, we first identify the root reason why a higher pipeline degree does not always achieve better performance in training MoE models. Then we formulate an optimization problem that aims to minimize the training iteration time. To solve this problem, we build performance models for computation and communication tasks in MoE and develop an optimal solution to determine the pipeline degree such that the iteration time is minimal. We conduct extensive experiments with 174 typical MoE layers and two real-world NLP models on a 64-GPU cluster. Experimental results show that our PipeMoE almost always chooses the best pipeline degree and outperforms state-of-the-art MoE training systems by 5%-77% in training time.

Original languageEnglish
Title of host publicationINFOCOM 2023 - IEEE Conference on Computer Communications
PublisherIEEE
Number of pages10
ISBN (Electronic)9798350334142
DOIs
Publication statusPublished - 17 May 2023
Event42nd IEEE International Conference on Computer Communications, INFOCOM 2023 - Hybrid, New York City, United States
Duration: 17 May 202320 May 2023

Publication series

NameProceedings - IEEE INFOCOM
Volume2023-May
ISSN (Print)0743-166X

Conference

Conference42nd IEEE International Conference on Computer Communications, INFOCOM 2023
Country/TerritoryUnited States
CityHybrid, New York City
Period17/05/2320/05/23

Scopus Subject Areas

  • Computer Science(all)
  • Electrical and Electronic Engineering

User-Defined Keywords

  • Communication-Efficient Training
  • Distributed Deep Learning
  • Mixture-of-Experts
  • Pipelining

Cite this