Communication-efficient decentralized learning with sparsification and adaptive peer selection

Zhenheng Tang, Shaohuai Shi, Xiaowen Chu

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

35 Citations (Scopus)

Abstract

The increasing size of machine learning models, especially deep neural network models, can improve the model generalization capability. However, large models require more training data and more computing resources (such as GPU clusters) to train. In distributed training, the communication overhead of exchanging gradients or models among workers becomes a potential system bottleneck that limits the system scalability. Recently, many research works aim to reduce communication time of two types of distributed deep learning architectures, centralized and decentralized.
Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 40th International Conference on Distributed Computing Systems, ICDCS 2020
PublisherIEEE
Pages1207-1208
Number of pages2
ISBN (Electronic)9781728170022
DOIs
Publication statusPublished - Nov 2020
Event40th IEEE International Conference on Distributed Computing Systems, ICDCS 2020 - Singapore, Singapore
Duration: 29 Nov 20201 Dec 2020

Publication series

NameProceedings - International Conference on Distributed Computing Systems
Volume2020-November

Conference

Conference40th IEEE International Conference on Distributed Computing Systems, ICDCS 2020
Country/TerritorySingapore
CitySingapore
Period29/11/201/12/20

Scopus Subject Areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

User-Defined Keywords

  • Adaptive Peer Selection
  • Deep Learning
  • Distributed Learning
  • Federated Learning
  • Model Sparsification

Fingerprint

Dive into the research topics of 'Communication-efficient decentralized learning with sparsification and adaptive peer selection'. Together they form a unique fingerprint.

Cite this