A convergence analysis of distributed SGD with communication-efficient gradient sparsification

Shaohuai Shi, Kaiyong Zhao, Qiang WANG, Zhenheng Tang, Xiaowen CHU

Research output: Chapter in book/report/conference proceedingConference contributionpeer-review

8 Citations (Scopus)

Abstract

Gradient sparsification is a promising technique to significantly reduce the communication overhead in decentralized synchronous stochastic gradient descent (S-SGD) algorithms. Yet, many existing gradient sparsification schemes (e.g., Top-k sparsification) have a communication complexity of O(kP ), where k is the number of selected gradients by each worker and P is the number of workers. Recently, the gTop-k sparsification scheme has been proposed to reduce the communication complexity from O(kP ) to O(k log P ), which significantly boosts the system scalability. However, it remains unclear whether the gTop-k sparsification scheme can converge in theory. In this paper, we first provide theoretical proofs on the convergence of the gTop-k scheme for non-convex objective functions under certain analytic assumptions. We then derive the convergence rate of gTop-k S-SGD, which is at the same order as the vanilla minibatch SGD. Finally, we conduct extensive experiments on different machine learning models and data sets to verify the soundness of the assumptions and theoretical results, and discuss the impact of the compression ratio on the convergence performance.

Original languageEnglish
Title of host publicationProceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019
EditorsSarit Kraus
PublisherInternational Joint Conferences on Artificial Intelligence
Pages3411-3417
Number of pages7
ISBN (Electronic)9780999241141
DOIs
Publication statusPublished - 2019
Event28th International Joint Conference on Artificial Intelligence, IJCAI 2019 - Macao, China
Duration: 10 Aug 201916 Aug 2019

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2019-August
ISSN (Print)1045-0823

Conference

Conference28th International Joint Conference on Artificial Intelligence, IJCAI 2019
Country/TerritoryChina
CityMacao
Period10/08/1916/08/19

Scopus Subject Areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'A convergence analysis of distributed SGD with communication-efficient gradient sparsification'. Together they form a unique fingerprint.

Cite this