Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training

Yuxin Wang, Qiang Wang, Shaohuai Shi, Xin He, Zhenheng Tang, Kaiyong Zhao, Xiaowen Chu

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

43 Citations (Scopus)

Abstract

Deep learning has become widely used in complex AI applications. Yet, training a deep neural network (DNNs) model requires a considerable amount of calculations, long running time, and much energy. Nowadays, many-core AI accelerators (e.g., GPUs and TPUs) are designed to improve the performance of AI training. However, processors from different vendors perform dissimilarly in terms of performance and energy consumption. To investigate the differences among several popular off-the-shelf processors (i.e., Intel CPU, NVIDIA GPU, AMD GPU, and Google TPU) in training DNNs, we carry out a comprehensive empirical study on the performance and energy efficiency of these processors1 by benchmarking a representative set of deep learning workloads, including computation-intensive operations, classical convolutional neural networks (CNNs), recurrent neural networks (LSTM), Deep Speech 2, and Transformer. Different from the existing end-to-end benchmarks which only present the training time, We try to investigate the impact of hardware, vendor's software library, and deep learning framework on the performance and energy consumption of AI training. Our evaluation methods and results not only provide an informative guide for end users to select proper AI accelerators, but also expose some opportunities for the hardware vendors to improve their software library.

Original languageEnglish
Title of host publicationProceedings - 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020
EditorsLaurent Lefevre, Carlos A. Varela, George Pallis, Adel N. Toosi, Omer Rana, Rajkumar Buyya
PublisherIEEE
Pages744-751
Number of pages8
ISBN (Electronic)9781728160955
DOIs
Publication statusPublished - May 2020
Event20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020 - Melbourne, Australia
Duration: 11 May 202014 May 2020

Publication series

NameProceedings - 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020

Conference

Conference20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020
Country/TerritoryAustralia
CityMelbourne
Period11/05/2014/05/20

Scopus Subject Areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

User-Defined Keywords

  • AI Accelerator
  • Computation-intensive Operations
  • Convolution Neural Networks
  • CPU
  • Deep Learning
  • Deep Speech 2
  • GPU
  • Recurrent Neural Networks
  • TPU
  • Transformer

Fingerprint

Dive into the research topics of 'Benchmarking the Performance and Energy Efficiency of AI Accelerators for AI Training'. Together they form a unique fingerprint.

Cite this