Benchmarking the memory hierarchy of modern GPUs

Xinxin Mei, Kaiyong Zhao, Chengjian Liu, Xiaowen CHU

Research output: Chapter in book/report/conference proceedingConference proceedingpeer-review

41 Citations (Scopus)

Abstract

Memory access efficiency is a key factor for fully exploiting the computational power of Graphics Processing Units (GPUs). However, many details of the GPU memory hierarchy are not released by the vendors. We propose a novel fine-grained benchmarking approach and apply it on two popular GPUs, namely Fermi and Kepler, to expose the previously unknown characteristics of their memory hierarchies. Specifically, we investigate the structures of different cache systems, such as data cache, texture cache, and the translation lookaside buffer (TLB). We also investigate the impact of bank conflict on shared memory access latency. Our benchmarking results offer a better understanding on the mysterious GPU memory hierarchy, which can help in the software optimization and the modelling of GPU architectures. Our source code and experimental results are publicly available.

Original languageEnglish
Title of host publicationNetwork and Parallel Computing - 11th IFIP WG 10.3 International Conference, NPC 2014, Proceedings
PublisherSpringer Verlag
Pages144-156
Number of pages13
ISBN (Print)9783662449165
DOIs
Publication statusPublished - 2014
Event11th IFIP WG 10.3 International Conference on Network and Parallel Computing, NPC 2014 - Ilan, Taiwan, Province of China
Duration: 18 Sept 201420 Sept 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8707 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th IFIP WG 10.3 International Conference on Network and Parallel Computing, NPC 2014
Country/TerritoryTaiwan, Province of China
CityIlan
Period18/09/1420/09/14

Scopus Subject Areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Benchmarking the memory hierarchy of modern GPUs'. Together they form a unique fingerprint.

Cite this