TY - GEN
T1 - ESet
T2 - 25th International Conference on Computer Communications and Networks, ICCCN 2016
AU - Liu, Chengjian
AU - CHU, Xiaowen
AU - Liu, Hai
AU - LEUNG, Yiu Wing
N1 - Publisher Copyright:
© 2016 IEEE.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2016/9/14
Y1 - 2016/9/14
N2 - Erasure coding has been extensively deployed in distributed storage systems to ensure high reliability and low storage overhead. However, erasure coding requires much more disk I/O to recover a damaged data block than replication does, resulting in very long data recovery time. Data placement algorithm can be tailored to speed up data recovery process by exploiting I/O parallelism. However, existing algorithms that obtain good I/O parallelism for replication can not directly work with erasure-coded storage systems; and other algorithms for both replication based and erasure-coded storage systems overlook the importance of recovery I/O parallelism, which may jeopardize the service quality and reliability of these systems. In this paper, we present a data placement strategy named ESet which brings recovery efficiency for each host in a distributed storage system. We define a configurable parameter named overlapping factor for system administrator to easily achieve desirable recovery I/O parallelism. Our simulation results show that ESet can significantly improve the data recovery performance without violating the reliability requirement by distributing data and code blocks across different failure domains.
AB - Erasure coding has been extensively deployed in distributed storage systems to ensure high reliability and low storage overhead. However, erasure coding requires much more disk I/O to recover a damaged data block than replication does, resulting in very long data recovery time. Data placement algorithm can be tailored to speed up data recovery process by exploiting I/O parallelism. However, existing algorithms that obtain good I/O parallelism for replication can not directly work with erasure-coded storage systems; and other algorithms for both replication based and erasure-coded storage systems overlook the importance of recovery I/O parallelism, which may jeopardize the service quality and reliability of these systems. In this paper, we present a data placement strategy named ESet which brings recovery efficiency for each host in a distributed storage system. We define a configurable parameter named overlapping factor for system administrator to easily achieve desirable recovery I/O parallelism. Our simulation results show that ESet can significantly improve the data recovery performance without violating the reliability requirement by distributing data and code blocks across different failure domains.
UR - http://www.scopus.com/inward/record.url?scp=84991745909&partnerID=8YFLogxK
U2 - 10.1109/ICCCN.2016.7568521
DO - 10.1109/ICCCN.2016.7568521
M3 - Conference proceeding
AN - SCOPUS:84991745909
T3 - 2016 25th International Conference on Computer Communications and Networks, ICCCN 2016
BT - 2016 25th International Conference on Computer Communications and Networks, ICCCN 2016
PB - IEEE
Y2 - 1 August 2016 through 4 August 2016
ER -