TY - GEN
T1 - Towards server-side repair for erasure coding-based distributed storage systems
AU - Chen, Bo
AU - Ammula, Anil Kumar
AU - Curtmola, Reza
N1 - Publisher Copyright:
Copyright © 2015 ACM.
PY - 2015/3/2
Y1 - 2015/3/2
N2 - Erasure coding is one of the main mechanisms to add redundancy in a distributed storage system, by which a file with k data segments is encoded into a file with n coded segments such that any k coded segments can be used to recover the original k data segments. Each coded segment is stored at a storage server. Under an adversarial setting in which the storage servers can exhibit Byzantine behavior, remote data checking (RDC) can be used to ensure that the stored data remains retrievable over time. The main previous RDC scheme to offer such strong security guarantees, HAIL, has an inefficient repair procedure, which puts a high load on the data owner when repairing even one corrupt data segment. In this work, we propose RDC-EC, a novel RDC scheme for erasure code-based distributed storage systems that can function under an adversarial setting. With RDC-EC we offer a solution to an open problem posed in previous work and build the first such system that has an efficient repair phase. The main insight is that RDC-EC is able to reduce the load on the data owner during the repair phase (i.e., lower bandwidth and computation) by shifting most of the burden from the data owner to the storage servers during repair. RDC-EC is able to maintain the advantages of systematic erasure coding: optimal storage for a certain reliability level and sub-file access. We build a prototype for RDC-EC and show experimentally that RDC-EC can handle efficiently large amounts of data.
AB - Erasure coding is one of the main mechanisms to add redundancy in a distributed storage system, by which a file with k data segments is encoded into a file with n coded segments such that any k coded segments can be used to recover the original k data segments. Each coded segment is stored at a storage server. Under an adversarial setting in which the storage servers can exhibit Byzantine behavior, remote data checking (RDC) can be used to ensure that the stored data remains retrievable over time. The main previous RDC scheme to offer such strong security guarantees, HAIL, has an inefficient repair procedure, which puts a high load on the data owner when repairing even one corrupt data segment. In this work, we propose RDC-EC, a novel RDC scheme for erasure code-based distributed storage systems that can function under an adversarial setting. With RDC-EC we offer a solution to an open problem posed in previous work and build the first such system that has an efficient repair phase. The main insight is that RDC-EC is able to reduce the load on the data owner during the repair phase (i.e., lower bandwidth and computation) by shifting most of the burden from the data owner to the storage servers during repair. RDC-EC is able to maintain the advantages of systematic erasure coding: optimal storage for a certain reliability level and sub-file access. We build a prototype for RDC-EC and show experimentally that RDC-EC can handle efficiently large amounts of data.
KW - Cloud storage
KW - Erasure coding
KW - Remote data integrity checking
KW - Server-side repair
UR - http://www.scopus.com/inward/record.url?scp=84928156551&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84928156551&partnerID=8YFLogxK
U2 - 10.1145/2699026.2699122
DO - 10.1145/2699026.2699122
M3 - Conference contribution
AN - SCOPUS:84928156551
T3 - CODASPY 2015 - Proceedings of the 5th ACM Conference on Data and Application Security and Privacy
SP - 281
EP - 288
BT - CODASPY 2015 - Proceedings of the 5th ACM Conference on Data and Application Security and Privacy
PB - Association for Computing Machinery
T2 - 5th ACM Conference on Data and Application Security and Privacy, CODASPY 2015
Y2 - 2 March 2015 through 4 March 2015
ER -