James S. Plank Department of Computer Science University of Tennessee 203 Claxton Complex Knoxville, TN 37996 plank@cs.utk.edu 865-974-4397
Lihao Xu Department of Computer Science Wayne State University 5143 Cass Avenue Detroit MI, 48202 lihao@cs.wayne.edu
The 5th IEEE International Symposium on Network Computing and Applications (IEEE NCA06), Cambridge, MA, July, 2006. http://www.cs.utk.edu/˜plank/plank/papers/NCA-2006.html NOTE: NCA’s page limit is rather severe: 8 pages. As a result, the final paper is pretty much a hatchet job of the original submission. I would recommend reading the technical report version of this paper, because it presents the material with some accompanying tutorial material, and is easier to read. The technical report is available at: http://www.cs.utk.edu/˜plank/plank/papers/CS-05-569.html. Please cite this paper, however. If this work get journalized, I will put a link to that on the above web sites.
Optimizing Cauchy Reed-Solomon Codes for Fault-Tolerant Network Storage Applications
James S. Plank∗ Department of Computer Science University of Tennessee Knoxville, TN 37996 plank@cs.utk.edu Abstract
In the past few years, all manner of storage applications, ranging from disk array systems to distributed and wide-area systems, have started to grapple with the reality of tolerating multiple simultaneous failures of storage nodes. Unlike the single failure case, which is optimally handled with RAID Level-5 parity, the multiple failure case is more difficult because optimal general purpose strategies are not yet known. Erasure Coding is the field of research that deals with these strategies, and this field has blossomed in recent years. Despite this research, the decades-old ReedSolomon erasure code remains the only space-optimal (MDS) code for all but the smallest storage systems. The best performing implementations of Reed-Solomon coding employ a variant
References: [1] S. Atchley, S. Soltesz, J. S. Plank, M. Beck and T. Moore. Fault-tolerance in the network storage stack. IEEE Workshop on Fault-Tolerant Parallel & Dist. Systems, Ft. Lauderdale, FL, April, 2002. [2] M. Blaum, J. Brady, J. Bruck, and J. Menon. EVENODD: An efficient scheme for tolerating double disk failures in RAID architectures. IEEE Trans. Comp., 44(2):192–202, 1995. [3] M. Blaum, J. Bruck, and A. Vardy. MDS array codes with independent parity symbols. IEEE Trans. Inf. Thy, 42(2):529–542, 1996. [4] J. Blomer, M. Kalfane, M. Karpinski, R. Karp, M. Luby, and D. Zuckerman. An XOR-based erasure-resilient coding scheme. Technical Report TR-95-048, International Computer Science Institute, August 1995. [5] J. Byers, M. Luby, M. Mitzenmacher, and A. Rege. A digital fountain approach to reliable distribution of bulk data. ACM SIGCOMM ’98, Vancouver, August 1998, pp. 56–67. [6] P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. RAID: High-performance, reliable secondary storage. ACM Computing Surveys, 26(2):145–185, June 1994. [7] R. L. Collins and J. S. Plank. Assessing the performance of erasure codes in the wide-area. DSN-05: Int. Conf. on Dependable Sys. and Networks, Yokohama, 2005. [8] P. Corbett et al. Row diagonal parity for double disk failure correction. 4th Usenix Conf. on File and Storage Tech., San Francisco, 2004. [9] L. Dairaine, J. Lacan, L. Lanc´ rica, and J. Fimes. Content-access QoS e in peer-to-peer networks using a fast MDS erasure code. Comp. Comm., 28(15):1778–1790, 2005. [10] G. Feng, R. Deng, F. Bao, and J. Shen. New efficient MDS array codes for RAID part I: Reed-Solomon-like codes for tolerating three disk failures. IEEE Trans. Comp., 54(9):1071–1080, 2005. [11] G. Feng, R. Deng, F. Bao, and J. Shen. New efficient MDS array codes for RAID part II: Rabin-like codes for tolerating multiple (≥ 4) disk failures. IEEE Trans. Comp., 54(12):1473–1483, 2005. [12] S. Frolund, A. Merchant, Y. Saito, S. Spence, and A. Veitch. A decentralized algorithm for erasure-coded virtual disks. DSN-04: Int. Conf. on Dependable Sys. and Networks, Florence, 2004. [13] A. Goldberg and P. N. Yianilos. Towards an archival intermemory. ADL98: IEEE Adv. in Dig. Libr., Santa Barbara, 1998, pp. 147–156. [14] G. R. Goodson, J. J. Wylie, G. R. Ganger, and M. K. Reiter. Efficient byzantine-tolerant erasure-coded storage. DSN-04: Int. Conf. on Dependable Sys. and Networks, Florence, 2004. [15] J. L. Hafner. WEAVER Codes: Highly fault tolerant erasure codes for storage systems. FAST-2005: 4th Usenix Conf. on File and Storage Tech., San Francisco, 2005, pp. 211–224. [16] J. L. Hafner. HoVer erasure codes for disk arrays. DSN-06: Int. Conf. on Dependable Sys. and Networks, Philadelphia, 2006. [17] C. Huang and L. Xu. STAR: An efficient coding scheme for correcting triple storage node failures. FAST-2005: 4th Usenix Conf. on File and Storage Tech., San Francisco, 2005, pp. 197–210. [18] J. Li. PeerStreaming: A practical receiver-driven peer-to-peer media streaming system. Technical Report MSR-TR-2004-101, Microsoft Research, September 2004. [19] W. K. Lin, D. M. Chiu, and Y. B. Lee. Erasure code replication revisited. PTP04: 4th Int. Conf. on Peer-to-Peer Computing. 2004. [20] W. Litwin and T. Schwarz. Lh*rs: a high-availability scalable distributed data structure using Reed Solomon codes. 2000 ACM SIGMOD Int. Conf. on Management of Data, 2000, pp. 237–248. [21] M. Luby, M. Mitzenmacher, A. Shokrollahi, D. Spielman, and V. Stemann. Practical loss-resilient codes. 29th Annual ACM Symp. on Theory of Computing, El Paso, TX, 1997, pp. 150–159. [22] F.J. MacWilliams and N.J.A. Sloane. The Theory of Error-Correcting Codes, Part I. North-Holland Publishing Company, Amsterdam, New York, Oxford, 1977. [23] M. Mitzenmacher. Digital fountains: A survey and look forward,. 2004 IEEE Inf. Theory Workshop, San Antonio, 2004. [24] J. S. Plank. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Software – Practice & Experience, 27(9):995–1012, 1997. [25] J. S. Plank. Optimizing Cauchy Reed-Solomon codes for fault-tolerant storage applications. Technical Report CS-05-569, Univ. Tennessee, December 2005. [26] J. S. Plank and Y. Ding. Note: Correction to the 1997 tutorial on ReedSolomon coding. Software – Practice & Experience, 35(2):189–194, 2005. [27] J. S. Plank and M. G. Thomason. A practical analysis of low-density parity-check erasure codes for wide-area storage applications. DSN-04: Int. Conf. on Dependable Sys. and Networks, Florence, 2004, pp. 115124. [28] S. Rhea, C. Wells, P. Eaton, D. Geels, B. Zhao, H. Weatherspoon, and J. Kubiatowicz. Maintenance-free global data storage. IEEE Internet Computing, 5(5):40–49, 2001. [29] L. Rizzo. Effective erasure codes for reliable computer communication protocols. ACM SIGCOMM Computer Communication Review, 27(2):24–36, 1997. [30] S. B. Wicker and S. Kim. Fundamentals of Codes, Graphs, and Iterative Decoding. Kluwer Academic Publishers, Norwell, MA, 2003. [31] W. Wilcke et al. IBM intelligent brick project – petabytes and beyond. IBM Journal of Research and Development, to appear, 2006. [32] H. Xia and A. A. Chien. RobuSTore: Robust performance for distributed storage systems. Technical Report CS2005-0838, Univ. Calif. San Diego, October 2005. [33] L. Xu, V. Bohossian, J. Bruck, and D. Wagner. Low density MDS codes and factors of complete graphs. IEEE Trans. Inf. Thy, 45(6):1817–1826, 1999. [34] L. Xu and J. Bruck. X-Code: MDS array codes with optimal encoding. IEEE Trans. Inf. Thy, 45(1):272–276, 1999. [35] Z. Zhang and Q. Lian. Reperasure: Replication protocol using erasurecode in peer-to-peer storage network. 21st IEEE Symp. Reliable Distributed Systems, 2002, pp. 330–339.