Next Article in Journal
A Denoising Method of Ship Radiated Noise Signal Based on Modified CEEMDAN, Dispersion Entropy, and Interval Thresholding
Next Article in Special Issue
Editorial of Energy-Efficient and Reliable Information Processing: Computing and Storage
Previous Article in Journal
Online Learned Siamese Network with Auto-Encoding Constraints for Robust Multi-Object Tracking
Previous Article in Special Issue
Discriminative Sparsity Graph Embedding for Unconstrained Face Recognition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Overview of Binary Locally Repairable Codes for Distributed Storage Systems

1
Department of Information and Communication Engineering, Chosun University, Gwangju 61452, Korea
2
Department of Electrical and Computer Engineering, Institute of New Media and Communications, Seoul National University, Seoul 08826, Korea
*
Author to whom correspondence should be addressed.
Electronics 2019, 8(6), 596; https://doi.org/10.3390/electronics8060596
Submission received: 9 April 2019 / Revised: 20 May 2019 / Accepted: 22 May 2019 / Published: 28 May 2019

Abstract

:
This paper summarizes the details of recently proposed binary locally repairable codes (BLRCs) and their features. The construction of codes over a small alphabet size of symbols is of particular interest for efficient hardware implementation. Therefore, BLRCs are highly noteworthy because no multiplication is required during the encoding, decoding, and repair processes. We explain the various construction approaches of BLRCs such as cyclic code based, bipartite graph based, anticode based, partial spread based, and generalized Hamming code based techniques. We also describe code generation methods based on modifications for linear codes such as extending, shorting, expurgating, and augmenting. Finally, we summarize and compare the parameters of the discussed constructions.

1. Introduction

Efficient distributed storage systems (DSSs) are considered to be crucial infrastructure for handling big data. These systems must be able to reliably store data over a long duration by introducing redundancy and storing data in a distributed manner across several storage nodes, which may be individually unreliable and could generate failures. Large data centers and peer-to-peer storage systems such as OceanStore [1] from Berkeley and BigTable from Google [2] are famous examples of distributed storage systems.
Owing to cost issues, large data centers also use many commercial hardware storage devices such as hard disk drives/solid state devices (HDDs/SSDs). As a result, device failure occurs regularly, rather than as an exception. The data are typically stored in a redundant manner to effectively protect valuable data against potential failures. The traditional storage method for large storage services such as cloud storage is triplication, i.e., triple replication of each symbol. For example, the Google file system [3] and Hadoop [4] adopt this approach. However, given that triplication requires thrice the storage space, a ( 14 , 10 ) Reed–Solomon code is deployed in their warehouse cluster in the case of Facebook [5]. Although RS codes are efficient for handling specified numbers of erasures, all of the code symbols must be communicated and reconstructed to repair erasures. Thus, more efficient storage methods have been actively researched, including regeneration codes (RCs), fractional repetition codes (FRCs), and locally repairable codes (LRCs) [6,7,8,9,10,11,12]. RC attempts to minimize the number of transmitted symbols, while the objective of LRC is to optimize the number of disk reads required to repair a single lost node. In some respects, LRC is essentially a block code with an additional parameter referred to as locality. There have been excellent reviews on the distributed storage codes (e.g., [13,14,15,16]). Moreover, a review article on this topic has recently been published [17]. However, to the best of the authors knowledge, no review paper deals only with the binary LRC (BLRC) constructions, which are practically useful.
In most of the early suggestions for LRC constructions, the alphabet size of the stored symbols is very large. However, for efficient and convenient hardware implementation, the construction of codes over a small alphabet size for the stored symbols is of particular interest. For example, BLRCs are of special interest because multiplication is not necessary during the encoding, decoding, and repair processes.
This paper summarizes the recently proposed construction of BLRCs and their features. The code construction methods discussed in this paper are categorized as in Figure 1. The construction methods of BLRCs are explained using cyclic code based, bipartite graph based, anticode based, partial spread based, and generalized Hamming code based approaches. In addition, the construction of BLRCs using modification methods for linear codes such as extending, shorting, expurgating, augmenting, and lengthening are discussed. This paper is organized into several sections. In Section 2, the basic concepts used in the coding techniques for distributed storage systems are introduced. In addition, the characteristics of RC, LRC, and FRC are explained, including the meaning of locality and availability. In Section 3, generation methods of LRCs are summarized with respect to individual types and features, with a focus on BLRC. Finally, the main conclusions are summarized in Section 4.

2. Preliminaries

2.1. Classification of Storage Codes for DSS

There are several types of codes for data storage systems such as regeneration codes, locally repairable codes, and fractional repetition codes. Regeneration codes are a class of codes that enhance data reliability and facilitate the efficient repair of failed nodes in distributed storage systems [18,19]. The key metric of these codes is the network bandwidth, which is intended to optimize the amount of data communicated to repair a single failure node. In the case of node failure, it is necessary to recover the data stored in the failed node or restore them in the replacement node. This is called repair or regeneration of a node. During the repair process, data are typically downloaded from the remaining nodes. In this case, downloading the entire message is a waste of network resources. Therefore, regeneration codes are introduced to reduce the amount of downloaded data during the repair process while retaining the storage efficiency of traditional maximum distance separable (MDS) codes.
The earliest LRCs were proposed as pyramid codes [20,21,22]. The formal definition of LRC with a tradeoff between locality r and the minimum distance d first appears in [23]. LRCs focus on optimization of the number of nodes accessed for node repair and reconstruction. These codes are introduced in [24] and developed further in [25,26]. In addition, LRCs were recently utilized in distributed storage systems, such as Windows Azure storage [27] and Facebook HDFS-RAID [28].
There are several approaches for the construction of efficient storage codes for distributed storage systems as follows:
nonlinear codes [25,29];
vector codes [30,31,32];
codes over bounded alphabets [33];
codes with short local MDS [24,30]; and
codes with local regeneration [30,32].
A more detailed review of each method can be found in [17].

2.2. Locality, Recoverability, and Availability for Hot Data

Several criteria are used to evaluate the performance of distributed storage codes. This subsection introduces the concepts and definitions of the most important ones, including locality, reliability, and availability.
Let C be an ( n , k ) q-ary code of length n and dimension k over a finite field F q . The locality of the ith coordinate of C is r if the value of the ith symbol of a codeword of C is represented as a function of r other coordinates, and no such set of coordinates with cardinality less than r exists. This means that a coordinate in a linear code has locality r if it can be expressed as a linear combination of r other coordinates. The set of such r coordinates that can repair the ith symbol is called a “repair set”. An ( n , k ) code C with locality r is denoted as an ( n , k , r ) locally repairable code. In addition to maximizing the distance of codewords, the maximal recoverable LRC (MR-LRC) is defined as a code that can modify all theoretically correctable erasure patterns under locality constraints.
If the ith symbol c i in a codeword is lost, it can be recovered by reading r other symbols in the codeword. In this case, the locality can be classified into two cases: “information locality r” if all information symbols have locality r, and “all-symbol locality r” if all symbols have locality r. In the case of node failure, the decoding complexity of LRCs can be decoupled from the code length n.
Other construction schemes for LRCs are intended to build codes with maximal recoverability (MR) called MR-LRCs, or partial MDS codes. Some examples are found in [34,35,36,37]. For MR-LRCs, it is important to not only maximize the global distance but also to correct any erasure patterns within a theoretical bound. Therefore, they are considered as a stronger class of LRCs than optimal LRCs [37].
Another important performance criterion is availability [38,39,40,41]. Availability is a very important feature when “hot data’ are accessed. Hot data are data that aere frequently accessed simultaneously by many users in front-end systems. A binary linear code C of length n is called a t-available r-locally repairable code if every coordinate i for 1 i n has at least t parity checks of disjoint r + 1 nonzero elements. A symbol has availability t if it can be read in parallel by t disjoint groups of symbols. These t reads have locality r if each read involves up to r symbols. Replication provides high availability for hot data. For example, considering that replication is performed three times and each symbol can be read in parallel three times, the availability is then t = 3 and the locality of these reads is r = 1 . One possible solution is LRC with multiple disjoint recovery sets.
There are two types of availabilities, namely information-symbol availability and all-symbol availability. If an ( n , k , r , d ) LRC supports availability t for local repair on each of k information symbols, it is referred to as an ( n , k , r , t , d ) LRC with information symbol availability. If an ( n , k , r , d ) LRC supports availability t for all n symbols, it is referred to as an ( n , k , r , t , d ) LRC with all-symbol availability [42].

3. Binary Locally Repairable Codes

When the LRCs are first introduced, there is no restriction on the field size. For the Singleton-like bound in [31], there is an optimal construction matching for the bound of field size q > n + 1 , where the optimal LRCs are constructed using an algebraic structure. However, the coding complexity can be significantly reduced using BLRC.
Compared to q-ary LRCs, BLRCs are known to be advantageous in terms of implementation in practical systems. In [43], the advantages of ( n , k , d , r ) = ( 15 , 10 , 4 , 6 ) BLRC are discussed and compared with ( 16 , 10 , 4 , 5 ) non-binary LRC, (14,10) RS code, and three-replication with four metrics including encoding complexity, repair complexity, mean time to data loss, and storage capacity. The authors of [43] further analyzed the advantages of BLRCs with a high Hamming distance and average locality [44,45]. In this section, we introduce bounds for BLRCs and various construction methods of BLRCs.

3.1. Bounds for the Binary Locally Repairable Codes

The bounds and constructions of BLRCs are quite different from those of q-ary LRCs. For the bound, the maximum code dimension of BLRCs is smaller than that of q-ary LRC and the corresponding optimal construction of the former should be made by different motivations such as easy implementation. Initially, we discuss the useful bounds for BLRCs.
Let us start with a general bound on LRC that shows a tradeoff relationship between rate k / n , minimum distance d, and locality r [23]. For linear LRCs with information locality r, there are tradeoffs among n, k, d, and r. Let C be an ( n , k , r ) LRC. Assuming that r | k and ( r + 1 ) | n , the rate is bounded as follows:
k n r r + 1 .
In addition, the minimum distance is bounded by [31]
d n k k r + 2 ,
which is called a Singleton-like bound because it is a generalization of the classical Singleton bound for linear codes and we have the Singleton bound if r = k . It is well-known that a q-ary ( n , k , d ) MDS code can achieve a Singleton bound. An optimal ( n , k , r ) LRC achieves the bound with equality. We can consider two extreme cases when r = k and r = 1 . For r = k , we have d n k + 1 and an ( n , k ) RS code is an ( n , k , r = k ) optimal LRC. For r = 1 , we have d n k k + 2 = 2 ( n 2 k + 1 ) and the duplication of an ( n / 2 , k ) RS code is an ( n , k , r = 1 ) optimal LRC. Therefore, we are interested in the case of 1 < r < k .
For the bounds of BLRCs, Cadambe–Mazumdar (C-M) [33], linear programming [46], and L -space bounds [47,48] are introduced. The first bound, considering the alphabet size, is given as
k min t Z + [ t r + k o p t ( q ) ( n ( r + 1 ) t , d ) ] ,
where k o p t ( q ) ( n , d ) denotes the largest possible dimension of an ( n , k , d ) linear code over F q . The C-M bound is often used to determine whether the given BLRC with short code length is optimal [32]. However, because the exact value of k o p t q ( n , d ) can only be obtained in a limited case with relatively short code length, it is difficult to apply the C-M bound to evaluate the optimality of general BLRCs.
In addition, a linear programming bound was proposed using the Delsarte linear programming method, which is known to be tighter than the C-M bound for BLRCs for some parameters [49]. However, both bounds are expressed in the implicit forms and, thus, it is difficult to apply these bounds to BLRCs with long code lengths.
For an ( n , k , d ) linear LRC C , L -space bound was recently proposed using sphere packing [47,48]. The L -space is defined as the dual of the linear space generated by a minimum set of local parity checks of C with overall support covering all coordinates. For an ( n , k , d , r ) BLRC with disjoint repair groups, where d = 2 t + 2 and n = ( r + 1 ) l , the following bound holds for the parity of t + 1 [50].
(i)
If t + 1 is odd, we have
k r n r + 1 log 2 0 i 1 + + i l d 1 4 j = 1 l r + 1 2 i j .
(ii)
If t + 1 is even, we have
k r n r + 1 log 2 0 i 1 + + i l d 1 4 j = 1 l r + 1 2 i j + i 1 + + i l = d 4 j = 1 l r + 1 2 i j n t + 1 .
These bounds are advantageous in two ways compared to the previous bounds. Firstly, the L -space bound is known to be tighter than the C-M bound for BLRCs with long code lengths. In addition, the inequality of the bound is expressed in an explicit form, i.e., the value of the bound is easily derived for BLRCs with long code lengths. Furthermore, the improved L -space bound is induced with the refined packing radius for BLRCs with 4 | d [50].
A bound in an explicit form for d 5 is given in [48]. For an ( n , k , d ) linear BLRC with locality r, such that d 5 and 2 r n 2 2 , it follows that
k n r r + 1 min log 2 1 + r n 2 , r n ( r + 1 ) ( r + 2 ) .
In the next subsection, we introduce the construction of BLRCs with various parameters and motivations, some of which are optimal or near-optimal with respect to the aforementioned bounds.

3.2. Classification of Binary Locally Repairable Codes

For the construction of BLRCs, various methods have been proposed based on the following:
(i)
cyclic codes [51,52,53,54];
(ii)
random vectors [42];
(iii)
bipartite graph [44,55,56];
(iv)
anticodes [57];
(v)
partial spread [50,58];
(vi)
generalized Hamming code [47,48]; and
(vii)
modification of codes [53,59].
In the following subsections, the various types of constructions of BLRCs are summarized.

3.3. BLRCs from Cyclic Codes

Goparaju and Calderbank proposed several constructions of BLRCs from cyclic codes [51]. Cyclic codes inherently enjoy efficient structures for encoder and decoder implementation. The q-cyclotomic coset M i , n is defined as
M i , n = { i q j mod n | 1 j < a } ,
where a is the smallest positive integer that satisfies i q a i mod n . The defining set of an ( n , k , d ) cyclic code C is defined as
D C = { i | g ( α i ) = 0 , 0 i n 1 } ,
where g ( x ) has roots in the splitting field F q s , n | ( q s 1 ) . Using optimal cyclic codes in terms of the Singleton bound, three BLRC constructions are suggested as follows.
Construction (CC1) [51]:
Let n = 2 m 1 , r + 1 be a factor of n and α be a primitive element of F 2 m . Let C be a cyclic code with the generator polynomial g ( x ) with the defining set as
D C = { j mod ( r + 1 ) | 0 j n 1 } .
Then, C is an LRC with locality r and dimension k = r n / ( r + 1 ) .
Construction (CC2) [51]:
Let n = 2 m 1 with even m, and locality r = 2 . Let C be a cyclic code in which the generator polynomial g ( x ) has the defining set
D C = { j mod 3 | 0 j n 1 } M 1 , n .
Then, C is an LRC of dimension k = 2 3 ( 2 m 1 ) m and a distance d 6 .
Construction (CC2) is shown to be distance-optimal among the set of linear codes that have disjoint locality parity checks.
Construction (CC3) [51]:
Let n = 2 m 1 . Let α be a primitive element of F q . The generator polynomial with the defining set
D C = { j mod 3 | 0 j n 1 } M 1 , n M 1 , n
can construct a BLRC that satisfies the following inequality k 2 3 ( 2 m 1 ) 2 m for even k, d = 10 , and r = 2 .
The BLRC construction from the ( 7 , 4 , 3 ) binary Hamming code is expressed in the following construction.
Construction (CC4) [51]:
For 3 | m , we have 7 | n when n = 2 m 1 . Let C be a cyclic code in which the generator polynomial g ( x ) has the defining set
D C = { j | j ( mod 7 ) { 0 , 3 , 5 , 6 } , 0 j n 1 } .
Then, C is a three-available two-local LRC with dimension k = 3 n / 7 and minimum distance d = 4 . The corresponding parity check polynomial h ( x ) is then given as
h ( x ) = 1 + x n / 7 + x 3 n / 7 .
Extending the results in [51], Zeh and Yaakobi proposed several construction methods for BLRC in [52]. These constructions generate BLRCs with locality 2. Construction (CC5) was based on binary reversible codes. Let D C [ l ] be the set given as { ( i + l ) | i D C } . Let D L be the defining set of ( r + 1 , r , 2 ) single parity check code with one erasure correctional capability in a block of length r + 1 . Then, a BLRC can be obtained as in Construction (CC5).
Construction (CC5) [52]:
For odd m, let n = 2 m + 1 and 3 | n . Let L be a ( 3 , 2 , 2 ) single parity check code with D L = { 0 } , where the defining set is given as:
D C = D L D L [ 3 ] D L [ n 3 ] M 1 , n = { j mod 3 | 0 j n 1 } M 1 , n .
The corresponding code C is then an ( n , k , d , r ) BLRC, where k = 2 3 ( 2 m + 1 ) 2 m , d 10 , and r = 2 .
In addition, Construction (CC4) was extended to obtain codes with a higher Hamming distance at the cost of a small reduction of the rate as follows:
Construction (CC6) [52]:
Let n = 2 m 1 and 7 | n (i.e., 3 | m ). Let D C be the defining set given as
D C = { j | j ( mod 7 ) { 0 , 3 , 5 , 6 } , 0 j n 1 } M 1 , n = { , 9 , 8 , 7 , 4 , 2 , 1 , 0 , 3 , 5 , 6 , 7 , 10 , 12 , 13 , } M 1 , n .
Then, the corresponding code C is a BLRC with k = 3 n / 7 m , d 12 , locality r = 2 , and availability t = 2 .
This construction was extended to the construction of ( 2 a 1 , a , 2 a 1 ) simplex code L with available ( 2 a 1 1 ) and locality 2 as follows.
Construction (CC7) [52]:
Let n = 2 m 1 , which is divisible by 2 a 1 (i.e., a | m ). Let L be a ( 2 a 1 , a , 2 a 1 ) cyclic simplex code with the defining set given as
D L = { 0 , 3 , 5 , 6 , 2 a 1 + 1 , , 2 a 1 } M 1 , n .
The corresponding code C is then a BLRC with d 2 a + 2 a 1 , r = 2 , t = 2 a 1 1 , and dimension k = a 2 a 1 ( 2 m 1 ) m .
Another example of BLRCs was proposed by Tamo, Barg, Goparaju, and Calderbank in 2016 as in the following construction.
Construction (CC8) [54]:
Let α be an nth root of unity and let z be an integer such that ( 2 z 1 ) | n and z 1 . Then, D is an ( n , k ) binary cyclic code with the defining set D with the coset α G 2 z 1 of the group G 2 z 1 = < α 2 z 1 > . Then, the locality of D is bound as r 2 z 1 1 . Moreover, each symbol of the codewords in D has at least 2 z 1 recovery sets A i of size 2 z 1 1 .
A BLRC that can satisfy the explicit bound given in Equation (4) is also proposed in [60] as follows:
Construction (CC9) [60]:
For ( r + 1 ) | n , let v = n r + 1 and u = r + 1 , where gcd ( u , v ) = 1 and u , v 2 . Let g ( x ) be a generator polynomial of the cyclic BLRC and β be the uth root of unity. Then, ( u v , u v deg ( g ( x ) ) , 4 , u 1 ) BLRC can be constructed using the generator polynomials given by
(i) 
For 2 | r , g ( x ) = ( x v + 1 ) g 1 ( x ) , where g 1 ( x ) is the minimum polynomial of β over F 2 .
(ii) 
For r = 2 m 1 , g ( x ) = ( x v + 1 ) ( x + 1 ) 2 m 1 , where m is a positive integer.

3.4. BLRCs from Random Vectors

A family of high-rate BLRCs with locality two and uneven availabilities was proposed in [42], which requires intermediate procedures. The uneven availability is represented as an availability profile. For its construction, a k-tuple binary column vector z k with a nonzero element at the random position is required. Let Z ( x ) be a random function that converts x into a binary vector with the same length by changing a zero element into a nonzero element. From z k , k × k square matrices P k , l for 1 l k 1 are constructed individually by increasing l as follows:
P k , l = Z l ( z k ) Z ( 1 ) l ( z k ) Z ( k 1 ) l ( z k ) ,
where Z l ( z k ) is generated from Z l 1 ( z k ) by the lexicographical order of construction, and Z ( i ) l ( z k ) is the i circularly downward-cyclic-shifted vector of Z l ( z k ) . Then, a k × k ( k 2 ) matrix P k for the parity part of the generator matrix in a systematic form is generated by concatenating the matrix P k , 1 , P k , 2 , , P k , k 2 as follows:
P k = [ P k , 1 P k , 2 P k , k 2 ] = Z 1 ( z k ) Z ( 1 ) 1 ( z k ) Z ( k 1 ) 1 ( z k ) Z k 2 ( z k ) Z ( 1 ) k 2 ( z k ) Z ( k 1 ) k 2 ( z k ) .
Construction (RV) [42]:
Let G ( n , k ) denote the generator matrix of the proposed ( n , k ) BLRC C in a systematic form. Then, a k × n systematic generator matrix G ( n , k ) is constructed as
G ( n , k ) = [ I k P k ] .
It should be noted that the k × k ( k 1 ) generator matrix G ( n , k ) has a code rate of R = 1 / ( k 1 ) .
An ( n , k ) BLRC code C from Construction (RV) has an all-symbol locality equal to r = 2 and the all-symbol availability profile is given by
t = [ k 1 , , k 1 , 2 , , 2 , 1 , , 1 ] ,
where the numbers of ( k 1 ) s, 2s, and 1s are k, k ( k 3 ) , and k, respectively, and each value denotes the availability for local repair of the ith symbol of a codeword in C .

3.5. BLRCs from Bipartite Graph

In coding theory, a Tanner graph is a bipartite graph with two sets of vertices, a set of n variable nodes and a set of ( n k ) check nodes, for the constraint of error correcting codes. Suppose that n variable nodes are partitioned into l = n / ( r + 1 ) groups. All variable nodes related to each group are linked to a unique check node called the local check node and the other nodes are called the global check nodes. Then, the constructed BLRC can achieve maximum locality r for all symbols.
Construction (BG) [44]:
Let H B L = I n r + 1 1 r + 1 F 2 n r + 1 × n and H B G = 1 n r + 1 H 0 ( r ) F 2 log 2 ( r + 1 ) × n , where ⊗ denotes the Kronecker product, 1 r + 1 denotes the all-one vector of length r + 1 and H 0 ( r ) is the parity check matrix of an ( r + 1 , r + 1 log 2 ( r + 1 ) ) Hamming code such as H 0 ( r ) = ( 0 , 1 , , r ) F 2 log 2 ( r + 1 ) × ( r + 1 ) . Then, the parity check matrix of BLRC based on a bipartite graph of parameters ( n , r n r + 1 log 2 ( r + 1 ) , 4 , r ) is given as
H = H B L H B G F 2 ( n k ) × n .
The minimum distance of the parity check matrix H in Construction (BG) is 4. This BLRC is optimal in some cases. Even when it is not optimal, it is shown that this code has a near-optimal code rate with a rate gap of O log r n .
In addition, an expander graph based construction of BLRC exists [55,56]. Suppose we have two sets V and C that satisfy the following conditions:
| V | = n , | C | = n t r + 1 ;
the degree of v V is t; and
the degree of c is r + 1 .
For 0 < α , γ 1 , the bipartite graph G = ( V C , E ) is a ( t , r + 1 , α , t γ ) -expander if for any subset V V , | V | α n implies the size of the subset of C connected to V is greater than t γ | V | . In addition, the length of the shortest cycle of the graph G is greater than 4. As such, we can have the following construction:
Construction (EG) [55,56]:
Let H E be an m × n parity check matrix [ h i , j ] where 1 i m and 1 j n , whose columns correspond to the vertices of V and the rows corresponds to the vertices of C. Then, h i , j is equal to one if the corresponding vertices c i and v j are connected with an edge. For t < r + 1 , the code C E constructed from H E is an ( n , k , δ , r , t ) C E BLRC.
In Construction (EG), γ is chosen from the range [ 1 1 + r , 1 1 t ) and α is determined as a solution of the following equation:
( t 1 ) h ( α ) / t h ( α γ ( r + 1 ) ) / r + 1 δ γ ( r + 1 ) h ( 1 γ ( r + 1 ) ) = 0 ,
where h ( x ) = x log 2 x ( 1 x ) log 2 ( 1 x ) . The probability that G is a ( t , r + 1 , α , t γ ) expander is greater than 1 O ( n t ( 1 γ ) 1 ) for 0 < α < α . In addition, the code rate is bounded by
R 1 t r + 1 o ( 1 ) ,
where the equality holds for the case whereby H E is a full rank matrix.

3.6. BLRC from Anticode

An anticode A of length n is a code that may contain repeated codewords in F 2 n and has an upper bound on the distance between codewords [61]. Contrary to the minimum distance in generic error correcting codes, the maximum distance δ is defined as the maximum Hamming distance between any pair of codewords in A . This anticode is a core ingredient of the following BLRC.
The generator matrix G A of the anticode A is a k × n matrix, and all codewords in A can be expressed by a linear combination of k rows of G A . If the rank of G A is γ , then each codeword in A occurs 2 k γ times. Let A s , 2 be an anticode of length n = s 2 and Hamming weight of 2 and the columns of its generator matrix G A are all weight-2 vectors of length s.
Construction (AC1) [57]:
Let S m be a binary simplex code of length 2 m 1 , dimension m, and minimum Hamming distance 2 m 1 . Let G m be the generator matrix of S m , and let its columns consist of all possible nonzero vectors in F 2 m . We prepend m s zeros to every column of G A of A s , 2 to construct an m × s 2 matrix G A . By deleting the columns in G A from G m , we can construct a generator matrix G of BLRC, C m , s , 2 , with parameters ( 2 m s 2 1 , m , 2 m 1 s 2 4 ) and locality 2.
For 3 s 5 , the code C m , s , 2 satisfies the C-M bound in Equation (1). Moreover, three instances with locality r = 2 of Construction (AC1) are listed in [57]:
The code C m , 3 , 2 from the anticode A 3 , 2 is a ( 2 m 4 , m , 2 m 1 2 ) LRC.
The code C m , 4 , 2 from the anticode A 4 , 2 is a ( 2 m 7 , m , 2 m 1 4 ) LRC.
The code C m , 5 , 6 from the anticode A 5 , 2 is a ( 2 m 11 , m , 2 m 1 6 ) LRC.
Construction (AC2) [57]:
Let A t ; 2 , 3 , , t 1 , 3 t m , be an anticode such that its generator matrix G A consists of all columns of weight in { 2 , 3 , , t 1 } . Then, m t zeros are prepended to every column of G A to form an m × i = 2 t 1 t i matrix whose columns will be deleted from G m to obtain a generator matrix G for the code C m , t , which becomes a ( 2 m 2 t + t + 1 , m , 2 m 1 2 t 1 + 2 ) LRC with locality r = 2 .
This code achieves the Griemer bound [62].
Construction (AC3) [57]:
Let A m 1 be an anticode with generator matrix given by
G A = 1 0 0 0 G m 1 0 ,
where G m 1 is the generator matrix of the simplex code S m 1 . Let C be a code obtained based on the Farrell construction using the simplex code S m and the anticode A m 1 . Then, C is a ( 2 m 1 1 , m , 2 m 2 1 ) BLRC with locality r = 3 .
It is also shown that this code can satisfy the bound in Equation (1).

3.7. BLRCs from Partial Spread

To introduce BLRCs constructed from partial spread, the definition of partial t-spread is given.
Definition [50]:
A partial t-spread of F q m is a collection S = { W 1 , , W l } of t-dimensional subspaces of F q m such that W i W j = { 0 } for 1 i < j l . Moreover, S is maximal if it has the largest possible size. In particular, if i = 1 n W i = F q m , then S is a t-spread. If t | n , a t-spread of F q m exists.
Now, we can define a BLRC C with parity check matrix given by
H = H L H G .
Then, a BLRC C of parameters ( n , k r n r + 1 t log 2 n , d 2 t + 2 , r ) can be constructed in the following way:
Construction (PS1) [50]:
Let 1 n be the all-one vector of length n. Let H L = I n r + 1 1 r + 1 and H G be a t log 2 n × n matrix that has binary expansions of the vectors { a 1 , a 2 , , a n } as its columns, where a i = ( β i , β i 3 , , β i 2 t 1 ) T and β 1 , , β n are distinct elements of the finite field F 2 log 2 n . Then, the parity check matrix of a BLRC C is given as in Equation (5).
For the further extension of Construction (PS1), the parity check matrix can be given as
H = H L H G = H L 1 H L 2 H L l H G 1 H G 2 H G l ,
where l = n r + 1 . For i [ 1 , l ] , H L i is an l × ( r + 1 ) matrix, whose ith row is the all-one vector of length r + 1 and the other rows are all-zero vectors. Moreover, H G i is the ith ( n k l ) × ( r + 1 ) submatrix of H G = ( H G 1 H G 2 H G l ) . It is well-known that if any d 1 columns of the parity check matrix H are linearly independent, the minimum distance of a linear code is greater than or equal to d. Furthermore, for a collection of any a i columns { c 1 i , c 2 i , , c a i i } of H G i , if i = 1 l j = 1 a i c j i 0 , then d 2 t + 2 , where a 1 , a 2 , , a l satisfy the following two conditions:
(i)
For 1 i l , a i is even, where 0 a i min { 2 t , r + 1 } ; and
(ii)
2 i = 1 l a i 2 t .
Then, we can construct two k-optimal ( n , k , d , r ) BLRCs with disjoint repair groups as in the following construction.
Construction (PS2) [50]:
Let r = 2 t and { W 1 , , W a } be the maximum partial 2 t -spread of F 2 s . In addition, let { e 1 ( i ) , e 2 ( i ) , , e 2 t ( i ) } be a basis of W i . For t 3 , there exists a ( 2 t , 2 t 2 t , 5 ) binary linear code with the parity check matrix H b . Let s u p p ( x ) be the set of indices corresponding to nonzero coordinates of a vector x. For i [ 1 , a ] , let T ( i ) be the set { 0 } { f i | 1 i n } , where f i = j s u p p ( h i ) e j ( i ) and h i is the ith column of H b . When t = 1 , 2 , T ( i ) = { 0 , e 1 ( i ) , e 2 ( i ) , , e 2 t ( i ) } . Let H G i be an s × ( 2 t + 1 ) matrix whose columns consist of the vectors in T ( i ) . Then, we can define a BLRC with a parity check matrix H as in Equation (6), where s r < l a .
A set T F is τ -wise weakly independent over F 2 F if no set T T , where 2 | T | τ , has the sum of its elements equal to zero. Then, we have d 6 , if the columns of H G satisfy the following conditions:
(i)
c 1 i + c 2 i 0 for 1 i l ;
(ii)
c 1 i + c 2 i + c 3 i + c 4 i 0 for 1 i l ; and
(iii)
c 1 i + c 2 i + c 1 j + c 2 j 0 for 1 i j l .
Construction (PS3) [50]:
Let r = 2 t + 2 ( t + 1 ) / 2 1 , and { W 1 , W 2 , , W a } be a maximum partial ( 2 t + 1 ) -spread of F 2 s and the basis of W i is { e 1 ( i ) , e 2 ( i ) , , e 2 t + 1 ( i ) } . When t 3 , there is a ( 2 t + 2 ( t + 1 ) / 2 1 , 2 t + 2 ( t + 1 ) / 2 2 t 2 , 5 ) binary linear code. Let T ( i ) be the same set in Construction (PS2) for 1 i a . For t = 1 , 2 , T ( i ) is defined as { 0 , e 1 ( i ) , e 2 ( i ) , , e 2 t + 1 ( i ) } . Let H G i be an s × ( 2 t + 1 ) matrix whose columns consist of the vectors in T ( i ) . Then, a BLRC C can be constructed using a parity check matrix H in Equation (6) for s r < l a .
Let A q ( m , k , d ) be the maximal cardinality of subspace codes over F q m with minimum distance d and dimension k. Then, we can construct a BLRC as follows:
Construction (PS4) [50]:
Let n = 3 l such that l 2 2 m + 1 2 3 for m 2 . Then, there exists an ( n , k , 6 , 2 ) BLRC C with dimension given as
k = 2 l 2 m , i f l [ A 2 ( 2 m 1 , 2 , 4 ) + 2 , A 2 ( 2 m , 2 , 4 ) ] 2 l 2 m 1 , i f l [ A 2 ( 2 m , 2 , 4 ) + 1 , A 2 ( 2 m + 1 , 2 , 4 ) ] ,
where it is optimal with respect to the bound in Equation (2). The following construction is nearly optimal with respect to the bound in Equation (2).
Construction (PS5) [50]:
Let { W 1 , W 2 , , W a } be a maximum partial two-spread of F 2 s . The basis of W i is given as { e 1 ( i ) , e 2 ( i ) } . Then, a ( 4 l , 3 l s 1 , 6 , 3 ) BLRC C with parity check matrix H of the form in Equation (6) for s + 1 3 < l a can be constructed using the submatrices H G i for 0 i l , which is given as
H G i = 0 e 1 ( i ) e 2 ( i ) e 1 ( i ) + e 2 ( i ) 1 0 0 0 .
Another construction based on the partial t-spread is also proposed in [58]. Let q be a prime power and V m ( q ) be the vector space of dimension m over F q .
Construction (PS6) [50]:
Given an integer r 2 , determine the smallest integer t such that r + 1 t + t 2 . An integer m such that m + 1 r l can be chosen, and there exists a partial t-spread with a size of at least l of V m ( 2 ) . Let B i = { b i , 0 , b i , 1 , , b i , t 1 } be a basis of W i S and C i = { c i , 0 , c i , 1 , , c i , t 2 1 } be a set whose elements are defined as c i , j = b i , 2 j + b i , 2 j + 1 for i = 0 , 1 , , l 1 and j = 0 , 1 , , t 2 1 . Finally, let U i = B i C i for i = 0 , 1 , , l 1 . Let s be an integer such that m + 1 r s l , and we use any r + 1 vectors in U i to fill each submatrix H G i as its r + 1 columns for i = 0 , 1 , , s 1 . Then, the BLRC C s , m , r has length n = ( r + 1 ) s , dimension k = r s m , minimum distance d 6 , and locality r.
Then, the BLRCs C 4 , 4 , 2 and C 5 , 4 , 2 obtained from Construction (PS6) are optimal. In addition, for s = 4 , 5 , , 9 , the BLRCs C s , 5 , 2 from Construction (PS6) are almost optimal in terms of the C-M bound and for s = 3 , 4 , , 9 , the BLRCs C s , 6 , 3 from Construction (PS6) are almost optimal with respect to the C-M bound.

3.8. BLRCs from Generalized Hamming Code

Suppose that s and t are two positive integers such that 2 t | s and s 2 t 2 . Let A be a 2 t × 2 t binary parity check matrix such that any four columns of this matrix are linearly independent. For t 2 , A can be chosen as the identity matrix. For t 3 , A is the parity check matrix of a ( 2 t , 2 t 2 t , 5 ) binary linear code that can be built from non-primitive cyclic codes with length 2 t + 1 . Let β be the primitive root of x 2 t + 1 1 , and let M ( x ) denote the minimum polynomial of β . The degree of M ( x ) is 2 t . A is a parity check matrix defining the binary cyclic code with parameters ( 2 t + 1 , 2 t 2 t , 6 ) that is generated by ( x 1 ) M ( x ) . Then, the set { β t | i = 2 , 1 , 0 , 1 , 2 , } forms a subset of the roots of ( x 1 ) M ( x ) . By deleting one coordinate of A , we can construct the parity check matrix A of the punctured code with parameters ( 2 t , 2 t 2 t , 5 ) . In addition, B is defined as a matrix such that the columns are all nonzero s 2 t -tuples from F 2 2 t , with the first nonzero element equal to 1. Then, B is an s 2 t × 2 s 1 2 2 t 1 parity check matrix of a 2 2 t -ary Hamming code. Using the matrices A and B, a BLRC construction is provided as follows.
Construction (GH1) [47,48]:
Suppose that a 1 , , a 2 t F 2 2 t are the 2 t elements corresponding to the columns of A, and the ith column of B is denoted by a vector β i for 1 i 2 s 1 2 2 t 1 . Let C be a binary linear code with the parity check matrix given as
H = L 1 L 2 L l H 1 H 2 H l ,
where l = 2 s 1 2 2 t 1 and for 1 i l , L i is an l × ( 2 t + 1 ) matrix whose ith row is an all-one vector, the other rows are all-zero vectors, and H i is an s × ( 2 t + 1 ) matrix over F 2 whose columns are binary expansions of the vectors { 0 , a 1 β i , a 2 β i , , a 2 t β i } .
It is shown that this construction can satisfy the bound given in Equation (4).
The shortening for LRCs can also give us another LRC. Let C be an ( n , k , d ) BLRC with locality r such that n 2 ( r + 1 ) and k 2 r . Then, an ( n , k , d ) BLRC C with locality r can be obtained by shortening C, where the parameters of C satisfy n = n ( r + 1 ) , k k r , and d d .
Construction (GH2) [48]:
By applying the shortening of the ( r + 1 ) times to C, we have an ( n ( r + 1 ) , k r , d ) BLRC.
This kind of code modification approach can be extended to the well-known code modification methods such as extending, shorting, expurgating, augmenting, and lengthening [53], as in the following subsection.

3.9. BLRCs from Code Modification

It is well-known that there are various code modification methods for linear codes. For BLRC, we can also use these modification methods to generate codes with new parameters [53]. Let C be an ( n , k , d ) binary code with locality r and let d be the minimum distance of its dual code, C . By adding a parity bit to each codewords in a C with parameters ( n , k , d ) , the extended code C e x t with parameters ( n + 1 , k , d e x t ) can be obtained. This can be formally presented as
C e x t = ( c 1 , , c n , c n + 1 ) | ( c 1 , , c n ) C , c n + 1 = i = 1 n c i ,
where d e x t = d + 1 for odd d and d e x t = d for even d [53]. For BLRCs, we are interested in the locality of the derived codes for a give C with locality r. Let C e x t be the dual code of C e x t . If the maximum Hamming weight among codewords in the code C is n r , then the locality of the extended code C e x t is r e x t = r . If the maximum Hamming weight among codewords in C is n + 1 d , then the locality of the extended code C e x t is r e x t = d 1 . Finally, if C is an ( n , k , d ) cyclic code with an odd minimum distance d, then the locality of the dual code C e x t in the extended code of C is r e x t = d [53].
The shortening can also be applied to the derivation of new BLRC. By deleting codewords in C with nonzero values in the last coordinates and removing the last coordinates from the remaining codewords, we can find the shortened code C s of C . This can be formally represented as
C s = { ( c 1 , , c n 1 ) | ( c 1 , , c n , 0 ) C } .
For an original ( n , k , d ) binary linear code, it is known that the parameters of the shortened code are given as ( n 1 , k 1 , d s d ) . Moreover, if the original code is BLRC with locality r 2 , then the locality of the shortened code C s is r or r 1 . Let C s be the dual of C s and let d 3 be the minimum distance of the dual code C . Then, for an ( n , k , d ) cyclic code C , the locality of code C s is either d 2 or d 1 [53].
Next, the expurgation also can be used to generate new BLRC for an ( n , k , d ) BLRC C with odd weight codewords. As such, the expurgated code C e x p of C can be generated as a subcode of C by selecting only even weight codewords such that
C e x p = { c | c C , the   Hamming   weight w ( c ) is   even } .
The corresponding parameters of C e x p are given as ( n , k 1 , w e ) , where w e is the minimum Hamming weight of the nonzero codewords in C . Let C e x p be the dual code of C e x p . Then, we have C e x p = C C ¯ [53].
As an inverse method of the expurgation as previously described, the augmented code C a of an ( n , k , d ) code C without the all-one codeword 1 is defined as the code C { C + all - one   codeword 1 } whose parameters are given as ( n , k + 1 , min { d , n w max } ) , where w max is the maximum Hamming weight of codewords in C . If the code C is cyclic, then the expurgated and augmented codes of C are also cyclic [53].
Another example of BLRC from the code modification methods is presented in [60] using the shortened expurgated Hamming code.
Construction (SE-Hamming) [60]:
Let β be a primitive element of F 2 m and n be a positive integer 9 and divisible by 3 such that 2 n 3 2 m 1 . Let C E is a ( 2 m 1 , 2 m m 2 , 4 ) expurgated Hamming code with the generator polynomial g ( x ) = ( x + 1 ) g 1 ( x ) , where g 1 ( x ) is the minimal polynomial of β over F 2 . Then, a ( 2 n 3 , 2 n 3 m 1 , 4 ) shortened expurgated Hamming code C S can be generated by shortening the first ( 2 m 2 n 3 1 ) information bits of C E . The concatenation of C S and an ( n , 2 n 3 ) cyclic code with parity check polynomial x 2 n 3 + x n 3 + 1 as an inner code then yields an ( n , 2 n 3 log 2 ( 2 n 3 + 1 ) 1 , d 6 , 2 ) LRC C C .

3.10. Summary of BLRC Constructions

We summarize the discussed BLRC construction methods in Table 1. Generally, in Table 1, X denotes the case that the equality of the bound is not achieved for all parameters. For the case of C-M bound, k o p t is assumed to satisfy the Singleton bound for given n and d.

4. Conclusions

This paper summarizes the recently proposed constructions for BLRCs and their features. To achieve efficient hardware implementation, the codes are constructed over the binary field because the need for multiplications is obviated during the encoding, decoding, and repair processes. We explain the various construction methods of BLRCs using cyclic code based, random vector based, bipartite or expander graph based, anticode based, partial spread based, and generalized Hamming code based approaches. In addition, construction methods of the BLRCs using code modification methods for linear codes such as extending, shorting, expurgating, and augmenting are introduced.
We selectively review important achievements on BLRCs from the authors’ perspectives and thus obviously the authors’ bias are reflected. Therefore, not being reviewed here does not mean it is not an important result. Especially, we also apologize in advance for the lack of proper citation or lack of new research results because this area is actively researched and many papers have been introduced in a relatively short period of time.

Author Contributions

The authors contributed equally to surveying the literature; conceptualizing the review; and writing, reviewing, editing, and drafting the manuscript.

Funding

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science and ICT) (No. NRF-2017R1A2B2010588).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ACAnticode
BGBipartite graph
BLRCBinary locally repairable code
CCCyclic code
C-MCadambe–Mazumdar
DSSDistributed storage system
EGExpander graph
FRCFractional repetition code
GHGeneralized Hamming code
LRCLocally repairable code
MDSMaximum distance separable
MRMaximal recoverability
MR-LRCMaximal recoverable-LRC
PSPartial spread
RCRegeneration code
RVRandom vector

References

  1. Rhea, S.; Wells, C.; Eaton, P.; Geels, D.; Zhao, B.; Weatherspoon, H.; Kubiatowicz, J. Maintenance-free global data storage. IEEE Internet Comput. 2001, 5, 40–49. [Google Scholar] [CrossRef] [Green Version]
  2. Chang, F.; Dean, J.; Ghemawat, S.; Hsieh, W.C.; Wallach, D.A.; Burrows, M.; Chandra, T.; Fikes, A.; Gruber, R.E. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 2008, 26. [Google Scholar] [CrossRef]
  3. Ghemawat, S.; Gobioff, H.; Leung, S.-T. The Google file system. In Proceedings of the 19th ACM Symp. Operating Systems Principles, Bolton Landing, NY, USA, 19–22 October 2003; pp. 20–43. [Google Scholar]
  4. Borthakur, D. The Hadoop Distributed File System: Architecture and Design. 2007. Available online: https://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.0/docs/hdfs_design.pdf (accessed on 9 April 2019).
  5. Rashmi, K.V.; Shah, N.B.; Gu, D.; Kuang, H.; Borthakur, D.; Ramchandran, K. A solution to the network challenges of data recovery in erasure-coded distributed storage systems: A study on the Facebook warehouse cluster. In Proceedings of the 5th USENIX Workshop on Hot Topics in Storage and File Systems, San Jose, CA, USA, 6–9 October 2013. [Google Scholar]
  6. Dimakis, A.G.; Prabhakaran, V.; Ramchandran, K. Decentralized erasure codes for distributed networked storage. IEEE Trans. Inf. Theory 2006, 52, 2809–2816. [Google Scholar] [CrossRef] [Green Version]
  7. Wu, Y.; Dimakis, A.G.; Ramchandran, K. Deterministic regenerating codes for distributed storage. In Proceedings of the Annual Allerton Conference on Communication, Control, and Computing, Urbana-Champaign, IL, USA, 18 September 2007; pp. 1–5. [Google Scholar]
  8. Rashmi, K.; Shah, N.; Kumar, P.V.; Ramchandran, K. Explicit construction of optimal exact regenerating codes for distributed storage. In Proceedings of the 47th Annual Allerton Conference on Communication, Control, and Computing, Urbana-Champaign, IL, USA, 18 September 2009; pp. 1243–1249. [Google Scholar]
  9. Kim, Y.-S.; Park, H.; No, J.-S. Construction of new fractional repetition codes from relative difference sets with λ = 1. Entropy 2017, 19, 5637. [Google Scholar] [CrossRef]
  10. Park, H.; Kim, Y.-S. Construction of fractional repetition codes with variable parameters for distributed storage systems. Entropy 2016, 18, 441. [Google Scholar] [CrossRef]
  11. Tamo, I.; Barg, A. A family of optimal locally recoverable codes. IEEE Trans. Inf. Theory 2014, 60, 4661–4676. [Google Scholar] [CrossRef]
  12. Rouayheb, S.E.; Ramchandran, K. Fractional repetition codes for repair in distributed storage systems. In Proceedings of the 48th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 29 September–1 October 2010; pp. 1510–1517. [Google Scholar]
  13. Dimakis, A.G.; Ramchandran, K.; Wu, Y.; Suh, C. A survey on network codes for distributed storage. Proc. IEEE 2011, 99, 476–489. [Google Scholar] [CrossRef]
  14. Datta, A.; Oggier, F.E. An overview of codes tailor-made for better repairability in networked distributed storage systems. SIGACT News 2013, 44, 89–105. [Google Scholar] [CrossRef]
  15. Li, J.; Li, B. Erasure coding for cloud storage systems: A survey. Tsinghua Sci. Technol. 2013, 18, 259–272. [Google Scholar] [CrossRef]
  16. Liu, S.; Oggier, F. An overview of coding for distributed storage systems. In Network Coding and Subspace Designs; Springer: Berlin, Germany, 2018; pp. 363–383. [Google Scholar]
  17. Balaji, S.B.; Krishnan, M.N.; Vajha, M.; Ramkumar, V.; Sasidharan, B.; Kumar, P.V. Erasure coding for distributed storage: An overview. Sci. China Inf. Sci. 2018, 61, 100301. [Google Scholar] [CrossRef]
  18. Rashmi, K.V.; Shah, N.B.; Ramchandran, K.; Kumar, P.V. Regenerating codes for errors and erasures in distributed storage. In Proceedings of the 2012 IEEE International Symposium on Information Theory Proceedings, Cambridge, MA, USA, 1–6 July 2012; pp. 1202–1206. [Google Scholar]
  19. Dimakis, A.G.; Godfrey, P.B.; Wu, Y.; Wainwright, M.J.; Ramchandran, K. Network coding for distributed storage systems. IEEE Trans. Inf. Theory 2010, 56, 4539–4551. [Google Scholar] [CrossRef]
  20. Huang, C.; Chen, M.; Li, J. Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems. In Proceedings of the IEEE International Symposium on Network Computing and Applications (NCA 2007), Cambridge, MA, USA, 12–14 July 2007; pp. 79–86. [Google Scholar]
  21. Huang, C.; Chen, M.; Li, J. Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems. ACM Trans. Storage (TOS) 2013, 9, 3:1–3:28. [Google Scholar] [CrossRef]
  22. Oggier, F.; Datta, A. Self-repairing homomorphic codes for distributed storage systems. In Proceedings of the IEEE INFOCOM, Shanghai, China, 10–15 April 2011; pp. 1215–1223. [Google Scholar]
  23. Gopalan, P.; Huang, C.; Simitci, H.; Yekhanin, S. On the locality of codeword symbols. IEEE Trans. Inf. Theory 2012, 58, 6925–6934. [Google Scholar] [CrossRef]
  24. Prakash, N.; Kamath, G.M.; Lalitha, V.; Kumar, P.V. Optimal linear codes with a local-error-correction property. In Proceedings of the IEEE International Symposium on Information Theory Proceedings (ISIT 2012), Cambridge, MA, USA, 1–6 July 2012; pp. 2776–2780. [Google Scholar]
  25. Forbes, M.; Yekhanin, S. On the locality of codeword symbols in non-linear codes. Discret. Math. 2014, 324, 78–84. [Google Scholar] [CrossRef] [Green Version]
  26. Tamo, I.; Papailiopoulos, D.S.; Dimakis, A.G. Optimal locally repairable codes and connections to matroid theory. IEEE Trans. Inf. Theory 2016, 62, 6661–6671. [Google Scholar] [CrossRef]
  27. Calder, B.; Wang, J.; Ogus, A.; Nilakantan, N.; Skjolsvold, A.; McKelvie, S.; Xu, Y.; Srivastav, S.; Wu, J.; Simitci, H.; et al. Windows Azure storage: A highly available cloud storage service with strong consistency. In Proceedings of the 23th ACM Symposium Operating Systems Principles (SOSP’11), Cascais, Portugal, 23–26 October 2011; pp. 143–157. [Google Scholar]
  28. Mehrabi, M.; Ardakani, M.; Khabbazian, M. Minimizing the update complexity of Facebook HDFS-RAID locally repairable code. In Proceedings of the 2017 IEEE 86th Vehicular Technology Conference (VTC-Fall), Toronto, ON, Canada, 24–27 September 2017; pp. 1–5. [Google Scholar]
  29. Papailiopoulos, D.; Dimakis, A.G. Distributed storage codes through Hadamard designs. In Proceedings of the IEEE International Symposium on Information Theory Proceedings (ISIT 2011), St. Petersburg, Russia, 31 July–5 August 2011; pp. 1230–1234. [Google Scholar]
  30. Silberstein, N.; Rawat, A.S.; Koyluoglu, O.O.; Vishwanath, S. Optimal locally repairable codes via rank-metric codes. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Istanbul, Turkey, 7–12 July 2013; pp. 1819–1823. [Google Scholar]
  31. Papailiopoulos, D.S.; Dimakis, A.G. Locally repairable codes. IEEE Trans. Inf. Theory 2014, 60, 5843–5855. [Google Scholar] [CrossRef]
  32. Kamath, G.M.; Prakash, N.; Lalitha, V.; Kumar, P.V. Codes with local regeneration and erasure correction. IEEE Trans. Inf. Theory 2014, 60, 4637–4660. [Google Scholar] [CrossRef]
  33. Cadambe, V.R.; Mazumdar, A. Bounds on the size of locally recoverable codes. IEEE Trans. Inf. Theory 2015, 61, 5787–5794. [Google Scholar] [CrossRef]
  34. Chen, M.; Huang, C.; Li, J. On the maximally recoverable property for multi-protection group codes. In Proceedings of the IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007; pp. 486–490. [Google Scholar]
  35. Blaum, M.; Hafner, J.L.; Hetzler, S. Partial-MDS codes and their application to RAID type of architectures. IEEE Trans. Inf. Theory 2013, 59, 4510–4519. [Google Scholar] [CrossRef]
  36. Gopalan, P.; Huang, C.; Jenkins, B.; Yekhanin, S. Explicit maximally recoverable codes with locality. IEEE Trans. Inf. Theory 2014, 60, 5245–5256. [Google Scholar] [CrossRef]
  37. Martinez-Penas, U.; Kschischangm, F.R. Universal and dynamic locally repairable codes with maximal recoverability via sum-rank codes. In Proceedings of the 2018 56th Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 2–5 October 2018; pp. 792–799. [Google Scholar]
  38. Shah, N.B.; Lee, K.; Ramchandran, K. When do redundant requests reduce latency? In Proceedings of the 51st Annual Allerton Conference on Communication, Control, and Computing, Monticello, IL, USA, 2–4 October 2013; pp. 731–738. [Google Scholar]
  39. Joshi, G.; Liu, Y.; Soljanin, E. On the delay-storage trade-off in content download from coded distributed storage systems. IEEE J. Sel. Areas Commun. 2014, 32, 989–997. [Google Scholar] [CrossRef]
  40. Liang, G.; Kozat, U. Tofec: Achieving optimal throughput-delay trade-off of cloud storage using erasure codes. In Proceedings of the IEEE Conference Computer Communication (IEEE INFOCOM), Toronto, ON, Canada, 27 April–2 May 2014; pp. 826–834. [Google Scholar]
  41. ARawat, S.; Papailiopoulos, D.S.; Dimakis, A.G.; Vishwanath, S. Locality and availability in distributed storage. IEEE Trans. Inf. Theory 2016, 62, 4481–4493. [Google Scholar]
  42. Lee, K.-S.; Park, H.; No, J.-S. New binary locally repairable codes with locality 2 and uneven availabilities for hot data. Entropy 2018, 20, 636. [Google Scholar] [CrossRef]
  43. Shahabinejad, M.; Khabbazian, M.; Ardakani, M. An efficient binary locally repairable codes for Hadoop distributed file system. IEEE Commun. Lett. 2014, 18, 1287–1290. [Google Scholar] [CrossRef]
  44. Shahabinejad, M.; Khabbazian, M.; Ardakani, M. A class of binary locally repairable codes. IEEE Trans. Commun. 2016, 64, 3182–3193. [Google Scholar] [CrossRef]
  45. Shahabinejad, M.; Khabbazian, M.; Ardakani, M. On the average locality of locally repairable codes. IEEE Trans. Commun. 2018, 66, 2773–2783. [Google Scholar] [CrossRef]
  46. Hu, S.; Tamo, I.; Barg, A. Combinatorial and LP bounds for LRC codes. In Proceedings of the IEEE International Symposium on Information Theory (ISIT 2016), Barcelona, Spain, 10–15 July 2016; pp. 1008–1012. [Google Scholar]
  47. Wang, A.; Zhang, Z.; Lin, D. Bounds and constructions for linear locally repairable codes over binary fields. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017; pp. 2033–2037. [Google Scholar]
  48. Wang, A.; Zhang, Z.; Lin, D. Bounds for binary linear locally repairable codes via a sphere-packing approach. IEEE Trans. Inf. 2019. [Google Scholar] [CrossRef]
  49. Agarwal, A.; Barg, A.; Hu, S.; Mazumda, A.; Tamo, I. Combinatorial alphabet-dependent bounds for locally recoverable codes. IEEE Trans. Inf. 2018, 64, 3481–34928. [Google Scholar] [CrossRef]
  50. Ma, J.; Ge, G. Optimal binary linear locally repairable codes with disjoint repair groups. arXiv 2017, arXiv:1711.07138v1. [Google Scholar]
  51. Goparaju, S.; Calderbank, R. Binary cyclic codes that are locally repairable. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Honolulu, HI, USA, 29 June–4 July 2014; pp. 676–680. [Google Scholar]
  52. Zeh, A.; Yaakobi, E. Optimal linear and cyclic locally repairable codes over small fields. In Proceedings of the IEEE Information Theory Workshop (ITW), Jerusalem, Israel, 26 April–1 May 2015; pp. 1–5. [Google Scholar]
  53. Huang, P.; Yaakobi, E.; Uchikawa, H.; Seigel, P.H. Binary linear locally repairable codes. IEEE Trans. Inf. 2016, 62, 6268–6283. [Google Scholar] [CrossRef]
  54. Tamo, I.; Barg, A.; Goparaju, S.; Calderbank, R. Cyclic LRC codes, binary LRC codes, and upper bounds on the distance of cyclic codes. Int. J. Inf. Coding Theory 2016, 3, 345–364. [Google Scholar] [CrossRef]
  55. Tamo, I.; Barg, A.; Frolov, A. Bounds on the parameters of locally recoverable codes. IEEE Trans. Inf. Theory 2016, 62, 3070–3083. [Google Scholar] [CrossRef]
  56. Kruglik, S.; Nazirkhanova, K.; Frolov, A. New bounds and generalizations of locally recoverable codes with availability. IEEE Trans. Inf. Theory 2019. [Google Scholar] [CrossRef]
  57. Silberstein, N.; Zeh, A. Optimal binary locally repairable codes via anticodes. In Proceedings of the IEEE International Symposium on Information Theory (ISIT), Hong Kong, China, 14–19 June 2015; pp. 1247–1251. [Google Scholar]
  58. Nam, M.Y.; Song, H.Y. Binary locally repairable codes with minimum distance at least 6 based on partial t-spreads. IEEE Commun. Lett. 2017, 21, 1683–1686. [Google Scholar] [CrossRef]
  59. Kim, C.; No, J.-S. New constructions of binary LRCs with disjoint repair groups and locality 3 using existing LRCs. IEEE Commun. Lett. 2019, 23, 406–409. [Google Scholar] [CrossRef]
  60. Kim, C.; No, J.-S. New constructions of binary and ternary locally repairable codes using cyclic codes. IEEE Commun. Lett. 2018, 22, 228–231. [Google Scholar] [CrossRef]
  61. Farrell, P. Linear binary anticodes. Electron. Lett. 1970, 6, 419–421. [Google Scholar] [CrossRef]
  62. MacWilliams, F.J.; Sloane, N.J.A. The Theory of Error-Correcting Codes; North Holland: Amsterdam, The Netherlands, 1988; p. 547. [Google Scholar]
Figure 1. Classification of binary locally repairable codes.
Figure 1. Classification of binary locally repairable codes.
Electronics 08 00596 g001
Table 1. Summary of parameters of various BLRC constructions.
Table 1. Summary of parameters of various BLRC constructions.
CodesnkdrtSC-M
CC1 2 m 1 r n / ( r + 1 ) 2r1OX
CC2 2 m 1 2 3 ( 2 m 1 ) m 6 21XX
CC3 2 m 1 2 3 ( 2 m 1 ) 2 m 1021XX
CC4 2 m 1 3 n / 7 423XX
CC5 2 m + 1 2 3 ( 2 m + 1 ) 2 m 10 21XX
CC6 2 m 1 3 n / 7 m 12 23XX
CC7 2 a 1 a 2 a 1 ( 2 m 1 ) m 2 a 1 + 2 a 2 2 a 1 1 XX
CC8nk 2 z 1 2 z 1 1 1XX
CC9n n deg ( g ( x ) ) 4 u 1 1XX
RV k 2 k k 2 k 2 2 2 *OX
BGn r n r + 1 log 2 ( r + 1 ) 4r1OX
EGnkdrtOX
AC1 2 m s 2 1 m 2 m 1 s 2 4 21XO
AC2 2 m 2 t + t + 1 m 2 m 1 2 t 1 + 2 21XX
AC3 2 m 1 1 m 2 m 2 1 31XO
PS1n r n r + 1 t log 2 n 2 t + 2 r1OX
PS2nkd 2 t tOO
PS3nkd 2 t + 2 t + 1 2 1 tOO
PS4 3 l 2 l 3 m or 2 l 2 m 1 62tOO
PS5 4 l 3 l s 1 6 31OX
PS6 ( r + 1 ) s r s m 6 rtOX
GH1 2 t + 1 2 t l s 1 drtOX
GH2 n r 1 k r d rtOX
SEn 2 n 3 log 2 ( 2 n 3 + 1 ) 1 6 21OX
* This scheme has an uneven availability represented as an availability profile.

Share and Cite

MDPI and ACS Style

Kim, Y.-S.; Kim, C.; No, J.-S. Overview of Binary Locally Repairable Codes for Distributed Storage Systems. Electronics 2019, 8, 596. https://doi.org/10.3390/electronics8060596

AMA Style

Kim Y-S, Kim C, No J-S. Overview of Binary Locally Repairable Codes for Distributed Storage Systems. Electronics. 2019; 8(6):596. https://doi.org/10.3390/electronics8060596

Chicago/Turabian Style

Kim, Young-Sik, Chanki Kim, and Jong-Seon No. 2019. "Overview of Binary Locally Repairable Codes for Distributed Storage Systems" Electronics 8, no. 6: 596. https://doi.org/10.3390/electronics8060596

APA Style

Kim, Y. -S., Kim, C., & No, J. -S. (2019). Overview of Binary Locally Repairable Codes for Distributed Storage Systems. Electronics, 8(6), 596. https://doi.org/10.3390/electronics8060596

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop