We show how to implement the CREW PRAM algorithm of the previous section in MapReduce, and discuss the complexity issues. First, we present the MapReduce model of computation.
4.1. The MapReduce Model of Computation
The MapReduce model allows global computation on a distributed system in its theoretical formulation. Therefore, bounding the number of computational steps is a requirement for the design of a practical algorithm.
The MapReduce programming paradigm is a sequence , where is a mapper and is a reducer for . First, we describe this paradigm and then discuss how to implement it on a distributed system. Since the input/output phases are inherent to any parallel algorithm and have standard solutions, the sequence P does not include the I/O phases, and the input to is a multiset where each element is a pair. The input to each mapper is a multiset output by the reducer , for . Mapper is run on each pair in , mapping to a set of new pairs. The input to reducer is , the union of the sets output by . For each key k, reduces the subset of pairs of with the key component equal to k to a new set of pairs with key component still equal to k. is the union of these new sets.
In a distributed system implementation, a key is associated with a processor. All the pairs with a given key are processed by the same node, but more keys can be associated to it in order to lower the scale of the system involved. Mappers are in charge of the data distribution, since they can generate new key values. On the other hand, reducers just process the data stored in the distributed memory, since they output for a set of pairs with a given key another set of pairs with the same given key.
The following complexity requirements are stated as necessary for a practical interest in [
11]:
R is polylogarithmic in the input size n;
the number of processors (or nodes in the Web) involved is O() with ;
the amount of memory for each node is O();
mappers and reducers take polynomial time in n.
In [
11], it is also shown that a
time CREW PRAM algorithm using subquadratic work space and a subquadratic number of processors can be implemented by MapReduce with a simulation satisfying the above requirements if
is polylogarithmic. Indeed, the parameter
R of the simulation is O(
), while the subquadratic work space is partitioned among a sublinear number of processors taking polynomial computational time.
Such requirements are necessary but not sufficient to guarantee a speed-up of the computation. Obviously, the total running time of mappers and reducers cannot be higher than the sequential one, and this is trivially implicit in what is stated in [
11]. The non-trivial bottleneck is the communication cost of the computational phase. This needs to be checked experimentally, since
R can be polylogarithmic in the input size. Generally speaking, a MapReduce implementation has a practical interest if
R is about ten units or less. If this is obtained from the simulation of a CREW PRAM algorithm, it might be preferable to the simulation of an EREW PRAM algorithm with a higher number of iterations.
4.2. Decoding LZC-Compressed Files in MapReduce
The MapReduce implementation of the decoder decompressing the sequence of pointers
of the previous section is
, with
. The number of iterations is
since, generally speaking, the simulation of a CREW PRAM algorithmic step is realized by two mappers and two reducers, where the reducers compute the memory requests and the corresponding information that must be provided to the processors while the mappers route the memory requests and the information to the reducers responsible for the particular processor [
11]. In this particular case, the keys correspond to the matrix entries, and the reducers compute the memory requests by looking at the values associated with the keys corresponding to the matrix entries, storing the last non-null components on the columns. As far as the other reducers are concerned, the information that must be provided to the processors is already computed, since the procedure just consists of copying values from columns to columns. Therefore, such reducers just identify the processors (or the keys) for the mappers routing the information.
The input to is a multiset of cardinality m, where each element is a pair with and for . The output of is , where each element in is a pair with and such that and . Then, reducer outputs the set , where is obtained from by reducing each element to the element . In other words, (as every other mapper of the sequence with an even index) routes a memory request for every processor. Since is the first mapper, it also does the job of computing the memory request (that is, subtracting the alphabet cardinality to the pointer value). Afterwords, this job is done by each reducer with an odd index for the next mapper. Reducer (as every other reducer with an even index) computes the keys that the next mapper will use to route the information.
The keys computed by are used by mapper . The output of is , where each element in is a pair with and such that and . Then, reducer outputs the set , where is the set of elements with key and value such that and . So, reducer does the job that did by itself. Therefore, mapper operates in a slightly different way from as every other mapper with an even index.
Mapper outputs , where each element in is a pair with or and such that and , that is, . Then, reducer outputs the set , where is obtained from by reducing each element or in to the element or . To complete the first two CREW PRAM algorithmic steps, we describe mapper and reducer .
Mapper outputs . Each element in is a pair with or and such that or and . Then, reducer outputs the set , where is the set of elements with key and value such that and , similarly to . Now, we can provide the MapReduce implementation of the generic step.
At the k-th step, for , if k is even mapper outputs . Each element in is a pair with key , for and , and value such that and ; that is, . Then, reducer outputs the set , where is obtained from by reducing each element to the element .
If k is odd, mapper outputs . Each element in is a pair with and such that and for some χ with and . Then, if , reducer outputs the set , where is the set of elements with key and value such that and .
Reducer outputs the set , where each is reduced to . Mapper outputs . Each element in is either equal to , where and is the length of the factor encoded by or to a pair with key equal to and value such that . Then, reducer outputs , where each element is obtained from the two elements and in .
Finally, outputs by mapping the element to the element , where a is the alphabet character target of q. Then, reducer outputs by reducing the set of elements {} to the element , where is the target of obtained by concatenating each alphabet character a for .