TY - JOUR
T1 - Parallel Algorithms for the Classes of ±2b DESCEND and ASCEND Computations on a SIMD Hypercube
AU - Nassimi, David
N1 - Funding Information:
Manuscript received July 9, 1991; revised February 6, 1992 and August 18, 1992. This work was supported in part by the New Jersey Institute of Technology under Grant SBR-421980. The author is with the Department of Computer and Information Science, New Jersey Institute of Technology, Newark, NJ 07102. IEEE Log Number 9213613.
PY - 1993/12
Y1 - 1993/12
N2 - There is a wide class of parallel algorithms which require communications in the form of a 2b permutation and — 2b permutation. (A 2b permutation of X elements, where N = 2n, moves the element from position i to position i + 2b mod N. The inverse is called a -2b permutation.) We first examine the complexity of performing a 2b permutation of X elements on an X -PE SIMD hypercube. (The hypercube model assumes that all data movements are restricted to a single dimension at a time.) We prove that, given any mapping of the elements to processors, log N - b steps are needed to perform a 2bpermutation. This lower bound suggests that if a parallel algorithm requires f(N) communication steps of type ±2bit may require Q(f(N) log N) steps on a SIMD hypercube. We identify an important class of parallel computations, called ±2b descend, which includes Batcher’s odd-even merge and many other algorithms. A computation in this class performs log X iterations on an N -element input A[0: N - 1]. For b =log N - 1. • • •. 0. iteration b computes the new value of each A[i] as a function of A[i]. A[i + 2b] and A[i — 2b]. We obtain an efficient O(log N) algorithm for this class on a SIMD hypercube. We discuss several problems which are solved elegantly using this general algorithm. We also study a related class of parallel computations called ±2b -ascend. A computation in this class has a structure similar to ±2b-descend, except that the iterations run in the increasing order of b. We develop a simple O(log2N/log log N) algorithm for this class on a SIMD hypercube, assuming Ω(log log N) space per PE. It remains open whether this class can be implemented on a SIMD hypercube in O(log N) time. Index Terms— Cyclic shift, odd-even merge, parallel prefix, ±2b -ascend, ±2b -descend, 2b permutation, PM2I interconnection, SIMD hypercube.
AB - There is a wide class of parallel algorithms which require communications in the form of a 2b permutation and — 2b permutation. (A 2b permutation of X elements, where N = 2n, moves the element from position i to position i + 2b mod N. The inverse is called a -2b permutation.) We first examine the complexity of performing a 2b permutation of X elements on an X -PE SIMD hypercube. (The hypercube model assumes that all data movements are restricted to a single dimension at a time.) We prove that, given any mapping of the elements to processors, log N - b steps are needed to perform a 2bpermutation. This lower bound suggests that if a parallel algorithm requires f(N) communication steps of type ±2bit may require Q(f(N) log N) steps on a SIMD hypercube. We identify an important class of parallel computations, called ±2b descend, which includes Batcher’s odd-even merge and many other algorithms. A computation in this class performs log X iterations on an N -element input A[0: N - 1]. For b =log N - 1. • • •. 0. iteration b computes the new value of each A[i] as a function of A[i]. A[i + 2b] and A[i — 2b]. We obtain an efficient O(log N) algorithm for this class on a SIMD hypercube. We discuss several problems which are solved elegantly using this general algorithm. We also study a related class of parallel computations called ±2b -ascend. A computation in this class has a structure similar to ±2b-descend, except that the iterations run in the increasing order of b. We develop a simple O(log2N/log log N) algorithm for this class on a SIMD hypercube, assuming Ω(log log N) space per PE. It remains open whether this class can be implemented on a SIMD hypercube in O(log N) time. Index Terms— Cyclic shift, odd-even merge, parallel prefix, ±2b -ascend, ±2b -descend, 2b permutation, PM2I interconnection, SIMD hypercube.
UR - http://www.scopus.com/inward/record.url?scp=0027885676&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0027885676&partnerID=8YFLogxK
U2 - 10.1109/71.250118
DO - 10.1109/71.250118
M3 - Article
AN - SCOPUS:0027885676
SN - 1045-9219
VL - 4
SP - 1372
EP - 1381
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 12
ER -