TY - GEN
T1 - Using PRAM algorithms on a uniform-memory-access shared-memory architecture
AU - Bader, David A.
AU - Illendula, Ajith K.
AU - Moret, Bernard M.E.
AU - Weisse-Bernstein, Nina R.
PY - 2001
Y1 - 2001
N2 - The ability to provide uniform shared-memory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. In this paper, we develop new techniques for designing a uniform shared-memory algorithm from a PRAM algorithm and present the results of an extensive experimental study demonstrating that the resulting programs scale nearly linearly across a significant range of processors (from 1 to 64) and across the entire range of instance sizes tested. This linear speedup with the number of processors is, to our knowledge, the first ever attained in practice for intricate combinatorial problems. The example we present in detail here is a graph decomposition algorithm that also requires the computation of a spanning tree; this problem is not only of interest in its own right, but is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but have no known efficient parallel implementations. Our results thus offer promise for bridging the gap between the theory and practice of shared-memory parallel algorithms.
AB - The ability to provide uniform shared-memory access to a significant number of processors in a single SMP node brings us much closer to the ideal PRAM parallel computer. In this paper, we develop new techniques for designing a uniform shared-memory algorithm from a PRAM algorithm and present the results of an extensive experimental study demonstrating that the resulting programs scale nearly linearly across a significant range of processors (from 1 to 64) and across the entire range of instance sizes tested. This linear speedup with the number of processors is, to our knowledge, the first ever attained in practice for intricate combinatorial problems. The example we present in detail here is a graph decomposition algorithm that also requires the computation of a spanning tree; this problem is not only of interest in its own right, but is representative of a large class of irregular combinatorial problems that have simple and efficient sequential implementations and fast PRAM algorithms, but have no known efficient parallel implementations. Our results thus offer promise for bridging the gap between the theory and practice of shared-memory parallel algorithms.
UR - http://www.scopus.com/inward/record.url?scp=0347264011&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0347264011&partnerID=8YFLogxK
U2 - 10.1007/3-540-44688-5_11
DO - 10.1007/3-540-44688-5_11
M3 - Conference contribution
AN - SCOPUS:0347264011
SN - 3540425004
SN - 9783540425007
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 129
EP - 144
BT - Algorithm Engineering - 5th International Workshop, WAE 2001, Proceedings
T2 - 5th International Workshop on Algorithm Engineering, WAE 2001
Y2 - 28 August 2010 through 31 August 2010
ER -