TY - GEN
T1 - Size oblivious programming with infinimem
AU - Koduru, Sai Charan
AU - Gupta, Rajiv
AU - Neamtiu, Iulian
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2016.
PY - 2016
Y1 - 2016
N2 - Many recently proposed BigData processing frameworks make programming easier, but typically expect the datasets to fit in the memory of either a single multicore machine or a cluster of multicore machines. When this assumption does not hold, these frameworks fail. We introduce the InfiniMem framework that enables size oblivious processing of large collections of objects that do not fit in memory by making them disk-resident. InfiniMem is easy to program with: the user just indicates the large collections of objects that are to be made diskresident, while InfiniMem transparently handles their I/O management. The InfiniMem library can manage a very large number of objects in a uniform manner, even though the objects have different characteristics and relationships which, when processed, give rise to a wide range of access patterns requiring different organizations of data on the disk. We demonstrate the ease of programming and versatility of InfiniMem with 3 different probabilistic analytics algorithms, 3 different graph processing size oblivious frameworks; they require minimal effort, 6–9 additional lines of code. We show that InfiniMem can successfully generate a mesh with 7.5 million nodes and 300 million edges (4.5GB on disk) in 40min and it performs the PageRank computation on a 14GB graph with 134 million vertices and 805 million edges at 14 min per iteration on an 8-core machine with 8GB RAM. Many graph generators and processing frameworks cannot handle such large graphs. We also exploit InfiniMem on a cluster to scale-up an object-based DSM.
AB - Many recently proposed BigData processing frameworks make programming easier, but typically expect the datasets to fit in the memory of either a single multicore machine or a cluster of multicore machines. When this assumption does not hold, these frameworks fail. We introduce the InfiniMem framework that enables size oblivious processing of large collections of objects that do not fit in memory by making them disk-resident. InfiniMem is easy to program with: the user just indicates the large collections of objects that are to be made diskresident, while InfiniMem transparently handles their I/O management. The InfiniMem library can manage a very large number of objects in a uniform manner, even though the objects have different characteristics and relationships which, when processed, give rise to a wide range of access patterns requiring different organizations of data on the disk. We demonstrate the ease of programming and versatility of InfiniMem with 3 different probabilistic analytics algorithms, 3 different graph processing size oblivious frameworks; they require minimal effort, 6–9 additional lines of code. We show that InfiniMem can successfully generate a mesh with 7.5 million nodes and 300 million edges (4.5GB on disk) in 40min and it performs the PageRank computation on a 14GB graph with 134 million vertices and 805 million edges at 14 min per iteration on an 8-core machine with 8GB RAM. Many graph generators and processing frameworks cannot handle such large graphs. We also exploit InfiniMem on a cluster to scale-up an object-based DSM.
UR - http://www.scopus.com/inward/record.url?scp=84961157501&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84961157501&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-29778-1_1
DO - 10.1007/978-3-319-29778-1_1
M3 - Conference contribution
AN - SCOPUS:84961157501
SN - 9783319297774
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 19
BT - Languages and Compilers for Parallel Computing - 28th International Workshop, LCPC 2015, Revised Selected Papers
A2 - Shen, Xipeng
A2 - Mueller, Frank
A2 - Tuck, James
PB - Springer Verlag
T2 - 28th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2015
Y2 - 9 September 2015 through 11 September 2015
ER -