We consider the problem of computing data-cube marginals by a single round of MapReduce, focusing on the relationship between the reducer size and the replication rate. Initially, we simplify the problem by making the extent of each dimension the same. Several recursive constructions meet or come close to the minimum possible replication rate for a given reducer size. These ideas extend in two directions. We relax the assumption that the extents are all equal, and we consider how to compute marginals from lower-order marginals rather than from the raw data cube.
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Computer Networks and Communications
- Computational Theory and Mathematics
- Applied Mathematics