TY - JOUR

T1 - Straggler- and Adversary-Tolerant Secure Distributed Matrix Multiplication Using Polynomial Codes

AU - Byrne, Eimear

AU - Gnilke, Oliver W.

AU - Kliewer, Jörg

N1 - Funding Information:
This work was supported in part by U.S. National Science Foundation grants 1815322, 1908756, 2107370 in addition to the UCD Seed Funding-Horizon Scanning scheme (grant no. 54584).
Publisher Copyright:
© 2023 by the authors.

PY - 2023/2

Y1 - 2023/2

N2 - Large matrix multiplications commonly take place in large-scale machine-learning applications. Often, the sheer size of these matrices prevent carrying out the multiplication at a single server. Therefore, these operations are typically offloaded to a distributed computing platform with a master server and a large amount of workers in the cloud, operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay by introducing a tolerance against straggling workers, i.e., workers for which execution time significantly lags with respect to the average. In addition to exact recovery, we impose a security constraint on both matrices to be multiplied. Specifically, we assume that workers can collude and eavesdrop on the content of these matrices. For this problem, we introduce a new class of polynomial codes with fewer non-zero coefficients than the degree +1. We provide closed-form expressions for the recovery threshold and show that our construction improves the recovery threshold of existing schemes in the literature, in particular for larger matrix dimensions and a moderate to large number of colluding workers. In the absence of any security constraints, we show that our construction is optimal in terms of recovery threshold.

AB - Large matrix multiplications commonly take place in large-scale machine-learning applications. Often, the sheer size of these matrices prevent carrying out the multiplication at a single server. Therefore, these operations are typically offloaded to a distributed computing platform with a master server and a large amount of workers in the cloud, operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay by introducing a tolerance against straggling workers, i.e., workers for which execution time significantly lags with respect to the average. In addition to exact recovery, we impose a security constraint on both matrices to be multiplied. Specifically, we assume that workers can collude and eavesdrop on the content of these matrices. For this problem, we introduce a new class of polynomial codes with fewer non-zero coefficients than the degree +1. We provide closed-form expressions for the recovery threshold and show that our construction improves the recovery threshold of existing schemes in the literature, in particular for larger matrix dimensions and a moderate to large number of colluding workers. In the absence of any security constraints, we show that our construction is optimal in terms of recovery threshold.

KW - distributed computation

KW - distributed learning

KW - information theoretic security

KW - matrix multiplication

KW - polynomial codes

UR - http://www.scopus.com/inward/record.url?scp=85148946024&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85148946024&partnerID=8YFLogxK

U2 - 10.3390/e25020266

DO - 10.3390/e25020266

M3 - Article

AN - SCOPUS:85148946024

SN - 1099-4300

VL - 25

JO - Entropy

JF - Entropy

IS - 2

M1 - 266

ER -