TY - JOUR
T1 - Private and Secure Distributed Matrix Multiplication with Flexible Communication Load
AU - Aliasgari, Malihe
AU - Simeone, Osvaldo
AU - Kliewer, Jorg
N1 - Funding Information:
Manuscript received September 1, 2019; revised December 18, 2019; accepted February 2, 2020. Date of publication February 6, 2020; date of current version March 9, 2020. This work was supported in part by the European Research Council (ERC) under the European Union Horizon 2020 Research and Innovative Programme under Grant 725731, and in part by the U.S. NSF Grant CNS-1526547 and Grant CCF-1525629. This article was presented in part at the IEEE International Symposium on Information Theory (ISIT), 2019 [1]. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Xiaodong Lin. (Corresponding author: Malihe Aliasgari.) Malihe Aliasgari and Jörg Kliewer are with the Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ 07102 USA (e-mail: ma839@njit.edu; jkliewer@njit.edu).
Publisher Copyright:
© 2005-2012 IEEE.
PY - 2020
Y1 - 2020
N2 - Large matrix multiplications are central to large-scale machine learning applications. These operations are often carried out on a distributed computing platform with a master server and multiple workers in the cloud operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay, yielding a trade-off between recovery threshold, i.e., the number of workers required to recover the matrix product, and communication load, i.e., the total amount of data to be downloaded from the workers. In this paper, in addition to exact recovery requirements, we impose security and privacy constraints on the data matrices, and study the recovery threshold as a function of the communication load. We first assume that both matrices contain private information and that workers can collude to eavesdrop on the content of these data matrices. For this problem, we introduce a novel class of secure codes, referred to as secure generalized PolyDot (SGPD) codes, that generalize state-of-the-art non-secure codes for matrix multiplication. SGPD codes allow a flexible trade-off between recovery threshold and communication load for a fixed maximum number of colluding workers while providing perfect secrecy for the two data matrices. We then study a connection between secure matrix multiplication and private information retrieval. We specifically assume that one of the data matrices is taken from a public set known to all the workers. In this setup, the identity of the matrix of interest should be kept private from the workers. For this model, we present a variant of generalized PolyDot codes that can guarantee both secrecy of one matrix and privacy for the identity of the other matrix for the case of no colluding servers.
AB - Large matrix multiplications are central to large-scale machine learning applications. These operations are often carried out on a distributed computing platform with a master server and multiple workers in the cloud operating in parallel. For such distributed platforms, it has been recently shown that coding over the input data matrices can reduce the computational delay, yielding a trade-off between recovery threshold, i.e., the number of workers required to recover the matrix product, and communication load, i.e., the total amount of data to be downloaded from the workers. In this paper, in addition to exact recovery requirements, we impose security and privacy constraints on the data matrices, and study the recovery threshold as a function of the communication load. We first assume that both matrices contain private information and that workers can collude to eavesdrop on the content of these data matrices. For this problem, we introduce a novel class of secure codes, referred to as secure generalized PolyDot (SGPD) codes, that generalize state-of-the-art non-secure codes for matrix multiplication. SGPD codes allow a flexible trade-off between recovery threshold and communication load for a fixed maximum number of colluding workers while providing perfect secrecy for the two data matrices. We then study a connection between secure matrix multiplication and private information retrieval. We specifically assume that one of the data matrices is taken from a public set known to all the workers. In this setup, the identity of the matrix of interest should be kept private from the workers. For this model, we present a variant of generalized PolyDot codes that can guarantee both secrecy of one matrix and privacy for the identity of the other matrix for the case of no colluding servers.
KW - Coded distributed computation
KW - distributed learning
KW - information theoretic security
KW - private information retrieval
KW - secret sharing
UR - http://www.scopus.com/inward/record.url?scp=85082171767&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082171767&partnerID=8YFLogxK
U2 - 10.1109/TIFS.2020.2972166
DO - 10.1109/TIFS.2020.2972166
M3 - Article
AN - SCOPUS:85082171767
SN - 1556-6013
VL - 15
SP - 2722
EP - 2734
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
M1 - 8985291
ER -