TY - JOUR
T1 - Secure large-scale genome-wide association studies using homomorphic encryption
AU - Blatt, Marcelo
AU - Gusev, Alexander
AU - Polyakov, Yuriy
AU - Goldwasser, Shafi
N1 - Publisher Copyright:
© 2020 National Academy of Sciences. All rights reserved.
PY - 2020/5/26
Y1 - 2020/5/26
N2 - Genome-wide association studies (GWASs) seek to identify genetic variants associated with a trait, and have been a powerful approach for understanding complex diseases. A critical challenge for GWASs has been the dependence on individual-level data that typically have strict privacy requirements, creating an urgent need for methods that preserve the individual-level privacy of participants. Here, we present a privacy-preserving framework based on several advances in homomorphic encryption and demonstrate that it can perform an accurate GWAS analysis for a real dataset of more than 25,000 individuals, keeping all individual data encrypted and requiring no user interactions. Our extrapolations show that it can evaluate GWASs of 100,000 individuals and 500,000 single-nucleotide polymorphisms (SNPs) in 5.6 h on a single server node (or in 11 min on 31 server nodes running in parallel). Our performance results are more than one order of magnitude faster than prior state-of-the-art results using secure multiparty computation, which requires continuous user interactions, with the accuracy of both solutions being similar. Our homomorphic encryption advances can also be applied to other domains where large-scale statistical analyses over encrypted data are needed.
AB - Genome-wide association studies (GWASs) seek to identify genetic variants associated with a trait, and have been a powerful approach for understanding complex diseases. A critical challenge for GWASs has been the dependence on individual-level data that typically have strict privacy requirements, creating an urgent need for methods that preserve the individual-level privacy of participants. Here, we present a privacy-preserving framework based on several advances in homomorphic encryption and demonstrate that it can perform an accurate GWAS analysis for a real dataset of more than 25,000 individuals, keeping all individual data encrypted and requiring no user interactions. Our extrapolations show that it can evaluate GWASs of 100,000 individuals and 500,000 single-nucleotide polymorphisms (SNPs) in 5.6 h on a single server node (or in 11 min on 31 server nodes running in parallel). Our performance results are more than one order of magnitude faster than prior state-of-the-art results using secure multiparty computation, which requires continuous user interactions, with the accuracy of both solutions being similar. Our homomorphic encryption advances can also be applied to other domains where large-scale statistical analyses over encrypted data are needed.
KW - Encrypted computing
KW - Genome-wide association studies
KW - Homomorphic encryption
UR - http://www.scopus.com/inward/record.url?scp=85085496664&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085496664&partnerID=8YFLogxK
U2 - 10.1073/pnas.1918257117
DO - 10.1073/pnas.1918257117
M3 - Article
C2 - 32398369
AN - SCOPUS:85085496664
SN - 0027-8424
VL - 117
JO - Proceedings of the National Academy of Sciences of the United States of America
JF - Proceedings of the National Academy of Sciences of the United States of America
IS - 21
M1 - 11608
ER -