TY - JOUR
T1 - A transfer learning approach via procrustes analysis and mean shift for cancer drug sensitivity prediction
AU - Turki, Turki
AU - Wei, Zhi
AU - Wang, Jason T.L.
N1 - Funding Information:
This project was funded by the Deanship of Scienti¯c Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia under Grant No. (KEP-3-611-39). The authors, therefore, acknowledge with thanks DSR technical and ¯nancial support.
Publisher Copyright:
© 2018 World Scientific Publishing Europe Ltd.
PY - 2018/6/1
Y1 - 2018/6/1
N2 - Transfer learning (TL) algorithms aim to improve the prediction performance in a target task (e.g. the prediction of cisplatin sensitivity in triple-negative breast cancer patients) via transferring knowledge from auxiliary data of a related task (e.g. the prediction of docetaxel sensitivity in breast cancer patients), where the distribution and even the feature space of the data pertaining to the tasks can be different. In real-world applications, we sometimes have a limited training set in a target task while we have auxiliary data from a related task. To obtain a better prediction performance in the target task, supervised learning requires a sufficiently large training set in the target task to perform well in predicting future test examples of the target task. In this paper, we propose a TL approach for cancer drug sensitivity prediction, where our approach combines three techniques. First, we shift the representation of a subset of examples from auxiliary data of a related task to a representation closer to a target training set of a target task. Second, we align the shifted representation of the selected examples of the auxiliary data to the target training set to obtain examples with representation aligned to the target training set. Third, we train machine learning algorithms using both the target training set and the aligned examples. We evaluate the performance of our approach against baseline approaches using the Area Under the receiver operating characteristic (ROC) Curve (AUC) on real clinical trial datasets pertaining to multiple myeloma, nonsmall cell lung cancer, triple-negative breast cancer, and breast cancer. Experimental results show that our approach is better than the baseline approaches in terms of performance and statistical significance.
AB - Transfer learning (TL) algorithms aim to improve the prediction performance in a target task (e.g. the prediction of cisplatin sensitivity in triple-negative breast cancer patients) via transferring knowledge from auxiliary data of a related task (e.g. the prediction of docetaxel sensitivity in breast cancer patients), where the distribution and even the feature space of the data pertaining to the tasks can be different. In real-world applications, we sometimes have a limited training set in a target task while we have auxiliary data from a related task. To obtain a better prediction performance in the target task, supervised learning requires a sufficiently large training set in the target task to perform well in predicting future test examples of the target task. In this paper, we propose a TL approach for cancer drug sensitivity prediction, where our approach combines three techniques. First, we shift the representation of a subset of examples from auxiliary data of a related task to a representation closer to a target training set of a target task. Second, we align the shifted representation of the selected examples of the auxiliary data to the target training set to obtain examples with representation aligned to the target training set. Third, we train machine learning algorithms using both the target training set and the aligned examples. We evaluate the performance of our approach against baseline approaches using the Area Under the receiver operating characteristic (ROC) Curve (AUC) on real clinical trial datasets pertaining to multiple myeloma, nonsmall cell lung cancer, triple-negative breast cancer, and breast cancer. Experimental results show that our approach is better than the baseline approaches in terms of performance and statistical significance.
KW - Transfer learning
KW - cancer genomics
KW - clinical informatics
KW - precision medicine
UR - http://www.scopus.com/inward/record.url?scp=85049178653&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85049178653&partnerID=8YFLogxK
U2 - 10.1142/S0219720018400140
DO - 10.1142/S0219720018400140
M3 - Article
C2 - 29945499
AN - SCOPUS:85049178653
SN - 0219-7200
VL - 16
JO - Journal of Bioinformatics and Computational Biology
JF - Journal of Bioinformatics and Computational Biology
IS - 3
M1 - 1840014
ER -