TY - JOUR
T1 - Improved TrAdaBoost and its Application to Transaction Fraud Detection
AU - Zheng, Lutao
AU - Liu, Guanjun
AU - Yan, Chungang
AU - Jiang, Changjun
AU - Zhou, Mengchu
AU - Li, Maozhen
N1 - Funding Information:
Manuscript received March 9, 2020; revised July 2, 2020; accepted August 10, 2020. Date of publication August 27, 2020; date of current version November 10, 2020. This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFB2100801, and in part by the Fundamental Research Funds for the Central Universities of China under Grant 22120190198. (Corresponding authors: Guanjun Liu; Changjun Jiang.) Lutao Zheng is with the Department of FinTech Research and Development, CaiTong Security Company Ltd., Hangzhou 310000, China (e-mail: zhenglutao103@163.com).
Publisher Copyright:
© 2014 IEEE.
PY - 2020/10
Y1 - 2020/10
N2 - AdaBoost is a boosting-based machine learning method under the assumption that the data in training and testing sets have the same distribution and input feature space. It increases the weights of those instances that are wrongly classified in a training process. However, the assumption does not hold in many real-world data sets. Therefore, AdaBoost is extended to transfer AdaBoost (TrAdaBoost) that can effectively transfer knowledge from one domain to another. TrAdaBoost decreases the weights of those instances that belong to the source domain but are wrongly classified in a training process. It is more suitable for the case that data are of different distribution. Can it be improved for some special transfer scenarios, e.g., the data distribution changes slightly over time We find that the distribution of credit card transaction data can change with the changes in the transaction behaviors of users, but the changes are slow most of the time. These changes are yet important for detecting transaction fraud since they result in a so-called concept drift problem. In order to make TrAdaBoost more suitable for the abovementioned case, we, thus, propose an improved TrAdaBoost (ITrAdaBoost) in this article. It updates (i.e., increases or decreases) the weight of a wrongly classified instance in a source domain according to the distribution distance from the instance to a target domain, and the calculation of distance is based on the theory of reproducing kernel Hilbert space. We do a series of experiments over five data sets, and the results illustrate the advantage of ITrAdaBoost.
AB - AdaBoost is a boosting-based machine learning method under the assumption that the data in training and testing sets have the same distribution and input feature space. It increases the weights of those instances that are wrongly classified in a training process. However, the assumption does not hold in many real-world data sets. Therefore, AdaBoost is extended to transfer AdaBoost (TrAdaBoost) that can effectively transfer knowledge from one domain to another. TrAdaBoost decreases the weights of those instances that belong to the source domain but are wrongly classified in a training process. It is more suitable for the case that data are of different distribution. Can it be improved for some special transfer scenarios, e.g., the data distribution changes slightly over time We find that the distribution of credit card transaction data can change with the changes in the transaction behaviors of users, but the changes are slow most of the time. These changes are yet important for detecting transaction fraud since they result in a so-called concept drift problem. In order to make TrAdaBoost more suitable for the abovementioned case, we, thus, propose an improved TrAdaBoost (ITrAdaBoost) in this article. It updates (i.e., increases or decreases) the weight of a wrongly classified instance in a source domain according to the distribution distance from the instance to a target domain, and the calculation of distance is based on the theory of reproducing kernel Hilbert space. We do a series of experiments over five data sets, and the results illustrate the advantage of ITrAdaBoost.
KW - Boosting learning
KW - E-commerce
KW - transaction fraud detection
KW - transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85095974553&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85095974553&partnerID=8YFLogxK
U2 - 10.1109/TCSS.2020.3017013
DO - 10.1109/TCSS.2020.3017013
M3 - Article
AN - SCOPUS:85095974553
SN - 2329-924X
VL - 7
SP - 1304
EP - 1316
JO - IEEE Transactions on Computational Social Systems
JF - IEEE Transactions on Computational Social Systems
IS - 5
M1 - 9178971
ER -