TY - JOUR
T1 - Are missing values important for earnings forecasts? A machine learning perspective
AU - Uddin, Ajim
AU - Tao, Xinyuan
AU - Chou, Chia Ching
AU - Yu, Dantong
N1 - Funding Information:
This work was supported by US National Institutes of Health [UL1TR003017].
Publisher Copyright:
© 2022 New Jersey Institute of Technology.
PY - 2022
Y1 - 2022
N2 - Analysts' forecasts are one of the most common and important estimators for firms' future earnings. However, they are challenging to fully utilize because of missing values. This study applies machine learning techniques to estimate missing values in individual analysts' forecasts and subsequently to predict firms' future earnings based on both estimated and observed forecasts. After estimating missing values, forecast error is reduced by 41% compared to the mean forecast, suggesting that missing values after estimating are indeed useful for earnings forecasts. We analyze multiple estimation methods and show that the out-performance of matrix factorization (MF) is consistent using different evaluation measures and across firms. Finally, we propose a stochastic gradient descent based coupled matrix factorization (CMF) to augment the estimation quality of missing values with multiple datasets. CMF further reduces the error of earnings forecasts by 19% compared to MF with a single dataset.
AB - Analysts' forecasts are one of the most common and important estimators for firms' future earnings. However, they are challenging to fully utilize because of missing values. This study applies machine learning techniques to estimate missing values in individual analysts' forecasts and subsequently to predict firms' future earnings based on both estimated and observed forecasts. After estimating missing values, forecast error is reduced by 41% compared to the mean forecast, suggesting that missing values after estimating are indeed useful for earnings forecasts. We analyze multiple estimation methods and show that the out-performance of matrix factorization (MF) is consistent using different evaluation measures and across firms. Finally, we propose a stochastic gradient descent based coupled matrix factorization (CMF) to augment the estimation quality of missing values with multiple datasets. CMF further reduces the error of earnings forecasts by 19% compared to MF with a single dataset.
KW - Analysts' earnings forecast
KW - Coupled matrix factorization
KW - Firm earnings prediction
KW - Machine learning
KW - Missing value estimation
UR - http://www.scopus.com/inward/record.url?scp=85122391988&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85122391988&partnerID=8YFLogxK
U2 - 10.1080/14697688.2021.1963825
DO - 10.1080/14697688.2021.1963825
M3 - Article
AN - SCOPUS:85122391988
SN - 1469-7688
VL - 22
SP - 1113
EP - 1132
JO - Quantitative Finance
JF - Quantitative Finance
IS - 6
ER -