Stochastic gradient descent (SGD) provides a scalable way to compute parameter estimates in applications involving large-scale data or streaming data. As an alternative version, averaged implicit SGD (AI-SGD) has been shown to be more stable and more efficient. Although the asymptotic properties of AI-SGD have been well established, statistical inferences based on it such as interval estimation remain unexplored. The bootstrap method is not computationally feasible because it requires to repeatedly resample from the entire data set. In addition, the plug-in method is not applicable when there is no explicit covariance matrix formula. In this paper, we propose a scalable statistical inference procedure, which can be used for conducting inferences based on the AI-SGD estimator. The proposed procedure updates the AI-SGD estimate as well as many randomly perturbed AI-SGD estimates, upon the arrival of each observation. We derive some large-sample theoretical properties of the proposed procedure and examine its performance via simulation studies.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
- big data
- interval estimation
- stochastic gradient descent
- streaming data