Symbolic data have become increasingly popular in the era of big data. In this paper, we consider density estimation and regression for interval-valued data, a special type of symbolic data, common in astronomy and official statistics. We propose kernel estimators with adaptive bandwidths to account for variability of each interval. Specifically, we derive cross-validation bandwidth selectors for density estimation and extend the Nadaraya–Watson estimator for regression with interval data. We assess the performance of the proposed methods in comparison with existing kernel methods by extensive simulation studies and real data analysis.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Cross validation
- Nadaraya–Watson estimator
- kernel density estimation
- symbolic data