TY - JOUR
T1 - Assessing the impact of training sample selection on accuracy of an urban classification
T2 - A case study in Denver, Colorado
AU - Jin, Huiran
AU - Stehman, Stephen V.
AU - Mountrakis, Giorgos
N1 - Funding Information:
Stehman was supported by the United States Geological Survey [Grant/Cooperative Agreement Number G12AC20221]. Jin and Mountrakis were supported by the National Aeronautics and Space Administration Biodiversity Program [grant number NNX09AK16G].
PY - 2014/3
Y1 - 2014/3
N2 - Understanding the factors that influence the performance of classifications over urban areas is of considerable importance to applications of remote-sensing-derived products in urban design and planning. We examined the impact of training sample selection on a binary classification of urban and nonurban for the Denver, Colorado, metropolitan area. Complete coverage reference data for urban and nonurban cover were available for the year 1997, which allowed us to examine variability in accuracy of the classification over multiple repetitions of the training sample selection and classification process. Four sampling designs for selecting training data were evaluated. These designs represented two options for stratification (spatial and class-specific) and two options for sample allocation (proportional to area and equal allocation). The binary urban and nonurban classification was obtained by employing a decision tree classifier with Landsat imagery. The decision tree classifier was applied to 1000 training samples selected by each of the four training data sampling designs, and accuracy for each classification was derived using the complete coverage reference data. The allocation of sample size to the two classes had a greater effect on classifier performance than the spatial distribution of the training data. The choice of proportional or equal allocation depends on which accuracy objectives have higher priority for a given application. For example, proportionally allocating the training sample to urban and nonurban classes favoured user's accuracy of urban whereas equally allocating the training sample to the two classes favoured producer's accuracy of urban. Although this study focused on urban and nonurban classes, the results and conclusions likely generalize to any binary classification in which the two classes represent disproportionate areas.
AB - Understanding the factors that influence the performance of classifications over urban areas is of considerable importance to applications of remote-sensing-derived products in urban design and planning. We examined the impact of training sample selection on a binary classification of urban and nonurban for the Denver, Colorado, metropolitan area. Complete coverage reference data for urban and nonurban cover were available for the year 1997, which allowed us to examine variability in accuracy of the classification over multiple repetitions of the training sample selection and classification process. Four sampling designs for selecting training data were evaluated. These designs represented two options for stratification (spatial and class-specific) and two options for sample allocation (proportional to area and equal allocation). The binary urban and nonurban classification was obtained by employing a decision tree classifier with Landsat imagery. The decision tree classifier was applied to 1000 training samples selected by each of the four training data sampling designs, and accuracy for each classification was derived using the complete coverage reference data. The allocation of sample size to the two classes had a greater effect on classifier performance than the spatial distribution of the training data. The choice of proportional or equal allocation depends on which accuracy objectives have higher priority for a given application. For example, proportionally allocating the training sample to urban and nonurban classes favoured user's accuracy of urban whereas equally allocating the training sample to the two classes favoured producer's accuracy of urban. Although this study focused on urban and nonurban classes, the results and conclusions likely generalize to any binary classification in which the two classes represent disproportionate areas.
UR - http://www.scopus.com/inward/record.url?scp=84896836907&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84896836907&partnerID=8YFLogxK
U2 - 10.1080/01431161.2014.885152
DO - 10.1080/01431161.2014.885152
M3 - Article
AN - SCOPUS:84896836907
SN - 0143-1161
VL - 35
SP - 2067
EP - 2081
JO - International Journal of Remote Sensing
JF - International Journal of Remote Sensing
IS - 6
ER -