Uncertainty is an intrinsic aspect of demand response because electrical loads are subject to many random factors and their capabilities are often not directly measurable until they have been deployed. Demand response algorithms must therefore balance utilizing well-characterized, good loads and learning about poorly characterized but potentially good loads; this is a manifestation of the classical tradeoff between exploration and exploitation. We address this tradeoff in a restless bandit framework, a generalization of the well-known multi-armed bandit problem. The formulation yields index policies, in which loads are ranked by a scalar index and those with the highest are deployed. The policy is particularly appropriate for demand response because the indices have explicit analytical expressions that may be evaluated separately for each load, making them both simple and scalable. We numerically evaluate the performance of the index policy, and discuss implications of the policies in demand response.