Many complex sensor network applications require deploying a large number of inexpensive and small sensors in a vast geographical region to achieve quality through quantity. Hierarchical clustering is generally considered as an efficient and scalable way to facilitate the management and operation of such large-scale networks and minimize the total energy consumption for prolonged lifetime. Judicious selection of cluster heads for data integration and communication is critical to the success of applications based on hierarchical sensor networks organized as layered clusters. We investigate the problem of selecting nodes in a pre-deployed sensor network to be the cluster heads to minimize the total energy needed for data gathering.We rigorously derive an analytical formula to optimize the number of cluster heads in sensor networks under uniform node distribution, and propose a Distance-based Crowdedness Clustering algorithm to determine the cluster heads in sensor networks under general node distribution. The results from an extensive set of experiments on a large number of simulated sensor networks illustrate the performance superiority of the proposed solution over the clustering schemes based on k-means algorithm.