General Circulation Models (GCMs) allow for the simulation of several climate variables through the year 2100. GCM simulations, however, are too coarse to monitor climate change at a local scale in a local region. Hence, one needs to perform spatial downscaling for these simulations, where statistical downscaling is often used. Statistical downscaling is performed by utilizing the large-scale GCM outputs to forecast a local-scale field (e.g., temperature). In this paper, we develop a new deep learning approach, named AIG-TRANSFORMER, which employs a novel attention-based input grouping (AIG) neural network followed by a transformer, for the statistical downscaling of the weekly averages of maximum (Tmax) and minimum (Tmin) temperatures using GCM-simulated climatic fields (climate variables). We formulate the downscaling problem as a multivariate time series forecasting task, with multiple GCM-simulated climatic fields as input features. We employ an attention mechanism within the AIG network to give selective importance to the input features while reducing the size of the input fed to the transformer. To test AIG-TRANSFORMER, we perform the statistical downscaling over the Hackensack-Passaic Watershed, in northeast New Jersey. We compare our new deep learning approach against several existing machine learning methods including random forests, support vector regression and long short-term memory networks. Experimental results show that AIG-TRANSFORMER outperforms the existing methods for downscaling both the maximum and minimum temperatures, with a Nash-Sutcliffe Efficiency coefficient of 0.84 for Tmax, and 0.85 for Tmin. We further apply AIG-TRANSFORMER to produce long-term projections over the 20 years period from 2030 to 2049, and report the annual means for maximum and minimum temperatures.