There has been growing interest in using convolutional neural networks (CNNs) in the fields of image forensics and steganalysis, and some promising results have been reported recently. These works mainly focus on the architectural design of CNNs, usually, a single CNN model is trained and then tested in experiments. It is known that, neural networks, including CNNs, are suitable to form ensembles. From this perspective, in this paper, we employ CNNs as base learners and test several different ensemble strategies. In our study, at first, a recently proposed CNN architecture is adopted to build a group of CNNs, each of them is trained on a random subsample of the training dataset. The output probabilities, or some intermediate feature representations, of each CNN, are then extracted from the original data and pooled together to form new features ready for the second level of classification. To make best use of the trained CNN models, we manage to partially recover the lost information due to spatial subsampling in the pooling layers when forming feature vectors. Performance of the ensemble methods are evaluated on BOSSbase by detecting S-UNIWARD at 0.4 bpp embedding rate. Results have indicated that both the recovery of the lost information, and learning from intermediate representation in CNNs instead of output probabilities, have led to performance improvement.