Federated Learning (FL), an emerging paradigm for fast intelligent acquisition at the network edge, enables joint training of a machine learning model over distributed data sets and computing resources with limited disclosure of local data. Communication is a critical enabler of large-scale FL due to significant amount of model information exchanged among edge devices. In this paper, we consider a network of wireless devices sharing a common fading wireless channel for the deployment of FL. Each device holds a generally distinct training set, and communication typically takes place in a Device-To-Device (D2D) manner. In the ideal case in which all devices within communication range can communicate simultaneously and noiselessly, a standard protocol that is guaranteed to converge to an optimal solution of the global empirical risk minimization problem under convexity and connectivity assumptions is Decentralized Stochastic Gradient Descent (DSGD). DSGD integrates local SGD steps with periodic consensus averages that require communication between neighboring devices. In this paper, wireless protocols are proposed that implement DSGD by accounting for the presence of path loss, fading, blockages, and mutual interference. The proposed protocols are based on graph coloring for scheduling and on both digital and analog transmission strategies at the physical layer, with the latter leveraging over-The-Air computing via sparsity-based recovery.