Networks of spiking neurons and Winner-Take-All spiking circuits (WTA-SNNs) can detect information encoded in spatio-temporal multi-valued events. These are described by the timing of events of interest, e.g., clicks, as well as by categorical numerical values assigned to each event, e.g., like or dislike. Other use cases include object recognition from data collected by neuromorphic cameras, which produce, for each pixel, signed bits at the times of sufficiently large brightness variations. Existing schemes for training WTA-SNNs are limited to rate-encoding solutions, and are hence able to detect only spatial patterns. Developing more general training algorithms for arbitrary WTA-SNNs inherits the challenges of training (binary) Spiking Neural Networks (SNNs). These amount, most notably, to the non-differentiability of threshold functions, to the recurrent behavior of spiking neural models, and to the difficulty of implementing backpropagation in neuromorphic hardware. In this paper, we develop a variational online local training rule for WTA-SNNs, referred to as VOWEL, that leverages only local pre- and post-synaptic information for visible circuits, and an additional common reward signal for hidden circuits. The method is based on probabilistic generalized linear neural models, control variates, and variational regularization. Experimental results on real-world neuromorphic datasets with multi-valued events demonstrate the advantages of WTA-SNNs over conventional binary SNNs trained with state-of-the-art methods, especially in the presence of limited computing resources.