Information measures, such as the entropy and the Kullback-Leibler (KL) divergence, are typically introduced using an abstract viewpoint based on a notion of "surprise." Accordingly, the entropy of a given random variable (rv) is larger if its realization, when revealed, is on average more "surprising" (see, e.g., -). The goal of this lecture note is to describe a principled and intuitive introduction to information measures that builds on inference, i.e., estimation and hypothesis testing. Specifically, entropy and conditional entropy measures are defined using variational characterizations that can be interpreted in terms of the minimum Bayes risk in an estimation problem. Divergence metrics are similarly described using variational expressions derived via mismatched estimation or binary hypothesis testing principles. The classical Shannon entropy and the KL divergence are recovered as special cases of more general families of information measures.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering
- Applied Mathematics