TY - GEN
T1 - Conditional Mutual Information-Based Generalization Bound for Meta Learning
AU - Rezazadeh, Arezou
AU - Jose, Sharu Theresa
AU - Durisi, Giuseppe
AU - Simeone, Osvaldo
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/7/12
Y1 - 2021/7/12
N2 - Meta-learning optimizes an inductive bias - typically in the form of the hyperparameters of a base-learning algorithm - by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the proposed extension to meta-learning, the CMI bound involves a training meta-supersample obtained by first sampling 2N independent tasks from the task environment, and then drawing 2M independent training samples for each sampled task. The meta-training data fed to the meta-learner is modelled as being obtained by randomly selecting N tasks from the available 2N tasks and M training samples per task from the available 2M training samples per task. The resulting bound is explicit in two CMI terms, which measure the information that the meta-learner output and the base-learner output provide about which training data are selected, given the entire meta-supersample. Finally, we present a numerical example that illustrates the merits of the proposed bound in comparison to prior information-theoretic bounds for meta-learning.
AB - Meta-learning optimizes an inductive bias - typically in the form of the hyperparameters of a base-learning algorithm - by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the proposed extension to meta-learning, the CMI bound involves a training meta-supersample obtained by first sampling 2N independent tasks from the task environment, and then drawing 2M independent training samples for each sampled task. The meta-training data fed to the meta-learner is modelled as being obtained by randomly selecting N tasks from the available 2N tasks and M training samples per task from the available 2M training samples per task. The resulting bound is explicit in two CMI terms, which measure the information that the meta-learner output and the base-learner output provide about which training data are selected, given the entire meta-supersample. Finally, we present a numerical example that illustrates the merits of the proposed bound in comparison to prior information-theoretic bounds for meta-learning.
UR - http://www.scopus.com/inward/record.url?scp=85115094462&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115094462&partnerID=8YFLogxK
U2 - 10.1109/ISIT45174.2021.9518020
DO - 10.1109/ISIT45174.2021.9518020
M3 - Conference contribution
AN - SCOPUS:85115094462
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 1176
EP - 1181
BT - 2021 IEEE International Symposium on Information Theory, ISIT 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Symposium on Information Theory, ISIT 2021
Y2 - 12 July 2021 through 20 July 2021
ER -