Incorporation of Optimal Computing Budget Allocation for Ordinal Optimization into Learning Automata

Junqi Zhang, Cheng Wang, Di Zang, Mengchu Zhou

Research output: Contribution to journalArticlepeer-review

21 Scopus citations

Abstract

A learning automaton (LA) is a powerful tool for reinforcement learning. Its action probability vector plays two roles: 1) deciding when it converges, i.e., total computing budget it has used, and 2) allocating computing budget among actions to identify the optimal one. These two intertwined roles lead to a problem: the computing budget mostly goes to the currently estimated optimal action due to its high action probability regardless whether such budget allocation can help identify the true optimal one or not. This work proposes a new class of LA that avoids the use of its action probability vector for computing budget allocation. Instead we use such vector only to determine if it converges and then employ optimal computing budget allocation to accomplish the allocation of computing budget in a way that maximizes the probability of identifying the true optimal actions. ϵ-optimality is proven. Simulations verify its advantages over existing algorithms.

Original languageEnglish (US)
Article number7165689
Pages (from-to)1008-1017
Number of pages10
JournalIEEE Transactions on Automation Science and Engineering
Volume13
Issue number2
DOIs
StatePublished - Apr 2016

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Keywords

  • Learning automata (LA)
  • optimal computing budget allocation (OCBA)
  • ordinal optimization

Fingerprint Dive into the research topics of 'Incorporation of Optimal Computing Budget Allocation for Ordinal Optimization into Learning Automata'. Together they form a unique fingerprint.

Cite this