A New Learning Automaton for Selecting an Arbitrary Subset of Actions

Junqi Zhang, Peng Zu, Peng Zhan Qiu, Meng Chu Zhou

Research output: Contribution to journalArticlepeer-review

Abstract

As a powerful reinforcement learning method, the learning automaton (LA) has been studied, analyzed, and applied to various engineering systems for decades. However, the state-of-the-art LA-based methods can select only the optimal action or optimal subset and cannot select an arbitrary target subset like selecting the best and worst actions or the ones in a given rank range. In order to solve the problem of selecting a given arbitrary subset of actions, this work proposes a novel pursuit learning scheme, called a discretized equal pursuit reward-inaction algorithm for arbitrary subset selection (DEP RI-AS). The proposed scheme pursues the currently estimated arbitrary action subset and makes the probabilities of selecting each action in the subset equal, so as to increase the convergence speed. The proof of its -optimality property is presented. Simulation results of comparison experiments, parameter analysis, and a real-world application demonstrate its power in selecting a given subset of user-desired actions.

Original languageEnglish (US)
Pages (from-to)568-577
Number of pages10
JournalIEEE Transactions on Systems, Man, and Cybernetics: Systems
Volume54
Issue number1
DOIs
StatePublished - Jan 1 2024

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Human-Computer Interaction
  • Computer Science Applications
  • Electrical and Electronic Engineering

Keywords

  • Arbitrary subset
  • learning automaton (LA)
  • pursuit LA

Fingerprint

Dive into the research topics of 'A New Learning Automaton for Selecting an Arbitrary Subset of Actions'. Together they form a unique fingerprint.

Cite this