TY - GEN
T1 - Incorporation of ordinal optimization into learning automata for high learning efficiency
AU - Zhang, Junqi
AU - Wang, Cheng
AU - Zang, Di
AU - Zhou, Mengchu
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/10/7
Y1 - 2015/10/7
N2 - Learning automata (LA) represent important leaning mechanisms with applications in automated system design, biological system modeling, computer vision, and transportation. They play the critical roles in modeling a process as well as generating the appropriate signal to control it. They update their action probabilities in accordance with the inputs received from the environment and can improve their own performance during operations. The action probability vector in LA takes charge of two functions: 1) The cost of convergence, i.e., the size of sampling budget; 2) The allocation of sampling budget among actions to identify the optimal one. These two intertwined functions lead to a problem: The sampling budget mostly goes to the currently estimated optimal action due to its high action probability regardless whether it can help identify the real optimal action or not. This work proposes a new class of LA that separates the allocation of sampling budget from the action probability vector. It uses the action probability vector to determine the size of sampling budget and then uses Optimal Computing Budget Allocation (OCBA) to accomplish the allocation of sampling budget in a way that maximizes the probability of identifying the true optimal action. Simulation results verify its significant speedup ranging from 10.93% to 65.94% over the best existing LA algorithms.
AB - Learning automata (LA) represent important leaning mechanisms with applications in automated system design, biological system modeling, computer vision, and transportation. They play the critical roles in modeling a process as well as generating the appropriate signal to control it. They update their action probabilities in accordance with the inputs received from the environment and can improve their own performance during operations. The action probability vector in LA takes charge of two functions: 1) The cost of convergence, i.e., the size of sampling budget; 2) The allocation of sampling budget among actions to identify the optimal one. These two intertwined functions lead to a problem: The sampling budget mostly goes to the currently estimated optimal action due to its high action probability regardless whether it can help identify the real optimal action or not. This work proposes a new class of LA that separates the allocation of sampling budget from the action probability vector. It uses the action probability vector to determine the size of sampling budget and then uses Optimal Computing Budget Allocation (OCBA) to accomplish the allocation of sampling budget in a way that maximizes the probability of identifying the true optimal action. Simulation results verify its significant speedup ranging from 10.93% to 65.94% over the best existing LA algorithms.
UR - http://www.scopus.com/inward/record.url?scp=84952787991&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84952787991&partnerID=8YFLogxK
U2 - 10.1109/CoASE.2015.7294262
DO - 10.1109/CoASE.2015.7294262
M3 - Conference contribution
AN - SCOPUS:84952787991
T3 - IEEE International Conference on Automation Science and Engineering
SP - 1206
EP - 1211
BT - 2015 IEEE Conference on Automation Science and Engineering
PB - IEEE Computer Society
T2 - 11th IEEE International Conference on Automation Science and Engineering, CASE 2015
Y2 - 24 August 2015 through 28 August 2015
ER -