Improved online sequential extreme learning machine: A new intelligent evaluation method for AZ-style algorithms

Xiali Li, Shuai He, Zhi Wei, Licheng Wu

Research output: Contribution to journalArticlepeer-review

Abstract

Researches on computer games for Go, Chess, and Japanese Chess stand out as one of the notable landmarks in the progress of artificial intelligence. AlphaGo, AlphaGo Zero, and AlphaZero algorithms, which are called AlphaZero style (AZ-style) algorithms in some literature [1], have achieved superhuman performance by using deep reinforcement learning (DRL). However, the unavailability of training details, expensive equipment used for model training, and the low evaluation accuracy resulted by slow self-play training without expensive computing equipment in practical applications have been the defects of AZ-style algorithms. To solve the problems to a certain extent, the paper proposes an improved online sequential extreme learning machine (IOS-ELM), a new evaluation method, to evaluate chess board positions for AZ-style algortihm. Firstly, the theoretical principles of IOS-ELM is given. Secondly, the study considers Gomoku as the application object and uses IOS-ELM as the evaluation method for AZ-style's board positions to discuss the loss in the training process and hyperparameters affecting performance in detail. Under the same experimental conditions, the proposed method reduces the training parameters by 14 times, training time to 15%, and error of evaluation by 13% compared with the board evaluation network used in original AZ-style algorithms.

Original languageEnglish (US)
Article number8821351
Pages (from-to)124891-124901
Number of pages11
JournalIEEE Access
Volume7
DOIs
StatePublished - 2019

All Science Journal Classification (ASJC) codes

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)

Keywords

  • AlphaZero
  • Artificial intelligence
  • deep reinforcement learning
  • evaluation method
  • online sequential extreme learning machine

Fingerprint Dive into the research topics of 'Improved online sequential extreme learning machine: A new intelligent evaluation method for AZ-style algorithms'. Together they form a unique fingerprint.

Cite this