Exact-win strategy for overcoming AlphaZero

Yen Chi Chen, Chih Hung Chen, Shun Shii Lin*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

The Monte-Carlo Tree Search used in the AlphaZero may easily miss a critical move because it is based on sampling search space and focuses on the most promising moves. In addition, the Monte-Carlo Tree Search may sample a move for many times even if this move has been explored with a determined game-theoretical value. In this paper, we propose an Exact-win-MCTS that makes use of sub-tree’s information (WIN, LOSS, DRAW, and UNKNOWN) to prune unneeded moves to increase the opportunities of discovering the critical moves. Our method improves and generalizes some previous MCTS variations as well as the AlphaZero approach. The experiments show that our Exact-win-MCTS substantially promotes the strengths of Tic-Tac-Toe, Connect4, and Go programs especially. Finally, our Exact-win Zero defeats the Leela Zero, which is a replication of AlphaZero and is currently one of the best open-source Go programs, with a significant 61% win rate. Therefore, we are pleased to announce that our Exact-win-MCTS has overcome the AlphaZero approach without using extra training time, playing time, or computer resources. As far as we know, this is the first practical idea with concrete experiments to beat the AlphaZero approach.

Original languageEnglish
Title of host publication2018 International Conference on Computational Intelligence and Intelligent Systems, CIIS 2018
PublisherAssociation for Computing Machinery
Pages26-31
Number of pages6
ISBN (Electronic)9781450365956
DOIs
Publication statusPublished - 2018 Nov 17
Event2018 International Conference on Computational Intelligence and Intelligent Systems, CIIS 2018 - Phuket, Thailand
Duration: 2018 Nov 172018 Nov 19

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2018 International Conference on Computational Intelligence and Intelligent Systems, CIIS 2018
Country/TerritoryThailand
CityPhuket
Period2018/11/172018/11/19

Keywords

  • AlphaZero
  • Exact-win zero
  • Exact-win-MCTS
  • Leela zero
  • MCTS pruning

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Exact-win strategy for overcoming AlphaZero'. Together they form a unique fingerprint.

Cite this