Simulation balancing is a new technique to tune parameters of a playout policy for a Monte-Carlo game-playing program. So far, this algorithm had only been tested in an artificial setting: it was limited to 5 × 5 and 6 × 6 Go, and required a stronger external program that served as a supervisor. In this article, the effectiveness of simulation balancing is demonstrated in a realistic setting. A state-of-the-art program, ERICA, learned an improved playout policy on the 9 × 9 board, without requiring any external expert to provide position evaluations. Evaluations were collected by letting the program analyze positions by itself. This evaluation was run with playout parameters estimated by the minorization-maximization (MM) algorithm. Thanks to simulation balancing, ERICA'S playing strength was improved from a winning rate of 69% (with playout parameters trained by MM) to 78% (with playout parameters trained by SB) against FUEGO 0.4.
ASJC Scopus subject areas