Batch MCTS, Part 1

It’s not straightforward to get good performance out of AlphaZero’s Monte Carlo Tree Search. Search wants to perform inference for one game state at a time — the new leaf state visited at the end of each game simulation — and the results of that inference affect the subsequent simulation by changing node values.

On the other hand neural network performance (at least on GPUs) benefits greatly from batching. In my browser at the moment, using the WebGL TensorFlow backend and my current Kingdomino model, inference is over 10x faster per data point with a batch size of 64 than a batch size of 1.

In the self-play component of AlphaZero it doesn’t matter how long it takes to complete any particular episode; we only care about the overall rate of episode completion. It’s fine for one thread to take a long time to generate a batch of episodes concurrently as long as the total time per episode is less than it would be when generating episodes sequentially.

Single-threaded concurrent episodes with batch inference can be implemented nicely in TypeScript using Generators. The tree search implementation can still use the natural recursive approach. Tree search methods are generator functions which yield state values and receive inference results from those yield expressions. Recursing into a child node uses yield *.

Add a couple more generator functions to loop over simulations for one state and over actions for one episode, and then harness the resulting episode generators together with this utility function:

/**
 * Drives {@link generators} to completion, using {@link f} to provide the
 * parameters to {@link Generator.next} for batches of generators that yield
 * intermediate values
 *
 * @return the return values of {@link generators}
 */
export function driveGenerators<ItemT, ReturnT, NextT>(
  generators: ReadonlyArray<Generator<ItemT, ReturnT, NextT>>,
  f: (items: ReadonlyArray<ItemT>) => ReadonlyArray<NextT>
): ReadonlyArray<ReturnT> {...}

Then you’ve got batching MCTS on one thread with each episode behaving the same as it would in non-batching MCTS. TypeScript’s Generators that can receive as well as emit values make this implementation more safe, efficient, and natural than it would be in most other languages.