The previous post described a batch-inference MCTS implementation for self-play. Batches consisted of one state from each of N concurrent episodes. Unfortunately that approach doesn’t work for playing a competitive episode since in that case there’s ...