After Deepmind released their "Player of Games", I decided it would be a good Christmas project to make an AI for a game I've always loved: Liar's Dice (or Dudo).
I realized there was a really simple way to implement Counter Factual Regret Minimization (CFR) with a value neural network trained from self play.
It starts learning from completely random play, but after about a million games the model is close to the Nash Equilibrium. The pytorch model is converted to ONNX runs entirely in the browser.
I realized there was a really simple way to implement Counter Factual Regret Minimization (CFR) with a value neural network trained from self play.
The code is on https://github.com/thomasahle/liars-dice . I will try to write a blog post about how it works later.
It starts learning from completely random play, but after about a million games the model is close to the Nash Equilibrium. The pytorch model is converted to ONNX runs entirely in the browser.