Stockfish would beat AlphaZero handily. Even the original paper's test conditions were not fair, so Stockfish may have been much better than the results of the original matches made it seem.
Alpha Zero would not be competitive in its original implementation. It's neural net is too small (this is why it only took 4 hours to train it, since it stopped learning after 4 hours). Stockfish 14 just doubled its network size from Stockfish 13.
Stockfish 8 actually won games against Alpha Zero. But Stockfish 11 (which is still classical evaluation engine with no neural net support) totally decimates Stockfish 8, and Stockfish 13 (which uses neural net) totally decimates Stockfish 11. Stockfish 14 just got 30 elo points stronger than 13.
As it stands now, Stockfish 14 is the strongest chess entity humanity has ever seen.
I think this is a good idea, but in no way should it be the only restrained game. Others could be time restraints (total time per game and/or total time per move), depth restraints (if they both work off of BFS/similar), and probably many other restraints that those more familiar with the engines can come up with.
Most chess games are already time constrained (though a ref can call it if someone in a non-winning position is trying to run out the opponents clock)
[edit]
I was imperfectly remembering the rules. If it is theoretically impossible (even with blunders on the player who is out of time's part) for the player with time left to win, then it is a draw.
In addition, a player with less than 2 minutes on the clock may request a draw; see Article 10.2 which includes this subsection:
> a. If the arbiter agrees the opponent is making no effort to win the game by normal means, or that it is not possible to win by normal means, then he shall declare the game drawn. Otherwise he shall postpone his decision or reject the claim.
I'm not super into chess, but I was under the impression that running out the clock was a totally legitimate tactic. I'm surprised to hear that a referee has discretion to end the game based on it.
The issue is when positions are completely equal and there is no reasonable way to progress. It might still be technically possible to win in such positions, but it would require someone to make extremely bad moves and almost certainly lose. If there is no rule that forces draws in such positions, then players will just keep moving pieces without purpose until either someone's time runs out or 50 moves without capture / threefold repetition happens.
An arbiter 100% cannot stop a game because of that. Time management is part of the game in speed chess anyway, and for longer time controls there is usually a delay/increment so running out the clock in a clearly lost position isn't viable.
I slightly misremembered; the player stops the clock and calls the arbiter; if the arbiter agrees that their opponent is not attempting to win by normal means, the arbiter may award a draw. There is a 2 minute bonus to the opponent if the arbiter disagrees with the player making this claim.
I am not a chess player but I have never heard of referees stopping the game if you don't "give up" in a losing position and have extra time. That sounds ridiculous but I would love to know if it applies in certain tournaments and the reasoning behind it.
You are allowed to keep thinking as long as you have time on your clock. Isn't really considered good sportsmanship but is legal.
Recent instances I saw was an adult was in an almost-lost position with over an hour on the clock while his opponent had 10 minutes. He let his clock run down to nothing and then played quickly before finally let the clock run to zero in a lost (mate in 2) position. He got mocked for this in the local forums.
Also common for a kid to do a blunder and then sit there sad/crying for an hour. You try to encourage them to resign though.
All fun ideas, but power draw and/or hardware cost limits seem essential for a fair game whereas depth limits and no castling are rather just interesting experiments.
Also, time limit doesn't seem that different from a power draw limit if you're a computer.
Given that AlphaZero will presumably never be publicly available, I think you might be interested in TCEC which has fair fights between Stockfish and LeelaChessZero (which Stockfish has won recently).
> I think you might be interested in TCEC which has fair fights between Stockfish and LeelaChessZero
This is pretty questionable in my judgment, actually. TCEC's GPU hardware is 4x Nvidia V100 data center class GPUs, with a pretty powerful processor to boot. A quick search suggests that ONE of these will run you close to $10k, so we're talking about an all-in system worth mid five figures.
Meanwhile, the CPU hardware is pretty dated at this point. They have 4x Intel E5-4669V4, which is from early 2016. It's not easy to find this processor for sale any more (because, again, it's old), but prices seem to run in the $750 - $1500 range if you look on places like Ebay. Meanwhile even on Ebay a V100 is likely to run you $7K+.
I don't know that it's possible to compare "performance" between GPUs and CPUs in a one to one way, but looking at cost, it seems pretty clear that you'd have to spend a lot more to get a system that allows Leela to play at the kind of level you see on TCEC.
Looking at power consumption tells a similar story. Nvidia's data sheet for the V100 shows a maximum power consumption of 250 watts per GPU, so 1000W when running at maximum load (as a chess engine is presumably likely to do). Meanwhile, Intel places the TDP of the E5-4669v4 CPU at 135 watts. Even assuming they're undershooting that by a bit, we're probably talking 600 watts for that system ... on a rather old CPU model.
I'd say it's not a fair comparison. I'm not mad about it, because at the end of the day computer chess tournaments are for entertainment. It's much better if the best neural net programs are competitive with more traditional chess engines, even if by "objective" standards they are weaker.
TCECs goal is to keep the ratios similar to the AlphaZero paper, not price or power or other benchmark. Increasing the hardware of one would require an increase of the hardware of the other. But, the hardware is donated, so it's hard to be too critical.
But why are we comparing used 2021 prices when these hardware weren't purchased in today's market? Especially when GPUs are 1.5-2x MSRP right now. Even very old GPU prices are insane. I recently sold a 980ti near what I purchased it new. 4669v4 MSRP was $7k, so they are not far off. The V100 is pretty dated too as it is from 2017, and doesn't have FP16 which is heavily used by Leela. For this and several other reasons, a single 3090 is actually faster than 4x v100s according to their own bechmarks[1]. A single v100 is approximately equal to a 3080 or 2080ti in performance.
Maybe you should also checkout the CCCC[2] hardware which is even stronger for both: 2x A100 vs 2x AMD EPYC 7H12
I'm inclined to agree. I am more impressed by an engine that plays well in limited hardware than one that plays well on faster/more expensive hardware.
This is one of the reasons Core War was so intriguing; all the programs battling it out were running on the same hardware, each given an even slice of compute time. To win, you must then find ways to do the same amount of work in less time, while keeping your footprint small.
When the day comes (and I think it will, if our civilization lasts long enough) that a computer finally "solves" chess, it will be a momentous achievement, but ultimately boring.
I think TCEC is as close to a fair fight as you're going to get. Obviously, it can't be perfect. A few caveats to your comments (which I mostly agree with). One is that it's easy to add GPUs to a high end machine than CPUs. I'm surprised a 4-socket motherboard (evidently) exists. Secondly, at the consumer level, you can use a consumer GPU with similar processing power as the datacenter GPUs for much cheaper.