That's great. There is also this effort in New Zealand: https://mackwelloco.com/, focusing on "Modern Steam" techniques, with a small engine, targeting areas without easy access to diesel.
Steam engines are fascinating pieces of technology. The page for the trust itself is full of interesting engineering details: https://www.a1steam.com/.
Btw, some great books on Steam locomotives/engines:
https://www.amazon.com/gp/product/B07NQ9JG2M - American steam locomotives 1880–1960. The author of drove locomotives, and was the transportation curator at the Smithsonian Museum. He wrote it over 30 years, and you can tell the amount of care and detail he put into with the details and history.
https://www.amazon.com/gp/product/B072BFJB3Z - The Perfectionists: How Precision Engineers Created the Modern World. Talks about how precision engineering was critical to the invention and wider use of steam engines.
Right now there is a lot of experimentation to try adjusting the network architecture. The current leading approach is a much larger net which takes in attack information per square (eg. is this piece attacked by more pieces than it's defended by?). That network is a little slower, but the additional information seems to be enough to be stronger than the current architecture.
Btw, the original Shogi developers really did something amazing. The nodchip trainer is all custom code, and trains extremely strong nets. There are all sorts of subtle tricks embedded in there as well that led to stronger nets. Not to mention, getting the quantization (float32 -> int16/int8) working gracefully is a huge challenge.
Just wanted to say thanks for many years of fantastic work on both Stockfish and Leela. The computer chess community owes you a huge debt of gratitude!
Interesting how before A0 it was mainly "search matters the most", with crazy low branching factors to get deeper. It seems that humans were just better in search heuristics than in evaluation ones.
The great thing about these community driven efforts is that it is indeed feasible to reproduce these super expensive efforts. I'm a bystander now, as new maintainers have taken over, and they are doing a fantastic job pushing things forward.
This is also how Stockfish got to be the #1 engine. By being open source, and having the testing framework (https://tests.stockfishchess.org) use donated computer time from volunteers, it was able to make fast, continuous progress. It flipped what was previously a disadvantage (if you are open source, everyone can copy your ideas), into an advantage - as you can't easily set up a fishtest like system with an engine that isn't already developed in public.
Markdeep is fantastic. I wrote a fairly long article [1] with a lot of code samples, and it was so nice to not worry about formatting. The styling is elegant without having to try :).
James put together a really nice summary of the ideas and the projects!
It was almost a year ago that lc0 was launched, since then the community (led by Alexander Lyashuk, author of the current engine) has taken it to a totally different level. Follow along at http://lczero.org!
Gcp has also done an amazing job with Leela Zero, with a very active community on the Go side. http://zero.sjeng.org
Of course, DeepMind really did something amazing with AlphaZero. It’s hard to overstate how dominant minimax search has been in chess. For another approach (MCTS/NN) to even be competitive with 50+ years of research is amazing. And all that without any human knowledge!
Still, Stockfish keeps on improving - Stockfish 10 is significantly stronger than the version AlphaZero played in the paper (no fault of DeepMind; SF just improves quickly). We need a public exhibition match to setttle the score, ideally with some GM commentary :). To complete the links you can watch Stockfish improve here: http://tests.stockfishchess.org.
MCTS is not a traditional depth first minimax framework. Key concepts like alpha-beta don’t apply. Although it is proven to converge to minimax in the limit, the game trees are so large this is not relevant. You could use the network in a minimax searcher, but it’s so much slower than a conventional evaluation function it’s unlikely to be competitive.
It is kind of the case, but it does not need to expand the whole node to find the maximum. It samples some children instead from a NN (the Monte Carlo aspect)
I strongly suspect alphazero is easily beatable, once you have your hands on it. This is just from experience that most neural network style systems are weak against adversarial opponents who understand their internals.
Of course I can't be sure, because Google refuses to give out anyone access to alphazero, or a network trained with it. Personally, that gives me more confidence they know there are significant exploitable weaknesses.
No need to wait for AlphaZero, you can try Leela Chess Zero today. From my experience the network without search has some blind spots, but the tree search is pretty effective in fixing them.
Adversarial? If the model exclusively trains against itself, you can’t really insert anything there. Do you mean, play confusing moves at the beginning of the game?
The way general adversarial networks work on tricking image recognition systems is that they vary pixels of the input image slightly to manipulate the output of the neural network.
For alphazero, the input is the board, which you can't manipulate arbitrarily. You can run an evaluation of a board based on a move and see if its significantly different than the evaluation that alphazero comes up with, and maybe try to exploit that. But if you have a better evaluation of some state than that of alphazero, you're likely a stronger player anyway so this extra step is unnecessary. Most of the value of the bot comes from the evaluation function of a board, along with some hyper-parameters. But the evaluation is probably the most important part and the most difficult to replicate.
That doesn't follow. For you to confuse it, you need to change the inputs. For images, this is fine, we can smoothly change lots of little things. For chess games or go you don't have that freedom.
There's current best weights available. Not alphazero, but I would expect that issues would be general and so if there are issues with leela zero they may transfer and if you don't see issues with leela zero they're unlikely to exist in alpha zero (at least, if they do they may be very particular to subtle training differences).
Would be very interested to see what you find if you get the chance.
You can change the inputs: it depends on when (ply) and which move you play. Some moves are uncommon enough to make it possible for you to uncover something?
You absolutely can change the inputs, but the point I wanted to make is that unlike images where you can make a human-irrelevant changes you can't really do that with chess or go.
If you want to construct a particular position on the board, you'd likely need to use multiple steps, require the AI to play very particular moves and then the outcome would be a certain move from the AI. Even then, a simple incorrect classification doesn't help all that much, you need your opponent to make repeated mistakes.
I think in reality if you uncovered a type of move it wasn't expecting you are likely to uncover a new strategy in general rather than a trick. Image classification however lets you play uninterrupted with tiny pixel value changes, and you only need a single incorrect output to "win".
It's suspect it's a bit harder for the network to be overfit like this, but it's probably possible it has some gaps in its evaluation. However, those gaps would have to persist beyond its search horizon and not concretely affect material or mobility and it just seems vanishingly unlikely you'll find any systematic way to exploit anything.
If you are interested in compression at all, be sure to take a trip through Charles Bloom's blog [1]. It's an incredible read, he covers everything from the basics all the way through state of the art algorithms.
A great example is this post [2], where he talks about how to correctly implement a Huffman encoder/decoder. It's a lot tricker than it is made to sound in most books. For example, most Huffman codes that are used in practice are length limited, to allow the decoder to use smaller lookup tables. There are a bunch of surprisingly interesting tricks to get that to work well from the encoding side (which symbols do you choose to be smaller than they would be otherwise?).
I like Charles Bloom's blog, but I'm not sure if it's very approachable (after all, it's "rants" :-). If you want more diversity in the readings, ryg blog [1] makes good reading. Start with the most recent series on efficiently reading bits [2].
What I would really love to know is that how they get those results with their newest set of algorithm. They seem to beat every compression algorithm including zstd and lz4 in both compression and speed. And how we can all benefit from those improvements.
Check out https://github.com/glinscott/leela-chess. We are getting close to kicking off the distributed version now that we have validated it's possible to get a strong network through supervised training.
Question : when you switch to self-play reinforcement learning, do you plan on starting from the networked obtained in supervised learning or tabula rasa? I understand starting from tabula rasa will require more comptuting power/time, but if you start from the supervised learning network, isn't there a risk you inherit human biases in the game style? It would also defeat the purpose of having the system discover existing chess theory and possibly new one.
There is a public distributed effort happening for Go right now: http://zero.sjeng.org/. They've been doing a fantastic job, and just recently fixed a big training bug that has resulted in a large strength increase.
I ported over from GCP's Go implementation to chess: https://github.com/glinscott/leela-chess. The distributed part isn't ready to go yet, we are still working the bugs out using supervised training, but will be launching soon!
We are using data from both human grandmaster games and self-play games of a recent Stockfish version. Both have resulted in networks that play reasonable openings, but we had some issues with the value head not understanding good positions. We think we have a line on why this is happening (too few weights in the final stage of the network), but this is exactly the purpose of the supervised learning debugging phase :).
This is really cool! The chart on that page makes it look like Leela Zero is already much much better than AlphaZero (~7400 Elo vs ~5200 Elo). I suspect I'm misinterpreting something though, could you clarify?
Leela Zero's ELO graph assumes that 0 ELO is completely random play, as a simple reference point.
On the other hand Alphago uses the more common ELO scale where 0 is roughly equivalent to a beginner who knows the rules, so you can't directly compare the two.
I've been having fun following along with Leela Zero, it's a great way to understand how a project like this goes at significant scale. Good luck with Leela Chess, I'm excited for it!
Sequential probability ratio test, it is essentially a test that distinguishes between two hypotheses with high probability.
In the case of Leela Zero the idea is to train new networks continuously and have them fight against the current best network, which is replaced only when a new network is statistically stronger, according to SPRT.
Congratulations to Stockfish! The community is amazing, and the patches keep on flowing. The sheer number of ideas is pretty incredible. If you are interested in contributing, head over to http://tests.stockfishchess.org/tests. You can submit a test, and it will be run by the virtual cluster of user donated machines.
It's been over four years since I put fishtest up, and in that time, there have been over 20,000 tests submitted. The really cool thing is that this distributed testing framework is only possible with an open source engine. So instead of being a disadvantage (everyone can read your ideas), it turns into an advantage!
Basically, we use a two-phase test to maximize testing resources. First a short time control test (15s/game), using more lenient SPRT termination criteria, then, a long time control (60s/game) test using more stringent criteria. That combined with setting the SPRT bounds to allow us to measure 2-3 ELO improvements has allowed the progress of Stockfish to be almost only improvements. Previously when developing an engine, you'd make 10 changes, and if you were lucky, 2 or 3 would be good enough to make up for the other bad or neutral ones.
If you look at the graphs on http://www.sp-cc.de/, you can see that it just keeps getting better, one small improvement at a time.
A good overview of modern steam techniques is here: http://advanced-steam.org/5at/principles-of-modern-steam/, where they focus on dramatically more efficient and clean burning engines. A fascinating rabbit hole to go down :).