I absolutely love love love seeing things like this.
I've been doing a fair bit with Monte Carlo search lately to explore different spaces, but mine has been in the context of Llama.cpp and Magic: The Gathering. [1] I'm going to compare and contrast my approaches with the approach used by the authors.
First, I want to ask a question (either to the authors, or to the general audience), about trees vs. DAGs. The author wrote:
> It was only after the initial tree implementation that we considered that there would be a lot of repeated wells: after all, in this game you can reach the same position in a number of different ways. A lot of deliberation and a re-factored codebase later, we opted for a directed acyclic graph (DAG) instead, taking identical positions and merging them into the same entry in a graph, rather than making them distinct entries on a tree. This complicated things significantly, but reduced our memory needs by an order of magnitude, at least. Refactoring from tree searches to DAG searches was more work than we’d expected to put in to the project, but was yielding promising results.
I'm curious about why this was such a large refactor. I've made this change in two different projects now, and in both cases, I was able to change from an exhaustive tree structure to what is essentially a DAG by simply searching for duplicate states and culling any branches that are equivalent. For instance, I implemented this change in Llama.cpp and got a >400% speedup -- and it was essentially a two-line change -- no drastic refactoring needed. [2]
Am I missing something here? How is a culled tree structure fundamentally different from a "true" DAG -- and more importantly -- would I stand to gain even more performance by switching to something else? I don't see what's so complicated about this, but I feel like I must be missing something.
> we discovered that the majority of our time was now spent accessing the hash of positions, since whenever a new positions was explored, we had to check if it already existed. A bit of a further dig and we discovered that most of that time was spent running a hashing algorithm on the positions data for comparison. Which again, made sense…but we knew all our positions were unique among each other. We didn’t need to do any hashing, we could just use the position’s representation as a key directly, if we could find a reasonable way to encode it. So we replaced the hashing algorithm with our own custom version that encoded the position as a binary representation of position, rotation and piece type. This was much faster than having to hash a position, and we knew it guaranteed uniqueness. This doubled the speed of our move finder.
In the case of my Magic: The Gathering searcher, I am using a text summary of the game state (essentially `gameState.toString()`) -- what's on the battlefield, what's in the hand, what's in the graveyard, how much mana is available, etc -- as the key. I sort the cards in each location alphabetically, so that it doesn't matter which order they were put there -- only what's actually there. This GREATLY increased the speed of execution, but again -- it was just a simple change.
That said, the fire graph showing how much time is spent in string processing is eye-opening, and it makes me think that I should probably profile my Python code with a similar tool, because a binary representation might be worth it.
It's also fascinating to me that the author attempted to train an ML model to play the game. My initial thought was that an exhaustive search was the only way to approach this (because it's perhaps the only way to arrive at a max score that is provably correct), but I think I am not truly appreciating the vastness of the search space at play here, and exhaustively searching all possible inputs might just be too unimaginably large.
All that said, this read has kept me on the edge of my seat. Kudos to the developers and authors for writing this up for us -- and CONGRATULATIONS ON CAPTURING THE RECORD!!! Even though it was later surpassed, that should in no way diminish the satisfaction of the accomplishments here.
It's an absolute blast reading this. I've enjoyed it, I've learned a ton, and I'm going to try implementing some of their suggestions in my own projects!! Thank you!! It's articles like this that keep me coming back to Hacker News day after day. :)
> It's also fascinating to me that the author attempted to train an ML model to play the game. My initial thought was that an exhaustive search was the only way to approach this (because it's perhaps the only way to arrive at a max score that is provably correct), but I think I am not truly appreciating the vastness of the search space at play here, and exhaustively searching all possible inputs might just be too unimaginably large.
I think when we did the math with our best emulator it would take 100 billion years? There are, by our estimation or a game lasting a thousand moves 10^23 possible wells. With current computing, even if we threw all the resources of the planet at it, we'd still be a few orders of magnitude short of finishing in our lifetimes. It is a very very very large search space.
>Searching for duplicate states and culling any branches that are equivalent.
Branch pruning and comparison are both expensive relatively speaking. You want, in a perfect world, to essentially have a hashmap lookup for duplicate nodes, so deciding if its new or not is O(1), rather than having to look at your node and all its children to know if it's duplicated. It doesn't matter unless you're really chasing performance.
Also this was our first serious rust project so implementing it was a bit more daunting than expected. In a sane codebase it's probably not a bad refactor, at the point we did it we had six methods with a bunch of stuff jammed into them and names like `explore_tree` so it was a challenge, made significantly easier by rust types, and more frustrating by the borrow checker.
I've been doing a fair bit with Monte Carlo search lately to explore different spaces, but mine has been in the context of Llama.cpp and Magic: The Gathering. [1] I'm going to compare and contrast my approaches with the approach used by the authors.
First, I want to ask a question (either to the authors, or to the general audience), about trees vs. DAGs. The author wrote:
> It was only after the initial tree implementation that we considered that there would be a lot of repeated wells: after all, in this game you can reach the same position in a number of different ways. A lot of deliberation and a re-factored codebase later, we opted for a directed acyclic graph (DAG) instead, taking identical positions and merging them into the same entry in a graph, rather than making them distinct entries on a tree. This complicated things significantly, but reduced our memory needs by an order of magnitude, at least. Refactoring from tree searches to DAG searches was more work than we’d expected to put in to the project, but was yielding promising results.
I'm curious about why this was such a large refactor. I've made this change in two different projects now, and in both cases, I was able to change from an exhaustive tree structure to what is essentially a DAG by simply searching for duplicate states and culling any branches that are equivalent. For instance, I implemented this change in Llama.cpp and got a >400% speedup -- and it was essentially a two-line change -- no drastic refactoring needed. [2]
Am I missing something here? How is a culled tree structure fundamentally different from a "true" DAG -- and more importantly -- would I stand to gain even more performance by switching to something else? I don't see what's so complicated about this, but I feel like I must be missing something.
> we discovered that the majority of our time was now spent accessing the hash of positions, since whenever a new positions was explored, we had to check if it already existed. A bit of a further dig and we discovered that most of that time was spent running a hashing algorithm on the positions data for comparison. Which again, made sense…but we knew all our positions were unique among each other. We didn’t need to do any hashing, we could just use the position’s representation as a key directly, if we could find a reasonable way to encode it. So we replaced the hashing algorithm with our own custom version that encoded the position as a binary representation of position, rotation and piece type. This was much faster than having to hash a position, and we knew it guaranteed uniqueness. This doubled the speed of our move finder.
In the case of my Magic: The Gathering searcher, I am using a text summary of the game state (essentially `gameState.toString()`) -- what's on the battlefield, what's in the hand, what's in the graveyard, how much mana is available, etc -- as the key. I sort the cards in each location alphabetically, so that it doesn't matter which order they were put there -- only what's actually there. This GREATLY increased the speed of execution, but again -- it was just a simple change.
That said, the fire graph showing how much time is spent in string processing is eye-opening, and it makes me think that I should probably profile my Python code with a similar tool, because a binary representation might be worth it.
It's also fascinating to me that the author attempted to train an ML model to play the game. My initial thought was that an exhaustive search was the only way to approach this (because it's perhaps the only way to arrive at a max score that is provably correct), but I think I am not truly appreciating the vastness of the search space at play here, and exhaustively searching all possible inputs might just be too unimaginably large.
All that said, this read has kept me on the edge of my seat. Kudos to the developers and authors for writing this up for us -- and CONGRATULATIONS ON CAPTURING THE RECORD!!! Even though it was later surpassed, that should in no way diminish the satisfaction of the accomplishments here.
It's an absolute blast reading this. I've enjoyed it, I've learned a ton, and I'm going to try implementing some of their suggestions in my own projects!! Thank you!! It's articles like this that keep me coming back to Hacker News day after day. :)
* [1] - https://github.com/HanClinto/mtg_belcher_montecarlo
* [2] - https://github.com/ggerganov/llama.cpp/pull/6616