King Domino: How to Build an Intelligent Graph Search?

If you’re lucky, you had a chance to play this tight and quick gem called King Domino. Don’t take my word for it, take Germany’s:

I won’t waste time teaching you the rules here, as you can find them on YouTube. But to score a grid, for each connected component, multiply its size by the number of kings.

So now, I assume you know the rules, you love the game, and you’re as curious as I am about how to algorithmically think about it. Well, you’re in good company. I’m really excited about the simplicity of the rules and the path-dependent nature of the grid building. So excited in fact, that I built out a platform to play the game in Python:

The premise is to abstract as classes the following components:

  • Domino
    • One two-sided carboard piece, replete with nostalgic artwork
  • Side
    • Two of which comprise the Domino
  • Board
    • The dynamically growing grid comprised of dominos
  • Move
    • The choice of placing a given domino to expand the existing board
    • This is of course what we want to be smart about using Artificial Intelligence

Theese classes have nice features implemented like __hash__, __eq__, __ne__, __repr__. This allows for a lot of convenience, and one can work with Python sets of Moves or Dominos and so forth. Then, the game can simply be played by choosing dominos and adding Dominos to a Board via the Move object.

Note this is single-player King Domino (for now).

The King Domino Board is simply a 9×9 grid with the Castle in the center. It doesn’t feel this way when you’re playing the game, but a moment’s reflection will ensure you this is a safe way to think about it Since this is coded in Python, we will begin indexing at 0 and so the center castle is placed at (4,4). An acceptable Move is then anything adjacent to that. Moves are instanced by specifying a side and a coordinate (i, j) for that side. Then the rotation is specified via (Di, Dj) so that the second side is placed at (i + Di, j + Dj).

Example code to place a Domino at (4,5 / 5,5) would be:

import kingDomino as kD
myBoard = kD.Board()
d = Domino(Side(kings=0, terrain='wheat'), Side(kings=0, terrain='wheat'), 1)
mv = Move(domino=d, first_side=0, i=4, j=5, Di=1, Dj=0)

Now, as you may have noticed, the 9×9 grid representation is actually more than we need. There is a lot of symmetry one can exploit to remove squares. For example, a grid in the upper left of the 5×5 grid can be rotated to be in the upper right, lower right, or lower left. Really, those are all just grids where the Castle is in the corner. It turns out there are only six positions for the Castle to be, as shown here:

The white polka-dotted region covers the six tiles that represent unique places the Castle. Adding a seventh position would be redundant!

So in fact, a computer really only has to think of a subset of the 9×9 grid with a staircase carved out of it. Something like this:

A 9×9 grid provides a nice square way to think about the space for King Domino tiles. Indeed it’s sufficient and makes the programming easy. But if you’re trying to save processing power, you can use the abridged grid depicted by the shaded checkerboard region above. (Note this graphic indexed at 1 instead of 0)

In the github link above, the simpler 9×9 grid is used. One can also imagine a computer intelligence that considers 6 separate 5×5 grids at the beginning of the game. For each, it chooses an optimal move and then applies a supervisory choice to discard grids when they appear to be suboptimal. Ultimately, the computer will have a single 5×5 grid to consider.

One interesting question that arises is: What is the Optimal board? You can try to work it out yourself if you have the game handy. Take a minute or two and give yourself full authority to choose any tile. What’s your score?

Now, in the github link above, you can try your luck at the most naive and computationally slow strategy conceivable to answer that question: a Depth-First Grid Search of all possible King Domino Boards given the 48 tiles. Simply clone and run (Python 3). The good news is that the algorithm is only 12-lines of code, a recursive depth-first search. We don’t have to worry about recursive runaway because the max-depth is obviously limited (you can only place 12 dominos). Let’s see what the board looks like when I run this on my machine and choose an intermediate result:

Local optimum from naive recursive depth-first search of king domino graph. This grid scores 62, and was the forerunner after several hours of searching. That is a very good score and if you built this grid in the game, you’d probably win. You’d do so in spite of only placing 11 dominos.

It is intermediate of course, because the naive implementation of grid-search is too slow. It’s in fact much worse than the grid my 8 year-old nephew was able to generate using his intuition rather than searching every grid:

The grid generated by my 8-year old nephew when he was tasked to build the maximum score grid. It scores a whopping 124! The caves alone nearly beat the local grid search optima of 62. Is this the optimal grid?

Is there a better way to grid-search algorithmically? What do you suggest?

Stay tuned for a much smarter approach to answer the simple max-scoring grid question. For the more nuanced challenge of how to play against intelligent competition – ie: how to optimally auction for tiles and build the grid, can we look to Tensorflow? Can deep learning outmaneuver an amateur human contender?

The conquest continues…

Variance of Catan

Have you ever noticed that every time you play Settlers of Catan, someone invariably casts themselves as the unfortunate recipient of the worst luck to have ever befallen a human soul? I’ve sure done it (after all, it does help garner trades). But there are some cognitive distortions at play. For one, any event that ever happens has probability arbitrarily close to zero. So anytime someone looks back on an event and asks something like, “What was the probability of that???” You can pretty safely say zero, after you condition on virtually everything that happened before it. And that’s the tricky thing with Catan! Even normal-looking histograms feel weird when you live and die by the fate of each and every roll.

This did make me want to look a layer more deeply into the strategy. I of course felt that I won because I placed my houses in the best spots compared to my competitors (and as any Catan player knows, kept as a low a profile as possible!). But my Orange competitor seemed to attribute her loss to a curse, perhaps owing to a crime committed in a former life.

This post assumes you’ve already played Settlers of Catan at least once, so I won’t rehash the rules here. If you haven’t, you should stop reading this and give it a go. It is fun (though I have my quams with the mechanic, I’ll save that for a later post)!

The initial setup (roads removed for clarity). I was brown and the badly hexed, out-of-luck friend was orange. The rules say you roll two 6-sided dice every turn. Whatever tiles are rolled, produce resources you can use to build your strength and ultimately win the game. So a roll of an ‘8’ would give orange & red one “Rock Ore” and give brown (me) one “Wood”.

Well, clearly when placing these little homes, one wants to concentrate on tiles with numbers close to 7 because they are more likely to be rolled on two 6-sided’s. If you add it up, everyone has about the same probability of getting a resources on a given roll. Brown and White have the slight advantage (20/32 resources per roll) vs. Orange and Red (19/32resources per roll). But this is only looking at the first moment. No one invests that way! We’ve got to look at the second moment too, the variance!

Aside – imagine a game. Flip a coin: heads you get $20 Million, tails you get $0. Would you rather play that game or take $10 Million and walk away? Almost anyone would just take the $10 Million. I sure as hell would. Are you kidding? But both options (playing the game vs. taking the $10 Million) actually have the same expected payout of $10 Million. It’s just, they have wildly different variances. In fact, the game has infinitely more variance since taking the money and walking away has zero. Now, what if $10.2 Million was given to you with probably 98%? We’ve added some variance, but most people would still probably choose this option (it’s still the better one for almost everyone). What if it was $18.18 Million at 55%? In all cases, the expected payout is not changing! Only the variance is changing. In other words, given equal expected payout, it often makes sense to choose the strategy with the least uncertainty. Let’s go back to Catan, and the case of the Orange friend.

\begin{tabular}{l|rrrr}  & Brown & Red & Orange & White \\  \hline  Prob(0 Resources) & 12/32 & 13/32 & 16/32 & 12/32 \\  Prob(1 Resource) & 20/32 & 19/32 & 13/32 & 20/32 \\  Prob(2 Resources) & 0 & 0 & 19/32 & 20/32 \\  Expected Payout & 20/32 & 19/32 & 19/ 32 & 20/32 \\  Variance of Payout & 240/1024 & 247/1024 & 439/1024 & 240/1024 \\  Sharpe Ratio & 1.29 & 1.21 & .91 & 1.29 \\  \end{tabular}

It turns out the variance of payoff for Orange is about twice that of the other players. Poor Orange is in a pretty bad position to begin with! It may be that they’ll win of course if the rolls come up all 10’s, but they have a pretty good probability of catching a “bad streak of luck” and getting donuts while everyone else is getting a slow steady stream of resources.

So after we played, I noticed that Catan is a lot like portfolio management. There’s little gain in ramping up variance without increasing the expected return. Even big very sophisticated companies, like Blackrock (which manages Trillions of dollars), handle this in a really simple way. The spirit of it is simply to try to maximize the ratio of expected value to standard deviation. That ratio is called the Sharpe Ratio, and I provide it in the last row of the table above. Brown and White lead the board in that stat with Orange in real trouble.

How did this game turn out?

Final position at end of game. Brown wins with 4 Castles and 2 Settlements. White ends with a respectable 8 points on Longest road, 2 Settlements, and 2 Castles.

There are all kinds of strategies going on in Settlers of Catan (and don’t get me started on trading mistakes). For example, only Brown and White touched every type of resource at the start. But without some kind resource stream, it doesn’t really matter how good you are, or how charismatic. You’re probably going to lose. At least in this single sample point, the Sharpe Ratio would have done a good job at predicting performance. If anyone knows of a good Catan data stream, please post in comments. I will see just how good a job Sharpe Ratio alone does at predicting the final rankings!

The takeaway? Make sure to diversify your numbers! You want a tile with a six and a tile with an eight, not two tiles with eights. Now, go forth and dominate Catan.

R squared

This quantity is often misunderstood and/or misrepresented. Let’s assume we have some random variable y with mean \bar{y} and some estimator for it \hat{y}. The commonly accepted fundamental definition of R^2 is:

R^2=1 - \frac{\sum(\hat{y_i} - y_i)^2}{\sum(\bar{y_i} - y_i)^2} = 1 - \frac{SSE}{SST}

Sometimes people call this the “fraction of explained variance”. But that’s only the right way to look at it under special circumstances. All the equation above shows is that R^2 “compares” the model \hat{y} to the model \bar{y}. It’s a little like comparing a random walk model to something someone clever cooked up hoping to beat it. Note immediately that there’s nothing stopping R^2 from being negative, and so the square in notation is unfortunate in that respect. If you choose a bad enough model, say \hat{y} = \bar{y} - \alpha for any \alpha \in \mathbb{R} with \alpha \neq 0 , well then R^2 will be negative*.

But the comparison made here is not simply comparing the errors of the two models. Instead it compares two variances (which harkens back to Principal Component Analysis, and I will update this blog with that discussion) and then subtracts that ratio from one. If the model for \hat{y} has the property that the errors are zero on average, then the numerator is proportional to the variance of the errors. The denominator is proportional to the variance of y. So, if errors are denoted by \epsilon, that ratio is just \frac{V(\epsilon)}{V(y)}. So whenever the variance of errors are greater than the variance of y, R^2 ends up in negative territory.

In terms of this fraction of explained variance interpretation, that turns out to only be the case when Cov(y, \hat{y}) \geq \frac{1}{2}V(\hat{y}), which follows from simple relation below:

V(\epsilon) = V(y) + V(\hat{y}) - 2Cov(y, \hat{y})

Although that all seems obvious, given the simple equations above, it’s actually *not* always done in mainstream practice. For example, the statsmodels package in Python gets this mixed up. Try running sm.OLS(Y, X, const=False) and checking out the excellent R^2. Better yet, check out statsmodels VIF calc, which makes the same mistake. I will add more decision around that to this blog.

* To see that any constant guess other than the mean itself will result in a negative R squared, consider one such guess as \hat{y} = \bar{y} + \alpha. Then we have,

SSE = \sum (\bar{y} - y_i + \alpha)^2 = \sum \bar{y}^2 + y_i^2 + \alpha^2 + 2 \bar{y}\alpha - 2 y_i \alpha - 2 \bar{y} y_i

=\sum (\bar{y} - y_i)^2 + \alpha^2 - 2\alpha(\bar{y} - y_i)

=SST + \epsilon

and it remains to show that \epsilon is greater than zero:

\epsilon = \sum \alpha^2 - 2\alpha(\bar{y}-y_i) = n\alpha^2 - 2n\alpha \bar{y} + 2\alpha \sum{y_i} = n\alpha^2

And therefore R^2 = 1 - SSE/SST = - \epsilon / SST < 0

Some useful discussions below