02.19.09

Sky-High Boggle Scores with Simulated Annealing

Posted in boggle at 1:09 am by danvk

Don’t let the sixteen month hiatus fool you. There’s just no end to Boggle posts on danvk.org!

In case you’d forgot, we’ve developed a blazing fast boggle solver capable of scoring 10,000 Boggle boards a second. What to do with this? Other than some interesting analyses, the most interesting question is:

What is the highest-scoring Boggle Board?

In this post, we’ll try to answer that question using Simulated Annealing. Here’s a sneak peak at one of the exceptionally word-rich boards we’ll find:

p e r s
l a t g
s i n e
t e r s
3625 points

Follow me past the fold for more…

Here’s the basic idea of Simulated Annealing:

  1. Generate a random board.
  2. Score it.
  3. Change the board somehow.
  4. Score it.
  5. Based on the scores, decide whether to keep the old board or the new one.
  6. Repeat from step 3.

If we make good decisions in step #5, the boards we keep will be progressively higher scoring.

The big decisions come in steps #3 and #5. What does it mean to change a board?

I decided to use two basic permutations to change the board: re-rolling a die:

a b c d
e f g h
i j k l
m n o p
18 points
a b c d
e f g h
i r k l
m n o p
76 points

and swapping two dice:

a b c d
e f g h
i r k l
m n o p
76 points
a b c r
e f g h
i d k l
m n o p
55 points

Clearly one of these changes was beneficial and the other was not.

Generally speaking, we should accept the changes that increase the score in step #5 and reject the changes that don’t. This is a perfectly reasonable strategy. It’s called Gradient Descent and is widely-used.

With Simulated Annealing, we sometimes go with the lower-scoring board. The idea is to not get stuck in a rut. Maybe our board is pretty good. It’s scores higher than all the boards around it. But there are lots of boards a little more distant that are even higher-scoring. With Simulated Annealing, we can take a short-term loss for a long-term gain.

The exact details aren’t too interesting. If they interest you, take a look at my code.

Now for the fun part! Here are the results of a sample run. I’ve concatenated the letters on each row of the board so that they’ll fit on a single line.

Gen. Score Reign Board
0 19 1 kqiqwhyyjthdsibk
1 4 1 iqiqjhyywthdskbk
2 5 1 ibiqjhyywthdskqk
3 9 1 ibiqjhcywthdskqk
4 10 1 ibiqjychsthdwkqk
5 8 1 ibicjyqhsthdwkqk
6 9 1 ibicjyqdsthywkqk
7 23 1 ibicjygdhtsywkqk
8 16 1 ibncjygdhtsywkqa
9 15 1 ibncyygdhtsymkqa
10 15 1 ibncyygwhtsytkda
11 38 1 ibncyygahtsyhkdw
12 146 2 ibscyogahtnyhkdw
14 140 4 ibscyogahtnbhkdw
18 165 2 ibbcyogahtnshkdw
20 159 1 ibbcdogahynshktw
21 169 1 ibbcdogahynshmtw
22 151 4 ibbcdogahynshbtw
26 150 3 hbbcdogaiynshbtw
29 142 2 hbbcdogagynshbtw
31 148 1 hbbcdogaygnshbtw
32 151 1 hbbqdogaygnshbtw
33 174 7 hbtmdogaygnshbtw
40 175 2 hdtmbogaygnshbtw
42 172 3 hdtmbogaygnstbtw
45 165 1 hdtmbogaygnetbtw
46 274 1 hdtmbogaygnetbts
47 298 5 hdtmbogabgnetyts
52 296 8 hdtmbogabgnswyte
60 387 1 hdtmboeabgnswytg
61 403 7 hdtgboeabmnswytg
68 409 3 hdtpboeabmnswytg
71 413 6 hdtpboeabmnsgytg
77 419 3 hptdboeabmnsgytg
80 416 8 hptdbaeobmnsgytg
88 421 9 phtdbaeobmnsgytg
97 449 40 phtdbaeoymnsgytg
137 454 2 phtsbaeoymnsgytg
139 463 11 phtseaeoymnsgytg
150 581 3 phtsmaeoyensgstg
153 662 26 yhtsmaeopensgstg
179 690 1 yhtsmaespenogstg
180 734 2 lhtsmaespenogstg
182 754 13 lctsoaespenogstg
195 849 1 lctsoasepenogstg
196 912 39 lctseasepenogstg
235 974 10 lctseasepenoistg
245 978 10 lctseasedenoistg
255 1046 25 lctseasedsnoietg
280 1281 53 lctseasedsrointg
333 1511 2 lctseasedsronitg
335 1511 8 lctseasedsronitg
343 1515 17 lctseaiedsronitg
360 1537 2 lctseaiedsronitp
362 1643 32 lctseaiedtronisp
394 1708 21 lstseaiedtronisp
415 1727 46 lstseiaedtronisp
461 1744 40 lstseiaedtrinosp
501 1873 36 lstseiaentrinosp
537 2053 9 lstseiaentrisosp
546 2162 13 lstseiaentrigosp
559 2445 42 sltseiaentrigosp
601 2445 21 sltseiaentrigosp
622 2524 58 sltseiaentrigesp
680 2597 37 stlseiaentrigesp
717 2614 232 stlseiaentragesp
949 3090 276 stlseiaentrpgesa
1225 3151 26 stlseiaentrpdesa
1251 3182 n/a stlseiaentrpdeso

Some fun words on that last board: “desperate”, “entreaties”, “operated” and “serenest”. No shortage of 11-pointers on this board!

That’s all for now. Next time we’ll look at some of the highest-scoring boards that simulated annealing finds.

As usual, you can find all the source code for this project on the performance-boggle project at Google Code.

9 Comments

  1. Craig Fratrik said,

    February 19, 2009 at 10:09 am

    Very cool dan.

    The way I way I learned simulated annealing, as time went on, you were less likely to take the lesser board. Like steel particles as they cool during annealing.

    Also, I seem to remember that you considered all boards that were one move away, and typically, you took the best one of those. The annealing part was that sometimes you just took a random choice.

    Finally, it’s a good thing I came to comment. The color of the text didn’t come across in google reader.

  2. Greg said,

    February 19, 2009 at 10:38 am

    Thats a pretty fun little analysis. I could see it being useful too. Imagine someone wanted to create a software-based boggle game for people to play online or whatnot. It would be more fun in general to play on high-scoring boards than low scoring ones, so weighting the algorithm for generating boards to be a little less random could improve the playability of the game.

  3. danvk said,

    February 19, 2009 at 12:41 pm

    Craig: that’s one of the details that interested readers will find in my code (http://code.google.com/p/performance-boggle/source/browse/trunk/anneal.cc — only 150 lines!). I use the metallurgy analogy. There’s an ambient temperature that decreases over time. At lower temperatures, I’m less likely to choose a lower-scoring board. You can see this in the board progression. At first, it’s pretty random. Then it starts to only improve.

    Craig paragraph 2: That’s not an approach I’d considered, but I like it. My code is fast enough that I can easily consider all boards at a distance <= 2. (more on that in a later post)

    Craig paragraph 3: Reader must strip my CSS.

    Greg: Sounds like an idea for a startup! =) It only took about two seconds to do the full annealing run I included in this post, so it would be completely feasible. You could even imagine requesting a low-, medium- or high-scoring board. Or a high-scoring board with a “qu” on it.

  4. Greg said,

    February 19, 2009 at 2:51 pm

    I recall a story of two slot machines that had the exact same payout structure and frequency (ie: if you were blind, they were identical machines), but one would earn the casino far more money.

    The difference was that one of them would show far more “almost wins” where the user was one click of one of the wheels away from winning alot of spare change. The psychological “near miss” kept people at the machine longer. Debatably, they enjoyed the experience more.

  5. Mike said,

    March 15, 2009 at 9:28 pm

    Awesome stuff, Dan!

    We should definitely get lunch together at some point… haven’t seen you in at least a year…

    Also, as a security guy, I present for your amusement: cross-site boggling

  6. danvk.org » Solving Boggle by Taking Option Three said,

    August 4, 2009 at 6:18 pm

    [...] we spoke about Boggle, we used Simulated Annealing to get at the question “what is the highest-scoring [...]

  7. danvk.org » Breaking 3×3 Boggle said,

    August 8, 2009 at 10:51 am

    [...] Find a very high-scoring board (maybe this way) [...]

  8. JohnPaul Adamovsky said,

    February 22, 2010 at 12:04 am

    10,000 Boards/second is impressive.

    Question: How does it scale on multi-core shared memory CPU’s?

    In November 2009, deepsearch.c was capable of scoring 70,000 random 5×5 boards per second with the TWL06 lexicon on a Core2 Q9450 @ 3GHZ.

    How? An immutable finite state automaton operating as a perfect and complete hash table. That’s how.

    RSLCS
    DEIAE
    GNTRP
    SMIDR

    This board is the best Best 5×5 Boggle board for TWL06. It’s worth 10,769 points. The word list is mind boggling.

    search “Boggle deepsearch” on Google. http://www.pathcom.com/~vadco/deep.html

    Final question: Why bother changing more than one tile, when the smallest change possible will lead to the best board every time? Does your algorithm guarantee the highest possible board in a period of time shorter than over night?

  9. JohnPaul Adamovsky said,

    February 22, 2010 at 12:12 am

    Whoops, missed a line in that board.

    RSLCS
    DEIAE
    GNTRP
    ATESE
    SMIDR

    That’s the one, I’d love to be able to edit my comment.

    All the very best,

    JohnPaul Adamovsky