08.04.09

Solving Boggle by Taking Option Three

Posted in boggle at 6:18 pm by danvk

There is exciting news to report in the land of Boggle!

Last we spoke about Boggle, we used Simulated Annealing to get at the question “what is the highest-scoring Boggle board?”

We found a board with 3625 points on it this way. It would have been nice to say that it was the best of all possible boards, but that would have been too rash. While it is a very good board and I have never seen a higher-scoring one, that doesn’t mean there isn’t one out there. Maybe I’ve just been looking in the wrong places.

To prove that the 3625-pointer is the best board, we’d need to show that every other board scores fewer points. In a previous post, I estimated that there were 2^69 possible boggle boards. At 10,000 boards/second, this would take 1.9 billion years of compute time!

When you run up against a computationally intractable problem, there are a few standard ways to deal with it:

  1. Give up.
  2. Come up with a brilliant new algorithm to solve the problem more quickly.
  3. Solve a simpler problem.

Option 1 is the easiest. Option 2 is the hardest. And option 3 is the compromise we’ll be taking for the next few danvk.org boggle posts. Our simpler problem today: 3×3 boggle.

If we drop seven letters (4×4 – 3×3 = 7), we’re left with only 26^9 / 8 ≅ 2^39 boards to consider. If we can achieve 10,000 boards/sec, this would be just over two years’ worth of computation. That’s still a lot, but it’s much more reasonable than 1.9 billion years!

I believe I have solved the 3×3 boggle problem using an approach that is only slightly more clever than this. I used simulated annealing to find very high-scoring 3×3 boards. This was the best one I found, along with its 4×4 best board buddy:


p e r
l a t
d e s
545 points

 

p e r s
l a t g
s i n e
t e r s
3625 points

It’s worth noting that the optimal 3×3 board has a 2×3 region in common with the 3625 point 4×4 board.

In the next post we’ll talk about how I showed that this board was higher-scoring than all 2^39 others in significantly less than two years (it took one day). As a teaser, I’ve included all the 3×3 boards with more than 500 points worth of words below the fold.

Read the rest of this entry »

02.19.09

Sky-High Boggle Scores with Simulated Annealing

Posted in boggle at 1:09 am by danvk

Don’t let the sixteen month hiatus fool you. There’s just no end to Boggle posts on danvk.org!

In case you’d forgot, we’ve developed a blazing fast boggle solver capable of scoring 10,000 Boggle boards a second. What to do with this? Other than some interesting analyses, the most interesting question is:

What is the highest-scoring Boggle Board?

In this post, we’ll try to answer that question using Simulated Annealing. Here’s a sneak peak at one of the exceptionally word-rich boards we’ll find:

p e r s
l a t g
s i n e
t e r s
3625 points

Follow me past the fold for more…

Read the rest of this entry »

10.09.07

A Java Surprise

Posted in boggle, programming at 11:44 pm by danvk

java.png
I’ve always been a Java and Eclipse naysayer, but I’m afraid new experiences are forcing me to reevaluate my skepticism. The last time I used Java was JDK 1.3 on a Sparc workstation back in early 2004. Eclipse was hella slow on that hardware, and somehow my workspace wound up in a temporary directory. This was a very bad thing, because as soon as I logged out, my project was gone forever. So I had good reason to swear off Eclipse.

More generally, Java left off a mighty stink back in 2004. Any GUI that I ran on the Mac would look out of place and felt clunky. Performance was poor. But in retrospect, I suspect much of the rank Java smell was really coming from the design patterns gibberish I was being force-fed at the same time. Why use a simple array when you could use an AbstractListFactory that does the same thing with 10x code bloat?

Regular readers only get one guess what program I wrote to get in the swing of things.
Read the rest of this entry »

08.02.07

How many Boggle boards are there?

Posted in boggle, math at 8:45 pm by danvk

I’ve taken a several months break from my Boggle series, mostly because I think everyone got tired of it. I’m going to come back to it, but hopefully at a slower pace this time around.

Last time, we completed a board solver that could score 10,000 boards per second. So what to do with all that power? One option was to compute statistics on random boards. Another is to hunt for the holy grail of Boggle: the highest possible scoring Boggle board. The next few posts will be devoted to the search for this board.

Before undertaking any search, you need to get a feel for your search space. In our case, that’s the set of all 4×4 Boggle boards. How many are there? We can do a few back-of-the-envelope calculations.

To create a board, you roll 16 dice. Each has six possible letters on it, which gives 6^16 possibilities. These dice land in some permutation on the board, which gives another factor of 16!. Finally, a 4×4 board has eight-fold symmetry, which takes us down to 6^16 * 16! / 8 = 7.3e24 ≈ 2 ^ 83.

That’s one upper bound. But it assumed that all 6*16 = 96 symbols on the dice were unique. Obviously they’re not. After a roll, each of the 16 squares will have one of 26 letters on it. Divide by the symmetries, and you get 26 ^ 16 / 8 = 5e21 ≈ 2^72. Much better!

I haven’t been able to come up with any better upper bounds than these. The main flaw in the second approximation is that not all boards can be rolled with the sixteen dice that Hasbro provides. A board of all z’s or qu’s simply can’t occur. If we knew the probability that any sixteen character sequence could be rolled, this would give a true approximation of the number of distinct boards.

The easiest way to do this is with a computer program. It picks a random sequence of sixteen characters, checks whether this board can be rolled, and repeats several thousand times. I believe that checking whether a given board can be rolled is NP-Complete, but in this case the greedy approximation works quite well. I wrote a program (leave a comment if you want to see it) to do this, and processed one million 16-character sequences. Only 84,492 could be represented with the usual dice, or 8.4%. This gives a total of

(26 ^ 16 / 8) * (84,492 / 1,000,000) = 4.6e20 ≈ 2^69.

If you like confidence intervals, my sample size of one million boards gives a 95% confidence interval of [4.545e24, 4.666e24] for the total number of boards. Pretty good.

So, assuming we could enumerate all these boards quickly, how long would it take for our faster solver to find the best board? At 10k boards/sec, we’re looking at 4.5e16 seconds = 1.9 billion years! Clearly we need to do better.

05.02.07

JRuby Performance, Yowch!

Posted in boggle, programming at 10:58 pm by danvk

I took JRuby 0.9.9 for a spin with the exceptionally-inefficient Boggle program from a few months back. Here are the numbers:

$ time ruby short.rb c a t d l i n e m a r o p e t s
2338
ruby short.rb c a t d l i n e m a r o p e t s  241.95s user 1.20s system 97% cpu 4:08.35 total
$ time jruby short.rb c a t d l i n e m a r o p e t s
2338
jruby short.rb c a t d l i n e m a r o p e t s  1178.86s user 40.84s system 108% cpu 18:44.44 total

I’d heard JRuby was slow, but this is spectacular. Four times slower than the already-slow Ruby?

I’d always thought that the point of JRuby was to run Ruby programs on the JVM, and hence get the benefits of the JVM’s JIT. I guess not. With that kind of performance, the only possible justification is the ability to use Java libraries.

« Previous Page« Previous entries Next entries »Next Page »