# 02.11.07

## Some Boggle Statistics

Posted in boggle at 10:46 pm by danvk

With a fast boggle solver in hand, it’s time for some fun statistics. These are all based on boggle boards rolled with real boggle dice. I’m going sans-code this time, but if you’re interested in seeing it, feel free to holla.

Most common words:

 3 letters 4 letters 5 letters Word Freq (%) Word Freq (%) Word Freq (%) toe 19.258 teen 6.718 eaten 2.034 tee 19.074 tees 6.564 enate 2 ten 17.944 tent 6.02 sente 1.954 net 17.944 note 5.976 setae 1.944 tea 17.65 tone 5.838 tense 1.86 set 17.51 teat 5.804 tease 1.856 eta 17.176 toes 5.664 teeth 1.788 ate 17.176 toea 5.548 eater 1.788 tae 16.518 nets 5.432 teens 1.712 eat 16.518 test 5.344 seton 1.702 tie 16.432 rete 5.208 notes 1.702 het 15.684 nett 5.204 tents 1.646 ret 15.108 nest 5.174 retie 1.632 eth 14.938 tens 5.172 steno 1.624 oes 14.698 sent 5.156 sheet 1.618 the 14.542 neat 5.146 ester 1.618 eon 14.474 etna 5.144 oaten 1.61 one 14.366 ante 5.144 teats 1.608 ose 13.82 thee 5.064 tones 1.606 see 13.78 tote 5.052 enter 1.596

I looked these words up and they all check out. See the Scrabble dictionary if you’re not convinced.

How many words can we expect to find on each board?

That looks like a log-normal distribution. The mean is 98.53 words. How many points?

That’s also a log-normal distribution with the characteristically long tail. The mean is 140.97 points per board.

How many words of each length can we expect to find on a board? Here’s a histogram of the number of words of each length on a board:

Those also look like log-normals, with four letter words being most common.

Put another way, what’s the likelihood of finding a word of a given length on a board?

Len. Likelihood
3 99.97994%
4 99.901%
5 98.62%
6 87.56%
7 56.21%
8 21.36%
9 3.94%
10 0.442%
11 0.0362%
12 0.00228%
13 0.0001%

For context, the longest word I’ve ever found in a game was “thrashers” at nine letters.

The most common words were based on a 50,000 board sample. The graphs are based on a 5,000,000 board sample. Feel free to contact me if you’d like source or the Excel spreadsheet.

## 1 Comment

1. Exile from GROGGS said,

October 27, 2010 at 1:20 pm

Interesting. I’m approaching this from the other direction – attempting to calculate some answers, following the observation (playing Scramble) that words seem to crop up on successive boards with surprising regularity, but the first post I’m writing is using these results as they stand. Thanks!