<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>danvk.org &#187; programming</title>
	<atom:link href="http://www.danvk.org/wp/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.danvk.org/wp</link>
	<description>Keepin' static like wool fabric since 2006</description>
	<lastBuildDate>Tue, 29 Jun 2010 01:45:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Introducing lmnowave</title>
		<link>http://www.danvk.org/wp/2010-03-22/introducing-lmnowave/</link>
		<comments>http://www.danvk.org/wp/2010-03-22/introducing-lmnowave/#comments</comments>
		<pubDate>Mon, 22 Mar 2010 07:02:29 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[crosswords]]></category>
		<category><![CDATA[lmnowave]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=668</guid>
		<description><![CDATA[Last Winter, a dear friend of mine moved from San Francisco to Brooklyn. With an entire continent between us, my principal crossword puzzle buddy and I looked in vain to the internet for help. Was there truly no good way to do a crossword together online? The New York Times offered an applet, but it [...]]]></description>
			<content:encoded><![CDATA[<p><img src="http://www.danvk.org/wp/wp-content/uploads/2010/03/logo.png" alt="logo" title="logo" width="167" height="101" align="right" class="alignright size-full wp-image-688" />Last Winter, a <a href="http://ericaricardo.com/">dear friend</a> of mine moved from San Francisco to Brooklyn. With an entire continent between us, my principal crossword puzzle buddy and I looked in vain to the internet for help. Was there truly no good way to do a crossword together online?</p>
<p>The New York Times <a href="http://select.nytimes.com/premium/xword/puzzles.html">offered</a> an applet, but it proved to be finicky and would only let us do the most recent day&#8217;s puzzle. A friend&#8217;s <a href="http://neugierig.org/software/lmnopuz/">project</a> offered hope, but only led to &#8220;Service Temporarily Unavailable&#8221;.</p>
<p>Enter: <b><a target="_blank" href="https://wave.google.com/wave/#restored:wave:googlewave.com!w%252B8X8AwPsDA.1">lmnowave</a></b>!</p>
<p>lmnowave is a crossword puzzle gadget for Google Wave. To do a crossword puzzle with a friend, you&#8217;ll both need <a href="http://wave.google.com/">Google Wave Accounts</a>.</p>
<p>Once you&#8217;ve got that taken care of, click this big link to get going:</p>
<h2><a target="_blank" href="https://wave.google.com/wave/#restored:wave:googlewave.com!w%252B8X8AwPsDA.1">lmnowave installer</a></h2>
<p>You should see something like this:</p>
<p><img src="http://www.danvk.org/wp/wp-content/uploads/2010/03/installer2.png" alt="lmnowave installer" title="installer" width="458" height="245" class="alignright size-full wp-image-697" /></p>
<p>Click the &#8220;Install Icon&#8221; and create a new wave. You&#8217;ll see a crossword puzzle icon in your toolbar:</p>
<style type="text/css">.bordered { border: solid 1px black; }</style>
<p><img src="http://www.danvk.org/wp/wp-content/uploads/2010/03/add_icon.png" alt="puzzle icon" title="add_icon" width="360" height="93" class="alignright size-full wp-image-675 bordered" /></p>
<p>Click it to add a crossword gadget. It should look like this:</p>
<p><img src="http://www.danvk.org/wp/wp-content/uploads/2010/03/drag_screen1.png" alt="load screen" title="drag_screen" width="429" height="383" class="alignright size-full wp-image-677 bordered" /></p>
<p>If you&#8217;re using Chrome or Safari, you may get a warning about not being able to upload puzzle files. This is fine &mdash; just switch to <a href="http://firefox.com/">Firefox</a> for a minute or try one of the built-in Onion puzzles.</p>
<p>If you have a .puz file on your computer (perhaps from your <a href="http://select.nytimes.com/premium/xword/puzzles.html">times subscription</a>), drag it onto the big lmnowave icon:</p>
<p><img src="http://www.danvk.org/wp/wp-content/uploads/2010/03/dragging2.png" alt="dragging a puz file" title="dragging" width="481" height="303" class="alignright size-full wp-image-682 bordered" /></p>
<p>The puzzle will load instantly. Now drag a friend into the wave:</p>
<p><img src="http://www.danvk.org/wp/wp-content/uploads/2010/03/adding_erica2.png" alt="Adding a friend" title="adding_erica" width="444" height="361" class="alignright size-full wp-image-683 bordered" /></p>
<p>and you&#8217;re ready to compete or collaborate as you see fit! Each player gets his or her own color, so you can keep track of who&#8217;s filled in each square:</p>
<p><img src="http://www.danvk.org/wp/wp-content/uploads/2010/03/solved_puzzle.png" alt="partially-solved puzzle" title="solved_puzzle" width="313" height="264" class="alignright size-full wp-image-686 bordered" /></p>
<p>lmnowave is an open-source project written entirely in JavaScript. If you&#8217;d like to contribute, <a href="http://github.com/danvk/lmnowave/">check it out</a> on github. Run into a bug or have a feature request? Let me know <a href="http://github.com/danvk/lmnowave/issues">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2010-03-22/introducing-lmnowave/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Crossword Word Frequency</title>
		<link>http://www.danvk.org/wp/2009-12-26/crossword-word-frequency/</link>
		<comments>http://www.danvk.org/wp/2009-12-26/crossword-word-frequency/#comments</comments>
		<pubDate>Sat, 26 Dec 2009 17:45:02 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[math]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=633</guid>
		<description><![CDATA[In a previous post, I discussed downloading several years&#8217; worth of New York Times Crosswords and categorizing them by day of week. Now, some analysis! Here were the most common words over the last 12 years, along with the percentage of puzzles in which they occurred: Percentage Word Length 6.218% ERA 3 5.703% AREA 4 [...]]]></description>
			<content:encoded><![CDATA[<p>In a previous post, I discussed downloading several years&#8217; worth of New York Times Crosswords and categorizing them by day of week. Now, some analysis!</p>
<p>Here were the most common words over the last 12 years, along with the percentage of puzzles in which they occurred:</p>
<table class="thin sortable draggable">
<tr>
<th>Percentage</th>
<th>Word</th>
<th>Length</th>
</tr>
<tr>
<td>6.218%</td>
<td>ERA
<td>3</td>
</tr>
<tr>
<td>5.703%</td>
<td>AREA
<td>4</td>
</tr>
<tr>
<td>5.413%</td>
<td>ERE
<td>3</td>
</tr>
<tr>
<td>5.055%</td>
<td>ELI
<td>3</td>
</tr>
<tr>
<td>4.854%</td>
<td>ONE
<td>3</td>
</tr>
<tr>
<td>4.585%</td>
<td>ALE
<td>3</td>
</tr>
<tr>
<td>4.496%</td>
<td>ORE
<td>3</td>
</tr>
<tr>
<td>4.361%</td>
<td>ERIE
<td>4</td>
</tr>
<tr>
<td>4.339%</td>
<td>ALOE
<td>4</td>
</tr>
<tr>
<td>4.317%</td>
<td>ETA
<td>3</td>
</tr>
<tr>
<td>4.317%</td>
<td>ALI
<td>3</td>
</tr>
<tr>
<td>4.227%</td>
<td>OLE
<td>3</td>
</tr>
<tr>
<td>4.205%</td>
<td>ARE
<td>3</td>
</tr>
<tr>
<td>4.138%</td>
<td>ESS
<td>3</td>
</tr>
<tr>
<td>4.138%</td>
<td>EDEN
<td>4</td>
</tr>
<tr>
<td>4.138%</td>
<td>ATE
<td>3</td>
</tr>
<tr>
<td>4.048%</td>
<td>IRE
<td>3</td>
</tr>
<tr>
<td>4.048%</td>
<td>ARIA
<td>4</td>
</tr>
<tr>
<td>4.004%</td>
<td>ANTE
<td>4</td>
</tr>
<tr>
<td>3.936%</td>
<td>ESE
<td>3</td>
</tr>
<tr>
<td>3.936%</td>
<td>ENE
<td>3</td>
</tr>
<tr>
<td>3.914%</td>
<td>ADO
<td>3</td>
</tr>
<tr>
<td>3.869%</td>
<td>ELSE
<td>4</td>
</tr>
<tr>
<td>3.825%</td>
<td>NEE
<td>3</td>
</tr>
<tr>
<td>3.758%</td>
<td>ACE
<td>3</td>
</tr>
</table>
<p>(you can click column headings to sort.)</p>
<p>So &#8220;ERA&#8221; appears, on average, in about 23 puzzles per year. How about if we break this down by day of week? Follow me past the fold&#8230;</p>
<p><script type=text/javascript src="/dragtable/sorttable.js"></script><br />
<script type=text/javascript src="/dragtable/dragtable.js"></script></p>
<style type=text/css>
  /* Sortable tables */
  table.sortable thead {
    background-color:#eee;
    color:#666666;
    font-weight: bold;
    cursor: default;
  }
  table.thin, table.thin td, table.thin tr, table.thin th {
    border: thin solid black;
    border-collapse: collapse;
  }
</style>
<p><span id="more-633"></span></p>
<p><b>Monday:</b></p>
<table class="thin sortable draggable">
<tr>
<th>Percentage</th>
<th>Word</th>
<th>Length</th>
</tr>
<tr>
<td>9.404%</td>
<td>ALOE
<td>4</td>
</tr>
<tr>
<td>8.777%</td>
<td>AREA
<td>4</td>
</tr>
<tr>
<td>7.837%</td>
<td>ERIE
<td>4</td>
</tr>
<tr>
<td>6.426%</td>
<td>ONE
<td>3</td>
</tr>
<tr>
<td>6.426%</td>
<td>IDEA
<td>4</td>
</tr>
<tr>
<td>6.426%</td>
<td>ARIA
<td>4</td>
</tr>
<tr>
<td>6.270%</td>
<td>ONCE
<td>4</td>
</tr>
<tr>
<td>6.270%</td>
<td>EDEN
<td>4</td>
</tr>
<tr>
<td>6.113%</td>
<td>ERA
<td>3</td>
</tr>
<tr>
<td>6.113%</td>
<td>ELSE
<td>4</td>
</tr>
<tr>
<td>6.113%</td>
<td>ASEA
<td>4</td>
</tr>
<tr>
<td>5.799%</td>
<td>ERE
<td>3</td>
</tr>
<tr>
<td>5.643%</td>
<td>ORE
<td>3</td>
</tr>
<tr>
<td>5.643%</td>
<td>ETAL
<td>4</td>
</tr>
<tr>
<td>5.643%</td>
<td>ARE
<td>3</td>
</tr>
<tr>
<td>5.643%</td>
<td>ANTE
<td>4</td>
</tr>
<tr>
<td>5.486%</td>
<td>OREO
<td>4</td>
</tr>
<tr>
<td>5.486%</td>
<td>ALEE
<td>4</td>
</tr>
<tr>
<td>5.329%</td>
<td>TREE
<td>4</td>
</tr>
<tr>
<td>5.329%</td>
<td>ESS
<td>3</td>
</tr>
<tr>
<td>5.329%</td>
<td>ELI
<td>3</td>
</tr>
<tr>
<td>5.329%</td>
<td>ACRE
<td>4</td>
</tr>
<tr>
<td>5.172%</td>
<td>TSAR
<td>4</td>
</tr>
<tr>
<td>5.172%</td>
<td>ANTI
<td>4</td>
</tr>
<tr>
<td>5.016%</td>
<td>ORAL
<td>4</td>
</tr>
</table>
<p>The four letter words are more common now. Also look how much higher the percentages are. There&#8217;s less variety in the fill of Monday puzzles. &#8220;ALOE&#8221; and &#8220;ARIA&#8221; are classic crossword words, not to mention &#8220;OREO&#8221;.</p>
<p><b>Saturday:</b></p>
<table class="thin sortable draggable">
<tr>
<th>Percentage</th>
<th>Word</th>
<th>Length</th>
</tr>
<tr>
<td>3.286%</td>
<td>ERA
<td>3</td>
</tr>
<tr>
<td>2.973%</td>
<td>ONE
<td>3</td>
</tr>
<tr>
<td>2.973%</td>
<td>ETE
<td>3</td>
</tr>
<tr>
<td>2.817%</td>
<td>TEN
<td>3</td>
</tr>
<tr>
<td>2.817%</td>
<td>EVE
<td>3</td>
</tr>
<tr>
<td>2.817%</td>
<td>ETA
<td>3</td>
</tr>
<tr>
<td>2.660%</td>
<td>IRE
<td>3</td>
</tr>
<tr>
<td>2.660%</td>
<td>ERR
<td>3</td>
</tr>
<tr>
<td>2.660%</td>
<td>ERE
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>OTIS
<td>4</td>
</tr>
<tr>
<td>2.504%</td>
<td>OLE
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>ENE
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>ELL
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>ELI
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>ARE
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>ARA
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>ALA
<td>3</td>
</tr>
<tr>
<td>2.504%</td>
<td>ACE
<td>3</td>
</tr>
<tr>
<td>2.347%</td>
<td>RTE
<td>3</td>
</tr>
<tr>
<td>2.347%</td>
<td>ICE
<td>3</td>
</tr>
<tr>
<td>2.347%</td>
<td>ATE
<td>3</td>
</tr>
<tr>
<td>2.347%</td>
<td>ALE
<td>3</td>
</tr>
<tr>
<td>2.191%</td>
<td>TSE
<td>3</td>
</tr>
<tr>
<td>2.191%</td>
<td>TERSE
<td>5</td>
</tr>
<tr>
<td>2.191%</td>
<td>SRI
<td>3</td>
</tr>
</table>
<p>Lots of three letter words and <i>much</i> lower percentages. &#8220;OTIS&#8221; is surprising to me, but I don&#8217;t do many Saturday puzzles, so who am I to say?</p>
<p>It would be really interesting to combine this with some <a href="http://en.wikipedia.org/wiki/Document_frequency">document frequency</a> numbers for the English language. This would find words which are much more common in crosswords than they are in general, i.e. crosswordese.</p>
<p>I&#8217;d include everything necessary to reproduce this here, but the puzzles are not free. See <a href="/xword-freq/">this directory</a> for the program I used to tabulate the statistics and complete word counts, both overall and for each day of the week. The first puzzle in my collection was 2006-10-23 and the last was 2009-01-19.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2009-12-26/crossword-word-frequency/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Breaking 3&#215;3 Boggle</title>
		<link>http://www.danvk.org/wp/2009-08-08/breaking-3x3-boggle/</link>
		<comments>http://www.danvk.org/wp/2009-08-08/breaking-3x3-boggle/#comments</comments>
		<pubDate>Sat, 08 Aug 2009 17:35:04 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[boggle]]></category>
		<category><![CDATA[math]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=516</guid>
		<description><![CDATA[Why is finding the highest-scoring Boggle board so difficult? It&#8217;s because there are so many boards to consider: 2^72 for the 4&#215;4 case and 2^40 for the 3&#215;3 case. At 10,000 boards/second the former corresponds to about 2 billion years of compute time, and the latter just two years. Just enumerating all 2^72 boards would [...]]]></description>
			<content:encoded><![CDATA[<p>Why is finding the highest-scoring Boggle board so difficult? It&#8217;s because there are so many boards to consider: 2^72 for the 4&#215;4 case and 2^40 for the 3&#215;3 case. At <a href="http://www.danvk.org/wp/2007-02-10/one-last-boggle-boost/">10,000 boards/second</a> the former corresponds to about 2 billion years of compute time, and the latter just two years. Just enumerating all 2^72 boards would take over 100,000 years.</p>
<p>So we have to come up with a technique that doesn&#8217;t involve looking at every single board. And I&#8217;ve come up with just such a method! This is the &#8220;exciting news&#8221; I alluded to in the last post.</p>
<p>Here&#8217;s the general technique:</p>
<ol>
<li>Find a very high-scoring board (maybe <a href="http://www.danvk.org/wp/2009-02-19/sky-high-boggle-scores-with-simulated-annealing/">this way</a>)</li>
<li>Consider a large class of boards</li>
<li>Come up with an upper bound on the highest score achieved by any board in the class.</li>
<li>If it&#8217;s lower than the score in step #1, we can eliminate all the boards in the class. If it&#8217;s not, subdivide the class and repeat step #2 with each subclass.</li>
</ol>
<p><b>Classes of Boards</b><br />
By &#8220;class of boards&#8221;, I mean something like this:</p>
<style type="text/css">
.board { text-align: center; border-collapse: collapse; }
.board tbody td { border: 1px solid black; border-collapse: collapse; padding: 4px 8px 4px 8px; }
.board tbody td { font-weight: bold; }
.notable { color: red; }
.change td { padding: 2px 5px 2px 5px; }
.mb { font-family: monospace; padding: 0px 4px 0px 4px; }
</style>
<p><center></p>
<table class="board">
<tr>
<td>{a,e,i,o,u}</td>
<td>{a,e,i,o,u}</td>
<td>r</td>
</tr>
<tr>
<td>{b,c,d,f,g,h}</td>
<td>a</td>
<td>t</td>
</tr>
<tr>
<td>d</td>
<td>e</td>
<td>{r,s,t,v}</td>
</tr>
</table>
<p></center></p>
<p>The squares that contain a set of letters can take on <i>any</i> of those letters. So this board is part of that class:</p>
<p><center></p>
<table class="board">
<tr>
<td>a</td>
<td>i</td>
<td>r</td>
</tr>
<tr>
<td>d</td>
<td>a</td>
<td>t</td>
</tr>
<tr>
<td>d</td>
<td>e</td>
<td>s</td>
</tr>
<tfoot>
<tr>
<td colspan=3><a href="/boggle3.php?quick=airdatdes">189 points</a></td>
</tr>
</tfoot>
</table>
<p></center></p>
<p>and so is this:</p>
<p><center></p>
<table class="board">
<tr>
<td>o</td>
<td>u</td>
<td>r</td>
</tr>
<tr>
<td>f</td>
<td>a</td>
<td>t</td>
</tr>
<tr>
<td>d</td>
<td>e</td>
<td>t</td>
</tr>
<tfoot>
<tr>
<td colspan=3><a href="/boggle3.php?quick=ourfatdet">114 points</a></td>
</tr>
</tfoot>
</table>
<p></center></p>
<p>All told, there are 5 * 5 * 6 * 4 = 600 boards that are part of this class, each with its own score. Other fun classes of boards include &#8220;boards with only vowels&#8221; (1,953,125 members) and &#8220;boards with only consonants&#8221; (794,280,046,581 members).</p>
<p>Follow me past the fold for more&#8230;<br />
<span id="more-516"></span></p>
<p><b>Upper Bounds</b><br />
Now on to step #3 of the general technique: calculating an upper bound. This is going to be easier if we introduce some mathematical notation:</p>
<p><center></p>
<table>
<tr>
<td align=right><i>b</i></td>
<td>=</td>
<td>A boggle board</td>
</tr>
<tr>
<td align=right><i>Score(b)</i></td>
<td>=</td>
<td>sum of the scores of all the words contained on b</td>
</tr>
<tr>
<td align=right><i><b>B</b></i></td>
<td>=</td>
<td>a class of boards, i.e. <img src='http://s.wordpress.com/latex.php?latex=b%20%5Cin%20B&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='b \in B' title='b \in B' class='latex' /></td>
</tr>
<tr>
<td align=right><i>Score(<b>B</b>)</i></td>
<td>=</td>
<td><img src='http://s.wordpress.com/latex.php?latex=max%28%5C%7BScore%28b%29%20%7C%20b%20%5Cin%20B%5C%7D%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='max(\{Score(b) | b \in B\})' title='max(\{Score(b) | b \in B\})' class='latex' /></td>
</tr>
</table>
<p></center></p>
<p>An upper bound is a function <img src='http://s.wordpress.com/latex.php?latex=f%28B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(B)' title='f(B)' class='latex' /> such that <img src='http://s.wordpress.com/latex.php?latex=f%28B%29%20%5Cgeq%20Score%28B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(B) \geq Score(B)' title='f(B) \geq Score(B)' class='latex' />, i.e. <img src='http://s.wordpress.com/latex.php?latex=f%28B%29%20%5Cgeq%20Score%28b%29%20%5Cforall%20b%20%5Cin%20B&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(B) \geq Score(b) \forall b \in B' title='f(B) \geq Score(b) \forall b \in B' class='latex' />.</p>
<p>There&#8217;s one really easy upper bound: <img src='http://s.wordpress.com/latex.php?latex=Score%28B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Score(B)' title='Score(B)' class='latex' />! This just enumerates all the boards in the class B, scores each and takes the maximum score. It&#8217;s very expensive to compute for a large class of boards and hence not very practical. You and I both know that no board in containing only consonants has any points on it. We don&#8217;t need to enumerate through all 794 billion such boards to determine this.</p>
<p>With upper bounds, there&#8217;s a trade-off between how hard they are to compute and how &#8220;tight&#8221; they are, i.e. how closely they approximate <img src='http://s.wordpress.com/latex.php?latex=Score%28B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Score(B)' title='Score(B)' class='latex' />. <img src='http://s.wordpress.com/latex.php?latex=Score%28B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='Score(B)' title='Score(B)' class='latex' /> is very tight but is hard to compute. At the other end of the spectrum, we know that all the words on a board are in the dictionary. So we could just sum up the scores of all the words in the dictionary and get a number, say 1,000,000. Then <img src='http://s.wordpress.com/latex.php?latex=f%28B%29%20%3D%201%2C000%2C000&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(B) = 1,000,000' title='f(B) = 1,000,000' class='latex' /> is an upper bound. It is very easy to compute, but is not very tight.</p>
<p>The trick is to hit some sort of sweet spot that strikes a good balance between &#8220;tightness&#8221; and ease of computation. Over the rest of this blog post, I&#8217;ll present two upper bounds that do this. Upper bounds have the nice property that if <img src='http://s.wordpress.com/latex.php?latex=f%28B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='f(B)' title='f(B)' class='latex' /> and <img src='http://s.wordpress.com/latex.php?latex=g%28B%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='g(B)' title='g(B)' class='latex' /> are two upper bounds, then <img src='http://s.wordpress.com/latex.php?latex=h%28B%29%20%3D%20min%28f%28B%29%2C%20g%28B%29%29&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='h(B) = min(f(B), g(B))' title='h(B) = min(f(B), g(B))' class='latex' /> is also an upper bound. So by finding two bounds, we&#8217;ll get a third that&#8217;s better than either one alone.</p>
<p><b>sum/union</b><br />
The idea of this bound is to find all the words that can possibly occur in a class of boards. Since each word can only be found once, we can add the scores of all these words to get an upper bound.</p>
<p>To get the list of words, we use the same <a href="http://www.danvk.org/wp/2007-02-01/tries-the-perfect-data-structure/">depth-first search strategy</a> as we did to find words on a single board. The wrinkle is that, when we encounter a cell with multiple possible letters, we have to do a separate depth-first search for each.</p>
<p>At first glance, it doesn&#8217;t seem like this would be tractable for a board class like this one (alternating vowels and consonants):</p>
<p><center></p>
<table class="board">
<tr>
<td>{a,e,i,o,u}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{a,e,i,o,u}</td>
</tr>
<tr>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{a,e,i,o,u}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
</tr>
<tr>
<td>{a,e,i,o,u}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{a,e,i,o,u}</td>
</tr>
</table>
<p></center></p>
<p>In addition to the branching from going different directions on each square, there&#8217;s also a huge amount of branching from trying each letter on each square. But we&#8217;re saved by the same lesson we learned in <a href="http://www.danvk.org/wp/2007-01-30/boggle-3-succeed-by-not-being-stupid/">boggle post #3</a>: the dictionary is exceptionally effective at pruning thorny search trees. If we prune search trees like &#8216;bqu&#8217; that don&#8217;t begin words, then there doesn&#8217;t wind up being that much work to do.</p>
<p>We can find all possible words on the above board in just under 1 second. This is about 10,000 times slower than it takes to score a conventional board, but it&#8217;s certainly tractable. The resulting score is 195,944. Given that no board scores higher than 545 points, this is a wild overestimate. But at least it&#8217;s a better bound than a million!</p>
<p>This technique does especially well on boards like this one, which contains all consonants:</p>
<p><center></p>
<table class="board">
<tr>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
</tr>
<tr>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
</tr>
<tr>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
</tr>
</table>
<p></center></p>
<p>This takes 0.23 seconds to score and results in a bound of 208 points (it contains words like &#8216;crypt&#8217; and &#8216;gypsy&#8217;). We&#8217;ve already <a href="http://www.danvk.org/wp/2009-08-04/solving-boggle-by-taking-option-three/">found</a> a single board that has <a href="/boggle3.php?quick=perlatdes">545 points</a> on it. So we can eliminate this entire class of 794 billion boards. That&#8217;s a speed of over 3 trillion boards/second! Of course, this board class is not typical.</p>
<p>It&#8217;s also worth pointing out why this upper bound isn&#8217;t tight. Consider this class of boards:</p>
<p><center></p>
<table class="board">
<tr>
<td>{a,i}</td>
<td>r</td>
<td>z</td>
</tr>
<tr>
<td>f</td>
<td>z</td>
<td>z</td>
</tr>
<tr>
<td>z</td>
<td>z</td>
<td>z</td>
</tr>
</table>
<p></center></p>
<p>You can find both &#8220;fir&#8221; and &#8220;far&#8221; on boards in this class, but there aren&#8217;t any boards that contain <i>both</i>. So while each &#8220;fir and &#8220;far&#8221; contribute a point to the upper bound, they should only really contribute a single point. The sum/union bound doesn&#8217;t take into account the relationships between various letter choices. It&#8217;s the best trade-off between computability and &#8220;tightness&#8221; we&#8217;ve seen so far, but it&#8217;s not good enough to make the problem tractable.</p>
<p><b>max/no mark</b><br />
In the sum/union upper bound, we dealt with multiple possible letters on the same square by trying each and adding the resulting scores (taking care not to count any word twice). But why take the sum of all choices when we know that any given board can only take on one of the possibilities? It would result in a much better bound if we took the max of the scores resulting from each possible choice, rather than the sum. This is the idea behind the &#8220;max/no mark&#8221; bound.</p>
<p>This is a huge win over sum/union, especially when there are many squares containing many possible letters. It does have one major drawback, though. The sum/union bound took advantage of the fact that each word could only be found once. With the max/no mark bound, the bookkeeping for this becomes completely intractable. The words we find by making a choice on one square may affect the results of a choice somewhere else. We can&#8217;t make the choices independently. The optimal set of choices becomes an optimization problem in its own right.</p>
<p>Rather than deal with this, max/no mark just throws up its hands. This is what the &#8220;no mark&#8221; refers to. In the past, we&#8217;ve recorded the words we find by <a href="http://www.danvk.org/wp/2007-02-10/one-last-boggle-boost/">marking the Trie</a>. By not marking the Trie with found words, we accept that we&#8217;ll double-count words sometimes. But it still winds up being an excellent upper bound.</p>
<p>Lets try some of our previous examples:</p>
<p><center></p>
<table class="board">
<tr>
<td>{a,e,i,o,u}</td>
<td>{a,e,i,o,u}</td>
<td>r</td>
</tr>
<tr>
<td>{b,c,d,f,g,h}</td>
<td>a</td>
<td>t</td>
</tr>
<tr>
<td>d</td>
<td>e</td>
<td>{r,s,t,v}</td>
</tr>
<tfoot>
<tr>
<td colspan=3 align=center>sum/union: 2880</td>
</tr>
<tr>
<td colspan=3>max/no mark: 1307</td>
</tr>
</tfoot>
</table>
<p></center></p>
<p>Alternating vowels and consonants:</p>
<p><center></p>
<table class="board">
<tr>
<td>{a,e,i,o,u}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{a,e,i,o,u}</td>
</tr>
<tr>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{a,e,i,o,u}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
</tr>
<tr>
<td>{a,e,i,o,u}</td>
<td>{b-d,f-h,j-n,p-t,v-z}</td>
<td>{a,e,i,o,u}</td>
</tr>
<tfoot>
<tr>
<td colspan=3>sum/union: 195944</td>
</tr>
<tr>
<td colspan=3>max/no mark: 15692</td>
</tr>
</tfoot>
</table>
<p></center></p>
<p>A class that can be entirely eliminated:</p>
<p><center></p>
<table class="board">
<tr>
<td>{b,d,f,g,j,k,m,p,v,w,x,z}</td>
<td>a</td>
<td>{s,y}</td>
</tr>
<tr>
<td>{i,o,u}</td>
<td>y</td>
<td>a</td>
</tr>
<tr>
<td>{s,y}</td>
<td>{c,h,l,n,r,t}</td>
<td>{c,h,l,n,r,t}</td>
</tr>
<tfoot>
<tr>
<td colspan=3>sum/union: 2497</td>
</tr>
<tr>
<td colspan=3>max/no mark: 447</td>
</tr>
</tfoot>
</table>
<p></center></p>
<p>max/no mark isn&#8217;t always better than sum/union:</p>
<p><center></p>
<table class="board">
<tr>
<td>{b,d}</td>
<td>a</td>
<td>{b,d}</td>
</tr>
<tr>
<td>a</td>
<td>{b,d}</td>
<td>a</td>
</tr>
<tr>
<td>{b,d}</td>
<td>a</td>
<td>{b,d}</td>
</tr>
<tfoot>
<tr>
<td colspan=3>sum/union: 9</td>
</tr>
<tr>
<td colspan=3>max/no mark: 132</td>
</tr>
</tfoot>
</table>
<p></center></p>
<p>This is something of a worst-case because, while there are relatively few distinct words, there are many different ways to find them.</p>
<p><b>Putting it all together</b><br />
Our two bounds do well in different situations. max/no mark works best when there are lots of choices to be made on particular cells and there are relatively few ways to make any particular word. sum/union works best when there are lots of possibilities but relatively few distinct words. Putting them together results in a bound that&#8217;s good enough to find the best 3&#215;3 boggle board using the technique described at the beginning of this post.</p>
<p>Given an initial class of boards, we wind up with what I call a &#8220;breaking tree&#8221;. If the initial class has an upper bound less than 545 points, then we&#8217;re done. Otherwise, we pick a cell to split and try each possibility.</p>
<p>Here&#8217;s a relatively small breaking tree that results from running <a href="http://code.google.com/p/performance-boggle/source/browse/trunk/3x3/ibucket_breaker.cc">this program</a>:</p>
<pre>
$ ./3x3/ibucket_breaker --best_score 520 --break_class "bdfgjkmpvwxz a sy iou xyz aeiou sy chlnrt chlnrt"
(     0%) (0;1/1) bdfgjkmpvwxz a sy iou xyz aeiou sy chlnrt chlnrt (820, 77760 reps)
                            split cell 4 (xyz) Will evaluate 3 more boards...
(     0%)  (1;1/3) bdfgjkmpvwxz a sy iou x aeiou sy chlnrt chlnrt (475, 25920 reps)
(33.333%)  (1;2/3) bdfgjkmpvwxz a sy iou y aeiou sy chlnrt chlnrt (703, 25920 reps)
                            split cell 5 (aeiou) Will evaluate 5 more boards...
(33.333%)   (2;1/5) bdfgjkmpvwxz a sy iou y a sy chlnrt chlnrt (447, 5184 reps)
(    40%)   (2;2/5) bdfgjkmpvwxz a sy iou y e sy chlnrt chlnrt (524, 5184 reps)
                            split cell (iou) 3 Will evaluate 3 more boards...
(    40%)    (3;1/3) bdfgjkmpvwxz a sy i y e sy chlnrt chlnrt (346, 1728 reps)
(42.222%)    (3;2/3) bdfgjkmpvwxz a sy o y e sy chlnrt chlnrt (431, 1728 reps)
(44.444%)    (3;3/3) bdfgjkmpvwxz a sy u y e sy chlnrt chlnrt (339, 1728 reps)
(46.667%)   (2;3/5) bdfgjkmpvwxz a sy iou y i sy chlnrt chlnrt (378, 5184 reps)
(53.333%)   (2;4/5) bdfgjkmpvwxz a sy iou y o sy chlnrt chlnrt (423, 5184 reps)
(    60%)   (2;5/5) bdfgjkmpvwxz a sy iou y u sy chlnrt chlnrt (318, 5184 reps)
(66.667%)  (1;3/3) bdfgjkmpvwxz a sy iou z aeiou sy chlnrt chlnrt (509, 25920 reps)
</pre>
<p>The numbers in parentheses are the upper bounds. When they get below 520 (the parameter I set on the command line), a sub-class is fully broken.</p>
<p>Using this technique and the following partition of the 26 letters:</p>
<ul>
<li>bdfgjvwxz
<li>aeiou
<li>lnrsy
<li>chkmpt
</ul>
<p>I was able to go through all 262,144 (=4^9) board classes in about six hours on a single machine. This resulted in the boards I listed in the <a href="http://www.danvk.org/wp/2009-08-04/solving-boggle-by-taking-option-three/">last post</a>. Six hours is a big improvement over two years!</p>
<p>If that same factor (two years to six hours) held for the 4&#215;4 case, then we&#8217;d be down to 380 years of compute time to find the best 4&#215;4 boggle board. Or, equivalently, 138 days on 1000 machines. That&#8217;s still a lot. We&#8217;re not quite there yet, but we&#8217;re getting closer!</p>
<p>Code for the program that went through all possible board classes can be found <a href="http://code.google.com/p/performance-boggle/source/browse/trunk/#trunk/paper">here</a>. While <a href="http://ai.stanford.edu/~chuongdo/boggle/index.html">many</a> <a href="http://ankurdave.com/AnkurDaveExtendedEssay2009.pdf">people</a> have found high-scoring boards, I haven&#8217;t found any previous work on this upper bounding approach. So if you have any ideas/suggestions on how to improve the bound, they&#8217;re probably novel and useful!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2009-08-08/breaking-3x3-boggle/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Chart of time.h Functions</title>
		<link>http://www.danvk.org/wp/2009-02-24/chart-of-timeh-functions/</link>
		<comments>http://www.danvk.org/wp/2009-02-24/chart-of-timeh-functions/#comments</comments>
		<pubDate>Wed, 25 Feb 2009 05:04:02 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=459</guid>
		<description><![CDATA[Here&#8217;s a handy chart of the C Standard Library functions in time.h: The ovals are data types and the rectangles are functions. The three basic types are: time_t: number of seconds since the start of the UNIX epoch. This is always UTC! struct tm: A broken-down date, split into years, months, seconds, etc. In Python, [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a handy chart of the <a href="http://en.wikipedia.org/wiki/C_Standard_Library">C Standard Library</a> functions in <code><a href="http://en.wikipedia.org/wiki/Time.h">time.h</a></code>:</p>
<p><img src="http://www.danvk.org/wp/wp-content/uploads/2009/02/unixtime.png" alt="unixtime" title="unixtime" width="400" height="370" class="aligncenter size-full wp-image-458" /></p>
<p>The ovals are data types and the rectangles are functions. The three basic types are:</p>
<ul>
<li><b>time_t</b>: number of seconds since the start of the UNIX epoch. This is always UTC!</li>
<li><b>struct tm</b>: A broken-down date, split into years, months, seconds, etc. In Python, it&#8217;s a tuple.</li>
<li><b>string</b>: Any string representation of a time, e.g. &#8220;Wed Jun 30 21:49:08 1993&#8243;.</li>
</ul>
<p>Generally you either want a <code>time_t</code> (because it&#8217;s easy to do arithmetic with) or a <code>string</code> (because it&#8217;s pretty to look at). So to get from a <code>time_t</code> to a <code>string</code>, you should use something like <code>strftime("%Y-%m-%d", localtime(time()))</code>. To go the other way, you&#8217;d use <code>mktime(strptime(str, "%Y-%m-%d"))</code>.</p>
<p>This library has been around <a href="http://books.google.com/books?id=D7FVAAAAMAAJ&#038;q=mktime+date:0-1982&#038;dq=mktime+date:0-1982&#038;lr=&#038;as_brr=0&#038;as_pt=ALLTYPES&#038;ei=8c-kSdfON5POkAS21byrBg&#038;pgis=1">since at least 1982</a>. It&#8217;s been replicated in many other languages (Python, Perl, Ruby). We seem to be stuck with it.</p>
<p>Read on for my rant about why this is all idiotic.<br />
<span id="more-459"></span></p>
<p>Let me just say that I think this is a <i>horrible</i> system. You almost never want to use <code>struct tm</code>. Most of the time, you want to go between <code>strings</code> and <code>time_t</code>. But lonely <code>ctime</code> is the only function that makes this jump, and it doesn&#8217;t let you set the output format or time zone.</p>
<p>The names are not exactly descriptive, either. They all end in &#8220;time&#8221;, which makes some sense. <code>strptime</code> and <code>strftime</code> are even OK, if a bit cryptic. The <code>p</code> stands for &#8220;parse&#8221; and the <code>f</code> stands for &#8220;format&#8221;, ala <code>printf</code>. The parameter order is hard to remember, though. Don&#8217;t use <code>gmtime</code> unless you have a good reason. <code>ctime</code> and <code>asctime</code> are non-sensical, but I don&#8217;t use them much, either. My greatest loathing is reserved for <code>localtime</code> and <code>mktime</code>. I can <i>never</i> remember which of these does which. Only mnemonic I can think of: <code>mktime</code> <i>m</i>a<i>k</i>es a <i>time</i>_t from a struct tm.</p>
<p>For another exercise in head-scratching, follow the role of time zones through this chart. <code>time_t</code> knows no time zones &#8212; it&#8217;s always UTC. To get to <code>struct tm</code>, you need to specify a time zone. This is not made explicit in the struct, however, so you need to do your own bookkeeping. The time zone for conversion isn&#8217;t a parameter or anything sensible like that, either. You just get two choices: GM (UTC) time or local time. And if you choose gmtime, you&#8217;ll never be able to get back to time_t because that function doesn&#8217;t exist. (Some systems supply a <code>mkgmtime</code> or <code>timegm</code> function.)</p>
<p>How would I design it? <code>struct tm</code> would lose its place at the center of everything. There would be sensibly-named functions to go between <code>time_t</code> and <code>string</code>:</p>
<ul>
<li><code>time_t parsetime(format, string[, timezone])</code></li>
<li><code>string formattime(format, time_t[, timezone])</code></li>
</ul>
<p>And if you really need them:</p>
<ul>
<li><code>splittime(time_t, struct tm*)</code></li>
<li><code>time_t packtime(struct tm*)</code></li>
</ul>
<p>Was that really so hard?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2009-02-24/chart-of-timeh-functions/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Draggable Table Columns</title>
		<link>http://www.danvk.org/wp/2008-06-12/draggable-table-columns/</link>
		<comments>http://www.danvk.org/wp/2008-06-12/draggable-table-columns/#comments</comments>
		<pubDate>Thu, 12 Jun 2008 07:41:44 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/2008-06-12/draggable-table-columns/</guid>
		<description><![CDATA[Inspired by the sorttable library, I&#8217;ve done some Javascript hacking over the last day and created dragtable, a complementary library which lets you drag column headers around to rearrange HTML tables. A demo will make everything clear: Name Date Favorite Color Dan 1984-07-12 Blue Alice 1980-07-22 Green Ryan 1990-09-23 Orange Bob 1966-04-21 Red Drag the [...]]]></description>
			<content:encoded><![CDATA[<p>Inspired by the <a href="http://www.kryogenix.org/code/browser/sorttable/">sorttable</a> library, I&#8217;ve done some Javascript hacking over the last day and created <a href="/wp/dragtable/">dragtable</a>, a complementary library which lets you drag column headers around to rearrange HTML tables. A demo will make everything clear:</p>
<table width=100%>
<tr>
<td align=center>
<table id=table class="thin draggable" cellpadding=2>
<tr>
<th>Name</th>
<th>Date</th>
<th>Favorite Color</th>
</tr>
<tr>
<td>Dan</td>
<td>1984-07-12</td>
<td>Blue</td>
</tr>
<tr>
<td>Alice</td>
<td>1980-07-22</td>
<td>Green</td>
</tr>
<tr>
<td>Ryan</td>
<td>1990-09-23</td>
<td>Orange</td>
</tr>
<tr>
<td>Bob</td>
<td>1966-04-21</td>
<td>Red</td>
</tr>
</table>
</td>
</tr>
</table>
<p>Drag the column headers to rearrange the table. dragtable is incredibly easy to use. To make a table rearrangeable, just add <code>class=draggable</code> to the <code>table</code> tag. And, if you set <code>class="draggable sortable"</code>, you can have a table that&#8217;s simultaneously sortable and rearrangable! For more details and a download link, check out the <a href="/wp/dragtable/">dragtable</a> page.</p>
<p>I&#8217;m calling this v0.9 since I&#8217;m sure there are plenty of bugs and tweaks left to make. I&#8217;d love to get some feedback, so take it for a spin and tell me what you think!</p>
<p><b>Update:</b> I&#8217;ve added full-column dragging and bumped the version to 1.0. Head on over to the <a href="/dragtable/">dragtable</a>, grab a copy, and let me know what you think!</p>
<p><script type=text/javascript src="/dragtable/sorttable.js"></script><br />
<script type=text/javascript src="/dragtable/dragtable.js"></script></p>
<style type=text/css>
  /* Sortable tables */
  table.sortable thead {
    background-color:#eee;
    color:#666666;
    font-weight: bold;
    cursor: default;
  }
  table.thin, table.thin td, table.thin tr, table.thin th {
    border: thin solid black;
    border-collapse: collapse;
  }
</style>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2008-06-12/draggable-table-columns/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>Reading Old GW-Basic Programs</title>
		<link>http://www.danvk.org/wp/2008-02-03/reading-old-gw-basic-programs/</link>
		<comments>http://www.danvk.org/wp/2008-02-03/reading-old-gw-basic-programs/#comments</comments>
		<pubDate>Sun, 03 Feb 2008 10:50:44 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[personal]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/2008-02-03/reading-old-gw-basic-programs/</guid>
		<description><![CDATA[I found a disk image I&#8217;d made of an old hard drive of mine today (circa 1995) and had some fun browsing through my files. Back then, I was programming in a combination of QBASIC and GW-BASIC. It&#8217;s easy to read old QBASIC programs, since QB saved code as human-readable text. Not so, GW-BASIC. To [...]]]></description>
			<content:encoded><![CDATA[<p>I found a disk image I&#8217;d made of an old hard drive of mine today (circa 1995) and had some fun browsing through my files. Back then, I was programming in a combination of <a href="http://en.wikipedia.org/wiki/QBASIC">QBASIC</a> and <a href="http://en.wikipedia.org/wiki/GW-BASIC">GW-BASIC</a>. It&#8217;s easy to read old QBASIC programs, since QB saved code as human-readable text.</p>
<p>Not so, GW-BASIC. To save space, it stored code in a compact, binary format. This seems like an unnecessary optimization now, but back in 1984 it made a lot of sense. GW-BASIC was an interactive environment, and it stored all your code in memory. Memory was a scarce resource at the time, so every byte counted. Hence the binary format.</p>
<p>I wanted to read my old GW-BASIC programs, so I dug around and found <a href="http://www.chebucto.ns.ca/~af380/GW-BASIC-tokens.html">this discussion</a> of the GW-BASIC binary file format. It&#8217;s incredibly detailed, which let me whip up a decoder in Python over two solid hours of hacking. Without further ado, here it is:</p>
<p><a href="http://www.danvk.org/wp/gw-basic-program-decoder/">GW-Basic Program Decoder</a></p>
<p>For a sample decoding, see below the fold.<br />
<span id="more-285"></span></p>
<p>Here&#8217;s a 20-questions style program I wrote on August 15, 1995:</p>
<pre>
   10 CLS
   20 DIM G$(15),RP(15),RD(15)
   30 PRINT "In this game I will make up a three digit code. The code will be between 100 and 999. You have 15 guesses. After each guess, I will tell you how many digits are correct and how many are in the right position. If you enter an invalid guess, ";
   35 PRINT "or a duplicate answer, I will let you try again. "
   40 PRINT
   50 PRINT "Press any key when ready. . . "
   60 GOSUB 1000
   70 CLS
   80 GOSUB 2000
   90 IF CODE < 100 OR CODE >999 THEN GOTO 80
  100 CODE$=RIGHT$(STR$(CODE),3)
  110 D$=CODE$
  120 FLAG=0
  130 GOSUB 1500
  140 IF FLAG=1 THEN GOTO 80
  150 FOR I=1 TO 15
  160     CLS
  170     IF I=1 THEN GOTO 230
  180     FOR J=1 TO I-1
  190             PRINT "Guess #";J;": ";G$(J);". results: ";
  200             PRINT "digits: ";RD(J);"positions: ";RP(J)
  210     NEXT J
  220     REM
  230     PRINT
  240     PRINT "enter guess #";I;
  250     INPUT G
  260     IF G<100 OR G>999 THEN 240
  270     G$(I)=STR$(G)
  280     IF LEN(G$(I))=0 THEN GOTO 240
  290     G$(I)=RIGHT$(G$(I),3)
  300     D$=G$(I)
  310     FLAG = 0
  320     GOSUB 1500
  330     IF FLAG=1 THEN GOTO 240
  340     FLAG=0
  350     GOSUB 1400
  360     IF FLAG=1 THEN 240
  370     GOSUB 1100
  380     IF RP(I)<>3 THEN GOTO 410
  390     PRINT "Good job!! You guessed the code in";I;"try's."
  400     END
  410 NEXT I
  420 PRINT "Your guesses are up!"
  430 PRINT "The code was";CODE$
  440 END
 1000 R$=INPUT$(1)
 1010 RETURN
 1100 FOR K=1 TO 3
 1110    FOR L=1 TO 3
 1120            G$=MID$(G$(I),K,1)
 1130            C$=MID$(CODE$,L,1)
 1140            IF G$<>C$ THEN GOTO 1170
 1150            IF K<>L THEN RD(I)=RD(I)+1
 1160            IF K=L THEN RP(I)=RP(I)+1
 1170    NEXT L
 1180 NEXT K
 1190 RETURN
 1400 IF I=1 THEN RETURN
 1410 FOR K=1 TO I-1
 1420    IF G$(I)=G$(K) THEN FLAG=1 : RETURN
 1430 NEXT K
 1460 RETURN
 1500 REM
 1510 FOR K=1 TO 3
 1520    FOR L=K+1 TO 3
 1530            IF MID$(D$,K,1)=MID$(D$,L,1) THEN FLAG=1: RETURN
 1540    NEXT L
 1550 NEXT K
 1560 RETURN
 2000 OPEN "r", 1, "b:amunt.ran", 2: FIELD 1, 2 AS T$
 2010 IF LOF(1)=0 THEN CODE = INT(100+RND*999): T$=MKI$(0): GOTO 2050
 2020 GET 1,1
 2030 FOR I=1 TO CVI(T$): CODE=RND:NEXT
 2040 CODE = INT(100+RND*999)
 2050 LSET T$=MKI$(CVI(T$)+1): PUT 1,1
 2060 CLOSE #1: RETURN
</pre>
<p>Oh, for the days of unstructured programming and rampant use of the &#8220;GOTO&#8221; statement. The convention was to number your lines 10, 20, 30, &#8230; so that you could go back and add extra lines between your originals. Hence line 35 above. I must have been really ambitious jumping to line 1000!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2008-02-03/reading-old-gw-basic-programs/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Using Track Parser</title>
		<link>http://www.danvk.org/wp/2007-12-24/using-track-parser/</link>
		<comments>http://www.danvk.org/wp/2007-12-24/using-track-parser/#comments</comments>
		<pubDate>Mon, 24 Dec 2007 19:27:29 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[music]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=245</guid>
		<description><![CDATA[Pitchfork Media has released their two standard year-end lists, the Top 100 Tracks of 2007 and the Top 50 Albums of 2007. As usual, they&#8217;ve been lampooned all over the web, including one critique in pie chart form. For me, they made for perfect listening on a long car drive this weekend. In my case, [...]]]></description>
			<content:encoded><![CDATA[<p><a href='http://www.pitchforkmedia.com/article/feature/47681-staff-list-top-100-tracks-of-2007' title='pitchfork-tracks.png'><img width=145 src='http://www.danvk.org/wp/wp-content/uploads/2007/12/pitchfork-tracks.png' alt='pitchfork-tracks.png' border=0 style="padding-left:5px;" align=right /></a> <a href="http://www.pitchforkmedia.com/">Pitchfork Media</a> has released their two standard year-end lists, the <a href="http://www.pitchforkmedia.com/article/feature/47681-staff-list-top-100-tracks-of-2007">Top 100 Tracks of 2007</a> and the <a href="http://www.pitchforkmedia.com/article/feature/47446-staff-list-top-50-albums-of-2007">Top 50 Albums of 2007</a>. As usual, they&#8217;ve been lampooned all over the web, including one critique in <a href="http://nymag.com/daily/entertainment/2007/12/pitchforks_top_100_tracks_of_2.html">pie chart form</a>. For me, they made for perfect listening on a long car drive this weekend.</p>
<p>In my case, this list led to a good use of my <a href="http://dougscripts.com/itunes/scripts/ss.php?sp=trackparser">Track Parser</a> script, which is in all likelihood the most useful program I&#8217;ve ever written. It&#8217;s an AppleScript for iTunes (i.e. Mac only, sorry) that lets you apply regular expressions to track names/tags. Here&#8217;s how I used it today&#8230;</p>
<p>Through some strange turn of events (certainly nothing to do with <a href="http://www.mininova.org/tor/1054972">this</a>), I found myself with a playlist of the top 100 tracks. The music was all there, but none of the songs had their &#8220;Artist&#8221; field filled in! Here&#8217;s where my Track Parser script came in.</p>
<p>I <a href="http://www.google.com/search?q=list%20of%20pitchfork%202007%20singles">googled around</a> and quickly found <a href="http://idolator.com/tunes/year_end-analysis/pitchfork-thinks-lcd-soundsystems-all-my-friends-is-something-great-334602.php">this page</a>, which has some commentary on the list, as well as what we&#8217;re interested in: a copy of all the songs/artists in simple text form. (For what it&#8217;s worth, I agree with his reactions.)</p>
<p>I copied the list and ran two regular expressions to get it down to just the artist (<code>s/ ".*//g</code> and <code>^\d*: </code> if you must know). The tracks are in reverse order of what we want (100 to 1 instead of 1 to 100). So I ran <code>pbpaste | tac | pbcopy</code> to put the #1 track at the top of the list. Or I would have, if Mac OS X had the <a href="http://en.wikipedia.org/wiki/Tac_(Unix)">tac</a> command. Instead, I ran this monstrosity:
<pre>pbpaste | perl -ne 'push @x, $_; END { print for reverse @x }' | pbcopy</pre>
<p> to do the same thing. In retrospect, I should have just sorted my playlist in reverse track order.</p>
<p>Next I went into iTunes and selected my songs. I ran &#8220;Track Parser (Clipboard)&#8221; from the Scripts menu, clicked &#8220;New Pattern&#8221; and put in &#8220;%a&#8221; to extract the artist from each line. Track Parser handled the rest. Total time: about five minutes.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2007-12-24/using-track-parser/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>C++ STL sort weirdness</title>
		<link>http://www.danvk.org/wp/2007-12-16/c-stl-sort-weirdness/</link>
		<comments>http://www.danvk.org/wp/2007-12-16/c-stl-sort-weirdness/#comments</comments>
		<pubDate>Mon, 17 Dec 2007 04:59:18 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=241</guid>
		<description><![CDATA[I ran into a weird bug at work this past week. What does this code do? #include #include #include struct Compare { int operator()(int a, int b) { return a - b; } }; int main(int argc, char** argv) { std::vector blah; for (int i=0; i]]></description>
			<content:encoded><![CDATA[<p>I ran into a weird bug at work this past week. What does this code do?</p>
<textarea name="code" class="C++:nocontrols:nogutter" cols="60" rows="10">
#include <algorithm>
#include <iostream>
#include <vector>

struct Compare {
  int operator()(int a, int b) { return a - b; }
};

int main(int argc, char** argv) {
  std::vector<int> blah;
  for (int i=0; i<20; i++) blah.push_back(20 - i);
  std::sort(blah.begin(), blah.end(), Compare());
  std::copy(blah.begin(), blah.end(),
            std::ostream_iterator<int>(cout, "\n"));
}
</textarea>
<p>If you said &#8220;segfault&#8221;, give yourself a pat on the back! Bonus points if you know that changing the &#8220;20&#8243; to a &#8220;16&#8243; will prevent the segfault.</p>
<p>After spending several hours staring at this, I figured out what was going on. Rather than taking a compare function that returns an int (like <a href="http://www.gnu.org/software/libc/manual/html_node/Array-Sort-Function.html">qsort</a> or Perl&#8217;s sort), it wants a &#8220;LessThan&#8221; function:</p>
<textarea name="code" class="C++:nocontrols:nogutter" cols="60" rows="10">
struct LessThan {
  bool operator()(int a, int b) { return a < b; }
};
</textarea>
<p>If I&#8217;ve ever been happy that C++ silently casts ints to bools, I have now done my penance. I&#8217;m still somewhat surprised that std::sort segfaults when given a strange comparison function, rather than returning an unsorted list.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2007-12-16/c-stl-sort-weirdness/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>NS-Tower in a Canvas Tag</title>
		<link>http://www.danvk.org/wp/2007-10-26/ns-tower-in-a-canvas-tag/</link>
		<comments>http://www.danvk.org/wp/2007-10-26/ns-tower-in-a-canvas-tag/#comments</comments>
		<pubDate>Fri, 26 Oct 2007 21:49:16 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=228</guid>
		<description><![CDATA[I recently noticed that Rice has unceremoniously purged my Owlnet site, so I&#8217;ll be moving some of its content over here. First up: my JavaScript implementation of Nagi-P Software&#8217;s NS-Tower. This is one of the few games I&#8217;ve ever seen with only one control: jump. Your character bounces off the walls and you have to [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.danvk.org/wp/wp-content/uploads/2007/10/nsticonw.gif' alt='nsticonw.gif' align=left style="padding-right:5px;" width=48 height=48 /> I recently noticed that Rice has <a href="http://www.owlnet.rice.edu/~danvk/">unceremoniously purged</a> my Owlnet site, so I&#8217;ll be moving some of its content over here. First up: my <a href="http://www.danvk.org/tower/nstower.html">JavaScript implementation</a> of <a href="http://www.nagi-p.com/eng/">Nagi-P Software&#8217;s</a> <a href="http://www.nagi-p.com/eng/nstw.html">NS-Tower</a>.</p>
<p>This is one of the few games I&#8217;ve ever seen with only one control: jump. Your character bounces off the walls and you have to power him up for jumps. My record is 282 floors on Hard. Can you beat it? No fair using the JavaScript version, though. More details below (warning: it takes a hard right turn for the nerdy)&#8230;<br />
<span id="more-228"></span></p>
<p>The main problem with my JS-Tower is that the levels don&#8217;t get harder as you climb. To come up with a method for placing platforms, I disassembled the bytecode of the <a href="http://www.nagi-p.com/java/nstowere.html">Java version</a> of NS-Tower and discovered this method:</p>
<textarea name="code" class="C++:nocontrols:nogutter" cols="60" rows="10">
public int getPlatCol(int row) {
	int r = 16 + Math.round( 4.0 * Math.random() )*16
		   + 64 * ( Math.floor((row%4)/2) )
		   + 144 * ( (row%2) );
	return r;
}
</textarea>
<p>It&#8217;s basically a pattern of four repeating platforms, plus some randomness. My JavaScript version would be more fun if I could discover the function used for platform generation by the Mac/PC versions. Here are some of the options, in order of increasing difficulty:</p>
<ol>
<li>Contact Nagi-P software and ask for source. This was painless, but seeing as their site hasn&#8217;t been updated in years, I doubt I&#8217;ll get a response. I&#8217;ve tried tracking Akihiko Kusanagi down, but have had limited success.</li>
<li>Disassemble the Windows NS-Tower. This is orders of magnitude more difficult than disassembling Java bytecode, but it may be just the excuse I need to try out <a href="http://www.datarescue.com/">IDA Pro</a>. Presumably platform generation uses just a few opcodes, it&#8217;s just finding them that will be the trouble.</li>
<li>Do a statistical analysis of NS-Tower platforms. I&#8217;ve gone surprisingly far down this road, but I&#8217;m afraid it only will only lead to madness. The same four-platform sequence is present, but it starts to break down around floor 100. Here&#8217;s a <a href="http://danvk.org/tower/hard.html">sample run</a>.</li>
</ol>
<p>Any other ideas?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2007-10-26/ns-tower-in-a-canvas-tag/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Java Surprise</title>
		<link>http://www.danvk.org/wp/2007-10-09/a-java-surprise/</link>
		<comments>http://www.danvk.org/wp/2007-10-09/a-java-surprise/#comments</comments>
		<pubDate>Wed, 10 Oct 2007 06:44:25 +0000</pubDate>
		<dc:creator>danvk</dc:creator>
				<category><![CDATA[boggle]]></category>
		<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://www.danvk.org/wp/?p=224</guid>
		<description><![CDATA[I&#8217;ve always been a Java and Eclipse naysayer, but I&#8217;m afraid new experiences are forcing me to reevaluate my skepticism. The last time I used Java was JDK 1.3 on a Sparc workstation back in early 2004. Eclipse was hella slow on that hardware, and somehow my workspace wound up in a temporary directory. This [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.danvk.org/wp/wp-content/uploads/2007/10/java.png' alt='java.png' align=right style="padding-left: 10px;" /><br />
I&#8217;ve always been a <a href="http://en.wikipedia.org/wiki/Java_(programming_language)">Java</a> and <a href="http://en.wikipedia.org/wiki/Eclipse_(software)">Eclipse</a> naysayer, but I&#8217;m afraid new experiences are forcing me to reevaluate my skepticism. The last time I used Java was JDK 1.3 on a Sparc workstation back in early 2004. Eclipse was hella slow on that hardware, and somehow my workspace wound up in a temporary directory. This was a very bad thing, because as soon as I logged out, my project was gone forever. So I had good reason to swear off Eclipse.</p>
<p>More generally, Java left off a mighty stink back in 2004. Any GUI that I ran on the Mac would look out of place and felt clunky. Performance was poor. But in retrospect, I suspect much of the rank Java smell was really coming from the <a href="https://sys.cs.rice.edu/course/comp314/07/">design patterns gibberish</a> I was being force-fed at the same time. Why use a simple array when you could use an AbstractListFactory that does the same thing with 10x code bloat?</p>
<p>Regular readers only get one guess what program I wrote to get in the swing of things.<br />
<span id="more-224"></span><br />
The <a href="http://performance-boggle.googlecode.com/svn/trunk/">C++ code</a> translated almost line-for-line to Java. Most of the changes were either syntax (&#8220;->&#8221; to &#8220;.&#8221;) or swapping <code>String</code> for <code>char*</code>. It took a while to remember that everything is a pointer in Java. The Java statement <code>Trie t = new Trie();</code> is the equivalent of C++&#8217;s <code>Trie* t = new Trie;</code> and not plain old <code>Trie t;</code>, which uses the stack. Another major pain was the lack of unsigned types, which I used as hash codes in some tests.</p>
<p>Here&#8217;s the <a href="http://performance-boggle.googlecode.com/svn/trunk/java/boggle/">Java code</a>, for those interested. I expected Java code to be an order of magnitude slower than the <a href="http://performance-boggle.googlecode.com/svn/trunk/">equivalent C++</a>, so this initial benchmark surprised me:</p>
<pre>
C++:  Evaluated 21632 boards in 0.694 seconds = 31,162 bds/sec
Java: Evaluated 21632 boards in 1.108 seconds = 19,523 bds/sec
</pre>
<p>That&#8217;s only a <b>37% performance penalty</b>, far less than the 90% or more I was expecting. Since my code doesn&#8217;t create/destroy any objects during the critical section, I&#8217;m really testing the effectiveness of the JIT. Since this code is a direct C conversion, it makes sense that it does well.</p>
<p>I don&#8217;t understand the internals of Java well enough to know how valid my benchmark is. Profiling C++ code is relatively straightforward, since nothing is going on behind your back. Not the case with Java. For example, does that 1.108 seconds include the time it took the JIT to compile my code? That might explain the whole perf difference. Also, am I fully optimizing? I hear there&#8217;s a difference between the <a href="http://java.sun.com/j2se/1.3/docs/guide/performance/hotspot.html#client">client</a> and <a href="http://java.sun.com/j2se/1.3/docs/guide/performance/hotspot.html#server">server</a> JITs. Which one am I using? How can I tell? Lazyweb?</p>
<p>A few more thoughts on the whole experience:</p>
<ul>
<li>Eclipse is just great, especially for someone whose knowledge of the JDK is rusty. Having &#8220;<code>string.</code>&#8221; bring up a list of methods was great. This was less useful for more obscure Java-isms, like <code>System.getProperty("user.dir")</code> to get the current working directory.</li>
<li>When Eclipse spots an error in your code, it suggests a solution, which you can double-click to perform. This was just perfect for a rusty coder. It added lines like <code>import java.io.File;</code> to my code, which would have taken me a while to figure out on my own. That&#8217;s always a nuisance in C++.</li>
<li>I can&#8217;t figure out what command line Eclipse is using to run my code. I have no idea how it&#8217;s running my unit tests. Is there any way of finding this out?</li>
<li>jUnit was very easy to use inside Eclipse. I particularly liked the ability to write unit tests inside the class they were testing. Just tag a method with <code>@Test</code> and it becomes a unit test. Very cool.</li>
<li>I found the whole StringBuilder business pretty clunky. Writing <code>"a" + "b"</code> is special kludge for <code>new StringBuilder().Append("a").Append("b").toString()</code>. I also missed <code>printf</code> when writing to stdout. Is there no natural way to mix numbers and strings?</li>
<li>Java GUIs on the Mac still aren&#8217;t quite there. When I moused over the icons in Eclipse, there was noticeable flicker.</li>
</ul>
<p>A tool like Eclipse is a great boon in understanding a large, foreign code base. One thing I&#8217;ve discovered in my past year at Google is that, while I&#8217;ve become a fairly good coder through experience, I have almost no external experience working with other people&#8217;s code. Make no mistake, navigating other people&#8217;s code is a skill. And to a surprising extent, it&#8217;s a skill disjoint from coding itself. We have a few tools to help with this at Google, but nothing quite so low-latency as what Eclipse pops up when you press type a period. The possibility of tools like Eclipse is a strong argument for languages with a simple syntax. When anyone can parse a language, they can build amazing tools like this one.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.danvk.org/wp/2007-10-09/a-java-surprise/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
