Regular expressions

Finding An Anagram with Linq and Regex

Posted on Updated on

Once again, thanks to Will Shortz and NPR for supplying a challenging puzzle so we can play with some fun .NET features!

Solution

This time, the puzzle was easy; the challenge was getting a list of song titles. I experienced the following issues getting a song list:

  • Wikipedia (normally the primo place for lists) didn’t have one.
  • The first decent alternate site
    • Spread its list over 85 pages
    • Used an ‘anti-robot’ policy that blocked my IP if I tried to access too many pages too fast. Argh!
  • Fortunately, I was able to find GoldStandardSongList.com, a nice site with 5,000 titles all on one page☺

Code Techniques Used

The Secret Trick to Solving Anagrams

SlowThe first time you try to solve an anagram, you might try some inefficient method, like building every letter-combination of your candidate. That would be super-slow, because there are a lot of combinations for a 15 letter song title. Listing all combinations is an O(n!) algorithm. (Check-out Wikipedia Big-O Notation if you never  took a class in algorithm analysis.)

­……

Faster Method: Sort Your Candidate’s Letters

FastHere’s a better way: sort the letters for the target and candidate, then compare the result. For example:

  • ‘COOL HIT FARE IN LA’ → ‘aacefhiillnoort’
  • ‘HOTEL CALIFORNIA’    → ‘aacefhiillnoort’

Obviously, because the the sorted strings are equal, the two original strings must be anagrams. This is much faster, because you can sort a string in O(n∙ ln) time.

 

Compare n-Factorial Vs. n ∙Ln(n)

For comparison,

  • 15!            = 1,307,674,368,000 – n factorial
  • 15∙ln(15) = 17                             – n ln n

Hint: if you don’t like staring at an hourglass, use my algorithm! Would you rather run 17 operations, or 1.3 billion?

TriumphantFaceFor fun, challenge your bone-headed co-worker to a race and laugh while he pounds his screen in frustration!

 

Use Linq to Clean-Up your Song Title and Sort Letters: Easy!

Here’s the method I wrote that takes a song title as input, cleans it up, and sorts the letters:

private string CleanAndSortLetters(string input) {
  string result = input.Where(c => char.IsLetter(c))
                        .Select(c => char.ToLower(c))
                        .OrderBy(c => c)
                        .Aggregate("", (p, c) => p + c);
  return result;
}

Highlights

  • char.IsLetter  – Should be obvious! It removes spaces, quote marks, parenthesis, etc.
  • OrderBy – Sorts the letters in the string. Note that Linq will treat your string as an IEnumerable. Basically a list of char’s.
  • Aggregate – combines the char list back into a string

Use Regular Expressions to Extract Song Titles from a Web Site

The challenge is to extract just the song title from all the HTML markup.

SongListDisplay
Here’s how the HTML list looks in your browser. To the novice, it looks easy to grab just column-two from the formatted table.

Here’s the Some of the Same Table in HTML Markup

HtmlMarkup
You can barely see the ‘payload’ (Hotel California) for all the markup. Grabbing just that text is not as easy as it looks!

Regular Expressions Find Your Pattern!

LookForAPatternRegular expressions are a powerful means to match patterns, but kinda hard to learn. They’re worth the effort for a task like this, because doing it differently would take forever and be just as complicated.

Here’s my Regular Expression in action – it looks crazy, but I’ll explain it below:

//Build a pattern to match the song titles:
string pattern = "class=\"table1column2\"[^>]+><p>" +
				".*?<span[^>]+>([^<]+)</span>";
//Use the pattern to build a regular expression
Regex re = new Regex(pattern, 
        RegexOptions.Compiled | RegexOptions.IgnoreCase 
		| RegexOptions.Singleline);

//Find all the matches in the string 'pageText':
Match m = re.Match(pageText);

Quick note: the RegexOptions “Compiled” and “IgnoreCase” should be obvious. But “SingleLine” is counter-intuitive: you should use it when you try to match text containing linebreaks.

Also, I have some other code to loop through the matches, very easy but shown below, and also to extract the capture group and use it, again shown below.

Explanation of the Regex Pattern

The Regex packs a lot of meaning into a small space. Here’s a detailed explanation of every section. As you can see, it takes longer to explain it than it does to just run it. Skip it unless you have a super attention span!

Regex Fragment Explanation Match Comment
class=”table1column2″ Match the exact text <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
[^>]+ Any character that is not >, one or more repetitions. <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
The so-called “negated character class” is a great way to skip over text until it
hits a special character you know is always present, like the >
> Literal match <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
.*? Any character, as few as possible (“lazy star”) <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span
style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
I would prefer to use a negated character class, but there is a line-break in the text
I wish to skip over. Lazy star grabs matches until we hit the expression below.
<span Literal match <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
This terminates the lazy star match above
[^>]+ Any character that is not >, one or more repetitions <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
Another negated character class – super helpful!
> Another literal match <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
([^<]+) Finally! My Capture Group; grab everything until we hit a < <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California</span>
Basically another negated character class, wrapped in parenthesis
</span> Literal match <td class=”table1column2″ width=”30%” style=”padding-top: 0in; padding-bottom: 0in” valign=”top”><p>
<span style=”font-weight: bold; font-family:Arial, sans-serif”>Hotel California<span>

Here’s the Code to Solve the Puzzle

As you can see, the actual code is quite short. Again, explaining it is harder than just doing it!

private const string SONG_URL = 
  @"https://www.goldstandardsonglist.com/Pages_Sort_2a/Sort_2a.htm";
  
private void btnSolve_Click(object sender, RoutedEventArgs e) {
  string anagramThis = "COOL HIT FARE IN LA";
  Mouse.OverrideCursor = Cursors.Wait;

  string matchThis = CleanAndSortLetters(anagramThis);

  string pageText;
  //Open the song URL in a web client and read all the text
  using (WebClient wc = new WebClient()) {
    using (StreamReader sr = new StreamReader(wc.OpenRead(SONG_URL))) {
      pageText = sr.ReadToEnd();
    }
  }

  string pattern = "class=\"table1column2\"[^>]+" +
           "><p>.*?<span[^>]+>([^<]+)</span>";
  Regex re = new Regex(pattern, 
      RegexOptions.Compiled | RegexOptions.IgnoreCase);

  Match m = re.Match(pageText);
  //Examine every match
  while (m.Success) {
    //Grab the capture group and remove line breaks etc.
    string songName = m.Groups[1].Value
      .Replace("\r", "")
      .Replace("\n", " ");
    string candidate = CleanAndSortLetters(songName);
    //The key to solving the puzzle:
    if (candidate == matchThis)
      txtAnswwer.Text += songName + "\n";

    m = m.NextMatch();
  }
  Mouse.OverrideCursor = null;
}

Code Comments

I left-out the method ‘CleanAndSortLetters‘ because I already listed it above.

If you download my code and run it, you can set a breakpoint and see that it takes several seconds to download the song list. But, after that slow operation, the rest of the code runs in a few milliseconds!

Summary

SummarySolving the puzzle is easy with Linq; you just use the Linq operators to sort the letters, remove spaces etc, and convert to lower case. After that, you compare it to the target; if they are the same, you’ve found a winner!

 

 

The hard part is getting your list of song titles. Get the raw HTML with WebClient.Open, then use Regular Expressions to extract patterns from your HTML markup. Most of my Regex pattern consists of literal matches, or else negated character classes. As you remember, negated character classes allow you to match until you hit a designated character, such as < or >.

Download the Code

You can download my code here from my Dropbox account; note that they will force you to create a free account before proceeding with the download.

Solved the NPR Sunday Puzzle from June 1, 2014

Posted on Updated on

The Challenge

This challenge comes from listener Dan Pitt of Palo Alto, Calif. Take the name of a well-known American businessman — first and last names. Put the last name first. Insert an M between the two names. The result names a food item. What is it?

Challenge URL: http://www.npr.org/2014/06/01/317697851/getting-to-know-the-in-crowd

Facing the Challenge

Writing the code to solving this one was simple; the real challenge was getting the right data to solve! For some reason, businessmen don’t get much attention on Wikipedia or other sites. Wikipedia has lists of chefs, dentists, copyrighters, etc. but no list of businessmen! I was able to find a list of Entrepreneurs and CEOs, but those are not quite the same thing. Fortunately, the two lists were sufficient to solve the puzzle.

Likewise, finding the foods was not too easy, mainly because I found a huge list at USDA, but said list turned-out to be a list of processed foods that did not include the answer. Fortunately, I was able to solve using the next place I looked for foods, namely, a list of fruits from Wikipedia. I guess USDA doesn’t need to list fruits on their web site.

Nonetheless, it was quite satisfying to solve the puzzle!

Screen shot with solution
Screen shot showing the program in action and arriving at the solution, ‘Elon Musk’

Code to Solve the Puzzle, Given that the Data is Available

Let’s jump right into the guts of the issue, solving the puzzle. First, here is the code to test if a given name is a solution.

//This method will return true if fullName can be manipulated to match a key
//in the food dictionary, and will set the output variable 'answer'
public bool NameIsASolution(string fullName, 
                       Dictionary<string, string> foodDic, out string answer) {
    answer = string.Empty;
    //Only look at names that have at least one space, i.e. a first and last name
    if (fullName.Contains(' ')) {
        //Get the first and last name, using the 
	//positions of the first and last space in the name
        string firstName = fullName.Substring(0, fullName.IndexOf(' '));
        int p = fullName.LastIndexOf(' ');
        string lastName = fullName.Substring(p + 1, fullName.Length - p - 1);

        //Reverse the last and first names, while 
	//inserting the letter 'm'
        string candidate = lastName + 'm' + firstName;
        //The dictionary uses hashing technology to efficiently 
	//tell us if the candidate is a key in it:
        if (foodDic.ContainsKey(candidate)) {
            answer = foodDic[candidate];
        }
        return foodDic.ContainsKey(candidate);
    } else {
        return false;
    }
}

Note:

  • We identify the first name (if present) using IndexOf(‘ ‘)
  • In other words, find the first space in the name
  • We identify the last name using LastIndexof(‘ ‘)
  • Which won’t work if a name ends with “Jr.”, but we don’t care for this puzzle
  • The dictionary has already been constructed to strip-out spaces and punctionation in the key
  • But each value contains the original, unmanipulated fruit name

The code above gets called for every entry in two files:

  1. BUSINESSMAN_FILE
  2. CEO_FILE

Why two files? Because I elected to save the businessmen in a different format, including their country and area of specialization. So processing each line in the file requires slightly more work, to extract just the name from each line. In contrast, the food and CEO files list a simple name on each line.

Here is the code that fires when user clicks the button to solve the puzzle:

private void btnSolve_Click(object sender, RoutedEventArgs e) {
    //The original food name is the value, the key is the food in lower case with no spaces or punctuation
    Dictionary<string, string> foodDic = new Dictionary<string, string>();
    using (StreamReader sr = File.OpenText(FOOD_FILE)) {
        while (sr.Peek() != -1) {
            string aLine = sr.ReadLine();
            //Use Linq to get only letters (the Where-clause) and then reassemble
            //them from char to string (the Aggregate clause)
            string aKey = aLine
                                .ToLower()
                                .Where(c => c >= 'a' && c <= 'z')
                                .Aggregate("", (p, c) => p + c);
            if (!foodDic.ContainsKey(aKey)) {
                foodDic.Add(aKey, aLine);
            }
        }
    }

    //Process each line in the businessman file
    string answer = string.Empty;
    using (StreamReader sr = File.OpenText(BUSINESSMAN_FILE)) {
        while (sr.Peek() != -1) {
            string aLine = sr.ReadLine();
            //The full name is all the text preceding the first tab:
            string fullName = aLine.Substring(0, aLine.IndexOf('\t'));
            if (NameIsASolution(fullName.ToLower(), foodDic, out answer)) {
                txtAnswer.Text += answer + "<-->" + fullName + "\n";
                stbMessage.Text = "Solved!";
            }
        }
    }

    //Now process each line in the CEO file:
    using (StreamReader sr = File.OpenText(CEO_FILE)) {
        while (sr.Peek() != -1) {
            string aLine = sr.ReadLine();
            if (NameIsASolution(aLine.ToLower(), foodDic, out answer)) {
                txtAnswer.Text += answer + "<-->" + aLine + "\n";
                stbMessage.Text = "Solved!";
            }
        }
    }
}

As you can see, the code above

  • Opens the food file and builds a dictionary with the contents
  • Opens the businessmen file, finds the name on each line, and tests it to see if it is a solution
  • Does the same thing for the CEO file

Getting the Data

If you have read my blog before, you know that I generally adopt the same approach:

  1. Go to a URL with a .NET WebClient
  2. Download the html via the OpenRead method
  3. Identify the data using a regular expression
  4. Save the regular expression matches to a file

You can read one of my other blog entries to get the code for that technique; today I will just list the regular expression that I used for each URL.

URL Regex
http://en.wikipedia.org/ wiki/List_of_chief_executive_officers <li><a         #Literal
[^>]+         #Anything but >
>                #Literal
([^<]+)       #Capture group, anything but <
</a>           #Literal
(                 #Start capture group
\s*              #Whitespace
\(                #Escaped parenthesis
[^)]+           #Anything but )
\),               #Escaped parenthesis, comma
)?                #End capture group, optional
([^<]+)?     #Start capture group, anything but <, optional
</li>            #Literal
http://en.wikipedia.org/
wiki/
List_of_fruits
<span       #Literal
\s+           #Whitespace
id=”          #Literal
([^”]+)      #Capture group, anything but quote
”               #Literal
http://en.wikipedia.org/wiki/List_of_chief_executive_officers  <tr>         #Literal
\r?\n?        #Optional carriage return linefeed
<td><a      #Literal
[^>]+        #Anything but >
>               #Literal
.*?             #Anything, lazy match
</a></td> #Literal match
\r?\n?        #Optional carriage return linefeed
<td><a      #Literal
[^>]+         #Anything but >
>               #Literal
([^<]+)      #Capture Group, anything but <
</a></td> #Literal match
http://ndb.nal.usda.gov/ndb/foods  <a                             #Literal
\s+                             #Whitespace
href=”/ndb/foods/show #Literal
[^>]+                         #Anything but >
>                                #Literal
([^<]+)                       #Capture group, anything but <
</a>                           #Literal
\r?\n?                         #Optional carriage return linefeed
\s+                            #Whitespace
</td>                         #Literal

Summary

Use regular expressions to capture the data from web sites. Use string manipulation to reverse the businessman name and inject the letter ‘M’. Build a dictionary of foods using the food name as the key. For each manipulated businessman name, test if the dictionary contains that value as a key; if so, we have a winner!

 

Body Parts From Literary Character – NPR Puzzle!

Posted on

The Challenge:

This week’s challenge comes from listener Steve Baggish of Arlington, Mass. Name a title character from a classic work of fiction, in 8 letters. Change the third letter to an M. The result will be two consecutive words naming parts of the human body. Who is the character, and what parts of the body are these?

Link to the puzzle: http://www.npr.org/2014/02/09/273644645/break-loose-break-loose-kick-off-your-sunday-shoes

Solution to the puzzle
Three solutions to the puzzle

Satisfying, But Complex!

I was able to solve this one with

The Algorithm

The hard part was identifying names from my book list, especially when you consider that the winner is no longer a common name, even though he is featured in the title of three books. Many more book characters are not common names! The basic idea is to examine the list of books and consider, as a name candidate, any words that meet the following criteria:

  1. If it is used in a possessive, such as “Charlotte’s Web”
  2. Or if it is found in our first-name list (boys and girls), and has 8 letters
  3. Or if it is a word pair, starting with an entry in our first-name list
  4. Or if the length of the title is exactly 8 letters, excluding spaces, hyphens, like “Jane Eyre”
  5. Finally, if  it is a word or word pair preceded by a preposition, such as “Flowers for Algernon”

Once we have the list of candidate names, we compare them against our list of body parts. After, of course, substituting ‘m’ for the 3rd character.

The shortest body part in my list has length 3; each literary name has length 8. We also know, from Will’s directions, that the two body-part words are consecutive. Therefore, each literary name can be split three different ways, which we perform with a loop htat splits the name  into 3 candidates, such as

  • gum/liver
  • guml/iver
  • gumli/ver

I elected to store my books and associated names in a dictionary; the name (e.g. ‘gulliver’) will be my key, while the value will be the book title. You can loop through a .NET dictionary using a for-each loop, and each entry will be a KeyValuePair, in my case, using string data type for the key and string data type for the value.

For each dictionary pair (name/book title), split the name, and if we can find both resulting words in our body parts list, we have a winner! As discussed above, there are 3 ways  to split each name.

Code Snippet for the Win

//Loop through the dictionary 'characterDic'; each entry is a name (the key) and a value (book title)
foreach (KeyValuePair<string, string> kvp in characterDic) {
    //replace 3th letter with 'M', 
    //for example 'gulliver' --> 'gumliver'
    string theCharacter = kvp.Key.Substring(0, 2) 
                    + "m" + kvp.Key.Substring(3);

    //Chop theCharacter at several points to see 
    //if they might be body parts
    for (int divPt = 3; divPt <= 5; divPt++) {
        string word1 = theCharacter.Substring(0, divPt);
        string word2 = theCharacter.Substring(divPt);
        //Check if both words are found in our body 
        //parts list using BinarySearch
        //Binary search returns an index, which will be 
        //greater or equal to zero, where the search item was found
        if (bodyPartsList.BinarySearch(word1) >= 0 
            && bodyPartsList.BinarySearch(word2) >= 0) {
            //Append the two words, plus the book 
            //title(s) to the text box for ansers
            txtAnswers.Text += word1 + "+" + word2 + " ? " + kvp.Value + "\n";
        }
    }
}

Notes

  • Binary search is very fast (it runs in O(Log(n)) time; I could have used a dictionary for even more speed but my lists aren’t that big. As noted above, when you run a .NET binary search, it returns the index of the entry if it finds what you searched for.
  • I used a dictionary for the characters and their associated title characters, not for the speed, but because it offers a convenient way to store a pair and avoid duplicates. As you probably noticed from my solution screen shot, ‘Gulliver’ is the title character in 3 books
    • To handle that, I elected to use ‘Gulliver’ for the dictionary key, when I found other books with the same title character, I simply appended the new book name to the entry whose key is ‘Gulliver’

Code Listing

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Input;
using System.IO;
using System.Text.RegularExpressions;

namespace Feb_9_2014 {
    /// <summary>
    /// Interaction logic for MainWindow.xaml
    /// </summary>
    public partial class MainWindow : Window {
        private const string BOOK_LIST 
            = @"C:\Users\Randy\Documents\Puzzles\Data\CleanedBookList.txt";
        private const string BODY_PARTS_LIST 
            = @"C:\Users\Randy\Documents\Puzzles\Data\BodyPartsList.txt";
        private const string BOY_NAME_LIST 
            = @"C:\Users\Randy\Documents\Puzzles\Data\CommonBoyNames.txt";
        private const string GIRL_NAME_LIST 
            = @"C:\Users\Randy\Documents\Puzzles\Data\CommonGirlNames.txt";
        private const string PREPOSITION_LIST 
            = @"C:\Users\Randy\Documents\Puzzles\Data\EnglishPrepositions.txt";
        private const string WORD_LIST 
            = @"C:\Users\Randy\Documents\Puzzles\Data\web2.txt";

        //links to body parts list: 
        //http://www.ego4u.com/en/cram-up/vocabulary/body-parts
        //http://www.enchantedlearning.com/wordlist/body.shtml

        //Booklist from http://www.goodreads.com

        public MainWindow() {
            InitializeComponent();
        }

        /// <summary>
        /// Fired when user clicks the Solve button
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnSolve_Click(object sender, RoutedEventArgs e) {
            Mouse.OverrideCursor = Cursors.Wait;

            //Build a unique list of names from the census list of first names:
            List<string> firstNameList = new List<string>();
            AugmentNameListFromFile(BOY_NAME_LIST, firstNameList);
            AugmentNameListFromFile(GIRL_NAME_LIST, firstNameList);

            //Most of the work is done here, creating potential 
            //character names from m list of books
            Dictionary<string, string> characterDic 
                = Build8LetterCharcterList(firstNameList);

            List<string> bodyPartsList = BuildBodyPartsList();

            //Now use our list of potential character names 
            //and the body parts list to solve:
            foreach (KeyValuePair<string, string> kvp in characterDic) {
                //replace 3th letter with 'M', 
                //for example 'gulliver' --> 'gumliver'
                string theCharacter = kvp.Key.Substring(0, 2) 
                                + "m" + kvp.Key.Substring(3);

                //Chop theCharacter at several points to see 
                //if they might be body parts
                for (int divPt = 3; divPt <= 5; divPt++) {
                    string word1 = theCharacter.Substring(0, divPt);
                    string word2 = theCharacter.Substring(divPt);
                    //Check if both words are found in our body 
                    //parts list using BinarySearch
                    //Binary search returns an index, which will be 
                    //greater or equal to zero, where the search item was found
                    if (bodyPartsList.BinarySearch(word1) >= 0 
                        && bodyPartsList.BinarySearch(word2) >= 0) {
                        //Append the two words, plus the book 
                        //title(s) to the text box for ansers
                        txtAnswers.Text += word1 + "+" + word2 + " ? " + kvp.Value + "\n";
                    }
                }
            }
            Mouse.OverrideCursor = null;
        }

        /// <summary>
        /// Reads a file, finds the boy/girl name at position 1-15, 
        /// and adds to the list if not already present
        /// </summary>
        /// <param name="fName">File Name containing boy/girl names</param>
        /// <param name="firstNameList">List to augment</param>
        /// <remarks>
        /// The list we build will be sorted by virtue of how we insert new names
        /// </remarks>
        private void AugmentNameListFromFile(string fName, List<string> firstNameList) {
            using (StreamReader sr = File.OpenText(fName)) {
                while (sr.Peek() != -1) {
                    string aName = sr.ReadLine().ToLower().Substring(0, 15).TrimEnd();
                    int p = firstNameList.BinarySearch(aName);
                    if (p < 0) {
                        //When p < 0, it means the name is not present in the list
                        //Also, when we invert p, that becomes the index where 
                        //it belongs to keep the list sorted
                        firstNameList.Insert(~p, aName);
                    }
                }
            }
        }

        /// <summary>
        /// Builds a dictionary of character names (8 letters long) 
        /// and the book titles they are found in
        /// </summary>
        /// <param name="firstNameList">List of fist names</param>
        /// <returns>The dictionary</returns>
        /// <remarks>
        /// We will consider a word a name if
        ///     1) it is used in the possessive, such as "Charlotte's Web"
        ///     2) It is found in our first-name list and has 8 letters
        ///     3) It is a word pair starting with an entry in our first-name list
        ///     4) The length of the title is 8 letters, 
        ///        excluding spaces, hyphens, like Jane Eyre
        ///     5) It is a word or word pair preceded by a preposition, 
        ///        such as Flowers for Algernon
        /// </remarks>
        private Dictionary<string, string> Build8LetterCharcterList(List<string> firstNameList) {
            Dictionary<string, string> result = new Dictionary<string, string>();

            //the word list will contain almost every English 
            //word that is not in our firstname list
            List<string> wordList = BuildWordList(firstNameList);
            //A list of 50 words like to/as/for
            List<string> prepositionList = BuildPrepositionList();

            //This Regular expression will detect 8 (or 9 with a space) 
            //letter words followed by apostrphe s
            Regex rePossive = new Regex(@"
                            (           #start capture group
                            \b          #Demand a word break
                            [A-Za-z\s]  #any letter a-z, lower or upper case, plus space
                            {8,9}       #The preceeding must have length 8 or 9
                            \b          #demand another word break
                            )           #close our capture group
                            's          #demand apostrphe s",
                            RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

            //Loop through the book list and see if each title 
            //matches any of the listed conditions
            //Each method will add to the dictionary using the 
            //character name as the key and the title as the value
            using (StreamReader sr = File.OpenText(BOOK_LIST)) {
                while (sr.Peek() != -1) {
                    string title = sr.ReadLine();
                    CheckPossive(title, rePossive, result, wordList);
                    Check8LetterNames(firstNameList, result, title);
                    CheckFullTitle(title, result);
                    CheckPreposition(title, prepositionList, wordList, result);
                }
            }
            return result;
        }

        /// <summary>
        /// Opens the WORD_LIST file and adds every entry which 
        /// is not found in the list of first names
        /// </summary>
        /// <param name="firstNameList">Sorted list of first names</param>
        /// <returns><The list of words/returns>
        /// <remarks>
        /// The word list is useful to reject potential character names 
        /// which might be a common English word
        /// </remarks>
        private List<string> BuildWordList(List<string> firstNameList) {
            List<string> result = new List<string>();
            using (StreamReader sr = File.OpenText(WORD_LIST)) {
                while (sr.Peek() != -1) {
                    string aWord = sr.ReadLine().ToLower();
                    if (firstNameList.BinarySearch(aWord) < 0) {
                        result.Add(aWord);
                    }
                }
            }
            return result;
        }

        /// <summary>
        /// Builds a list of prepositions (string) from the file 
        /// PREPOSITION_LIST, which is already sorted
        /// </summary>
        /// <returns>The list of words</returns>
        private List<string> BuildPrepositionList() {
            List<string> result = new List<string>();
            using (StreamReader sr = File.OpenText(PREPOSITION_LIST)) {
                while (sr.Peek() != -1) {
                    result.Add(sr.ReadLine());
                }
            }

            return result;
        }

        /// <summary>
        /// Examines every word in the specified title to determine 
        /// if it matches the possive form (and have 8-9 characters)
        /// </summary>
        /// <param name="title">A book title to search</param>
        /// <param name="rePossive">A regular expression that will 
        /// match a possessive form</param>
        /// <param name="characterTitleDic">Dictionary to add results to</param>
        /// <param name="wordList">List of english words to exclude</param>
        private void CheckPossive(string title, Regex rePossive, 
                    Dictionary<string, string> characterTitleDic, 
                    List<string> wordList) {
            Match m = rePossive.Match(title);
            if (m.Success) {
                string candidate = m.Groups[1].Value.ToLower().Trim().Replace(" ", "");
                if (candidate.Length == 8 && wordList.BinarySearch(candidate) < 0) {
                    if (characterTitleDic.ContainsKey(candidate)) {
                        characterTitleDic[candidate] += ", " + title;
                    } else {
                        characterTitleDic.Add(candidate, title);
                    }
                }
            }
        }

        /// <summary>
        /// Examines the specified title to see if any words could 
        /// be an 8-letter first name or else a first name followed
        /// by any other word when the pair has a combined length of 8
        /// </summary>
        /// <param name="firstNameList">List of first names</param>
        /// <param name="characterTitleDic">List to add the result to</param>
        /// <param name="title">Book title that might contain names</param>
        /// <remarks>
        /// We could have a hit under two conditions: 
        ///   1) We find an 8-letter name in the title, such as 'JONATHAN'
        ///   2) We find a name of any length followed by another word, 
        ///       such as 'Jane Eyre', the two having a combined
        ///      length of 8
        /// </remarks>
        private void Check8LetterNames(List<string> firstNameList, 
                        Dictionary<string, string> characterTitleDic, string title) {
            string[] words = title.ToLower().Split(' ');

            for (int i = 0; i< words.Length; i++) {
                string aWord = words[i];
                //Check case 1, 8 letter names
                if (aWord.Length == 8 && firstNameList.BinarySearch(aWord) >= 0) {
                    if (characterTitleDic.ContainsKey(aWord)) {
                        characterTitleDic[aWord] += ", " + title;
                    } else {
                        characterTitleDic.Add(aWord, title);
                    }
                } else if (i < words.Length - 1 
                           && firstNameList.BinarySearch(aWord) > 0) {
                    //Case two, we have a first name followed 
                    //by another name with total length of 8
                    if (aWord.Length + words[i + 1].Length == 8) {
                        string candidate = aWord + words[i+1];
                        if (candidate.Length == 8) {
                            if (characterTitleDic.ContainsKey(candidate)) {
                                characterTitleDic[candidate] += ", " + title;
                            } else {
                                characterTitleDic.Add(candidate, title);
                            }
                        }
                    }
                }
            }
        }

        /// <summary>
        /// We will identify any titles having length 8 (excluding 
        /// spaces) and add them to the dictionary
        /// </summary>
        /// <param name="title">A book title</param>
        /// <param name="characterTitleDic">Where we store hits</param>
        /// <remarks>
        /// For example, 'Jane Eyre'
        /// </remarks>
        private void CheckFullTitle(string title, 
                        Dictionary<string, string> characterTitleDic) {
            string candidate = title.Replace(" ", "").ToLower();
            if (candidate.Length == 8) {
                if (characterTitleDic.ContainsKey(candidate)) {
                    characterTitleDic[candidate] += "," + title;
                } else {
                    characterTitleDic.Add(candidate, title);
                }
            }
        }

        /// <summary>
        /// Identify 8-letter words following a preposition 
        /// that are not common English words
        /// </summary>
        /// <param name="title">A book title</param>
        /// <param name="prepositionList">List of prepositions</param>
        /// <param name="wordList">List of English words 
        ///         that are not first names</param>
        /// <param name="characterTitleDic">Dictionary to 
        ///         augment with hits</param>
        /// <remarks>
        /// Sample: 'A Ball for Daisy'. 'for' is the preposition,
        ///         'Daisy' is the word that follows.
        /// </remarks>
        private void CheckPreposition(string title, List<string> prepositionList, 
                        List<string> wordList, Dictionary<string, string> characterTitleDic) {
            string[] words = title.ToLower().Split(' ');
            for (int i = 0; i < words.Length; i++) {
                string aWord = words[i];
                if (prepositionList.BinarySearch(aWord) > 0 
                    && i < words.Length - 1 
                    && words[i+1].Length == 8 
                    && wordList.BinarySearch(words[i+1]) < 0) {
                    if (characterTitleDic.ContainsKey(words[i + 1])) {
                        characterTitleDic[words[i + 1]] += ", " + title;
                    } else {
                        characterTitleDic.Add(words[i + 1], title);
                    }
                }
            }
        }

        /// <summary>
        /// Builds a string list from the body parts in the 
        /// file BODY_PARTS_LIST, which is already sorted
        /// </summary>
        /// <returns>The sorted list</returns>
        private List<string> BuildBodyPartsList() {
            List<string> result = new List<string>();
            using (StreamReader sr = File.OpenText(BODY_PARTS_LIST)) {
                while (sr.Peek() != -1) {
                    result.Add(sr.ReadLine().ToLower());
                }
            }

            return result;
        }
    }
}

Now, the XAML


<Window x:Class="Feb_9_2014.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="Title Character to Body Parts" Height="350" Width="725"
        WindowStartupLocation="CenterScreen" 
        Icon="/Feb_9_2014;component/Images/Book.jpg">
    <Grid>
        <Grid.Background>
            <ImageBrush Opacity=".3" 
                ImageSource="/Feb_9_2014;component/Images/gullivers600x390.jpg" />
        </Grid.Background>
        <Grid.RowDefinitions>
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition />
        </Grid.RowDefinitions>
        <Grid.ColumnDefinitions>
            <ColumnDefinition Width="auto" />
            <ColumnDefinition Width="auto" />
            <ColumnDefinition />
        </Grid.ColumnDefinitions>

        <Border Grid.ColumnSpan="3">
            <Border.Background>
                <LinearGradientBrush EndPoint="0.5,1" StartPoint="0.5, 0">
                    <GradientStop Color="Black" Offset="0" />
                    <GradientStop Color="#FFD5CCC0" Offset="1" />
                </LinearGradientBrush>
            </Border.Background>
            <TextBlock HorizontalAlignment="Center" FontSize="36" 
                       Foreground="Chartreuse" 
                       Text="Title Character Plus M ? 2 Body Parts">
                <TextBlock.Effect>
                    <DropShadowEffect />
                </TextBlock.Effect>
            </TextBlock>
        </Border>

        <TextBlock Grid.Row="1" Grid.ColumnSpan="3" FontSize="14" 
                   TextWrapping="Wrap" Margin="3" >
            This week's challenge comes from listener Steve Baggish of 
            Arlington, Mass. Name a title character from a classic work 
            of fiction, in 8 letters. Change the third letter to an M. 
            The result will be two consecutive words naming parts 
            of the human body. Who is the character, and what parts 
            of the body are these?
        </TextBlock>

        <Button Grid.Row="2" Content="_Solve" Name="btnSolve" 
                Height="30" VerticalAlignment="Top" Margin="5 0" 
                Click="btnSolve_Click" FontSize="14" FontWeight="Black" />
        <Label Grid.Row="2" Grid.Column="1" Content="_Answer:" 
               Target="{Binding ElementName=txtAnswers}" FontWeight="Bold" />
        <TextBox Grid.Row="2" Grid.Column="3" Name="txtAnswers" 
                 AcceptsReturn="True" TextWrapping="Wrap" 
                 Background="Transparent" FontWeight="Bold"
                 FontSize="16"/>

    </Grid>
</Window>

That’s all!

NPR Puzzle: Famous Person with 4 Double-Letters in Name

Posted on

The Challenge:

Name a famous person whose first and last names together contain four doubled letters — all four of these being different letters of the alphabet. Who is it? For example, Buddy Holly’s name has two doubled letters, D and L.

Link to challenge

Screen shot showing solution
Screen shot showing some of the over 40 matching names

Well, that was kind of fun, and I enjoyed the results. For starts, I found over 40 answers. For seconds, I love some of the names that qualify, like ‘Bobby “Boogaloo” Watts‘ (2 b’s, 2 o’s, 2 o’s again, 2 t’s), ‘Dimebag Darrell Abbott‘, ‘Boccaccio Boccaccino‘, ‘Blind Mississippi Morris’ or ‘William M. “Boss” Tweed‘ (qualifies because the trailing ‘m’ in William pairs with his middle initial. While they all qualify, I’m sure the person Will was thinking of was Tennessee Williams.

Algorithm

As you might guess, solving the puzzle is easy if you have the right list. But getting that list was quite a challenge! First, let’s discuss processing the list(s) of names to solve.

As you can see from my screen shot, I elected to make a number of name lists, motivated by the mechanics of capturing them from Wikipedia. In terms of solving the puzzle, that just means we need a loop to process each file. We read each file one name at a time. When we read a name, we use another loop to process the letters in that name. By ‘processing’, I mean that we examine the letter and compare it to the previous letter, if they match, we increase our counter. After processing the line, if our counter equals 4, we have a winner!

List answerList = new List();
string folder = Path
                .GetDirectoryName(Assembly
                .GetExecutingAssembly()
                .Location);

//Get all the text files in the exe folder, hold result in an array:
string[] fileList = Directory.GetFiles(folder, "*.txt");
//process every file we found
foreach(string aFile in fileList) {
    using (StreamReader sr = File.OpenText(aFile)) {
        while (sr.Peek() != -1) {
            string aPerson = sr.ReadLine();
            //Remove blanks convert to lower case
            string letters = aPerson.ToLower().Replace(" ", "");
            //reject some bad names that contain bad letters like ( or ,
            if (!letters.Contains("(") && !letters.Contains(",")) {
                char lastLtr = letters[0];
                int hitCount = 0;
                //loop through the letters
                for (int i = 1; i < letters.Length; i++) {
                     if (letters[i] == lastLtr) {
                         hitCount++;
                         //avoid triple letters by resetting previous
                         lastLtr = '';
                     } else {
                         lastLtr = letters[i];
                     }
                 }
                 if (hitCount >= 4) {
                    //look for previous entries:
                    NameFilePair prev = answerList.FirstOrDefault(n => n.Person == aPerson);
                    if (prev == null) {
                        NameFilePair nfp = new NameFilePair {
                            FileName = Path.GetFileName(aFile),
                            Person = aPerson
                        };
                        answerList.Add(nfp);
                    }
                }
            }
        }
    }
}

Downloading the Name Lists

As I mentioned above, the hardest part was getting the lists of names. As usual, Wikipedia had the most complete list, but scattered across many pages. I was able to start at this page http://en.wikipedia.org/wiki/Lists_of_people_by_nationality and programmatically follow the links on every page to other pages. Basically, my code loads the main page, identifies two kinds of links on the page:

  • Links to other pages of name lists
  • Links to actual articles about people

On most pages, the same pattern is used for each kind of link, though I had to do a bit of manual cleaning. As usual, I utilized regular expressions to match the text.

Here’s the complete code listing:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Controls;
using System.IO;                        //To open/close files
using System.Net;                       //For WebClient
using System.Text.RegularExpressions;   //To match text patterns
using System.Windows.Input;             //For Mouse override cusor
using System.ComponentModel;            //For Background worker
using System.Reflection;                //To get exe path

namespace Jan19_2014 {
    /// <summary>Downloads names and solves the puzzle</summary>
    public partial class MainWindow : Window {

        //Allows us to display a cancel button and update the status bar on different thread
        BackgroundWorker _WikipediaFetcher;

        /// <summary>Default constructor</summary>
        public MainWindow() {
            InitializeComponent();
        }

        /// <summary>
        /// Fires when user clicks the button 'Solve'
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        /// <remarks>
        /// Iterates all the files in the project folder having 'txt' extension,
        /// approximately 705 total.
        /// 
        /// Examines every name in each file by
        ///     - Concatenating the letters
        ///     - Looping through the result
        ///     - Comparing the last letter to the current
        ///     - Counting as a hit when the prev letter matches current
        ///     - Adding to the answer list when the hit count equals 4
        /// </remarks>
        private void btnSolve_Click(object sender, RoutedEventArgs e) {
            List<NameFilePair> answerList = new List<NameFilePair>();
            string folder = Path
                            .GetDirectoryName(Assembly
                                            .GetExecutingAssembly()
                                            .Location);
            
            //Get all the text files in the exe folder:
            string[] fileList = Directory.GetFiles(folder, "*.txt");
            //process every file we found
            foreach(string aFile in fileList) {
                using (StreamReader sr = File.OpenText(aFile)) {
                    while (sr.Peek() != -1) {
                        string aPerson = sr.ReadLine();
                        //Remove blanks convert to lower case
                        string letters = aPerson.ToLower().Replace(" ", "");
                        //reject some bad names that contain bad letters
                        if (!letters.Contains("(") && !letters.Contains(",")) {
                            char lastLtr = letters[0];
                            int hitCount = 0;
                            //loop through the letters
                            for (int i = 1; i < letters.Length; i++) {
                                if (letters[i] == lastLtr) {
                                    hitCount++;
                                    //avoid triple letters by resetting previous
                                    lastLtr = '';
                                } else {
                                    lastLtr = letters[i];
                                }
                            }

                            if (hitCount >= 4) {
                                //look for previous entries:
                                NameFilePair prev = answerList.
                                            FirstOrDefault(n => n.Person == aPerson);
                                if (prev == null) {
                                    NameFilePair nfp = new NameFilePair {
                                        FileName = Path.GetFileName(aFile),
                                        Person = aPerson
                                    };
                                    answerList.Add(nfp);
                                }
                            }
                        }
                    }
                }

            }
            //bind the results to the grid
            grdAnswer.ItemsSource = answerList;

            MessageBox.Show("Solving Done", "Mission Accomplished", 
                MessageBoxButton.OK, MessageBoxImage.Information);
        }

        /// <summary>
        /// Fired when user clicks the button to get Wikipedia people
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        /// <remarks>
        /// Sets-up the background worker and fires it off.
        /// </remarks>
        private void btnGetWikipediaRecursive_Click(object sender, RoutedEventArgs e) {
            if ((string)btnGetWikipediaRecursive.Content == "_Go") {
                _WikipediaFetcher = new BackgroundWorker();
                _WikipediaFetcher.WorkerSupportsCancellation = true;
                _WikipediaFetcher.WorkerReportsProgress = true;
                _WikipediaFetcher.RunWorkerCompleted += 
                    new RunWorkerCompletedEventHandler(WikipediaFetcher_RunWorkerCompleted);
                _WikipediaFetcher.ProgressChanged += 
                    new ProgressChangedEventHandler(WikipediaFetcher_ProgressChanged);
                _WikipediaFetcher.DoWork += new DoWorkEventHandler(WikipediaFetcher_DoWork);

                tbProgress.Visibility = System.Windows.Visibility.Visible;
                btnGetWikipediaRecursive.Content = "_Cancel";

                _WikipediaFetcher.RunWorkerAsync(txtWikipediaPeopleUrl.Text);
            } else {
                btnGetWikipediaRecursive.Content = "_Go";
                _WikipediaFetcher.CancelAsync();
            }
        }

        /// <summary>
        /// Fires when the background worker is kicked-off
        /// </summary>
        /// <param name="sender">The worker</param>
        /// <param name="e">Event args containing the URL as the e.Argument</param>
        /// <remarks>
        /// This method will start visiting a Wikipedia page that has links
        /// to other pages which may, in turn, contain other pages.
        /// 
        /// In general, each page has two types of payload we care about:
        ///     - the link to another page
        ///     - a person's name
        /// We identify each with a regular expression, but in this method,m
        /// we initialize the to regular expressiosn and merely pass them
        /// to a subroutine 'CapturePeople'.
        /// 
        /// This method iterates a number of links on the main page (which
        /// is rather irregular, so we set-up boundaries within which we search)
        /// and calls CapturePeople on each. CapturePeople is recursive,
        /// so it will handle any additional links therein.
        /// </remarks>
        void WikipediaFetcher_DoWork(object sender, DoWorkEventArgs e) {
            string peoplePattern = @"
                                    <li><a			#Literal match
                                    [^>]+?			#Anything except >
                                    >				#Literal match
                                    ([^<]+?)		#Capture group, anything except <
                                    <				#Literal match
                                    ";
            Regex rePeople = new Regex(peoplePattern, 
                RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline);

            List<string> visitedUrlList = new List<string>();
            using (WebClient wc = new WebClient()) {
                string wikiUrl = e.Argument as string;
                using (StreamReader sr = new StreamReader(wc.OpenRead(wikiUrl))) {
                    string allText = sr.ReadToEnd();
                    int p0 = allText.IndexOf("class=\"div-col columns column-width");
                    int p1 = allText.IndexOf("id=\"See_also");
                    string payLoad = allText.Substring(p0, p1 - p0);


                    string linkPattern = @"
                            <li><a		#Literal match
                            \s			#Whitespace
                            href=""		#Literal match
                            (/wiki/List_of_[^""]+?)	#Capture group, anything but quote
                            ""			#Literal match
                            ";
                    Regex reLink = new Regex(linkPattern, 
                        RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);
                    Match m = reLink.Match(payLoad);
                    while (m.Success) {
                        _WikipediaFetcher.ReportProgress(0, m.Groups[1].Value);
                        if (_WikipediaFetcher.CancellationPending) {
                            break;
                        }
                        CapturePeople(m.Groups[1].Value, visitedUrlList, 
                                        wc, reLink, rePeople);

                        m = m.NextMatch();
                    }
                }
            }
        }

        /// <summary>
        /// Captures the people listed on the URL in question, writing to a file based on
        /// the URL, and follows any links on the page to additional lists.
        /// </summary>
        /// <param name="listUrl">The Url to load and capture people form</param>
        /// <param name="visitedUrlList">List of links we have already visited</param>
        /// <param name="wc">Web client to fetch web pages</param>
        /// <param name="reLink">Regex that identifies a typical Wikipedia list link</param>
        /// <param name="rePeople">Regex that identifies a typical Wikipedia person name</param>
        private void CapturePeople(string listUrl, List<string> visitedUrlList, 
                                WebClient wc, Regex reLink, Regex rePeople) {
            string url = "http://www.Wikipedia.org" + listUrl;
            int ndx = visitedUrlList.BinarySearch(url);
            if (ndx < 0)
                visitedUrlList.Insert(~ndx, url);
            else
                return;

            try {
                using (StreamReader sr = new StreamReader(wc.OpenRead(url))) {
                    string allText = sr.ReadToEnd();

                    //Limit matching to prior to these two IDs; generally
                    //any false matches that follow are for things like referecnes
                    int p = allText.IndexOf("id=\"See_also");
                    int p2 = allText.IndexOf("id=\"References");
                    Match linkMatch = reLink.Match(allText);
                    //iterate the matches
                    while (linkMatch.Success && linkMatch.Index < p) {
                        if (_WikipediaFetcher.CancellationPending)
                            return;

                        _WikipediaFetcher.ReportProgress(0, listUrl);
                        if (!url.EndsWith(linkMatch.Groups[1].Value)) {
                            CapturePeople(linkMatch.Groups[1].Value, 
                                    visitedUrlList, wc, reLink, rePeople);
                        }
                        linkMatch = linkMatch.NextMatch();
                    }

                    _WikipediaFetcher.ReportProgress(0, url);
                    Match m = rePeople.Match(allText);
                    int p3 = allText.IndexOf("class=\"mw-headline");
                    if (p3 >= 0)
                        allText = allText.Substring(p3);
                    string fName = listUrl.Substring(listUrl.LastIndexOf("/") + 1) + ".txt";
                    using (StreamWriter sw = File.CreateText(fName)) {
                        while (m.Success && (p == -1 || m.Index < p) &&
                                            (p2 == -1 || m.Index < p2)) {
                            string person = m.Groups[1].Value;
                            sw.WriteLine(person);

                            m = m.NextMatch();
                        }
                    }
                }
            } catch (Exception ex) {
                Console.WriteLine(ex.Message);
            }

        }

        /// <summary>
        /// Fired when the background worker has something to report,
        /// namely, the URL it is processing
        /// </summary>
        /// <param name="sender"></param>
        /// <param name="e"></param>
        void WikipediaFetcher_ProgressChanged(object sender, ProgressChangedEventArgs e) {
            tbProgress.Text = (string)e.UserState;
        }

        /// <summary>
        /// Fired when the background worker is done
        /// </summary>
        /// <param name="sender">The background worker</param>
        /// <param name="e">Event args</param>
        void WikipediaFetcher_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e) {
            tbProgress.Visibility = System.Windows.Visibility.Hidden;
            MessageBox.Show("Done Capturing People", "Mission Accomplished", 
                             MessageBoxButton.OK, MessageBoxImage.Information);
        }
    }
}

Here’s my little class, used to hold a person and the file they came from:

    public class NameFilePair {
        public string Person { get; set; }
        public string  FileName { get; set; }
    }

Finally, here’s the XAML

<Window x:Class="Jan19_2014.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="Famous Person With 4 Double Letters"
        WindowStartupLocation="CenterScreen"
        Height="380" Width="925">
    <Grid>
        <Grid.ColumnDefinitions>
            <ColumnDefinition Width="auto" />
            <ColumnDefinition />
        </Grid.ColumnDefinitions>
        
        <!-- Left side/vertical letters -->
        <Border Background="Brown">
            <TextBlock FontSize="48" FontFamily="Impact" 
                       Foreground="LightGreen" VerticalAlignment="Top" >
                <ItemsControl ItemsSource="PUZZLE" />
            </TextBlock>
        </Border>
        
        <DockPanel Grid.Column="2">
            <!-- Status bar holds slider and progress display -->
            <StatusBar DockPanel.Dock="Bottom">
                <Label Content="Zoom:" Target="{Binding ElementName=sldZoom}" />
                <Slider Name="sldZoom" Width="50" Minimum=".5" Maximum="3" Value="1" />
                <TextBlock Name="tbProgress" Visibility="Hidden" />
            </StatusBar>
            
            <!-- Nested grid hods everything except the status bar and vertical PUZZLE -->
            <Grid >
                <Grid.LayoutTransform>
                    <!-- Works with the slider to enlarge/shrink the page -->
                    <ScaleTransform   ScaleX="{Binding ElementName=sldZoom,Path=Value}" 
                                      ScaleY="{Binding ElementName=sldZoom,Path=Value}" />
                </Grid.LayoutTransform>
                <Grid.ColumnDefinitions>
                    <ColumnDefinition Width="auto" />
                    <ColumnDefinition />
                    <ColumnDefinition Width="auto" />
                </Grid.ColumnDefinitions>
                <Grid.RowDefinitions>
                    <RowDefinition Height="auto" />
                    <RowDefinition Height="auto" />
                    <RowDefinition Height="auto" />
                    <RowDefinition />
                </Grid.RowDefinitions>
                
                <Border Grid.ColumnSpan="3">
                    <Border.Background>
                        <LinearGradientBrush EndPoint="1,0.5" StartPoint="0,0.5">
                            <GradientStop Color="BurlyWood" Offset="1" />
                            <GradientStop Color="Brown" Offset="0" />
                        </LinearGradientBrush>
                    </Border.Background>
                    <TextBlock Text="Famous Person's Name with 4 Double Letters" 
                               FontFamily="Arial" FontStyle="Italic" 
                               FontSize="40" TextWrapping="Wrap" 
                               Foreground="LightGreen" VerticalAlignment="Center" >
                    </TextBlock>
                </Border>

                <TextBlock Grid.Row="1" Grid.ColumnSpan="3" TextWrapping="Wrap" 
                           FontSize="14" Margin="5 0">
                    <Bold>Next week's challenge from Ed Pegg Jr. of MathPuzzle.com:</Bold>
                    Name a famous person whose first and last names together 
                    contain four doubled letters — all four 
                    of these being different letters of the alphabet. Who 
                    is it? For example, Buddy Holly's name has two doubled 
                    letters, D and L.
                </TextBlock>
                
                
                <Label Grid.Row="2" Content="_Wikipedia Start URL:" 
                       Target="{Binding ElementName=txtWikipediaPeopleUrl}" 
                       VerticalAlignment="Center" />
                <TextBox Grid.Row="2" Grid.Column="1" Name="txtWikipediaPeopleUrl" 
                         Text="http://en.wikipedia.org/wiki/Lists_of_people_by_nationality"
                         VerticalAlignment="Center" />
                <Button Grid.Row="2" Grid.Column="2" Content="_Go" 
                        Name="btnGetWikipediaRecursive"
                        Height="30" Margin="3" Click="btnGetWikipediaRecursive_Click" />


                <Label Grid.Row="3" Content="_Answers:" 
                       Target="{Binding ElementName=txtAnswer}" />
                <DataGrid Grid.Row="3" Grid.Column="1" Name="grdAnswer" 
                          AutoGenerateColumns="True" />
                <Button Grid.Row="3" Grid.Column="2" Name="btnSolve" 
                        Content="_Solve" Height="30" Margin="5" 
                        Click="btnSolve_Click" Grid.ColumnSpan="2"
                        
                        VerticalAlignment="Top" />
            </Grid>
        </DockPanel>            
    </Grid>

    <!-- Build our own Icon using letters '4-2', meaing 4 doubles -->
    <Window.Icon>
        <DrawingImage>
            <DrawingImage.Drawing>
                <GeometryDrawing>
                    <GeometryDrawing.Geometry>
                        <RectangleGeometry Rect="0,0,20,20" />
                    </GeometryDrawing.Geometry>
                    <GeometryDrawing.Brush>
                        <VisualBrush>
                            <VisualBrush.Visual>
                                <TextBlock Text="4-2" FontSize="8"
                                                Background="Brown"
                                                Foreground="LightGreen" />
                            </VisualBrush.Visual>
                        </VisualBrush>
                    </GeometryDrawing.Brush>
                </GeometryDrawing>
            </DrawingImage.Drawing>
        </DrawingImage>
    </Window.Icon>
</Window>

Solved! NPR Sunday Puzzle: “Wizard Word” Analog

Posted on

That was a nice, enjoyable puzzle. I suspect that people who tried solving without computers had a tough time :-(, whereas, with code, it was simply fun 🙂 The only difficulty was remembering to strip hyphens and spaces from candidate brand names before checking to see if they met the criteria.

Challenge:

The word “wizard” has the peculiar property that its letters can be grouped in pairs — A and Z, D and W, and I and R — that are opposite each other in the alphabet. That is, A and Z are at opposite ends of the alphabet, D and W are four letters in from their respective ends, and I and R are nine letters in from their respective ends. Can you name a well-known brand name in six letters that has this same property?

Link to the challenge

User interface of the program that solved the puzzle
User interface of the program that solved the puzzle

The Winning Algorithm!

The best way to solve the puzzle is to take a candidate brand name and

  1. Sort its letters
  2. Iterate the 1st 3 letters,
  3. Comparing
    • letter 1 <–>letter 6,
    • letter 2 <–>  letter 5,
    • letter 3 <–> letter 4
  4. By ‘comparing’, I mean checking if the first letter is as far from the beginning of the alphabet as the second letter is from the end
  5. This works because we already sorted the letters, ensuring that we won’t compare two letters from the beginning of the alphabet against each other
  6. The comparison is simple in C#, because you can do arithmetic on char variables
    • For example, you can subtract ‘z’ – ‘y’ and get 1 for an answer (i.e. 26 – 25 = 1)
  7. Also,  in C# (or other C-based languages), you can easily run a ‘for-loop’ that controls two variables at a time
    • like this: for (int i = 1, j = 6; i <= 3; i++, j–) { … }
    • In other words, run a loop that moves i from 1 to 3, and j from 6 to 4, simultaneously!

The following code accomplishes what I just said, and it is pretty short! This method will return true if the candidate ‘aWord’ passes the test (e.g., if aWord == ‘wizard’), and false otherwise:

public bool WordHasLetterSymmetry(string aWord) {
    bool hasSymmetry = true;
    //sort the candidate word and put the result into an array of char, one letter per position:
    char[] sorted = aWord.Select(c => c).OrderBy(c => c).ToArray();
    //Iterate the letter array; p0 will index the 1st 3 letters, p2 will index the reflected letters 6-5
    for (int p1 = 0, p2 = 5; p1 < 3; p1++, p2--) {
        //Check if the letter at position p1 is as far from the beginning as letter at p2 is from the end:
        if (sorted[p1] - 'a' != 'z' - sorted[p2]) {
            hasSymmetry = false;
            break;
        }
    }
    return hasSymmetry;
}

I was able to get the brand name list from http://www.namedevelopment.com/brand-names.html. I extracted the brand names using techniques previously discussed on this blog, using regular expressions.

Complete code listing:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Data;
using System.IO;
using System.Net;
using System.Text.RegularExpressions;

namespace Dec29_2013 {
    /// <summary>
    /// Holds the event methods and other code to solve the puzzle
    /// </summary>
    public partial class MainWindow : Window {
        /// <summary>Where we save the brand names</summary>
        private const string BRAND_FILE_NAME = "BrandNames.txt";
        public MainWindow() {
            InitializeComponent();
        }

        /// <summary>
        /// Fires when user clicks the button to get brand names
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnGetBrandNames_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                                <li>			#literal match
                                ([^<]+)			#Capture group - anything but <
                                </li>			#Literal match
                                ";
            Regex reBrand = new Regex(pattern, 
                            RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline);

            //Some listing have a company name, we will replace
            //example: 365 Every Day Value (Whole Foods)
            string companyPattern = @"
                                    \s		#Whitespace
                                    \(		#Escaped open parenthesis
                                    [^)]+?	#Anything but ), 1 or more
                                    \)		#Escaped close parenthesis
                                    ";
            Regex reCompany = new Regex(companyPattern, 
                        RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

            //create the file
            using (StreamWriter sw = File.CreateText(BRAND_FILE_NAME)) {
                //the web client will go to the url
                using (WebClient wc = new WebClient()) {
                    using (StreamReader sr = new StreamReader(wc.OpenRead(txtBrandNameList.Text))) {
                        //get the entire page
                        string allText = sr.ReadToEnd();

                        //start matching the text using the regular expression
                        Match m = reBrand.Match(allText);
                        while (m.Success) {
                            string aName = m.Groups[1].Value.Trim();
                            //clean-up by remving company names and degree symbols
                            sw.WriteLine(reCompany.Replace(aName, "").Replace("°", "°"));

                            m = m.NextMatch();
                        }
                    }
                }
            }
            MessageBox.Show("Done Capturing Brands", 
                "Mission Accomplished", MessageBoxButton.OK, MessageBoxImage.Exclamation);
        }

        /// <summary>
        /// Fires when user clicks the button to solve the puzzle
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnSolve_Click(object sender, RoutedEventArgs e) {
            txtAnswer.Clear();
            using (StreamReader sr = File.OpenText(BRAND_FILE_NAME)) {
                while (sr.Peek() != -1) {
                    string aLine = sr.ReadLine();
                    string aBrand = aLine.ToLower().Replace("-", "")
                                    .Replace("'", "").Replace(" ", "");
                    if (aBrand.Length == 6) {
                        bool ok = WordHasLetterSymmetry(aBrand);
                        if (ok) {
                            txtAnswer.Text += aLine  + "\n";
                        }
                    }
                }
            }
        }

        /// <summary>
        /// Determines whether the word meets the puzzle criteria
        /// </summary>
        /// <param name="aWord">A word that might have the specified symmetry</param>
        /// <returns>True if the word qualifies, false otherwise</returns>
        public bool WordHasLetterSymmetry(string aWord) {
            bool hasSymmetry = true;
            char[] sorted = aWord.Select(c => c).OrderBy(c => c).ToArray();
            for (int p1 = 0, p2 = 5; p1 < 3; p1++, p2--) {
                if (sorted[p1] - 'a' != 'z' - sorted[p2]) {
                    hasSymmetry = false;
                    break;
                }
            }
            return hasSymmetry;
        }

        /*
         Some additional Brand URLs and the regex patterns that extract their payload:
         http://www.forbes.com/powerful-brands/list/
        <h3>				#Literal match
        ([^<]+?)			#Capture Group = anything but <
        </h3></a></td>		#Literal Match 

        http://en.wikipedia.org/wiki/List_of_generic_and_genericized_trademarks
        <tr>			#Literal text
        \n?				#Optional newline
        <td>			#Literal text
        (				#Group 1, '<a' tag
        <a				#Literal match
        \s				#Whitespace
        [^>]+			#Anything except >
        >				#Closing > for '<a' 
        )?				#End of group 1, ? makes it optional
        ([^<]+?)		#Group 2, the one we want, anything except <
        (?(1)			#start a conditional, based on capture group 1
        </a></td>		#if we matched group 1, then </a></td>
        |				#Tne false-clause of the conditional
        </td>)			#Literal match

        http://en.wikipedia.org/wiki/List_of_brand_name_food_products
        <li><a			#Literal Match
        [^>]+?			#Anything but >, one or more
        >				#Literal match
        ([^<]+?)		#Capture group, anything but <
        </a>			#Literal match          
         */
    }
}

Here’s the XAML for the UI

<Window x:Class="Dec29_2013.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Height="330" Width="640"
        WindowStartupLocation="CenterScreen"
        Title="MainWindow" >
    <Grid>
        <Grid.RowDefinitions>
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition />
        </Grid.RowDefinitions>
        <Grid.ColumnDefinitions>
            <ColumnDefinition Width="auto" />
            <ColumnDefinition />
            <ColumnDefinition Width="auto" />
        </Grid.ColumnDefinitions>

        <Border Grid.ColumnSpan="3">
            <Border.Background>
                <LinearGradientBrush EndPoint="0.5,1" StartPoint="0.5,0">
                    <GradientStop Color="GreenYellow" Offset="0" />
                    <GradientStop Color="LightGreen" Offset="1" />
                </LinearGradientBrush>
            </Border.Background>
            <TextBlock HorizontalAlignment="Center" FontSize="40" 
                       Text="Brand Name Letter-Pair Symmetry" 
                       Foreground="SaddleBrown">
                <TextBlock.Effect>
                    <DropShadowEffect Color="LightGreen" BlurRadius="5" 
                                      Opacity=".8" />
                </TextBlock.Effect>
            </TextBlock>
        </Border>

        <Label Grid.Row="1" Content="Challenge:" />
        <TextBlock Grid.Row="1" Grid.Column="1" TextWrapping="Wrap" 
                   FontSize="13" >
            <Bold> Next week's challenge from listener Steve 
            Daubenspeck of Fleetwood, Pa.: </Bold>
            The word "wizard" has the peculiar property that its 
            letters can be grouped in pairs — A and Z, D and W, 
            and I and R — that are opposite each other in the alphabet.
            That is, A and Z are at opposite ends of 
            the alphabet, D and W are four letters in from their 
            respective ends, and I and R are nine letters in from their 
            respective ends. Can you name a well-known brand name in 
            six letters that has this same property?
        </TextBlock>

        <Label Grid.Row="2" Content="Brand Name _Url:" 
               Target="{Binding ElementName=txtBrandNameList}" />
        <TextBox Grid.Row="2" Grid.Column="1" x:Name="txtBrandNameList" 
                 Text="http://www.namedevelopment.com/brand-names.html" />
        <Button Grid.Row="2" Grid.Column="2" Content="Get _Brand Names" 
                Height="30" Margin="3" x:Name="btnGetBrandNames"
                Click="btnGetBrandNames_Click"/>

        <Label Grid.Row="6" Content="_Answer:" Target="{Binding ElementName=txtAnswer}" />
        <TextBox Grid.Row="6" Grid.Column="1" x:Name="txtAnswer" 
                 AcceptsReturn="True" TextWrapping="Wrap" MinHeight="100" />
        <Button Grid.Row="6" Grid.Column="2" Content="_Solve" Height="30" 
                Width="{Binding ElementName=btnGetBrandNames,Path=ActualWidth}"
                VerticalAlignment="Top" Name="btnSolve" Click="btnSolve_Click" />
    </Grid>
</Window>

Solved! NPR Sunday Puzzle for December 22, 2013

Posted on Updated on

I had to struggle a bit on this one, mainly because every time I searched for ‘filmmaker’ on the web, I got a list of feature-film producers and directors. After I wised-up and downloaded a list of  documentary producers, solving was a piece of cake!

 

Solution - Sun +Filmmaker
Screen shot shows the answer to the puzzle (plus a bunch of buttons to get filmmakers from ever more locations!)

The Challenge

Think of a well-known filmmaker, first and last names. Add “S-U-N” before this person’s first name and last name. In each case, you’ll form a common English word. Who is the filmmaker?

Link to the challenge

Algorithm

The algorithm is straightforward; assume you have:

  1. A file containing filmmaker names, one per line
  2. A List<string> of words starting with ‘sun’, sorted
  3. A Textbox ‘txtAnswer’ to display the answer

Simply loop through the list of names, grabbing the first name and last name with the ‘Split’ function, which will give you an array containing each name part as a separate entry in the array. Concatenate ‘sun’ with the last name and check if the resulting candidate exists in the list – an efficient technique is to use BinarySearch, which returns the index where the candidate is found (but returns negative number if not found). If the last name exists in the array, try the same on the first name. If they both exist in the list, we have a winner!

Code That Solves the Puzzle with a File of Filmmakers ‘fName’ and a List of Words ‘sunList’

private void CheckFileAgainstSunWords(string fName, List sunList) {
    //Open the file of filmmakers:
    using (StreamReader sr = File.OpenText(fName)) {
        ///Loop while there is data to read:
        while (sr.Peek() != -1) {
            //Get a filmmaker name, converting to lower case and removing leading/trailing blanks:
            string aLine = sr.ReadLine().Trim().ToLower();
            //Split the name by the space characters, put each word into an array slot:
            string[] nameParts = aLine.Split(' ');
            //the last name is the last entry in the arrary; prepend 'sun to form a candidate
            string candidate2 = "sun" + nameParts[nameParts.Length - 1];
            //The list 'sunList' is sorted, a binary search will return a non-negative position of
            //the candidate is in the list
            if (sunList.BinarySearch(candidate2) >= 0) {
                //We got a hit on the filmmaker last name, so try the first 
                //name (zeroth entry in the array)
                string candidate1 = "sun" + nameParts[0];
                //As above, if the list contains the candidate, binary search returns a non-negative index
                if (sunList.BinarySearch(candidate1) > 0) {
                    //Append the filmmaker name to the text box
                    txtAnswer.Text += aLine;
                }
            }
        }
    }
}

Complete Code List

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Windows.Controls;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;
using System.Windows.Input;

namespace Dec22_2013 {
    /// <summary>
    /// Code to solve the puzzle
    /// </summary>
    public partial class MainWindow : Window {

        /// <summary>Where we save the film maker names</summary>
        private const string PRODUCER_FILE_NAME = "Producers.txt";

        /// <summary>Default constructor</summary>
        public MainWindow() {
            InitializeComponent();
        }

        /// <summary>
        /// Fires when user clicks the button
        /// </summary>
        /// <param name="sender">The Button</param>
        /// <param name="e">Event args</param>
        private void btnGetProducers_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                                ^<b><a		#Literal text
                                \s+?		#Whitespace, one or more
                                [^>]+?		#Anything except >
                                >		#Literal text
                                ([^<]+?)	#Capture group - anything except <
                                </a></b>	#Literal match
                                ";
            Mouse.OverrideCursor = Cursors.Wait;
            CaptureTextFromUrl(txtProducerUrl.Text, pattern, PRODUCER_FILE_NAME);

            Mouse.OverrideCursor = null;
            MessageBox.Show("Producers Downloaded", "Mission Complete", 
                             MessageBoxButton.OK, MessageBoxImage.Information);
        }

        /// <summary>
        /// Gets list of film makers from a different url
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnWikipediaProducers_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                                <li><a		#Literal match
                                [^>]+?		#Anything exept >, one or more
                                >		#Literal match
                                ([^<]+?)	#Capture group, anything but <
                                </a>		#Literal match
                                ";
            CaptureTextFromUrl(txtWikipediaUrl.Text, pattern, PRODUCER_FILE_NAME);
        }

        /// <summary>
        /// Gets more film makers from different URL
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnGetMoreProducers_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                            \d{4}\s		#4digits and a space
                            /\s+?		#/ and space
                            ([^)]+?)	        #Capture group - anything but )
                            \)			#Escaped close-parenthesis
                            ";
            CaptureTextFromUrl(txtMoreProducers.Text, pattern, PRODUCER_FILE_NAME);
        }

        /// <summary>
        /// Gets more directors from different URL
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnMoreDirectors_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                            csv_column_4"">	#Literal text
                            ([^<]+?)		#Capture group, anything but <
                            <			#Literal match
                            ";
            List<string> previousNameList = new List<string>();
            using (StreamReader sr = File.OpenText(PRODUCER_FILE_NAME)) {
                while (sr.Peek() != -1) {
                    previousNameList.Add(sr.ReadLine().ToLower());
                }
            }

            Regex re = new Regex(pattern, 
                       RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);
            using (WebClient wc = new WebClient()) {
                using (StreamReader sr = new StreamReader(wc.OpenRead(txtMoreDirectors.Text))) {
                    string allText = sr.ReadToEnd();
                    Match m = re.Match(allText);
                    while (m.Success) {
                        int p = m.Groups[1].Value.IndexOf(',');
                        if (p > 0) {
                            string director = m.Groups[1].Value.Substring(p + 2).ToLower() 
                                              + " " 
                                              + m.Groups[1].Value.Substring(0, p).ToLower();
                            p = previousNameList.BinarySearch(director);
                            if (p < 0) {
                                previousNameList.Insert(~p, director);
                            }
                        }

                        m = m.NextMatch();
                    }
                }
            }

            using (StreamWriter sw = File.CreateText(PRODUCER_FILE_NAME)) {
                foreach (string prodName in previousNameList) {
                    sw.WriteLine(prodName);
                }
            }
        }

        /// <summary>
        /// Standardized way to capture text from specified url using the regex
        /// and saving to the specified file
        /// </summary>
        /// <param name="url">Web location where data is retrieved from</param>
        /// <param name="rePattern">Regex customized for the page in question</param>
        /// <param name="fName">File to add the data to</param>
        private void CaptureTextFromUrl(string url, string rePattern, string fName) {
            List<string> previousNameList = new List<string>();
            using (StreamReader sr = File.OpenText(fName)) {
                while (sr.Peek() != -1) {
                    previousNameList.Add(sr.ReadLine().ToLower());
                }
            }

            Regex re = new Regex(rePattern, 
                     RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);
            using (WebClient wc = new WebClient()) {
                using (StreamReader sr = new StreamReader(wc.OpenRead(url))) {
                    string allText = sr.ReadToEnd();
                    Match m = re.Match(allText);
                    while (m.Success) {
                        string aName = m.Groups[1].Value.ToLower();
                        int p = previousNameList.BinarySearch(aName);
                        if (p < 0) {
                            previousNameList.Insert(~p, aName);
                        }

                        m = m.NextMatch();
                    }
                }
            }

            using (StreamWriter sw = File.CreateText(PRODUCER_FILE_NAME)) {
                foreach (string prodName in previousNameList) {
                    sw.WriteLine(prodName);
                }
            }
        }

        /// <summary>
        /// Solves the puzzle using a list of English words
        /// </summary>
        /// <param name="sender"></param>
        /// <param name="e"></param>
        private void btnSolve_Click(object sender, RoutedEventArgs e) {

            //Load the  word list and record 'Sun' words in a list
            string fName = @"C:\Users\Randy\Documents\Puzzles\Data\2of12inf.txt";
            List<string> sunList = new List<string>();
            using (StreamReader sr = File.OpenText(fName)) {
                while (sr.Peek() != -1) {
                    string aLine = sr.ReadLine().ToLower();
                    if (aLine.StartsWith("sun")) {
                        sunList.Add(aLine);
                    } else if (string.Compare(aLine, "sun") > 0) {
                        break;
                    }
                }
            }

            txtAnswer.Clear();
            //Bounce the film maker file against the director list
            CheckFileAgainstSunWords(PRODUCER_FILE_NAME, sunList);
            string directorFile = @"C:\Users\Randy\Documents\Puzzles\Data\DirectorList.txt";

            //Try again with a different file
            CheckFileAgainstSunWords(directorFile, sunList);
            MessageBox.Show("Solving Complete", "Mission Accomplished", 
                            MessageBoxButton.OK, MessageBoxImage.Exclamation);
        }

        /// <summary>
        /// Unified way to check a film maker name against the word list
        /// </summary>
        /// <param name="fName">File containing film maker names</param>
        /// <param name="sunList">List of words starting with 'sun'</param>
        private void CheckFileAgainstSunWords(string fName, List<string> sunList) {
            using (StreamReader sr = File.OpenText(fName)) {
                while (sr.Peek() != -1) {
                    string aLine = sr.ReadLine().Trim().ToLower();
                    string[] nameParts = aLine.Split(' ');
                    string candidate2 = "sun" + nameParts[nameParts.Length - 1];
                    if (sunList.BinarySearch(candidate2) >= 0) {
                        string candidate1 = "sun" + nameParts[0];
                        if (sunList.BinarySearch(candidate1) > 0) {
                            txtAnswer.Text += aLine;
                        }
                    }
                }
            }
        }

        /// <summary>
        /// Removes dupes and makes the names lower case
        /// </summary>
        /// <param name="sender"></param>
        /// <param name="e"></param>
        private void btnFixNameLIst_Click(object sender, RoutedEventArgs e) {
            List<string> all = new List<string>();
            using (StreamReader sr = File.OpenText(PRODUCER_FILE_NAME)) {
                while (sr.Peek() != -1) {
                    string aLine = sr.ReadLine().ToLower();
                    int p = all.BinarySearch(aLine);
                    if (p < 0) {
                        all.Insert(~p, aLine);
                    }
                }
            }

            using (StreamWriter sw = File.CreateText(PRODUCER_FILE_NAME)) {
                foreach (string aName in all) {
                    sw.WriteLine(aName);
                }
            }
        }

        /// <summary>
        /// Fires when user clicks the button to get more filme makers
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnEvenMore_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                            ^\s+		#Beginning of line, whitespace
                            ([\w\s]+?)		#Capture group, letters and spaces
                            (?:\s-)?		#optional space hyphen
                            \s			#Whitespace
                            <br>		#Literal match
                            ";
            CaptureTextFromUrl(txtEvenMore.Text, pattern, PRODUCER_FILE_NAME);
        }

        /// <summary>
        /// Fires when user clicks the button, gets a list of film makers
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        private void btnGetDocumentaryMakers_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                            <li><a		#Literal match
                            [^>]+?		#Anything but >, 1 or more
                            >			#Literal match
                            ([^<]+?)	        #Capture group, anything but <
                            </a>		#Literal match
                            ";
            CaptureTextFromUrl(txtDocumentaryMakers.Text, pattern, PRODUCER_FILE_NAME);
        }

    }
}

Here’s the Xaml!

<Window x:Class="Dec22_2013.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        WindowStartupLocation="CenterScreen"    
        Title="Puzzle Dec. 22, 2013" Height="450" Width="655">

    <Grid>
        <Grid.RowDefinitions>
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition />
        </Grid.RowDefinitions>
        <Grid.ColumnDefinitions>
            <ColumnDefinition Width="auto" />
            <ColumnDefinition />
            <ColumnDefinition Width="auto" />
        </Grid.ColumnDefinitions>

        <Border Grid.ColumnSpan="3">
            <Border.Background>
                <LinearGradientBrush EndPoint="0.5,1" StartPoint="0.5,0">
                    <GradientStop Color="DarkGoldenrod" Offset="0" />
                    <GradientStop Color="Gold"  Offset="1" />
                </LinearGradientBrush>
            </Border.Background>
            <TextBlock FontSize="40" Text="Sun + Movie Maker Puzzle" 
                       HorizontalAlignment="Center" Foreground="GreenYellow">
                <TextBlock.Effect >
                    <DropShadowEffect />
                </TextBlock.Effect>
            </TextBlock>
        </Border>

        <TextBlock Grid.Row="1"  Grid.ColumnSpan="3" FontSize="13" 
                   TextWrapping="Wrap" Margin="3">
            <Bold>Next week's challenge from listener Andrew Chaikin of San Francisco: </Bold>
            Think of a well-known filmmaker, first and last names. Add "S-U-N" before 
            this person's first name and last name. In each case, you'll form a common 
            English word. Who is the filmmaker?
        </TextBlock>

        <Label Grid.Row="2" Content="IMDB Producer _Url:" 
                 Target="{Binding ElementName=txtProducerUrl}" />
        <TextBox Grid.Row="2" Grid.Column="1" x:Name="txtProducerUrl" 
                 Text="http://www.imdb.com/list/ZFLM-BUjNlA/" />
        <Button Grid.Row="2" Grid.Column="2" Height="30" Margin="3" x:Name="btnGetProducers" 
                Content="Get _Producers" Click="btnGetProducers_Click" />

        <Label Grid.Row="3" Content="Wikipedia Producer URL:" 
                Target="{Binding ElementName=txtWikipediaUrl}" />
        <TextBox Grid.Row="3" Grid.Column="1" x:Name="txtWikipediaUrl"
                 Text="http://en.wikipedia.org/wiki/List_of_film_producers" />
        <Button Grid.Row="3" Grid.Column="2" x:Name="btnWikipediaProducers" 
                Height="30" Margin="3"
                Content="Wikipedia Producers"
                Click="btnWikipediaProducers_Click" />

        <Label Grid.Row="4" Content="_More Producers" 
               Target="{Binding ElementName=txtMoreProducers}" />
        <TextBox Grid.Row="4" Grid.Column="1" x:Name="txtMoreProducers"
                 Text="http://www.theyshootpictures.com/gf1000_all1000films_ex1000.htm" />
        <Button Grid.Row="4" Grid.Column="2" x:Name="btnGetMoreProducers" 
                Height="30" Margin="3" Content="M_ore Producers" 
                Click="btnGetMoreProducers_Click" />

        <Label Grid.Row="5" Content="More _Directors:" 
               Target="{Binding ElementName=txtMoreDirectors}" />
        <TextBox Grid.Row="5" Grid.Column="1" x:Name="txtMoreDirectors"
                 Text="http://www.theyshootpictures.com/gf1000_all1000films_table.php" />
        <Button Grid.Row="5" Grid.Column="2" x:Name="btnMoreDirectors" Height="30" 
                Margin="3" Content="More _Directors" Click="btnMoreDirectors_Click" />

        <Label Grid.Row="6" Content="Even More:" />
        <TextBox Grid.Row="6" Grid.Column="1" x:Name="txtEvenMore" 
                 Text="http://www.cinemateca.org/people/film_directors.htm" />
        <Button Grid.Row="6" Grid.Column="2" x:Name="btnEvenMore" Content="_Even More"
                Height="30" Margin="3" Click="btnEvenMore_Click" />

        <Label Grid.Row="7" Content="_Documentary Makers:" />
        <TextBox Grid.Row="7" Grid.Column="1" x:Name="txtDocumentaryMakers" 
                Text="http://en.wikipedia.org/wiki/List_of_directors_and_producers_of_documentaries" />
        <Button Grid.Row="7" Grid.Column="2" x:Name="btnGetDocumentaryMakers" Height="30" Margin="3"
                Content="Documentary Makers" Click="btnGetDocumentaryMakers_Click" />

        <Label Grid.Row="8" Content="_Answer:" Target="{Binding ElementName=txtAnswer}" />
        <TextBox Grid.Row="8" Grid.Column="1" x:Name="txtAnswer" />
        <Button Grid.Row="8" Grid.Column="2" x:Name="btnSolve" Height="30" Margin="3" 
                Width="{Binding ElementName=btnGetProducers,Path=ActualWidth}"
                VerticalAlignment="Top" Content="_Solve"
                 Click="btnSolve_Click" />
    </Grid>

    <!-- In lieu of linking to a file, I elected to draw a sun in Xaml -->
    <Window.Icon>
        <DrawingImage>
            <DrawingImage.Drawing>
                <GeometryDrawing Brush="Gold">
                    <GeometryDrawing.Pen>
                        <Pen Brush="Yellow" Thickness="1" />
                    </GeometryDrawing.Pen>
                    <GeometryDrawing.Geometry>
                        <EllipseGeometry RadiusX="25" RadiusY="25" />
                    </GeometryDrawing.Geometry>
                </GeometryDrawing>
            </DrawingImage.Drawing>
        </DrawingImage>
    </Window.Icon>    
</Window>

That’s all!

NPR Dancing Puzzle, December 1st, 2013

Posted on Updated on

Challenge: Name a dance. Change one of the letters to a U. The resulting letters can be rearranged to name an event at which this dance is done. What is it?

Although similar to the last several NPR puzzles, this one was a bit tough! Not for coding reasons, but because I couldn’t find a list of social events. I had to go to online thesauri repeatedly until I found the correct event name.

 

Hula (dance) --> Luau
Screen shot shows the solution to the puzzle.

The heart of the solution is to

  1. Build a dictionary of events, again using the sorted event name as the key and the original event name as the value
  2. Iterate the dance list
  3. For each dance name, form a candidate by
    1. Iterating the letters
    2. Replacing each letter with u
    3. Sorting the resulting string to form a candidate
    4. And checking whether that candidate is a key in our event name dictionary

If it seems like I am going a bit fast, please take a look a this previous post, where I used the same anagram dictionary technique to find other anagrams .

I created two extension methods, ‘Left‘ and ‘Right‘, to make my sample a bit easier to read:

  • MyString.Left(i) will return all the characters left of position i
  • MyString.Right(i) will return all the character to the right of position i

For example,

  • “hula”.Left(1) returns “h”.
  • “hula”.Right(1) returns “la”

I’ll show the (very simple) code for these two extension methods shortly, in the mean time, here’s the heart of the solution. This code snippet assumes

  1. That you have a file containing dance names named “DANCE_FILE”
  2. Which you have cleaned-up to exclude non-alpha characters like blanks and hyphens
  3. And that you have an event name dictionary “eventDic” similar to what I described above
//Open the dance file and process each dance name (one per line)
using (StreamReader sr = File.OpenText(DANCE_FILE)) {
    while (sr.Peek() != -1) {
        string aDance = sr.ReadLine().Trim().ToLower();

        //Iterate the dance name chracters, injecting u at every 
        //position, then sorting the result, to form a candidate:
        for (int i = 0; i < danceName.Length -1; i++) {
            //Combine the left side + u + right side and sort the result:
            string candidate = (danceName.Left(i) + "u" + danceName.Right(i))
                                .Select(c => c)
                                .OrderBy(c => c)
                                .Aggregate("", (t, c) => t + c);

            //If the candidate's sorted letters match the sorted letters of
            //an event name, that means they are anagrams!
            if (eventDic.ContainsKey(candidate)) {
                txtAnswer.Text += string.Format("Dance name: '{0}'; " 
                                    + "Event name: '{1}'\n", 
                                    aDance, eventDic[candidate]);
            }
        }
    }
}

If you’re gagging on the funky syntax (“.Select”, “.Aggregate”, etc.),  be advised that I skimmed over this because I explained it in a previous post and didn’t want to bore repeat visitors. These are Lambda Expressions, and I explain their usage in depth in this post, except that there I used Lambda Expressions instead of Linq syntax.

I promised to show my extension class, so here it is:

public static class StringExtensions {
    public static string Left(this string str, int rightPos) {
        if (rightPos <= 0)
            return "";
        else
            return str.Substring(0, rightPos);
    }

    public static string Right(this string str, int rightPos) {
        if (rightPos < str.Length - 1)
            return str.Substring(rightPos + 1);
        else
            return "";

    }
}

There is nothing in here too difficult; the point is that .NET allows you to write your own extension methods which behave as if they were part of the .NET string class. A simple class like this can help make your code more readable. Note the class is static, and also note the use of ‘this string str‘ as a parameter. That’s all it takes to make an extension method!

I used one more interesting bit of coding while getting a list of dance names. If you’ve read my previous puzzle articles, you know that I use regular expressions to get lists from web pages, frequently from Wikipedia. In this case, I got my list of dances from here. If you go to that link, you will see that the main list is preceded by a bunch of other links, which happen to match the same pattern as the dance names! In order to deal with this, I modified my regular expression to exclude any hits whose ‘payload’ starts with ‘List’. To do so, I used the Regex feature ‘Negative Lookahead‘, which was fairly new to me.

Here’s my documented regex to capture dance names from that page (negative lookahead highlighted):


<li><a                  #Literal match
\s+                     #Whitespace, one or more
href=\"/wiki/           #Literal match
[^>]+                   #Anything except >, one or more
>                       #Literal match
(                       #Start of our group
(?!List)                #Exclude matches starting with List
[^<]+                   #Anything except <
)                       #end of group
</a>                    #Literal match

In case you are totally baffled,

  • The regex will take raw HTML as input
  • That is, the HTML from Wikipedia
  • Which looks like this:
    <li><a href="/wiki/Hula" title="Hula">Hula</a></li>
  • This regular expression contains a capture group (starts on line 6)
  • Basically, it looks for
    • “<li><a” (line 1)
    • followed by space (line 2)
    • followed by href=\”/wiki/ (line 3)
    • followed by anything except a ‘>’  (this is a “negated character class“) (line 4)
    • Followed by > (line 5)
    • The parenthesis on line 6 starts a capture group
    • The negative lookahead (line 7) rejects any text starting with ‘article’
    • On line 8 we specify the text we will capture, namely: anything except <
    • Line 9 we close the capture group with a close-parenthesis
    • And specify that it must be followed by </a> (line 10)

I have listed the entire code below (138 lines, it took way longer for me to explain it here!)


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Windows;
using System.Text.RegularExpressions;
using System.Net;                           //For WebClient
using System.IO;

namespace Dec_1_2013 {
    /// <summary>
    /// Solves the puzzle by responding to each button click form the main page
    /// </summary>
    /// <remarks>
    /// Has code to fetch dance lists from two web pages and 
    /// then use those dance lists to /// solve the puzzle
    /// </remarks>
    public partial class MainWindow : Window {
        private const string DANCE_FILE = "DanceList.txt";
        public MainWindow() {
            InitializeComponent();
        }

        /// <summary>
        /// Fires when user clicks the button to get the dance list from the web
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        /// <remarks>
        /// Goes the the URL in the textbox, gets the page content, extracts matches 
        /// with the regex, saves to file
        /// </remarks>
        private void btnGetDances_Click(object sender, RoutedEventArgs e) {
            string pattern = @"
                            <li><a			    #Literal match
                            \s+				    #Whitespace, one or more
                            href=\""/wiki/	    #Literal match
                            [^>]+			    #Anything except >, one or more
                            >				    #Literal match
                            (				    #Start of our group
                            (?!List)		    #Exclude matches starting with List
                            [^<]+			    #Anything except <
                            )				    #end of group
                            </a>			    #Literal match
                            ";
            Regex reDance = new Regex(pattern
                    , RegexOptions.IgnorePatternWhitespace | RegexOptions.Multiline);

            //Go to the web page, get the text, extract 
            //all the matches using the regex, save to file
            using (WebClient wc = new WebClient()) {
                using (StreamReader sr = new StreamReader(wc.OpenRead(txtDanceURL.Text))) {
                    string allText = sr.ReadToEnd();

                    //There exist a number of false matches after the last dance on 
                    //the page (marked by the text 'see also'), find the position now
                    int stopPos = allText.IndexOf("See also");
                    using (StreamWriter sw = File.CreateText(DANCE_FILE)) {
                        Match m = reDance.Match(allText);

                        //keep looping so long as we have more matches and have 
                        //not exceeded the last dance position
                        while (m.Success && m.Index < stopPos) {
                            sw.WriteLine(m.Groups[1].Value);
                            m = m.NextMatch();
                        }
                    }
                }                
            }
            MessageBox.Show("Done", "Dances Captured");
        }

        /// <summary>
        /// Fired when user clicks the button to solve the puzzle
        /// </summary>
        /// <param name="sender">The button</param>
        /// <param name="e">Event args</param>
        /// <remarks>
        /// Load the dance list from file into a dictionary
        ///     - Each dictionary key is the sorted dance name
        ///     - Each dictionary value is the UNsorted dance name, the original value
        /// 
        /// Loop through every social event in the text box
        ///      - Try replacing each letter in the event name with 'u'
        ///      - Sort the result and check for a match in the dictionary
        ///      - If we have a hit, display in the textbox
        /// </remarks>
        private void btnSolve_Click(object sender, RoutedEventArgs e) {
            txtAnswer.Clear();

            //Snag the events from the textbox, put them into an array after splitting on comma char
            string[] events = txtEventNames.Text.ToLower().Split(',');
            Dictionary<string, string> eventDic = new Dictionary<string, string>();
            foreach (string evt in events) {
                //Remove non-alpha letters, sort, recombine into a new string
                string sortedEvent = evt.Where(ev => ev>= 'a' && ev <= 'z')
                                     .OrderBy(ev => ev)
                                     .Aggregate("", (t, c) => t + c);

                //store in the dictionary
                if (eventDic.ContainsKey(sortedEvent))
                    eventDic[sortedEvent] += ", " + evt;
                else
                    eventDic.Add(sortedEvent, evt.Trim());
            }

            //Now open the dance file and process each dance name (one per line)
            using (StreamReader sr = File.OpenText(DANCE_FILE)) {
                while (sr.Peek() != -1) {
                    string aDance = sr.ReadLine().Trim().ToLower();
                    //Remove non-alpha characters
                    string danceName = (aDance.Where(d => d >= 'a' && d <= 'z'))
                                                .Aggregate("", (t, c) => t + c);

                    //Iterate the dance name chracters, injecting u at every 
                    //position, to form a candidate
                    for (int i = 0; i < danceName.Length -1; i++) {
                        //Combine the left side + u + right side and sort the result:
                        string candidate = (danceName.Left(i) + "u" + danceName.Right(i))
                                            .Select(c => c)
                                            .OrderBy(c => c)
                                            .Aggregate("", (t, c) => t + c);

                        //If the candidate's sorted letters match the sorted letters of
                        //an event name, that means they are anagrams!
                        if (eventDic.ContainsKey(candidate)) {
                            txtAnswer.Text += string.Format("Dance name: '{0}'; "
                                                + "Event name: '{1}'\n", 
                                                aDance, eventDic[candidate]);
                        }
                    }
                }
            }
            MessageBox.Show("Done", "Matching Complete", MessageBoxButton.OK, 
                         MessageBoxImage.Exclamation);
        }

    }
}

And here is the XAML:


<Window x:Class="Dec_1_2013.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        WindowStartupLocation="CenterScreen"
        Title="Dec 1st, 2013" Height="460" Width="555">
    <Grid>
        <Grid.RowDefinitions>
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition Height="auto" />
            <RowDefinition />
            <RowDefinition Height="auto" />
        </Grid.RowDefinitions>
        <Grid.ColumnDefinitions>
            <ColumnDefinition Width="auto" />
            <ColumnDefinition/>
            <ColumnDefinition  Width="auto" />
        </Grid.ColumnDefinitions>

        <Border Grid.ColumnSpan="3">
            <Border.Background>
                <LinearGradientBrush EndPoint="0.5,1" StartPoint="0.5,0">
                    <GradientStop Color="DarkRed" Offset="0" />
                    <GradientStop Color="Crimson" Offset="1" />
                </LinearGradientBrush>
            </Border.Background>
            <TextBlock HorizontalAlignment="Center" FontSize="36" 
                       Foreground="AntiqueWhite" 
                       Text="Dance Name to Event Name">
                <TextBlock.Effect>
                    <DropShadowEffect BlurRadius="4" Color="LightSalmon" 
                       Opacity="0.5" />
                </TextBlock.Effect>
            </TextBlock>
        </Border>

        <TextBlock Grid.Row="1" Grid.ColumnSpan="3" TextWrapping="Wrap" 
                   FontSize="14">
            <Bold>Next week's challenge: </Bold>
            Name a dance. Change one of the letters to a U. The 
            resulting letters can be rearranged to name an event at 
            which this dance is done. What is it?
        </TextBlock>

        <Label Grid.Row="2" Content="Dance _URL:" 
               Target="{Binding ElementName=txtDanceURL}" />
        <TextBox Grid.Row="2" Grid.Column="1" x:Name="txtDanceURL" 
                 Text="http://en.wikipedia.org/wiki/List_of_dances" />
        <Button Grid.Row="2" Grid.Column="2" x:Name="btnGetDances" 
                Content="Get _Dance List" Height="30"
                HorizontalAlignment="Right" Margin="3" Click="btnGetDances_Click" />

        <Label Grid.Row="3" Content="_Events:" 
               Target="{Binding ElementName=txtEventNames}" />
        <TextBox Grid.Row="3" Grid.Column="1" x:Name="txtEventNames" 
                 AcceptsReturn="True" TextWrapping="WrapWithOverflow"
                 VerticalAlignment="Stretch">
                social function, fundraiser, function, debutante ball, 
                photo opportunity, jubilation, funeral, groundbreaking, 
                graduation, sacramental union, communion, masquerade, ritual, 
                inauguration, presidential inauguration, papal inauguration,
                audience, changing of the guard, courtesy call, 
                investiture, yule, house party, circumstance,  nuptials, 
                induction, Maundy, bunfight, musical, burlesque, puppet play, 
                tournament, occurrence, masquerade ball, masque, masquerade party, 
                nauch, nautch, nautch dance, duet, pas de deux, pas de quatre, 
                ritual dance, ritual dancing, nude dancing, beaux arts ball,
                buffet, housewarming party, housewarming, costume party, 
                hurricane party, June Event, Open house, houseparty, 
                Symposium, Musical Occasion, Cultural festivals, Burning Man,
                Hula Festival, Saturnalia, Fat Tuesday, blowout, Ludi Saeculares,
                secular games, perambulation, church festival, luau, slumber party,
                reunion, banquet, musical soiree, soiree musicale
        </TextBox>
        <Button Grid.Row="4" Grid.Column="2" Content="_Solve" Height="30" 
                Margin="3" HorizontalAlignment="Right"
                VerticalAlignment="Top" Click="btnSolve_Click" x:Name="btnSolve"
                Width="{Binding ElementName=btnGetDances, Path=ActualWidth}" />

        <Label Grid.Row="4" Content="_Answer:" 
               Target="{Binding ElementName=txtAnswer}" />
        <TextBox Grid.Row="4" Grid.Column="1" x:Name="txtAnswer" />
    </Grid>
</Window>