Month: June 2022

NPR Puzzle for June 12, 2022

Posted on

Famous American’s last name manipulated to become a European capital.

This week’s challenge comes from listener Theodore Regan, of Scituate, Mass. Take the last name of a famous 20th-century American. The 5th, 6th, 7th, and 1st, 2nd, and 3rd letters, in that order, name a European capital. Who is the person, and what capital is it?

Link to the challenge
Reposition Letters in a Famous American’s Last Name to get a European Capital

Techniques Used

  • String Manipulation
  • Simple File IO
  • Screen Scraping (to get the famous Americans lists)
  • XPath (Only discussed in my comments in the download)
  • HtmlAgilityPack (also only in the download)

Synopsis

This one should have been easy, but I struggled some. First, I had to get a good enough list of famous Americans. That was tricky because my list source was 56 separate pages on Wikipedia (one page per state or territory). The large number of pages was not a problem as much as the differing formats on each page. Then I accidentally assumed the person’s name should be six letters long, resulting in no output until I realized my mistake.

Even though I struggled a bit, I’m pretty sure that this puzzle is still easier to solve by coding than by traditional methods. Yea, team coders!

However, having struggled to build the famous Americans files (also the European capitals), the algorithm is straightforward: 1) read the capitals into a list, 2) read each person from the 56 files, 3) parse each input line to extract the last name, 4) manipulate the last name per Will’s instructions, and 5) search the capitals list to see if the manipulated name matches any capital.

The heart of the algorithm is encapsulated in the following four lines of code:

var candidate = lastName[4..7] + lastName[..3];
var ndx = capitalList.BinarySearch(candidate, StringComparer.OrdinalIgnoreCase);
if (ndx >= 0) 
    txtResults.Text += $"{fullName} - {capitalList[ndx]}\n";

Don’t worry if it doesn’t make sense yet, I’ll explain it below. 😊

Screen shot showing the solution. I was surprised to see three “famous” Lindbergh’s! Note that Charles Lindbergh is claimed by four states: he was born in Michigan, worked in Missouri, lived in New Jersey and died in Hawaii. I elected to keep my code simple, displaying the names multiple times, because the answer is clear enough, and I wished to keep my code simple so you can follow more easily.

Here’s the Code!

Comments below with a number correspond to a numbered bullet point in the explanation deeper down:

private void btnSolve_Click(object sender, RoutedEventArgs e) {
  //Load the capitals into a list via a method defined below:
  var capitalList = LoadCapitalList();

  var folderName = @"C:\Users\Randy\Documents\Puzzles\Data\Americans";
  //1) The folder holds 56 files, one for each state or territory
  var fList = Directory.GetFiles(folderName);
  //Loop through all 56 files and read each line:
  foreach(var fileName in fList) {
    using(var sr = File.OpenText(fileName)) {
      //2) Peek tells us when we hit the end of the file
      while(sr.Peek() != -1) {
        var aLine = sr.ReadLine() ?? "";
        //3) Each line holds multiple fields separated by | characters
        //The first field is the person's name, so find the 1st |
        var p = aLine.IndexOf('|');
        var fullName = "";
        if (p >= 0)
          //4) Extract everything before the |, using range operators:
          fullName = aLine[..p];
        else
          fullName = aLine;

        //Now that we have the full name, extract the last name
        var lastName = "";
        //Look for the last space, the last name starts after it:
        p = fullName.LastIndexOf(' ');
        if (p >= 0)
          //5) 
          lastName = fullName[(p + 1)..];
        else
          //Some people only have one name, so there is no space
          lastName = fullName;

        if (lastName.Length >= 7) {
	  //5) Swap the letters per Will's instructions:
          var candidate = lastName[4..7] + lastName[..3];
          //6) Binary search is fast and allows mixed-case matches:
          var ndx = capitalList.BinarySearch(candidate, 
                                  StringComparer.OrdinalIgnoreCase);
	  //If ndx is non-negative, that means we have a solution!
          if (ndx >= 0) 
            txtResults.Text += $"{fullName} - {capitalList[ndx]}\n";
        }
      }
    }
  }
}

Discussion

  1. var fList = Directory.GetFiles(folderName);
    • The method GetFiles returns an array of file names which we can loop through
    • The GetFiles method returns an array, which I iterate, so as to process every file in the folder
  2. while(sr.Peek() != -1) {
    • ‘sr’ is a StreamReader; the Peek method tells us when we reach the end of the file by returning -1
  3. var p = aLine.IndexOf('|');
    • My files containing famous people uses the pipe (‘|’) character to separate fields
    • So that is why I search for a ‘|’, because the name ends where the ‘|’ starts.
  4. fullName = aLine[..p];
    • Here I am using the range operator to extract a sub-string from the variable aLine.
    • The syntax above is equivalent to aLine[0..p], meaning “start at position 0 and finish prior to p“.
    • However, we can omit the 0 because it is implied when omitted
  5. var candidate = lastName[4..7] + lastName[..3];
    • The candidate might be a European capital; per Will’s instructions, we put letters 4-7 at the start, concatenate letters 1-3 at the end. Again, I am using range operators
  6. var ndx = capitalList.BinarySearch(candidate, StringComparer.OrdinalIgnoreCase);
    • The binary search method is a super fast way to search a list, also convenient
    • Note that the candidate I just built looks like this: “berLin“. Note that ‘b’ is lower case and ‘L’ is upper case, because that’s how they appeared in the source, ‘Lindebergh‘.
    • Also note that the entry in capitalList has different capitalization, ‘Berlin‘, and under normal string comparison, these two strings are not equal!
    • I handle this by specifying ‘StringComparer.OrdinalIgnoreCase‘ as a parameter to the search. This means the strings will be compared ignoring case.
    • Note that the result returned by the BinarySearch method is a negative number if nothing is found, otherwise it is the index of the item searched for. In my list of 65 capitals, Berlin is at index 7. Note that C# uses zero-based indexing, meaning that the first entry has index 0, which is confusing to many, including me 😒

Method ‘LoadCapitalList‘ Loads the Capitals File into a List

The method shown below was invoked in the code listing above:

private List<string> LoadCapitalList() {
  var fName = @"C:\Users\Randy\source\repos\GetFamousAmericans\GetFamousAmericans\EuropeanCapitals.txt";
  var capitalList = new List<string>();
  using (var sr = new StreamReader(File.OpenRead(fName)))
    while (sr.Peek() != -1) {
      //'Readline' returns null or a string, the null coalesce
      //operator ?? says 'convert any null to an empty string'
      var aLine = sr.ReadLine() ?? "";
      //The file has two fields separated by '~'; the 1st field is the capital
      var p = aLine.IndexOf("~");
      var capital = aLine[..p];      
      capitalList.Add(capital);
    }

  //Binary search only works on sorted lists, so sort it!
  capitalList.Sort();
  return capitalList;
}

Remarks – Loading the File

The code is pretty straightforward, using the same techniques as shown above to open the file and loop through it. Note that this file is different because its delimiter is a tilde, not a pipe.

Code to Get Famous Americans

In order to keep this post short and sweet, I omitted explaining how I did that. However, you can download my code and examine the comments in the file to learn more. Note that the code to get the famous Americans is considerably more complex than the code to solve the puzzle. To download the HTML pages and parse them, I used HtmlAgilityPack, which in turn relies on XPath. That allowed me to extract the parts of the page I wanted, ignoring all the text and formatting.

Get the Code!

Click here to download my code; note that to do so, you will need a free DropBox account. You will need to revise any variables containing file paths to match the path on your PC. Unless you have folders name ‘Randy’!

Puzzle June 5, 2022

Posted on

Country Name Containing a Deodorant and an Air Freshener

This week’s challenge comes from listener Ben Bass of Chicago. The name of what country contains a deodorant and an air freshener in consecutive letters?

Link to the challenge
Find a country name containing brand names for deodorant and air freshener

Techniques Used

  • String manipulation
  • Simple file handling
Screen shot showing the solution

Synopsis

Another super easy one! There aren’t that many consumer products to search for, and for some reason, people have made web pages for those products. So I was able to copy the product names manually off those web pages and save as a pair of files I used to solve the puzzle. Happily, I already had a list of country names, so no work was required for that.

After making my files, the process involved 1) reading the country, deodorant and freshener names into three lists, 2) iterating the country list, 3) for each country, searching the name for a deodorant, 4) if a hit resulted, searching again for a freshener.

Since Will didn’t say whether the deodorant was first, I actually needed two search sections, one that searches for deodorant being first in the country, the other for the freshener being first.

The similarities in loading the three lists resulted in consolidating my code to a method to do the work. Similarly, I wrote a single method to search a country name for the contents of two anonymous lists, thereby allowing me to invoke the same code twice but with the lists in different order. The result was another consolidation of code, again making the resulting program shorter and easier to understand.

Here’s the Code!

First, the Main Method, Which Makes use of My Two Methods

private void btnSolve_Click(object sender, RoutedEventArgs e) {
  //1) Make a list of country names:
  var fName = @"C:\Users\Randy\Documents\Puzzles\Data\Countries.txt";
  var countryList = ReadFileIntoList(fName);

  //2) Make another list of deodorant names:
  fName = @"C:\Users\Randy\Documents\Puzzles\Data\Deodorants.txt";
  var deodorantList = ReadFileIntoList(fName);

  //3) And now another list of air fresheners:
  fName = @"C:\Users\Randy\Documents\Puzzles\Data\AirFresheners.txt";
  var freshenerList = ReadFileIntoList(fName);

  //4) Loop through the countries and check each to see if it matches:
  foreach (var country in countryList) {
    //This block of code checks for deodorant first, followed by air freshener
    var solution = FindListEntriesInCountryName(country, deodorantList, freshenerList);
    if (!string.IsNullOrWhiteSpace(solution))
      txtResult.Text += solution;

    //5) This block of code swaps the list order to check for
    //air freshener first, followed by deodorant:
    solution = FindListEntriesInCountryName(country, freshenerList, deodorantList);
    if (!string.IsNullOrWhiteSpace(solution))
      txtResult.Text += solution;
  }
}

Remarks

The code above is pretty straightforward. Code blocks marked above as items 1, 2, 3 serve to load the three files into three lists. Each returned list contains string entries

The code blocks marked as items 4 and 5 above serve to search the country name for deodorant name and air freshener. The parameter order is swapped when I invoke ‘FindListEntriesInCountryName‘, meaning that block 4 searches for deodorant followed by an entry in the freshener list, and block 5 does vice versa.

My Two Methods

/// Open a file, read each line and add it to a list private List ReadFileIntoList(string fName) {
  var result = new List<string>();
  using (var sr = File.OpenText(fName))
    while (sr.Peek() != -1) {
      var addMe = sr.ReadLine() ?? "";
      result.Add(addMe);
    }
  return result;
}

/// Searches a country name using entries from two listsprivate string FindListEntriesInCountryName(string country, 
                      List<string> list1, List<string> list2) {
  var result = "";
  foreach (var deodorant in list1) {
    var p1 = country.IndexOf(deodorant, StringComparison.CurrentCultureIgnoreCase);
    if (p1 >= 0) {
      var startPos = p1 + deodorant.Length;
      if (startPos < country.Length)
        foreach (var freshener in list2) {
          var p2 = country.IndexOf(freshener, startPos,   StringComparison.CurrentCultureIgnoreCase);
          if (p2 >= 0) {
            result = $"{country} - {deodorant} - {freshener}\n";
            break;
          }
        }
    }
  }
  return result;
}

Remarks on the Two Methods

ReadFileIntoList

The first method opens the file whose name was provided as a parameter, ‘fName‘. File.OpenText gives us a StreamReader which has two methods we use: Peek and ReadLine.

var addMe = sr.ReadLine() ?? ""; 

What’s that weird looking code above? Note that ReadLine might theoretically return null; I deal with that issue by using the ?? operator (the null coalescing operator). The string I supply after the ?? operator (an empty string) is substituted whenever a null is returned. I could have substituted “WTF” instead of an empty string, but that would have been silly 😆.

FindListEntriesInCountryName

The second method takes a country name as input, plus two lists named ‘List1’ and ‘List2’. First we iterate list 1, using the IndexOf method to check if a list entry is present in the list. IndexOf will return a negative number if it fails, so when we find a number 0 or higher, we know we have a hit.

In that case, we check if the second list might contain another match. It must start after the first hit, so I use a variable ‘startPos‘ to indicate where we should look inside the country name. For example, if I found deodorant ‘DogBreath’ at position 2 in the country name, then startPos would be 11, because the length of the deodorant is 9 and 9 + 2 = 11.

In actuality, ‘Ban’ is at position 0, so I look for entries in the freshener list starting at position 3. Remember that string indices start at position 0, not position 1.

‘BAN’ starts at position 0, so ‘Glade’ should start after position 3

Can you Spot my Cheat?

If you’re paying close attention, you might realize that Will says the 2nd entry should start immediately after the first. But my code allows intervening letters. I left this in place for simplicity and because I found a single solution. But theoretically, my code could have found ‘BanBlahBlahBlahGladeSh as an answer!

Download my Code

Click here to download my code; note that you will need a (free) DropBox account to do so.