Month: June 2022
NPR Puzzle for June 12, 2022
Famous American’s last name manipulated to become a European capital.
This week’s challenge comes from listener Theodore Regan, of Scituate, Mass. Take the last name of a famous 20th-century American. The 5th, 6th, 7th, and 1st, 2nd, and 3rd letters, in that order, name a European capital. Who is the person, and what capital is it?
Link to the challenge
Techniques Used
- String Manipulation
- Simple File IO
- Screen Scraping (to get the famous Americans lists)
- XPath (Only discussed in my comments in the download)
- HtmlAgilityPack (also only in the download)
Synopsis
This one should have been easy, but I struggled some. First, I had to get a good enough list of famous Americans. That was tricky because my list source was 56 separate pages on Wikipedia (one page per state or territory). The large number of pages was not a problem as much as the differing formats on each page. Then I accidentally assumed the person’s name should be six letters long, resulting in no output until I realized my mistake.
Even though I struggled a bit, I’m pretty sure that this puzzle is still easier to solve by coding than by traditional methods. Yea, team coders!
However, having struggled to build the famous Americans files (also the European capitals), the algorithm is straightforward: 1) read the capitals into a list, 2) read each person from the 56 files, 3) parse each input line to extract the last name, 4) manipulate the last name per Will’s instructions, and 5) search the capitals list to see if the manipulated name matches any capital.
The heart of the algorithm is encapsulated in the following four lines of code:
var candidate = lastName[4..7] + lastName[..3]; var ndx = capitalList.BinarySearch(candidate, StringComparer.OrdinalIgnoreCase); if (ndx >= 0) txtResults.Text += $"{fullName} - {capitalList[ndx]}\n";
Don’t worry if it doesn’t make sense yet, I’ll explain it below. 😊
Here’s the Code!
Comments below with a number correspond to a numbered bullet point in the explanation deeper down:
private void btnSolve_Click(object sender, RoutedEventArgs e) { //Load the capitals into a list via a method defined below: var capitalList = LoadCapitalList(); var folderName = @"C:\Users\Randy\Documents\Puzzles\Data\Americans"; //1) The folder holds 56 files, one for each state or territory var fList = Directory.GetFiles(folderName); //Loop through all 56 files and read each line: foreach(var fileName in fList) { using(var sr = File.OpenText(fileName)) { //2) Peek tells us when we hit the end of the file while(sr.Peek() != -1) { var aLine = sr.ReadLine() ?? ""; //3) Each line holds multiple fields separated by | characters //The first field is the person's name, so find the 1st | var p = aLine.IndexOf('|'); var fullName = ""; if (p >= 0) //4) Extract everything before the |, using range operators: fullName = aLine[..p]; else fullName = aLine; //Now that we have the full name, extract the last name var lastName = ""; //Look for the last space, the last name starts after it: p = fullName.LastIndexOf(' '); if (p >= 0) //5) lastName = fullName[(p + 1)..]; else //Some people only have one name, so there is no space lastName = fullName; if (lastName.Length >= 7) { //5) Swap the letters per Will's instructions: var candidate = lastName[4..7] + lastName[..3]; //6) Binary search is fast and allows mixed-case matches: var ndx = capitalList.BinarySearch(candidate, StringComparer.OrdinalIgnoreCase); //If ndx is non-negative, that means we have a solution! if (ndx >= 0) txtResults.Text += $"{fullName} - {capitalList[ndx]}\n"; } } } } }
Discussion
var fList = Directory.GetFiles(folderName);
- The method
GetFiles
returns an array of file names which we can loop through - The
GetFiles
method returns an array, which I iterate, so as to process every file in the folder
- The method
while(sr.Peek() != -1) {
- ‘sr’ is a
StreamReader
; thePeek
method tells us when we reach the end of the file by returning -1
- ‘sr’ is a
var p = aLine.IndexOf('|');
- My files containing famous people uses the pipe (‘|’) character to separate fields
- So that is why I search for a ‘|’, because the name ends where the ‘|’ starts.
fullName = aLine[..p];
- Here I am using the
range
operator to extract a sub-string from the variableaLine
. - The syntax above is equivalent to
aLine[0..p]
, meaning “start at position 0 and finish prior to p“. - However, we can omit the 0 because it is implied when omitted
- Here I am using the
var candidate = lastName[4..7] + lastName[..3];
- The
candidate
might be a European capital; per Will’s instructions, we put letters 4-7 at the start, concatenate letters 1-3 at the end. Again, I am using range operators
- The
var ndx = capitalList.BinarySearch(candidate, StringComparer.OrdinalIgnoreCase);
- The binary search method is a super fast way to search a list, also convenient
- Note that the candidate I just built looks like this: “berLin“. Note that ‘b’ is lower case and ‘L’ is upper case, because that’s how they appeared in the source, ‘Lindebergh‘.
- Also note that the entry in
capitalList
has different capitalization, ‘Berlin‘, and under normal string comparison, these two strings are not equal! - I handle this by specifying ‘
StringComparer.OrdinalIgnoreCase
‘ as a parameter to the search. This means the strings will be compared ignoring case. - Note that the result returned by the
BinarySearch
method is a negative number if nothing is found, otherwise it is the index of the item searched for. In my list of 65 capitals, Berlin is at index 7. Note that C# uses zero-based indexing, meaning that the first entry has index 0, which is confusing to many, including me 😒
Method ‘LoadCapitalList‘ Loads the Capitals File into a List
The method shown below was invoked in the code listing above:
private List<string> LoadCapitalList() { var fName = @"C:\Users\Randy\source\repos\GetFamousAmericans\GetFamousAmericans\EuropeanCapitals.txt"; var capitalList = new List<string>(); using (var sr = new StreamReader(File.OpenRead(fName))) while (sr.Peek() != -1) { //'Readline' returns null or a string, the null coalesce //operator ?? says 'convert any null to an empty string' var aLine = sr.ReadLine() ?? ""; //The file has two fields separated by '~'; the 1st field is the capital var p = aLine.IndexOf("~"); var capital = aLine[..p]; capitalList.Add(capital); } //Binary search only works on sorted lists, so sort it! capitalList.Sort(); return capitalList; }
Remarks – Loading the File
The code is pretty straightforward, using the same techniques as shown above to open the file and loop through it. Note that this file is different because its delimiter is a tilde, not a pipe.
Code to Get Famous Americans
In order to keep this post short and sweet, I omitted explaining how I did that. However, you can download my code and examine the comments in the file to learn more. Note that the code to get the famous Americans is considerably more complex than the code to solve the puzzle. To download the HTML pages and parse them, I used HtmlAgilityPack, which in turn relies on XPath. That allowed me to extract the parts of the page I wanted, ignoring all the text and formatting.
Get the Code!
Click here to download my code; note that to do so, you will need a free DropBox account. You will need to revise any variables containing file paths to match the path on your PC. Unless you have folders name ‘Randy’!
Puzzle June 5, 2022
Country Name Containing a Deodorant and an Air Freshener
This week’s challenge comes from listener Ben Bass of Chicago. The name of what country contains a deodorant and an air freshener in consecutive letters?
Link to the challenge
Techniques Used
- String manipulation
- Simple file handling
Synopsis
Another super easy one! There aren’t that many consumer products to search for, and for some reason, people have made web pages for those products. So I was able to copy the product names manually off those web pages and save as a pair of files I used to solve the puzzle. Happily, I already had a list of country names, so no work was required for that.
After making my files, the process involved 1) reading the country, deodorant and freshener names into three lists, 2) iterating the country list, 3) for each country, searching the name for a deodorant, 4) if a hit resulted, searching again for a freshener.
Since Will didn’t say whether the deodorant was first, I actually needed two search sections, one that searches for deodorant being first in the country, the other for the freshener being first.
The similarities in loading the three lists resulted in consolidating my code to a method to do the work. Similarly, I wrote a single method to search a country name for the contents of two anonymous lists, thereby allowing me to invoke the same code twice but with the lists in different order. The result was another consolidation of code, again making the resulting program shorter and easier to understand.
Here’s the Code!
First, the Main Method, Which Makes use of My Two Methods
private void btnSolve_Click(object sender, RoutedEventArgs e) { //1) Make a list of country names: var fName = @"C:\Users\Randy\Documents\Puzzles\Data\Countries.txt"; var countryList = ReadFileIntoList(fName); //2) Make another list of deodorant names: fName = @"C:\Users\Randy\Documents\Puzzles\Data\Deodorants.txt"; var deodorantList = ReadFileIntoList(fName); //3) And now another list of air fresheners: fName = @"C:\Users\Randy\Documents\Puzzles\Data\AirFresheners.txt"; var freshenerList = ReadFileIntoList(fName); //4) Loop through the countries and check each to see if it matches: foreach (var country in countryList) { //This block of code checks for deodorant first, followed by air freshener var solution = FindListEntriesInCountryName(country, deodorantList, freshenerList); if (!string.IsNullOrWhiteSpace(solution)) txtResult.Text += solution; //5) This block of code swaps the list order to check for //air freshener first, followed by deodorant: solution = FindListEntriesInCountryName(country, freshenerList, deodorantList); if (!string.IsNullOrWhiteSpace(solution)) txtResult.Text += solution; } }
Remarks
The code above is pretty straightforward. Code blocks marked above as items 1, 2, 3 serve to load the three files into three lists. Each returned list contains string entries
The code blocks marked as items 4 and 5 above serve to search the country name for deodorant name and air freshener. The parameter order is swapped when I invoke ‘FindListEntriesInCountryName
‘, meaning that block 4 searches for deodorant followed by an entry in the freshener list, and block 5 does vice versa.
My Two Methods
/// Open a file, read each line and add it to a list private List ReadFileIntoList(string fName) { var result = new List<string>(); using (var sr = File.OpenText(fName)) while (sr.Peek() != -1) { var addMe = sr.ReadLine() ?? ""; result.Add(addMe); } return result; }/// Searches a country name using entries from two lists private string FindListEntriesInCountryName(string country, List<string> list1, List<string> list2) { var result = ""; foreach (var deodorant in list1) { var p1 = country.IndexOf(deodorant, StringComparison.CurrentCultureIgnoreCase); if (p1 >= 0) { var startPos = p1 + deodorant.Length; if (startPos < country.Length) foreach (var freshener in list2) { var p2 = country.IndexOf(freshener, startPos, StringComparison.CurrentCultureIgnoreCase); if (p2 >= 0) { result = $"{country} - {deodorant} - {freshener}\n"; break; } } } } return result; }
Remarks on the Two Methods
ReadFileIntoList
The first method opens the file whose name was provided as a parameter, ‘fName
‘. File.OpenTex
t gives us a StreamReader
which has two methods we use: Peek
and ReadLine
.
var addMe = sr.ReadLine() ?? "";
What’s that weird looking code above? Note that ReadLine might theoretically return null; I deal with that issue by using the ??
operator (the null coalescing operator). The string I supply after the ??
operator (an empty string) is substituted whenever a null is returned. I could have substituted “WTF” instead of an empty string, but that would have been silly 😆.
FindListEntriesInCountryName
The second method takes a country name as input, plus two lists named ‘List1’ and ‘List2’. First we iterate list 1, using the IndexOf
method to check if a list entry is present in the list. IndexOf
will return a negative number if it fails, so when we find a number 0 or higher, we know we have a hit.
In that case, we check if the second list might contain another match. It must start after the first hit, so I use a variable ‘startPos
‘ to indicate where we should look inside the country name. For example, if I found deodorant ‘DogBreath’ at position 2 in the country name, then startPos
would be 11, because the length of the deodorant is 9 and 9 + 2 = 11.
In actuality, ‘Ban’ is at position 0, so I look for entries in the freshener list starting at position 3. Remember that string indices start at position 0, not position 1.
Can you Spot my Cheat?
If you’re paying close attention, you might realize that Will says the 2nd entry should start immediately after the first. But my code allows intervening letters. I left this in place for simplicity and because I found a single solution. But theoretically, my code could have found ‘BanBlahBlahBlahGladeSh as an answer!
Download my Code
Click here to download my code; note that you will need a (free) DropBox account to do so.