NR Puzzle Feb 25, 2022

Posted on March 4, 2022 Updated on March 4, 2022

This week’s challenge comes from listener Alan Hochbaum, of Duluth, Ga. Name a famous actor — first and last names. Remove the last letter of each name. You’ll be left with an animal and an adjective that describes that animal, respectively. Who is the actor?
Link to Puzzle at NPR

Take a list of actors, remove a letter from first and last names, and get an animal and an adjective.

Another easy puzzle, and again I elected to solve it using Python. Skills and techniques include:

File processing
String manipulation
Simple Regular Expressions
Binary search
Loops and basic logic

Screenshot from Visual Studio Code Shows the Terminal Print-out of my 3 Candidate Solutions

Comments

I’m pretty confident in my code, and I have a lists of 16,000 actors, 5,000 adjectives, and 2,000 animals. But, I don’t have all the actors, animals and adjectives! So there could be other solutions that I missed, even though my code works satisfactorily. “Redd Foxx” is not a well known actor these days. Sorry if either is your favorite actor, but I’ve never heard of my other two actors, Cody Longo and Sean Whalen. So, come Sunday, I would not be surprised if Will says some other actor is the solution. But, as you can see, given my data, these three are legitimate solutions.

Redd Foxx in the Breakthrough Television Comedy “Sanford and Son”

Strategy

Load the animal file into a list. Note my file is alphabetically sorted
Load the adjective file into another list
Open the actor file and process each entry
- Split the name into separate words
  - Using a regular expression expressing to control the separator character
Take the first word – call it the first name,
- Take the last word – call it the actor’s last name
  - Avoid any suffix, such as ‘Jr.’, ‘Sr.’, etc.
- Use a binary search to quickly search the animal and adjective lists
- If we get a hit for first and last names, print a candidate solution

Here’s The Code

Step 1: Load the two Files

import re
# Open and load the animals file; note it is presorted
fName = "C:\\Users\\Randy\\Documents\\Puzzles\\Data\\CommonAnimals.txt"
animals = []
# Note: file is already sorted
fHandle = open(fName, 'rt')
for aLine in fHandle:
    animals.append(aLine.strip())

# Open and load the adjectives file, also presorted
adjectives = []
# This file is sorted too
fName = "C:\\Users\\Randy\\Documents\\Puzzles\\Data\\Adjectives.txt"
fHandle = open(fName, 'rt')
for aLine in fHandle:
    adjectives.append(aLine.strip())

A couple notes:

I use a double backslash \\ in the file path because a single backslash is used to designate special characters
I invoke the ‘strip’ method primarily to remove the linefeed

Step 2: Define the Binary Search Method

def binary_search(list1, n):  
    low = 0  
    high = len(list1) - 1  
    mid = 0  
  
    while low <= high:  
        # Compute index for our next guess, the middle of high and low
        mid = (high + low) >> 1
  
        # Check if n is present at mid
        if list1[mid] < n:  
            low = mid + 1  
  
        # If n is greater, compare to the right of mid
        elif list1[mid] > n:  
            high = mid - 1  
  
        # If n is smaller, compared to the left of mid
        else:  
            return mid  
  
    # element not found, return -1
    return -1

Notes

Binary search is extremely fast (it runs in O (log(n)) time). Meaning that the search time is proportionate to the log of the list size. For example, if the list has a million entries, binary search would find an entry in an average 6 operations (because Log(1000000) = 6). Note that it works if the list is sorted.
low, mid and high are indexes to the list. ‘mid‘ is the middle of the current search subsection.
We examine the element at position mid
- If it matches our target ‘n’, we found it
- But if it is smaller, recompute high (which then causes mid to be recomputed)
- Alternatively, if it is higher, recompute low (which also causes mid to be recomputed next guess)
Bit-shift for even more speed: the following is the faster version of division by 2:
- mid = (high + low) >> 1
It does this by shifting the bits to the right one position. I do this for a minor increase in speed; note that division is by far the slowest arithmetic operator

Step 3: Process the Actor File

# Use this list to make sure we get the correct surname
suffixes = ['Jr', 'Sr', 'II', 'III']
fName = "C:\\Users\\Randy\\Documents\\Puzzles\\Data\\Actors\\Actors.txt"
# The file has some non-standard letters, so open with utf-8
fHandle = open(fName, 'rt', encoding="utf8")
for aLine in fHandle:
    # By splitting the name, we can get each name part as 
    # a separate entry in the list 'tokens':
    tokens = re.split(r'[ ,.]', aLine.strip())
    first = tokens[0][:-1]
    last = ''
    lastIndex = len(tokens) - 1
    if tokens[lastIndex] in suffixes and lastIndex >= 2:
        last = tokens[lastIndex -1][:-1]
    elif lastIndex >= 1:
        last = tokens[lastIndex][:-1]

    # At this point, the first and last names have been stripped of
    # their last characters. Some names might not work 
    # because they have length 1,
    # such as the actor 'X Brands' or 'A.J. Trauth', 
    # so skip over empty first/last names
    if last != '' and first != '':
        # see if first name starts corresponds to an animal
        aniIndex = binary_search(animals, first)
        # -1 means no match; so now let's see 
        # if first name works with adjectives:
        if aniIndex == -1:
            adjIndex = binary_search(adjectives, first.lower())

            if adjIndex != -1:
                aniIndex = binary_search(animals, last)
                if aniIndex != -1:
                    print(aLine.strip(), '-->',  adjectives[adjIndex], animals[aniIndex].lower())
        else:
            # First name is an animal; try last name for adjectives
            adjIndex = binary_search(adjectives, last.lower())
            if adjIndex >= 0:
                print(aLine.strip(), '-->', adjectives[adjIndex], animals[aniIndex].lower())

Comments

I use a regular expression to split the actor name into a list of separate strings
- tokens = re.split(r'[ ,.]', aLine.strip())
For example, “William Collier, Jr.”
- Becomes [“William”, “Collier”, “Jr”], i.e. a list of 3 strings
The separator characters (delimiters) are space, comma and period, specified here:
- r'[ ,.]'
The first name is at position 0 of the list, the first element
The following line of code creates a variable ‘first’ by removing the last character from the first element:
- first = tokens[0][:-1]
- Using string slicing
Usually, the last string in my list is the actor’s surname, but sometimes they have a suffix like “Sr.”
The following code tests if the last string is such a suffix:
- if tokens[lastIndex] in suffixes and lastIndex >= 2:
After assigning the first and last name, we use those variables to perform a binary search
If first is an adjective and last is an animal, print a solution
Or, if last is an adjective and first is an animal, print

Download the Code

You can download my code and data files here. Note that you will be required to create a free DropBox account to do so.

This entry was posted in Puzzle Solving! and tagged Binary Search, Puzzle-solving, Python.

One thought on “NR Puzzle Feb 25, 2022”

Sunday Puzzle for April 10, 2022 « Randy's Rants, Ravings and Reflections said:
April 11, 2022 at 5:04 am

[…] to search a list. That is important when searching a big list. Since I discussed this code before (NPR Puzzle February 5, 2022), I will only list the code and not discuss the […]

	Sunday Puzzle for Ap… on NR Puzzle Feb 25, 2022
	Isadora Wickersham on Use DataTriggers to Initiate W…
	Eq on Body Parts From Literary Chara…
	Programmatically Ret… on Search TFS the Easy Way…
	Use a DataBound Smil… on Visual Studio Quick Tip: Snipp…

Randy's Rants, Ravings and Reflections

A code enthusiast's musings