Coding a Wordle solving assistant in Python
Wordle from NYT is a well-known puzzle game played by over 300,000 people everyday. The game is such a phenomenon that typing wordle in google brings up a custom animated doodle next to the search bar.

I’m an occasional player. The game has simple rules and a simple objective but getting to the finish line is easier said than done. I’ve been stumped a few times and spent more than 20+ minutes trying to guess the word of the day. Then one day, I decided to do what all gamers eventually do when cornered. Cheat.
I decided to write a program to help me beat the game. Here it is:
The program progressively eliminates invalid words based on the provided input to help the player make good guesses. The program is not omniscient, and cannot crack the game in 1 turn. That would require hacking into NYT. In this blog I’ll cover some things I learned in writing the program.
In order to write such a program the first step is to obtain a word list. I obtained a word list from a kindly professor here: https://www-personal.umich.edu/~jlawler/wordlist.html. There are 69,905 words in this list. I’m sure the English language has more words than that, but this was a good starting point. Also this list is a simple, clean list of words, with one word on each line without any other columns.
Next step was to filter this list and extract all the five letter words. I’ve recently begun using Linux full time at home and this task was quite simple in linux.
$ egrep “^[a-z]{5}$” wordlist > five_letter_words.txt
And just like that 69,905 words are filtered down to 5169 words, in less than a second. Linux is great (but sometimes you have to convert DOS files to Unix files because egrep doesn’t like DOS style CRLF line endings)
Finally the python code. The program will load the list of five letter words into memory and progressively whittle the list down as letters are removed from play. The user begins by loading wordle and then running wordle-buddy in a second tab. wordle-buddy will wait patiently for the user to input their first guess.

Suppose the secret word is “BLACK” (as on Sunday, April 10, 2022).
1st try: FROST

Result: No matches.
Now the user has to inform wordle-buddy of their guess, and the outcome.

Enter FROST at the first prompt, and the match string in the next prompt. What is a match string? To make it simple:
For GREY (non-match) letters: Enter a “/” (slash)
For YELLOW (wrong spot) letters: Enter a “#”
For GREEN (correct) letters: Enter a “+”
For FROST the match rating is “/////” (five slashes).
Next wordle-buddy will use this information to eliminate all words containing letters that were removed from play. So after one turn, the list of probable matches was reduced from 5169 > 952.
The program has a list browser that enables the user to view the remaining words to make the next guess.

You can actually enter regex here if you want, but I’m prepending the ^ (caret) anchor automatically. Typically the player would enter a single letter here to see a sub-list of all the words that begin with that letter. Or just enter !rand to get 10 random suggestions.
Hitting enter without specifying any pattern will bring up the whole list. Do this once the number of words remaining is less than 20.
Let’s finish the play through.
2nd try : CUMIN
Match: #////
61 words remain
3rd try : DECAY
Match: //##/
13 words remain
4th try: BLACK
Success.
The code below is the heart of the program:
This code takes the user supplied guess and the “match rating” to eliminate invalid words and returns a list of remaining valid guesses. For characters with a rating of “/” (grey) python will scan each word in the active word list and eliminate all words that contain that character. For characters with a rating of “#” (yellow) python will scan each word and eliminate all words that do not contain the character. Finally for the green characters python will eliminate all words that do not contain the matching character at the correct position.
Update: Found a better wordlist at https://github.com/DevangThakkar/wordle_archive. Using that now.