Solving Wordle Without Playing Wordle

All the Wordle solver programs I’ve seen work by playing the game like normal and computing the guess that provides the most information toward the answer. Asking the game to tell you about the letters is easy. It’s practically cheating. I’m looking for a challenge.

When you finish the daily Wordle, you can copy results to your clipboard with no letters, so you can show people how you did without giving away hints. That’s the intention, at least. I believe those result posts provide enough hints to solve wordle from the outside.

I proceed to discuss some theory and potential issues, but you can skip to the Colab notebook containing the technical analysis if you want.

Definition of Wordle.

The player’s objective is to guess a daily 5-letter word using up to 6 guesses. The game program color-codes the letters of each guess. Green indicates a letter in the correct position in the answer. Yellow indicates a letter that is in the answer at a different position, and grey (or white or black) indicates a letter that is not in the answer at all, with one exception. If the answer is GHOST and the guess is GOOSE, the second O will be green, but the first O will be grey, yielding the coding:

g o o s e

🟩⬜🟩🟩⬜

Skill and Chance

A player can lose just because of bad luck. For example, if they discover that the word ends in IGHT, the possible answers would be EIGHT, BIGHT, SIGHT, RIGHT, FIGHT, NIGHT, LIGHT, and MIGHT, more than six. However, if they get lucky and learn it on an early guess, they will have the opportunity to guess “BRIEF” and “SMILE” to narrow the list sharply enough to check seven of the possibilities. A yellow letter (not I) points to the answer, while a persistent grey points to NIGHT.

An information Game

When you have the list of words that the game accepts, you can create a directed complete graph of all colorings between guess-answer pairs. A word’s outward edges, say, comprise a subset among all 243 (3^5) colorings that a guess could yield if that word was the answer. This is, perhaps, the word’s “fingerprint,” if its subset of colorings differs from all the others. It may be that not all of them have unique sets. Chris Chow reports that the game accepts arond 10,000-15,000 words as guesses, so there is good reason to suspect that most words would hit every coloring, but it’s worth investigating.

From the fingerprints, you can index the other way. For each possible coloring, compose a set of all words whose fingerprints contain that coloring. Then, if you have a list of colorings from publicly available (or simulated) guesses, you can reduce set-intersection across the sets that correspond to those colorings. If we are lucky, only one word will belong to the intersection of all those sets.

But what if, hypothetically, a word only hit a few oft-repeated colorings? AAAAA may not be in the word list, but it’s an easy example because none of its colorings contain yellow. A set intersection of only green/grey colorings might end up rather large, but most of its elements also have other colorings. We can’t guarantee that the public social media posts will contain the exhaustive list of colorings, so we can’t rule them out on that basis. In that situation, finding the most strictly matched element of that set intersection would be a more likely, but uncertain guess. Luck will always be a factor, but a small advantage is that the list of possible answers is much smaller than the list of allowed guesses.

Before trawling the HTTP APIs, we can discover the theoretical limits through analysis.

Domain

The most recent list of accepted words I can find is here. If we can’t establish unique fingerprints for each possible answer, we may still arrive at a successful solver by weighting the contribution that each guess provides to its coloring against a given answer. That was a top-heavy sentence. Suppose we’re building the fingerprint for the word “HELLO.” We want to know how likely a player is to think of that word, based on how often it’s used. It’s more likely that 🟩🟩🟩⬜⬜ came from guessing “BELTS” against “BELOW,” than from guessing “CRAKE” (a type of bird) against “CRAWL.” A simple way to get those numbers might be to pull percentages from Google NGram, if it offers a suitable API.

Another consideration is a word’s likelihood of being an answer. There used to be a shortlist of less than 3000 words that were possible answers, but the New York Times dropped that stricture and selects the daily answer by the editor’s discretion. Still we can rely on the editor to avoid words that most people haven’t seen before. This can be quantified with the same metrics as above, and may even be thresholded.

Study & Findings

I conducted a full analysis using Google colab, and the process is linked here. Sorry if it’s a bit messy.

Much to my surprise, all of the words in the list have unique fingerprints! Although there are 2^243 possible fingerprints, I had assumed that most fingerprints would have 230-240 colorings, but in fact the average is ~150 colorings. There is one caveat, however. Some fingerprints are subsets of others, meaning the reduce set-intersection strategy will not be enough to differentiate them. Often these are words that we can confidently exclude because they’re so weird, but I recorded examples of genuine ambiguity in the notebook.

In well-behaved cases, at least, The number of unique colorings required to single out an answer centers on 100, often much less when you discount obscure words. Narrowing to five possible answers centers around 50. We can’t be sure without gathering live data, but there are probably enough posts out there to solve most or all Wordles.

I have other priorities at the moment, but it will be fun to finally bring the concept to life in the indeterminate future. I’m open to ideas, suggestions, and collaboration.

Cover art: Piet Mondrian, Composition with Grid #1

Mondrian once said, “The realization of equivalent relationships is of the highest importance for life.”