Is Nerdle Harder Than Wordle?

Nerdle is like Wordle, but for numbers. The words are short arithmetic sentences like “1+2+7=10” (always 8 characters long). You get 6 tries to solve a word, just like Wordle. A certain group chat got very, very excited about Nerdle this week and since some of them were freakishly good at it, not including yours truly, I’m trying to compensate by contributing a solver and some analysis. The source code for this post lives here.

We Need a Dictionary

By now, there are many solvers for Wordle. Those solvers rely on a well-defined dictionary of 2315 possible solutions and 12972 allowed guess words, which can be harvested from the Wordle source code. Nerdle doesn’t provide a dictionary, so we’ll need to generate our own. The rules state:

Rules

There are 8 “letters”

A “letter” is one of “0123456789+-*/=”

And a word must be a calculation that is mathematically correct. So it must have one “=”

Also, the number on the right of the “=” is just a number (not a calculation)

Standard order of operations applies, so calculate * and / before + and –

Order matters in nerdle. If the answer we’re looking for is 10+20=30, then 20+10=30 isn’t close enough.

Naively, there are 15^8 ≈ 2.5 billion candidate ‘nerdles’. That sounds like a lot, but we can cheat and just generate possible left-hand sides (on the order of 14^6 + 14^5 + … + 14 ≈ 8 million possibilities), and then check whether each expression evaluates to an integer with the right number of digits.

It turns out that this generates a torrent of obscenely irritating words like 1----1=0 (which is a valid guess in the interface) or 01++01=2 (also valid) or 0*0001=0 and its cousins. The Nerdle website states in another dialog:

As far as we can work out, there are over 100,000 valid words but we have chosen 17,723 valid “words” as there are quite a lot we thought you wouldn’t like.

This is a little more cheerfully vague than we might wish, but thankfully there is also an FAQ page that provides two additional rules:

Nerdle answers don’t use leading zeros or lone zeros, even though some may be accepted as valid guesses. So you won’t find an answer like this: 0+5+5=10, or like this: 01+2+1=4.

Nerdle answers don’t use numbers that are negative, even though some may be accepted as valid guesses.. So you won’t find an answer like this: -5-6=-11.

Somewhat to my surprise, these rules are enough to let us exactly recreate a dictionary of 17,723 words. Each rule has some additional gotchas:

  • while 0 cannot appear by itself on the left-hand side, it is allowed to appear as the answer on the right-hand side, although negative numbers are not allowed on either side.
  • the prohibition against negative numbers extends to all sequences of consecutive operators like ++ and -+ and ---, although these are accepted in the interface.

I was stuck on various lists of 16,000 to 80,000 words for some time before hitting on the magic combination of exceptions above, so: here is the list of words. The repository also contains Python and C++ dictionary generators, which run in about 8 seconds and 5 seconds on my 2020 MacBook Pro, respectively.

How Hard is Nerdle?

Compared to Wordle, both the number of possible answers and the number of legal guesses are an order of magnitude larger, so you might assume that Nerdle is harder. In fact, this is not necessarily true, due to the uniformity of the distribution of Nerdle characters and the smaller alphabet.

For comparison, here’s the same chart for the Wordle dictionary, which can be obtained from the source code at https://www.powerlanguage.co.uk/wordle/:

You notice that in the Wordle dictionary, most characters appear in 10 – 25% of the words (with a few devilishly hard characters that appear in only a tiny fraction.) But in Nerdle, most characters appear in about 50% of words, with 25% in the worst case. 50% turns out to be great from an information theory perspective, since characters that show up more frequently don’t reduce your search space enough and characters that show up less frequently are risky to waste a guess on.

Random Algorithm

When you input a guess, some of the words that were previously possibilities are ruled out. Only words that that would produce the sequence of patterns you’ve seen for all your guesses can remain. So a simple heuristic, similar to how a human might play Nerdle, is to keep guessing a random word from the set of words that are still possible until you find the right one.

How well does this work? I ran a simulation of this strategy, trying 1000 random starting words and picking each guess randomly from the set of words that could still be allowed.

This strategy is surprisingly effective, taking an average of 3.37 guesses. This is already better than the best average score you can get at Wordle with completely optimal play, so Nerdle is, it seems, potentially easier to solve than its predecessor (ignoring the difficulty of coming up with valid guesses, which is probably harder than finding words for Wordle). Indeed, the random strategy finds the correct answer in two guesses 6% of the time! 99.7% of trials solved the puzzle within the limit 6 guesses or less. So it’s safe to say that if you are able to just keep guessing Nerdle words that are consistent with the clues you’ve gotten so far, you are almost certain to beat the game.

Best Starting Word

The first step to improving the random strategy is to find a good starting word (or words). Without actually simulating all possible Nerdle games, we can say that a good first word should narrow the search space as much as possible. Of course, how much your search space narrows depends on what result (call it a “grade”) you get after you input the word. Each grade corresponds to a different set of candidate words (call it a “pool”). We would like to minimize the expected average size of the pools.

Your daily dose of math: Suppose, for a guess w, you have k pools of sizes s1…sk. Each pool correspond to the answers that are still possible given a particular grade G. The sum of s1 + s2+… sk = S = 17723. The likelihood that you will end up with a particular pool is proportionate to its size, so the expected value of the pool size is (s1 / S) * s1 + … (sk / S) * sk = (s12 + … sk2)/S — that is, we want to minimize the sums of squares of pool sizes.

So we can simply go through all the possible starter words and check which ones have the best expected pool size. Because this is on the order of 177232 = 300 million operations, I haven’t actually computed this value exhaustively yet, but a preliminary search has produced a few words like 48-32=16 and 43-26-17 with expected pool sizes of approximately 10 each. This is actually fantastic – it means that if you pick one of these words, you can expect to have only about 10 feasible Nerdles to choose between in the next step (although there is significant variance here.) Here’s the code snippet I used to compute pool sizes; see this file for definitions of the helper methods used here.

def pick_best(target_pool, guess_pool=None):
    if guess_pool is None:
        guess_pool = target_pool
    pools_by_guess = defaultdict(Counter)
    for guess in guess_pool:
        for target in target_pool:
            score = grade(guess, target)
            pools_by_guess[guess][score] += 1
    best, best_score = None, 1e9
    for guess, pools in pools_by_guess.items():
        total = 0
        for pool in pools.values():
            total += pool ** 2
        avg = total / len(pools)
        if avg < best_score:
            best_score = avg
            best = guess
            print(f"{guess} has average pool size {avg}")
    return best

Extending the Best Starting Word Strategy

What if we start with 48-32=16 each time, and then use the same heuristic described above to find the best guess given the new pool of possible solutions? Here’s my results for this strategy, over 1000 trials:

Using the heuristic, the average number of guesses drops to 3.125 and 99% of trials take 4 or fewer guesses. I’m guessing that a really well-tuned minimax algorithm could drop this average under 3 guesses per game (but probably not too far under: two guess solutions simply aren’t possible in many cases.)

Use the Solver

Sample solver output

I’ve written a simple interactive Python program that tells you what guesses to make and lets you report what grade you’ve gotten from the web interface. You can get it from the repository here: https://gitlab.com/jacob.brazeal/nerdle-solver. Let me know if you have any questions or problems!

Takeaway

Despite a much larger dictionary, Nerdles can be solved somewhat faster on average than Wordles. However, it may be harder to find valid guesses for Nerdle than it is to think of words for Wordle, so I think which one is more difficult for you probably depends on that factor.

I think the strategy here and the simulations are reasonably good, but someone could test this strategy exhaustively for all 17,723 Nerdle solutions with 17,723 starting words to get a very precise result. That’s a few hundred million scenarios to simulate, but there is adequate cloud compute out there in the ether for that workload. If you do, I’ll happily edit this post to incorporate your results.

5 thoughts on “Is Nerdle Harder Than Wordle?

  1. my (human) average is 3.97, including off days. WIthout doing the stats, nerdle seems a much easier problem than wordle, great for kids, though.
    interesting to learn about the restrictions. I had noticed some sums had fractional results as part of their calc chains

    Like

  2. I was searching whether you can actually lose at nerdle (as opposed to give up). When I’ve played it seems by about the third guess the possible solutions are a very small number and by the fourth may have only a couple. I can’t imagine how there could be more than one solution if you ever got to the sixth guess. I do assume certain rules that seem to be valid, and paying attention to which positions you have gotten correct and which numbers/operations you have correct but in the wrong position.

    Like

  3. I was searching to see if one can even lose at nerdle, az opposed to just give up. This assuming a few things, such as paying attention to numbers/operators that are in the correct position or are correct but not in the right place, and not using two operators in a row. Seems that by the third try there aren’t many possible answers left and by the fourth there often may only be a couple.

    Like

  4. Hmm, interesting study. Since day one I naively started with the word 12+35=47. I track just a little better than the “random” distribution. When I get a purple =, my next guess always tries to put = in position 7, but I wonder if the pools are more favourable if I aim to put it at position 5. Regarding outliers, I find these all to be of the form 3x-20=1x or similar. I did blow out on one such, but can’t remember if I made a misstep there.

    Like

Leave a comment