Levenshtein squares: symmetrical word puzzles embedded in Connections
Earlier this week, I had a terrible idea for a Connections puzzle. I found a lovely tool to create them and sent my monstrosity of a board to my friends for laughs. We've been bouncing silly puzzles back and forth ever since then. A 1-letter puzzle led to a 2-letter puzzle, which then begat a 3-letter puzzle. This made me wonder: what would a 4-letter puzzle that takes full advantage of its inherent symmetry with the game of Connections look like?
To be clear, the goal here is not to design a traditional Connections puzzle around 4-letter words. I wanted something a little more abstract - something that leans hard into the maximal four-ness of the format (fourmat?). This design space grants us the opportunity to establish 1:1 mappings between the 4 word groups (and/or the 4 words in each group) and the 4 letters in each word. Swapping letters to create new words seems like a somewhat canonical mechanic to try here; let's give it a shot.
We'll start with a secret word for each group, then derive from it 4 clue words for the group that each have a different letter index substituted:
To ensure the full puzzle only has one valid solution, let's choose secret words that have a different letter in each position (e.g. ruling out BODY and BRAT as secret words, but not RATE and TEAR). Here's an example:
The clue words are shown in order here for illustration, but they'll all be shuffled together for the actual puzzle (and the secret words won't be shown, of course).
A troublesome strategy that emerges from this design is to look for exactly 3 of a letter in a given position across all the clue words. Those 3 words are guaranteed to belong to the same group, leaving you with one more word to find. 75% of the puzzle is trivialized by this realization! Letter distribution analysis turns out to be very powerful for this kind of puzzle - we need to nerf it somehow.
One option here is to look for secret words whose clue words can borrow letters from the other secret words. Here's a simplified example of what we're looking for:
Choosing which secret word to borrow from is trivial when there are only two groups; when there are four, the swaps for each index have to form a permutation of the hidden words' letters at that same index. (Thus, the swaps within a group typically won't spell out a hidden word as illustrated above.) Borrowing letters for substitutions like this perfectly balances the letter distributions, meaning the player can no longer make trivial inferences based on single columns. The resulting structure is also pleasantly symmetrical; I've tentatively (and non-authoritatively) named it a Levenshtein square.
All of that took me hours to figure out, by the way. I had to spend a long time rotating four-dimensional word clusters in my head to convince myself that this was (1) possible and (2) tractable to search for valid puzzles. Initially, it seemed like the combinatorial space was immense, perhaps involving 16! (factorial) permutations to evaluate per possible puzzle. The answer thankfully turned out to be a lot lower - just 9 permutations times 4 positions. Together with some hashmap optimizations and early pruning of dead-ends, it's possible to brute-force the entire search space for a 1,000-word dictionary in just a few seconds on my computer.
Unfortunately, there are no perfectly balanced puzzles contained in my 1,000-word dictionary. I had to resort to an additional, ~5,000-word dictionary full of archaic and obscure words to find boards that met my criteria. But I found them, and now you can experience them for yourself!
These puzzles were optimized to contain as many "common" clue words as possible; only ~30 of the boards my script found had 10 or more common words.¹ Aside from the mandatory archaic words, these puzzles meet my definition of a Levenshtein Square perfectly. I think there's good potential for a daily game based around this concept.
The perfect 4x4 Levenshtein square requires some compromises, though. Ideally, I'd like to limit the board to the smaller dictionary and compromise on something other than the word bank. Deviating from the perfect definition generally seems like a good way to make the problem space more interesting. Maybe the puzzle should allow swaps of two indexes², or multiple words within a group with the same substitution index. Maybe the hidden words can be permitted to share letters at a given index; we can check for uniqueness of a solution procedurally instead. Much remains to be explored!
¹ The most common words I've found in a puzzle is 13, but those puzzles all require some uncommon words that are tough to accept.² This is known as the Damerau-Levenshtein distance.