If you were to apply a bijective function to each letter in each word of a language (e.g. English), how many pre-existing words would you obtain in the resulting image?

Since that's a pretty convoluted way of explaining things, let's try a more concrete example.

We'll take the well-known rot13 substitution cipher (a simple example of a bijection between the set of letters in the English alphabet and itself), and apply it to every letter in a chosen word. For most words, the result will be non-sensical gibberish. There does exist, however, a subset of valid English words that map into other valid English words.

Example

rot13('sync') = 'flap'

How many of these words exist? To answer this, I wrote a small Python script that loads up the words in my system dictionary into a Redis set. Another set of the rot13'ed words is then stored, and the set intersection of the original and transformed words is calculated:

After a simple cardinality check, we have our answer: 256 words1.

The great part about this little script is that nearly everything is done natively in Redis - the only thing Python is needed for is loading the words into the database, and the implementation of the bijective function that we wish to apply.

This is a very contrived example, but the ease with which I was able to map my thought process to code was fantastic. No need to think about tables, rows or joins - just sets, and operations on sets. The simplicity of it is almost shocking.

Pretty neat, eh?

  1. This result does not discard any single letter words (e.g. "a"), which will always trivially map into another letter when using rot13.