The Wonders and Simplicity of Redis Sets
by jperras
If you were to apply a bijective function to each letter in each word of a language (e.g. English), how many pre-existing words would you obtain in the resulting image?
Since that’s a pretty convoluted way of explaining things, let’s try a more concrete example.
We’ll take the well-known
substitution cipher (a simple example of a bijection between the set of letters in the English alphabet and itself), and apply it to every letter in a chosen word. For most words, the result will be non-sensical gibberish. There does exist, however, a subset of valid English words that map into other valid English words.
Example:
rot13('sync') = 'flap'
How many of these words exist? To answer this, I wrote a small Python script that loads up the words in my system dictionary into a Redis set. Another set of the rot13′ed words is then stored, and the set intersection of the original and transformed words is calculated:
After a simple cardinality check, we have our answer: 256 words[1].
The great part about this little script is that nearly everything is done natively in Redis – the only thing Python is needed for is loading the words into the database, and the implementation of the bijective function that we wish to apply.
This is a very contrived example, but the ease with which I was able to map my thought process to code was fantastic. No need to think about tables, rows or joins – just sets, and operations on sets. The simplicity of it is almost shocking.
Pretty neat, eh?
[1] This result does not discard any single letter words (e.g. “a”), which will always trivially map into another letter when using rot13.
Comments
If I was storing a list of arbitrary key/values in redis and then wanted to get a grouped return on the intersection of a bunch of k/vs on all the data, could you do that easily with Redis?
For example, if I had various a=1, b=2, c=5, d=89, e=78 etc (different attributes and values, some the same, millions of times) and wanted to get a grouped resultset of (a=1, c=5, re=45) returned where all those elements that match all three are grouped together then all elements that match just 2 and then finally all that match just one, can Redis do that?