How to design Redis data structures in order to perform queries similar to DB queries in redis?

10,140

Solution 1

Redis is not designed to structure the data in SQL way. Beside a in-memory key value store, it supports five types of data structures: Strings, Hashes, Lists, Sets and Sorted Sets. At high level this is a sufficient hint that Redis is designed to solve performance problems that arises due to high computation in relational data models. However, if you want to execute sql query in a in-memory structure, you may want to look at memsql.

Solution 2

Let's break down the SQL statement into different components and I'll try to show how redis can accomplish various parts.

Select J.JobID, J.JobName from Job J;

We translate each row in "Job" into a hash in redis using the SQL primary index as the redis natural index in redis. For example: SQL

==JobId==|==Name==
123        Fred

Redis HSET Job:123 Name Fred which can be conceptualized as

Job-123 => {"Name":"Fred"}

Thus we can store columns as hash fields in redis

Let's say we do the same thing for JobInfo. Each JobInfo object has its own ID

JobInfo-876 => {"meta1": "some value", "meta2": "bla", "JobID": "123"}

In sql normally we would make a secondary index on JobInfo.JobID but in NoSql land we maintain our own secondary indexes.

Sorted Sets are great for this. Thus when we want to fetch JobInfo objects by some field, JobId in this case we can add it to a sorted set like this ZADD JobInfo-JobID 123 JobInfo-876

This results in a set with 1 element in it {JobInfo-876} which has a score of 123. I realize that forcing all JobIDs into the float range for the score is a bad idea, but work with me here.

Now when we want to find all JobInfo objects for a given JobID we just do a log(N) lookup into the index. ZRANGEBYSCORE JobInfo-JobID 123 123 which returns "JobInfo-876"

Now to implement simple joins we simply reuse this JobInfo-JobID index by storing Job keys by their JobIDs. ZADD JobInfo-JobID 123 Job-123

Thus when doing something akin to SELECT J.JobID, J.Name, B.meta1 FROM Job, JobInfo USING (JobID).

This would translate to scanning through the JobInfo-JobID secondary index and reorganizing the Job and JobInfo objects returned. ZRANGEBYSCORE JobInfo-JobID -inf +inf WITHSCORES 5 -> (Job-123, JobInfo-876)

These objects all share the same JobID. CLient side you'd then asynchronously fetch the needed fields. Or you could embed these lookups in a lua script. This lua script could make redis hang for a long time. Normally redis tries to be fair with clients and prefers you to have short batched queries instead of one long query.

Now we come to a big problem, what if we want to combine secondary indexes. Let's say we have a secondary index on JobInfo.Status, and another on Job.JobType. If we make a set of all jobs with the right JobType and use that as a filter on the JobInfo-JobID shared secondary index then we not only eliminate the bad Job elements but also every JobInfo element. We could, I guess fetch the scores(JobID) on the intersection and refetch all JobInfo objects with those scores, but we lose some of the filtering we did.

It is at this point where redis breaks down.

Here is an article on secondary indexes from the creator of redis himself: http://redis.io/topics/indexes He touches multi-dimensional indexes for filtering purposes. As you can see he designed the data structures in a very versatile way. One that is the most appealing is the fact that sorted set elements with the same score are stored in lexicographical order. Thus you can easily have all elements have a score of 0 and piggyback on Redis's speed and use it more like cockroachDB, which relies on a global order to implement many SQL features.

Solution 3

The other answer are completely correct for redis up to version 3.4

The latest releases of redis, from 4.0 onward, include supports for modules.

Modules are extremelly powerfull and it happens that I just wrote a small module to embed SQLite into redis itself; rediSQL.

With that module you can actually use a fully functional SQL database inside your redis instace.

Share:
10,140
user888270
Author by

user888270

Updated on June 06, 2022

Comments

  • user888270
    user888270 almost 2 years

    I have tables like Job, JobInfo. And i want to perform queries like below -

    "SELECT J.JobID FROM Job J, JobInfo B WHERE B.JobID = J.JobID AND BatchID=5850 AND B.Status=0 AND J.JobType<>2"

    How shall i go about writing my redis data types so that i can map such queries in redis?

    IF i try to map the rows of table job in a redis hash for e.g. (hash j jobid 1 status 2) & similarly the rows of table JobInfo in again a redis hash as (hash jinfo jobid 1 jobtype 3.)

    So my tables can be a set of hashes. Job table can be set with entries JobSet:jobid & JobInfo table can be set with entries like JobInfoSet:jobid

    But i am confused in when i will do a SINTER on JobSet & JobInfoSet. how am i going to query that hash to get keys? As in the hash content of set jobSet is not identical to hash content of table JobInfoSet (they may have different key value pair.

    So what exactly am i going to get as an output of SINTER? And how am i going to query that output as key-value pair?

    So the tables will be a collection of redis hashes

  • Brian
    Brian over 7 years
    CORRECTION:ZRANGEBYSCORE JobInfo-JobID -inf +inf WITHSCORES 5 -> (Job-123, JobInfo-876)
  • Brian
    Brian over 7 years
    CORRECTION: ZRANGEBYSCORE JobInfo-JobID -inf +inf WITHSCORES 123 -> (Job-123, JobInfo-876)