Architecture for Redis cache & Mongo for persistence

node.js mongodb caching redis

22,160

It is actually sensible to associate Redis and MongoDB: they are good team players. You will find more information here:

MongoDB with redis

One critical point is the resiliency level you need. Both Redis and MongoDB can be configured to achieve an acceptable level of resiliency, and these considerations should be discussed at design time. Also, it may put constraint on the deployment options: if you want master/slave replication for both Redis and MongoDB you need at least 4 boxes (Redis and MongoDB should not be deployed on the same machine).

Now, it may be a bit simpler to keep Redis for queuing, pub/sub, etc ... and store the user data in MongoDB only. Rationale is you do not have to design similar data access paths (the difficult part of this job) for two stores featuring different paradigms. Also, MongoDB has built-in horizontal scalability (replica sets, auto-sharding, etc ...) while Redis has only do-it-yourself scalability.

Regarding the second question, writing to both stores would be the easiest way to do it. There is no built-in feature to replicate Redis activity to MongoDB. Designing a daemon listening to a Redis queue (where activity would be posted) and writing to MongoDB is not that hard though.

22,160

Ryan Ogle

Updated on July 09, 2022

Comments

Ryan Ogle almost 2 years

The Setup:
Imagine a 'twitter like' service where a user submits a post, which is then read by many (hundreds, thousands, or more) users.

My question is regarding the best way to architect the cache & database to optimize for quick access & many reads, but still keep the historical data so that users may (if they want) see older posts. The assumption here is that 90% of users would only be interested in the new stuff, and that the old stuff will get accessed occasionally. The other assumption here is that we want to optimize for the 90%, and its ok if the older 10% take a little longer to retrieve.

With this in mind, my research seems to strongly point in the direction of using a cache for the 90%, and then to also store the posts in another longer-term persistent system. So my idea thus far is to use Redis for the cache. The advantages is that Redis is very fast, and also it has built in pub/sub which would be perfect for publishing posts to many people. And then I was considering using MongoDB as a more permanent data store to store the same posts which will be accessed as they expire off of Redis.

Questions:
1. Does this architecture hold water? Is there a better way to do this?
2. Regarding the mechanism for storing posts in both the Redis & MongoDB, I was thinking about having the app do 2 writes: 1st - write to Redis, it then is immediately available for the subscribers. 2nd - after successfully storing to Redis, write to MongoDB immediately. Is this the best way to do it? Should I instead have Redis push the expired posts to MongoDB itself? I thought about this, but I couldn't find much information on pushing to MongoDB from Redis directly.
- Sergio Tulentsev almost 12 years
  
  Redis won't push to MongoDb. You have to do it yourself. Or just write to both places at the same time (as you suggested).
- Geert-Jan almost 12 years
  
  I'd always push to the more robust store first (MongoDB in this case), or as Sergio suggested, async at the same time. Never the other way around.
- user636525 over 11 years
  
  My question is , would you store only the ids of posts in cache or the whole lists of post objects in cache ?
Geert-Jan almost 12 years

I'm curious, any links/background on why Redis and Mongo shouldn't be deployed on the same machine?
Didier Spezia almost 12 years

It is due to the fact MongoDB maps the data files in memory. So it uses the virtual memory mechanism to access the data whose structure is designed to favor locality (btrees are used for indexes for instance). With MongoDB, when the data do not fit in memory, the machine will swap, and it is designed for this.
Didier Spezia almost 12 years

On the contrary, Redis is a pure main-memory data store, based on memory oriented data structures (hash tables, lists, skip lists, etc ...) which do not enforce any kind of locality. Because it is single-threaded, performance is dramatically impacted when Redis memory is swapped out.
Didier Spezia almost 12 years

So if you put MongoDB and Redis on the same box and MongoDB data do not fit in memory, MongoDB will "steal" memory to Redis via the OS paging mechanism. The consequence is a major performance drop for Redis.
Geert-Jan almost 12 years

Thanks, good to know. On boxes where both Mongo and Redis data fit in Ram completely I take it this isn't a problem?
Didier Spezia almost 12 years

Correct. If everything fit in memory, there is no issue.
farnoy over 11 years

Can't we limit mongo with cgroups so that redis has at least max-memory-limit available at all times?
Didier Spezia over 11 years

I have never tried using cgroups with mongo, but it should work. Please note Redis requires more memory that max-memory-limit (communication buffers, etc ...). You will probably have to measure to size the cgroups config.
Lion789 over 10 years

So in the end does it make more sense to write to redis and mongodb at the same time? Also, to do a read, should redis be queried first and if it does not exist query mongo?