LRU cache design

70,466

Solution 1

A linked list + hashtable of pointers to the linked list nodes is the usual way to implement LRU caches. This gives O(1) operations (assuming a decent hash). Advantage of this (being O(1)): you can do a multithreaded version by just locking the whole structure. You don't have to worry about granular locking etc.

Briefly, the way it works:

On an access of a value, you move the corresponding node in the linked list to the head.

When you need to remove a value from the cache, you remove from the tail end.

When you add a value to cache, you just place it at the head of the linked list.

Thanks to doublep, here is site with a C++ implementation: Miscellaneous Container Templates.

Solution 2

This is my simple sample c++ implementation for LRU cache, with the combination of hash(unordered_map), and list. Items on list have key to access map, and items on map have iterator of list to access list.

#include <list>
#include <unordered_map>
#include <assert.h>

using namespace std;

template <class KEY_T, class VAL_T> class LRUCache{
private:
        list< pair<KEY_T,VAL_T> > item_list;
        unordered_map<KEY_T, decltype(item_list.begin()) > item_map;
        size_t cache_size;
private:
        void clean(void){
                while(item_map.size()>cache_size){
                        auto last_it = item_list.end(); last_it --;
                        item_map.erase(last_it->first);
                        item_list.pop_back();
                }
        };
public:
        LRUCache(int cache_size_):cache_size(cache_size_){
                ;
        };

        void put(const KEY_T &key, const VAL_T &val){
                auto it = item_map.find(key);
                if(it != item_map.end()){
                        item_list.erase(it->second);
                        item_map.erase(it);
                }
                item_list.push_front(make_pair(key,val));
                item_map.insert(make_pair(key, item_list.begin()));
                clean();
        };
        bool exist(const KEY_T &key){
                return (item_map.count(key)>0);
        };
        VAL_T get(const KEY_T &key){
                assert(exist(key));
                auto it = item_map.find(key);
                item_list.splice(item_list.begin(), item_list, it->second);
                return it->second->second;
        };

};

Solution 3

I see here several unnecessary complicated implementations, so I decided to provide my implementation as well. The cache has only two methods, get and set. Hopefully it is better readable and understandable:

#include<unordered_map>
#include<list>

using namespace std;

template<typename K, typename V = K>
class LRUCache
{

private:
    list<K>items;
    unordered_map <K, pair<V, typename list<K>::iterator>> keyValuesMap;
    int csize;

public:
    LRUCache(int s) :csize(s) {
        if (csize < 1)
            csize = 10;
    }

    void set(const K key, const V value) {
        auto pos = keyValuesMap.find(key);
        if (pos == keyValuesMap.end()) {
            items.push_front(key);
            keyValuesMap[key] = { value, items.begin() };
            if (keyValuesMap.size() > csize) {
                keyValuesMap.erase(items.back());
                items.pop_back();
            }
        }
        else {
            items.erase(pos->second.second);
            items.push_front(key);
            keyValuesMap[key] = { value, items.begin() };
        }
    }

    bool get(const K key, V &value) {
        auto pos = keyValuesMap.find(key);
        if (pos == keyValuesMap.end())
            return false;
        items.erase(pos->second.second);
        items.push_front(key);
        keyValuesMap[key] = { pos->second.first, items.begin() };
        value = pos->second.first;
        return true;
    }
};

Solution 4

Here is my implementation for a basic, simple LRU cache.

//LRU Cache
#include <cassert>
#include <list>

template <typename K,
          typename V
          >
class LRUCache
    {
    // Key access history, most recent at back
    typedef std::list<K> List;

    // Key to value and key history iterator
    typedef unordered_map< K,
                           std::pair<
                                     V,
                                     typename std::list<K>::iterator
                                    >
                         > Cache;

    typedef V (*Fn)(const K&);

public:
    LRUCache( size_t aCapacity, Fn aFn ) 
        : mFn( aFn )
        , mCapacity( aCapacity )
        {}

    //get value for key aKey
    V operator()( const K& aKey )
        {
        typename Cache::iterator it = mCache.find( aKey );
        if( it == mCache.end() ) //cache-miss: did not find the key
            {
            V v = mFn( aKey );
            insert( aKey, v );
            return v;
            }

        // cache-hit
        // Update access record by moving accessed key to back of the list
        mList.splice( mList.end(), mList, (it)->second.second );

        // return the retrieved value
        return (it)->second.first;
        }

private:
        // insert a new key-value pair in the cache
    void insert( const K& aKey, V aValue )
        {
        //method should be called only when cache-miss happens
        assert( mCache.find( aKey ) == mCache.end() );

        // make space if necessary
        if( mList.size() == mCapacity )
            {
            evict();
            }

        // record k as most-recently-used key
        typename std::list<K>::iterator it = mList.insert( mList.end(), aKey );

        // create key-value entry, linked to the usage record
        mCache.insert( std::make_pair( aKey, std::make_pair( aValue, it ) ) );
        }

        //Purge the least-recently used element in the cache
    void evict()
        {
        assert( !mList.empty() );

        // identify least-recently-used key
        const typename Cache::iterator it = mCache.find( mList.front() );

        //erase both elements to completely purge record
        mCache.erase( it );
        mList.pop_front();
        }

private:
    List mList;
    Cache mCache;
    Fn mFn;
    size_t mCapacity;
    };

Solution 5

I implemented a thread-safe LRU cache two years back.

LRU is typically implemented with a HashMap and LinkedList. You can google the implementation detail. There are a lot of resources about it(Wikipedia have a good explanation too).

In order to be thread-safe, you need put lock whenever you modify the state of the LRU.

I will paste my C++ code here for your reference.

Here is the implementation.

/***
    A template thread-safe LRU container.

    Typically LRU cache is implemented using a doubly linked list and a hash map.
    Doubly Linked List is used to store list of pages with most recently used page
    at the start of the list. So, as more pages are added to the list,
    least recently used pages are moved to the end of the list with page
    at tail being the least recently used page in the list.

    Additionally, this LRU provides time-to-live feature. Each entry has an expiration
    datetime.
***/
#ifndef LRU_CACHE_H
#define LRU_CACHE_H

#include <iostream>
#include <list>

#include <boost/unordered_map.hpp>
#include <boost/shared_ptr.hpp>
#include <boost/make_shared.hpp>
#include <boost/date_time/posix_time/posix_time.hpp>
#include <boost/thread/mutex.hpp>

template <typename KeyType, typename ValueType>
  class LRUCache {
 private:
  typedef boost::posix_time::ptime DateTime;

  // Cache-entry
  struct ListItem {
  ListItem(const KeyType &key,
           const ValueType &value,
           const DateTime &expiration_datetime)
  : m_key(key), m_value(value), m_expiration_datetime(expiration_datetime){}
    KeyType m_key;
    ValueType m_value;
    DateTime m_expiration_datetime;
  };

  typedef boost::shared_ptr<ListItem> ListItemPtr;
  typedef std::list<ListItemPtr> LruList;
  typedef typename std::list<ListItemPtr>::iterator LruListPos;
  typedef boost::unordered_map<KeyType, LruListPos> LruMapper;

  // A mutext to ensuare thread-safety.
  boost::mutex m_cache_mutex;

  // Maximum number of entries.
  std::size_t m_capacity;

  // Stores cache-entries from latest to oldest.
  LruList m_list;

  // Mapper for key to list-position.
  LruMapper m_mapper;

  // Default time-to-live being add to entry every time we touch it.
  unsigned long m_ttl_in_seconds;

  /***
      Note : This is a helper function whose function call need to be wrapped
      within a lock. It returns true/false whether key exists and
      not expires. Delete the expired entry if necessary.
  ***/
  bool containsKeyHelper(const KeyType &key) {
    bool has_key(m_mapper.count(key) != 0);
    if (has_key) {
      LruListPos pos = m_mapper[key];
      ListItemPtr & cur_item_ptr = *pos;

      // Remove the entry if key expires
      if (isDateTimeExpired(cur_item_ptr->m_expiration_datetime)) {
        has_key = false;
        m_list.erase(pos);
        m_mapper.erase(key);
      }
    }
    return has_key;
  }

  /***
      Locate an item in list by key, and move it at the front of the list,
      which means make it the latest item.
      Note : This is a helper function whose function call need to be wrapped
      within a lock.
  ***/
  void makeEntryTheLatest(const KeyType &key) {
    if (m_mapper.count(key)) {
      // Add original item at the front of the list,
      // and update <Key, ListPosition> mapper.
      LruListPos original_list_position = m_mapper[key];
      const ListItemPtr & cur_item_ptr = *original_list_position;
      m_list.push_front(cur_item_ptr);
      m_mapper[key] = m_list.begin();

      // Don't forget to update its expiration datetime.
      m_list.front()->m_expiration_datetime = getExpirationDatetime(m_list.front()->m_expiration_datetime);

      // Erase the item at original position.
      m_list.erase(original_list_position);
    }
  }

 public:

  /***
      Cache should have capacity to limit its memory usage.
      We also add time-to-live for each cache entry to expire
      the stale information. By default, ttl is one hour.
  ***/
 LRUCache(std::size_t capacity, unsigned long ttl_in_seconds = 3600)
   : m_capacity(capacity), m_ttl_in_seconds(ttl_in_seconds) {}

  /***
      Return now + time-to-live
  ***/
  DateTime getExpirationDatetime(const DateTime &now) {
    static const boost::posix_time::seconds ttl(m_ttl_in_seconds);
    return now + ttl;
  }

  /***
      If input datetime is older than current datetime,
      then it is expired.
  ***/
  bool isDateTimeExpired(const DateTime &date_time) {
    return date_time < boost::posix_time::second_clock::local_time();
  }

  /***
      Return the number of entries in this cache.
   ***/
  std::size_t size() {
    boost::mutex::scoped_lock lock(m_cache_mutex);
    return m_mapper.size();
  }

  /***
      Get value by key.
      Return true/false whether key exists.
      If key exists, input paramter value will get updated.
  ***/
  bool get(const KeyType &key, ValueType &value) {
    boost::mutex::scoped_lock lock(m_cache_mutex);
    if (!containsKeyHelper(key)) {
      return false;
    } else {
      // Make the entry the latest and update its TTL.
      makeEntryTheLatest(key);

      // Then get its value.
      value = m_list.front()->m_value;
      return true;
    }
  }

  /***
      Add <key, value> pair if no such key exists.
      Otherwise, just update the value of old key.
  ***/
  void put(const KeyType &key, const ValueType &value) {
    boost::mutex::scoped_lock lock(m_cache_mutex);
    if (containsKeyHelper(key)) {
      // Make the entry the latest and update its TTL.
      makeEntryTheLatest(key);

      // Now we only need to update its value.
      m_list.front()->m_value = value;
    } else { // Key exists and is not expired.
      if (m_list.size() == m_capacity) {
        KeyType delete_key = m_list.back()->m_key;
        m_list.pop_back();
        m_mapper.erase(delete_key);
      }

      DateTime now = boost::posix_time::second_clock::local_time();
      m_list.push_front(boost::make_shared<ListItem>(key, value,
                                                     getExpirationDatetime(now)));
      m_mapper[key] = m_list.begin();
    }
  }
};
#endif

Here is the unit tests.

#include "cxx_unit.h"
#include "lru_cache.h"

struct LruCacheTest
  : public FDS::CxxUnit::TestFixture<LruCacheTest>{
  CXXUNIT_TEST_SUITE();
  CXXUNIT_TEST(LruCacheTest, testContainsKey);
  CXXUNIT_TEST(LruCacheTest, testGet);
  CXXUNIT_TEST(LruCacheTest, testPut);
  CXXUNIT_TEST_SUITE_END();

  void testContainsKey();
  void testGet();
  void testPut();
};


void LruCacheTest::testContainsKey() {
  LRUCache<int,std::string> cache(3);
  cache.put(1,"1"); // 1
  cache.put(2,"2"); // 2,1
  cache.put(3,"3"); // 3,2,1
  cache.put(4,"4"); // 4,3,2

  std::string value_holder("");
  CXXUNIT_ASSERT(cache.get(1, value_holder) == false); // 4,3,2
  CXXUNIT_ASSERT(value_holder == "");

  CXXUNIT_ASSERT(cache.get(2, value_holder) == true); // 2,4,3
  CXXUNIT_ASSERT(value_holder == "2");

  cache.put(5,"5"); // 5, 2, 4

  CXXUNIT_ASSERT(cache.get(3, value_holder) == false); // 5, 2, 4
  CXXUNIT_ASSERT(value_holder == "2"); // value_holder is still "2"

  CXXUNIT_ASSERT(cache.get(4, value_holder) == true); // 4, 5, 2
  CXXUNIT_ASSERT(value_holder == "4");

  cache.put(2,"II"); // {2, "II"}, 4, 5

  CXXUNIT_ASSERT(cache.get(2, value_holder) == true); // 2, 4, 5
  CXXUNIT_ASSERT(value_holder == "II");

  // Cache-entries : {2, "II"}, {4, "4"}, {5, "5"}
  CXXUNIT_ASSERT(cache.size() == 3);
  CXXUNIT_ASSERT(cache.get(2, value_holder) == true);
  CXXUNIT_ASSERT(cache.get(4, value_holder) == true);
  CXXUNIT_ASSERT(cache.get(5, value_holder) == true);
}

void LruCacheTest::testGet() {
  LRUCache<int,std::string> cache(3);
  cache.put(1,"1"); // 1
  cache.put(2,"2"); // 2,1
  cache.put(3,"3"); // 3,2,1
  cache.put(4,"4"); // 4,3,2

  std::string value_holder("");
  CXXUNIT_ASSERT(cache.get(1, value_holder) == false); // 4,3,2
  CXXUNIT_ASSERT(value_holder == "");

  CXXUNIT_ASSERT(cache.get(2, value_holder) == true); // 2,4,3
  CXXUNIT_ASSERT(value_holder == "2");

  cache.put(5,"5"); // 5,2,4
  CXXUNIT_ASSERT(cache.get(5, value_holder) == true); // 5,2,4
  CXXUNIT_ASSERT(value_holder == "5");

  CXXUNIT_ASSERT(cache.get(4, value_holder) == true); // 4, 5, 2
  CXXUNIT_ASSERT(value_holder == "4");


  cache.put(2,"II");
  CXXUNIT_ASSERT(cache.get(2, value_holder) == true); // {2 : "II"}, 4, 5
  CXXUNIT_ASSERT(value_holder == "II");

  // Cache-entries : {2, "II"}, {4, "4"}, {5, "5"}
  CXXUNIT_ASSERT(cache.size() == 3);
  CXXUNIT_ASSERT(cache.get(2, value_holder) == true);
  CXXUNIT_ASSERT(cache.get(4, value_holder) == true);
  CXXUNIT_ASSERT(cache.get(5, value_holder) == true);
}

void LruCacheTest::testPut() {
  LRUCache<int,std::string> cache(3);
  cache.put(1,"1"); // 1
  cache.put(2,"2"); // 2,1
  cache.put(3,"3"); // 3,2,1
  cache.put(4,"4"); // 4,3,2
  cache.put(5,"5"); // 5,4,3

  std::string value_holder("");
  CXXUNIT_ASSERT(cache.get(2, value_holder) == false); // 5,4,3
  CXXUNIT_ASSERT(value_holder == "");

  CXXUNIT_ASSERT(cache.get(4, value_holder) == true); // 4,5,3
  CXXUNIT_ASSERT(value_holder == "4");

  cache.put(2,"II");
  CXXUNIT_ASSERT(cache.get(2, value_holder) == true); // II,4,5
  CXXUNIT_ASSERT(value_holder == "II");

  // Cache-entries : {2, "II"}, {4, "4"}, {5, "5"}
  CXXUNIT_ASSERT(cache.size() == 3);
  CXXUNIT_ASSERT(cache.get(2, value_holder) == true);
  CXXUNIT_ASSERT(cache.get(4, value_holder) == true);
  CXXUNIT_ASSERT(cache.get(5, value_holder) == true);
}

CXXUNIT_REGISTER_TEST(LruCacheTest);
Share:
70,466

Related videos on Youtube

user297850
Author by

user297850

Updated on April 16, 2022

Comments

  • user297850
    user297850 about 2 years

    Least Recently Used (LRU) Cache is to discard the least recently used items first How do you design and implement such a cache class? The design requirements are as follows:

    1) find the item as fast as we can

    2) Once a cache misses and a cache is full, we need to replace the least recently used item as fast as possible.

    How to analyze and implement this question in terms of design pattern and algorithm design?

  • Admin
    Admin about 14 years
    @Moron: I would use a doubly-linked list.
  • leo7r
    leo7r about 14 years
    @darid: I think you can also use a singly-linked list and make the hash table point to the previous node to the one that contains the hashed value.
  • Admin
    Admin about 14 years
    @darid: Just 'linked list' does not imply singly linked list. Besides, for singly linked list, making a random node the head is not O(1).
  • Admin
    Admin about 14 years
    @Rafal: Why complicate matters? Given a doubly linked list implementation and a hashtable implementation, you can put them together easily to create a LRU implementaiton. Trying to maintain the linked list structure in the hashtable will make you implement the linked list methods yourself and you will not be able to use off the shelf ones...
  • Admin
    Admin almost 14 years
    By the way, there a linked hash table implementation for C++ in Miscellaneous Container Templates. Its documentation contains example exactly for this usecase.
  • Admin
    Admin almost 14 years
    @doublep: Thanks, I have added your link to the answer.
  • sprite
    sprite about 13 years
    What you are saying is pretty much the Linked List + Hash Table method.
  • elif
    elif over 11 years
    There is also a linked hash map implementation in java that has a LRU option: LinkedHashMap
  • Casey Patton
    Casey Patton over 11 years
    So is your hash-map from keys to Node pointers? And does the node store both the key and the value? So in order to do a get you have to follow the node pointer and grab the value?
  • jax
    jax about 8 years
    Why do you use an list and unordeed map?
  • Chintan Parikh
    Chintan Parikh over 7 years
    list is implemented as doubly linked list internally, and unordered_map is basically a hash table. so this is efficient solution, in terms of time & space complexity.
  • Joe Black
    Joe Black over 5 years
    are you locking both the hashtable and linkedlist when doing either get or put? Wouldnt that make it almost sequential even for CPUs with multiple cores?
  • Yang Liu
    Yang Liu over 5 years
    @JoeBlack. Yes, I am locking both hashtable and linkedlist, this because get and put method both edit the "state" of these two data structures. We have to do it like this, right? We can't let two threads editing the data structure at the same time.
  • piepi
    piepi over 2 years
    @RafałDowgird Hey, can you elaborate on what you meant by previous node to the one that contains the hashed value? How would I control the previous node and/or the hashed value? I am taking LinkedHashMap or ConcurrentHashMap in java as examples.