How would you implement a hashtable in language x?

27,903

Solution 1

A hash table a data structure that allows lookup of items in constant time. It works by hashing a value and converting that value to an offset in an array. The concept of a hash table is fairly easy to understand, but implementing is obviously harder. I'm not pasting the whole hash table here, but here are some snippets of a hash table I made in C a few weeks ago...

One of the basics of creating a hash table is having a good hash function. I used the djb2 hash function in my hash table:

int ComputeHash(char* key)
{
    int hash = 5381;
    while (*key)
    hash = ((hash << 5) + hash) + *(key++);
    return hash % hashTable.totalBuckets;
}

Then comes the actual code itself for creating and managing the buckets in the table

typedef struct HashTable{
    HashTable* nextEntry;
    char*      key;
    char*      value;
}HashBucket;

typedef struct HashTableEntry{
    int totalBuckets;       // Total number of buckets allocated for the hash table
    HashTable** hashBucketArray; // Pointer to array of buckets
}HashTableEntry;
HashTableEntry hashTable;

bool InitHashTable(int totalBuckets)
{
    if(totalBuckets > 0)
    {
        hashTable.totalBuckets = totalBuckets;
        hashTable.hashBucketArray = (HashTable**)malloc(totalBuckets * sizeof(HashTable));
        if(hashTable.hashBucketArray != NULL)
        {
            memset(hashTable.hashBucketArray, 0, sizeof(HashTable) * totalBuckets);
            return true;
        }
    }
    return false;
}

bool AddNode(char* key, char* value)
{
    int offset = ComputeHash(key);
    if(hashTable.hashBucketArray[offset] == NULL)
    {
        hashTable.hashBucketArray[offset] = NewNode(key, value);
        if(hashTable.hashBucketArray[offset] != NULL)
            return true;
    }

    else
    {
        if(AppendLinkedNode(hashTable.hashBucketArray[offset], key, value) != NULL)
            return true;
    }
    return false;
}

HashTable* NewNode(char* key, char* value)
{
    HashTable* tmpNode = (HashTable*)malloc(sizeof(HashTable));
    if(tmpNode != NULL)
    {
        tmpNode->nextEntry = NULL;
        tmpNode->key   = (char*)malloc(strlen(key));
        tmpNode->value = (char*)malloc(strlen(value));

        strcpy(tmpNode->key, key);
        strcpy(tmpNode->value, value);
    }
    return tmpNode;
}

AppendLinkedNode finds the last node in the linked list and appends a new node to it.

The code would be used like this:

if(InitHashTable(100) == false) return -1;
AddNode("10", "TEN");

Finding a node is a simple as:

HashTable* FindNode(char* key)
{
    int offset = ComputeHash(key);
    HashTable* tmpNode = hashTable.hashBucketArray[offset];
    while(tmpNode != NULL)
    {
        if(strcmp(tmpNode->key, key) == 0)
            return tmpNode;
        tmpNode = tmpNode->nextEntry;
    }
    return NULL;
}

And is used as follows:

char* value = FindNode("10");

Solution 2

A Java implementation in < 60 LoC

import java.util.ArrayList;
import java.util.List;
import java.util.Random;


public class HashTable {

    class KeyValuePair {

        Object key;

        Object value;

        public KeyValuePair(Object key, Object value) {
            this.key = key;
            this.value = value;
        }
    }

    private Object[] values;

    private int capacity;

    public HashTable(int capacity) {
        values = new Object[capacity];
        this.capacity = capacity;
    }

    private int hash(Object key) {
        return Math.abs(key.hashCode()) % capacity;
    }

    public void add(Object key, Object value) throws IllegalArgumentException {

        if (key == null || value == null)
            throw new IllegalArgumentException("key or value is null");

        int index = hash(key);

        List<KeyValuePair> list;
        if (values[index] == null) {
            list = new ArrayList<KeyValuePair>();
            values[index] = list;

        } else {
            // collision
            list = (List<KeyValuePair>) values[index];
        }

        list.add(new KeyValuePair(key, value));
    }

    public Object get(Object key) {
        List<KeyValuePair> list = (List<KeyValuePair>) values[hash(key)];
        if (list == null) {
            return null;
        }
        for (KeyValuePair kvp : list) {
            if (kvp.key.equals(key)) {
                return kvp.value;
            }
        }
        return null;
    }

    /**
     * Test
     */
    public static void main(String[] args) {

        HashTable ht = new HashTable(100);

        for (int i = 1; i <= 1000; i++) {
            ht.add("key" + i, "value" + i);
        }

        Random random = new Random();
        for (int i = 1; i <= 10; i++) {
            String key = "key" + random.nextInt(1000);
            System.out.println("ht.get(\"" + key + "\") = " + ht.get(key));
        }   
    }
}

Solution 3

A HashTable is a really simple concept: it is an array in which key and value pairs are placed into, (like an associative array) by the following scheme:

A hash function hashes the key to a (hopefully) unused index into the array. the value is then placed into the array at that particular index.

Data retrieval is easy, as the index into the array can be calculated via the hash function, thus look up is ~ O(1).

A problem arises when a hash function maps 2 different keys to the same index...there are many ways of handling this which I will not detail here...

Hash tables are a fundamental way of storing and retrieving data quickly, and are "built in" in nearly all programming language libraries.

Solution 4

I was looking for a completely portable C hash table implementation and became interested in how to implement one myself. After searching around a bit I found: Julienne Walker's The Art of Hashing which has some great tutorials on hashing and hash tables. Implementing them is a bit more complex than I thought but it was a great read.

Solution 5

Here is my code for a Hash Table, implemented in Java. Suffers from a minor glitch- the key and value fields are not the same. Might edit that in the future.

public class HashTable
{
    private LinkedList[] hashArr=new LinkedList[128];
    public static int HFunc(int key)
    {
        return key%128;
    }


    public boolean Put(Val V)
    {

        int hashval = HFunc(V.getKey());
        LinkedNode ln = new LinkedNode(V,null);
        hashArr[hashval].Insert(ln);
        System.out.println("Inserted!");
        return true;            
    }

    public boolean Find(Val V)
    {
        int hashval = HFunc(V.getKey());
        if (hashArr[hashval].getInitial()!=null && hashArr[hashval].search(V)==true)
        {
            System.out.println("Found!!");
            return true;
        }
        else
        {
            System.out.println("Not Found!!");
            return false;
        }

    }
    public boolean delete(Val v)
    {
        int hashval = HFunc(v.getKey());
        if (hashArr[hashval].getInitial()!=null && hashArr[hashval].delete(v)==true)
        {
            System.out.println("Deleted!!");
            return true;
        }
        else 
        {
            System.out.println("Could not be found. How can it be deleted?");
            return false;
        }
    }

    public HashTable()
    {
        for(int i=0; i<hashArr.length;i++)
            hashArr[i]=new LinkedList();
    }

}

class Val
{
    private int key;
    private int val;
    public int getKey()
    {
        return key;
    }
    public void setKey(int k)
    {
        this.key=k;
    }
    public int getVal()
    {
        return val;
    }
    public void setVal(int v)
    {
        this.val=v;
    }
    public Val(int key,int value)
    {
        this.key=key;
        this.val=value;
    }
    public boolean equals(Val v1)
    {
        if (v1.getVal()==this.val)
        {
            //System.out.println("hello im here");
            return true;
        }
        else 
            return false;
    }
}

class LinkedNode
{
    private LinkedNode next;
    private Val obj;
    public LinkedNode(Val v,LinkedNode next)
    {
        this.obj=v;
        this.next=next;
    }
    public LinkedNode()
    {
        this.obj=null;
        this.next=null;
    }
    public Val getObj()
    {
        return this.obj;    
    }
    public void setNext(LinkedNode ln)
    {
        this.next = ln;
    }

    public LinkedNode getNext()
    {
        return this.next;
    }
    public boolean equals(LinkedNode ln1, LinkedNode ln2)
    {
        if (ln1.getObj().equals(ln2.getObj()))
        {
            return true;
        }
        else 
            return false;

    }

}

class LinkedList
{
    private LinkedNode initial;
    public LinkedList()
    {
        this.initial=null;
    }
    public LinkedList(LinkedNode initial)
    {
        this.initial = initial;
    }
    public LinkedNode getInitial()
    {
        return this.initial;
    }
    public void Insert(LinkedNode ln)
    {
        LinkedNode temp = this.initial;
        this.initial = ln;
        ln.setNext(temp);
    }
    public boolean search(Val v)
    {
        if (this.initial==null)
            return false;
        else 
        {
            LinkedNode temp = this.initial;
            while (temp!=null)
            {
                //System.out.println("encountered one!");
                if (temp.getObj().equals(v))
                    return true;
                else 
                    temp=temp.getNext();
            }
            return false;
        }

    }
    public boolean delete(Val v)
    {
        if (this.initial==null)
            return false;
        else 
        {
            LinkedNode prev = this.initial;
            if (prev.getObj().equals(v))
            {
                this.initial = null;
                return true;
            }
            else
            {
                LinkedNode temp = this.initial.getNext();
            while (temp!=null)
            {
                if (temp.getObj().equals(v))
                {
                    prev.setNext(temp.getNext());
                    return true;
                }
                else
                {
                    prev=temp;
                    temp=temp.getNext();
                }
            }
            return false;
            }
        }
    }
}
Share:
27,903
mk.
Author by

mk.

funkadelicrex

Updated on July 09, 2022

Comments

  • mk.
    mk. almost 2 years

    The point of this question is to collect a list of examples of hashtable implementations using arrays in different languages. It would also be nice if someone could throw in a pretty detailed overview of how they work, and what is happening with each example.

    Edit:

    Why not just use the built in hash functions in your specific language?

    Because we should know how hash tables work and be able to implement them. This may not seem like a super important topic, but knowing how one of the most used data structures works seems pretty important to me. If this is to become the wikipedia of programming, then these are some of the types of questions that I will come here for. I'm not looking for a CS book to be written here. I could go pull Intro to Algorithms off the shelf and read up on the chapter on hash tables and get that type of info. More specifically what I am looking for are code examples. Not only for me in particular, but also for others who would maybe one day be searching for similar info and stumble across this page.

    To be more specific: If you had to implement them, and could not use built-in functions, how would you do it?

    You don't need to put the code here. Put it in pastebin and just link it.

  • Gene Belitski
    Gene Belitski over 12 years
    A minor fix is needed in the line 3 of the above snippet: k.GetHashCode() may return a negative number (for example, "Key with negative hashcode".GetHashCode() returns -648767821), which, in turn, will cause System.IndexOutOfRangeException when calculating a bucket number for a such key by function f.
  • tymtam
    tymtam almost 12 years
    How is this related to the question?
  • Jimmy Huch
    Jimmy Huch over 10 years
    I'm not sure you coould use char* value = FindNode("10"); since FindNode returns HashTable*. So you would be looking at something along the lines of: char* value = FindNode("10")->value;
  • João dos Reis
    João dos Reis over 9 years
    Agreed this may not answer the question, and that the question may not be constructive, but at least I finally understand why hash tables are so quick to retrieve values! Very embarrassing that until now I thought it was voodoo.
  • João dos Reis
    João dos Reis over 9 years
    Not sure what that list.add(new KeyValuePair(key, value)); is doing at the end of the add function?
  • Korbi
    Korbi over 9 years
    I don't really get that question. It adds the key -> value mapping to the internal array so it can be retrieved later in the get() call.
  • João dos Reis
    João dos Reis over 9 years
    Sorry, I wasn't thinking straight. I was thinking it was defined in that method, so it's just going to be garbage collected, but you're using it as a handle onto an element in the values member variable.