There are different forms of tables that assist us in getting information extremely efficiently. The ideal hash table is just an array of a fixed size; the size is determined by the application in which it will be used. The hash table includes key values with links to the related records. The essential principle behind a hash table is to enter a key value into a hash table location; the location will be calculated from the key value itself. Address calculation indexing, or hashing, refers to the one-to-one correlation between a key value and an index in a hash table. We will cover hashing algorithms and data structure in this part.
Hash Table
A hash table is a data structure that stores data associatively. Data is kept in an array format in a hash table, with each data item having its own unique index value. If we know the index of the needed data, we can get to it quickly. As a result, it becomes a data structure in which insertion and search operations are extremely quick, regardless of data size. Hash Table employs an array as a storage medium and uses the hash technique to produce an index where an element is to be put or to be located.
Hashing
Hashing is a technique for converting a set of key values into a set of array indexes. This is frequently represented by a shorter, fixed-length value or key that reflects the original string and makes it simpler to discover or use it. The most common use for hashing is the creation of hash tables.
Linear Probing
The hashing algorithm utilises an array index that has previously been used. In such a circumstance, we may search for the next empty spot in the array by searching into the following cell till we discover an empty cell. This method is known as linear probing.
Basic Operations
The following are the core primary operations of a hash table.
Search – In a hash table, search for an element.
Insert – Insert an element into a hash table.
Delete – removes an element from a hash table.
DataItem
Define a data item that contains some data and a key that will be used to search the hash table.
struct DataItem {
int data;
int key;
};
Hash Method
Define a hashing technique for computing the hash code of the data item’s key.
int hashCode(int key){
return key % SIZE;
}
Search Operation
Whenever an element has to be found, compute the hash code of the provided key and use that hashcode as an index in the array to find the element. If the element is not located at the computed hash code, use linear probing to obtain it ahead.
struct DataItem *search(int key){
//get the hash
int hashIndex = hashCode(key);
//move in array until an empty
while(hashArray[hashIndex] != NULL){
if(hashArray[hashIndex]->key == key)
return hashArray[hashIndex];
//go to next cell
++hashIndex;
//wrap around the table
hashIndex %= SIZE;
}
return NULL;
}
Insert Operation
Whenever an element is to be introduced, compute the hash code of the provided key and use that hashcode as an index in the array to get the index. Use linear probing for empty location if an element is discovered at the calculated hash code.
void insert(int key,int data){
struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem));
item->data = data;
item->key = key;
//get the hash
int hashIndex = hashCode(key);
//move in array until an empty or deleted cell
while(hashArray[hashIndex] != NULL && hashArray[hashIndex]->key != -1){
//go to next cell
++hashIndex;
//wrap around the table
hashIndex %= SIZE;
}
hashArray[hashIndex] = item;
}
Delete Operation
When an element needs to be removed, compute the hash code of the key supplied and identify the index using that hashcode as the index in the array. If an element is not located at the computed hash code, use linear probing to advance it. When detected, place a fake item there to keep the hashtable’s performance unchanged.
struct DataItem* delete(struct DataItem* item){
int key = item->key;
//get the hash
int hashIndex = hashCode(key);
//move in array until an empty
while(hashArray[hashIndex] !=NULL){
if(hashArray[hashIndex]->key == key){
struct DataItem* temp = hashArray[hashIndex];
//assign a dummy item at deleted position
hashArray[hashIndex] = dummyItem;
return temp;
}
//go to next cell
++hashIndex;
//wrap around the table
hashIndex %= SIZE;
}
return NULL;
}
Algorithm
Hashing-based search algorithms are divided into two components. The first step is to generate a hash function that converts the search key to an array index. Various keys should ideally correspond to different indices. Because this ideal is typically out of reach, we must accept the chance that two or more distinct keys hash to the same array index. As a result, the second component of a hashing search is a collision-resolution procedure that handles this issue. Collision Resolution
Separate Chaining
Separate chaining entails creating a linked list with a key-value pair for each search array index. The objects that have collided are connected together in a single linked list, which may be browsed to retrieve the item with a unique search key. Collision resolution by chaining with a linked list is a typical way of implementing hash tables.
Open Addressing
Another method used to resolve collisions in hash tables is called open addressing. Here, rather than building secondary data structures, we resolve collisions by looking elsewhere in the table. Specifically, we have a sequence of hash functions 〈h0, h1, h2,…….,hm-1〉, such that for any item x, the probe sequence is a permutation of 〈0, 1, 2, . . . , m − 1〉. In other words, different hash functions in the sequence always map x to different locations in the hash table. We search for x using the following algorithm, which returns the array index i if T[i] = x, ‘absent’ if x is not in the table, but there is an empty slot, and ‘full’ if x is not in the table and there are no empty slots.
The algorithm for inserting a new item into the table is similar; only the second-to-last line is changed to T[hi (x)] ← x. Notice that for an open-addressed hash table, the load factor is never bigger than 1. Just as with chaining, we’d like to pretend that the sequence of hash values is truly random for analysis purposes. Specifically, most open-address hashing analysis uses the following assumption, which is impossible to enforce in practice but leads to reasonably predictive results for most applications.
Conclusion
Thus it is concluded that a hash table is a data structure that stores data associatively. The hash table includes key values with links to the related records. The essential principle behind a hash table is that we must enter a key value into a hash table location; the location will be calculated from the key value itself. Address calculation indexing, or hashing, refers to the one-to-one correlation between a key value and an index in a hash table.