Hash Table is a data structure which stores data in an associative manner. In a hash table, data are stored in an array, where each data value is placed at an index computed from its key. Access to data becomes very fast if we know the index of the desired data.
Hash tables provide very fast insertion and search operations irrespective of the size of the data, when a suitable hashing method and collision-resolution scheme are used. A hash table uses an array as storage and a hash function to compute the index at which an element should be inserted or searched for.
Hashing is a technique to convert a (typically large) range of key values into a (smaller) range of array indices. A simple and commonly used hash function for integer keys is the division method, which uses the modulo operator:
index = key % SIZE
Example: consider a hash table of size 20. The following items (in the format (key, value)) are to be stored.
| Sr. No. | Key | Hash | Array Index |
|---|---|---|---|
| 1 | 1 | 1 % 20 = 1 | 1 |
| 2 | 2 | 2 % 20 = 2 | 2 |
| 3 | 42 | 42 % 20 = 2 | 2 |
| 4 | 4 | 4 % 20 = 4 | 4 |
| 5 | 12 | 12 % 20 = 12 | 12 |
| 6 | 14 | 14 % 20 = 14 | 14 |
| 7 | 17 | 17 % 20 = 17 | 17 |
| 8 | 13 | 13 % 20 = 13 | 13 |
| 9 | 37 | 37 % 20 = 17 | 17 |
A collision occurs when two different keys map to the same array index. Collisions are inevitable when the key space is larger than the table size. A collision-resolution strategy is required to handle such cases and still provide efficient operations.
Linear probing is an open addressing collision-resolution method. If the computed index is already occupied, linear probing searches the next cell (index+1), then index+2, and so on, wrapping around the table, until an empty cell is found.
Continuing the previous example, using linear probing to place items results in the following final indices:
| Sr. No. | Key | Hash | Array Index | After Linear Probing, Array Index |
|---|---|---|---|---|
| 1 | 1 | 1 % 20 = 1 | 1 | 1 |
| 2 | 2 | 2 % 20 = 2 | 2 | 2 |
| 3 | 42 | 42 % 20 = 2 | 2 | 3 |
| 4 | 4 | 4 % 20 = 4 | 4 | 4 |
| 5 | 12 | 12 % 20 = 12 | 12 | 12 |
| 6 | 14 | 14 % 20 = 14 | 14 | 14 |
| 7 | 17 | 17 % 20 = 17 | 17 | 17 |
| 8 | 13 | 13 % 20 = 13 | 13 | 13 |
| 9 | 37 | 37 % 20 = 17 | 17 | 18 |
Notes about linear probing:
The primary operations supported by a hash table are:
We can represent an entry (data item) stored in the hash table as a struct containing the key and associated data:
struct DataItem { int data; int key; };
A simple hash function using the division method:
int hashCode(int key){ return key % SIZE; }
To search for an element with a given key, compute its hash index and check the slot. If the key at that slot does not match, use the chosen collision-resolution probing sequence (for example, linear probing) to continue searching until the element is found or an empty slot is reached.
struct DataItem *search(int key) { // get the hash int hashIndex = hashCode(key); // move in array until an empty while(hashArray[hashIndex] != NULL) { if(hashArray[hashIndex]->key == key) return hashArray[hashIndex]; // go to next cell ++hashIndex; // wrap around the table hashIndex %= SIZE; } return NULL; }
To insert a new (key, data) pair, compute the hash index for the key. If the computed index is empty or marked deleted, place the item there. Otherwise, follow the probing sequence until an available slot is found.
void insert(int key,int data) { struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem)); item->data = data; item->key = key; // get the hash int hashIndex = hashCode(key); // move in array until an empty or deleted cell while(hashArray[hashIndex] != NULL && hashArray[hashIndex]->key != -1) { // go to next cell ++hashIndex; // wrap around the table hashIndex %= SIZE; } hashArray[hashIndex] = item; }
Deletion must preserve the property that probe sequences used during search still find items inserted earlier. One common approach in open addressing is to place a special dummy item (a tombstone) at the deleted position rather than setting it to NULL. This allows searches that would otherwise skip over reinserted elements to continue correctly.
struct DataItem* delete(struct DataItem* item) { int key = item->key; // get the hash int hashIndex = hashCode(key); // move in array until an empty while(hashArray[hashIndex] != NULL) { if(hashArray[hashIndex]->key == key) { struct DataItem* temp = hashArray[hashIndex]; // assign a dummy item at deleted position hashArray[hashIndex] = dummyItem; return temp; } // go to next cell ++hashIndex; // wrap around the table hashIndex %= SIZE; } return NULL; }
Alternatives to using tombstones are:
Load factor, usually denoted α, is defined as:
α = number_of_elements / table_size
It helps determine expected performance. Typical guidelines:
Summary: Hash tables provide efficient key-based storage and retrieval by converting keys to array indices using hash functions. Collisions must be handled using strategies such as chaining or open addressing (linear probing, quadratic probing, double hashing). Proper choice of hash function, table size and load factor management (rehashing) are essential to maintain average-case O(1) performance for insert, search and delete operations.
| 1. What is hashing in computer science engineering (CSE)? | ![]() |
| 2. How does hashing work in computer science engineering (CSE)? | ![]() |
| 3. What are the advantages of using hashing in computer science engineering (CSE)? | ![]() |
| 4. What are the different types of hash functions used in computer science engineering (CSE)? | ![]() |
| 5. How is collision resolution handled in hashing in computer science engineering (CSE)? | ![]() |
![]() | Explore Courses for Computer Science Engineering (CSE) exam |
![]() | Get EduRev Notes directly in your Google search |