Short Notes: Hashing | Short Notes for Computer Science Engineering - Computer Science Engineering (CSE) PDF Download

Download, print and study this document offline
Please wait while the PDF view is loading
 Page 1


Hashing 
Hashing is a common method of accessing data records using the hash table. 
Hashing can be used to build, search, or delete from a table.
Hash Table: A hash table is a data structure that stores records in an array, called a 
hash table. Hash table can be used for quick insertion and searching.
Load Factor: The ratio of the number of items in a table to the table's size is called 
the load factor.
Hash Function:
• It is a method for computing table index from key.
• A good hash function is simple, so it can be computed quickly.
• The major advantage of hash tables is their speed.
• If the hash function is slow, this speed will be degraded.
• The purpose of a hash function is to take a range of key values and transform 
them into index values in such a way that the key values are distributed 
randomly across all the indices of the hash table.
• Goal of hash function:The probability of any two keys hashing to the same 
slot is 1/N.
There are many hash functions approaches as follows:
Division Method:
• Mapping a key K into one of m slots by taking the remainder of K divided by 
m.
h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using division method, insert the 
following elements into the hash table. 36,18, 72, 43, and 6 are inserted in the
Page 2


Hashing 
Hashing is a common method of accessing data records using the hash table. 
Hashing can be used to build, search, or delete from a table.
Hash Table: A hash table is a data structure that stores records in an array, called a 
hash table. Hash table can be used for quick insertion and searching.
Load Factor: The ratio of the number of items in a table to the table's size is called 
the load factor.
Hash Function:
• It is a method for computing table index from key.
• A good hash function is simple, so it can be computed quickly.
• The major advantage of hash tables is their speed.
• If the hash function is slow, this speed will be degraded.
• The purpose of a hash function is to take a range of key values and transform 
them into index values in such a way that the key values are distributed 
randomly across all the indices of the hash table.
• Goal of hash function:The probability of any two keys hashing to the same 
slot is 1/N.
There are many hash functions approaches as follows:
Division Method:
• Mapping a key K into one of m slots by taking the remainder of K divided by 
m.
h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using division method, insert the 
following elements into the hash table. 36,18, 72, 43, and 6 are inserted in the
order.
36 % S = 4 
18 % 8 = 2 
72 % 8 = 0
43 % 8 = 3 
6 % 8 = 6
[0] [1] [2]
[3] [4] [5] [6]
[7]
72
18 43
36
6
Mid-Square Method:
Mapping a key K into one of m slots, by getting the some middle digits from value
K2.
h(k) = K2 and get middle (logio m) digits
Example: 3121 is a key and square of 3121 is 9740641. Middle part is 406 (with a
table size of 1000)
Folding Method:
Divide the key K into some sections, besides the last section, have same length.
Then, add these sections together.
• Shift folding (123456 is folded as 12+34+56)
• Folding at the boundaries (123456 is folded as 12+43+56)
Radix Hashing:
Transform the number into another base and then divide by the maximum address.
• Example: Addresses from 0 to 99, key = 453 in base 11 = 382. Hash address = 
382 mod 99 = 85.
Problems with hashing:
• Collision: No matter what the hash function, there is the possibility that two 
different keys could resolve to the same hash address. This situation is 
known as a collision.
• Handling the Collisions: The following techniques can be used to handle the 
collisions.
° Chaining
° Double hashing (Re-hashing)
° Open Addressing (Linear probing, Quadratic probing, Random probing), 
etc.
Chaining:
Page 3


Hashing 
Hashing is a common method of accessing data records using the hash table. 
Hashing can be used to build, search, or delete from a table.
Hash Table: A hash table is a data structure that stores records in an array, called a 
hash table. Hash table can be used for quick insertion and searching.
Load Factor: The ratio of the number of items in a table to the table's size is called 
the load factor.
Hash Function:
• It is a method for computing table index from key.
• A good hash function is simple, so it can be computed quickly.
• The major advantage of hash tables is their speed.
• If the hash function is slow, this speed will be degraded.
• The purpose of a hash function is to take a range of key values and transform 
them into index values in such a way that the key values are distributed 
randomly across all the indices of the hash table.
• Goal of hash function:The probability of any two keys hashing to the same 
slot is 1/N.
There are many hash functions approaches as follows:
Division Method:
• Mapping a key K into one of m slots by taking the remainder of K divided by 
m.
h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using division method, insert the 
following elements into the hash table. 36,18, 72, 43, and 6 are inserted in the
order.
36 % S = 4 
18 % 8 = 2 
72 % 8 = 0
43 % 8 = 3 
6 % 8 = 6
[0] [1] [2]
[3] [4] [5] [6]
[7]
72
18 43
36
6
Mid-Square Method:
Mapping a key K into one of m slots, by getting the some middle digits from value
K2.
h(k) = K2 and get middle (logio m) digits
Example: 3121 is a key and square of 3121 is 9740641. Middle part is 406 (with a
table size of 1000)
Folding Method:
Divide the key K into some sections, besides the last section, have same length.
Then, add these sections together.
• Shift folding (123456 is folded as 12+34+56)
• Folding at the boundaries (123456 is folded as 12+43+56)
Radix Hashing:
Transform the number into another base and then divide by the maximum address.
• Example: Addresses from 0 to 99, key = 453 in base 11 = 382. Hash address = 
382 mod 99 = 85.
Problems with hashing:
• Collision: No matter what the hash function, there is the possibility that two 
different keys could resolve to the same hash address. This situation is 
known as a collision.
• Handling the Collisions: The following techniques can be used to handle the 
collisions.
° Chaining
° Double hashing (Re-hashing)
° Open Addressing (Linear probing, Quadratic probing, Random probing), 
etc.
Chaining:
• A chain is simply a linked list of all the elements with the same hash key.
• A linked list is created at each index in the hash table.
• Hash Function: h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using the chaining, insert the 
following elements into the hash table. 36,18, 72, 43, 6,10, 5, and 15 are 
inserted in the order.
Hash key = key % table size
36 % S
=
4
13 % S
=
2
72 % s
=
0
43 % s
=
3
6 % s
=
6
10 % s
=
2
5 % s
=
5
15 % s
=
7
• A data item's key is hashed to the index in simple hashing, and the item is 
inserted into the linked list at that index.
• Other items that hash to the same index are simply added to the linked list.
• Cost is proportional to length of list.
• Average length = N / M, where N is size of array and M is the number of linked 
lists.
• Theorem: Let a = N / M > 1 be average length of list. For any t > 1, probability 
that list length > t o r is exponentially small in t.
• Analysis of Chaining:
° If M is too large, then too many empty chains.
° If M is too small, then chains are too long.
• There is no need to search for empty cells in the array or table.
• Advantages of Chaining: Unbounded elements can be stored (No bound to the 
size of table).
Page 4


Hashing 
Hashing is a common method of accessing data records using the hash table. 
Hashing can be used to build, search, or delete from a table.
Hash Table: A hash table is a data structure that stores records in an array, called a 
hash table. Hash table can be used for quick insertion and searching.
Load Factor: The ratio of the number of items in a table to the table's size is called 
the load factor.
Hash Function:
• It is a method for computing table index from key.
• A good hash function is simple, so it can be computed quickly.
• The major advantage of hash tables is their speed.
• If the hash function is slow, this speed will be degraded.
• The purpose of a hash function is to take a range of key values and transform 
them into index values in such a way that the key values are distributed 
randomly across all the indices of the hash table.
• Goal of hash function:The probability of any two keys hashing to the same 
slot is 1/N.
There are many hash functions approaches as follows:
Division Method:
• Mapping a key K into one of m slots by taking the remainder of K divided by 
m.
h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using division method, insert the 
following elements into the hash table. 36,18, 72, 43, and 6 are inserted in the
order.
36 % S = 4 
18 % 8 = 2 
72 % 8 = 0
43 % 8 = 3 
6 % 8 = 6
[0] [1] [2]
[3] [4] [5] [6]
[7]
72
18 43
36
6
Mid-Square Method:
Mapping a key K into one of m slots, by getting the some middle digits from value
K2.
h(k) = K2 and get middle (logio m) digits
Example: 3121 is a key and square of 3121 is 9740641. Middle part is 406 (with a
table size of 1000)
Folding Method:
Divide the key K into some sections, besides the last section, have same length.
Then, add these sections together.
• Shift folding (123456 is folded as 12+34+56)
• Folding at the boundaries (123456 is folded as 12+43+56)
Radix Hashing:
Transform the number into another base and then divide by the maximum address.
• Example: Addresses from 0 to 99, key = 453 in base 11 = 382. Hash address = 
382 mod 99 = 85.
Problems with hashing:
• Collision: No matter what the hash function, there is the possibility that two 
different keys could resolve to the same hash address. This situation is 
known as a collision.
• Handling the Collisions: The following techniques can be used to handle the 
collisions.
° Chaining
° Double hashing (Re-hashing)
° Open Addressing (Linear probing, Quadratic probing, Random probing), 
etc.
Chaining:
• A chain is simply a linked list of all the elements with the same hash key.
• A linked list is created at each index in the hash table.
• Hash Function: h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using the chaining, insert the 
following elements into the hash table. 36,18, 72, 43, 6,10, 5, and 15 are 
inserted in the order.
Hash key = key % table size
36 % S
=
4
13 % S
=
2
72 % s
=
0
43 % s
=
3
6 % s
=
6
10 % s
=
2
5 % s
=
5
15 % s
=
7
• A data item's key is hashed to the index in simple hashing, and the item is 
inserted into the linked list at that index.
• Other items that hash to the same index are simply added to the linked list.
• Cost is proportional to length of list.
• Average length = N / M, where N is size of array and M is the number of linked 
lists.
• Theorem: Let a = N / M > 1 be average length of list. For any t > 1, probability 
that list length > t o r is exponentially small in t.
• Analysis of Chaining:
° If M is too large, then too many empty chains.
° If M is too small, then chains are too long.
• There is no need to search for empty cells in the array or table.
• Advantages of Chaining: Unbounded elements can be stored (No bound to the 
size of table).
• Disadvantage of Chaining: Too many linked lists (overhead of linked lists) 
Open Addressing:
In open addressing, when a data item can’t be placed at the hashed index value, 
another location in the array is sought. We’ll explore three methods of open 
addressing, which vary in the method used to find the next empty location. These 
methods are linear probing, quadratic probing, and double hashing.
Linear Probing:
• When using a linear probing method the item will be stored in the next 
available slot in the table, assuming that the table is not already full.
• This is implemented via a linear searching for an empty slot, from the point of 
collision.
• If the end of table is reached during the linear search, the search will wrap 
around to the beginning of the table and continue from there.
• Example: Assume a table has 8 slots (m=8). Using Linear probing, insert the 
following elements into the hash table. 36,18, 72, 43, 6,10, 5, and 15 are 
inserted in the order.
Hash key = key % table size
36 % 8
=
4
18 % 8
=
2
72 % 8
=
0
43 % 8
=
3
6 % 8
=
6
10 % 8
=
2
5 % 8
=
5
15 % 8
=
7
[ 0]
[1]
[ 2]
[3]
[4]
[5] 
[S ] 
[7]
72
15
18
43
36
10
6
5
Relationship between probe length (P) and load factor (L) for linear probing: 
° For a successful search: P = (1 +1 /(1-L)2) /2
Page 5


Hashing 
Hashing is a common method of accessing data records using the hash table. 
Hashing can be used to build, search, or delete from a table.
Hash Table: A hash table is a data structure that stores records in an array, called a 
hash table. Hash table can be used for quick insertion and searching.
Load Factor: The ratio of the number of items in a table to the table's size is called 
the load factor.
Hash Function:
• It is a method for computing table index from key.
• A good hash function is simple, so it can be computed quickly.
• The major advantage of hash tables is their speed.
• If the hash function is slow, this speed will be degraded.
• The purpose of a hash function is to take a range of key values and transform 
them into index values in such a way that the key values are distributed 
randomly across all the indices of the hash table.
• Goal of hash function:The probability of any two keys hashing to the same 
slot is 1/N.
There are many hash functions approaches as follows:
Division Method:
• Mapping a key K into one of m slots by taking the remainder of K divided by 
m.
h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using division method, insert the 
following elements into the hash table. 36,18, 72, 43, and 6 are inserted in the
order.
36 % S = 4 
18 % 8 = 2 
72 % 8 = 0
43 % 8 = 3 
6 % 8 = 6
[0] [1] [2]
[3] [4] [5] [6]
[7]
72
18 43
36
6
Mid-Square Method:
Mapping a key K into one of m slots, by getting the some middle digits from value
K2.
h(k) = K2 and get middle (logio m) digits
Example: 3121 is a key and square of 3121 is 9740641. Middle part is 406 (with a
table size of 1000)
Folding Method:
Divide the key K into some sections, besides the last section, have same length.
Then, add these sections together.
• Shift folding (123456 is folded as 12+34+56)
• Folding at the boundaries (123456 is folded as 12+43+56)
Radix Hashing:
Transform the number into another base and then divide by the maximum address.
• Example: Addresses from 0 to 99, key = 453 in base 11 = 382. Hash address = 
382 mod 99 = 85.
Problems with hashing:
• Collision: No matter what the hash function, there is the possibility that two 
different keys could resolve to the same hash address. This situation is 
known as a collision.
• Handling the Collisions: The following techniques can be used to handle the 
collisions.
° Chaining
° Double hashing (Re-hashing)
° Open Addressing (Linear probing, Quadratic probing, Random probing), 
etc.
Chaining:
• A chain is simply a linked list of all the elements with the same hash key.
• A linked list is created at each index in the hash table.
• Hash Function: h (K) = K mod m
• Example: Assume a table has 8 slots (m=8). Using the chaining, insert the 
following elements into the hash table. 36,18, 72, 43, 6,10, 5, and 15 are 
inserted in the order.
Hash key = key % table size
36 % S
=
4
13 % S
=
2
72 % s
=
0
43 % s
=
3
6 % s
=
6
10 % s
=
2
5 % s
=
5
15 % s
=
7
• A data item's key is hashed to the index in simple hashing, and the item is 
inserted into the linked list at that index.
• Other items that hash to the same index are simply added to the linked list.
• Cost is proportional to length of list.
• Average length = N / M, where N is size of array and M is the number of linked 
lists.
• Theorem: Let a = N / M > 1 be average length of list. For any t > 1, probability 
that list length > t o r is exponentially small in t.
• Analysis of Chaining:
° If M is too large, then too many empty chains.
° If M is too small, then chains are too long.
• There is no need to search for empty cells in the array or table.
• Advantages of Chaining: Unbounded elements can be stored (No bound to the 
size of table).
• Disadvantage of Chaining: Too many linked lists (overhead of linked lists) 
Open Addressing:
In open addressing, when a data item can’t be placed at the hashed index value, 
another location in the array is sought. We’ll explore three methods of open 
addressing, which vary in the method used to find the next empty location. These 
methods are linear probing, quadratic probing, and double hashing.
Linear Probing:
• When using a linear probing method the item will be stored in the next 
available slot in the table, assuming that the table is not already full.
• This is implemented via a linear searching for an empty slot, from the point of 
collision.
• If the end of table is reached during the linear search, the search will wrap 
around to the beginning of the table and continue from there.
• Example: Assume a table has 8 slots (m=8). Using Linear probing, insert the 
following elements into the hash table. 36,18, 72, 43, 6,10, 5, and 15 are 
inserted in the order.
Hash key = key % table size
36 % 8
=
4
18 % 8
=
2
72 % 8
=
0
43 % 8
=
3
6 % 8
=
6
10 % 8
=
2
5 % 8
=
5
15 % 8
=
7
[ 0]
[1]
[ 2]
[3]
[4]
[5] 
[S ] 
[7]
72
15
18
43
36
10
6
5
Relationship between probe length (P) and load factor (L) for linear probing: 
° For a successful search: P = (1 +1 /(1-L)2) /2
° For an unsuccessful search: P = (1 +1 /(1-L))/2
• Analysis of Linear Probing:
° If load factor is too small then too many empty cells.
Quadratic Probing:
If the collision occurs, instead of moving one cell, move i2 cells from the point of 
collision, where i is the number of attempts to resolve the collision. Example:
• Example: Assume a table has 10 slots. Using Quadratic probing, insert the 
following elements in the given order.
89,18, 49, 58 and 69 are inserted into a hash table.
89% 10 = 9 
18% 10 = 8 
49% 10 = 9
49 = 9 is occupied, so 1 attempt 
needed. Insert at9+l2 = 10 = 0 
location.
58 % 10 = 8 3 attempts - 32 = 9 spots
5 8 = 8 is occupied, and next 3 
locations are occupied.
So 8 + 32 = 17 = 7 location.
69 % 10 = 9 2 attempts - 22 = 4 spots
\
Read More
90 docs

Top Courses for Computer Science Engineering (CSE)

Explore Courses for Computer Science Engineering (CSE) exam

Top Courses for Computer Science Engineering (CSE)

Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

Sample Paper

,

Viva Questions

,

Summary

,

Objective type Questions

,

Extra Questions

,

past year papers

,

shortcuts and tricks

,

Previous Year Questions with Solutions

,

Short Notes: Hashing | Short Notes for Computer Science Engineering - Computer Science Engineering (CSE)

,

ppt

,

Short Notes: Hashing | Short Notes for Computer Science Engineering - Computer Science Engineering (CSE)

,

study material

,

Short Notes: Hashing | Short Notes for Computer Science Engineering - Computer Science Engineering (CSE)

,

mock tests for examination

,

Semester Notes

,

practice quizzes

,

pdf

,

Important questions

,

Free

,

Exam

,

video lectures

,

MCQs

;