Hashing in DBSM | Database Management System (DBMS) - Software Development PDF Download

Introduction

In modern database management systems (DBMS), efficient data retrieval is crucial for optimal performance. One popular technique used to achieve this is hashing. Hashing involves mapping data to a fixed-size array called a hash table, enabling fast and direct access to specific data elements. In this article, we will explore the concept of hashing in DBMS, understand its benefits, and see some examples and code snippets to illustrate its application.

What is Hashing?

Hashing is a technique used in database management systems to efficiently locate and retrieve data by mapping it to a fixed-size array. It involves applying a hash function to convert the data into a hash value or index, which is used to determine the position of the data in the hash table. The hash table acts as a lookup table, allowing for direct access to the desired data element.

Benefits of Hashing in DBMS

Hashing offers several advantages in DBMS:

  • Fast Data Retrieval: Hashing enables direct access to data elements, resulting in faster retrieval compared to other methods, such as linear search or binary search.
  • Constant Time Complexity: When implemented correctly, hashing provides a constant time complexity of O(1) for both insertion and retrieval operations, regardless of the size of the dataset.
  • Space Efficiency: Hash tables occupy a fixed amount of memory, typically proportional to the number of distinct keys or data elements, making them memory-efficient.

Hash Functions

A hash function takes an input (data element) and produces a unique hash value or index. The hash value is used to determine the position of the data in the hash table.
Some key characteristics of a good hash function include:

  • Deterministic: Given the same input, a hash function must always produce the same hash value.
  • Uniform Distribution: The hash values should be evenly distributed across the hash table to minimize collisions.
  • Efficiency: Hash functions should have fast computation time to ensure efficient data processing.

Hash Collision

Hash collision occurs when two different inputs produce the same hash value. Collisions are unavoidable due to the limited number of hash values and the infinite number of possible inputs. When a collision occurs, it must be handled to maintain the integrity and accuracy of the data stored in the hash table.

Hashing Examples and Code Explanation

Let's explore two examples to understand the application of hashing in DBMS.

Example 1: Student Records
Suppose we have a database of student records containing their names and corresponding roll numbers. We can use hashing to quickly retrieve the details of a student given their roll number.

# Hash function for roll numbers

def hash_function(roll_number):

    hash_value = roll_number % 100

    return hash_value


# Hash table to store student records

student_table = [None] * 100


# Inserting a student record

def insert_student_record(roll_number, name):

    index = hash_function(roll_number)

    student_table[index] = name


# Retrieving a student record

def get_student_record(roll_number):

    index = hash_function(roll_number)

    return student_table[index]


# Example usage

insert_student_record(101, "John")

insert_student_record(202, "Alice")


print(get_student_record(101))  # Output: John

print(get_student_record(202))  # Output: Alice

Explanation:

  • The 'hash_function' takes a roll number as input and calculates the hash value by taking the modulus (%) with 100. This ensures that the hash value remains within the index range of the hash table.
  • The 'insert_student_record' function uses the hash value to store the student's name in the hash table at the corresponding index.
  • The 'get_student_record' function retrieves the student's name by using the hash value to access the hash table.

Example 2: Employee Database
Let's consider an employee database where each employee is identified by their unique employee ID. We can use hashing to efficiently retrieve employee details using their ID.

# Hash function for employee IDs

def hash_function(employee_id):

    hash_value = employee_id % 1000

    return hash_value


# Hash table to store employee records

employee_table = [None] * 1000


# Inserting an employee record

def insert_employee_record(employee_id, name):

    index = hash_function(employee_id)

    employee_table[index] = name


# Retrieving an employee record

def get_employee_record(employee_id):

    index = hash_function(employee_id)

    return employee_table[index]


# Example usage

insert_employee_record(12345, "John Doe")

insert_employee_record(67890, "Jane Smith")


print(get_employee_record(12345))  # Output: John Doe

print(get_employee_record(67890))  # Output: Jane Smith

Explanation:

  • Similar to the previous example, the 'hash_function' computes the hash value for employee IDs by taking the modulus with 1000.
  • The 'insert_employee_record' function inserts the employee's name into the hash table at the calculated index.
  • The 'get_employee_record' function retrieves the employee's name by accessing the hash table using the hash value.

Sample Problems and Solutions

Problem 1: Implement a hash table to store the frequency count of words in a text document.

Create a hash table where the words are hashed based on their string value. Maintain a count for each word encountered, updating the count whenever a word is encountered again.

Problem 2: Implement a hash table to store contact information (name and phone number) for a phone book.

Use hashing to store contact details based on the hash value of the phone number. This allows for quick retrieval of contact information given a phone number.

Conclusion

Hashing is a powerful technique used in database management systems to achieve efficient data retrieval. By leveraging hash functions and hash tables, DBMS can provide fast and direct access to specific data elements, leading to improved performance. Understanding hashing and its applications can greatly benefit beginners in the field of database management.

The document Hashing in DBSM | Database Management System (DBMS) - Software Development is a part of the Software Development Course Database Management System (DBMS).
All you need of Software Development at this link: Software Development
75 videos|44 docs

Top Courses for Software Development

75 videos|44 docs
Download as PDF
Explore Courses for Software Development exam

Top Courses for Software Development

Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

mock tests for examination

,

past year papers

,

Hashing in DBSM | Database Management System (DBMS) - Software Development

,

practice quizzes

,

Hashing in DBSM | Database Management System (DBMS) - Software Development

,

Semester Notes

,

Free

,

Sample Paper

,

Exam

,

Summary

,

pdf

,

study material

,

Important questions

,

video lectures

,

MCQs

,

Previous Year Questions with Solutions

,

ppt

,

Viva Questions

,

Extra Questions

,

shortcuts and tricks

,

Hashing in DBSM | Database Management System (DBMS) - Software Development

,

Objective type Questions

;