Open App

Computer Science Engineering (CSE) Exam > Computer Science Engineering (CSE) Notes > Database Management System (DBMS) > File Organization

File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE) PDF Download

File Organization in DBMS

A database consist of a huge amount of data. The data is grouped within a table in RDBMS, and each table have related records. A user can see that the data is stored in form of tables, but in acutal this huge amount of data is stored in physical memory in form of files.

File: A file is named collection of related information that is recorded on secondary storage such as magnetic disks, magnetic tables and optical disks.

What is File Organization?
File Organization refers to the logical relationships among various records that constitute the file, particularly with respect to the means of identification and access to any specific record. In simple terms, Storing the files in certain order is called file Organization. File Structure refers to the format of the label and data blocks and of any logical control record.

Types of File Organizations
Various methods have been introduced to Organize files. These particular methods have advantages and disadvantages on the basis of access or selection . Thus it is all upon the programmer to decide the best suited file Organization method according to his requirements.

Some types of File Organizations are:

Sequential File Organization
Heap File Organization
Hash File Organization
B+ Tree File Organization
Clustered File Organization

We will be discussing each of the file Organizations in further sets of this article along with differences and advantages/ disadvantages of each file Organization methods.

1. Sequential File Organization
The easiest method for file Organization is Sequential method. In this method the file are stored one after another in a sequential manner. There are two ways to implement this method:

Pile File Method: This method is quite simple, in which we store the records in a sequence i.e one after other in the order in which they are inserted into the tables.(i) Insertion of new record
Let the R1, R3 and so on upto R5 and R4 be four records in the sequence. Here, records are nothing but a row in any table. Suppose a new record R2 has to be inserted in the sequence, then it is simply placed at the end of the file.

Sorted File Method: In this method, As the name itself suggest whenever a new record has to be inserted, it is always inserted in a sorted (ascending or descending) manner. Sorting of records may be based on any primary key or any other key.(i) Insertion of new record
Let us assume that there is a preexisting sorted sequence of four records R1, R3, and so on upto R7 and R8. Suppose a new record R2 has to be inserted in the sequence, then it will be inserted at the end of the file and then it will sort the sequence .
(ii) Pros and Cons of Sequential File Organization
(a) Pros:
A. Fast and efficient method for huge amount of data.
B. Simple design.
C. Files can be easily stored in magnetic tapes i.e cheaper storage mechanism.
(ii) Cons:
A. Time wastage as we cannot jump on a particular record that is required, but we have to move in a sequential manner which takes our time.
B. Sorted file method is inefficient as it takes time and space for sorting records.

2. Heap File Organization
Heap File Organization works with data blocks. In this method records are inserted at the end of the file, into the data blocks. No Sorting or Ordering is required in this method. If a data block is full, the new record is stored in some other block, Here the other data block need not be the very next data block, but it can be any block in the memory. It is the responsibility of DBMS to store and manage the new records.

File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

(i) Insertion of new record
Suppose we have four records in the heap R1, R5, R6, R4 and R3 and suppose a new record R2 has to be inserted in the heap then, since the last data block i.e data block 3 is full it will be inserted in any of the data blocks selected by the DBMS, lets say data block

File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

If we want to search, delete or update data in heap file Organization the we will traverse the data from the beginning of the file till we get the requested record. Thus if the database is very huge, searching, deleting or updating the record will take a lot of time.
(ii) Pros and Cons of Heap File Organization

Pros
1. Fetching and retrieving records is faster than sequential record but only in case of small databases.
2. When there is a huge number of data needs to be loaded into the database at a time, then this method of file Organization is best suited.
Cons
1. Problem of unused memory blocks.
2. Inefficient for larger databases.

In database management system, When we want to retrieve a particular data, It becomes very inefficient to search all the index values and reach the desired data. In this situation, Hashing technique comes into picture.
Hashing is an efficient technique to directly search the location of desired data on the disk without using index structure. Data is stored at the data blocks whose address is generated by using hash function. The memory location where these records are stored is called as data block or data bucket.

3. Hash File Organization

Data bucket: Data buckets are the memory locations where the records are stored. These buckets are also considered as Unit Of Storage.
Hash Function: Hash function is a mapping function that maps all the set of search keys to actual record address. Generally, hash function uses primary key to generate the hash index – address of the data block. Hash function can be simple mathematical function to any complex mathematical function.
Hash Index: The prefix of an entire hash value is taken as a hash index. Every hash index has a depth value to signify how many bits are used for computing a hash function. These bits can address 2n buckets. When all these bits are consumed ? then the depth value is increased linearly and twice the buckets are allocated.

Below given diagram clearly depicts how hash function work:
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

Hashing is further divided into two sub categories:
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

(i) Static Hashing
In static hashing, when a search-key value is provided, the hash function always computes the same address. For example, if we want to generate address for STUDENT_ID = 76 using mod (5) hash function, it always result in the same bucket address 4. There will not be any changes to the bucket address here. Hence number of data buckets in the memory for this static hashing remains constant throughout.
(ii) Operations

Insertion: When a new record is inserted into the table, The hash function h generate a bucket address for the new record based on its hash key K.
Bucket address = h(K)
Searching: When a record needs to be searched, The same hash function is used to retrieve the bucket address for the record. For Example, if we want to retrieve whole record for ID 76, and if the hash function is mod (5) on that ID, the bucket address generated would be 4. Then we will directly got to address 4 and retrieve the whole record for ID 104. Here ID acts as a hash key.
Deletion: If we want to delete a record, Using the hash function we will first fetch the record which is supposed to be deleted. Then we will remove the records for that address in memory.
Updation: The data record that needs to be updated is first searched using hash function, and then the data record is updated.

Now, If we want to insert some new records into the file But the data bucket address generated by the hash function is not empty or the data already exists in that address. This becomes a critical situation to handle. This situation in the static hashing is called bucket overflow.

How will we insert data in this case?
There are several methods provided to overcome this situation. Some commonly used methods are discussed below:
(i) Open Hashing
In Open hashing method, next available data block is used to enter the new record, instead of overwriting older one. This method is also called linear probing.
For example, D3 is a new record which needs to be inserted , the hash function generates address as 105. But it is already full. So the system searches next available data bucket, 123 and assigns D3 to it.
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

(ii) Closed hashing
In Closed hashing method, a new data bucket is allocated with same address and is linked it after the full data bucket. This method is also known as overflow chaining.
For example, we have to insert a new record D3 into the tables. The static hash function generates the data bucket address as 105. But this bucket is full to store the new data. In this case is a new data bucket is added at the end of 105 data bucket and is linked to it. Then new record D3 is inserted into the new bucket.

File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

Quadratic probing: Quadratic probing is very much similar to open hashing or linear probing. Here, The only difference between old and new bucket is linear. Quadratic function is used to determine the new bucket address.
Double Hashing: Double Hashing is another method similar to linear probing. Here the difference is fixed as in linear probing, but this fixed difference is calculated by using another hash function. That’s why the name is double hashing.

(iii) Dynamic Hashing
The drawback of static hashing is that that it does not expand or shrink dynamically as the size of the database grows or shrinks. In Dynamic hashing, data buckets grows or shrinks (added or removed dynamically) as the records increases or decreases. Dynamic hashing is also known as extended hashing.
In dynamic hashing, the hash function is made to produce a large number of values. For Example, there are three data records D1, D2 and D3 . The hash function generates three addresses 1001, 0101 and 1010 respectively. This method of storing considers only part of this address – especially only first one bit to store the data. So it tries to load three of them at address 0 and 1.
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

But the problem is that No bucket address is remaining for D3. The bucket has to grow dynamically to accommodate D3. So it changes the address have 2 bits rather than 1 bit, and then it updates the existing data to have 2 bit address. Then it tries to accommodate D3.
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

4. B+ Tree File Organization
B+ Tree, as the name suggests, It uses a tree like structure to store records in File. It uses the concept of Key indexing where the primary key is used to sort the records. For each primary key, an index value is generated and mapped with the record. An index of a record is the address of record in the file.
B+ Tree is very much similar to binary search tree, with the only difference that instead of just two children, it can have more than two. All the information is stored in leaf node and the intermediate nodes acts as pointer to the leaf nodes. The information in leaf nodes always remain a sorted sequential linked list.
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

In the above diagram 56 is the root node which is also called the main node of the tree.
The intermediate nodes here, just consist the address of leaf nodes. They do not contain any actual record. Leaf nodes consist of the actual record. All leaf nodes are balanced.

Pros and Cons of B+ Tree File Organization

Pros
1. Tree traversal is easier and faster.
2. Searching becomes easy as all records are stored only in leaf nodes and are sorted sequential linked list.
3. There is no restriction on B+ tree size. It may grows/shrink as the size of data increases/decreases.
Cons
1. Inefficient for static tables.

5. Cluster File Organization
In cluster file organization, two or more related tables/records are stored withing same file known as clusters. These files will have two or more tables in the same data block and the key attributes which are used to map these table together are stored only once.
Thus it lowers the cost of searching and retrieving various records in different files as they are now combined and kept in a single cluster.
For example we have two tables or relation Employee and Department. These table are related to each other.
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

Therefore these table are allowed to combine using a join operation and can be seen in a cluster file.
File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

If we have to insert, update or delete any record we can directly do so. Data is sorted based on the primary key or the key with which searching is done. Cluster key is the key with which joining of the table is performed.

Types of Cluster File Organization: There are two ways to implement this method:

Indexed Clusters: In Indexed clustering the records are group based on the cluster key and stored together. The above mentioned example of Emplotee and Department relationship is an example of Indexed Cluster where the records are based on the Department ID.
Hash Clusters: This is very much similar to indexed cluster with only difference that instead of storing the records based on cluster key, we generate hash key value and store the records with same hash key value.

The document File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE) is a part of the Computer Science Engineering (CSE) Course Database Management System (DBMS).

All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)

	Database Management System (DBMS) 62 videos\|92 docs\|35 tests

Database Management System (DBMS)

62 videos|92 docs|35 tests

Join Course for Free

FAQs on File Organization - Database Management System (DBMS) - Computer Science Engineering (CSE)

1. What is file organization in DBMS?

Ans. File organization in DBMS refers to the way in which data is stored in a file or database. It determines the efficiency and speed of data retrieval and storage operations. Different file organization techniques include sequential, indexed sequential, direct, and hashed organization.

2. What is sequential file organization?

Ans. Sequential file organization is a technique in which data is stored in a sequential order based on a key field. Records are stored one after another in a linear fashion, making it easy to retrieve data in the order they are stored. However, sequential organization is not suitable for random access or frequent updates.

3. What is indexed sequential file organization?

Ans. Indexed sequential file organization is a combination of sequential and indexed organization techniques. In this method, records are stored sequentially like in sequential organization, but an index is built to provide direct access to specific records based on key values. This allows for efficient random access and faster retrieval of data.

4. What is direct file organization?

Ans. Direct file organization, also known as random file organization, is a method in which records are assigned a fixed location based on their key value. This allows for direct access to any record without the need to scan through the entire file. Direct organization is suitable for applications that require frequent random access and updates.

5. What is hashed file organization?

Ans. Hashed file organization is a technique that uses a hash function to determine the location of records in a file. The hash function converts the key value of a record into a physical address, allowing for direct access to the record. Hashed organization provides fast access to records but may lead to collisions if two records have the same hash value.

About this Document

4.85/5 Rating

Sep 28, 2025 Last updated

Related Exams

GATE

Document Description: File Organization for Computer Science Engineering (CSE) 2025 is part of Database Management System (DBMS) preparation. The notes and questions for File Organization have been prepared according to the Computer Science Engineering (CSE) exam syllabus. Information about File Organization covers topics like File Organization in DBMS and File Organization Example, for Computer Science Engineering (CSE) 2025 Exam. Find important definitions, questions, notes, meanings, examples, exercises and tests below for File Organization.

Introduction of File Organization in English is available as part of our Database Management System (DBMS) for Computer Science Engineering (CSE) & File Organization in Hindi for Database Management System (DBMS) course. Download more important topics related with notes, lectures and mock test series for Computer Science Engineering (CSE) Exam by signing up for free. Computer Science Engineering (CSE): File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

Description

Full syllabus notes, lecture & questions for File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE) - Computer Science Engineering (CSE) | Plus excerises question with solution to help you revise complete syllabus for Database Management System (DBMS) | Best notes, free PDF download

Information about File Organization

In this doc you can find the meaning of File Organization defined & explained in the simplest way possible. Besides explaining types of File Organization theory, EduRev gives you an ample number of questions to practice File Organization tests, examples and also practice Computer Science Engineering (CSE) tests

	Database Management System (DBMS) 62 videos\|92 docs\|35 tests

Database Management System (DBMS)

62 videos|92 docs|35 tests

Join Course for Free

Download as PDF

Explore Courses for Computer Science Engineering (CSE) exam

File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

study material

Exam

Important questions

shortcuts and tricks

mock tests for examination

past year papers

pdf

Objective type Questions

MCQs

Free

ppt

Summary

Extra Questions

Sample Paper

video lectures

File Organization | Database Management System (DBMS) - Computer Science Engineering (CSE)

Previous Year Questions with Solutions

practice quizzes

Viva Questions

Semester Notes

;

Additional Information about File Organization for Computer Science Engineering (CSE) Preparation

File Organization Free PDF Download

The File Organization is an invaluable resource that delves deep into the core of the Computer Science Engineering (CSE) exam. These study notes are curated by experts and cover all the essential topics and concepts, making your preparation more efficient and effective. With the help of these notes, you can grasp complex subjects quickly, revise important points easily, and reinforce your understanding of key concepts. The study notes are presented in a concise and easy-to-understand manner, allowing you to optimize your learning process. Whether you're looking for best-recommended books, sample papers, study material, or toppers' notes, this PDF has got you covered. Download the File Organization now and kickstart your journey towards success in the Computer Science Engineering (CSE) exam.

Importance of File Organization

The importance of File Organization cannot be overstated, especially for Computer Science Engineering (CSE) aspirants. This document holds the key to success in the Computer Science Engineering (CSE) exam. It offers a detailed understanding of the concept, providing invaluable insights into the topic. By knowing the concepts well in advance, students can plan their preparation effectively. Utilize this indispensable guide for a well-rounded preparation and achieve your desired results.

File Organization Notes

File Organization Notes offer in-depth insights into the specific topic to help you master it with ease. This comprehensive document covers all aspects related to File Organization. It includes detailed information about the exam syllabus, recommended books, and study materials for a well-rounded preparation. Practice papers and question papers enable you to assess your progress effectively. Additionally, the paper analysis provides valuable tips for tackling the exam strategically. Access to Toppers' notes gives you an edge in understanding complex concepts. Whether you're a beginner or aiming for advanced proficiency, File Organization Notes on EduRev are your ultimate resource for success.

File Organization Computer Science Engineering (CSE) Questions

The "File Organization Computer Science Engineering (CSE) Questions" guide is a valuable resource for all aspiring students preparing for the Computer Science Engineering (CSE) exam. It focuses on providing a wide range of practice questions to help students gauge their understanding of the exam topics. These questions cover the entire syllabus, ensuring comprehensive preparation. The guide includes previous years' question papers for students to familiarize themselves with the exam's format and difficulty level. Additionally, it offers subject-specific question banks, allowing students to focus on weak areas and improve their performance.

Study File Organization on the App

Students of Computer Science Engineering (CSE) can study File Organization alongwith tests & analysis from the EduRev app, which will help them while preparing for their exam. Apart from the File Organization, students can also utilize the EduRev App for other study materials such as previous year question papers, syllabus, important questions, etc. The EduRev App will make your learning easier as you can access it from anywhere you want. The content of File Organization is prepared as per the latest Computer Science Engineering (CSE) syllabus.

Education Revolution