Database Recovery Techniques | Database Management System (DBMS) - Computer Science Engineering (CSE) PDF Download

Database Recovery Techniques in DBMS

Database systems, like any other computer system, are subject to failures but the data stored in it must be available as and when required. When a database fails it must possess the facilities for fast recovery. It must also have atomicity i.e. either transactions are completed successfully and committed (the effect is recorded permanently in the database) or the transaction should have no effect on the database.
There are both automatic and non-automatic ways for both, backing up of data and recovery from any failure situations. The techniques used to recover the lost data due to system crash, transaction errors, viruses, catastrophic failure, incorrect commands execution etc. are database recovery techniques. So to prevent data loss recovery techniques based on deferred update and immediate update or backing up data can be used.
Recovery techniques are heavily dependent upon the existence of a special file known as a system log. It contains information about the start and end of each transaction and any updates which occur in the transaction. The log keeps track of all transaction operations that affect the values of database items. This information is needed to recover from transaction failure.

  • The log is kept on disk start_transaction(T): This log entry records that transaction T starts the execution.
  • read_item(T, X): This log entry records that transaction T reads the value of database item X.
  • write_item(T, X, old_value, new_value): This log entry records that transaction T changes the value of the database item X from old_value to new_value. The old value is sometimes known as a before an image of X, and the new value is known as an afterimage of X.
  • commit(T): This log entry records that transaction T has completed all accesses to the database successfully and its effect can be committed (recorded permanently) to the database.
  • abort(T): This records that transaction T has been aborted.
  • checkpoint: Checkpoint is a mechanism where all the previous logs are removed from the system and stored permanently in a storage disk. Checkpoint declares a point before which the DBMS was in consistent state, and all the transactions were committed.

A transaction T reaches its commit point when all its operations that access the database have been executed successfully i.e. the transaction has reached the point at which it will not abort (terminate without completing). Once committed, the transaction is permanently recorded in the database. Commitment always involves writing a commit entry to the log and writing the log to disk. At the time of a system crash, item is searched back in the log for all transactions T that have written a start_transaction(T) entry into the log but have not written a commit(T) entry yet; these transactions may have to be rolled back to undo their effect on the database during the recovery process

  • Undoing: If a transaction crashes, then the recovery manager may undo transactions i.e. reverse the operations of a transaction. This involves examining a transaction for the log entry write_item(T, x, old_value, new_value) and setting the value of item x in the database to old-value.There are two major techniques for recovery from non-catastrophic transaction failures: deferred updates and immediate updates.
  • Deferred update: This technique does not physically update the database on disk until a transaction has reached its commit point. Before reaching commit, all transaction updates are recorded in the local transaction workspace. If a transaction fails before reaching its commit point, it will not have changed the database in any way so UNDO is not needed. It may be necessary to REDO the effect of the operations that are recorded in the local transaction workspace, because their effect may not yet have been written in the database. Hence, a deferred update is also known as the No-undo/redo algorithm
  • Immediate update: In the immediate update, the database may be updated by some operations of a transaction before the transaction reaches its commit point. However, these operations are recorded in a log on disk before they are applied to the database, making recovery still possible. If a transaction fails to reach its commit point, the effect of its operation must be undone i.e. the transaction must be rolled back hence we require both undo and redo. This technique is known as undo/redo algorithm.
  • Caching/Buffering:  In this one or more disk pages that include data items to be updated are cached into main memory buffers and then updated in memory before being written back to disk. A collection of in-memory buffers called the DBMS cache is kept under control of DBMS for holding these buffers. A directory is used to keep track of which database items are in the buffer. A dirty bit is associated with each buffer, which is 0 if the buffer is not modified else 1 if modified.
  • Shadow paging: It provides atomicity and durability. A directory with n entries is constructed, where the ith entry points to the ith database page on the link. When a transaction began executing the current directory is copied into a shadow directory. When a page is to be modified, a shadow page is allocated in which changes are made and when it is ready to become durable, all pages that refer to original are updated to refer new replacement page.

Some of the backup techniques are as follows

  • Full database backup: In this full database including data and database, Meta information needed to restore the whole database, including full-text catalogs are backed up in a predefined time series.
  • Differential backup: It stores only the data changes that have occurred since last full database backup. When same data has changed many times since last full database backup, a differential backup stores the most recent version of changed data. For this first, we need to restore a full database backup.
  • Transaction log backup: In this, all events that have occurred in the database, like a record of every single statement executed is backed up. It is the backup of transaction log entries and contains all transaction that had happened to the database. Through this, the database can be recovered to a specific point in time. It is even possible to perform a backup from a transaction log if the data files are destroyed and not even a single committed transaction is lost.

Log based Recovery in DBMS

Atomicity property of DBMS states that either all the operations of transactions must be performed or none. The modifications done by an aborted transaction should not be visible to database and the modifications done by committed transaction should be visible.
To achieve our goal of atomicity, user must first output to stable storage information describing the modifications, without modifying the database itself. This information can help us ensure that all modifications performed by committed transactions are reflected in the database. This information can also help us ensure that no modifications made by an aborted transaction persist in the database.

Log and log records

The log is a sequence of log records, recording all the update activities in the database. In a stable storage, logs for each transaction are maintained. Any operation which is performed on the database is recorded is on the log. Prior to performing any modification to database, an update log record is created to reflect that modification.

An update log record represented as: <Ti, Xj, V1, V2> has these fields:

  1. Transaction identifier: Unique Identifier of the transaction that performed the write operation.
  2. Data item: Unique identifier of the data item written.
  3. Old value: Value of data item prior to write.
  4. New value: Value of data item after write operation.

Other type of log records are:

  1. <Ti start>: It contains information about when a transaction Ti starts.
  2. <Ti commit>: It contains information about when a transaction Ti commits.
  3. <Ti abort>: It contains information about when a transaction Ti aborts.

Undo and Redo Operations

Because all database modifications must be preceded by creation of log record, the system has available both the old value prior to modification of data item and new value that is to be written for data item. This allows system to perform redo and undo operations as appropriate:

  1. Undo: using a log record sets the data item specified in log record to old value.
  2. Redo: using a log record sets the data item specified in log record to new value.

The database can be modified using two approaches

  1. Deferred Modification Technique: If the transaction does not modify the database until it has partially committed, it is said to use deferred modification technique.
  2. Immediate Modification Technique: If database modification occur while transaction is still active, it is said to use immediate modification technique.

Recovery using Log records

After a system crash has occurred, the system consults the log to determine which transactions need to be redone and which need to be undone.

  1. Transaction Ti needs to be undone if the log contains the record <Ti start> but does not contain either the record <Ti commit> or the record <Ti abort>.
  2. Transaction Ti needs to be redone if log contains record <Ti start> and either the record <Ti commit> or the record <Ti abort>.

Use of Checkpoints

When a system crash occurs, user must consult the log. In principle, that need to search the entire log to determine this information. There are two major difficulties with this approach:

  1. The search process is time-consuming.
  2. Most of the transactions that, according to our algorithm, need to be redone have already written their updates into the database. Although redoing them will cause no harm, it will cause recovery to take longer.

To reduce these types of overhead, user introduce checkpoints. A log record of the form <checkpoint L> is used to represent a checkpoint in log where L is a list of transactions active at the time of the checkpoint. When a checkpoint log record is added to log all the transactions that have committed before this checkpoint have <Ti commit> log record before the checkpoint record. Any database modifications made by Ti is written to the database either prior to the checkpoint or as part of the checkpoint itself. Thus, at recovery time, there is no need to perform a redo operation on Ti.
After a system crash has occurred, the system examines the log to find the last <checkpoint L> record. The redo or undo operations need to be applied only to transactions in L, and to all transactions that started execution after the record was written to the log. Let us denote this set of transactions as T. Same rules of undo and redo are applicable on T as mentioned in Recovery using Log records part.

Note that user need to only examine the part of the log starting with the last checkpoint log record to find the set of transactions T, and to find out whether a commit or abort record occurs in the log for each transaction in T. For example, consider the set of transactions {T0, T1, . . ., T100}. Suppose that the most recent checkpoint took place during the execution of transaction T67 and T69, while T68 and all transactions with subscripts lower than 67 completed before the checkpoint. Thus, only transactions T67, T69, . . ., T100 need to be considered during the recovery scheme. Each of them needs to be redone if it has completed (that is, either committed or aborted); otherwise, it was incomplete, and needs to be undone.

The document Database Recovery Techniques | Database Management System (DBMS) - Computer Science Engineering (CSE) is a part of the Computer Science Engineering (CSE) Course Database Management System (DBMS).
All you need of Computer Science Engineering (CSE) at this link: Computer Science Engineering (CSE)
62 videos|66 docs|35 tests

Top Courses for Computer Science Engineering (CSE)

FAQs on Database Recovery Techniques - Database Management System (DBMS) - Computer Science Engineering (CSE)

1. What are the common causes of database failures that require recovery techniques?
Ans. Common causes of database failures that require recovery techniques include system crashes, software bugs, power outages, hardware failures, and human errors. These events can lead to data corruption or loss, making it necessary to recover the database to a consistent state.
2. What is the purpose of database recovery techniques in DBMS?
Ans. The purpose of database recovery techniques in DBMS is to ensure data consistency and durability in the event of failures. These techniques help recover the database to a consistent state, where all committed transactions are reflected correctly, and any uncommitted or incomplete transactions are rolled back or undone.
3. What are the different types of database recovery techniques?
Ans. The different types of database recovery techniques include redo logging, undo logging, checkpointing, shadow paging, and write-ahead logging (WAL). Redo logging records the changes made to the database and re-applies them after a failure, while undo logging undoes any incomplete transactions. Checkpointing involves periodically saving the database state to reduce recovery time. Shadow paging maintains multiple versions of the database, allowing recovery by simply switching to a consistent version. WAL ensures that changes are written to a log before modifying the actual database.
4. What is the role of transaction logs in database recovery techniques?
Ans. Transaction logs play a crucial role in database recovery techniques. They record all the changes made to the database, including both committed and uncommitted transactions. In the event of a failure, the transaction logs are used to roll back any uncommitted or incomplete transactions and re-apply the committed transactions to bring the database back to a consistent state.
5. How does write-ahead logging (WAL) ensure database consistency during recovery?
Ans. Write-ahead logging (WAL) ensures database consistency during recovery by guaranteeing that changes are first written to a log before modifying the actual database. This means that before any modification is made to the database, the corresponding log records are written and flushed to disk. In the event of a failure, the log records can be used to re-apply the changes to the database, ensuring that all committed transactions are reflected correctly. This technique prevents data corruption and maintains the consistency of the database.
62 videos|66 docs|35 tests
Download as PDF
Explore Courses for Computer Science Engineering (CSE) exam

Top Courses for Computer Science Engineering (CSE)

Signup for Free!
Signup to see your scores go up within 7 days! Learn & Practice with 1000+ FREE Notes, Videos & Tests.
10M+ students study on EduRev
Related Searches

past year papers

,

study material

,

Semester Notes

,

shortcuts and tricks

,

Previous Year Questions with Solutions

,

mock tests for examination

,

Database Recovery Techniques | Database Management System (DBMS) - Computer Science Engineering (CSE)

,

Free

,

Viva Questions

,

Database Recovery Techniques | Database Management System (DBMS) - Computer Science Engineering (CSE)

,

ppt

,

Exam

,

MCQs

,

Extra Questions

,

Sample Paper

,

Summary

,

Objective type Questions

,

practice quizzes

,

Database Recovery Techniques | Database Management System (DBMS) - Computer Science Engineering (CSE)

,

video lectures

,

Important questions

,

pdf

;