ACID Properties
A transaction is a single logical unit of work which accesses and possibly modifies the contents of a database. Transactions access data using read and write operations.
In order to maintain consistency in a database, before and after transaction, certain properties are followed. These are called ACID properties.
Atomicity
By this, we mean that either the entire transaction takes place at once or doesn’t happen at all. There is no midway i.e. transactions do not occur partially. Each transaction is considered as one unit and either runs to completion or is not executed at all. It involves following two operations.
—Abort: If a transaction aborts, changes made to database are not visible.
—Commit: If a transaction commits, changes made are visible.
Atomicity is also known as the ‘All or nothing rule’.
Consider the following transaction T consisting of T1 and T2: Transfer of 100 from account X to account Y.
If the transaction fails after completion of T1 but before completion of T2.( say, after write(X) but before write(Y)), then amount has been deducted from X but not added to Y. This results in an inconsistent database state. Therefore, the transaction must be executed in entirety in order to ensure correctness of database state.
Consistency
This means that integrity constraints must be maintained so that the database is consistent before and after the transaction. It refers to correctness of a database. Referring to the example above,
The total amount before and after the transaction must be maintained.
Total before T occurs = 500 + 200 = 700.
Total after T occurs = 400 + 300 = 700.
Therefore, database is consistent. Inconsistency occurs in case T1 completes but T2 fails. As a result T is incomplete.
Isolation
This property ensures that multiple transactions can occur concurrently without leading to inconsistency of database state. Transactions occur independently without interference. Changes occurring in a particular transaction will not be visible to any other transaction until that particular change in that transaction is written to memory or has been committed. This property ensures that the execution of transactions concurrently will result in a state that is equivalent to a state achieved these were executed serially in some order.
Let X= 500, Y = 500.
Consider two transactions T and T”.
Suppose T has been executed till Read (Y) and then T’’ starts. As a result , interleaving of operations takes place due to which T’’ reads correct value of X but incorrect value of Y and sum computed by
T’’: (X+Y = 50, 000+500=50, 500)
is thus not consistent with the sum at end of transaction:
T: (X+Y = 50, 000 + 450 = 50, 450).
This results in database inconsistency, due to a loss of 50 units. Hence, transactions must take place in isolation and changes should be visible only after a they have been made to the main memory.
Durability:
This property ensures that once the transaction has completed execution, the updates and modifications to the database are stored in and written to disk and they persist even is system failure occurs. These updates now become permanent and are stored in a non-volatile memory. The effects of the transaction, thus, are never lost.
The ACID properties, in totality, provide a mechanism to ensure correctness and consistency of a database in a way such that each transaction is a group of operations that acts a single unit, produces consistent results, acts in isolation from other operations and updates that it makes are durably stored.
Concurrency Control -Introduction
Concurrency Control deals with interleaved execution of more than one transaction. In the next article, we will see what is serializability and how to find whether a schedule is serializable or not.
What is Transaction?
A set of logically related operations is known as transaction. The main operations of a transaction are:
Read(A): Read operations Read(A) or R(A) reads the value of A from the database and stores it in a buffer in main memory.
Write (A): Write operation Write(A) or W(A) writes the value back to the database from buffer.
Let us take a debit transaction from an account which consists of following operations:
Assume A’s value before starting of transaction is 5000.
But it may also be possible that transaction may fail after executing some of its operations. The failure can be because of hardware, software or power etc. For example, if debit transaction discussed above fails after executing operation 2, the value of A will remain 5000 in the database which is not acceptable by the bank. To avoid this, Database has two important operations:
Commit: After all instructions of a transaction are successfully executed, the changes made by transaction are made permanent in the database.
Rollback: If a transaction is not able to execute all operations successfully, all the changes made by transaction are undone.
Properties of a transaction
Atomicity: As a transaction is set of logically related operations, either all of them should be executed or none. A debit transaction discussed above should either execute all three operations or none.If debit transaction fails after executing operation 1 and 2 then its new value 4000 will not be updated in the database which leads to inconsistency.
Consistency: If operations of debit and credit transactions on same account are executed concurrently, it may leave database in an inconsistent state.
Table 1
Isolation: Result of a transaction should not be visible to others before transaction is committed. For example, Let us assume that A’s balance is Rs. 5000 and T1 debits Rs. 1000 from A. A’s new balance will be 4000. If T2 credits Rs. 500 to A’s new balance, A will become 4500 and after this T1 fails. Then we have to rollback T2 as well because it is using value produced by T1. So a transaction results are not made visible to other transactions before it commits.
Durable: Once database has committed a transaction, the changes made by the transaction should be permanent. e.g.; If a person has credited $500000 to his account, bank can’t say that the update has been lost. To avoid this problem, multiple copies of database are stored at different locations.
What is a Schedule?
A schedule is series of operations from one or more transactions. A schedule can be of two types:
Question: Consider the following transaction involving two bank accounts x and y:
The constraint that the sum of the accounts x and y should remain constant is that of?
Solution: As discussed in properties of transactions, consistency properties says that sum of accounts x and y should remain constant before starting and after completion of transaction. So, the correct answer is B.
Next article- Serializability of Schedules
How to test if two schedules are View Equal or not ?
Two schedules S1 and S2 are said to be view equal iff following below conditions are satisfied :
1) Initial Read
If a transaction T1 reading data item A from initial database in S1 then in S2 also T1 should read A from initial database.
Transaction T2 is reading A form initial database.
2) Updated Read
If Ti is reading A which is updated by Tj in S1 then in S2 also Ti should read A which is updated by Tj.
Above two schedule are not view equal as in S1 :T3 is reading A updated by T2, in S2 T3 is reading A updated by T1.
3) Final Write operation
If a transaction T1 updated A at last in S1, then in S2 also T1 should perform final write operations.
Above two schedule are not view as Final write operation in S1 is done by T1 while in S2 done by T2.
View Serializability: A Schedule is called view serializable if it is view equal to a serial schedule (no overlapping transactions).
Conflict Serializability
As discussed in Concurrency control, serial schedules have less resource utilization and low throughput. To improve it, two are more transactions are run concurrently. But concurrency of transactions may lead to inconsistency in database. To avoid this, we need to check whether these concurrent schedules are serializable or not.
Conflict Serializable: A schedule is called conflict serializable if it can be transformed into a serial schedule by swapping non-conflicting operations.
Conflicting operations: Two operations are said to be conflicting if all conditions satisfy:
Example: –
Consider the following schedule:
S1: R1(A), W1(A), R2(A), W2(A), R1(B), W1(B), R2(B), W2(B)
If Oi and Oj are two operations in a transaction and Oi< Oj (Oi is executed before Oj), same order will follow in schedule as well. Using this property, we can get two transactions of schedule S1 as:
T1: R1(A), W1(A), R1(B), W1(B)
T2: R2(A), W2(A), R2(B), W2(B)
Possible Serial Schedules are: T1->T2 or T2->T1
-> Swapping non-conflicting operations R2(A) and R1(B) in S1, the schedule becomes,
S11: R1(A), W1(A), R1(B), W2(A), R2(A), W1(B), R2(B), W2(B)
-> Similarly, swapping non-conflicting operations W2(A) and W1(B) in S11, the schedule becomes,
S12: R1(A), W1(A), R1(B), W1(B), R2(A), W2(A), R2(B), W2(B)
S12 is a serial schedule in which all operations of T1 are performed before starting any operation of T2. Since S has been transformed into a serial schedule S12 by swapping non-conflicting operations of S1, S1 is conflict serializable.
Let us take another Schedule:
S2: R2(A), W2(A), R1(A), W1(A), R1(B), W1(B), R2(B), W2(B)
Two transactions will be:
T1: R1(A), W1(A), R1(B), W1(B)
T2: R2(A), W2(A), R2(B), W2(B)
Possible Serial Schedules are: T1->T2 or T2->T1
Original Schedule is:
S2: R2(A), W2(A), R1(A), W1(A), R1(B), W1(B), R2(B), W2(B)
Swapping non-conflicting operations R1(A) and R2(B) in S2, the schedule becomes,
S21: R2(A), W2(A), R2(B), W1(A), R1(B), W1(B), R1(A), W2(B)
Similarly, swapping non-conflicting operations W1(A) and W2(B) in S21, the schedule becomes,
S22: R2(A), W2(A), R2(B), W2(B), R1(B), W1(B), R1(A), W1(A)
In schedule S22, all operations of T2 are performed first, but operations of T1 are not in order (order should be R1(A), W1(A), R1(B), W1(B)). So S2 is not conflict serializable.
Conflict Equivalent: Two schedules are said to be conflict equivalent when one can be transformed to another by swapping non-conflicting operations. In the example discussed above, S11 is conflict equivalent to S1 (S1 can be converted to S11 by swapping non-conflicting operations). Similarly, S11 is conflict equivalent to S12 and so on.
Note 1: Although S2 is not conflict serializable, but still it is conflict equivalent to S21 and S21 because S2 can be converted to S21 and S22 by swapping non-conflicting operations.
Note 2: The schedule which is conflict serializable is always conflict equivalent to one of the serial schedule. S1 schedule discussed above (which is conflict serializable) is equivalent to serial schedule (T1->T2).
Question: Consider the following schedules involving two transactions. Which one of the following statement is true?
S1: R1(X) R1(Y) R2(X) R2(Y) W2(Y) W1(X)
S2: R1(X) R2(X) R2(Y) W2(Y) R1(Y) W1(X)
Solution: Two transactions of given schedules are:
T1: R1(X) R1(Y) W1(X)
T2: R2(X) R2(Y) W2(Y)
Let us first check serializability of S1:
S1: R1(X) R1(Y) R2(X) R2(Y) W2(Y) W1(X)
To convert it to a serial schedule, we have to swap non-conflicting operations so that S1 becomes equivalent to serial schedule T1->T2 or T2->T1. In this case, to convert it to a serial schedule, we must have to swap R2(X) and W1(X) but they are conflicting. So S1 can’t be converted to a serial schedule.
Now, let us check serializability of S2:
S2: R1(X) R2(X) R2(Y) W2(Y) R1(Y) W1(X)
Swapping non conflicting operations R1(X) and R2(X) of S2, we get
S2’: R2(X) R1(X) R2(Y) W2(Y) R1(Y) W1(X)
Again, swapping non conflicting operations R1(X) and R2(Y) of S2’, we get
S2’’: R2(X) R2(Y) R1(X) W2(Y) R1(Y) W1(X)
Again, swapping non conflicting operations R1(X) and W2(Y) of S2’’, we get
S2’’’: R2(X) R2(Y) W2(Y) R1(X) R1(Y) W1(X)
which is equivalent to a serial schedule T2->T1.
So, correct option is C. Only S2 is conflict serializable.
Recoverability of Schedules
We have discussed the basics of Transactions and Schedules in Concurrency Control (Introduction) article. As discussed, a transaction may not execute completely due to hardware failure, system crash or software issues. In that case, we have to rollback the failed transaction. But some other transaction may also have used values produced by failed transaction. So we have to rollback those transactions as well.
Above table shows a schedule with two transactions, T1 reads and writes A and that value is read and written by T2. T2 commits. But later on, T1 fails. So we have to rollback T1. Since T2 has read the value written by T1, it should also be rollbacked. But we have already committed that. So this schedule is irrecoverable schedule.
Irrecoverable Schedule: When Tj is reading the value updated by Ti and Tj is committed before commit of Ti, the schedule will be irrecoverable.
Table 2 shows a schedule with two transactions, T1 reads and writes A and that value is read and written by T2. But later on, T1 fails. So we have to rollback T1. Since T2 has read the value written by T1, it should also be rollbacked. As it has not committed, we can rollback T2 as well. So it is recoverable with cascading rollback.
Recoverable with cascading rollback: If Tj is reading value updated by Ti and commit of Tj is delayed till commit of Ti , the schedule is called recoverable with cascading rollback.
Table 3 shows a schedule with two transactions, T1 reads and writes A and commits and that value is read by T2. But if T1 fails before commit, no other transaction has read its value, so there is no need to rollback other transaction. So this is a cascadeless recoverable schedule.
Cascadeless Recoverable: If Tj reads value updated by Ti only after Ti is commited, the schedule will be cascadeless recoverable.
Question: Which of the following scenarios may lead to an irrecoverable error in a database system?
(A) A transaction writes a data item after it is read by an uncommitted transaction.
(B) A transaction reads a data item after it is read by an uncommitted transaction.
(C) A transaction reads a data item after it is written by a committed transaction.
(D) A transaction reads a data item after it is written by an uncommitted transaction.
Answer: See the example discussed in Table 1, a transaction is reading a data item after it is written by an uncommitted transaction, the schedule will be irrecoverable.
62 videos|66 docs|35 tests
|
1. What is a transaction in computer science engineering? |
2. What is concurrency control in computer science engineering? |
3. How does locking help in concurrency control? |
4. What is the role of timestamps in concurrency control? |
5. What is optimistic concurrency control? |
|
Explore Courses for Computer Science Engineering (CSE) exam
|