分类: Oracle
2010-08-14 21:23:50
Most DBMSs offer a number of transaction isolation levels which control the degree of locking which occurs when selecting data. For many database applications, the majority of database transactions can be constructed in such a way as to not require high isolation levels, thus reducing the locking overhead for the system. The programmer must carefully analyze database access code to ensure that any relaxation of isolation does not cause difficult-to-find software bugs. Conversely, at higher isolation levels the possibility of is increased, which also requires careful analysis and programming techniques to avoid.
The isolation levels defined by the / standard are:
This isolation level specifies that all transactions occur in a completely isolated fashion; i.e., as if all transactions in the system had executed serially, one after the other. The DBMS may execute two or more transactions at the same time only if the illusion of serial execution can be maintained.
However, many databases (e.g. Oracle , PostgreSQL ) do not guarantee that there is some serial ordering of the transactions that will result in the same outcome, instead implementing . This explicitly contradicts the ANSI/ISO SQL 99 standard (see ISO/IEC9075-2:1999(E), page 83). Other databases (MS SQL Server ) support both serializable and snapshot isolation modes.
With a lock-based concurrency control DBMS implementation, requires that range locks are acquired when a query uses a ranged WHERE clause. When using non-lock concurrency control, no lock is acquired; however, if the system detects a concurrent transaction in progress which would violate the serializability illusion, it must force that transaction to , and the application will have to restart the transaction.
All data records read by a statement cannot be changed; however, if the SELECT statement contains any ranged WHERE clauses, can occur. In this isolation level, the transaction acquires read locks on all retrieved data, but does not acquire range locks.
Data records retrieved by a query are not prevented from modification by some other transactions. may occur, meaning data retrieved in a statement may be modified by some other transaction when it commits. In this isolation level, read locks are acquired on selected data but they are released immediately whereas write locks are released at the end of the transaction.
In this isolation level, dirty reads are allowed. One transaction may see uncommitted changes made by some other transaction.
The default isolation level of different varies quite widely. Most databases which feature transactions allow the user to set any isolation level. Some DBMSs also require additional syntax when performing a SELECT statement which is to acquire locks.
However, the definitions above have been criticised in the paper as being ambiguous, and as not accurately reflecting the isolation provided by many databases:
In these examples two transactions take place. In the first transaction, Query 1 is performed. Then, in the second transaction, Query 2 is performed and committed. Finally, in the first transaction, Query 1 is performed again.
The queries use the following data table.
id | name | age |
---|---|---|
1 | Joe | 20 |
2 | Jill | 25 |
A phantom read occurs when, in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first. This can occur when range locks are not acquired on performing a SELECT.
Transaction 1 | Transaction 2 |
---|---|
/* Query 1 */ |
|
/* Query 2 */ |
|
/* Query 1 */ |
Note that transaction 1 executed the same query twice. If the highest level of isolation were maintained, the same set of rows should be returned both times, and indeed that is what is mandated to occur in a database operating at the SQL SERIALIZABLE isolation level. However, at the lesser isolation levels, a different set of rows may be returned the second time.
In the SERIALIZABLE isolation mode, Query 1 would result in all records with age in the range 10 to 30 being locked, thus Query 2 would block until the first transaction was committed. In REPEATABLE READ mode, the range would not be locked, allowing the record to be inserted and the second execution of Query 1 to return the new row in its results.
Also known as Inconsistent Analysis, non-repeatable reads may occur in a lock-based concurrency control method when read locks are not acquired when performing a SELECT. Under multiversion concurrency control, non-repeatable reads may occur when the requirement that a transaction affected by a must roll back is relaxed.
Transaction 1 | Transaction 2 |
---|---|
/* Query 1 */ |
|
/* Query 2 */ |
|
/* Query 1 */ |
In this example, Transaction 2 commits successfully, which means that its changes to the row with id 1 should become visible. However, Transaction 1 has already seen a different value for age in that row. At the SERIALIZABLE and REPEATABLE READ isolation level, the DBMS must return the old value. At READ COMMITTED and READ UNCOMMITTED, the DBMS may return the updated value; this is a non-repeatable read.
There are two basic strategies used to prevent non-repeatable reads. The first is to delay the execution of Transaction 2 until Transaction 1 has committed or rolled back. This method is used when locking is used, and produces the serial T1, T2. A serial schedule does not exhibit non-repeatable reads.
In the other strategy, which is used in multiversion concurrency control, Transaction 2 is permitted to commit first, which provides for better concurrency. However, Transaction 1, which commenced prior to Transaction 2, must continue to operate on a past version of the database — a snapshot of the moment it was started. When Transaction 1 eventually tries to commit, the DBMS looks to see if the result of committing Transaction 1 would be equivalent to the schedule T1, T2. If it is, then Transaction 1 can succeed. If it cannot be seen to be equivalent, however, Transaction 1 must roll back with a serialization failure.
Using a lock-based concurrency control method, at the REPEATABLE READ isolation mode, the row with ID = 1 would be locked, thus blocking Query 2 until the first transaction was committed or rolled back. In READ COMMITTED mode the second time Query 1 was executed the age would have changed.
Under multiversion concurrency control, at the SERIALIZABLE isolation level, both SELECT queries see a snapshot of the database taken at the start of Transaction 1. Therefore, they return the same data. However, if Transaction 1 were then to attempt to UPDATE that row as well, a serialization failure would occur and Transaction 1 would be forced to roll back.
At the READ COMMITTED isolation level, each query sees a snapshot of the database taken at the start of each query. Therefore, they each see different data for the updated row. No serialization failure is possible in this mode (because no promise of serializability is made) and Transaction 1 will not have to be retried.
A dirty read occurs when a transaction reads data from a row that has been modified by another transaction, but not yet committed.
Dirty reads work similarly to non-repeatable reads, however the second transaction would not need to be committed for the first query to return a different result. The only thing that may be prevented in the READ UNCOMMITTED mode is that updates appearing in the results out of order; that is, earlier updates will always appear in a result set before later updates.
Transaction 2 has changed a row, but has not committed the changes. Transaction 1 reads the uncommitted data but his view of the data may be wrong if Transaction 2 rolls back his changes and updates his own changes to the database.
Transaction 1 | Transaction 2 |
---|---|
/* Query 1 */ |
|
/* Query 2 */ |
|
/* Query 1 */ |
|
ROLLBACK; /* lock-based DIRTY READ */ |
Isolation level | Dirty reads | Non-repeatable reads | Phantoms |
---|---|---|---|
Read uncommited | X | X | X |
Read Committed | - | X | X |
Repeatable Read | - | - | X |
Serializable | - | - | - |
"X" means that the isolation level has the problem, while "-" means that it does not have the problem.
Range Lock | Read Lock | Write Lock | |
---|---|---|---|
Read Uncommited | X | X | X |
Read Committed | X | X | V |
Repeatable Read | X | V | V |
Serializable | V | V | V |
V means the method locks for the operation.