If you have been jumping around trying to find a simple, straightforward explanation of SQL Server isolation levels or simply trying to understand what “snapshot isolation” is, fear no more, hopefully you are in the right place to get even more confused. The reality is, there is not simple explanation for database isolation levels as its understanding is associated to a whole bunch of different concepts. What I’ll do for you in this post is to disclose, and try to explain in simple terms, all related concepts to database isolation levels, and hopefully it should be enough for you to have a good grasp of this concept. Note that even though this is a SQL Server post the same concepts apply for most RDBMSs.
In one sentence, the concepts associated to database isolation levels are transactions, ACID properties, concurrency, optimistic concurrency, pessimistic concurrency, read phenomena, locking, and multiversion (row versioning). In this post I’ll explain briefly each concept and hopefully they should be enough for you to understand what isolation levels mean in the database world.
If you don’t know what a database transaction is, seriously consider to find a different kind of job, kidding… A transaction can be defined as a set of one or more operations applied to a database as one logical operation. The topic of transactions could have another whole post on its own; however, for now it is enough for you to know that a transaction can be something as simple as a select operation or something as complex as 1000 lines of T-SQL statements doing different kind of insertions and updates across different tables, the important bit here is to understand that all operations are applied as one logical change.
To learn more about transactions go to http://technet.microsoft.com/en-us/library/ms172400(v=sql.105).aspx
Atomicity, Consistency, Isolation, and Duration (ACID) are four fundamental properties of transactions. First, atomicity is about transactions performing all changes as if they were a single operation; that is, either all changes are performed or all of them are reverted. Second, consistency states that only valid data will be written to the database. For instance, a transaction cannot finish successfully if it has inserted a duplicated primary key. Third, isolation is about transactions no impacting each other execution; that is, there cannot be more than one operation modifying the same data at the same time. Finally, durability states that once transactions complete their operations successfully, changes are to be stored permanently and cannot be undone.
Concurrency, under our context, means to allow multiple operations on an object at the same time. This concept is directly related to the isolation property of ACID as the degree of concurrency allowed by the database determines the levels/types of isolation that can be used by transactions.
You may be wondering how can this be implemented? The answer is not simple. For example, on a hypothetical read-only table a database engine can allow 5 users to run different select statements at the same time as it is able to provide all users with the result concurrently (all users are reading); however, this is not the case for 4 users running selects statements and 1 user running an update statement. In this last case, the database system must find a way to either process all select statements first and then process the update statement or vice versa, but certainly the database cannot allow write and read operations at the same time on the same object as the end result would be unknown. This process is known as concurrency control and there are two types, pessimistic and optimistic.
This type of concurrency control assumes that most operations won’t be accessing objects concurrently; therefore, operations do not perform any kind of check or validation before executing. This improves on speed and resource usage but it does not guarantees the correct concurrent access to objects. If for whatever reason operations try to access and modify the same resource at the same time then these fail and should be repeated. Simply put, locks are not used and updating (writing data) conflicts are solved by rolling back operations.
This mechanism of concurrency control assumes that most operations will be accessing objects concurrently; therefore, operations perform validations and checks to make sure access to resources is sequential which assures adequate concurrent access to resources. If for whatever reason operations try to access and modify the same resource at the same time then one operation waits until the other finishes. Overall, this mechanism can assure correct concurrent access to resources but suffers from extra overhead and deadlocks.
This is the fancy name given to a set of common issues that happen with RDBMs and concurrent access.
Locking and Row versioning (multiversioning) are mechanisms to ensure transactional integrity and data consistency when multiple operations are accessing the same resource concurrently.
Is the process of reserving the set of resources or objects required by a transaction to perform its operations correctly. Pessimistic concurrency is implemented through locking. When objects are locked they cannot be accessed by other processes. There are different types of locks that determine what’s being locked. Locking and types of lock will be discusses in a subsequent post.
Multiversion (Row versioning)
Also known as optimistic concurrency control is a mechanism that does not require locks. Before data are modified, copies are created and made available to be used whenever concurrent access is required. For example, with row versioning when a statement SR needs to read a table T and a statement SW needs to write to the table T then SR reads from the latest data copy available from T and SW writes directly to T, hence reads do not block writes and vice versa. Note that SW (and only SW as it’s the writing statement) still requires locks to prevent any other statement from modifying T at the same time.
Note that both, Locking and Row versioning, are different concurrency control approaches to implement the same ACID property: isolation. Also, remember that a transaction is deemed to have been processed successfully if and only if it complies with the four fundamental properties of Atomicity, Consistency, Isolation, and Durability.
As with most concepts in Computer Science, there is a trade-off between Locking and Multiversion. On the one hand, locking means extra overhead to setup and manage locks. Being a pessimistic concurrency control, in this approach locks have to be set and maintained for every operation (read/write). On the other hand, multiversion, being an optimistic approach, does not require locks, hence, do not suffers from deadlocks but it does require extra temporary space to store the “versioned” values and does not guarantee successful concurrent access to resources.
If you are really interested about row versioning the following document is a MUST read:
SQL Server 2005 Row Versioning-Based Transaction Isolation
Locking Mechanisms (pessimistic concurrency control)
Allows dirty reads
Prevents dirty reads but suffers from not repeatable reads
Prevents dirty reads, and not repeatable reads but suffers from phantom reads.
Multiversion Mechanisms (optimistic concurrency control)
Read committed snapshot
Does optimistic reads and pessimistic writes. Optimistic reads are achieved through row versioning and pessimistic writes through locking (see above).
Does optimistic reads and optimistic writes.
(Note that by default SQL Server uses read committed)
That’s it. I hope this basic information helped you to understand SQL Server Isolation Levels. I know, you may be thinking, after reading all this post is that all you have to say about isolation levels? Remember the idea of this post was to disclose the concepts related to isolation levels. Based on this information it is entirely up to you, as developer, to decide the best isolation mechanism to use in your queries according to your specific problems. Hopefully, in a different post I’ll be discussing the internals and specifics of each isolation level in SQL Server and providing some guidance as to when to use them.