A New Basis for an SQL Isolation Level Standard |
||
Atul Adya |
Barbara Liskov |
Patrick O' Neil |
Microsoft Research |
Lab. for Comp. Science., MIT |
Umass/Boston |
Abstract
Isolation levels are provided by all commercial database systems to allow programmers to trade off consistency for a potential gain in performance. The current ANSI SQL standard provides isolation level definitions, but the definitions have been shown to be ambiguous and locking-centric, and no new definitions have been proposed to correct the problem. We have developed a new characterization for "portable" isolation levels: definitions that apply not only to locking implementations, but also to optimistic and multi-version concurrency control schemes. Unlike earlier definitions, our specifications handle predicates correctly at all isolation levels. Using our definitions we can also characterize Cursor Stability and Snapshot Isolation, which are both supported by commercial systems, and a new level called PL-2+ which is the weakest level that ensures consistent reads. We believe that our isolation level characterization is appropriate as a basis for! ! a new ANSI SQL standard.
1. Introduction and Motivation
The concept of different isolation levels (also called degrees of consistency) was introduced in [GLPT76] and further refined in [Date90] . The definitions in [GLPT76] and [Date90] formed the basis of the ANSI SQL-92 standard [ANSI92], one of whose stated goals was to provide a definition that is implementation-independent. All commercial databases such as SQL Server, IBM DB2, and Oracle, attempt to implement this ANSI standard. However, a paper in SIGMOD-95 [BBG+95] showed that the existing ANSI definitions are ambiguous. That paper proposed different definitions that avoided the ambiguity problems and fixed a few isolation holes, but as stated in [BBG+95] these definitions were "disguised versions of locking" which disallowed optimistic and multi-version mechanisms. Thus, these definitions failed to meet the goals of [ANSI92] for implementation-independence. At present then, we have no definition of isolation levels that is implementation-independe! ! nt. We believe implementation-independence is important because it provides flexibility to implementers, which can lead to more portable code and better performance. For example, optimistic CC can outperform locking in some environments, such as large-scale, wide-area distributed systems [AGLM95, Gru97] and mobile systems. Gemstone [MSOP86] and Oracle [Ora95] provide multi-version optimistic implementations of isolation levels (Oracle provides a variation of Snapshoit Isolation designated "Serializable"). At present, isolation levels such as Cursor Stability and Snapshot Isolation that are supported by commercial systems, do not have implementation-independent specifications.
2. Implementation-Independent Specifications of ANSI Isolation Levels
We have developed new definitions to correct these problems, while trying to support the intent of the specifications in [ANSI92]. For each SQL-92 level, we provide a corresponding portable isolation level that is precise and implementation-independent. Our levels for transactional isolation are called PL-1, PL-2, and PL-3 where PL-3 is the same as serializability. There are also a few levels that fall between PL-2 and PL-3. Details of our definitions have been detailed in [Adya99].
2.1 Features of Our Specifications
Our specifications have the following important attributes:
2.2 Conceptual Basis for our Specifications
Our new definitions are based on the principle that any set of consistency specifications must not allow transactions to observe violations of multi-object constraints; these are invariants of the type x + y = 10, that involve multiple objects. The approach suggested in [BBG95] captures multi-object constraints by disallowing conflicting operations to run concurrently on individual objects, i.e., the conditions are specified in terms of single-object histories. However, optimistic schemes allow conflicting operations to execute simultaneously and still correctly preserve multi-object constraints; the reason is that the consistency checks take all objects that were accessed by the committing transaction into account by avoiding cycles of changes that could not occur in a serial schedule. Thus, any consistency definition that tries to capture such constraints using a fixed number of objects and transactions, as was the case with [ANSI92], will be either i! ! ncorrect or overly restrictive in disallowing valid histories.
We have created our isolation definitions using a combination of constraints on object histories and serialization graphs [BHG87]. These graphs provide a simple way of capturing multi-object constraints. Each node in the graph corresponds to some committed transaction T, and edges are added between nodes corresponding to reads and writes performed by two transactions on the same data item. Some of our conditions are specified in terms of the different types of cycles (involving specified types of conflicts) that must be disallowed in these graphs, e.g., serializability disallows all types of cycles whereas the lowest isolation level disallows only cycles involving updates (and not reads). We also use graphs for defining correctness conditions in mixed systems in which different transactions may commit at different isolation levels.
Consistency conditions using graphs and different types of conflicts and dependencies have appeared multiple times in the literature, to define serializability [BHG87, KSS97,GR93], semantics-based correctness criteria [AAS93], and extended transaction models [CR94]. But our approach is the first that applies these techniques to defining the ANSI isolation level standard in a flexible, implementation-independent manner.
Another virtue of our isolation definitions is that they allow an application to request different isolation guarantees for committed and running transactions. This characteristic provides more flexibility to system builders and allows efficient implementations for various isolation levels. The approach in [BBG+95] requires that a concurrency control scheme provide the same guarantees for running and committed transactions (a lock-based implementation does indeed have this property). Thus, our definitions provide additional flexibility, and this ensures that a wide range of concurrency control mechanisms will be permitted by our isolation specifications. For example, AOCC [Adya94, AGLM95, Gru97] is disallowed by the definitions in [BBG+95] but permitted by our specifications.
In addition to supporting levels specified in [ANSI-92], we have developed definitions to support commercial levels such as Cursor Stability [Date90], Snapshot Isolation [BBG+95], and Oracle’s Read Consistency [Ora95]. We characterize these levels by extending the graphs used for defining the ANSI levels; different types of nodes and edges are added to capture the constraints relevant to each level. These definitions demonstrate that the graph-based approach for specifying isolation levels is flexible; if new isolation levels become important in the future, we believe our graph approach can be used to specify them.
We have also developed two new and useful levels between degree 2 and degree 3 and related them to existing commercial consistency guarantees. Our first level, PL-2+, is the weakest level that ensures that transactions do not observe violated multi-object constraints. However, it allows transactions to update the database in an inconsistent manner (but of course that’s always possible when a lower isolation level allows a reader to see an inconsistent set of values). Thus, PL-2+ lies "halfway" between degrees 2 and 3 since it ensures consistent reads but allows inconsistent writes. Level PL-2+ ensures that a transaction is placed after all transactions that causally affect it, i.e., it provides a notion of "causal consistency". Thus it disallows all the phenomena that Snapshot Isolation was intended to disallow. Since PL-2+ is weaker than Snapshot Isolation, it has the potential of being implemented more efficiently, especially in a distributed client-server system. Th! ! us, PL-2+ may be preferable to Snapshot Isolation in some cases.
Our second new level, PL-2L, captures one of the useful properties of a lock-based implementation of degree 2. It ensures that a transaction observes a monotonically increasing prefix of the database history as it executes, e.g., in an online auction system, if a transaction closes the auction to sell a product and a user transaction observes this closure and then the price of the product, PL-2L will ensure that the user observes the final price of the product. PL-2L can be useful for legacy applications that execute at degree 2 and assume such monotonicity properties are provided by a locking implementation; when the system is changed from locking to a different concurrency control mechanism, PL-2L can be used to ensure that these applications continue to run correctly. An interesting observation about PL-2L is that it is similar to the Read Consistency guarantees provided by the Oracle server [Ora95]. Level PL-2L disallows phenomena that Read Consistency was intended to d! ! isallow. Since PL-2L is weaker than Read Consistency, it may be less expensive to provide PL-2L than Read Consistency, especially in distributed client-server systems.
References
[AAS93] D. Agrawal, A. E. Abbadi, and A. K. Singh. Consistency and Orderability: Semantics-Based Correctness Criteria for Databases. ACM Transactions on Database Systems (TODS), 18(3):460 - 486, Sept. 1993.
[Adya94] A. Adya. Transaction Management for Mobile Objects Using Optimistic Concurrency Control. Master's thesis, Massachusetts Institute of Technology, Jan.1994. Also available as MIT LCS Technical Report MIT/LCS/TR-626.
[Adya99] A. Adya. Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA, March 1999.
[AGLM95] A. Adya, R. Gruber, B. Liskov, and U. Maheshwari. Efficient Optimistic Concurrency Control using Loosely Synchronized Clocks. In Proc. of ACM SIGMOD, pages 23 - 34, San Jose, CA, May 1995.
[BBG+ 95] H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O'Neil, and P. O'Neil. A Critique of ANSI SQL Isolation Levels. In Proceedings of ACM SIGMOD, pages 1 - 10, San Jose, CA, May 1995.
[BHG87] P. A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. Addison Wesley, 1987.
[BOS91] P. Butterworth, A. Otis, and J. Stein. The GemStone Object Database Management System. Comm. of the ACM, 34(10):64 -77, October 1991.
[CR94] P. Chrysanthis and K. Ramamritham. Synthesis of Extended Transaction Models using ACTA. ACM Transactions on Database Systems (TODS), 19(3):450 - 491, Sept. 1994.
[Dat90] C. J. Date. An Introduction to Database Systems. Addison-Wesley, Fifth edition, 1990.
[GLPT76] J. Gray, R. Lorie, G. Putzolu, I. Traiger. Granularity of Locks and Degrees of Consistency in a Shared Database. In Modeling in Data Base Management Systems. Amsterdam: Elsevier North-Holland, 1976. Also available in Chapter 3 of Readings in Database Systems, Second Edition, M. Stonebraker Editor, Morgan Kaufmann, 1994.
[Gru97] R. Gruber. Optimism vs. Locking: A Study of Concurrency Control for Client-Server Object-Oriented Databases. Ph.D. thesis, MIT, Cambridge, MA, 1997.
[Ora95] Oracle Corporation. Concurrency Control, Transaction Isolation and Serializability in SQL92 and Oracle7, July 1995.