zbMATH — the first resource for mathematics

Transaction processing: concepts and techniques. (English) Zbl 0781.68006
San Mateo, CA: Morgan Kaufmann. XXXII, 1070 p. (1993).
The authors wrote a heavy book: more than a thousand pages and the most comprehensive one written in the past few years on this subject. It is not hard to predict that it also will be among the most influential books in computer science and computer engineering in this decade. There are at least three reasons for this judgement:
First: the book covers the most important challenge of computer science today – that is the design of a software infrastructure as the basis for reliable application development in a distributed environment.
Second: all the essential concepts are presented as well on a high theoretical level as in outstanding technical depth.
Third: the book is written from a system perspective; approaches to distributedness from different areas in computer science are presented in a novel, unifying way. It is therefore a source of knowledge and new insight for scientists, practioneers and students in all system related disciplines, e.g. databases, operating systems or communications, just to mention a few.
Distributed computing is understood (not only) by the authors as the key paradigm of computer science, engineering and application in this decade. In ten years time, stand alone computers will be rare exceptions. Instead, applications will be spread accross different – typically geographically distributed – computing systems. Imagine for example the collaboration of scientists in different countries on a particular experiment or the cooperation of companies in a common project. Another example is the routine business of a travel agency, which includes flight, hotel and perhaps rental car reservation, accounting and sometimes cancelling, where all activities belong to a single customer order, and therefore have to be processed in a consistent manner.
The notion of a transaction is at the heart of all these kinds of applications: “a set of actions on the physical or abstract application state” having particular properties: atomicity, consistency, isolation and durability. The challenge is now, to explore methods and develop techniques which guarantee these properties under all circumstances. This is what Transaction processing – concepts and techniques – is all about.
The book is also an exercise in constructing complex systems. There is no theoretical framework for this engineering kind of task. However, there exist lots of engineering criteria and principles as their pragmatic basis. The authors demonstrate in a convincing way that a complex system is more than the sum of its parts. The way, pieces are put together is reflected against the theoretical framework of transactional processing and the pragmatic engineering principles. This makes the book outstanding. The reader will learn, how transactional systems are constructed, why the decisions are made and what the tradeoffs are. This is much more than most computer science books have to offer: they often remain vague even when specific problems are to be solved. The authors know what they are writing about: they have been among the first and most influencial researchers who developed the theoretical concepts of transaction processing and they have actively participated in building systems on these grounds. This fact is reflected on each page and makes the book unique.
The book is organized into seven subject areas which are to a large degree self-contained. These are: basics, fault tolerance, transaction– oriented computing, concurrency control, recovery, transactional file system an products.
Despite its importance, transaction processing is an ill-defined term. Consequently, the first chapter presents different perceptions. A common denominator is the ability of transactions to deal with exceptions due to computing faults – an extremely important theme in systems composed of many different, mostly independent subsystems: operating software, communication, databases, repositories, even applications generators.
The part II on fault tolerance of computing systems is recommended even for those not primarily interested in transactions: starting with empirical studies of reliability in computing systems, the authors present technical solutions for different kinds of hardware and software faults. The in-depth discussion of problems – not invented, but real ones –, the development of solution concepts and their thorough implementation (in \(C\) and \(SQL)\) makes the book so fascinating.
The core of the book are the chapters on transaction processing concepts (part III) and techniques guaranteeing the desirable ACID properties (part IV and V). Starting with the concepts of flat transactions, as implemented in relational database systems and spheres of control as a general notion of control, transations are introduced as means for structuring computations in well-defined steps with well-defined properties. The theoretical foundations of isolation concepts and the implementation of isolation by locking techniques are presented in IV. Again: concepts, theory and implementation go hand in hand. To my knowledge, subsystems like the lock manager, have never been published in the open literature on this level of detail. But the book is not at all an implementation handbook. However, the code presented for important subsystems demonstrates, how to get from system concepts to working systems.
Part V deals with architectures for consistency and durability. While chapter 9 develops a log manager architecture, chapter 9 introduces the overall picture of transaction management. The architectural concepts and tadeoffs form the basis for the implementation of a transaction manager in chapter 11. New features which appear in some new systems, e.g. heterogeneous commit protocols, are discussed in chpater 12. Part VI is on implementing a file system as part of a transactional architecture. The chapters of file and buffer management, file organization and access techniques conclude the main part of the book. Some people might object that it is more appropriate, to present \(B\)-trees or hash algorithms in books on data structures, as it is often done. But the value of these (and all the other) chapters is not primarily the algorithm, but the discussion of tradeoffs: why has it been implemented in this way? Again: the experience of the authors in designing and constructing systems makes the book a constant source of new insights. The last part is a survey on transaction monitor products, most of them on a level of detail which allows to compare the systems. The book is organized in an ideal manner: the different parts are mostly self contained, each chapter is introduced by an overview of the main concepts. The summaries at the end of the chapters or important subchapters help to remember the essentials.
Each chapter is accompanied by a bunch of exercises of varying difficulty followed by the solutions. This makes the book even suitable for self studies. The book is in some sense orthogonal to typical computer science courses. This is not a fault by the authors but is due to the traditional organization of (most) current computer science curricula. Nevertheless I recommended it strongly for all computer science students and professors, system designers and implementors and even application programmers. The system perspective of the book will help to tear down the doubtful walls between different areas of computer science.
It is a pleasure to read the book – due to the excellent style, the fundamental concepts of system oriented computer science covered and last not least a good sense of humour of the authors.
However, it is not easy to read: it requires a good deal of efforts to grasp the ideas – as all good books do.

68-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to computer science
68P15 Database theory
68N25 Theory of operating systems