Friday, September 02, 2005

BTrees and Logical Undos

Certain BTree operations require Logical Undo to be performed during rollbacks and restart recovery. This is because, in a BTree, keys can move from one page to another, and therefore an undo operation may have to be applied to a different page from the one where the original action took place.

To take an example, suppose transaction T1 deletes the key D from page P1. Another transaction T2 comes along and inserts the key B into page P1. This causes a page split, after which, all keys greater than B end up in a new page P2. Now, if the first transaction T1 aborts, then it needs to reinstate the deleted key D. However, if it looks at page P1, it will find that the deleted key no longer belongs there. Therefore it must locate the new page P2 and perform the undo action there.

There are a couple of design decisions that affect the complexity of the logical undo actions:

If the BTree is designed to physically remove keys when they are deleted, then undo actions require the key to be physically re-inserted. This means that during the undo of a key delete, page splits can occur if the insert cannot be performed due to the page being full. The splits can recurse up the tree causing more page splits.

The advantage of the physical removal option is that the BTree never contains data that is superfluous. If a large number of rows are deleted, the BTree will be immediately compacted, as the pages used by deleted keys would be removed. Keeping the BTree dense and small is good for performance as fewer pages need to be scanned during traversals.

The other approach is to use logical key deletes, where the key is not physically removed but merely marked as deleted. To undo the delete, only the mark needs to be reset. Logical deletes allows undo operations to proceed without page splits, and therefore Logical Undos can be page oriented.

The problem with logical deletes, of course is that these keys must be garbage collected at some point and key traversals need to obtain instant locks on the deleted keys to determine whether they are truly deleted or not. The presence of a large number of deleted keys can also make the BTree less dense, and increase the path for traversals.

It seems to me that the logical delete key approach is more suited to a scenario where there are frequent rollbacks.

The issue of logical deletes is not discussed at any great length anywhere. There is a brief discussion of this in Mohan's paper Commit_LSN: A Novel and Simple Method for Reducing Locking and Latching in Transaction Processing systems.

A BTree implementation that uses logical key deletes is in Apache Derby.

No comments: