Top positive review
18 people found this helpful
Some important ideas but much work remains to be done
on August 16, 2008
In software development a 'refactoring' is a change that improves code quality without changing functionality. Refactoring helps keep an application maintainable over its life-cycle as requirements evolve, and is particularly of interest to those adopting modern 'agile' methodologies. This book comprises five general chapters on database refactoring - about 70 pages - followed by a 200 page catalog of various refactorings. The refactorings are classified as 'structural', 'data quality', 'referential integrity', 'architectural' and 'methods'. An additional chapter catalogs 'transformations', more on which in a moment. Each catalog entry uses a template including 'Motivation', 'Tradeoffs', 'Schema Update Mechanics', 'Data-Migration Mechanics' and 'Access Program Update Mechanics'. The 'Mechanics' sections include example code fragments for Oracle, JDBC and Hibernate.
Several of the structural refactorings are just simple database schema changes: rename/drop column/table/view. Adding is not really a refactoring so add column/table/view were cataloged as 'transformations' - changes that do affect the application, a distinction that appears to me a little clumsy. Some structural refactorings are more interesting: merge/split columns/tables, move column, introduce/remove surrogate key, introduce calculated column, introduce associative table.
The data quality refactorings include introduce/drop default values, null or check constraints, standardize codes, formats and data types, use consistent keys and lookup tables. Most of these are common best practices, seeing them cataloged as refactorings didn't yield me any new insights. Only replacing type codes with flags was of special interest.
Referential integrity refactorings include the obvious add/drop foreign keys with optional cascading delete, but also using triggers to create a change history and hard vs. soft deletes. Using before and after delete triggers to implement soft deletes is probably the best example in the book.
Architectural refactorings include using CRUD methods (ie. stored procedures or functions to select/insert/update/delete records), query functions that return cursor refs, interchanging methods and views, implementing methods in stored procedures, using materialized views and using local mirror tables vs. remote 'official' data sources. All these are common design techniques and the discussion of motivation and tradeoffs is particularly relevant.
The final section on method refactorings is more abbreviated and covers typical code refactorings. These qualify for inclusion only because databases include stored procedures, but they have nothing to do with schema evolution.
An important aspect of this book is that the catalog of refactorings is presented in the context of evolutionary database development described in the first five chapters: this approach emphasises an iterative approach, automated regression testing, configuration control of schema objects and easy availability of personalized application database environments for developers. Refactorings and transformations are intended to be applied one by one, and an automated regression test suite used to maintain confidence that a change does not introduce an application defect. Change control and a change tracking mechanism are essential to manage the application of schema changes to integration, QA and production environments.
What do I like about this book? The catalog of refactorings is thorough (some might say pedantic) which makes it a good learning tool for new database developers and DBAs, and as a shared reference for communicating on larger projects and in larger organizations. Experienced DBAs working on smaller projects are less likely to find it useful.
What don't I like? Relatively little is provided about the tools required to make regular refactoring practical, the authors simply state that these are being worked on. utPLSQL is not mentioned at all. The discussion on tracking changes is thin (but check out the LiquiBase project on Sourceforge). No guidance is provided on how you might use Ant to build and maintain developer database environments. Little is covered on the tough topic of building and maintaining test data sets. A final pet peeve: no discussion of refactoring across multiple schemas shared by an application suite.
In summary this book sketches out some important ideas but much work remains to be done. The catalog takes a number of established techniques and best practices and places them in a new framework which at least provides value to some for now.