- Series: The David Hume Series
- Paperback: 408 pages
- Publisher: Center for the Study of Language and Inf (December 1, 1999)
- Language: English
- ISBN-10: 1575862174
- ISBN-13: 978-1575862170
- Product Dimensions: 6 x 1 x 9 inches
- Shipping Weight: 1.2 pounds (View shipping rates and policies)
- Average Customer Review: 1 customer review
- Amazon Best Sellers Rank: #3,354,878 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison 0th Edition
Use the Amazon App to scan ISBNs and compare prices.
The Amazon Book Review
Author interviews, book reviews, editors picks, and more. Read it now
Time Warps, String Edits and Macromolecules is a young classic in computational science, scientific analysis from a computational perspective. The book is the first, and still best compilation of papers explaining how to measure distance between sequences, and how to compute that measure effectively. The book contains lucid explanations of the basic techniques, well-annotated examples of applications, and mathematical analysis of its computational (algorithmic) complexity.
Top customer reviews
A general overview of sequence comparison is given in chapter 1 with applications to molecular biology, human speech, computer science, coding theory, gas chromotography, and bird songs discussed. The author discusses how deletion-insertion, compression-expansion, and substitution are employed in sequence comparison. Different metrics are introduced, such as the Levenshtein distance. Dynamic programming, which pretty much dominates the book, is introduced here also.
Part 1 of the book discusses sequence comparison in molecular biology. The use of dynamic programming is emphasized and its importance continues to this day. The advantages of using the dynamic programming method are outlined, and it is shown how to find the substring in a longer sequence with most optimum agreement to a shorter sequence. In addition, given an RNA molecule with a known nucleotide sequence, methods are discussed for predicting the way different parts of the molecule will bond to each other. These methods are based on dynamic programming. Mathematicians considering doing research on or about entering the field will profit from the section on the biological background. The treatment of RNA secondary structures is excellent.
In part 2, the emphasis is on speech processing and what is called "time-warping", which is a technique for comparing functions by altering the time axis. An interesting application is given to the comparison of bird songs. An algorithm is given for adjusting the time scales for two songs to arrange them in the most optimal alignment. In addition, the differences between compression and expansion and deletion and insertion are discussed in this part.
In part 3, a modified Smith-Waterman algorithm is employed to find similar portions in two sequences. Called local alignment in computational biology, it is shown in detail how to define the recurrences for the alignment and how to keep track of the pointers for backtracking. This part also generalizes the operations of substitution and Levenshtein distance. In addition, the strategy of doing sequence comparison by allowing transpositions is discussed. Such a strategy entails a generalized concept of trace, wherein trace lines can intersect each other, leading to entangling of the traces into knots or plaids. The usual dynamic programming techniques must then be extended to deal with these complications. One particular algorithm for this is discussed, called CELLAR, which involves the construction of a directed graph whose paths correspond to admissible sequences of generalizations of traces, called cuts. The computational complexity of this algorithm is discussed. In addition, an O(n^2/logn) algorithm is given for computing string-edit distances.
The last part of the book deals with studying comparisons between random sequences. Combinatorial arguments are used to derive upper bounds on the expected length of the longest common subsequences of two random sequences. Other miscellaneous results dealing with comparing common subsequences of two random sequences are given.