Bioinformatics with Python Cookbook First Published Edition
| Tiago Antao (Author) Find all the books, read about the author, and more. See search results for this author |
Use the Amazon App to scan ISBNs and compare prices.
There is a newer edition of this item:
Learn how to use modern Python bioinformatics libraries and applications to do cutting-edge research in computational biology
About This Book
- Discover and learn the most important Python libraries and applications to do a complex bioinformatics analysis
- Focuses on the most modern tools to do research with next generation sequencing, genomics, population genetics, phylogenomics, and proteomics
- Uses real-world examples and teaches you to implement high-impact research methods
Who This Book Is For
If you have intermediate-level knowledge of Python and are well aware of the main research and vocabulary in your bioinformatics topic of interest, this book will help you develop your knowledge further.
What You Will Learn
- Gain a deep understanding of Python's fundamental bioinformatics libraries and be exposed to the most important data science tools in Python
- Process genome-wide data with Biopython
- Analyze and perform quality control on next-generation sequencing datasets using libraries such as PyVCF or PySAM
- Use DendroPy and Biopython for phylogenetic analysis
- Perform population genetics analysis on large datasets
- Simulate complex demographies and genomic features with simuPOP
In Detail
If you are either a computational biologist or a Python programmer, you will probably relate to the expression "explosive growth, exciting times". Python is arguably the main programming language for big data, and the deluge of data in biology, mostly from genomics and proteomics, makes bioinformatics one of the most exciting fields in data science.
Using the hands-on recipes in this book, you'll be able to do practical research and analysis in computational biology with Python. We cover modern, next-generation sequencing libraries and explore real-world examples on how to handle real data. The main focus of the book is the practical application of bioinformatics, but we also cover modern programming techniques and frameworks to deal with the ever increasing deluge of bioinformatics data.
What other items do customers buy after viewing this item?
Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.
Product details
- Publisher : Packt Publishing; First Published edition (June 25, 2015)
- Language : English
- Paperback : 306 pages
- ISBN-10 : 1782175113
- ISBN-13 : 978-1782175117
- Item Weight : 1.16 pounds
- Dimensions : 7.5 x 0.69 x 9.25 inches
- Best Sellers Rank: #3,103,262 in Books (See Top 100 in Books)
- #761 in Bioinformatics (Books)
- #3,312 in Python Programming
- #7,860 in Computer Programming Languages
- Customer Reviews:
About the author

Tiago Antao has a long career in the field of scientific computing, have been a researcher at the Universities of Oxford and Cambridge. He is a co-author of several Python-based scientific software packages like Biopython.
He currently works in the software development industry focusing on developing high performance software solutions using Python.
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonTop reviews from the United States
There was a problem filtering reviews right now. Please try again later.
Two things you need to know right away about the book: 1) This is not a bioinformatics book, as it covers only the Python implementation of the algorithms but not their theoretical background. Furthermore, the techniques covered are intermediate and advanced bioinformatics topics (e.g. next-generation sequencing, population genetics, protein structure visualization). 2) By the same token, is not an introductory Python book, knowledge about the language and its most common scientific packages (NumPy, SciPy, and Matplotlib) is assumed (on the other hand, Chapter 1 - covering the installation of the software - is a little wasted because of this).
The materials in each chapter are presented clearly and in a concise manner. Is very easy to follow what Python packages are being used and the differences in syntax between Python 2 and 3. The selected examples are very relatable to the biomedical scientist, making it a nice change of pace from all the finance examples commonly encountered in most Python books. The code runs smoothly without any changes, and the explanations of the commands are good although a little short every now and then.
In summary, this is a great book if you are looking to implement advanced bioinformatics techniques in Python. The writing style of the author makes it very user-friendly and if he decided to write another book covering how to implement basic bioinformatic techniques (probably a Biopython manual) I wouldn't hesitate to add it to my library.
The book is well organized and structured. The first chapter helps the reader to set up a python environment through Anaconda or Docker. In the following chapter the author starts to talk about bioinformatics and the first topic he touches is NGS (next generation sequencing). I found this chapter very interesting as it shows how to deal efficiently with the usually huge fastq files and how to perform simple but useful analyses such as quality control of the sequencing reads by position. In the same chapter the author talks about the variant call format (VCF) and how to work with it through libraries such as pyVCF. The next chapter is dedicated to whole genome sequences in fasta format and to genome annotations in GFF/GTF formats. The following chapters are about population genetics and phylogenetics. Even though I’m not an expert of these fields and I was not aware of the formats used (such as genepop) I could enjoy the reading and the code was straightforward enough to be easily understandable. Then the author explains how to use PDB (the protein data bank) and the software PyMol. A very interesting (and in my opinion important) chapter is the last one where the author discusses how to deal efficiently with big genomics datasets. Here the authors talks about concurrency and ipython parallel computing, and shows how it is possible to speed up your python code by using libraries such as Cython and Numba.
Overall the book is well written and organized and is a good cookbook for bioinformaticians.
Top reviews from other countries
Python basis are mandatory.



