- Paperback: 504 pages
- Publisher: O'Reilly Media; 1 edition (July 10, 2009)
- Language: English
- ISBN-10: 0596516495
- ISBN-13: 978-0596516499
- Product Dimensions: 7 x 1.2 x 9.2 inches
- Shipping Weight: 1.5 pounds (View shipping rates and policies)
- Average Customer Review: 49 customer reviews
- Amazon Best Sellers Rank: #43,821 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
The Amazon Book Review
Author interviews, book reviews, editors picks, and more. Read it now
Frequently bought together
What other items do customers buy after viewing this item?
About the Author
Steven Bird is Associate Professor in the Department of Computer Science and Software Engineering at the University of Melbourne, and Senior Research Associate in the Linguistic Data Consortium at the University of Pennsylvania. He completed a PhD on computational phonology at the University of Edinburgh in 1990, supervised by Ewan Klein. He later moved to Cameroon to conduct linguistic fieldwork on the Grassfields Bantu languages under the auspices of the Summer Institute of Linguistics. More recently, he spent several years as Associate Director of the Linguistic Data Consortium where he led an R&D team to create models and tools for large databases of annotated text. At Melbourne University, he established a language technology research group and has taught at all levels of the undergraduate computer science curriculum. In 2009, Steven is President of the Association for Computational Linguistics.
Ewan Klein is Professor of Language Technology in the School of Informatics at the University of Edinburgh. He completed a PhD on formal semantics at the University of Cambridge in 1978. After some years working at the Universities of Sussex and Newcastle upon Tyne, Ewan took up a teaching position at Edinburgh. He was involved in the establishment of Edinburgh's Language Technology Group in 1993, and has been closely associated with it ever since. From 2000-2002, he took leave from the University to act as Research Manager for the Edinburgh-based Natural Language Research Group of Edify Corporation, Santa Clara, and was responsible for spoken dialogue processing. Ewan is a past President of the European Chapter of the Association for Computational Linguistics and was a founding member and Coordinator of the European Network of Excellence in Human Language Technologies (ELSNET).
Edward Loper has recently completed a PhD on machine learning for natural language processing at the the University of Pennsylvania. Edward was a student in Steven's graduate course on computational linguistics in the fall of 2000, and went on to be a TA and share in the development of NLTK. In addition to NLTK, he has helped develop two packages for documenting and testing Python software, epydoc, and doctest.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
The book has several strengths. It is tightly integrated with Python and NLTK code. There are numerous examples throughout and the author walks through and modifies them to clarify how the NLTK works. The sizeable reference sections at the end of each chapter are also valuable. These sections include both introductory and advanced sources. And a lot of them. There is also useful integration with the NLTK web site which provides and points to additional resources.
Not to be missed are the end-of-chapter questions. Readers have come to expect little from these learning aids; they usually invite us to parrot back a small number of key concepts or try a few calculations or code segments. This book's questions go far beyond the norm. They introduce new concepts, encourage writing and comparing several versions of a program, and otherwise extend each chapter's contents. Even readers who don't plan to complete these exercises should read them closely.
Weaknesses are few. As noted, the book may assume too much Python and NLP background for some users. It does have a narrow focus and is not organized the right way to be used as a reference book. Readers who want something a little more modular and reference-like might prefer Jacob Perkins' Python 3 Text Processing with NLTK 3 Cookbook. David Mertz's Text Processing in Python is an older source, but still useful as well.
1. Know the basics of natural language processing (NLP) or linguistics;
2. Know the Python programming language or you're willing to learn it;
3. Are using the NLTK library or plan to do so.
NLTK is a Python library that offers many standard NLP tools (tokenizers, POS taggers, parsers, chunkers and others). It comes with samples of several dozens of text corpora typically used in NLP applications, as well as with interfaces to dictionary-like resources such as WordNet and VerbNet. No FrameNet, though. NLTK is well documented, so you might not need this book initially. However, it definitely helps to have it on your desk if you are serious about using NLTK.
The first chapters are a bit messy, as they attempt to introduce all three themes (NLP, NLTK and Python) together. Beginners may have some difficulty sorting things out. By the time you reach the WordNet section, you either got lost in the forest, realize that you would never understand this topic without the book, or both. However, if you are a bit patient and try out all simple code examples, you'll make it eventually. In my opinion, NLTK remains the simplest, most elegant and well rounded library of its kind.