Python 3 Text Processing with NLTK 3 Cookbook Illustrated Edition
| Jacob Perkins (Author) Find all the books, read about the author, and more. See search results for this author |
Use the Amazon App to scan ISBNs and compare prices.
About This Book
- Break text down into its component parts for spelling correction, feature extraction, and phrase transformation
- Learn how to do custom sentiment analysis and named entity recognition
- Work through the natural language processing concepts with simple and easy-to-follow programming recipes
Who This Book Is For
This book is intended for Python programmers interested in learning how to do natural language processing. Maybe you've learned the limits of regular expressions the hard way, or you've realized that human language cannot be deterministically parsed like a computer language. Perhaps you have more text than you know what to do with, and need automated ways to analyze and structure that text. This Cookbook will show you how to train and use statistical language models to process text in ways that are practically impossible with standard programming tools. A basic knowledge of Python and the basic text processing concepts is expected. Some experience with regular expressions will also be helpful.
What You Will Learn
- Tokenize text into sentences, and sentences into words
- Look up words in the WordNet dictionary
- Apply spelling correction and word replacement
- Access the built-in text corpora and create your own custom corpus
- Tag words with parts of speech
- Chunk phrases and recognize named entities
- Grammatically transform phrases and chunks
- Classify text and perform sentiment analysis
In Detail
This book will show you the essential techniques of text and language processing. Starting with tokenization, stemming, and the WordNet dictionary, you'll progress to part-of-speech tagging, phrase chunking, and named entity recognition. You'll learn how various text corpora are organized, as well as how to create your own custom corpus. Then, you'll move onto text classification with a focus on sentiment analysis. And because NLP can be computationally expensive on large bodies of text, you'll try a few methods for distributed text processing. Finally, you'll be introduced to a number of other small but complementary Python libraries for text analysis, cleaning, and parsing.
This cookbook provides simple, straightforward examples so you can quickly learn text processing with Python and NLTK.
Customers who viewed this item also viewed
Editorial Reviews
About the Author
Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.
Product details
- Publisher : Packt Publishing; Illustrated edition (August 26, 2014)
- Language : English
- Paperback : 304 pages
- ISBN-10 : 1782167854
- ISBN-13 : 978-1782167853
- Item Weight : 1.16 pounds
- Dimensions : 7.5 x 0.69 x 9.25 inches
- Best Sellers Rank: #2,312,095 in Books (See Top 100 in Books)
- #387 in Natural Language Processing (Books)
- #2,104 in Data Processing
- #2,521 in Python Programming
- Customer Reviews:
About the author

Jacob Perkins is an open source programmer, NLP hacker, and startup entrepreneur. He is currently the CTO & co-founder of Weotta, a semantic search engine for local events, activities, restaurants and more. His major open source contributions are to NLTK, a Python toolkit for natural language processing, and Seahorse, the Gnome encryption key application.
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonTop reviews from the United States
There was a problem filtering reviews right now. Please try again later.
This book pales in comparison in communication, content, and utility as it relates to both NLTK and Python (in general) - you don't even get a table of contents.
A lot of content provided without proper resources.
Many of his red URL or package are out dated and not useful at all.
Very irresponsible author- i contact author about many issues he never ever answered
So, if you want to know how to adopt NLTK, Wordnet, Scipy, Numpy, and the like to the problems you are facing right now this is your book. It will have you up and running in hours, not weeks, with plenty of code recipes included. The author also maintains an excellent blog, streamhacker.com, if you want to get a sense for his knowledge and writing style before buying. It's how I found this book.
An excellent next book, if you need a more complete book to build your own fundamental tools, rather than simply adopting NLTK, is Fundamentals of Predictive Text Mining by Weiss. Fundamentals will take you from the launch point provided by this book into computational and predictive methods.







