Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.

  • Apple
  • Android
  • Windows Phone
  • Android

To get the free app, enter your email address or mobile phone number.

Text Processing in Python 1st Edition

4.4 out of 5 stars 19 customer reviews
ISBN-13: 007-6092017905
ISBN-10: 0321112547
Why is ISBN important?
This bar-code number lets you verify that you're getting exactly the right version or edition of a book. The 13-digit and 10-digit formats both work.
Scan an ISBN with your phone
Use the Amazon App to scan ISBNs and compare prices.
Have one to sell? Sell on Amazon
Buy used On clicking this link, a new layer will be open
$34.99 On clicking this link, a new layer will be open
Buy new On clicking this link, a new layer will be open
$43.94 On clicking this link, a new layer will be open
More Buying Choices
15 New from $31.31 20 Used from $3.95
Free Two-Day Shipping for College Students with Amazon Student Free%20Two-Day%20Shipping%20for%20College%20Students%20with%20Amazon%20Student

Save Up to 90% on Textbooks Textbooks
$43.94 FREE Shipping. Only 3 left in stock (more on the way). Ships from and sold by Amazon.com. Gift-wrap available.

Frequently Bought Together

  • Text Processing in Python
  • +
  • Natural Language Processing with Python
  • +
  • Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
Total price: $106.39
Buy the selected items together

Editorial Reviews

From the Back Cover

Text Processing in Python is an example-driven, hands-on tutorial that carefully teaches programmers how to accomplish numerous text processing tasks using the Python language. Filled with concrete examples, this book provides efficient and effective solutions to specific text processing problems and practical strategies for dealing with all types of text processing challenges.

Text Processing in Python begins with an introduction to text processing and contains a quick Python tutorial to get you up to speed. It then delves into essential text processing subject areas, including string operations, regular expressions, parsers and state machines, and Internet tools and techniques. Appendixes cover such important topics as data compression and Unicode. A comprehensive index and plentiful cross-referencing offer easy access to available information. In addition, exercises throughout the book provide readers with further opportunity to hone their skills either on their own or in the classroom. A companion Web site (http://gnosis.cx/TPiP) contains source code and examples from the book.

Here is some of what you will find in thie book:

  • When do I use formal parsers to process structured and semi-structured data? Page 257
  • How do I work with full text indexing? Page 199
  • What patterns in text can be expressed using regular expressions? Page 204
  • How do I find a URL or an email address in text? Page 228
  • How do I process a report with a concrete state machine? Page 274
  • How do I parse, create, and manipulate internet formats? Page 345
  • How do I handle lossless and lossy compression? Page 454
  • How do I find codepoints in Unicode? Page 465


About the Author

David Mertz came to writing about programming via the unlikely route of first being a humanities professor. Along the way, he was a senior software developer, and now runs his own development company, Gnosis Software ("We know stuff!"). David writes regular columns and articles for IBM developerWorks, Intel Developer Network, O'Reilly ONLamp, and other publications.


Product Details

  • Paperback: 544 pages
  • Publisher: Addison-Wesley Professional; 1 edition (June 12, 2003)
  • Language: English
  • ISBN-10: 0321112547
  • ISBN-13: 978-0321112545
  • Product Dimensions: 6.8 x 1.2 x 9.1 inches
  • Shipping Weight: 1.9 pounds (View shipping rates and policies)
  • Average Customer Review: 4.4 out of 5 stars  See all reviews (19 customer reviews)
  • Amazon Best Sellers Rank: #761,060 in Books (See Top 100 in Books)

More About the Author

Discover books, learn about writers, read author blogs, and more.

Customer Reviews

Top Customer Reviews

Format: Paperback
Text Processing in Python, by David Mertz, 2003, Addison Wesley, 520 pages.
If you have read an introductory book or two about programming, but you are far from being an expert, then you will benefit a lot from reading this book. If you are a competent programmer in any other language, you will benefit from this book. If you are an expert Python programmer, you will also benefit from this book.
For, as you know, there are many good introductory texts about Python. This is not one of them, for this is an advanced book, but not an inaccessible one. David Mertz has a unique style and focus that we have become familiar with from his "Charming Python" series of articles on the IBM Developer Network. Dr. Mertz is more interested in facilitating our learning process than in lecturing us, and rather than fill his pages with impressive examples designed to illustrate his expertise, he gently guides us by offering subtle yet important examples of code and analysis that makes us think for ourselves.
He has a special talent for programming in the functional style, and this is a great introduction to that style of Python programming. Thus, this is also a good guide to using the newer features introduced into Python in the last few revisions, which often facilitate the functional style of programming.
The text includes, in an appendix, a 40 page tutorial covering the basic Python language. This tutorial is, like the book, unique in its approach and is worthwhile even for experienced Pythonistas, as it sheds light on some of the underlying ideas behind the syntax and semantics, and it also illustrates the functional style of programming, which is sometimes quite useful when doing text processing. And, despite its many other virtues, this is a book about text processing.
Read more ›
Comment 51 of 53 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback Verified Purchase
Yes, I mean it: this is a beautiful book. If your aesthetic sensibilities have been informed, directly or indirectly, by Kernighan and Ritchie's influential book on C, you'll know what I mean.
I've been programming computers in various capacities since I was in my early teens (the mid-1970s) and I've been through a number of languages. Not long ago I discovered Python, and I suspect I won't need to learn any other languages for quite a long time. Guido van Rossum is a wizard.
If you're interested in learning Python, don't start here. If you've got some programming background already, Guido's tutorial (which comes bundled with the Python download) will be enough to get you rolling. I personally recommend all of O'Reilly's books on the subject (_Learning Python_ for the absolute beginner, Mark Lutz's idiosyncratic but highly useful _Programming Python_ for the next level up, the magisterial _Python Cookbook_ for pretty much anybody, and the _Nutshell_ book to be placed permanently next to your keyboard). There are others as well, and after you've gotten started, you'll be a better judge than I am of what will be most useful to you. (But I'd skip the vastly overpriced and not-very-deep _Python Programming Patterns_ unless you can buy it used.)
This one's for later; although it does offer some beginning instruction in Python, it isn't really an introductory book. However, if you do any text processing with Python -- which you almost undoubtedly do if you use Python at all -- then you _do_ want this book even if you don't know it yet.
Most of what you'll want to know is in chapter two, which sets out the basics of string processing in Python.
Read more ›
Comment 41 of 44 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse
Format: Paperback
This is the only book that really attacks the issue of string processing using Python. Unfortunately it didn't attack the text processing problems that I wanted discussed.
Also, in the area of Regular Expressions the examples didn't directly use the Python library, instead a wrap around function was used for the many examples and that detracted from using the book as a reference book for this purpose.
I found that Python has several different ways to do string processing. Also, some of those ways come up with conflicting results. At the time of this writing the authors of Python are re-organizing and improving this area.
What is truly great about the book is the discussion of state machines, parsers, and functional programming. Although these topics detract from the focus on string processing somewhat this book is perhaps the only popular Python book out there that does these topics justice. I thought they were very well written.
My overall complaint is that this book includes too many things outside of text processing using the core Python language. But other readers may appreciate this aspect more than I did. If you want coverage on handling email specifically, the author covers that. Same with HTML processing and other specialized topics. I just wanted to low down on using the full string processing capabilities of the core Python language -- not necessarily all the specialized libraries.
I found string processing to be messy with Python but found Ruby to be much easier. That is perhaps because Ruby is a newer language and it has some features of Perl built in. Ruby however does not have the extent of libraries available like Python, nor does it have as nice of Windows GUI.
Read more ›
Comment 22 of 24 people found this helpful. Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse

Most Recent Customer Reviews

Set up an Amazon Giveaway

Amazon Giveaway allows you to run promotional giveaways in order to create buzz, reward your audience, and attract new followers and customers. Learn more
Text Processing in Python
This item: Text Processing in Python
Price: $43.94
Ships from and sold by Amazon.com