- Paperback: 544 pages
- Publisher: Addison-Wesley Professional; 1 edition (June 12, 2003)
- Language: English
- ISBN-10: 0321112547
- ISBN-13: 978-0321112545
- Product Dimensions: 6.9 x 1.2 x 9.1 inches
- Shipping Weight: 1.9 pounds (View shipping rates and policies)
- Average Customer Review: 17 customer reviews
- Amazon Best Sellers Rank: #1,459,955 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Text Processing in Python 1st Edition
Use the Amazon App to scan ISBNs and compare prices.
Fulfillment by Amazon (FBA) is a service we offer sellers that lets them store their products in Amazon's fulfillment centers, and we directly pack, ship, and provide customer service for these products. Something we hope you'll especially enjoy: FBA items qualify for FREE Shipping and Amazon Prime.
If you're a seller, Fulfillment by Amazon can help you increase your sales. We invite you to learn more about Fulfillment by Amazon .
"Warlight" by Michael Ondaatje
A dramatic coming-of-age story set in the decade after World War II, "Warlight" is the mesmerizing new novel from the best-selling author of "The English Patient." Learn more
Frequently bought together
What other items do customers buy after viewing this item?
From the Back Cover
Text Processing in Python is an example-driven, hands-on tutorial that carefully teaches programmers how to accomplish numerous text processing tasks using the Python language. Filled with concrete examples, this book provides efficient and effective solutions to specific text processing problems and practical strategies for dealing with all types of text processing challenges.
Text Processing in Python begins with an introduction to text processing and contains a quick Python tutorial to get you up to speed. It then delves into essential text processing subject areas, including string operations, regular expressions, parsers and state machines, and Internet tools and techniques. Appendixes cover such important topics as data compression and Unicode. A comprehensive index and plentiful cross-referencing offer easy access to available information. In addition, exercises throughout the book provide readers with further opportunity to hone their skills either on their own or in the classroom. A companion Web site (http://gnosis.cx/TPiP) contains source code and examples from the book.
Here is some of what you will find in thie book:
- When do I use formal parsers to process structured and semi-structured data? Page 257
- How do I work with full text indexing? Page 199
- What patterns in text can be expressed using regular expressions? Page 204
- How do I find a URL or an email address in text? Page 228
- How do I process a report with a concrete state machine? Page 274
- How do I parse, create, and manipulate internet formats? Page 345
- How do I handle lossless and lossy compression? Page 454
- How do I find codepoints in Unicode? Page 465
About the Author
David Mertz came to writing about programming via the unlikely route of first being a humanities professor. Along the way, he was a senior software developer, and now runs his own development company, Gnosis Software ("We know stuff!"). David writes regular columns and articles for IBM developerWorks, Intel Developer Network, O'Reilly ONLamp, and other publications.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
On it's strengths, this book is probably best suited for programmers that aren't afraid to learn advanced material. It covers in great detail everything you ever wanted to know about python string processing (and honestly probably a bit more). It has a very readable style, and overall is exceptionally informative. Examples are clear, pointed, and useful.
On it's weaknesses, some material (ie parsers) might be extremely dense and hard to understand if you don't have a CS or Linguistics degree. On the other hand, if you do understand it (and the explanation is pretty good), you will end up a much better programmer for it.
Overall, I'd recommend this book for professionals with theory background that need to do advanced python work. I'd also recommend it to people without theory background, but only if they're not afraid of getting their feet wet. People who are afraid of learning should probably avoid this book.
4 stars mostly because I'm not really sure how to evaluate this book.
I've been programming computers in various capacities since I was in my early teens (the mid-1970s) and I've been through a number of languages. Not long ago I discovered Python, and I suspect I won't need to learn any other languages for quite a long time. Guido van Rossum is a wizard.
If you're interested in learning Python, don't start here. If you've got some programming background already, Guido's tutorial (which comes bundled with the Python download) will be enough to get you rolling. I personally recommend all of O'Reilly's books on the subject (_Learning Python_ for the absolute beginner, Mark Lutz's idiosyncratic but highly useful _Programming Python_ for the next level up, the magisterial _Python Cookbook_ for pretty much anybody, and the _Nutshell_ book to be placed permanently next to your keyboard). There are others as well, and after you've gotten started, you'll be a better judge than I am of what will be most useful to you. (But I'd skip the vastly overpriced and not-very-deep _Python Programming Patterns_ unless you can buy it used.)
This one's for later; although it does offer some beginning instruction in Python, it isn't really an introductory book. However, if you do any text processing with Python -- which you almost undoubtedly do if you use Python at all -- then you _do_ want this book even if you don't know it yet.
Most of what you'll want to know is in chapter two, which sets out the basics of string processing in Python. The other, fancier stuff in the later chapters may be handy sometimes, but author David Mertz himself will tell you not to overcomplicate things; if you can do what you need to do using string operations, do so.
Read the rest of it too, though. There's good stuff here on e.g. regular expressions and parsing that you'll find interesting and possibly useful. Just don't rush out and start trying to apply it when it isn't necessary.
Mertz is an excellent teacher. He tends to approach things from a foundation of "functional programming" -- of which I'm not particularly a fan, but he has a healthy sense of its limitations and his comments on the subject are refreshing. (If you're interested in functional programming, get a book on Haskell, which is actually a very cool language. But me, I like imperative languages just fine and I don't have any problem with "side effects" as long as they're deliberate or at least controlled.) At any rate, Mertz won't lock you in to a functional approach, but he will teach you some function-oriented stuff that will be useful to you no matter what your preferred programming style.
And his exposition is well organized and wonderfully lucid. If you're the sort of person who likes books that have a chapter zero, you'll enjoy his style.
Unless you have a strong programming background, then, you probably won't want to start your Python bookshelf with this one. But I recommend making it one of your first five.
Mertz is an exceptionally smart guy. A few of the things in this book were over my head, but most of it was not. He offers terrific insights into programming in general, and probably the best Python overview / tutorial I have ever seen (in one of the Appendices).
Also, in the area of Regular Expressions the examples didn't directly use the Python library, instead a wrap around function was used for the many examples and that detracted from using the book as a reference book for this purpose.
I found that Python has several different ways to do string processing. Also, some of those ways come up with conflicting results. At the time of this writing the authors of Python are re-organizing and improving this area.
What is truly great about the book is the discussion of state machines, parsers, and functional programming. Although these topics detract from the focus on string processing somewhat this book is perhaps the only popular Python book out there that does these topics justice. I thought they were very well written.
My overall complaint is that this book includes too many things outside of text processing using the core Python language. But other readers may appreciate this aspect more than I did. If you want coverage on handling email specifically, the author covers that. Same with HTML processing and other specialized topics. I just wanted to low down on using the full string processing capabilities of the core Python language -- not necessarily all the specialized libraries.
I found string processing to be messy with Python but found Ruby to be much easier. That is perhaps because Ruby is a newer language and it has some features of Perl built in. Ruby however does not have the extent of libraries available like Python, nor does it have as nice of Windows GUI.
Overall, if you are looking for a book on text processing this is the only book out there, and a big plus with this book is what you will learn on function programming, state machines and parsers.
The author worked hard to produce a book in this specialized area. He has lots of code examples. Highly recommended for Python programmers.
Sugar Land, TX
Most recent customer reviews
The first chapter dives into functional programming using obscure and terse high order...Read more