18 of 18 people found the following review helpful
on July 16, 2002
I was definitely interested when I first heard that O'Reilly were publishing a book on LWP. LWP is a definitive collection of perl modules covering everything you could think of doing with URIs, HTML, and HTTP. While 'web services' are the buzzword friendly technology of the day, sometimes you need to roll your sleeves up and get a bit dirty scraping screens and hacking at HTML. For such a deep subject, this book weighs in at a slim 242 pages. This is a very good thing. I'm far too busy to read these massive shelf-destroying tomes that seem to be churned out recently.
It covers everything you need to know with concise examples, which is what makes this book really shine. You start with the basics using LWP::Simple through to more advanced topics using LWP::UserAgent, HTTP::Cookies, and WWW::RobotRules. Sean shows finger saving tips and shortcuts that take you more than a couple notches above what you can learn from the lwpcook manpage, with enough depth to satisfy somebody who is an experienced LWP hacker.
This book is a great reference, just flick through and you'll find a relevant chapter with an example to save the day. Chapters include filling in forms and extracting data from HTML using regular expressions, then more advanced topics using HTML::TokeParser, and then my preferred tool, the author's own HTML::TreeBuilder. The book ends with a chapter on spidering, with excellent coverage of design and warnings to get your started on your web trawling.
15 of 15 people found the following review helpful
on July 12, 2003
If you aren't yet comfortable using object-oriented Perl modules, the multitude of examples will at least allow you see how it's done even if you're a bit fuzzy on what's happening 'underneath' when you call object methods. If you're comfortable learning how to do something without knowing exactly why it works, then the author's clear step-by-step explantions and numerous progressively more powerful examples should make this book accessible even to relatively innexperienced Perl programmers.
More experienced programmers will understand better why things work, but any Perl programmer will set this book down feeling empowered to turn the web into their own valet. No longer do you need to check multiple sites looking for interesting information. Instead, you can readily author code to do that for you and alert you when items of interest are found. You can use these tools to free up personal time, to harvest information to inform business decisions, to automate tedious web application testing, and a zillion other things.
The author's clear exploration of the relevant Perl modules leaves the reader with a good depth of understanding of what these modules do, when you might want to use which module, and how to use them for real world tasks. Before reading the book, I knew of these modules, but they were a rather intimidating pile. I'd used a few of them on occasion for rather limited projects, but was reluctant to invest the time required to read all of the documentation from the whole collection. Mountains of method-level documentation do not a tutorial make. This book takes all of that information, selects the most important parts, and ensures that those parts are covered in progressively more powerful and/or flexible examples.
If you know Perl and you're sick of 'working the web' to get information and you want the web to work for you instead, then you need this book. I had a personal project that was on the back burner for a couple of years because it just sounded too hard. The weekend after I finished this book, I wrote what I had previously thought to be the hard part of that project and it was both easy and fun. This book makes hard things not just possible, but actually easy.
14 of 14 people found the following review helpful
on August 7, 2002
As a web programmer, I had dealt with several such projects dealing with web automation and writing simple crawlers even before I read "Perl & LWP". The book was the first book I've read on the subject, and I'm by no means disappointed. The book is very well organized, very informative and nails the subject in the head. I am pleased.
I noticed some inaccuracies in the discussions, some chopped off paragraphs and sentences. But this doesn't affect the usability of the book much. Author Sean Burke does a great job in walking one through the most of the aspects of web automation and data extraction in the web using Perl and LWP (libwww in Perl ).
The codes the book gives are very well organized, well written and easily debugable. The steps are pretty consistent across all the examples:
a) Inspect the HTML source code of the page;
b) Determine the tokens and patterns of interest;
c) Write the first code;
d) Fine tune the code;
As usual, I'll be commenting on individual chapters to give you an idea of the
coverage of the book in more details...
9 of 9 people found the following review helpful
on August 30, 2002
This book is a comprehensive and authoritative guide to web automation. It reads as both a gentle tutorial and a well organized reference. Basic HTTP operation, regexp HTML parsing, tokenizing, cookie authentication, form handling, and robot spidering are covered extensively in numerous case studies and practical examples.
Naturally, I was impressed by the simple, consistent treatment of examples: inspect source and find the interesting bits, code things up and then enhance to suit. :-)
A particularly satisfying thing to me is the sane way of working, that the author assumes. So many people seem to just bungle their way through web programming while ignoring basics like the robots.txt file. This book helps to prevent this.
One would think that only a thick tome would be sufficient to cover such vast territory, but the author (who is an active LWP module developer) does a fabulous job covering this extensive subject matter.
I recommend this book both to anyone starting out on their way to working with the underside of the web and to accomplished professionals in need of a full reference manual.
8 of 8 people found the following review helpful
on July 11, 2002
A great book for anyone who wishes to automate daily tasks on the web. Sean does an outstanding job of showing how Perl can be used to extract and manipulate not just data but useful information efficiently from the web's vast data resources. I've already adapted an example from this book (link-checking spider) for sites I maintain. Yes, I've known of the LWP module prior to this book. But as a lazy programmer, I rely on others to show me the way. Sean does just that...
7 of 7 people found the following review helpful
on March 15, 2003
If you are unfamiliar with LWP and web scraping, or HTML parsing using tokens and trees, I strongly recommend this book. It's the best *introduction* to these topics I've been able to find. Sean's style is clear and concise-just what I expect from an O'Reilly book.
To get the most out of this book, you'll want to be familiar with Object Oriented programming in Perl, because (with the exception of LWP::Simple) all the modules discussed in this book use objects.
Also, don't expect the LWP sample code in the book to work correctly. Many of the sites that the scripts try to "scrape" have changed their layout since this book was published, braking the scripts. This isn't a problem though, because the samples Sean provides are very short and clear, so it's not necessary to run them in order to figure out how they work.
10 of 12 people found the following review helpful
on September 1, 2002
Disclaimer: The author is an online-type-friend and I used to work with the author of the foreword. I even got my copy for free.
If the above hasn't totally disqualified me from commenting, I just wanted to note some things most reviewers have ignored.
The book is an excellent resource for two kinds of people.
Many people scan technical books looking for little scripts and thingies; a few lines changed and BOOM! They have the program they always wanted. Sean provides those in abundance.
It is also a good resource for a complete novice to learn about the hodgepodge of technologies we call the web - the ... wire protocol, markup languages, tree-based parsers, and encodings, to name just a few. The author is an expert in all of these, but has restrained himself to provide just enough information to get a programmer going. I was impressed time and again with how he manages to give the reader exactly enough knowledge to get their tasks done, with short but accurate explanations and pointers on where to learn more.
Best of all, this is a funny technical book. Usually if a technical book has pretensions to humor, it jabs you in the arm repeatedly with lots of groaner puns and dumb cartoons, in order to fill the space between bland code sections. But Sean has sprinkled the *code sections* with his dada sense of humor, which also highlights the difference between mere placeholder data and the concept being illustrated. And then the text gets right back to the point.
This is a slim work (242 pages, no thicker than my thumb) but packs a lot of value for your money. So buy it already.
My only criticism is that it is exclusively focused on consuming services on the web - like downloading TV listings and so on. But you can use everything Sean talks about to also *publish* information; for instance, making some nifty Perl-based thing to update your online journal from MS Word or something. Or to aggregate information that's out there, and feed it back onto the web. Nevertheless, if you've got half a brain it will be obvious how to do this stuff once you've absorbed everything you'll get from this book.
3 of 3 people found the following review helpful
on July 13, 2007
This is not your typical clunker with endless pages of filler material. It gets right to the point. If you want to learn about using Perl to interact with the internet, this would be a good book to help you get there. I have purchased several Perl books that supposedly teach you how to write code for use with the internet, but they are difficult to understand, and most of the examples just don't work. This book is an exception to that trend. It is the only one I have found so far that has useable, workable examples. The subject matter is still challenging, but Burke is able to explain it enough to give you a clue. If you are looking for help in handling HTTP programmatically, then here is your book.
3 of 3 people found the following review helpful
on February 24, 2011
I needed to use LWP to interact with various web APIs; 'Perl & LWP' turned out to be exactly what I was looking for. Although it's dated, the book contains a wealth of information about the module and working with HTTP in Perl.
It is worth noting that in 2007, the book's author, Sean Burke, published the text of the book on his personal website at [...]. If you're thinking of purchasing the Kindle edition of this book (like I ended up doing), you may be better off using his site. Clearly, if you want a physical copy of the book Amazon is still a great way to go.
2 of 2 people found the following review helpful
on August 18, 2006
I bought this book to get information automatically on japanese stocks(for example, charts, price, volume, PER, PBR, ROE, ROA, News, messages on Yahoo! Japan BBS for stocks) from the WEB every day.
Somehow this book has not yet translated into Japanese language.
I think this book would sell very well if translated into Japanese. Many demands.
This book is self-contained about the WEB, so you need little Perl programming rules and don't have to have knowledge on the Internet Protocols(HTTP) at all.
In most cases, all you need to do is to modify an example program on this book for your use very little.