Statistical Language Learning (Language, Speech, and Communication) Reprint Edition
Use the Amazon App to scan ISBNs and compare prices.
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
This is a lovely book.―David Nye
About the Author
- Publisher : A Bradford Book; Reprint edition (September 1, 1996)
- Language : English
- Paperback : 190 pages
- ISBN-10 : 0262531410
- ISBN-13 : 978-0262531412
- Reading age : 18 years and up
- Item Weight : 11.2 ounces
- Dimensions : 6 x 0.43 x 9 inches
- Best Sellers Rank: #3,305,639 in Books (See Top 100 in Books)
- Customer Reviews:
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
The first chapter ("The Standard Model") is probably included just for comparison to the statistical model, so it's a bit surprising to find the good coverage of the Chart Parsing there.
If you are interested in some of the 3 topics mentioned above, consider buying this book. Don't forget that it's been written in 1993 (so it's pretty old). For this reason (or maybe others) it is less known than other similar writings in the field, so you may also surprise some of your colleagues with it :-)
I can still remember reading this book's preface so many years ago, and feeling the excitement! A new age is upon us--the old paradigms are crumbling, but not to worry! Statistical parsing will save the day! And Dr. Charniak gave us such an incredibly easy on-ramp: this book is fat-free, crystal-clear, and mercifully short. Ideal for busy professors and grad students, you too could change your paradigm in only 10 days of assiduous reading.
But in 2006, this book doesn't sound anymore like a manafesto for a new era. Indeed, it sounds more like a survey and summary of the last 10 years worth of research in statistical language processing. After so many years, statistical parsers are still only around 90% accurate in picking the best parse for a sentence. Rereading this book makes it PAINFULLY obvious that the field of NLP has been treading water for over a decade.
What could have gone so wrong??? I went back to the preface to find out, and my conclusion is that although Dr. Charniak accurately diagnosed the disease, he prescribed the wrong medicine.
What was the disease? Listen to how cogently he characterises what was wrong with NLP in 1993: "...language understanding depends on a lot of "real-world knowledge" ...But....the study of knowledge representation....is not going anywhere fast....AI has become notorious for the production of countless non-monotonic logics and almost as many logics of knowledge and belief, and none of the work shows any obvious application to actual knowledge representation problems".
The coup-de-grace is delivered by Dr. Charniak as follows: "Thus many of us in AI-NLP have found ourselves in the position of basing our research on the successful completion of other's reasearch--a completion that is looking more and more problematic." One would expect Dr. Charniak at this point to say something like "so lets all pitch-in and give the knowledge-representation folks a hand for a while." But no. Instead, he suggests that we all start inducing statistical parsers.
Now let's step back for a moment and marvel at what a great strategy this is. If your current search space is exhausted, you have no choice--you must create a new search space for yourself. I'm reminded of the marketing plan for Altoids. The market for breath-mints is incredibly crowded, so what's a newcommer to do? Create a _new_ space (strong mints) and give the consumer a reason to buy Altoids (they're curiously strong!) Dr. Charniak made a similar move when he wrote this book--create a new space (statistical parsing) where he could be number one (I wrote the book). Historically, this strategy worked flawlessly--statistical parsing has so dominated the field that you are hard-pressed to find a course at any university called "natural language processing" which doesn't deal exclusively with inducing grammars from corpora. Its not too much of an exaggeration to say that every NLP textbook written for 10 years after this book is just a retread of this slim volume, but with horrendous page counts and larded with extrainous techniques. Appolgies to Dijkstra, but this book truly is an improvement on all of its sucessors.
So its impossible to critique this book on clarity, subject mattter, historical significance, prophetic powers, or ultimate success and influence on the field. Truly, Dr. Charniak was talking about a revolution. So what's not to like?
Well.......this book does an excellent job of identifying the problem: our knowledge representation methods are weak. Where I fault this book is that it doesn't present a solution to the problem it identifies. Statistical parsing has absolutely no hope of helping out with the knowledge representation problem. Say curent statistical parsers were more than 90% accurate--say they were 100% accurate, i.e. we can parse any dang sentence. The knowledge representation problem remains. A simple example: given the two sentences "There are three chickens. Every chicken has two legs." Can you write a program which can answer the question "How many legs are there?" Does statistical parsing help? Parsing each of those sentences is no problem, even for the nonstatistical parsers available in 1993. There is no syntactic ambiguity at all, but even if you have the correct parse trees, those two sentences bristle with semantic difficulties. How do we represent the plurals? (three chickens, two legs). How do we know that "each chicken" ranges over the the chickens described by "three chickens?" How do go from parse trees to the multiplication problem "3 chickens times 2 legs per chicken" to yield the desired answer "6 legs?" It seems to me that although millions of dollars and millions of person-hours have been spent in NLP in the 13 years since this book was published, almost no effort has gone into answering these sorts of semantic problems--the problems which this book so eloquently identifies as what is holding up NLP research.
13 years is a long time. I've gone from a young and good-looking grad student to a bald 40-year old since this book was published, and I do believe that this whole endeavor of inducing statistical grammars from corpora is also showing its age. 13 years is longer than we gave to neural networks or to unification grammars to prove their worth. Its time for a new paradigm.
And I can't think of anybody better than Dr. Charniak to once again show us the way! How I would love to see another slim little book from him--another slim little book which will change the way NLP is done for years to come!