- Series: Adaptive Computation and Machine Learning series
- Hardcover: 800 pages
- Publisher: The MIT Press (November 18, 2016)
- Language: English
- ISBN-10: 0262035618
- ISBN-13: 978-0262035613
- Product Dimensions: 7 x 1 x 9 inches
- Shipping Weight: 2.9 pounds (View shipping rates and policies)
- Average Customer Review: 109 customer reviews
- Amazon Best Sellers Rank: #943 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Deep Learning (Adaptive Computation and Machine Learning series) Hardcover – November 18, 2016
|New from||Used from|
The Amazon Book Review
Author interviews, book reviews, editors picks, and more. Read it now
Frequently bought together
Customers who bought this item also bought
Written by three experts in the field, Deep Learning is the only comprehensive book on the subject. It provides much-needed broad perspective and mathematical preliminaries for software engineers and students entering the field, and serves as a reference for authorities.(Elon Musk, cochair of OpenAI; cofounder and CEO of Tesla and SpaceX)
This is the definitive textbook on deep learning. Written by major contributors to the field, it is clear, comprehensive, and authoritative. If you want to know where deep learning came from, what it is good for, and where it is going, read this book.(Geoffrey Hinton FRS, Emeritus Professor, University of Toronto; Distinguished Research Scientist, Google)
Deep learning has taken the world of technology by storm since the beginning of the decade. There was a need for a textbook for students, practitioners, and instructors that includes basic concepts, practical aspects, and advanced research topics. This is the first comprehensive textbook on the subject, written by some of the most innovative and prolific researchers in the field. This will be a reference for years to come.(Yann LeCun, Director of AI Research, Facebook; Silver Professor of Computer Science, Data Science, and Neuroscience, New York University)
[T]he AI bible... the text should be mandatory reading by all data scientists and machine learning practitioners to get a proper foothold in this rapidly growing area of next-gen technology.(Daniel D. Gutierrez insideBIGDATA)
About the Author
Ian Goodfellow is Research Scientist at OpenAI. Yoshua Bengio is Professor of Computer Science at the Université de Montréal. Aaron Courville is Assistant Professor of Computer Science at the Université de Montréal.
Browse award-winning titles. See more
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
Why? Because this book also makes very clear - is completely honest - that neural networks are a 'folk' technology (though they do not use those words): Neural networks work (in fact they work unbelievably well - at least, as Geoffrey Hinton himself has remarked, given unbelievably powerful computers), but the underlying theory is very limited and there is no reason to think that it will become less limited, and the lack of a theory means that there is no convincing 'gradient', to use an appropriate metaphor, for future development. A constant theme here is that 'this works better than that' for practical reasons not for underlying theoretical reasons. Neural networks are engineering, they are not applied mathematics, and this is very much, and very effectively, an engineer's book.
Bad mistake. Only a few of the reviews clearly state the obvious problems of this book. Oddly enough, these informative
reviews tend to attract aggressively negative comments. There is a surprising disconnect between the majority of positive
reviews of this book and the reality of how it is written. This sometimes does occur with a first book in an important area or
when dealing with pioneer authors with a cult following.
First of all, it is not clear who is the audience--the writing does not provide details at the level one
expects from a textbook. It also does not provide a good overview ("big picture thinking"). Advanced readers
would also not gain much because it is too superficial, when it comes to the advanced topics (final 35% of book).
More than half of this book reads like a bibliographic notes section of a book, with
(horrendously incomplete) portions of the math thrown in at places. In other words, these portions read
like a prose description of a bibliography, with equations thrown in for annotation. The level of
detail is more similar to an expanded ACM Computing Surveys article rather than a textbook in
several chapters. At the other extreme of audience expectation, we have a review of linear algebra in the beginning,
which is a waste of useful space that could have been spent on actual explanations in other
chapters. If you don't know linear algebra already, you cannot really hope to follow
anything (especially in the way the book is written). In any case, the amount of linear
algebra introduced in that chapter is too little to be of much use-- so who is that for?
As a practical matter, Part I of the book is mostly redundant/off-topic for a neural network book
(containing linear algebra, probability, and so on)
and Part III is written in a superficial way--so only a third of the book is remotely useful.
Other than a chapter on optimization algorithms (good description of algorithms like
Adam), I do not see even a single chapter that has done a half-decent job of presenting
algorithms with the proper conceptual framework. The presentation style is unnecessarily terse,
and dry, and is stylistically more similar to a research paper rather than a book.
It is understood that any machine learning book would have some mathematical sophistication, but the
main problem is caused by a lack of concern on part of the authors in promoting readability and an inability to
put themselves in reader shoes (surprisingly enough, some defensive responses to negative reviews tend to place
blame on math-phobic readers). At the end of the day, it is the author's responsibility to make
notational and organizational choices that are likely to maximize understanding.
Good mathematicians have excellent manners while choosing notation (you don't use nested
subscripts/superscripts/functions if you possess the clarity to do it more simply).
And no, math equations are not the same as algorithms-- only a small part of it. Where is the rest?
Where is the algorithm described? Where is the conceptual framework?
Where is the intuition? Where are the pseudocodes? Where are the illustrations? Where are the examples?
No, I am not asking for recipes or Python code. Just some decent writing, details, and explanations.
The sections on applications, LSTM and convolutional neural networks are hand-wavy at places and
read like "you can do this to achieve that." It is impossible to fully reconstruct the methods from the description provided.
A large part of the book (including restricted Boltzmann machines)
is so tightly integrated with Probabilistic Graphical models (PGM), so that it loses its neural network focus.
This portion is also in the latter part of the book that is written in a rather superficial way and
therefore it implicitly creates another prerequisite of being very used to PGM (sort-of knowing it wouldn't be enough). .
Keep in mind that the PGM view of neural networks is not the dominant view today, from either a practitioner
or a research point of view. So why the focus on PGM, if they don't have the space to elaborate?
On the one hand, the authors make a futile attempt at promoting accessibility by discussing redundant
pre-requisites like basic linear algebra/probability basics. On the other hand, the PGM-heavy approach implicitly
increases the pre-requisites to include an even more advanced machine learning topic than neural networks
(with a 1200+ page book of its own). What the authors are doing is the equivalent of trying to teach someone
how to multiply two numbers as a special case of tensor multiplication. Even for RNNs with deterministic hidden states
they feel the need to couch it as a graphical model. It is useful to connect areas, but mixing them
is a bad idea. Look at Hinton's course. It does explain the connection between Boltzmann machines and PGM
very nicely, but one can easily follow RBM without having to bear the constant burden of a PGM-centric view.
One fact that I think played a role in these types of strategic errors of judgement is the fact that the
lead author is a fresh PhD graduate There is no substitute for experience when it comes to maturity
in writing ability (irrespective of how good a researcher someone is). Mature writers have the ability to put
themselves in reader shoes and have a good sense of what is conceptually important. The
authors clearly miss the forest from the trees, with chapter titles like "Confronting
the partition function." The book is an example of the fact that a first book in an important area with the name of
a pioneer author in it is not necessarily a qualification for being considered a good book.
I am not hesitant to call it out. The emperor has no clothes.
a must-have if you're interested in the field