Enjoy fast, free delivery, exclusive deals, and award-winning movies & TV shows with Prime
Try Prime
and start saving today with fast, free delivery
Amazon Prime includes:
Fast, FREE Delivery is available to Prime members. To join, select "Try Amazon Prime and start saving today with Fast, FREE Delivery" below the Add to Cart button.
Amazon Prime members enjoy:- Cardmembers earn 5% Back at Amazon.com with a Prime Credit Card.
- Unlimited Free Two-Day Delivery
- Instant streaming of thousands of movies and TV episodes with Prime Video
- A Kindle book to borrow for free each month - with no due dates
- Listen to over 2 million songs and hundreds of playlists
- Unlimited photo storage with anywhere access
Important: Your credit card will NOT be charged when you start your free trial or if you cancel during the trial period. If you're happy with Amazon Prime, do nothing. At the end of the free trial, your membership will automatically upgrade to a monthly membership.
Buy new:
$38.99$38.99
FREE delivery:
Thursday, Jan 25
Ships from: Amazon.com Sold by: Amazon.com
Buy used: $32.75
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Deep Learning from Scratch: Building with Python from First Principles 1st Edition
Purchase options and add-ons
With the resurgence of neural networks in the 2010s, deep learning has become essential for machine learning practitioners and even many software engineers. This book provides a comprehensive introduction for data scientists and software engineers with machine learning experience. Youâ??ll start with deep learning basics and move quickly to the details of important advanced architectures, implementing everything from scratch along the way.
Author Seth Weidman shows you how neural networks work using a first principles approach. Youâ??ll learn how to apply multilayer neural networks, convolutional neural networks, and recurrent neural networks from the ground up. With a thorough understanding of how neural networks work mathematically, computationally, and conceptually, youâ??ll be set up for success on all future deep learning projects.
This book provides:
- Extremely clear and thorough mental modelsâ??accompanied by working code examples and mathematical explanationsâ??for understanding neural networks
- Methods for implementing multilayer neural networks from scratch, using an easy-to-understand object-oriented framework
- Working implementations and clear-cut explanations of convolutional and recurrent neural networks
- Implementation of these neural network concepts using the popular PyTorch framework
- ISBN-10935213902X
- ISBN-13978-9352139026
- Edition1st
- PublisherO'Reilly Media
- Publication dateOctober 15, 2019
- LanguageEnglish
- Dimensions7 x 0.53 x 9.19 inches
- Print length250 pages
Frequently bought together

Similar items that may ship from close to you
From the brand
-
-
Sharing the knowledge of experts
O'Reilly's mission is to change the world by sharing the knowledge of innovators. For over 40 years, we've inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
From the Publisher
From the Preface
If you’ve tried to learn about neural networks and deep learning, you’ve probably encountered an abundance of resources, from blog posts to MOOCs (massive open online courses, such as those offered on Coursera and Udacity) of varying quality and even some books—I know I did when I started exploring the subject a few years ago. However, if you’re reading this preface, it’s likely that each explanation of neural networks that you’ve come across is lacking in some way. I found the same thing when I started learning: the various explanations were like blind men describing different parts of an elephant, but none describing the whole thing. That is what led me to write this book.
These existing resources on neural networks mostly fall into two categories. Some are conceptual and mathematical, containing both the drawings one typically finds in explanations of neural networks, of circles connected by lines with arrows on the ends, as well as extensive mathematical explanations of what is going on so you can “understand the theory.”
Other resources have dense blocks of code that, if run, appear to show a loss value decreasing over time and thus a neural network “learning.”
Explanations like this, of course, don’t give much insight into “what is really going on”: the underlying mathematical principles, the individual neural network components contained here and how they work together, and so on.
What would a good explanation of neural networks contain? For an answer, it is instructive to look at how other computer science concepts are explained: if you want to learn about sorting algorithms, for example, there are textbooks that will contain:
- An explanation of the algorithm, in plain English
- A visual explanation of how the algorithm works, of the kind that you would draw on a whiteboard during a coding interview
- Some mathematical explanation of “why the algorithm works”
- Pseudocode implementing the algorithm
One rarely—or never—finds these elements of an explanation of neural networks side by side, even though it seems obvious to me that a proper explanation of neural networks should be done this way; this book is an attempt to fill that gap.
Editorial Reviews
About the Author
Product details
- ASIN : 1492041416
- Publisher : O'Reilly Media; 1st edition (October 15, 2019)
- Language : English
- Paperback : 250 pages
- ISBN-10 : 935213902X
- ISBN-13 : 978-9352139026
- Item Weight : 14.4 ounces
- Dimensions : 7 x 0.53 x 9.19 inches
- Best Sellers Rank: #475,024 in Books (See Top 100 in Books)
- #194 in Computer Neural Networks
- #555 in Python Programming
- #802 in Artificial Intelligence & Semantics
- Customer Reviews:
Important information
To report an issue with this product or seller, click here.
About the author

Discover more of the author’s books, see similar authors, read author blogs and more
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on Amazon-
Top reviews
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
You will want to check the author's github for this book. In some later chapters, parts of the code are omitted from print in the book, but are present in the github repo. There are also some minor corrections and updates that have been made to the repo since the book was printed. This was an excellent resource for me.
- Explanation of how Jacobian tensors are used (best placed in Appendix) to compute partial derivatives of matrix transformations would clarify how the back-propagation equations in the book were obtained.
* e.g. Computing Jacobians in our code would use up significant computer resources, therefore, rather than computing them for back-prop., we use their essential values (hence, the partial derivatives in the code examples)
* Jacobians make the concept of backprop. intuitive for understanding
How did this get published? O'Reilly used to mean quality, not "we didn't proof read anything, the FIRST diagram is wrong, and the code doesn't work". The perilous forum contains tons of reported errors. Not one of which is acknowledged or corrected in the years since they were reported.
The book MIGHT have good content, but I'd have to read it online ( not why I bought a paper book) and believe that what I read wasn't riddled with errors.
Skip this book. Find something where the author and publisher give a damn.
Top reviews from other countries
For instance, the code on page 150 suddenly introduces a variable name called 'fil' that has never been used before, neither in the book nor in the github repo. The whole code from Chapter 5 onwards does not consider the 'inference' variable that is introduced before. The code for the CNN doesn't even consider to be part of a class, 'self' is missing basically everywhere. In addition, arguments of functions like '_output' are not consistent with preceding code shown in the book. Another terrible example is the code on page 154. It's simply a mess. It isn't even consistent with the Layer class that is used, arguments are missing [for instance in super().__init__()], the variable name for the convolution operation is wrong, which is actually introduced on the page before, and many more.
Don't get me wrong, I understand that you cannot explain every single line of code and that some basic functions are not shown over and over again (for instance, dropout and weight_init are not considered in the Conv2D class on page P154, but are used to define the model on the next page, which is fine because they are introduced earlier in the book), BUT I'm expecting variable names and function arguments be consistent, AT LEAST with preceding code shown in the very same chapter.
1) Instruct the reader about the mathematics involved in deep learning in a clear, concise and comprehensive manner.
2) Expound on concepts and theories involved in neural network, deep learning model through Python codes and visual aids such as diagrams.
3) Illustrate how to build neural networks, and deep learning models from scratch
Chapter 1) Foundations: Chapter one touches on Calculus (chain-rule, derivatives), Linear Algebra (vectors, matrices & operations), nested functions. The author works through an example and elucidates the mathematics involved, demonstrates the codes line-by-line, as well as giving visual aids such as diagrams to illustrate the process flow of the concepts.
Chapter 2) Fundamentals: Chapter two is an extension of the mathematical principles, concepts of deep learning, as well as the explanations of chapter one. In chapter two, the reader is exposed to the building of a traditional linear regression model from scratch as well as the building of neural network. The author gave sufficient explanations as to why the neural network model is able to give a more accurate prediction of the 'target' and 'feature'.
Chapter 3) Deep Learning from Scratch: In chapter three, the reader gets to learn about 'layers', 'operations' as well as 'class'. Towards the end of the chapter, the reader gets to see various aspects of the deep learning process being integrated as a deep learning example built from scratch.
Chapter 4) Extensions: Chapter four starts off with a 'loss function', the chapter also covers activation functions other than 'Sigmoid' and explains why these activation functions might accelerate learning. Next, the chapter covers 'momentum' and illustrates that momentum is the most important extension of the stochastic gradient descent optimization technique. The chapter briefly discusses three techniques that are essential, namely: (1) learning rate decay, (2) weight initialization, and (3) dropout. The reader will learn how each of these techniques enable the neural network to find successively more optimal solutions. The concepts illustrated in this chapter are elucidated by breaking them down into three aspects: (1) Math, (2) Intuition, and (3) Code.
Chapter 5: Convolutional Neural Networks: In chapter five, the reader will learn about 'convolutional operations', 'feature maps'. The author elaborates on processes and aspects involved in 'multichannel convolution operation', explaining that feature maps are each set of features detected by a particular set of weights. In another section, the author further elaborates that "feature maps is a linear combination of convolving m1 different filters over that same location in each of the corresponding m1 feature maps from the prior layer." The author also expounds on the concept of 'convolutional filters' and 'convolutional layers'. The author illustrates the operations of '1D convolutions with batches: forward pass & backward pass' before moving on to 2D convolutions as well as the codes involved in '2D convolutions: forward pass & backward pass'. The author then instructs that the three Python functions: '_output', '_input_grad', & '_param_grad' are necessary functions needed to create a 'Conv2DOperation', which forms the core of the 'Conv2DLayers' used in CNNs as illustrated in the book.
Chapter 6) Recurrent Neural Networks: In chapter six, the reader will learn that 'Recurrent Neural Networks' (RNN) is a class of neural network architectures meant for handling sequences of data, designed to take in sequences of such data and return a correct prediction as output. In this chapter, the author uncovers the key limitation of the framework, that the book has been using so far, namely 'Handling Branching'. The author touches on 'Automatic Differentiation' as well as showing how 'Gradient Accumulation' works in Python codes. In the later part of the chapter, the author reintroduces RNN, illustrating the 'First Class for RNNs: RNNLayer' as well as the 'Second Class for RNNs: RNNNodes' as well as putting the two classes together. RNN is implemented by code and the forward method and the backward method are expounded upon. In the next part of the chapter, the author talks about 'Vanilla RNNNodes' as well as their limitations, and illustrates two advanced variants of the vanilla RNN, namely the: (1) 'Gated Recurrent Units' (GRU), as well as the (2) Long Short Term Memory (LSTM). Chapter six has no lack of visual illustrations based on flowcharts and diagrams on the workings of RNNs.
Chapter 7) PyTorch: In chapter seven, the author mainly covers PyTorch, starting of with 'Tensors', deep learning with PyTorch, then to the elements of PyTorch, namely: model, layer, optimizer, & loss. The author then goes through examples using Neural Networks implemented with PyTorch. Next the author touches upon 'Convolution Neural Networks' (CNN) and 'Long Short Term Memory' (LSTM) using PyTorch. In the later parts of the chapter, the author illustrates 'Unsupervised Learning' with autoencoders used as an example.
APPENDIX A) Deep Dive: In this section, the author fills in the gap in terms of the explanations given for mathematical concepts illustrated in the book, such as the 'Matrix Chain Rule'. The author supplements explanations of 'Gradient of the Loss with Respects to Bias Terms'. The author also adds a section on how to implement 'Convolutions via Matrix Multiplication' efficiently in Numpy.
There is no lack of mathematical illustration, diagram illustration of concepts and process-flow of Neural Networks and deep learning, as well as lines of code intended to implement Deep Learning from scratch in this book. This book is readily comprehensible for lay person and I would recommend this book to anyone interested in Deep Learning and Neural Networks. Also, check on the website for this book as well as the author's GitHub page and try to implement the codes, modifying them a little to supplement your understanding of the whole deep learning process. Overall, although this book is short in length, and a lot of concepts could be further illustrated, I would rate this book a 5/5.








