- Series: Adaptive Computation and Machine Learning series
- Hardcover: 322 pages
- Publisher: A Bradford Book; 1st Edition edition (March 1, 1998)
- Language: English
- ISBN-10: 0262193981
- ISBN-13: 978-0262193986
- Product Dimensions: 7 x 0.8 x 9 inches
- Shipping Weight: 1.8 pounds (View shipping rates and policies)
- Average Customer Review: 27 customer reviews
- Amazon Best Sellers Rank: #44,340 in Books (See Top 100 in Books)
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning) 1st Edition Edition
Use the Amazon App to scan ISBNs and compare prices.
Fulfillment by Amazon (FBA) is a service we offer sellers that lets them store their products in Amazon's fulfillment centers, and we directly pack, ship, and provide customer service for these products. Something we hope you'll especially enjoy: FBA items qualify for FREE Shipping and Amazon Prime.
If you're a seller, Fulfillment by Amazon can help you increase your sales. We invite you to learn more about Fulfillment by Amazon .
"Enlightenment Now: The Case for Reason, Science, Humanism, and Progress"
Is the world really falling apart? Is the ideal of progress obsolete? Cognitive scientist and public intellectual Steven Pinker urges us to step back from the gory headlines and prophecies of doom, and instead, follow the data: In seventy-five jaw-dropping graphs, Pinker shows that life, health, prosperity, safety, peace, knowledge, and happiness are on the rise. Learn more
Frequently bought together
Customers who bought this item also bought
Customers who viewed this item also viewed
This is a highly intuitive and accessible introduction to the recent major developments in reinforcement learning, written by two of the field's pioneering contributors.(Dimitri P. Bertsekas and John N. Tsitsiklis, Professors, Department of Electrical Engineering andn Computer Science, Massachusetts Institute of Technology)
This book not only provides an introduction to learning theory but also serves as a tremendous source of ideas for further development and applications in the real world.(Toshio Fukuda, Nagoya University, Japan; President, IEEE Robotics and Automantion Society)
Reinforcement learning has always been important in the understanding of the driving force behind biological systems, but in the last two decades it has become increasingly important, owing to the development of mathematical algorithms. Barto and Sutton were the prime movers in leading the development of these algorithms and have described them with wonderful clarity in this new text. I predict it will be the standard text.(Dana Ballard, Professor of Computer Science, University of Rochester)
The widely acclaimed work of Sutton and Barto on reinforcement learning applies some essentials of animal learning, in clever ways, to artificial learning systems. This is a very readable and comprehensive account of the background, algorithms, applications, and future directions of this pioneering and far-reaching work.(Wolfram Schultz, University of Fribourg, Switzerland)
About the Author
Richard S. Sutton is Senior Research Scientist, Department of Computer Science, University of Massachusetts.
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
The authors define reinforcement learning as learning how to map situations to actions so as to maximize a numerical reward. The machine that is indulging in reinforcement learning discovers on its own which actions will optimize the reward by trying out these actions. It is the ability of such a machine to learn from experience that distinguishes it from one that is indulging in supervised learning, for in the latter examples are needed to guide the machine to the proper concept or knowledge. The authors emphasize the "exploration-exploitation" tradeoffs that reinforcement-learning machines have to deal with as they interact with the environment.
For the authors, a reinforcement learning system consists of a `policy', a `reward function', a `value function', and a `model' of the environment. A policy is a mapping from the states of the environment that are perceived by the machine to the actions that are to be taken by the machine when in those states. The reward function maps each perceived state of the environment to a number (the reward). A value function specifies what is the good for the machine over the long run. A model, as the name implies, is a representation of the behavior of the environment. The authors emphasize that all of the reinforcement learning methods that are discussed in the book are concerned with the estimation of value functions, but they point out that other techniques are available for solving reinforcement learning problems, such as genetic algorithms and simulated annealing.
The authors use dynamic programming, Monte Carlo simulation, and temporal-difference learning to solve the reinforcement learning problem, but they emphasize that each of these methods will not give a free-lunch. An entire chapter is devoted to each of these methods however, giving the reader a good overview of the weaknesses and strengths of each of these approaches. The differences between them usual boil down to issues of performance rather than accuracy in the generated solutions. Temporal difference learning in fact is viewed in the book as a combination of Monte Carlo and dynamic programming techniques, and in the opinion of this reviewer, has resulted in some of the most impressive successes for applications based on reinforcement learning. One of these is TD-Gammon, developed to play backgammon, and which is also discussed in the book.
The authors emphasize that these three main strategies for solving reinforcement learning problems are not mutually exclusive. Instead each of them could be used simultaneously with the others, and they devote a few chapters in the book illustrating how this "unified" approach can be advantageous for reinforcement learning problems. They do this by using explicit algorithms and not just philosophical discussion. These discussions are very interesting and illustrate beautifully the idea that there is no "free lunch" in any of the different algorithms involved in reinforcement learning.
In the last chapter of the book the authors overview some of the more successful applications of reinforcement learning, one of them already mentioned. Another one discussed is the `acrobot', which is a two-link, underactuated robot, which models to some extent the motion of a gymnast on a high bar. The motion of the acrobot is to be controlled by swinging its tip above the first joint, with appropriate rewards given until this goal is reached. The authors use the `Sarsa' learning algorithm, developed earlier in the book, for solving this reinforcement learning problem. The acrobot is an example of the current intense interest in machine learning of physical motion and intelligent control theory.
Another example discussed in this chapter deals with the problem of elevator dispatching, which the authors include as an example of a problem that cannot be dealt with efficiently by dynamic programming. This problem is studied with Q-learning and via the use of a neural network trained by back propagation.
The authors also treat a problem of great importance in the cellular phone industry, namely that of dynamic channel allocation. This problem is formulated as a semi-Markov decision problem, and reinforcement learning techniques were used to minimize the probability of blocking a call. Reinforcement learning has become very important in the communications industry of late, as well as in queuing networks.
However, in my opinion it is neither well structured nor written. The book has no clear separation between theory and examples given to demonstrate the applications of the theory. Due to this, the theoretical ideas are blured instead of clearified. After going through the examples it is always possible to find out how it work, but this should not be necessary.
After reading this book you will definetely know the basics (even more) about reinforcement learning. However, I somehow expected more because of the names of the authors. Perhaps this is not only a problem of this book but of the field of reinforcement learning itself.
The book provides numerous step-by-step algorithms that makes it relatively easy to get started writing algorithms. The presentation uses minimal mathematics, and avoids the difficult theory supporting the convergence proofs, making it a nice introduction for undergraduates and graduates alike. But throughout the presentation is evidence of extensive experience with applying these methods to a range of classical problems in artificial intelligence.
Students interested in a stronger theoretical foundation should look at Neuro-Dynamic Programming (Optimization and Neural Computation Series, 3). My recent book, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics), puts far more emphasis on mathematical modeling, and presents the field more from the perspective of the operations research community. For an edited volume with a number of contributions from both artificial intelligence and control theory, see Handbook of Learning and Approximate Dynamic Programming (IEEE Press Series on Computational Intelligence).
Operations Research and Financial Engineering