Buy new:
$40.71$40.71
FREE delivery:
Tuesday, Feb 14
Ships from: Amazon.com Sold by: Amazon.com
Buy Used: $35.74
Other Sellers on Amazon
& FREE Shipping
91% positive over last 12 months
Usually ships within 4 to 5 days.
+ $3.99 shipping
85% positive over last 12 months
Usually ships within 3 to 4 days.
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required. Learn more
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Transformers for Natural Language Processing: Build innovative deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, RoBERTa, and more
| Price | New from | Used from |
- Kindle
$31.49 Read with Our Free App - Paperback
$35.74 - $40.719 Used from $31.27 8 New from $40.71
There is a newer edition of this item:
Enhance your purchase
Publisher's Note: A new edition of this book is out now that includes working with GPT-3 and comparing the results with other models. It includes even more use cases, such as casual language analysis and computer vision tasks, as well as an introduction to OpenAI's Codex.
Key Features
- Build and implement state-of-the-art language models, such as the original Transformer, BERT, T5, and GPT-2, using concepts that outperform classical deep learning models
- Go through hands-on applications in Python using Google Colaboratory Notebooks with nothing to install on a local machine
- Test transformer models on advanced use cases
Book Description
The transformer architecture has proved to be revolutionary in outperforming the classical RNN and CNN models in use today. With an apply-as-you-learn approach, Transformers for Natural Language Processing investigates in vast detail the deep learning for machine translations, speech-to-text, text-to-speech, language modeling, question answering, and many more NLP domains with transformers.
The book takes you through NLP with Python and examines various eminent models and datasets within the transformer architecture created by pioneers such as Google, Facebook, Microsoft, OpenAI, and Hugging Face.
The book trains you in three stages. The first stage introduces you to transformer architectures, starting with the original transformer, before moving on to RoBERTa, BERT, and DistilBERT models. You will discover training methods for smaller transformers that can outperform GPT-3 in some cases. In the second stage, you will apply transformers for Natural Language Understanding (NLU) and Natural Language Generation (NLG). Finally, the third stage will help you grasp advanced language understanding techniques such as optimizing social network datasets and fake news identification.
By the end of this NLP book, you will understand transformers from a cognitive science perspective and be proficient in applying pretrained transformer models by tech giants to various datasets.
What you will learn
- Use the latest pretrained transformer models
- Grasp the workings of the original Transformer, GPT-2, BERT, T5, and other transformer models
- Create language understanding Python programs using concepts that outperform classical deep learning models
- Use a variety of NLP platforms, including Hugging Face, Trax, and AllenNLP
- Apply Python, TensorFlow, and Keras programs to sentiment analysis, text summarization, speech recognition, machine translations, and more
- Measure the productivity of key transformers to define their scope, potential, and limits in production
Who this book is for
Since the book does not teach basic programming, you must be familiar with neural networks, Python, PyTorch, and TensorFlow in order to learn their implementation with Transformers.
Readers who can benefit the most from this book include experienced deep learning & NLP practitioners and data analysts & data scientists who want to process the increasing amounts of language-driven data.
Table of Contents
- Getting Started with the Model Architecture of the Transformer
- Fine-Tuning BERT Models
- Pretraining a RoBERTa Model from Scratch
- Downstream NLP Tasks with Transformers
- Machine Translation with the Transformer
- Text Generation with OpenAI GPT-2 and GPT-3 Models
- Applying Transformers to Legal and Financial Documents for AI Text Summarization
- Matching Tokenizers and Datasets
- Semantic Role Labeling with BERT-Based Transformers
- Let Your Data Do the Talking: Story, Questions, and Answers
(N.B. Please use the Look Inside option to see further chapters)
- ISBN-101800565798
- ISBN-13978-1800565791
- PublisherPackt Publishing
- Publication dateJanuary 29, 2021
- LanguageEnglish
- Dimensions7.5 x 0.87 x 9.25 inches
- Print length384 pages
What other items do customers buy after viewing this item?
Editorial Reviews
Review
"After looking through so many sources, I can attest that not only will this book help you get started rapidly with using, training, and transfer-learning transformers, but it will also help you understand transformers in deep philosophical ways. I am VERY grateful to Denis for writing this book AND FOR writing it very well."
--Thom Ives, Lead Data Scientist at UL Prospector, Owner of Integrated Machine Learning & AI
"Transformers have taken the NLP world by storm in the last couple of years and have become indispensable for both academic research and industrial practice in NLP. For me, the major benefit of this book has been its comprehensive coverage. The Transformer models covered include not only the popular ones such as BERT, GPT-3 and T5, but also less well-known ones such as RoBERTa and ELECTRA. Examples are provided using Hugging Face (both PyTorch and TensorFlow), AllenNLP, and Trax (Google Brain) libraries."
--Sujit Pal, Technology Research Director at Elsevier Labs, Co-author of Deep Learning with TensorFlow 2 and Keras, Second Edition
"There is much attention and hype surrounding transformers and this book authored by Denis provides well-balanced coverage of the theory and practice that allows newcomers to get a good grasp of the concept, before following along with several code examples provided in the book. Particularly, the book provides a solid background on the architecture of transformers before covering popular models such as BERT, RoBERTa, and GPT-2. It also takes readers through several use cases (text summarization, labeling, Q&A, sentiment analysis and fake news detection) that they can follow along. I am already using this book as a reference for implementing some of the tutorial videos for my YouTube channel Data Professor."
--Chanin Nantasenamat, Ph.D., Associate Professor of Bioinformatics and Founder of Data Professor YouTube channel
About the Author
Denis Rothman graduated from Sorbonne University and Paris-Diderot University, patenting one of the very first word2matrix embedding solutions. Denis Rothman is the author of three cutting-edge AI solutions: one of the first AI cognitive chatbots more than 30 years ago; a profit-orientated AI resource optimizing system; and an AI APS (Advanced Planning and Scheduling) solution based on cognitive patterns used worldwide in aerospace, rail, energy, apparel, and many other fields. Designed initially as a cognitive AI bot for IBM, it then went on to become a robust APS solution used to this day.
Product details
- Publisher : Packt Publishing (January 29, 2021)
- Language : English
- Paperback : 384 pages
- ISBN-10 : 1800565798
- ISBN-13 : 978-1800565791
- Item Weight : 1.45 pounds
- Dimensions : 7.5 x 0.87 x 9.25 inches
- Best Sellers Rank: #1,109,359 in Books (See Top 100 in Books)
- #188 in Natural Language Processing (Books)
- #339 in Computer Neural Networks
- #1,523 in Artificial Intelligence & Semantics
- Customer Reviews:
About the author

My core belief is that you only really know something once you have taught somebody how to do it.
I graduated from Sorbonne University and Paris-Diderot University. I wrote and registered a patent for one of the very first word2vector embeddings and word piece tokenization solutions 30+ years ago as a student and started a company to deploy AI. I went full speed from the start to:
- begin my career, authoring one of the first AI cognitive NLP chatbots applied as a language teacher for Moët et Chandon and other companies.
-author an AI resource optimizer for IBM and apparel producers.
-author an Advanced Planning and Scheduling (APS) solution used worldwide.
I rapidly became an expert in explainable AI (XAI) from the start to add interpretable mandatory, acceptance-based explanation data and explanation interfaces to the solutions implemented for major corporate aerospace, apparel, and supply chain projects.
As a full-stack AI developer and instructor, I write programs daily, mostly in Python, TensorFlow, PyTorch, C++, and Java. I find it essential to get my hands on code before explaining and implementing it.
If you wish, there is more information on my Linkedin profile:
https://www.linkedin.com/in/denis-rothman-0b034043/
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonReviewed in the United States on May 9, 2022
-
Top reviews
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
The chapters on translations and generation are particularly interesting because it leads to a discussion of generation and GPT-2 and GPT-3. This is a topic that requires massive computing because of the number of words involved in their data, (540 M) and (1.75 Billions, 8 10000 PCs) respectively. Three of the later chapters are devoted to word extraction. The final three chapters are devoted to Language understanding.
I have no hesitation in recommending this book to any student of modern AI.
So to summarize, do not waste your money with this kind of books pretending to teach you how to do NLP with Transformers. Plan 2 months of homework studying HF, Stanza and AllenNLP repo and watch the Stanford YouTube videos from Manning. Free and much much better.
Reviewed in the United States 🇺🇸 on May 9, 2022
In my case, I had some familiarity with Transformers prior to reading the book, in the sense that I had fine-tuned a few Transformer models and ran inference on them, both natively and using the Huggingface API. However, my understanding of Transformers had been ad-hoc and driven by the needs of the application.
The major benefit of this book for me has been its comprehensive coverage. The book starts with the theory behind Transformers and basic operations such as pre-training and fine-tuning and the rationale behind them. After the first few introductory chapters, each chapter covers individual applications of Transformers in a very example-driven manner, introducing the most appropriate model for the task. Applications cover standard tasks such as classification, sequence classification, as well as esoteric ones such as semantic role labeling, and few and zero shot learning. Discussions cover not only the transformer related portions of the application, but other aspects as well, including datasets, data pre-processing and evaluation. The Transformer models covered include not only the popular ones such as BERT, GPT-3 and T5, but also less well-known ones such as RoBERTa and ELECTRA. Examples are provided using the HuggingFace (both Pytorch and Tensorflow), AllenNLP, and Trax (Google Brain) libraries.
As a NLP practitioner, Transformers need to be part of your toolkit, and this book can help. Even if you have used Transformers in your work before, this book will likely teach you something new that will make you more effective with them (check the Table of Contents).
DISCLAIMER: I was provided with a free review copy of the book and requested to provide an honest review, which I have done above.
Top reviews from other countries
Overall, great book to have in your bookshelf if you're into ML and NLP.
All in all a decent attempt with an overview of transformers, transformer-based architectures and its applications. However, if you are looking for step-by-step conceptual explanations about transformers and their variations this is NOT your go to reference. Personally, I would recommend Getting started with Google BERT by Sudharsan Ravichandiran. On the other hand, if you are is looking for clear steps to set up your first transformers notebook project, this could be a resource to refer to.
On a totally different note, if still interested in the book, the publisher offers, both, print + ebook (immediate access) for a lower price of about 29 euros.
It expertly introduces transformers and mentors the reader for building innovative deep neural network architectures for NLP.
The book covers almost all game-changing applications for natural language processing (NLP), natural language understanting (NLU), and natural laguage generation (NLG).
The book is very useful even for beginners in the domain as the questions of each chapter are answered in the Appendix.












