Other Sellers on Amazon
+ $3.99 shipping
81% positive over last 12 months
Usually ships within 4 to 5 days.
& FREE Shipping
91% positive over last 12 months
+ $3.99 shipping
92% positive over last 12 months
Usually ships within 3 to 4 days.
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required. Learn more
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Transformers for Natural Language Processing: Build, train, and fine-tune deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, and GPT-3, 2nd Edition 2nd ed. Edition
| Price | New from | Used from |
- Kindle
$21.09 Read with Our Free App - Paperback
$37.795 Used from $51.98 11 New from $37.79
Enhance your purchase
BONUS OpenAI ChatGPT, GPT-4, and DALL-E notebooks in the book's GitHub repository – Start coding with these SOTA transformers.
OpenAI's GPT-3 and Hugging Face transformers for language tasks in one book. Plus, get a taste of the future of transformers, including computer vision tasks and code writing and assistance with Codex and GitHub Copilot.
Purchase of the print or Kindle book includes a free eBook in PDF format
Key Features
- Pretrain a BERT-based model from scratch using Hugging Face
- Fine-tune powerful transformer models, including OpenAI's GPT-3, to learn the logic of your data
- Perform root cause analysis on hard NLP problems
Book Description
Transformers are...well...transforming the world of AI. There are many platforms and models out there, but which ones best suit your needs?
Transformers for Natural Language Processing, 2nd Edition, guides you through the world of transformers, highlighting the strengths of different models and platforms, while teaching you the problem-solving skills you need to tackle model weaknesses.
You'll use Hugging Face to pretrain a RoBERTa model from scratch, from building the dataset to defining the data collator to training the model.
If you're looking to fine-tune a pretrained model, including GPT-3, then Transformers for Natural Language Processing, 2nd Edition, shows you how with step-by-step guides.
The book investigates machine translations, speech-to-text, text-to-speech, question-answering, and many more NLP tasks. It provides techniques to solve hard language problems and may even help with fake news anxiety (read chapter 13 for more details).
You'll see how cutting-edge platforms, such as OpenAI, have taken transformers beyond language into computer vision tasks and code creation using Codex.
By the end of this book, you'll know how transformers work and how to implement them and resolve issues like an AI detective!
What you will learn
- Find out how ViT and CLIP label images (including blurry ones!) and create images from a sentence using DALL-E
- Discover new techniques to investigate complex language problems
- Compare and contrast the results of GPT-3 against T5, GPT-2, and BERT-based transformers
- Carry out sentiment analysis, text summarization, casual speech analysis, machine translations, and more using TensorFlow, PyTorch, and GPT-3
- Measure the productivity of key transformers to define their scope, potential, and limits in production
Who this book is for
If you want to learn about and apply transformers to your natural language (and image) data, this book is for you.
You'll need a good understanding of Python and deep learning and a basic understanding of NLP to benefit most from this book. Many platforms covered in this book provide interactive user interfaces, which allow readers with a general interest in NLP and AI to follow several chapters. And, don't worry if you get stuck or have questions; this book gives you direct access to our AI/ML community and author, Denis Rothman. So, he'll be there to guide you on your transformers journey!
Table of Contents
- What are Transformers?
- Getting Started with the Architecture of the Transformer Model
- Fine-Tuning BERT Models
- Pretraining a RoBERTa Model from Scratch
- Downstream NLP Tasks with Transformers
- Machine Translation with the Transformer
- The Rise of Suprahuman Transformers with GPT-3 Engines
- Applying Transformers to Legal and Financial Documents for AI Text Summarization
- Matching Tokenizers and Datasets
- Semantic Role Labeling with BERT-Based Transformers
(N.B. Please use the Look Inside option to see further chapters)
- ISBN-101803247339
- ISBN-13978-1803247335
- Edition2nd ed.
- PublisherPackt Publishing
- Publication dateMarch 25, 2022
- LanguageEnglish
- Dimensions7.5 x 1.28 x 9.25 inches
- Print length564 pages
More items to explore
From the Publisher
|
|
|
|
|---|---|---|
|
Learn how to use BertViz, the Language Interpretability Tool (LIT), and Local Interpretable Model-Agnostic Explanations (LIME) to visualize and interpret the inner workings of transformers. |
Acquire the skills to solve false model outputs, applying the right language tools to get to the root cause of the problem. |
Run semantic role label experiments with transformer models to understand how these models approach such tasks and analyze casual speech. |
Book Topics and Platforms Used:
|
|
|
|---|---|---|
| Transformers for Natural Language Processing, 2nd Edition | Transformers for Natural Language Processing, 1st Edition | |
| Pretraining a BERT transformer | Hugging Face | Hugging Face |
| Fine-tuning transformer models | Hugging Face and OpenAI | Hugging Face |
| Natural language translation | Trax | Trax |
| Text summarization | Hugging Face and OpenAI | Hugging Face |
| Training a tokenizer | OpenAI and NLTK | - |
| Semantic role labeling (SRL) testing | AllenNLP | AllenNLP |
| Question-answering tasks | Hugging Face, OpenAI, AllenNLP, and Haystack | Hugging Face, AllenNLP, and Haystack |
| Sentiment analysis | Hugging Face, OpenAI, and AllenNLP | Hugging Face and AllenNLP |
| Vision transformers | Hugging Face and OpenAI | - |
| Creating code from sentences | OpenAI | - |
Editorial Reviews
Review
"Transformers for Natural Language Processing, Second Edition, is a reference for everyone interested in understanding how transformers work both from a theoretical and practical perspective. The author does a tremendous job of explaining how to use transformers step by step with a hands-on approach. After reading this book, you will be ready to use this state-of-the-art set of techniques for empowering your deep learning applications, including popular models such as BERT, RoBERTa, T5, and GPT-3.
The first edition always has a place on my desk, and now so will the second edition."
--Antonio Gulli, Engineering Director for the Office of the CTO, Google
About the Author
Denis Rothman graduated from Sorbonne University and Paris-Diderot University, designing one of the very first word2matrix patented embedding and patented AI conversational agents. He began his career authoring one of the first AI cognitive natural language processing (NLP) chatbots applied as an automated language teacher for Moet et Chandon and other companies. He authored an AI resource optimizer for IBM and apparel producers. He then authored an advanced planning and scheduling (APS) solution used worldwide.
Product details
- Publisher : Packt Publishing; 2nd ed. edition (March 25, 2022)
- Language : English
- Paperback : 564 pages
- ISBN-10 : 1803247339
- ISBN-13 : 978-1803247335
- Item Weight : 3.53 ounces
- Dimensions : 7.5 x 1.28 x 9.25 inches
- Best Sellers Rank: #24,856 in Books (See Top 100 in Books)
- #2 in Word Processing Books
- #11 in Natural Language Processing (Books)
- #11 in Computer Neural Networks
- Customer Reviews:
About the author

My core belief is that you only really know something once you have taught somebody how to do it.
I graduated from Sorbonne University and Paris-Diderot University. I wrote and registered a patent for one of the very first word2vector embeddings and word piece tokenization solutions 30+ years ago as a student and started a company to deploy AI. I went full speed from the start to:
- begin my career, authoring one of the first AI cognitive NLP chatbots applied as a language teacher for Moët et Chandon and other companies.
-author an AI resource optimizer for IBM and apparel producers.
-author an Advanced Planning and Scheduling (APS) solution used worldwide.
I rapidly became an expert in explainable AI (XAI) from the start to add interpretable mandatory, acceptance-based explanation data and explanation interfaces to the solutions implemented for major corporate aerospace, apparel, and supply chain projects.
As a full-stack AI developer and instructor, I write programs daily, mostly in Python, TensorFlow, PyTorch, C++, and Java. I find it essential to get my hands on code before explaining and implementing it.
If you wish, there is more information on my Linkedin profile:
https://www.linkedin.com/in/denis-rothman-0b034043/
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonReviewed in the United States on August 20, 2022
-
Top reviews
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
Transformers for Natural Language Processing is the best book I have ever read, and I am never going back. I don’t have to, and you can’t make me. And why would I want to?
The Rise of Super Human Transformer Models with GPT-3 — incidentally, the title of the texts 7th chapter — has changed the game for me and for the world. No longer are the best and most powerful technologies locked away in the Silos of large research institutions, but broken apart and distributed in piecemeal to anyone with the inclination to see the grain and to make with it what you will.
Bread has always been a staple of Western societies, but there is a new one today as well: the industrialized billion and soon-to-be trillion parameter models which Denis Rothman walks you through starting from first principles: the Attention Mechanism of Neural Networks.
When we Attend to something, we consider everything in the context of everything else, and we do so all at once. Washing tokens — a technical definition. Does that mean anything to you? Me neither. But the building blocks of the original Transformer Model, as outlined in the seminal and electric research paper published by the team at Google Brain? Yes. I know exactly how that works now. So will you. You will learn how to build a Multi-Headed Attention Encoder-Decoder network with TensorFlow and the components of said model will be forever etched into your mind like a solemn hymn:
input embedding |
add and norm |
Multi-Headed Attention |
Add and Norm |
Feed Forward |
Add and Norm |
Multi-Headed Attention |
… |
On and on until the result you are left with is an entirely new kind of Machine Learning model free from the limitations of Convolutions and LSTMs.
We need not have a Long and Short Term Memory of that which we learn in this text, because Denis Rothman is really only showing us how we can get started. The choices he makes in this text are directed choices — choices directed towards realizing for yourself the Call to Arms which has mobilized me to embrace the challenge of the moment. The call is simple: to become an Industry 4.0 AI Specialist.
The Fourth Industrial Revolution is all about connecting things to things. It is not only about generating new wealth with original creations, but by architecting and orchestrating building blocks across domains to arrive at something completely new.
Is everything perfect? No. The book is 500 pages long. I wish it were 5,000. I want 10x more. But if you are looking to work with the Transformer Models that will dominate the future, Transformers for Natural Language Processing Is All You Need.
Reviewed in the United States 🇺🇸 on August 20, 2022
Transformers for Natural Language Processing is the best book I have ever read, and I am never going back. I don’t have to, and you can’t make me. And why would I want to?
The Rise of Super Human Transformer Models with GPT-3 — incidentally, the title of the texts 7th chapter — has changed the game for me and for the world. No longer are the best and most powerful technologies locked away in the Silos of large research institutions, but broken apart and distributed in piecemeal to anyone with the inclination to see the grain and to make with it what you will.
Bread has always been a staple of Western societies, but there is a new one today as well: the industrialized billion and soon-to-be trillion parameter models which Denis Rothman walks you through starting from first principles: the Attention Mechanism of Neural Networks.
When we Attend to something, we consider everything in the context of everything else, and we do so all at once. Washing tokens — a technical definition. Does that mean anything to you? Me neither. But the building blocks of the original Transformer Model, as outlined in the seminal and electric research paper published by the team at Google Brain? Yes. I know exactly how that works now. So will you. You will learn how to build a Multi-Headed Attention Encoder-Decoder network with TensorFlow and the components of said model will be forever etched into your mind like a solemn hymn:
input embedding |
add and norm |
Multi-Headed Attention |
Add and Norm |
Feed Forward |
Add and Norm |
Multi-Headed Attention |
… |
On and on until the result you are left with is an entirely new kind of Machine Learning model free from the limitations of Convolutions and LSTMs.
We need not have a Long and Short Term Memory of that which we learn in this text, because Denis Rothman is really only showing us how we can get started. The choices he makes in this text are directed choices — choices directed towards realizing for yourself the Call to Arms which has mobilized me to embrace the challenge of the moment. The call is simple: to become an Industry 4.0 AI Specialist.
The Fourth Industrial Revolution is all about connecting things to things. It is not only about generating new wealth with original creations, but by architecting and orchestrating building blocks across domains to arrive at something completely new.
Is everything perfect? No. The book is 500 pages long. I wish it were 5,000. I want 10x more. But if you are looking to work with the Transformer Models that will dominate the future, Transformers for Natural Language Processing Is All You Need.
Now for the fonts. The fonts sizes are very small. This makes reading the text a challenging task. It takes away all the joy of reading the book (the book content itself is good).
I bought several other books from this publisher in the past, and the fonts were larger; I enjoyed very much the books (and the font sizes! )
In this year (2022), I bought this book and another book; it turns out both have very small fonts. It made the reading a very un-pleasant experience. I really felt sorry for the choices the publisher made on the font sizes. [ I don't have a magnifying glass and don't want to carry one ]
I would say I will have to stop buying books from this publisher and switch to other publishers, if they continue to publish books in such small fonts. Reason very simple: want to protect my eyes, and don't want to see papers and trees wasted; and finally, get back the joy and pleasure of reading books.
Consider p.10: "The expression 'Artificial Intelligence' was first used by John McCarthy in 1956 when it was established that machines could learn."
John was a friend of mine. He did coin the term AI, and it had absolutely nothing to do with machine learning.
Then on p.11. "It seemed that everybody in AI was on the right track for all these years. Markov Fields, RNNs, and CNNs evolved into multiple other models."
As someone who contributed to a variety of AI subfields during "all these years," this is just flat out offensive. A tiny fraction of the work was in the intellectual line leading to transformers.
The diatribe on p.11-12 titled "What resources should we use?" presents a hypothetical example where someone looking for an AI job doesn't get it -- not because he or she doesn't understand the technology (Rothman doesn't appear to realize that actually understanding technology is a thing), but because he or she is insufficiently familiar with the AI ecosystem in use at some particular company. I have spent decades as a scientist and as a CEO. I care what potential employees understand, and never whether or not they've used some specific piece of software.
As a fairly randomly selected example of all of this, look at p.409: "Let's peek into the code to see how the model works," which is immediately followed by, "In this section, we will see how DALL-E reconstructs images."
Except that's not what happens. There are about 20 lines of code, most of which are Python imports. Then there is a definition of the encoder and decoder via a call to load_model, and Rothman writes, "I added the enc and dec cells so that you can look into the encoder and decoder blocks to see how this hybrid model works: the convolutional functionality in a transformer model and the concatenation of text and image input."
There is nothing here that will help you understand how anything works. You'll understand how to use a specific application in a specific instance (reproducing a picture of Rothman's cat, it turns out). But you won't learn much, if anything, about what's actually going on.
If you want to understand transformers, buy a different book. If you really insist on buying this one, maybe you'll get my copy, which I am returning to Amazon.














