Have one to sell? Sell yours here
Text to Speech Synthesis: New Paradigms and Advances (Prentice Hall Imsc Press Multimedia Series)
 
See larger image
 
Tell the Publisher!
I'd like to read this book on Kindle

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Text to Speech Synthesis: New Paradigms and Advances (Prentice Hall Imsc Press Multimedia Series) [Hardcover]

Shrikanth Narayanan (Author), Abeer Alwan (Author)
4.0 out of 5 stars  See all reviews (1 customer review)


Available from these sellers.



Book Description

Prentice Hall Imsc Press Multimedia Series August 3, 2004
Text to speech synthesis (TTS) is a critical research and application area in the field of multimedia interfaces. Recent advances in TTS will impact is wide number of disciplines from education, business and entertainment applications to medical aids. Until recently, speech synthesis relied on models and rule-based approaches. While this had yielded intelligible sounding speech, the voice quality was unacceptable for widespread adoption. Fortunately, there has been a major technological paradigm shift recently in how speech synthesis is done: going from rule-based to explicit data-driven methods. Recent advances in computing and corpus driven methodologies have yielded exciting possibilities for research and development in this domain yielding highly natural sounding speech. The book focuses on recent advances and new paradigms in text to speech synthesis contributed by leading experts from both academia and industry from across the world. There is no book of this nature that documents in a comprehensive way the recent research trends. This is not only important for researchers and students of the field but potential customers and other benefactors of the results. The book's chapters address key current topic areas in text to speech synthesis (TTS): Data-driven systems, unit selection Hybrid Schemes: interplay between data-driven and knowledge-based techniques, prosody models and generation and expressive speech synthesis.

Editorial Reviews

From the Back Cover

Recent advances in speech synthesis will enable the development of high-quality natural voice systems with broad application in education, business, entertainment, and medicine. Text to Speech Synthesis is the first book to comprehensively document these new research trends and paradigms, balancing coverage of research and applications. It brings together seminal research by leaders in the field, drawn from both academic and industrial laboratories worldwide.

The authors and editors offer broad coverage of several key areas, including new unit selection approaches, speech representations and modeling, data-driven synthesis schemes, and expressive speech synthesis.

Coverage includes:

  • Unit Selection Methods: Reducing discontinuities at synthesis time in corpus-based speech processing, voice quality variation, and join costs
  • Hidden Markov Model (HMM)-Based Synthesis: Advanced uses of speech recognition technology, HMM-based multilingual speech synthesis, and new prosody control techniques
  • Expressive Speech Synthesis: Challenges, questions, and avenues of research, including diphone transplantation and minimization of pitch modification
  • Speech Representation and Models: A new articulatory modeling paradigm for controlling synthesis quality

This is an essential resource for all researchers working in speech synthesis and related areas such as multimedia signal processing, linguistics, and spoken user interfaces. It will also be valuable to any engineer, developer, or manager who must evaluate the latest speech technologies or integrate them into practical applications.



About the Author

Dr. Shrikanth Narayanan is associate professor at the Signal and Image Processing Institute of USC's Electrical Engineering Department. He founded and directs USC's Speech Analysis and Interpretation Laboratory, and serves as research area director of the Integrated Media Systems Center, an NSF Engineering Research Center. He is associate editor of IEEE Transactions of Speech and Audio Processing, serves on the speech communication technical committee of the Acoustical Society of America, and was Principal Member of Technical Staff at AT&T Laboratories.

Dr. Abeer Alwan, a professor of electrical engineering at UCLA, established and directs the Speech Processing and Auditory Perception Laboratory there. Her research interests include modeling human speech production and perception mechanisms and applying these models to speech-processing applications such as noise-robust automatic speech recnognition, compression, and synthesis. She is a Fellow of the Acoustical Society of America and recently served as editor-in-chief of the journal Speech Communication.



013145661XAB04232004

Product Details

  • Hardcover: 288 pages
  • Publisher: Prentice Hall PTR (August 3, 2004)
  • Language: English
  • ISBN-10: 013145661X
  • ISBN-13: 978-0131456617
  • Product Dimensions: 9.2 x 7.2 x 0.9 inches
  • Shipping Weight: 1.8 pounds
  • Average Customer Review: 4.0 out of 5 stars  See all reviews (1 customer review)
  • Amazon Best Sellers Rank: #3,925,190 in Books (See Top 100 in Books)

 

Customer Reviews

1 Review
5 star:    (0)
4 star:
 (1)
3 star:    (0)
2 star:    (0)
1 star:    (0)
 
 
 
 
 
Average Customer Review
4.0 out of 5 stars (1 customer review)
 
 
 
 
Share your thoughts with other customers:
Most Helpful Customer Reviews

1 of 2 people found the following review helpful:
4.0 out of 5 stars compare TTS to ASR, May 22, 2005
This review is from: Text to Speech Synthesis: New Paradigms and Advances (Prentice Hall Imsc Press Multimedia Series) (Hardcover)
The field of TTS has been steadily improving. But still not perfect. If you listen to an extended TTS audio, you are unlikely to imagine it was a single human recording. Here, the editors provide a set of research papers that map out the boundary of TTS.

What I found the most interesting was the chapter comparing it with Automatic Speech Recognition. The latter is a much harder problem. Especially if you want speaker independence. And the input audio can have noise. Whereas TTS is effectively noise-free. The input text is always precisely known. But the chapter points out an ironic difference that is somewhat of a mirror image. ASR accuracy can be easily and objectively measured, by comparing the ASR's output text with the text transcribed by a human listener. Whereas the "goodness" of a TTS audio output is very subjectively determined.

This is one major unsolved TTS problem.

The chapter goes into some of the ASR methods that have been brought successfully into TTS research. Most notably is the use of Hidden Markov Methods. In ASR work, this was perhaps the biggest innovation in the last 10 years. It also shows some preliminary promise for TTS.
Help other customers find the most helpful reviews 
Was this review helpful to you? Yes No

Share your thoughts with other customers: Create your own review
 
 
 
Only search this product's reviews



Tag this product

 (What's this?)
Think of a tag as a keyword or label you consider is strongly related to this product.
Tags will help all customers organize and find favorite items.
Your tags: Add your first tag
 

Sell a Digital Version of This Book in the Kindle Store

If you are a publisher or author and hold the digital rights to a book, you can sell a digital version of it in our Kindle Store. Learn more

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 


Active discussions in related forums
Search Customer Discussions
Search all Amazon discussions
   
Related forums


Listmania!


Create a Listmania! list

So You'd Like to...


Create a guide


Look for Similar Items by Category


Look for Similar Items by Subject