Cassandra: The Definitive Guide and over one million other books are available for Amazon Kindle. Learn more



or
Sign in to turn on 1-Click ordering
Sell Us Your Item
For a $2.00 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Start reading Cassandra: The Definitive Guide on your Kindle in under a minute.

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.
Sorry, this item is not available in
Image not available for
Color:
Image not available

To view this video download Flash Player

 

Cassandra: The Definitive Guide [Paperback]

Eben Hewitt
3.2 out of 5 stars  See all reviews (12 customer reviews)

List Price: $39.99
Price: $35.99 & FREE Shipping. Details
You Save: $4.00 (10%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
In stock but may require an extra 1-2 days to process.
Ships from and sold by Amazon.com. Gift-wrap available.

Formats

Amazon Price New from Used from
Kindle Edition $17.27  
Paperback $35.99  
Shop the new tech.book(store)
New! Introducing the tech.book(store), a hub for Software Developers and Architects, Networking Administrators, TPMs, and other technology professionals to find highly-rated and highly-relevant career resources. Shop books on programming and big data, or read this week's blog posts by authors and thought-leaders in the tech industry. > Shop now

Book Description

November 29, 2010

What could you do with data if scalability wasn't a problem? With this hands-on guide, you'll learn how Apache Cassandra handles hundreds of terabytes of data while remaining highly available across multiple data centers -- capabilities that have attracted Facebook, Twitter, and other data-intensive companies. Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment.

Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility.

  • Understand the tenets of Cassandra's column-oriented structure
  • Learn how to write, update, and read Cassandra data
  • Discover how to add or remove nodes from the cluster as your application requires
  • Examine a working application that translates from a relational model to Cassandra's data model
  • Use examples for writing clients in Java, Python, and C#
  • Use the JMX interface to monitor a cluster's usage, memory patterns, and more
  • Tune memory settings, data storage, and caching for better performance

Frequently Bought Together

Cassandra: The Definitive Guide + HBase: The Definitive Guide + Hadoop: The Definitive Guide
Price for all three: $95.93

Some of these items ship sooner than the others.

Buy the selected items together


Editorial Reviews

About the Author

Eben Hewitt is Director of Application Architecture at a publicly traded company where he is responsible for the design of their mission-critical, global-scale web, mobile and SOA integration projects. He has written several programming books, including Java SOA Cookbook (O'Reilly).


Product Details

  • Paperback: 332 pages
  • Publisher: O'Reilly Media; 1 edition (November 29, 2010)
  • Language: English
  • ISBN-10: 1449390412
  • ISBN-13: 978-1449390419
  • Product Dimensions: 7.1 x 0.7 x 9.2 inches
  • Shipping Weight: 1 pounds (View shipping rates and policies)
  • Average Customer Review: 3.2 out of 5 stars  See all reviews (12 customer reviews)
  • Amazon Best Sellers Rank: #157,263 in Books (See Top 100 in Books)

More About the Author

Eben Hewitt is the Chief Information Officer at O'Reilly Media. For nearly 15 years, he has worked in positions throughout IT, most recently on large-scale web and SOA integration projects, event-driven architecture, rules engines, distributed software, and messaging systems.

His team's architecture work for an integration with Google, Inc won the 2011 Fusion Middleware Innovation Award from Oracle Corp.

Hewitt is the author of several technical books, including Cassandra: The Definitive Guide and Java SOA Cookbook, and he is a contributor to 97 Things Every Software Architect Should Know.

He is a popular speaker at international conferences, a TOGAF certified architect, and a certified Scum Master.

Follow Eben on Twitter at @ebenhewitt or visit http://www.ebenhewitt.com.

Customer Reviews

Most Helpful Customer Reviews
55 of 57 people found the following review helpful
Format:Paperback
I'm not a database person but I've worked with SQL databases (esp. MySQL) and have read a few papers about non-relational databases, particularly Google's Bigtable. I understand the "web-scale" data challenge and see how a distributed, fault-tolerant, tunable open-source database like Cassandra can be an incredibly useful tool for addressing it. Therefore I was really looking forward to the publication of Eben Hewitt's Cassandra, The Definitive Guide. I was hoping that it would lay out all the important things a person would need to know in order to decide whether Cassandra made sense for their project and, if it did, how specifically they would use it.

Now that the book's out and I've had a chance to read it once through, I have to say that it does not meet my expectations. The author is clearly very interested in his subject and also very anxious to share insights not only into Cassandra but into modern non-relational databases in general (to the extent of including a 25-page appendix "The Nonrelational Landscape" at the end of the book). He does a pretty good job of explaining how Cassandra works at the level of distributed storage including scaling as well as availability and consistency. And though I haven't gone through the steps, he seems to give pretty good instructions for installing, configuring and monitoring a Cassandra cluster.

What he doesn't cover nearly as well as I was hoping (and would have expected from an O'Reilly book) is data modeling in Cassandra and the actual APIs for putting data into the database and getting data out (i.e. querying). It's not that he doesn't cover these subjects at all. In fact he devotes two chapters to data modeling (Chapter 3 The Cassandra Data Model and Chapter 4 Sample Application) and two to APIs (Chapter 7 Reading and Writing Data and Chapter 8 Clients), and these chapters contain a lot of useful information. The problem is that the information I really want is either mixed in with other, for me, less important information and/or is too limited or even not present at all.

Here are some things that I would have expected to be presented in reasonably full, coherent form in a "definitive guide" to Cassandra:

Data modeling:

Column families, supercolumns and columns - what are they for, how do you use them effectively? Especially supercolumns, which, in conjunction with the intrinsically sparse data representation, allow you to blur the distinction between structure and data and store data in "wide" format and even as out-and-out row-specific lists. He touches on matters of this sort, including in the design patterns at the end of his Data Modeling chapter, but doesn't integrate them into a coherent account of how to use the Cassandra data representation model.

Lack of joins - what are the alternatives? He addresses this issue too, but mostly says, denormalize your tables and design for common queries - or even more bluntly, precompute the results of your common queries and put them into your database. This may be a good approach in some situations, but leaves a lot of questions like, when do you precompute your query results, where and how, what triggers the computation, and how do you handle data changes that invalidate previously precomputed query results (one of the problems that normalization and joins were originally designed to solve). Also, I believe he does not say very much about implementing joins and other complex queries on the client side. Does Cassandra have properties that determine more vs. less efficient ways of doing this? How important is planning for locality in your column family organization? And supercolumns for maintaining lists/sets so that you don't have to assemble them at query time?

APIs:

Primary API - what is it? As the author explains, Cassandra doesn't have a query language, so he can't offer a chapter on the Cassandra equivalent of, say, SQL for relational databases. But Cassandra does have an API that lets you put data in and get data out, if not also other things like creating and deleting column families, supercolumns and columns. I was really expecting a chapter (or appendix or whatever) listing out the complete set of API requests and responses, either in some language-neutral format or in terms of the "native" Cassandra language, i.e. Java, ideally with additional information on "bindings" for other client-side languages like PHP, Python and so on. Again the information is sort of there, but not pulled together.

Higher-level wrappers - what are they about? The author talks about Thrift and Avro as (at least somewhat) high-level languages for communicating with Cassandra, but doesn't lay out in any coherent what those languages are. These tools may be very familiar to some, but I'm sure not to all. He does provide enough information - especially in the form of external links - to make it possible to start exploring these tools, but I would have expected the book to give a pretty good idea of what they're about without having to go off and read other material.

While I am, overall, dissatisfied with the book, I found it both an interesting read and an engaging introduction to the world of Cassandra. It also undeniably offers a wealth of information, even if it's not exactly the information a person may be looking for. For this reason I'm rating it 3 stars.
Comment | 
Was this review helpful to you?
21 of 21 people found the following review helpful
Format:Paperback
The information in this book is solid enough but its chaotic structure and lack of support for the code examples make it hard to justify a purchase.

The book was written to against version 0.7b2 of Cassandra. That beta status alone should be warning of the perils of premature publication. None of the code examples work (or indeed compile) with the current API (0.7b5). Downloading the latest code from the author's spartan support site offers little gain. The zip ball contains a readme file noting that the code did work once and suggesting the reader fixes it themselves.

There is a consistent pattern of requiring the reader to understand terms which are first defined several chapters later. Slices for example, or setting up the Cassandra JMX interface which is required for data loading in chapter 4 but first described in chapter 8.

Annoying, especially as there is solid information here and it's not badly written. Had the O'Reilly editors been more pro-active, ignored the me-first commercial pressures, delayed publication until the API stabilized and sorted out the structural problems in the writing this could have been a solid read.
Comment | 
Was this review helpful to you?
6 of 7 people found the following review helpful
Format:Paperback|Amazon Verified Purchase
This book is a fine introduction to Cassandra itself, and even to the whole genre of non relational databases. Where it falls down is if you want to actually start using Cassandra for an actual product. The fault doesn't lie with the book, but with the confused state of Cassandra clients. Basically no one codes directly to Cassandra: people code to one of the various Cassandra clients such as Thrift, Avro, Hector, Chirper, Pelops, etc. Cassandra has many clients none of which is the clear leader, and none of which really solve the full problem of writing to Cassandra.

Given that the only real way to learn system is to code to it this presents a real challenge. The current book will give you an overview and feel for Cassandra but will not by itself allow you to start using it.
Comment | 
Was this review helpful to you?
Most Recent Customer Reviews
2.0 out of 5 stars No technical depth and full of fallacies
Writer is a PM, not a tech guy. I could not trust the contents in the book, and there were no real examples to show the subtleties of cassandra. Read more
Published 1 month ago by S. Wang
2.0 out of 5 stars Nothing definite about this Guide
First up, I have nothing against the author. The author comes across as a genuine guy who is actually willing to invest energy in explanations. Read more
Published 2 months ago by Rajeev Jha
2.0 out of 5 stars Not well written
This book has good material, but getting old fast, and it was not well written.
It uses terms without prior definition, it speaks only from the framework of "I already know... Read more
Published 16 months ago by Sarah Baker
5.0 out of 5 stars Best Book on NoSQL Databases--Cassandra is Excellent too
I will be giving a huge presentation on all the NoSQL Databases for the IndianaJUG group in January. Read more
Published 20 months ago by Tom Hunter
4.0 out of 5 stars Cassandra: The Definitive Guide
Original review written by Roberto Bentivoglio, JUG Lugano, www.juglugano.ch

Cassandra is one of the most famous NoSQL database. Read more
Published on May 17, 2011 by JUG Lugano
4.0 out of 5 stars Informative and good help to start using Cassandra
In general I liked the book though some sections could be better and let's start with them. Section with code are not very good because version used is already old and byte arrays... Read more
Published on April 24, 2011 by Irakli Kobiashvili
3.0 out of 5 stars Interesting, not completely current, some code does not work
Cassandra: The Definitive Guide is one of the few in-depth books available on Cassandra at the moment. Read more
Published on February 20, 2011 by John Brady
5.0 out of 5 stars Going to production
A good technical overview of Cassandra. You will get the most out of this book if are willing to get your hands dirty (or already have) and actually boot up Cassandra and play with... Read more
Published on January 16, 2011 by Ilya Grigorik
4.0 out of 5 stars Great book on Cassandra
I feel this book is very useful for anyone who wants to get a solid understanding of Cassandra. Anyone who has been trying to learn Cassandra will find that there is very little... Read more
Published on January 1, 2011 by Tommy Li
Search Customer Reviews
Only search this product's reviews

What Other Items Do Customers Buy After Viewing This Item?


Forums

There are no discussions about this product yet.
Be the first to discuss this product with the community.
Start a new discussion
Topic:
First post:
Prompts for sign-in
 



So You'd Like to...


Create a guide


Look for Similar Items by Category