Buy new:
-35% $41.99$41.99
Delivery Monday, December 23
Ships from: Amazon.com Sold by: Amazon.com
Save with Used - Very Good
$39.86$39.86
Delivery January 8 - 21
Ships from: Amazon Sold by: Perpetual Textbooks
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Hands-On Data Analysis with Pandas - Second Edition: A Python data science handbook for data collection, wrangling, analysis, and visualization 2nd ed. Edition
Purchase options and add-ons
Get to grips with pandas by working with real datasets and master data discovery, data manipulation, data preparation, and handling data for analytical tasks
Key Features- Perform efficient data analysis and manipulation tasks using pandas 1.x
- Apply pandas to different real-world domains with the help of step-by-step examples
- Make the most of pandas as an effective data exploration tool
Extracting valuable business insights is no longer a 'nice-to-have', but an essential skill for anyone who handles data in their enterprise. Hands-On Data Analysis with Pandas is here to help beginners and those who are migrating their skills into data science get up to speed in no time.
This book will show you how to analyze your data, get started with machine learning, and work effectively with the Python libraries often used for data science, such as pandas, NumPy, matplotlib, seaborn, and scikit-learn.
Using real-world datasets, you will learn how to use the pandas library to perform data wrangling to reshape, clean, and aggregate your data. Then, you will learn how to conduct exploratory data analysis by calculating summary statistics and visualizing the data to find patterns. In the concluding chapters, you will explore some applications of anomaly detection, regression, clustering, and classification using scikit-learn to make predictions based on past data.
This updated edition will equip you with the skills you need to use pandas 1.x to efficiently perform various data manipulation tasks, reliably reproduce analyses, and visualize your data for effective decision making - valuable knowledge that can be applied across multiple domains.
What you will learn- Understand how data analysts and scientists gather and analyze data
- Perform data analysis and data wrangling using Python
- Combine, group, and aggregate data from multiple sources
- Create data visualizations with pandas, matplotlib, and seaborn
- Apply machine learning algorithms to identify patterns and make predictions
- Use Python data science libraries to analyze real-world datasets
- Solve common data representation and analysis problems using pandas
- Build Python scripts, modules, and packages for reusable analysis code
This book is for data science beginners, data analysts, and Python developers who want to explore each stage of data analysis and scientific computing using a wide range of datasets. Data scientists looking to implement pandas in their machine learning workflow will also find plenty of valuable know-how as they progress.
You'll find it easier to follow along with this book if you have a working knowledge of the Python programming language, but a Python crash-course tutorial is provided in the code bundle for anyone who needs a refresher.
Table of Contents- Introduction to Data Analysis
- Working with Pandas DataFrames
- Data Wrangling with Pandas
- Aggregating Pandas DataFrames
- Visualizing Data with Pandas and Matplotlib
- Plotting with Seaborn and Customization Techniques
- Financial Analysis - Bitcoin and the Stock Market
- Rule-Based Anomaly Detection
- Getting Started with Machine Learning in Python
- Making Better Predictions - Optimizing Models
- Machine Learning Anomaly Detection
- The Road Ahead
- ISBN-101800563450
- ISBN-13978-1800563452
- Edition2nd ed.
- PublisherPackt Publishing
- Publication dateApril 29, 2021
- LanguageEnglish
- Dimensions9.25 x 7.5 x 1.62 inches
- Print length788 pages
Frequently bought together

Related products with free delivery on eligible orders
From the brand
-
-
Packt is a leading publisher of technical learning content with the ability to publish books on emerging tech faster than any other.
Our mission is to increase the shared value of deep tech knowledge by helping tech pros put software to work.
We help the most interesting minds and ground-breaking creators on the planet distill and share the working knowledge of their peers.
From the Publisher
What makes this second edition of Hands-On Data Analysis with Pandas stand out from other pandas titles?
Hands-On Data Analysis with Pandas is not your typical data science book. Say goodbye to the stereotypical datasets that most tutorials and books use and say hello to real-world data with real-world issues; after all, the data you will work with in real life won’t be perfect either.
This book shows you how to work with realistic datasets, so you can master the use of pandas for data analysis. Elements of software engineering are also included throughout the chapters, which will strengthen your programming skills—you’ll learn how to build scripts with command-line arguments, package analysis code in classes, and build Python packages for modular and reusable analysis code.
What's new in the second edition of Hands-On Data Analysis with Pandas?
In this edition, the code examples have been updated for newer versions of the libraries used. The book also features new and revised examples highlighting new features in pandas 1.2. In addition, there are significant changes to the content of some chapters, while others have new examples and/or datasets.
What are the key takeaways for the readers buying this book?
Working with data doesn’t preclude good programming skills. This book will instill confidence and teach the concepts needed to write quality data science code using pandas and other Python data science libraries. You'll be able to apply new data wrangling and visualization skills to a variety of real-world datasets and have the confidence to search for solutions to common problems in both the documentation and resources like Stack Overflow with a solid foundation in pandas.
Editorial Reviews
About the Author
Product details
- Publisher : Packt Publishing; 2nd ed. edition (April 29, 2021)
- Language : English
- Paperback : 788 pages
- ISBN-10 : 1800563450
- ISBN-13 : 978-1800563452
- Item Weight : 2.97 pounds
- Dimensions : 9.25 x 7.5 x 1.62 inches
- Best Sellers Rank: #895,689 in Books (See Top 100 in Books)
- #129 in Database Storage & Design
- #306 in Data Mining (Books)
- Customer Reviews:
About the author

Stefanie Molin is a software engineer and data scientist at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She holds a bachelor’s of science degree in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.
Products related to this item
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on Amazon-
Top reviews
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
The datasets are intuitive. Not a boring texty book. Instead, lots of example code appears on every single page, illustrating the features. The story and example code flow together, not skipping around or showing disjointed points. The chapters follow your workflow, from data ingest and EDA to data cleaning, data wrangling, visualizataion, and finally to applications.
Thorough treatments are given to data cleaning, data wrangling, and data enrichment as separate topics, going into deep details on how to reshape and reindex data frames, how to do proper joins on data frames, left, right, inner, and outer, and how to do many other data cleaning and wrangling steps. For exaple, you'll learn how to set a new index, and why you should do that. And when inserting rows from different dataframes, you can leave yourself a new indicator column that shows you which table added the row. Pandas has many features like this that professionals should know, and Stephanie Molin shows the "how to".
Of course there's a GitHub link so you can download the example datasets. Honestly, I'm only up through data wrangling - have not even reached the financial analysis, machine learning, and advanced visualization code. I can hardly wait to work all the examples in person. (As you know, reading is good, but building the code is by far the most effective way to learn.)
Thanks Stephanie for devoting the time to making this a wonderful detailed and usable guide on how to use Pandas to solve my customer's problems. What a joy to read and use. This is the first and best book you should buy for Pandas.
Also, it approaches the teaching of pandas with both a data analyst perspective and a software engineer perspective. To be successful in data science today we need to wear both of those hats, so for someone coming from an analysis background without formal software engineering training, the book helps demystify concepts like virtualenv, simulations, source control, etc.
It’s not only about learning Pandas but about using pandas in the right away.
Top reviews from other countries
Uno dei migliori testi su python e le librerie per l'analisi dei dati. Consiglio



