Andrii Gakhov

OK
About Andrii Gakhov
Andrii Gakhov is a mathematician and software engineer holding a Ph.D. in mathematical modeling and numerical methods. He has been a teacher in the School of Computer Science at V. Karazin Kharkiv National University in Ukraine for a number of years and currently works as a software practitioner for ferret go GmbH, the leading community moderation, automation, and analytics company in Germany. His fields of interests include machine learning, stream mining, and data analysis.
The best way to reach the author is via Twitter @gakhov or by visiting his webpage at https://www.gakhov.com.
Customers Also Bought Items By
Are you an author?
Author Updates
Titles By Andrii Gakhov
A technical book about popular space-efficient data structures and fast algorithms that are extremely useful in modern Big Data applications.
Probabilistic data structures is a common name for data structures based mostly on different hashing techniques. Unlike regular (or deterministic) data structures, they always provide approximated answers but with reliable ways to estimate possible errors. Fortunately, the potential losses and errors are fully compensated for by extremely low memory requirements, constant query time, and scaling, the three factors that become essential in Big Data applications.
About the book
The purpose of this book is to introduce technology practitioners which includes software architects and developers, as well as technology decision makers to probabilistic data structures and algorithms.
While it is impossible to cover all the existing amazing solutions, this book is to highlight their common ideas and important areas of application, including membership querying, counting, stream mining, and similarity estimation.
This is not a book for scientists only, but to gain the most out of it you will need to have basic mathematical knowledge and an understanding of the general theory of data structures and algorithms.
What you will learn
Reading the book, you will get a theoretical and practical understanding of probabilistic data structures and learn about their common uses.
- Learn how to solve practical issues of massive data handling
- Master the theoretical aspects of probabilistic data structures
- Identify the right data structures for your particular problems
What's inside?
This book consists of six chapters, each preceded by an introduction and followed by a brief summary and bibliography for further reading relating to that chapter. Every chapter is dedicated to one particular problem in Big Data applications, it starts with an in-depth explanation of the problem and follows by introducing data structures and algorithms that can be used to solve it efficiently.
- Hashing
- Membership
- Cardinality
- Frequency
- Rank
- Similarity
This book on the Web
You can find errata, examples, and additional information at pdsa.gakhov.com. If you have a comment, technical question about the book, would like to report an error you found, or any other issue, send email to pdsa@gakhov.com.
In case you are also interested in Cython implementation that includes many of the data structures and algorithms from this book, please check out our free and open-source Python library called PDSA at https://github.com/gakhov/pdsa. Everybody is welcome to contribute at any time.