“If you torture the data long enough, Nature will confess,” said 1991 Nobel-winning economist
Ronald Coase. The statement is still true. However, achieving this lofty goal is not easy. First,
“long enough” may, in practice, be “too long” in many applications and thus unacceptable. Second,
to get “confession” from large data sets one needs to use state-of-the-art “torturing” tools. Third,
Nature is very stubborn — not yielding easily or unwilling to reveal its secrets at all.
Fortunately, while being aware of the above facts, the reader (a data miner) will find several
efficient data mining tools described in this excellent book. The book discusses various issues
connecting the whole spectrum of approaches, methods, techniques and algorithms falling under
the umbrella of data mining. It starts with data understanding and preprocessing, then goes through
a set of methods for supervised and unsupervised learning, and concludes with model assessment,
data security and privacy issues. It is this specific approach of using the knowledge discovery
process that makes this book a rare one indeed, and thus an indispensable addition to many other
books on data mining.
To be more precise, this is a book on knowledge discovery from data. As for the data sets, the
easy-to-make statement is that there is no part of modern human activity left untouched by both
the need and the desire to collect data. The consequence of such a state of affairs is obvious.
We are surrounded by, or perhaps even immersed in, an ocean of all kinds of data (such as
measurements, images, patterns, sounds, web pages, tunes, etc.) that are generated by various types
of sensors, cameras, microphones, pieces of software and/or other human-made devices. Thus we
are in dire need of automatically extracting as much information as possible from the data that
we more or less wisely generate. We need to conquer the existing and develop new approaches,
algorithms and procedures for knowledge discovery from data. This is exactly what the authors,
world-leading experts on data mining in all its various disguises, have done. They present the
reader with a large spectrum of data mining methods in a gracious and yet rigorous way.