We are deluged by data—scientific data, medical data, demographic data, financial
data, and marketing data. People have no time to look at this data. Human
attention has become a precious resource. So, we must find ways to automatically
analyze the data, to automatically classify it, to automatically summarize it, to
automatically discover and characterize trends in it, and to automatically flag
anomalies. This is one of the most active and exciting areas of the database
research community. Researchers in areas such as statistics, visualization, artificial
intelligence, and machine learning are contributing to this field. The breadth of
the field makes it difficult to grasp its extraordinary progress over the last few
years.
Jiawei Han and Micheline Kamber have done a wonderful job of organizing and
presenting data mining in this very readable textbook. They begin by giving quick
introductions to database and data mining concepts with particular emphasis on
data analysis. They review the current product offerings by presenting a general
framework that covers them all. They then cover in a chapter-by-chapter tour the
concepts and techniques that underlie classification, prediction, association, and
clustering. These topics are presented with examples, a tour of the best algorithms
for each problem class, and pragmatic rules of thumb about when to apply each
technique. I found this presentation style to be very readable, and J certainly
learned a lot from reading the book, jiawei Han and Micheline Kamber have been
leading contributors to data mining research. This is the text they use with their
students to bring them up to speed on the field. The field is evolving very rapidly,
but this book is a quick way to learn the basic ideas, and to understand where the
field is today. I found it very informative and stimulating, and I expect you will
too.