Fifteen years ago, Michael and I wrote the first version of this book. A little more than 400 pages, the book fulfilled our goal of surveying the field of data mining by bridging the gap between the technical and the practical, by helping business people understand the data mining techniques and by helping technical people understand the business applications of these techniques. When Bob Elliott, our editor at Wiley, asked us to write the third edition of Data Mining Techniques, we happily said “yes,” conveniently forgetting the sacrifices that writing a book requires in our personal lives. We also knew that the new edition would be considerably reworked from the previous two editions.
In the past 15 years, the field has broadened and so has the book, both figuratively and literally. The second edition, published in 2004 and expanded to 600 pages, introduced two key new technical chapters covering survival analysis and statistical algorithms that had then become (and still are) increasingly important for data miners. Once again, this version introduces new technical areas, particularly text mining and principal components, and a wealth of new examples and enhanced technical descriptions in all the chapters. These examples come from a broad section of industries, including financial services, retailing, telecommunications, media, insurance, health care, and web-based services.
As practitioners in the field, we have also continued to learn. Between us, we now have about half a century of experience in data mining. Since 1999, Michael and I have been teaching courses through the Business Knowledge Series at SAS Institute (this series is separate from the software side of the business and brings in outside experts to teach non-software-specific courses), the Data Warehouse Institute, and onsite classes at many different companies. Our role as instructors in these courses has introduced us to thousands of diverse business people working in many industries. One of these courses, “Business Data Mining Techniques,” was based on the second edition of this book. These courses provide a wealth of feedback about the subject of data mining, about what people are doing in the real world, and how best to present these ideas so they can be readily understood. Much of this feedback is reflected in this new edition. We seem to learn as much from our students as our students learn from us.