| An overview of the multidisciplinary field of data mining, this book focuses specifically on new methodologies and case studies. Included are case studies written by 44 leading scientists and talented young scholars from seven different countries. Topics covered include data mining based on rough sets, the impact of missing data, and mining free text for structure. In addition, the four basic mining operations supported by numerous mining techniques are addressed: predictive model creation supported by supervised induction techniques; link analysis supported by association discovery and sequence discovery techniques; DB segmentation supported by clustering techniques; and deviation detection supported by statistical techniques.
Data acquired for analysis can have many different forms. We will describe the analysis of data that can be thought of as samples drawn from a population, and the conclusions will be phrased as properties of this larger population. We will focus on very simple models. As the investigator’s understanding of a problem area improves, the statistical models tend to become complex. Some examples of such areas are genetic linkage studies, ecosystem studies, and functional MRI investigations, where the signals extracted from measurements are very weak but potentially extremely useful for the application area. Experiments are typically analyzed using a combination of visualization, Bayesian analysis, and conventional test- and confidence-based statistics. In engineering and commercial applications of data mining, the goal is not normally to arrive at eternal truths, but to support decisions in design and business. Nevertheless, because of the competitive nature of these activities, one can expect well-founded analytical methods and understandable models to provide more useful answers than ad hoc ones.
About the Author John Wang is a professor in the department of information and decision sciences at Montclair State University, has a Ph.D. in operations research from Temple University, and has worked as an assistant professor at Beijing University of Sciences and Technology. He has served as a referee for Operations Research and IEEE Transactions on Control Systems Technology. His current research interests include optimization, nonlinear programming, and manufacturing systems engineering. He lives in Upper Montclair, New Jersey. |