| Data mining, or knowledge discovery in databases, is a large area of study and is populated with numerous theoretical and practical textbooks. In this book, we take a focused and comprehensive look at one topic within this field: mining data that is represented as a graph. We attempt to cover the full breadth of the topic, including graph manipulation, visualization, and representation, mining techniques for graph data, and application of these ideas to problems of current interest.
The book is divided into three parts. Part I, Graphs, offers an introduction to basic graph terminology and techniques. In Part II, Mining Techniques, we take a detailed look at computational techniques for extracting patterns from graph data. These techniques provide an overview of the state of the art in frequent substructure mining, link analysis, graph kernels, and graph grammars. Part III, Applications, describes application of mining techniques to four graph-based application domains: chemical graphs, bioinformatics data, Web graphs, and social networks.
The book is targeted toward graduate students, faculty, and researchers from industry and academia who have some familiarity with basic computer science and data mining concepts. The book is designed so that individuals with no background in analyzing graph data can learn how to represent the data as graphs, extract patterns or concepts from the data, and see how researchers apply the methodologies to real datasets.
For those readers who would like to experiment with the techniques found in this book or test their own ideas on graph data, we have set up a Web page for the book at http://www.eecs.wsu.edu.mgd. This site contains additional information on current techniques for mining graph data. Links are also given to implementations of the techniques described in this book, as well as graph datasets that can be used for testing new or existing algorithms.
With the advent of and continued prospect for large databases containing relational and graphical information, the discovery of knowledge in such data is an important challenge to the scientific and industrial communities. Fielded applications for mining graph data from real-world domains has the potential to make significant contributions of new knowledge. We hope that this book accelerates progress toward meeting this challenge. |