In enterprises, a large volume of data has been collected and stored in data warehouses. Advances in data gathering, storage, and distribution have created a need for integrating data warehousing and data mining techniques. Mining data warehouses raises unique issues and requires special attention. Data warehousing and data mining are inter-related, and require holistic techniques from the two disciplines. The “Advanced Topics in Data Warehousing and Mining” series comes into place to address some issues related to mining data warehouses. To start this series, this volume 1, includes 12 chapters in four sections, contributed by authors and editorial board members from the International Journal of Data Warehousing and Mining.
With the large number of companies using the Internet to distribute and collect information, knowledge discovery on the Webor Web mininghas become an important research area. Web mining can be divided into three areas, namely Web content mining, Web structure mining, and Web usage mining (also called Web log mining) (Cooley, Srivastava, & Mobasher, 1997). Web content mining focuses on discovery of information stored on the Internetthat is, the various search engines. Web structure mining can be used when improving the structural design of a Web site. Web usage mining, the main topic of this chapter, focuses on knowledge discovery from the usage of individual Web sites.
Web usage mining is mainly based on the activities recorded in the Web log, the log file written by the Web server recording individual requests made to the server. An important notion in a Web log is the existence of user sessions. A user session is a sequence of requests from a single user within a certain time window. Of particular interest is the discovery of frequently performed sequences of actions by the Web userthat is, frequent sequences of visited Web pages.