Since the 1960s, database systems have been playing a relevant role in the
information technology field. By the mid-1960s, several systems were also available
for commercial purposes. Hierarchical and network database systems provided
two different perspectives and data models to organize data collections. In 1970,
E. Codd wrote a paper called A Relational Model of Data for Large Shared
Data Banks, proposing a model relying on relational table structures. Relational
databases became appealing for industries in the 1980s, and their wide adoption
fostered new research and development activities toward advanced data models
like object oriented or the extended relational. The online transaction processing
(OLTP) support provided by the relational database systems was fundamental to
make this data model successful. Even though the traditional operational systems
were the best solution to manage transactions, new needs related to data analysis and
decision support tasks led in the late 1980s to a new architectural model called data
warehouse. It includes extraction transformation and loading (ETL) primitives and
online analytical processing (OLAP) support to analyze data. From OLTP to OLAP,
from transaction to analysis, from data to information, from the entity-relationship
data model to a star/snowflake one, and from a customer-oriented perspective to
a market-oriented one, data warehouses emerged as data repository architecture to
perform data analysis and mining tasks. Relational, object-oriented, transactional,
spatiotemporal, and multimedia data warehouses are some examples of database
sources. Yet, the World Wide Web can be considered another fundamental and
distributed data source (in the Web2.0 era it stores crucial information – from a
market perspective – about user preferences, navigation, and access patterns).
Accessing and processing large amount of data distributed across several countries
require a huge amount of computational power, storage, middleware services,
specifications, and standards.