High Availability and Disaster Recovery: Concepts, Design, Implementation

High Availability and Disaster Recovery: Concepts, Design, Implementation, 9783540244608 (3540244603), Springer, 2006

Companies and other organizations depend more than ever on the availability of their Information Technology, and most mission critical business processes are IT-based processes. Business continuity is the ability to do business under any circumstances and is an essential requirement modern companies are facing. High availability and disaster recovery are contributions of the IT to fulfill this requirement. And companies will be confronted with such demands to an even greater extent in the future, since their credit ratings will be lower without such precautions.

Both, high availability and disaster recovery, are realized by redundant systems. Redundancy can and should be implemented on different abstraction levels: from the hardware, the operating system and middleware components up to the backup computing center in case of a disaster. This book presents requirements, concepts, and realizations of redundant systems on all abstraction levels, and all given examples refer to UNIX and Linux systems.

During the last 15 years I was involved in planning, deployment, and operations of IT systems for major companies. Those systems are missioncritical: the customers’ business depends on their availability. The systems are required to be highly available and to protect against all kinds of problems, like hardware failures, software issues, human errors, through to physical disasters.

I learned that there are misunderstandings between customers, planners, and vendors about what high availability is, what can be achieved with IT systems, and where their limitations are. I also recognized that disaster recovery is only a feature of high availability, but is often seen as an independent topic.

This book addresses this area with an end-to-end view and makes it available as single piece of material: from requirements gathering to planning, implementation, and operations. Another missing piece is supplied, an approach to develop an architecture which leads to highly available systems that are robust and are able to recover from the relevant failure scenarios. But identification of failure scenarios is still a kind of art, mostly based on individual experiences. Selection of a solution is driven by a process, and not by products that claim protection but do not consider the whole picture.

With that end-to-end view, we get a structured approach that leads from requirements to possible failure scenarios to a successful solution. That was the motivation to write this book. It addresses these topics and is targeted at all parties involved, enabling them to speak a common language and manage their mutual expectations. The goal of this book is to explain and discuss architecture, technology, solutions, and processes. Since products and features are too short lived for the aim of this book, it does not review, compare, or recommend any particular products.