Clustering for High Availability

Clustering for high availability, sometimes called failover clustering, is much like the buddy system in scuba diving. The idea behind the buddy system is very simple. If your system fails (air supply) you will be down (unable to breath) for a short period of time (down time) until you are able to locate, notify, and gain access (fail over) to your buddy's system (air supply).

Clustering for high availability significantly reduces system down time caused by both unplanned and planned hardware and software failures. It is however, important to note, that even with failover clustering, you will experience down time. Just like the buddy system in scuba diving, you will be down (unable to breath) for a short period of time.

Microsoft Cluster Service (MSCS) is one of many solutions for high availability on the Windows platform. Microsoft includes this optionally installed service in Windows NT 4.0 Enterprise Edition, Windows 2000 Advanced Server (AS) and Datacenter Server (DC), Windows Server 2003 Enterprise Edition and Datacenter Edition. The Cluster Service can be installed as part of the initial operating system load or installed at a later time. The Windows NT 4.0, Windows 2000 Server, or Windows Server 2003 Standard Edition server products do not include the Microsoft Clustering Service.


Once installed and configured, high availability is achieved by providing an environment in which applications such as database management systems (DBMS) can move from one server to another in the event of a hardware or software failure. Both active/passive and active/active cluster configurations are supported. The difference between the two is that a cluster configured as active/passive has at least one server in the cluster that remains idle. The passive server has no other responsibilities during normal day-to-day operations.


The process of resources moving from one server to another due to a hardware or software failure is called fail over. Fail back, also called fallback, refers to the moving of applications or databases back to their primary or preferred server. This process can be configured to be performed immediately after the failing server is back online, or it can be differed until a more appropriate time, as to not cause any additional outages. High availability is also provided for outages normally associated with system maintenance. This technique, referred to as rolling upgrades, relies on clustering support for high availability when systems need to be brought down for hardware or software upgrades. Instead of bringing the system down for maintenance the clustered resources are moved to another server in the cluster temporarily while the maintenance is being performed.

The basic components consist of two servers that establish a cluster when MSCS software is installed and configured on both servers. Prior to installing the MSCS software these two servers must be able to communicate with each other over a network. Although not required, but highly recommended is a dedicated private network between the two servers that can be used to communicate each others heart beat without interference from traffic on the public network. Both servers must also have access to shared storage.

All of the database editions of DB2 UDB v8 for Windows support the Microsoft Clustering Service (MSCS). DB2 UDB integrates directly into Microsoft Clustering Services by registering a cluster aware resource type of DB2. Other DB2 UDB services, such as the DB2 Database Administration Server, can be clustered as well by registering these components as generic services. DB2 UDB has simplified the configuration tasks required to modify a DB2 instance for high availability by implementing a DB2 Productivity Tool called DB2MSCS (db2mscs.exe). DB2 MSCS is DB2 system command utility used to HA enable a DB2 instance. The DB2 MSCS utility will use a configuration file to create and/or modify MSCS resources to establish HA support for the instance. 



Copyright © 1998 - 2018 Ten Digit Consulting, LLC | All rights Reserved