Clustering for High Availability
Clustering for high availability, sometimes called failover clustering, is much like the buddy system in scuba
diving. The idea behind the buddy system is very simple. If your system fails (air supply) you will be down
(unable to breath) for a short period of time (down time) until you are able to locate, notify, and gain access (fail over) to your buddy's system (air supply).
Clustering for high availability significantly reduces system down time caused by both unplanned and
planned hardware and software failures. It is however, important to note, that even with failover clustering, you
will experience down time. Just like the buddy system in scuba diving, you will be down (unable to breath) for a short period of time.
Microsoft Cluster Service (MSCS) is one of many solutions for high availability on the Windows platform.
Microsoft includes this optionally installed service in Windows NT 4.0 Enterprise Edition, Windows 2000
Advanced Server (AS) and Datacenter Server (DC), Windows Server 2003 Enterprise Edition and Datacenter
Edition. The Cluster Service can be installed as part of the initial operating system load or installed at a later
time. The Windows NT 4.0, Windows 2000 Server, or Windows Server 2003 Standard Edition server products do not include the Microsoft Clustering Service.
Once installed and configured, high availability is achieved by providing an environment in which applications
such as database management systems (DBMS) can move from one server to another in the event of a
hardware or software failure. Both active/passive and active/active cluster configurations are supported. The
difference between the two is that a cluster configured as active/passive has at least one server in the cluster
that remains idle. The passive server has no other responsibilities during normal day-to-day operations.
The process of resources moving from one server to another due to a hardware or software failure is called
fail over. Fail back, also called fallback, refers to the moving of applications or databases back to their
primary or preferred server. This process can be configured to be performed immediately after the failing
server is back online, or it can be differed until a more appropriate time, as to not cause any additional
outages. High availability is also provided for outages normally associated with system maintenance. This
technique, referred to as rolling upgrades, relies on clustering support for high availability when systems
need to be brought down for hardware or software upgrades. Instead of bringing the system down for
maintenance the clustered resources are moved to another server in the cluster temporarily while the maintenance is being performed.
The basic components consist of two servers that establish a cluster when MSCS software is installed and
configured on both servers. Prior to installing the MSCS software these two servers must be able to
communicate with each other over a network. Although not required, but highly recommended is a dedicated
private network between the two servers that can be used to communicate each others heart beat without
interference from traffic on the public network. Both servers must also have access to shared storage.
All of the database editions of DB2 UDB v8 for Windows support the Microsoft Clustering Service (MSCS).
DB2 UDB integrates directly into Microsoft Clustering Services by registering a cluster aware resource type
of DB2. Other DB2 UDB services, such as the DB2 Database Administration Server, can be clustered as
well by registering these components as generic services. DB2 UDB has simplified the configuration tasks
required to modify a DB2 instance for high availability by implementing a DB2 Productivity Tool called
DB2MSCS (db2mscs.exe). DB2 MSCS is DB2 system command utility used to HA enable a DB2 instance.
The DB2 MSCS utility will use a configuration file to create and/or modify MSCS resources to establish HA support for the instance.