November 2012 – LinuxClustering.net

Best practices for eliminating SPoF in cluster architecture

Nov 262012

Much as a chain is only as strong as its weakest link, the effectiveness of a high availability cluster is limited by any single point of failures (SPOF) which exist within its deployment. To ensure the absolute highest levels of availability, SPOFs must be removed. There is a straightforward method for ridding the cluster of these weak links.

First, you must identify any SPOFs which exist with particular attention paid to servers, network connections and storage devices. Modern servers come with redundant and error correcting memory, data striping across hard disks and multiple CPUs which eliminates most hardware components as a SPOF. Software and human error, however, can result in server or application downtime. Deploying a high availability cluster solution which monitors the health of servers and critical applications and takes automatic recovery actions in the event of failure eliminates this SPOF. All clustering solutions provide basic ping tests to validate server functionality, but only more advanced offerings also track application health and have the ability to automatically recover from detected failures. This deeper level of detection and recovery minimizes downtime.

Architecting all components of the cluster for redundancy is paramount to maximizing uptime. Connections to storage often represent a SPOF and it is critical that multi-pathing is architected into any shared storage configuration. Linux DM Multipath (DM-MPIO) provides the rerouting of block I/O to an alternate path in the event of a path failure. This eliminates all components in the path from server to storage as a potential SPOF and provides automatic recovery should a failure occur.

But even configured with multi-pathing, shared storage/SANs still represent single points of failure as does the physical data center where it is located. To provide further protection, off-site replication of critical data combined with cross-site clustering must be deployed. Combined with network redundancy between sites, this optimal solution removes all SPOFs. Real-time replication ensures that an up-to-date copy of business critical data is always available; doing this off-site to a backup data center or into a cloud service also protects against primary data center outages that can result from fire, power outages, etc.

The use of application-level monitoring and auto-recovery, multi-pathing for shared storage, and data replication for off-site protection each eliminate potential Single Points of Failure within your cluster architecture. Paying attention to these components during cluster architecture and deployment will ensure the greatest possible levels of uptime.

Step-by-Step: How to Set up a Low Cost SAN with the Linux Software iSCSI Target

General, iSCSI, Storage 11 Responses »

Nov 082012

A software iSCSI target can be a great way to set up shared storage when you don’t have enough dough to afford pricey SAN hardware. The iSCSI target acts just like a real hardware iSCSI array, except it’s just a piece of software running on a traditional server (or even a VM!). Setting up an iSCSI target is an easy and low cost way to get the shared storage you need, whether you’re using a clustering product like Microsoft Windows Server Failover Clustering (WSFC), a cluster filesystem such as GFS or OCFS, or even if you’re wanting to get the most out of your virtualization platform (be it VMware, XenServer, or Hyper-V) by enabling storage pooling and live migration.

About Lio-Target

Recently, the Linux kernel has adopted LIO-Target as the standard iSCSI target for Linux. LIO-Target is available in Linux kernels 3.1 and higher. LIO-Target supports SCSI-3 Persistent Reservations, which are required by Windows Server Failover Clustering, VMware vSphere, and other clustering products. The LUNs (disks) presented by the iSCSI target can be entire disks, partitions, or even just plain old files on the filesystem. LIO-Target supports all of these options.

Below, we’ll walk through the steps to configure LIO-Target on an Ubuntu 12.04 server (other recent distros will probably work also, but the steps may vary slightly).

Configuration Steps

First, install the Lio-target packages:

# apt-get install –no-install-recommends targetcli python-urwid

Lio-target is controlled using the targetcli command line utility.

The first step is to create the backing store for the LUN. In this example, we’ll use a file-backed LUN, which is just a normal file on the filesystem of the iSCSI target server.

# targetcli

/> cd backstores/
/backstores> ls
o- backstores …………………………………………………… […]
o- fileio …………………………………………. [0 Storage Object]
o- iblock …………………………………………. [0 Storage Object]
o- pscsi ………………………………………….. [0 Storage Object]
o- rd_dr ………………………………………….. [0 Storage Object]
o- rd_mcp …………………………………………. [0 Storage Object]

/backstores> cd fileio

/backstores/fileio> help create (for help)

/backstores/fileio> create lun0 /root/iscsi-lun0 2g (create 2GB file-backed LUN)

Now the LUN is created. Next we’ll set up the target so client systems can access the storage.

/backstores/fileio/lun0> cd /iscsi

/iscsi> create (create iqn and target port group)

Created target iqn.2003-01.org.linux-iscsi.murray.x8664:sn.31fc1a672ba1.
Selected TPG Tag 1.
Successfully created TPG 1.
Entering new node /iscsi/iqn.2003-01.org.linux-iscsi.murray.x8664:sn.31fc1a672ba1/tpgt1

/iscsi/iqn.20…a672ba1/tpgt1> set attribute authentication=0 (turn off chap auth)

/iscsi/iqn.20…a672ba1/tpgt1> cd luns

/iscsi/iqn.20…a1/tpgt1/luns> create /backstores/fileio/lun0 (create the target LUN)
Selected LUN 0.
Successfully created LUN 0.
Entering new node /iscsi/iqn.2003-01.org.linux-iscsi.murray.x8664:sn.31fc1a672ba1/tpgt1/luns/lun0

/iscsi/iqn.20…gt1/luns/lun0> cd ../../portals

iSCSI traffic can consume a lot of bandwidth, so you’ll probably want the iSCSI traffic to be on a dedicated (or SAN) network, rather than your public network.

/iscsi/iqn.20…tpgt1/portals> create 10.10.102.164 (create portal to listen for connections)
Using default IP port 3260
Successfully created network portal 10.10.102.164:3260.
Entering new node /iscsi/iqn.2003-01.org.linux-iscsi.murray.x8664:sn.31fc1a672ba1/tpgt1/portals/10.10.102.164:3260

/iscsi/iqn.20….102.164:3260> cd ..

/iscsi/iqn.20…tpgt1/portals> create 10.11.102.164
Using default IP port 3260
Successfully created network portal 10.11.102.164:3260.
Entering new node /iscsi/iqn.2003-01.org.linux-iscsi.murray.x8664:sn.31fc1a672ba1/tpgt1/portals/10.11.102.164:3260

/iscsi/iqn.20…102.164:3260> cd ../../acls

Now, you’ll just need to register the iSCSI initiators (client systems). To do this, you’ll need to find the initiator names of the systems. For Linux, this will usually be in /etc/iscsi/initiatorname.iscsi. For Windows, the initiator name is found in the iSCSI Initiator Properties Panel in the Configuration Tab.

/iscsi/iqn.20…a1/tpgt1/acls> create iqn.1994-05.com.redhat:f5b312caf756 (register initiator — this IQN is the IQN of the initiator — do this for each initiator that will access the target)
Successfully created Node ACL for iqn.1994-05.com.redhat:f5b312caf756
Created mapped LUN 0.
Entering new node /iscsi/iqn.2003-01.org.linux-iscsi.murray.x8664:sn.31fc1a672ba1/tpgt1/acls/iqn.1994-05.com.redhat:f5b312caf756

/iscsi/iqn.20….102.164:3260> cd /

Now, remember to save the configuration. Without this step, the configuration will not be persistent.
/> saveconfig (SAVE the configuration!)

/> exit

You’ll now need to connect your initiators to the target. You’ll generally need to provide the IP address of the target to connect to it. After the connection is made, the client systems will see a new disk. The disk will need to be formatted before use.

And that’s it! You’re ready to use your new SAN. Have fun!

Host-Based Replication vs SAN Replication

Data Replication, Storage 4 Responses »

Nov 072012

Host-based or storage-based?

Two common platforms for replicating data are from the server host that operates against the data and from the storage array that holds the data.

Host-based replication doesn’t lock users into a particular storage array from any one vendor. SIOS SteelEye DataKeeper, for example, can replicate from any array to any array, regardless of vendor. This ability ultimately lowers costs and provides users the flexibility to choose what is right for their environment. Most host-based replication solutions can also replicate data natively over IP networks, so users don’t need to buy expensive hardware to achieve this functionality.

Storage-based replication is OS-independent and adds no processing overhead. However, vendors often demand that users replicate from and to similar arrays. This requirement can be costly, especially when you use a high-performance disk at your primary site — and now must use the same at your secondary site. Also, storage-based solutions natively replicate over Fibre Channel and often require extra hardware to send data over IP networks, further increasing costs.

When creating remote replicas for business continuity, the decision whether to deploy a host- or storage-based solution depends heavily on the platform that is being replicated and the business requirements for the applications that are in use. If the business demands zero impact to operations in the event of a site disaster, then host-based techniques provide the only feasible solution.

Host-based solutions are storage-agnostic, providing IT managers complete freedom to choose any storage that matches the needs of the enterprise. Host-based replication software functions with any storage hardware that can be mounted to the application platform, offering heterogeneous storage support. Host-based solutions that operate at the block or volume level are also ideally suited for cluster configurations.

One disadvantage is that host-based solutions consume server resources and can affect overall server performance. Despite this possibility, a host-based solution might still be appropriate when IT managers need a multi-vendor storage infrastructure or have a legacy investment or internal expertise in a specific host-based application.

A storage-based alternative does provide the benefit of an integrated solution from a dedicated storage vendor. These solutions leverage the controller of the storage array as an operating platform for replication functionality. The tight integration of hardware and software gives the storage vendor unprecedented control over the replication configuration and allows for service-level guarantees that are difficult to match with alternative replication approaches. Most storage vendors have also tailored their products to complement server virtualization and use key features such as virtual machine storage failover. Some enterprises might also have a long-standing business relationship with a particular storage vendor; in such cases, a storage solution might be a relevant fit.

High quality of service comes at a cost, however. Storage-based replication invariably sets a precondition of like-to-like storage device configuration. This means that two similarly configured high-end storage arrays must be deployed to support replication functionality, increasing costs and tying the organization to one vendor’s storage solution.

This locking in to a specific storage vendor can be a drawback. Some storage vendors have compatibility restrictions within their storage-array product line, potentially making technology upgrades and data migration expensive. When investigating storage alternatives, IT managers should pay attention to the total cost of ownership: The cost of future license fees and support contracts will affect expenses in the longer term.

Cost is a key consideration, but it is affected by several factors beyond the cost of the licenses. Does the solution require dedicated hardware, or can it be used with pre-existing hardware? Will the solution require network infrastructure expansion and if so, how much? If you are using replication to place secondary copies of data on separate servers, storage, or sites, realize that this approach implies certain hardware redundancies. Replication products that provide options to redeploy existing infrastructure to meet redundant hardware requirements demand less capital outlay.

Before deciding between a host- or storage-based replication solution, carefully consider the pros and cons of each, as illustrated in the following table.

	Host-Based Replication	Storage-Based Replication
Pros	Storage agnostic Sync and async Data can reside on any storage	Unaffected by storage upgrades Single vendor for storage and replication No burden on host system OS agnostic
Cons	Use of computing resources on host	Vendor lock-in Higher cost Data must reside on array Distance limitations of Fibre Channel
Best Fit	Multi-vendor storage environment Need option of sync or async Implementing failover cluster Replicating to multiple targets	Prefer single vendor Limited distance and controlled environment Replicating to single target

The Availability Equation

High Availability No Responses »

Nov 072012

Are you familiar with the Availability Equation? In a nutshell, this equation shows how the total time needed to restore an application to usability is equal to the time required to detect that an application is experiencing a problem plus the time required to perform a recovery action:

T_RESTORE = T_DETECT + T_RECOVER

The equation introduces the key concepts of high availability (HA): clustering, problem detection, and subsequent recovery. HA solutions monitor the health of business application components; when problems are detected, these solutions act to restore them to service. The objective of deploying an HA solution is to minimize downtime.

Reducing detection and recovery times are two important tasks of any HA solution that you choose to deploy. Today’s applications are combinations of technologies: servers, storage, network infrastructure, and so on. When reviewing your HA options, be certain that you understand the technologies that each solution uses to detect and recover from all outage types. Each technology has a direct impact on service restoration times.

One technology that is crucial to providing the fastest possible restoration time is known as local detection and recovery (aka service-level problem detection and recovery). In a basic clustering solution, servers are connected and configured such that one or more servers can take over the operations of another in the event of a server failure. The server nodes in the cluster continuously send small data packets, often called heartbeat signals, to each other to indicate that they are “alive”.

In simple clustered environments, when one server stops generating heartbeats, other cluster members assume that this server is down and begin the process of taking over responsibility for that server’s domain of operation. This approach is adequate for detecting failure at the server level. But unless problems cause the interruption or cessation of heartbeat signals, server-level detection is inadequate. More than that, it can actually magnify the extent and impact of an outage.

For example, if Apache processes hang, the server may still send heartbeats — even though the Web server subsystem has ceased to perform its primary function. Rather than restart the Apache subsystem on the same or a different server, a basic server-level clustering solution would restart the entire software stack of the failed server on a backup server, thereby causing interruption to users and extending recovery time.

Using local detection and recovery, advanced clustering solutions deploy health-monitoring agents within individual cluster servers, to monitor individual system components such as a file system, a database, user-level application, IP address, and so on. These agents use heuristics that are specific to the monitored component. Therefore, the agents can predict and detect operational issues and then take the most appropriate recovery action. Often, the most efficient recovery method is to stop and restart the problem subsystem on the same server.

By detecting failures at a more granular level than simply by observing server-level heartbeats, and by enabling recovery within the same physical server, the time to restore an application to user availability can be greatly reduced. Solutions such as the SteelEye Protection Suite for Linux from SIOS provides this level of detection and recovery for your environment. Make certain that whichever HA solution you deploy can also support local detection and recovery.

12 Checklist Items for Selecting a High-Availability Solution

High Availability No Responses »

Nov 012012

When selecting a high-availability (HA) solution, you should consider several criteria. These range from the total cost of the solution, to the ease with which you can configure and manage the cluster, to the specific restrictions placed on hardware and software. This post touches briefly on 12 of the most important checklist items.

1. Support for standard OS and application versions

Solutions that require enterprise or advanced versions of the OS, database, or application software can greatly reduce the cost benefits of moving to a commodity server environment. By deploying the proper HA middleware, you can make standard versions of applications and OSs highly available and meet the uptime requirements of your business environment.

2. Support for a variety of data storage configurations

When you deploy an HA cluster, the data that the protected applications require must be available to all systems that might need to bring the applications into service. You can share this data via data replication, by using shared SCSI or Fibre Channel storage, or by using a NAS device. Whichever method you decide to deploy, the HA product that you use must be able to support all data configurations so that you can change as your business needs dictate.

3. Ability to use heterogeneous solution components

Some HA clustering solutions require that every system within the cluster has identical configurations. This requirement is common among hardware-specific solutions in which clustering technology is meant to differentiate servers or storage and among OS vendors that want to limit the configurations they are required to support. This restriction limits your ability to deploy scaled-down servers as temporary backup nodes and to reuse existing hardware in your cluster deployment. Deploying identically configured servers might be the correct choice for your needs, but the decision shouldn’t be dictated by your HA solution provider.

4. Support for more than two nodes within a cluster

The number of nodes that can be supported in a cluster is an important measure of scalability. Entry-level HA solutions typically limit you to one two-node cluster, usually in active/passive mode. Although this configuration provides increased availability (via the addition of a standby server), it can still leave you exposed to application downtime. In a two-node cluster configuration, if one server is down for any reason, then the remaining server becomes a single point of failure. By clustering three or more nodes, you not only gain the ability to provide higher levels of protection, but you can also build highly scalable configurations.

5. Support for active/active and active/standby configurations

In an active/standby configuration, one server is idle, waiting to take over the workload of its cluster member. This setup has the obvious disadvantage of underutilizing your compute resource investment. To get the most benefit from your IT expenditure, ensure that cluster nodes can run in an active/active configuration.

6. Detection of problems at node and individual service levels

All HA software products can detect problems with cluster server functionality. This task is done by sending heartbeat signals between servers within the cluster and initiating a recovery if a cluster member stops delivering the signals. But advanced HA solutions can also detect another class of problems, one that occurs when individual processes or services encounter problems that render them unavailable but that do not cause servers to stop sending or responding to heartbeat signals. Given that the primary function of HA software is to ensure that applications are available to end users, detecting and recovering from these service level interruptions is a crucial feature. Ensure that your HA solution can detect both node- and service-level problems.

7. Support for in-node and cross-node recovery

The ability to perform recovery actions both across cluster nodes and within a node is also important. In cross-node recovery, one node takes over the complete domain of responsibility for another. When systems-level heartbeats are missed, the server which should have sent the heartbeats is assumed to be out of operation, and other cluster members begin recovery operations. With in-node or local recovery, failed system services first attempt to be restored within the server on which they are running. This task is typically done by stopping and restarting the service and any dependent system resources. This recovery method is much faster and minimizes downtime.

8. Transparency to client connections of server-side recovery

Server-side recovery of an application or even of an entire node should be transparent to client-side users. Through the use of virtualized IP addresses or server names, the mapping of virtual compute resources onto physical cluster entities during recovery, and automatic updating of network routing tables, no changes to client systems are necessary for the systems to access recovered applications and data. Solutions that require manual client-side configuration changes to access recovered applications greatly increase recovery time and introduce the risk of additional errors due to required human interaction. Recovery should be automated on both the servers and clients.

9. Protection for planned and unplanned downtime

In addition to providing protection against unplanned service outages, the HA solution that you deploy should be usable as an administration tool to lessen downtime caused by maintenance activities. By providing a facility to allow on-demand movement of applications between cluster members, you can migrate applications and users onto a second server while performing maintenance on the first. This can eliminate the need for maintenance windows in which IT resources are unavailable to end users. Ensure that your HA solution provides a simple and secure method for performing manual (on-demand) movement of applications and needed resources among cluster nodes.

10. Off-the-shelf protection for common business functions

Every HA solution that you evaluate should include tested and supported agents or modules that are designed to monitor the health of specific system resources: file systems, IP addresses, databases, applications, and so on. These modules are often called recovery modules. By deploying vendor-supplied modules, you benefit from both the run-time that the vendor and other customers have already done. You also have the assurance of ongoing support and maintenance of these solution components.

11. Ability to easily incorporate protection for custom business applications

There will likely be applications, perhaps custom to your corporation, that you want to protect but for which there are no vendor-supplied recovery modules. It is important, therefore, that you have a method for easily incorporating your business application into your HA solution’s protection schema. You should be able to do this without modifying your application, and especially without having to embed any vendor-specific APIs. A software developer’s kit that provides examples and a step-by-step process for protecting your application should be available, along with vendor-supplied support services, to assist as needed.

12. Ease of cluster deployment and management

A common myth surrounding HA clusters is that they are costly and complex to deploy and administer. This is not necessarily true. Cluster administration interfaces should be wizard-driven to assist with initial cluster configuration, should include auto-discovery of new elements as they are added to the cluster, and should allow for at-a-glance status monitoring of the entire cluster. Also, any cluster metadata must be stored in an HA fashion, not on a single quorum disk within the cluster, where corruption or an outage could cause the entire cluster to fall apart.

By looking for the capabilities on this checklist, you can make the best decision for your particular HA needs.