LinuxClustering.net

12 Checklist Items for Selecting a High-Availability Solution

Nov 012012

When selecting a high-availability (HA) solution, you should consider several criteria. These range from the total cost of the solution, to the ease with which you can configure and manage the cluster, to the specific restrictions placed on hardware and software. This post touches briefly on 12 of the most important checklist items.

1. Support for standard OS and application versions

Solutions that require enterprise or advanced versions of the OS, database, or application software can greatly reduce the cost benefits of moving to a commodity server environment. By deploying the proper HA middleware, you can make standard versions of applications and OSs highly available and meet the uptime requirements of your business environment.

2. Support for a variety of data storage configurations

When you deploy an HA cluster, the data that the protected applications require must be available to all systems that might need to bring the applications into service. You can share this data via data replication, by using shared SCSI or Fibre Channel storage, or by using a NAS device. Whichever method you decide to deploy, the HA product that you use must be able to support all data configurations so that you can change as your business needs dictate.

3. Ability to use heterogeneous solution components

Some HA clustering solutions require that every system within the cluster has identical configurations. This requirement is common among hardware-specific solutions in which clustering technology is meant to differentiate servers or storage and among OS vendors that want to limit the configurations they are required to support. This restriction limits your ability to deploy scaled-down servers as temporary backup nodes and to reuse existing hardware in your cluster deployment. Deploying identically configured servers might be the correct choice for your needs, but the decision shouldn’t be dictated by your HA solution provider.

4. Support for more than two nodes within a cluster

The number of nodes that can be supported in a cluster is an important measure of scalability. Entry-level HA solutions typically limit you to one two-node cluster, usually in active/passive mode. Although this configuration provides increased availability (via the addition of a standby server), it can still leave you exposed to application downtime. In a two-node cluster configuration, if one server is down for any reason, then the remaining server becomes a single point of failure. By clustering three or more nodes, you not only gain the ability to provide higher levels of protection, but you can also build highly scalable configurations.

5. Support for active/active and active/standby configurations

In an active/standby configuration, one server is idle, waiting to take over the workload of its cluster member. This setup has the obvious disadvantage of underutilizing your compute resource investment. To get the most benefit from your IT expenditure, ensure that cluster nodes can run in an active/active configuration.

6. Detection of problems at node and individual service levels

All HA software products can detect problems with cluster server functionality. This task is done by sending heartbeat signals between servers within the cluster and initiating a recovery if a cluster member stops delivering the signals. But advanced HA solutions can also detect another class of problems, one that occurs when individual processes or services encounter problems that render them unavailable but that do not cause servers to stop sending or responding to heartbeat signals. Given that the primary function of HA software is to ensure that applications are available to end users, detecting and recovering from these service level interruptions is a crucial feature. Ensure that your HA solution can detect both node- and service-level problems.

7. Support for in-node and cross-node recovery

The ability to perform recovery actions both across cluster nodes and within a node is also important. In cross-node recovery, one node takes over the complete domain of responsibility for another. When systems-level heartbeats are missed, the server which should have sent the heartbeats is assumed to be out of operation, and other cluster members begin recovery operations. With in-node or local recovery, failed system services first attempt to be restored within the server on which they are running. This task is typically done by stopping and restarting the service and any dependent system resources. This recovery method is much faster and minimizes downtime.

8. Transparency to client connections of server-side recovery

Server-side recovery of an application or even of an entire node should be transparent to client-side users. Through the use of virtualized IP addresses or server names, the mapping of virtual compute resources onto physical cluster entities during recovery, and automatic updating of network routing tables, no changes to client systems are necessary for the systems to access recovered applications and data. Solutions that require manual client-side configuration changes to access recovered applications greatly increase recovery time and introduce the risk of additional errors due to required human interaction. Recovery should be automated on both the servers and clients.

9. Protection for planned and unplanned downtime

In addition to providing protection against unplanned service outages, the HA solution that you deploy should be usable as an administration tool to lessen downtime caused by maintenance activities. By providing a facility to allow on-demand movement of applications between cluster members, you can migrate applications and users onto a second server while performing maintenance on the first. This can eliminate the need for maintenance windows in which IT resources are unavailable to end users. Ensure that your HA solution provides a simple and secure method for performing manual (on-demand) movement of applications and needed resources among cluster nodes.

10. Off-the-shelf protection for common business functions

Every HA solution that you evaluate should include tested and supported agents or modules that are designed to monitor the health of specific system resources: file systems, IP addresses, databases, applications, and so on. These modules are often called recovery modules. By deploying vendor-supplied modules, you benefit from both the run-time that the vendor and other customers have already done. You also have the assurance of ongoing support and maintenance of these solution components.

11. Ability to easily incorporate protection for custom business applications

There will likely be applications, perhaps custom to your corporation, that you want to protect but for which there are no vendor-supplied recovery modules. It is important, therefore, that you have a method for easily incorporating your business application into your HA solution’s protection schema. You should be able to do this without modifying your application, and especially without having to embed any vendor-specific APIs. A software developer’s kit that provides examples and a step-by-step process for protecting your application should be available, along with vendor-supplied support services, to assist as needed.

12. Ease of cluster deployment and management

A common myth surrounding HA clusters is that they are costly and complex to deploy and administer. This is not necessarily true. Cluster administration interfaces should be wizard-driven to assist with initial cluster configuration, should include auto-discovery of new elements as they are added to the cluster, and should allow for at-a-glance status monitoring of the entire cluster. Also, any cluster metadata must be stored in an HA fashion, not on a single quorum disk within the cluster, where corruption or an outage could cause the entire cluster to fall apart.

By looking for the capabilities on this checklist, you can make the best decision for your particular HA needs.

SAP on Linux – Certified High Availability Solutions

High Availability, SAP No Responses »

Oct 112012

Are you looking for a powerful yet easy to implement High Availability / Disaster Recovery solution for your SAP environment? If so, you will want to take a look at the SteelEye Protection Suite (SPS) for Linux, from SIOS Technologies. SPS provides integrated High Availability and Data Replication functionality that works with any server or storage configuration. Support for SAP is provided out-of-the-box without the need for any scripting or customizations.

SPS for Linux was recently officially certified by SAP against their “SAP NetWeaver High Availability Cluster 730 Certification” (NW-HA-CLU 730)

A list of certified HA solutions for SAP can be found here: http://scn.sap.com/docs/DOC-31701

For more information on SPS for Linux’s SAP functionality, please visit:

http://us.sios.com

http://us.sios.com/wp-content/uploads/2011/05/SPS-for-Linux-SAP-July-23.pdf

How Much Bandwidth Do You Need to Support Real-Time Replication?

Data Replication No Responses »

Oct 082012

When you want to replicate data across multi-site or wide area network (WAN) configurations, you first need to answer one important question: Is there sufficient bandwidth to successfully replicate the partition and keep the mirror in the mirroring state as the source partition is updated throughout the day? Keeping the mirror in the mirroring state is crucial. A partition switchover is allowed only when the mirror is in the mirroring state.

Therefore, an important early step in any successful data replication solution is determining your network bandwidth requirements. How can you measure the rate of change—the value that indicates the amount of network bandwidth needed to replicate your data?

Establish Basic Rate of Change

First, use these commands to determine the basic daily rate of change for the files or partitions that you want to mirror; for example, to measure the amount of data written in a day for /dev/sda3, run this command at the beginning of the day:

MB_START=`awk ‘/sda3 / { print $10 / 2 / 1024 }’ /proc/diskstats`

Wait for 24 hours, then run this command:

MB_END=`awk ‘/sda3 / { print $10 / 2 / 1024 }’ /proc/diskstats`

The daily rate of change, in megabytes, is then MB_END – MB_START.

The amounts of data that you can push through various network connections are as follows:

For T1 (1.5Mbps): 14,000 MB/day (14 GB)
For T3 (45Mbps): 410,000 MB/day (410 GB)
For Gigabit (1Gbps): 5,000,000 MB/day (5 TB)

Establish Detailed Rate of Change

Next, you’ll need to measure detailed rate of change. The best way to collect this data is to log disk write activity for some period (e.g., one day) to determine the peak disk write periods. To do so, create a cron job that will log the timestamp of the system followed by a dump of /proc/diskstats. For example, to collect disk stats every 2 minutes, add this link to /etc/crontab:

*/2 * * * * root ( date ; cat /proc/diskstats ) >> /path_to/filename.txt

Wait for the determined period (e.g., one day, one week), then disable the cron job and save the resulting /proc/diskstats output file in a safe location.

Analyze and Graph Detailed Rate of Change Data

Next you should analyze the detailed rate of change data. You can use the roc-calc-diskstats utility for this task. This utility takes the /proc/diskstats output file and calculates the rate of change of the disks in the dataset. To run the utility, use this command:

# ./roc-calc-diskstats <interval> <start_time> <diskstats-data-file> [dev-list]

For example, the following dumps a summary (with per-disk peak I/O information) to the output file results.txt:

# ./roc-calc-diskstats 2m “Jul 22 16:04:01” /root/diskstats.txt sdb1,sdb2,sdc1 > results.txt

Here are sample results from the results.txt file:

Sample start time: Tue Jul 12 23:44:01 2011

Sample end time: Wed Jul 13 23:58:01 2011

Sample interval: 120s #Samples: 727 Sample length: 87240s

(Raw times from file: Tue Jul 12 23:44:01 EST 2011, Wed Jul 13 23:58:01 EST 2011)

Rate of change for devices dm-31, dm-32, dm-33, dm-4, dm-5, total

dm-31 peak:0.0 B/s (0.0 b/s) (@ Tue Jul 12 23:44:01 2011) average:0.0 B/s (0.0 b/s)

dm-32 peak:398.7 KB/s (3.1 Mb/s) (@ Wed Jul 13 19:28:01 2011) average:19.5 KB/s (156.2 Kb/s)

dm-33 peak:814.9 KB/s (6.4 Mb/s) (@ Wed Jul 13 23:58:01 2011) average:11.6 KB/s (92.9 Kb/s)

dm-4 peak:185.6 KB/s (1.4 Mb/s) (@ Wed Jul 13 15:18:01 2011) average:25.7 KB/s (205.3 Kb/s)

dm-5 peak:2.7 MB/s (21.8 Mb/s) (@ Wed Jul 13 10:18:01 2011) average:293.0 KB/s (2.3 Mb/s)

total peak:2.8 MB/s (22.5 Mb/s) (@ Wed Jul 13 10:18:01 2011) average:349.8 KB/s (2.7 Mb/s)

To help you understand your specific bandwidth needs over time, you can graph the detailed rate of change data. The following dumps graph data to results.csv (as well as dumping the summary to results.txt):

# export OUTPUT_CSV=1

# ./roc-calc-diskstats 2m “Jul 22 16:04:01” /root/diskstats.txt sdb1,sdb2,sdc1 2> results.csv > results.txt

SIOS has created a template spreadsheet, diskstats-template.xlsx, which contains sample data that you can overwrite with your data from roc-calc-diskstats. The following series of images show the process of using the spreadsheet.

Open results.csv, and select all rows, including the total column.

Open diskstats-template.xlsx, select the diskstats.csv worksheet.

In cell 1-A, right-click and select Insert Copied Cells.
Adjust the bandwidth value in the cell towards the bottom left of the worksheet (as marked in the following figure) to reflect the amount of bandwidth (in megabits per second) that you have allocated for replication. The cells to the right are automatically converted to bytes per second to match the collected raw data.

Take note of the following row and column numbers:

Total (row 6 in the following figure)
Bandwidth (row 9 in the following figure)
Last datapoint (column R in the following figure)

Select the bandwidth vs ROC worksheet.

Right-click the graph and choose Select Data.
In the Select Data Source dialog box, choose bandwidth in the Legend Entries (Series) list, and then click Edit.

In the Edit Series dialog box, use the following syntax in the Series values field: =diskstats.csv!$B$<row>:$<final_column>$<row> The following figure shows the series values for the spread B9 to R9.

Click OK to close the Edit Series box.
In the Select Data Source box, choose ROC in the Legend Entries (Series) list, and then click Edit.

In the Edit Series dialog box, use the following syntax in the Series values field: =diskstats.csv!$B$<row>:$<final_column>$<row> The following figure shows the series values for the spread B6 to R6.

Click OK to close the Edit Series box, then click OK to close the Select Data Source box.

The Bandwidth vs ROC graph updates. Analyze your results to determine whether you have sufficient bandwidth to support data replication.

Next Steps

If your Rate of Change exceeds your available bandwidth, you will need to consider some of the following points to ensure your replication solution performs optimally:

Enable compression in your replication solution or in the network hardware. (DataKeeper for Linux, which is part of the SteelEye Protection Suite for Linux, supports this type of compression.)
Create a local, non-replicated storage repository for temporary data and swap files that don’t need to be replicated.
Reduce the amount of data being replicated.
Increase your network capacity.

Step-by-Step: How to Create a 2-Node MySQL Cluster Without Shared Storage, Part 1

Data Replication, High Availability, MySQL 1 Response »

Sep 252012

The primary advantage of running a MySQL cluster is obviously high availability (HA). To get the most from this type of solution, you will want to eliminate as many potential single points of failure as possible. Conventional wisdom says that you can’t form a cluster without some type of shared storage, which technically represents a single point of failure in your clustering architecture. However, there are solutions, such as the SteelEye Protection Suite (SPS) for Linux, that allow you to eliminate storage as a single point of failure by providing real-time data replication between cluster nodes. Let’s look at a typical scenario: You form a cluster that leverages local, replicated storage to protect a MySQL database.

This step-by-step discussion presumes that you’re working with an evaluation copy of SPS in a lab environment. We’re also presuming that you’ve confirmed that the primary and secondary servers and the network all meet the requirements for running this type of setup. (You can find details of these requirements in the SIOS SteelEye Protection Suite for Linux MySQL with Data Replication Evaluation Guide.)

Getting started

Before you begin setting up your cluster, you’ll need to configure the storage. The data that you want to replicate need to reside on a separate file system or logical volume. Keep in mind that the size of the target disk, whether you’re using a partition or logical volume, must be equal to or larger than the source.

In this example, we presume that you’re using a disk partition. (However, LVM is also fully supported.) First, partition the local storage for use with SteelEye DataKeeper. On the primary server, identify a free, unused disk partition to use as the MySQL repository or create a new partition. Use the fdisk utility to partition the disk, then format the partition and temporarily mount it at /mnt. Move any existing data from /var/lib/mysql/ into this new disk partition (assuming a default MySQL configuration). Unmount and then remount the partition at /var/lib/mysql. You don’t need to add this partition to /etc/fstab, as it will be mounted automatically by SPS.

On the secondary server, configure your disk as you did on the primary server.

Installing MSQL

Next you’ll deal with MySQL. On the primary server, install both the mysql and mysql-server RPM packages (if they don’t already exist on the system) and apply any required dependencies. Verify that your local disk partition is still mounted at /var/lib/mysql. If necessary, initialize a sample MySQL database. Ensure that all the files in your MySQL data directory (/var/lib/mysql) have the correct permissions and ownership, and then manually start the MySQL daemon from the command line. (Note: Do not start MySQL via the service command or the /etc/init.d/ script.)

Connect with the mysql client to verify that MySQL is running.

Update and verify the root password for your MySQL configuration. Then create a MySQL configuration file, such as the sample file shown here:

———-

# cat /var/lib/mysql/my.cnf

[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
pid-file=/var/lib/mysql/mysqld.pid
user=root
port=3306
# Default to using old password format for compatibility with mysql 3.x
# clients (those using the mysqlclient10 compatibility package).
old_passwords=1

# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
# symbolic-links=0

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

[client]

user=root

password=SteelEye

———-

In this example, we place this file in the same directory that we will later replicate (/var/lib/mysql/my.cnf). Delete the original MySQL configuration file (in /etc).

On the secondary server, install both the mysql and mysql-server RPM packages if necessary, apply any dependencies, and ensure that all the files in your MySQL data directory (/var/lib/mysql) have the correct permissions and ownership.

Installing SPS for Linux

Next, install SPS for Linux. For ease of installation, SIOS provides a unified installation script (called “setup”) for SPS for Linux. Instructions for how to obtain this software are in an email that comes with the SPS for Linux evaluation license keys.

Download the software and evaluation license keys on both the primary and secondary servers. On each server, run the installer script, which will install a handful of prerequisite RPMs, the core clustering software, and any optional ARKs that are needed. In this case, you will want to install the MySQL ARK (steeleye-lkSQL) and the DataKeeper (i.e., Data Replication) ARK (steeleye-lkDR). Apply the license key via the /opt/LifeKeeper/bin/lkkeyins command and start SPS for Linux via its start script, /opt/LifeKeeper/lkstart.

At this point you have SPS installed, licensed, and running on both of your nodes, and your disk and the MySQL database that you want to protect are configured.

In the next post, we’ll look at the remaining steps in the shared-nothing clustering process:

Creating communication (Comm) paths, i.e. heartbeats, between the primary and target servers
Creating an IP resource
Creating the mirror and launching data replication
Creating the MySQL database resource
Creating the MySQL IP address dependency

Step-by-Step: How to Create a 2-Node MySQL Cluster Without Shared Storage, Part 2

Data Replication, High Availability, MySQL 2 Responses »

Sep 252012

The previous post introduced the advantages of running a MySQL cluster, using a shared-nothing storage configuration. We also began walking through the process of setting up the cluster, using data replication and SteelEye Protection Suite (SPS) for Linux. In this post, we complete the process. Let’s get started.

Creating Comm paths

Now it’s time to access the SteelEye LifeKeeper GUI. LifeKeeper is an integrated component of SPS for Linux, and the LifeKeeper GUI is a Java-based application that can be run as a native Linux app or as an applet within a Java-enabled Web browser. (The GUI is based on Java RMI with callbacks, so hostnames must be resolvable or you might receive a Java 115 or 116 error.)

To start the GUI application, enter this command on either of the cluster nodes: /opt/LifeKeeper/bin/lkGUIapp & Or, to open the GUI applet from a Web browser, go to http://<hostname>:81.

The first step is to make sure that you have at least two TCP communication (Comm) paths between each primary server and each target server, for heartbeat redundancy. This way, the failure of one communication line won’t cause a split-brain situation. Verify the paths on the primary server. The following screenshots walk you through the process of logging into the GUI, connecting to both cluster nodes, and creating the Comm paths.

Step 1: Connect to primary server

Step 2: Connect to secondary server

Step 3: Create the Comm path

Step 4: Choose the local and remote servers

Step 5: Choose device type

Next, you are presented with a series of dialogue boxes. For each box, provide the required information and click Next to advance. (For each field in a dialogue box, you can click Help for additional information.)

Step 6: Choose IP address for local server to use for Comm path

Step 7: Choose IP address for remote server to use for Comm path

Step 8: Enter Comm path priority on local server

After entering data in all the required fields, click Create. You’ll see a message that indicates that the network Comm path was successfully created.

Step 9: Finalize Comm path creation

Click Next. If you chose multiple local IP addresses or remote servers and set the device type to TCP, then the procedure returns you to the setup wizard to create the next Comm path. When you’re finished, click Done in the final dialogue box. Repeat this process until you have defined all the Comm paths you plan to use.

Verify that the communications paths are configured properly by viewing the Server Properties dialogue box. From the GUI, select Edit > Server > Properties, and then choose the CommPaths tab. The displayed state should be ALIVE. You can also check the server icon in the right-hand primary pane of the GUI. If only one Comm path has been created, the server icon is overlayed with a yellow warning icon. A green heartbeat checkmark indicates that at least two Comm paths are configured and ALIVE.

Step 10: Review Comm path state

Creating and extending an IP resource

In the LifeKeeper GUI, create an IP resource and extend it to the secondary server by completing the following steps. This virtual IP can move between cluster nodes along with the application that depends on it. By using a virtual IP as part of your cluster configuration, you provide seamless redirection of clients upon switchover or failover of resources between cluster nodes because they continue to access the database via the same FQDN/IP.

Step 11: Create resource hierarchy

Step 12: Choose IP ARK

Enter the appropriate information for your configuration, using the following recommended values. (Click the Help button for further information.) Click Next to continue after entering the required information.

Field	Tips
Resource Type	Choose IP Address as the resource type and click Next.
Switchback Type	Choose Intelligent and click Next.
Server	Choose the server on which the IP resource will be created. Choose your primary server and click Next.
IP Resource	Enter the virtual IP information and click Next.(This is an IP address that is not in use anywhere on your network. All clients will use this address to connect to the protected resources.)
Netmask	Enter the IP subnet mask that your TCP/IP resource will use on the target server. Any standard netmask for the class of the specific TCP/IP resource address is valid. The subnet mask, combined with the IP address, determines the subnet that the TCP/IP resource will use and should be consistent with the network configuration.This sample configuration 255.255.255.0 is used for a subnet mask on both networks.
Network Connection	Enters the physical Ethernet card with which the IP address interfaces. Chose the network connection that will allow your virtual IP address to be routable. Choose the correct NIC and click Next.
IP Resource Tag	Accept the default value and click Next. This value affects only how the IP is displayed in the GUI. The IP resource will be created on the primary server.

LifeKeeper creates and validates your resource. After receiving the message that the resource has been created successfully, click Next.

Step 13: Review notice of successful resource creation

Now you can complete the process of extending the IP resource to the secondary server.

Step 14: Extend IP resource to secondary server

The process of extending the IP resource starts automatically after you finish creating an IP address resource and click Next. You can also start this process from an existing IP address resource, by right-clicking the active resource and selecting Extend Resource Hierarchy. Use the information in the following table to complete the procedure.

Field	Recommended Entries or Notes
Switchback Type	Leave as intelligent and click Next.
Template Priority	Leave as default (1).
Target Priority	Leave as default (10).
Network Interface	This is the physical Ethernet card with which the IP address interfaces. Choose the network connection that will allow your virtual IP address to be routable. The correct physical NIC should be selected by default. Verify and then click Next.
IP Resource Tag	Leave as default.
Target Restore Mode	Choose Enable and click Next.
Target Local Recovery	Choose Yes to enable local recovery for the SQL resource on the target server.
Backup Priority	Accept the default value.

After receiving the message that the hierarchy extension operation is complete, click Finish and then click Done.

Your IP resource (example: 192.168.197.151) is now fully protected and can float between cluster nodes, as needed. In the LifeKeeper GUI, you can see that the IP resource is listed as Active on the primary cluster node and Standby on the secondary cluster node.

Step 15: Review IP resource state on primary and secondary nodes

Creating a mirror and beginning data replication

You’re ready to set up and configure the data replication resource, which you’ll use to synchronize MySQL data between cluster nodes. For this example, the data to replicate is in the /var/lib/mysql partition on the primary cluster node. The source volume must be mounted on the primary server, the target volume must not be mounted on the secondary server, and the target volume size must be equal to or larger than the source volume size.

The following screenshots illustrate the next series of steps.

Step 16: Create resource hierarchy

Step 17: Choose Data Replication ARK

Use these values in the Data Replication wizard.

Field	Recommended Entries or Notes
Switchback Type	Choose Intelligent.
Server	Choose LinuxPrimary (the primary cluster node or mirror source).
Hierarchy Type	Choose Replicate Existing Filesystem.
Existing Mount Point	Choose the mounted partition to replicate; in this example, /var/lib/mysql.
Data Replication Resource Tag	Leave as default.
File System Resource Tag	Leave as default.
Bitmap File	Leave as default.
Enable Asynchronous Replication	Leave as default (Yes).

Click Next to begin the creation of the data replication resource hierarchy. The GUI will display the following message.

Step 18: Begin creation of Data Replication resource

Click Next to begin the process of extending the data replication resource. Accept all default settings. When asked for a target disk, choose the free partition on your target server that you created earlier in this process. Make sure to choose a partition that is as large as or larger than the source volume and that is not mounted on the target system.

Step 19: Begin extension of Data Replication resource

Eventually, you are prompted to choose the network over which you want the replication to take place. In general, separating your user and application traffic from your replication traffic is best practice. This sample configuration has two separate network interfaces, our “public NIC” on the 192.168.197.X subnet and a “private/backend NIC” on the 192.168.198.X subnet. We will configure replication to go over the back-end network 192.168.198.X, so that user and application traffic is not competing with replication.

Step 20: Choose network for replication traffic

Click Next to continue through the wizard. Upon completion, your resource hierarchy will look like this:

Step 21: Review Data Replication resource hierarchy

Creating the MySQL resource hierarchy

You need to create a MySQL resource to protect the MySQL database and make it highly available between cluster nodes. At this point, MySQL must be running on the primary server but not running on the secondary server.

From the GUI toolbar, click Create Resource Hierarchy. Select MySQL Database and click Next. Proceed through the Resource Creation wizard, providing the following values.

Field	Recommended Entries or Notes
Switchback Type	Choose Intelligent.
Server	Choose LinuxPrimary (primary cluster node).
Location of my.cnf	Enter /var/lib/mysql. (Earlier in the MySQL configuration process, you created a my.cnf file in this directory.)
Location of MySQL executables	Leave as default (/usr/bin) because you’re using a standard MySQL install/configuration in this example.
Database tag	Leave as default.

Click Create to define the MySQL resource hierarchy on the primary server. Click Next to extend the file system resource to the secondary server. In the Extend wizard, choose Accept Defaults. Click Finish to exit the Extend wizard. Your resource hierarchy should look like this:

Step 22: Review MySQL resource hierarchy

Creating the MySQL IP address dependency

Next, you’ll configure MySQL to depend on a virtual IP (192.168.197.151) so that the IP address follows the MySQL database as it moves.

From the GUI toolbar, right-click the mysql resource. Choose Create Dependency from the context menu. In the Child Resource Tag drop-down menu, choose ip-192.168.197.151. Click Next, click Create Dependency, and then click Done. Your resource hierarchy should now look like this:

Step 23: Review MySQL IP resource hierarchy

At this point in the evaluation, you’ve fully protected MySQL and its dependent resources (IP addresses and replicated storage). Test your environment, and you’re ready to go.

You can find much more information and detailed steps for every stage of the evaluation process in the SIOS SteelEye Protection Suite for Linux MySQL with Data Replication Evaluation Guide. To download an evaluation copy of SPS for Linux, visit the SIOS website or contact SIOS at info@us.sios.com.

Failover Clustering + VMware HA: Overkill or a Perfect Match?

High Availability, Virtualization, VMware No Responses »

Aug 222012

Implementing high availability (HA) at the VMware layer is great. Why would you need anything else? Well, as useful as the solution is — and it does help to protect against some types of failures — VMware HA alone simply doesn’t cover all the bases.

According to Gartner Research, most unplanned outages are caused by application failure (40 percent of outages) or admin error (40 percent). Hardware, network, power, or environmental problems cause the rest (20 percent total). VMware HA focuses on protection against hardware failures, but a good application-clustering solution picks up the slack in other areas. Here are a few things to consider when architecting the proper HA strategy for your VMware environment.

Shorten outages with application-level monitoring and clustering. What about recovery speed? In a perfect world, there would be no failures, outages or downtime. But if an unplanned outage does occur, the next best thing is to get up and running and again — fast. This equation represents the total availability of your environment:

As you can see, detection time is a crucial piece of the equation. Here’s another place where VMware HA alone doesn’t quite cut it. VMware HA treats each virtual machine (VM) as a “black box” and has no real visibility into the health or status of the applications that are running inside. The VM and OS running inside might be just fine, but the application could be stopped, hung, or misconfigured, resulting in an outage for users.

Even when a host server failure is the issue, you must wait for VMware HA to restart the affected VMs on another host in the VMware cluster. That means that applications running on those VMs are down until 1) the outage is detected, 2) the OS boots fully on the new host system, 3) the applications restart, and 4) users reconnect to the apps.

By clustering at the application layer between multiple VMs, you are not only protected against application-level outages, you also shorten your outage-recovery time. The application can simply be restarted on a standby VM, which is already booted up and waiting to take over. To maximize availability, the VMs involved should live on different physical servers — or even better, separate VMware HA clusters or even separate datacenters!

Eliminate storage as a potential single point of failure (SPOF). Traditional clustering solutions, including VMware HA, require shared storage and typically protect applications or services only within a single data center. Technically, the shared-storage device represents an SPOF in your architecture. If you lose access to the back-end storage, your cluster and applications are down for the count. The goal of any HA solution is to increase overall availability by eliminating as many potential SPOFs as possible.

So how can you augment a native VMware HA cluster to provide greater levels of availability? To protect your entire stack, from hardware to applications, start with VMware HA. Next, you need a way to monitor and protect the applications. Clustering at the application level (i.e., within the VM) is the natural choice. Be sure to choose a clustering solution that supports host-based data replication (i.e., a shared-nothing configuration) so that you don’t need to go through the expense and complexity of setting up SAN-based replication. SAN replication solutions also typically lock you into a single storage vendor. On top of that, to cluster VMs by using shared storage, you generally need to enable Raw Device Mapping (RDM), which means that you lose access to many powerful VMware functions, such as vMotion.

Going with a shared-nothing cluster configuration eliminates the storage tier as an SPOF and at the same time allows you to use vMotion to migrate your VMs between physical hosts – it’s a win/win. A shared-nothing cluster is also an excellent solution for disaster recovery because the standby VM can reside at a different data center.

Cover all the bases. Application-failover clustering, layered over VMware HA, offers the best of both worlds. You can enjoy built-in hardware protection and application awareness, greater flexibility and scalability, and faster recovery times. Even better, the solution doesn’t need to break the bank.

Linux Clustering with Fusion-io – Tuning tips to maximize replication performance

Data Replication, Fusion-io, High Availability, Performance 3 Responses »

Aug 132012

When most people think about setting up a cluster, it usually involves two or more servers, and a SAN – or some other type of shared storage. SAN’s are typically very costly and complex to setup and maintain. Also, they technically represent a potential Single Point of Failure (SPOF) in your cluster architecture. These days, more and more people are turning to companies like Fusion-io, with their lightning fast ioDrives, to accelerate critical applications. These storage devices sit inside the server (i.e. aren’t “shared disks”), and therefore can’t be used as cluster disks with many traditional clustering solutions. Fortunately, there are solutions out there that allow you form a failover cluster when there is no shared storage involved – i.e. a “shared nothing” cluster.


Traditional Cluster	“Shared Nothing” Cluster

When leveraging data replication as part of a cluster configuration, it’s critical that you have enough bandwidth so that data can be replicated across the network just as fast as it’s written to disk. The following are tuning tips that will allow you to get the most out of your “shared nothing” cluster configuration, when high-speed storage is involved:

Network

Use a 10Gbps NIC: Flash-based storage devices from Fusion-io (or other similar products from OCZ, LSI, etc) are capable of writing data at speeds in the HUNDREDS (750+) of MB/sec or more. A 1Gbps NIC can only push a theoretical maximum of ~125 MB/sec, so anyone taking advantage of an ioDrive’s potential can easily write data much faster than could be pushed through a 1 Gbps network connection. To ensure that you have sufficient bandwidth between servers to facilitate real-time data replication, a 10 Gbps NIC should always be used to carry replication traffic
Enable Jumbo Frames: Assuming that your Network Cards and Switches support it, enabling jumbo frames can greatly increase your network’s throughput while at the same time reducing CPU cycles. To enable jumbo frames, perform the following configuration (example from a RedHat/CentOS/OEL linux server)
- ifconfig <interface_name> mtu 9000
- Edit /etc/sysconfig/network-scripts/ifcfg-<interface_name> file and add “MTU=9000” so that the change persists across reboots
- To verify end-to-end jumbo frame operation, run this command: ping -s 8900 -M do <IP-of-other-server>
Change the NIC’s transmit queue length:
- /sbin/ifconfig <interface_name> txqueuelen 10000
- Add this to /etc/rc.local to preserve the setting across reboots

TCP/IP Tuning

Change the NIC’s netdev_max_backlog:
- Set “net.core.netdev_max_backlog = 100000” in /etc/sysctl.conf
Other TCP/IP tuning that has shown to increase replication performance:
- Note: these are example values and some might need to be adjusted based on your hardware configuration
- Edit /etc/sysctl.conf and add the following parameters:
  - net.core.rmem_default = 16777216
  - net.core.wmem_default = 16777216
  - net.core.rmem_max = 16777216
  - net.core.wmem_max = 16777216
  - net.ipv4.tcp_rmem = 4096 87380 16777216
  - net.ipv4.tcp_wmem = 4096 65536 16777216
  - net.ipv4.tcp_timestamps = 0
  - net.ipv4.tcp_sack = 0
  - net.core.optmem_max = 16777216
  - net.ipv4.tcp_congestion_control=htcp

Typically you will also need to make adjustments to your cluster configuration, which will vary based on the clustering and replication technology you decide to implement. In this example, I’m using the SteelEye Protection Suite for Linux (aka SPS, aka LifeKeeper), from SIOS Technologies, which allows users to form failover clusters leveraging just about any back-end storage type: Fiber Channel SAN, iSCSI, NAS, or, most relevant to this article, local disks that need to be synchronized/replicated in real time between cluster nodes. SPS for Linux includes integrated, block level data replication functionality that makes it very easy to setup a cluster when there is no shared storage involved.

SteelEye Protection Suite (SPS) for Linux configuration recommendations:

Allocate a small (~100 MB) disk partition, located on the Fusion-io drive to place the bitmap file. Create a filesystem on this partition and mount it, for example, at /bitmap:
- # mount | grep /bitmap
- /dev/fioa1 on /bitmap type ext3 (rw)
Prior to creating your mirror, adjust the following parameters in /etc/default/LifeKeeper
- Insert: LKDR_CHUNK_SIZE=4096
  - Default value is 64
- Edit: LKDR_SPEED_LIMIT=1500000
  - (Default value is 50000)
  - LKDR_SPEED_LIMIT specifies the maximum bandwidth that a resync will ever take — this should be set high enough to allow resyncs to go at the maximum speed possible
- Edit: LKDR_SPEED_LIMIT_MIN=200000
  - (Default value is 20000)
  - LKDR_SPEED_LIMIT_MIN specifies how fast the resync should be allowed to go when there is other I/O going on at the same time — as a rule of thumb, this should be set to half or less of the drive’s maximum write throughput in order to avoid starving out normal I/O activity when a resync occurs

From here, go ahead and create your mirrors and configure the cluster as you normally would.

Welcome!

General No Responses »

Jul 312012

Welcome to LinuxClustering.net. Visit us to find information, techniques, tips and best practices in the areas of high availability, data replication and disaster recovery for your critical linux applications. If there are particular topics you would like to see discussed, please let us know. Enjoy!

Newer Entries