TonyT

Sep 252012
 

The previous post introduced the advantages of running a MySQL cluster, using a shared-nothing storage configuration. We also began walking through the process of setting up the cluster, using data replication and SteelEye Protection Suite (SPS) for Linux. In this post, we complete the process. Let’s get started.

Creating Comm paths

Now it’s time to access the SteelEye LifeKeeper GUI. LifeKeeper is an integrated component of SPS for Linux, and the LifeKeeper GUI is a Java-based application that can be run as a native Linux app or as an applet within a Java-enabled Web browser. (The GUI is based on Java RMI with callbacks, so hostnames must be resolvable or you might receive a Java 115 or 116 error.)

To start the GUI application, enter this command on either of the cluster nodes: /opt/LifeKeeper/bin/lkGUIapp & Or, to open the GUI applet from a Web browser, go to http://<hostname>:81.

The first step is to make sure that you have at least two TCP communication (Comm) paths between each primary server and each target server, for heartbeat redundancy. This way, the failure of one communication line won’t cause a split-brain situation. Verify the paths on the primary server. The following screenshots walk you through the process of logging into the GUI, connecting to both cluster nodes, and creating the Comm paths.

Step 1: Connect to primary server

tutorial image

Step 2: Connect to secondary server

tutorial image

Step 3: Create the Comm path

tutorial image

Step 4: Choose the local and remote servers

tutorial image

tutorial image

Step 5: Choose device type

tutorial image

Next, you are presented with a series of dialogue boxes. For each box, provide the required information and click Next to advance. (For each field in a dialogue box, you can click Help for additional information.)

Step 6: Choose IP address for local server to use for Comm path

tutorial image

Step 7: Choose IP address for remote server to use for Comm path

tutorial image

Step 8: Enter Comm path priority on local server

tutorial image

After entering data in all the required fields, click Create. You’ll see a message that indicates that the network Comm path was successfully created.

Step 9: Finalize Comm path creation

tutorial image

Click Next. If you chose multiple local IP addresses or remote servers and set the device type to TCP, then the procedure returns you to the setup wizard to create the next Comm path. When you’re finished, click Done in the final dialogue box. Repeat this process until you have defined all the Comm paths you plan to use.

Verify that the communications paths are configured properly by viewing the Server Properties dialogue box. From the GUI, select Edit > Server > Properties, and then choose the CommPaths tab. The displayed state should be ALIVE. You can also check the server icon in the right-hand primary pane of the GUI. If only one Comm path has been created, the server icon is overlayed with a yellow warning icon. A green heartbeat checkmark indicates that at least two Comm paths are configured and ALIVE.

Step 10: Review Comm path state

tutorial image

Creating and extending an IP resource

In the LifeKeeper GUI, create an IP resource and extend it to the secondary server by completing the following steps. This virtual IP can move between cluster nodes along with the application that depends on it. By using a virtual IP as part of your cluster configuration, you provide seamless redirection of clients upon switchover or failover of resources between cluster nodes because they continue to access the database via the same FQDN/IP.

Step 11: Create resource hierarchy

tutorial image

Step 12: Choose IP ARK

tutorial image

Enter the appropriate information for your configuration, using the following recommended values. (Click the Help button for further information.) Click Next to continue after entering the required information.

Field

Tips

Resource Type Choose IP Address as the resource type and click Next.
Switchback Type Choose Intelligent and click Next.
Server Choose the server on which the IP resource will be created. Choose your primary server and click Next.
IP Resource Enter the virtual IP information and click Next.(This is an IP address that is not in use anywhere on your network. All clients will use this address to connect to the protected resources.)
Netmask Enter the IP subnet mask that your TCP/IP resource will use on the target server. Any standard netmask for the class of the specific TCP/IP resource address is valid. The subnet mask, combined with the IP address, determines the subnet that the TCP/IP resource will use and should be consistent with the network configuration.This sample configuration 255.255.255.0 is used for a subnet mask on both networks.
Network Connection Enters the physical Ethernet card with which the IP address interfaces. Chose the network connection that will allow your virtual IP address to be routable. Choose the correct NIC and click Next.
IP Resource Tag Accept the default value and click Next. This value affects only how the IP is displayed in the GUI. The IP resource will be created on the primary server.

LifeKeeper creates and validates your resource. After receiving the message that the resource has been created successfully, click Next.

Step 13: Review notice of successful resource creation

tutorial image

Now you can complete the process of extending the IP resource to the secondary server.

Step 14: Extend IP resource to secondary server

tutorial image

The process of extending the IP resource starts automatically after you finish creating an IP address resource and click Next. You can also start this process from an existing IP address resource, by right-clicking the active resource and selecting Extend Resource Hierarchy. Use the information in the following table to complete the procedure.

Field

Recommended Entries or Notes

Switchback Type Leave as intelligent and click Next.
Template Priority Leave as default (1).
Target Priority Leave as default (10).
Network Interface This is the physical Ethernet card with which the IP address interfaces. Choose the network connection that will allow your virtual IP address to be routable. The correct physical NIC should be selected by default. Verify and then click Next.
IP Resource Tag Leave as default.
Target Restore Mode Choose Enable and click Next.
Target Local Recovery Choose Yes to enable local recovery for the SQL resource on the target server.
Backup Priority Accept the default value.

 

After receiving the message that the hierarchy extension operation is complete, click Finish and then click Done.

Your IP resource (example: 192.168.197.151) is now fully protected and can float between cluster nodes, as needed. In the LifeKeeper GUI, you can see that the IP resource is listed as Active on the primary cluster node and Standby on the secondary cluster node.

Step 15: Review IP resource state on primary and secondary nodes

tutorial image

Creating a mirror and beginning data replication

You’re ready to set up and configure the data replication resource, which you’ll use to synchronize MySQL data between cluster nodes. For this example, the data to replicate is in the /var/lib/mysql partition on the primary cluster node. The source volume must be mounted on the primary server, the target volume must not be mounted on the secondary server, and the target volume size must be equal to or larger than the source volume size.

The following screenshots illustrate the next series of steps.

Step 16: Create resource hierarchy

tutorial image

Step 17: Choose Data Replication ARK

tutorial image

Use these values in the Data Replication wizard.

Field

Recommended Entries or Notes

Switchback Type Choose Intelligent.
Server Choose LinuxPrimary (the primary cluster node or mirror source).
Hierarchy Type Choose Replicate Existing Filesystem.
Existing Mount Point Choose the mounted partition to replicate; in this example, /var/lib/mysql.
Data Replication Resource Tag Leave as default.
File System Resource Tag Leave as default.
Bitmap File Leave as default.
Enable Asynchronous Replication Leave as default (Yes).

Click Next to begin the creation of the data replication resource hierarchy. The GUI will display the following message.

Step 18: Begin creation of Data Replication resource

tutorial image

Click Next to begin the process of extending the data replication resource. Accept all default settings. When asked for a target disk, choose the free partition on your target server that you created earlier in this process. Make sure to choose a partition that is as large as or larger than the source volume and that is not mounted on the target system.

Step 19: Begin extension of Data Replication resource

tutorial image

Eventually, you are prompted to choose the network over which you want the replication to take place. In general, separating your user and application traffic from your replication traffic is best practice. This sample configuration has two separate network interfaces, our “public NIC” on the 192.168.197.X subnet and a “private/backend NIC” on the 192.168.198.X subnet. We will configure replication to go over the back-end network 192.168.198.X, so that user and application traffic is not competing with replication.

Step 20: Choose network for replication traffic

tutorial image

Click Next to continue through the wizard. Upon completion, your resource hierarchy will look like this:

Step 21: Review Data Replication resource hierarchy

tutorial image

Creating the MySQL resource hierarchy

You need to create a MySQL resource to protect the MySQL database and make it highly available between cluster nodes. At this point, MySQL must be running on the primary server but not running on the secondary server.

From the GUI toolbar, click Create Resource Hierarchy. Select MySQL Database and click Next. Proceed through the Resource Creation wizard, providing the following values.

Field

Recommended Entries or Notes

Switchback Type Choose Intelligent.
Server Choose LinuxPrimary (primary cluster node).
Location of my.cnf Enter /var/lib/mysql. (Earlier in the MySQL configuration process, you created a my.cnf file in this directory.)
Location of MySQL executables Leave as default (/usr/bin) because you’re using a standard MySQL install/configuration in this example.
Database tag Leave as default.

 

Click Create to define the MySQL resource hierarchy on the primary server. Click Next to extend the file system resource to the secondary server. In the Extend wizard, choose Accept Defaults. Click Finish to exit the Extend wizard. Your resource hierarchy should look like this:

Step 22: Review MySQL resource hierarchy

tutorial image

Creating the MySQL IP address dependency

Next, you’ll configure MySQL to depend on a virtual IP (192.168.197.151) so that the IP address follows the MySQL database as it moves.

From the GUI toolbar, right-click the mysql resource. Choose Create Dependency from the context menu. In the Child Resource Tag drop-down menu, choose ip-192.168.197.151. Click Next, click Create Dependency, and then click Done. Your resource hierarchy should now look like this:

Step 23: Review MySQL IP resource hierarchy

tutorial image

At this point in the evaluation, you’ve fully protected MySQL and its dependent resources (IP addresses and replicated storage). Test your environment, and you’re ready to go.

You can find much more information and detailed steps for every stage of the evaluation process in the SIOS SteelEye Protection Suite for Linux MySQL with Data Replication Evaluation Guide. To download an evaluation copy of SPS for Linux, visit the SIOS website or contact SIOS at info@us.sios.com.

 Posted by at 9:36 am
Aug 222012
 

Implementing high availability (HA) at the VMware layer is great. Why would you need anything else? Well, as useful as the solution is — and it does help to protect against some types of failures — VMware HA alone simply doesn’t cover all the bases.

According to Gartner Research, most unplanned outages are caused by application failure (40 percent of outages) or admin error (40 percent). Hardware, network, power, or environmental problems cause the rest (20 percent total). VMware HA focuses on protection against hardware failures, but a good application-clustering solution picks up the slack in other areas. Here are a few things to consider when architecting the proper HA strategy for your VMware environment.

Time to repair detail

Shorten outages with application-level monitoring and clustering. What about recovery speed? In a perfect world, there would be no failures, outages or downtime. But if an unplanned outage does occur, the next best thing is to get up and running and again — fast. This equation represents the total availability of your environment:

As you can see, detection time is a crucial piece of the equation. Here’s another place where VMware HA alone doesn’t quite cut it. VMware HA treats each virtual machine (VM) as a “black box” and has no real visibility into the health or status of the applications that are running inside. The VM and OS running inside might be just fine, but the application could be stopped, hung, or misconfigured, resulting in an outage for users.

Even when a host server failure is the issue, you must wait for VMware HA to restart the affected VMs on another host in the VMware cluster. That means that applications running on those VMs are down until 1) the outage is detected, 2) the OS boots fully on the new host system, 3) the applications restart, and 4) users reconnect to the apps.

By clustering at the application layer between multiple VMs, you are not only protected against application-level outages, you also shorten your outage-recovery time. The application can simply be restarted on a standby VM, which is already booted up and waiting to take over. To maximize availability, the VMs involved should live on different physical servers — or even better, separate VMware HA clusters or even separate datacenters!

Eliminate storage as a potential single point of failure (SPOF). Traditional clustering solutions, including VMware HA, require shared storage and typically protect applications or services only within a single data center. Technically, the shared-storage device represents an SPOF in your architecture. If you lose access to the back-end storage, your cluster and applications are down for the count. The goal of any HA solution is to increase overall availability by eliminating as many potential SPOFs as possible.

So how can you augment a native VMware HA cluster to provide greater levels of availability? To protect your entire stack, from hardware to applications, start with VMware HA. Next, you need a way to monitor and protect the applications. Clustering at the application level (i.e., within the VM) is the natural choice. Be sure to choose a clustering solution that supports host-based data replication (i.e., a shared-nothing configuration) so that you don’t need to go through the expense and complexity of setting up SAN-based replication. SAN replication solutions also typically lock you into a single storage vendor. On top of that, to cluster VMs by using shared storage, you generally need to enable Raw Device Mapping (RDM), which means that you lose access to many powerful VMware functions, such as vMotion.

Going with a shared-nothing cluster configuration eliminates the storage tier as an SPOF and at the same time allows you to use vMotion to migrate your VMs between physical hosts – it’s a win/win. A shared-nothing cluster is also an excellent solution for disaster recovery because the standby VM can reside at a different data center.

Cover all the bases. Application-failover clustering, layered over VMware HA, offers the best of both worlds. You can enjoy built-in hardware protection and application awareness, greater flexibility and scalability, and faster recovery times. Even better, the solution doesn’t need to break the bank.

 Posted by at 8:13 am
Aug 132012
 

When most people think about setting up a cluster, it usually involves two or more servers, and a SAN – or some other type of shared storage.  SAN’s are typically very costly and complex to setup and maintain. Also, they technically represent a potential Single Point of Failure (SPOF) in your cluster architecture.  These days, more and more people are turning to companies like Fusion-io, with their lightning fast ioDrives, to accelerate critical applications.  These storage devices sit inside the server (i.e. aren’t “shared disks”), and therefore can’t be used as cluster disks with many traditional clustering solutions.  Fortunately, there are solutions out there that allow you form a failover cluster when there is no shared storage involved – i.e. a “shared nothing” cluster.

Traditional Cluster   “Shared Nothing” Cluster

 

When leveraging data replication as part of a cluster configuration, it’s critical that you have enough bandwidth so that data can be replicated across the network just as fast as it’s written to disk.  The following are tuning tips that will allow you to get the most out of your “shared nothing” cluster configuration, when high-speed storage is involved:

Network

  • Use a 10Gbps NIC: Flash-based storage devices from Fusion-io (or other similar products from OCZ, LSI, etc) are capable of writing data at speeds in the HUNDREDS (750+) of MB/sec or more.  A 1Gbps NIC can only push a theoretical maximum of ~125 MB/sec, so anyone taking advantage of an ioDrive’s potential can easily write data much faster than could be pushed through a 1 Gbps network connection.  To ensure that you have sufficient bandwidth between servers to facilitate real-time data replication, a 10 Gbps NIC should always be used to carry replication traffic
  • Enable Jumbo Frames: Assuming that your Network Cards and Switches support it, enabling jumbo frames can greatly increase your network’s throughput while at the same time reducing CPU cycles.  To enable jumbo frames, perform the following configuration (example from a RedHat/CentOS/OEL linux server)
    • ifconfig <interface_name> mtu 9000
    • Edit /etc/sysconfig/network-scripts/ifcfg-<interface_name> file and add “MTU=9000” so that the change persists across reboots
    • To verify end-to-end jumbo frame operation, run this command: ping -s 8900 -M do <IP-of-other-server>
  • Change the NIC’s transmit queue length:
    • /sbin/ifconfig <interface_name> txqueuelen 10000
    • Add this to /etc/rc.local to preserve the setting across reboots

TCP/IP Tuning

  • Change the NIC’s netdev_max_backlog:
    • Set “net.core.netdev_max_backlog = 100000” in /etc/sysctl.conf
  • Other TCP/IP tuning that has shown to increase replication performance:
    • Note: these are example values and some might need to be adjusted based on your hardware configuration
    • Edit /etc/sysctl.conf and add the following parameters:
      • net.core.rmem_default = 16777216
      • net.core.wmem_default = 16777216
      • net.core.rmem_max = 16777216
      • net.core.wmem_max = 16777216
      • net.ipv4.tcp_rmem = 4096 87380 16777216
      • net.ipv4.tcp_wmem = 4096 65536 16777216
      • net.ipv4.tcp_timestamps = 0
      • net.ipv4.tcp_sack = 0
      • net.core.optmem_max = 16777216
      • net.ipv4.tcp_congestion_control=htcp

Typically you will also need to make adjustments to your cluster configuration, which will vary based on the clustering and replication technology you decide to implement.  In this example, I’m using the SteelEye Protection Suite for Linux (aka SPS, aka LifeKeeper), from SIOS Technologies, which allows users to form failover clusters leveraging just about any back-end storage type: Fiber Channel SAN, iSCSI, NAS, or, most relevant to this article, local disks that need to be synchronized/replicated in real time between cluster nodes.  SPS for Linux includes integrated, block level data replication functionality that makes it very easy to setup a cluster when there is no shared storage involved.

SteelEye Protection Suite (SPS) for Linux configuration recommendations:

  • Allocate a small (~100 MB) disk partition, located on the Fusion-io drive to place the bitmap file.  Create a filesystem on this partition and mount it, for example, at /bitmap:
    • # mount | grep /bitmap
    • /dev/fioa1 on /bitmap type ext3 (rw)
  • Prior to creating your mirror, adjust the following parameters in /etc/default/LifeKeeper
    • Insert: LKDR_CHUNK_SIZE=4096
      • Default value is 64
    • Edit: LKDR_SPEED_LIMIT=1500000
      • (Default value is 50000)
      • LKDR_SPEED_LIMIT specifies the maximum bandwidth that a resync will ever take — this should be set high enough to allow resyncs to go at the maximum speed possible
    • Edit: LKDR_SPEED_LIMIT_MIN=200000
      • (Default value is 20000)
      • LKDR_SPEED_LIMIT_MIN specifies how fast the resync should be allowed to go when there is other I/O going on at the same time — as a rule of thumb, this should be set to half or less of the drive’s maximum write throughput in order to avoid starving out normal I/O activity when a resync occurs

From here, go ahead and create your mirrors and configure the cluster as you normally would.

Jul 312012
 

Welcome to LinuxClustering.net. Visit us to find information, techniques, tips and best practices in the areas of high availability, data replication and disaster recovery for your critical linux applications.  If there are particular topics you would like to see discussed, please let us know. Enjoy!

 Posted by at 11:33 am