Archives

Categories

Xen and Heartbeat

Xen (a system for running multiple virtual Linux machines) and has some obvious benefits for testing Heartbeat (the clustering system) – the cheapest new machine that is on sale in Australia can be used to simulate a four node cluster. I’m not sure whether there is any production use for a cluster running under Xen (I look forward to seeing some comments or other blog posts about this).

Most cluster operations run on a Xen virtual machine in the same way as they would under physically separate machines, and Xen even supports simulating a SAN or fiber-channel shared storage device if you use the syntax phy:/dev/VG/LV,hdd,w! in the Xen disk configuration line (the exclamation mark means that the volume is writable even if someone else is writing to it).

The one missing feature is the ability to STONITH a failed node. This is quite critical as the design of Heartbeat is that a service on a node which is not communicating will not be started on another node until the failed node comes up after a reboot or the STONITH sub-system states that it has rebooted it or turned it off. This means that the failure of a node implies the permanent failure of all services on it until/unless the node can be STONITH’d.

To solve this problem I have written a quick Xen STONITH module. The first issue is how to communicate between the DomU’s (Xen virtual machines) and the Dom0 (the physical host). It seemed that the best way to do this is to ssh to special accounts on the Dom0 and then use sudo to run a script that calls the Xen xm utility to actually restart the node. That way the Xen virtual machine gets limited access to the Dom0, and the shell script could even be written to allow each VM to only manage a sub-set of the VMs on the host (so you could have multiple virtual clusters on the one physical host and prevent them from messing with each other through accident or malice).

xen ALL=NOPASSWD:/usr/local/sbin/xen-stonith

Above is the relevant section from my /etc/sudoers file. It allows user xen to execute the script /usr/local/sbin/xen-stonith as root to do the work.

One thing to note is that from each of the DomU’s you must be able to ssh from root on the node to the specified account for the Xen STONITH service without using a password and without any unreasonable delay (IE put UseDNS no in /etc/ssh/sshd_config.

The below section (which isn’t in the feed) there are complete scripts for configuring this.


Here is the /usr/local/sbin/xen-stonith script to run on the Dom0:

#!/bin/sh

if [ "$1" = "list" ]; then
  xm list
  exit 0
fi

nodes="node-0 node-1"
for n in $nodes ; do
  if [ "$2" = "$n" ]; then
    case "$1" in
    reset)
# if it isn't running then exit with success
      if ! xm list|grep -q "^$2"
      then
        echo "$2 not running – OK"
        exit 0
      fi
# otherwise destroy and re-create it
      xm shutdown $2
      sleep 2
      xm destroy $2 && xm create /etc/xen/$2
      echo "Successfully rebooted $2"
      exit $?
      ;;
    on)
      xm create /etc/xen/$2 || xm list | grep -q ^$2 && echo "Started $2"
      exit $?
      ;;
    off)
      xm shutdown $2
      sleep 2
      # can this fail?
      if ! xm destroy $2
      then
        echo "Failed to stop $2"
        exit 1
      fi
      echo "Stopped $2"
      exit 0
      ;;
    esac
    echo "Invalid operation $1" >&2
    exit 1
  fi
done
echo "Invalid node $2" >&2
exit 1

Here is the script to write the XML data for the STONITH service into the CIB database, the two configuration options are hostlist which contains a space-separated list of all hosts that can be rebooted by Xen, and sshaccount which contains an SSH account to login to for the purpose of running sudo, expressed in the form user@host:

#!/bin/bash
if [ "$1" = "start" ]; then
  cibadmin --obj_type resources --cib_create -p << END
  <clone id="XenStonithSet">
    <instance_attributes>
      <attributes>
        <nvpair name="clone_max" value="2"/>
        <nvpair name="clone_node_max" value="1"/>
      </attributes>
    </instance_attributes>
    <primitive id="XenStonith" class="stonith" type="external/xen">
      <operations>
        <op name="monitor" interval="20s" timeout="40s"
        prereq="nothing"/>
        <op name="start" timeout="40s" prereq="nothing"/>
      </operations>
      <instance_attributes>
        <attributes>
          <nvpair name="hostlist" value="node-0 node-1"/>
          <nvpair name="sshaccount" value="xen@10.1.0.1"/>
        </attributes>
      </instance_attributes>
    </primitive>
  </clone>
END
  sleep 1
  cibadmin --obj_type constraints --cib_create -p << END
  <rsc_location id="XenStonith-cons:0" rsc="XenStonithSet">
      <rule id="XenStonith-rule-node-0" score="INFINITY">
        <expression id="XenStonith-expression-node-0"
          attribute="#uname" operation="eq" value="node-0"/>
      </rule>
  </rsc_location>
END
  cibadmin --obj_type constraints --cib_create -p << END
  <rsc_location id="XenStonith-cons:1" rsc="XenStonithSet">
      <rule id="XenStonith-rule-node-1" score="INFINITY">
        <expression id="XenStonith-expression-node-1"
          attribute="#uname" operation="eq" value="node-1"/>
      </rule>
  </rsc_location>
END
else
  cibadmin -D --obj_type resources -X '<clone id="XenStonithSet">'
  cibadmin -D --obj_type constraints -X \
          '<rsc_location id="XenStonith-cons:0">'
  cibadmin -D --obj_type constraints -X \
          '<rsc_location id="XenStonith-cons:1">'
fi

Finally, here is the STONITH script that is stored at /usr/lib/stonith/plugins/external/xen in the virtual machines, it was inspired significantly by the ssh script so I am claiming joint copyright with Lars Marowsky-Bree of SUSE:

#!/bin/sh
#
# External STONITH module for xen.
#
# Copyright (c) 2004 SUSE LINUX AG – Lars Marowsky-Bree <lmb@suse.de>
# Copyright (c) 2007 Russell Coker <russell@coker.com.au>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like.  Any license provided herein, whether implied or
# otherwise, applies only to this software file.  Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place – Suite 330, Boston MA 02111-1307, USA.
#

SSH_COMMAND="ssh $sshaccount"
COMMAND="sudo -u root xen-stonith"

# Rewrite the hostlist to accept "," as a delimeter for hostnames too.
hostlist=`echo $hostlist | tr ',' ' '`

doit()
{
        for h in $hostlist
        do
         if
            [ "$h" != "$2" ]
          then
            continue
          fi
         $SSH_COMMAND "$COMMAND $1 $2" || exit 1
          exit 0
        done
        exit 1
}

case $1 in
gethosts)
        for h in $hostlist ; do
                echo $h
        done
        exit 0
        ;;
on)
        if [ "$(hostname)" = "$2" ]; then
                echo "Error, can't turn yourself on!"
                exit 1
        fi
        doit on $2
        ;;
off)
        doit off $2
        ;;
reset)
        if [ "$(hostname)" = "$2" ]; then
                echo "Error, resetting yourself makes no sense from Heartbeat"
                exit 1
        fi
        doit reset $2
        ;;
status)
        if
         [ -z "$hostlist" ]
        then
         exit 1
        fi
        if
          ! $SSH_COMMAND "$COMMAND list" > /dev/null
        then
         exit 1
        fi
        exit 0
        ;;
getconfignames)
        echo "hostlist sshaccount"
        exit 0
        ;;
getinfo-devid)
        echo "xen STONITH device"
        exit 0
        ;;
getinfo-devname)
        echo "xen STONITH external device"
        exit 0
        ;;
getinfo-devdescr)
        echo "Xen-based Linux host reset"
        exit 0
        ;;
getinfo-devurl)
        echo "http://etbe.coker.com.au/category/ha/"
        exit 0
        ;;
getinfo-xml)
        cat << SSHXML
<parameters>
<parameter name="hostlist" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hostlist
</shortdesc>
<longdesc lang="en">
The list of hosts that the STONITH device controls
</longdesc>
<parameter name="sshaccount" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
SSH Account
</shortdesc>
<longdesc lang="en">
The account to login to (in user@host form) that can run "sudo -u root xm"
</longdesc>
</parameter>
</parameters>
SSHXML
        exit 0
        ;;
*)
        exit 1
        ;;
esac

Update: if this interests you then you may want to read the other posts filed under the Xen and HA categories.

2 comments to Xen and Heartbeat

  • sadegh

    Hi,
    very good.
    my question is if we have only one domu on each machine
    (VMM) and our mechanism for failover is Migration how about Changes on Heartbeat and Xen?

    Thank you.
    Sadegh

  • 1. I heard about people running entire Xen clusters on single physical servers with, e.g., a firewall on one guest, a mail server on another, a web server on yet another etc.

    2. You can communicate with xend over a socket. See “xend-http-server” in xend-config.sxp(5).

    3. To create a limited access to ssh I’d suggest considering use of special ssh private/public keys (without a pass-phrase) and a command=”” parameter in authorized_keys. That way only the holder of the private key can invoke the command and the server can limit the command to be invoked by that key (look for “AUTHORIZED_KEYS” in sshd(8)).

    About (2) – you can’t pass arguments to the command but I’m pretty confident it is allowed to read from standard input. Otherwise you can create different key-pairs for different commands if you have to.

    –Amos