The Red Hat High Availability Add-On ships with a number of different fencing agents for different hypervisors. Cluster nodes that are VMs running on a KVM/libvirt host require the software fencing device fence-virtd configured.
Our goal is to to configure STONITH agent fence_xvm in a RHEL cluster. For this we first need to configure libvirt fencing on our physical KVM host.
Installation
Our KVM server runs CentOS 7.
On the hypervisor, install the following packages:
[kvm]# yum install fence-virtd fence-virtd-libvirt fence-virtd-multicast
Create a shared secret key:
[kvm]# mkdir /etc/cluster [kvm]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=4k count=1
Configuration
Configure the fence_virtd daemon. The important things are to select the libvirt back end and the multicast listener.
Also, make sure to select the correct interface used for communication between the cluster nodes (it’s br0 in our case).
[kvm]# fence_virtd -c Module search path [/usr/lib64/fence-virt]: Available backends: libvirt 0.3 Available listeners: multicast 1.2 Listener modules are responsible for accepting requests from fencing clients. Listener module [multicast]: The multicast listener module is designed for use environments where the guests and hosts may communicate over a network using multicast. The multicast address is the address that a client will use to send fencing requests to fence_virtd. Multicast IP Address [225.0.0.12]: Using ipv4 as family. Multicast IP Port [1229]: Setting a preferred interface causes fence_virtd to listen only on that interface. Normally, it listens on all interfaces. In environments where the virtual machines are using the host machine as a gateway, this *must* be set (typically to virbr0). Set to 'none' for no interface. Interface [br0]: The key file is the shared key information which is used to authenticate fencing requests. The contents of this file must be distributed to each physical host and virtual machine within a cluster. Key File [/etc/cluster/fence_xvm.key]: Backend modules are responsible for routing requests to the appropriate hypervisor or management layer. Backend module [libvirt]: Configuration complete. === Begin Configuration === fence_virtd { listener = "multicast"; backend = "libvirt"; module_path = "/usr/lib64/fence-virt"; } listeners { multicast { key_file = "/etc/cluster/fence_xvm.key"; address = "225.0.0.12"; interface = "br0"; family = "ipv4"; port = "1229"; } } backends { libvirt { uri = "qemu:///system"; } } === End Configuration === Replace /etc/fence_virt.conf with the above [y/N]? y
Enable and start the service on the hypervisor:
[kvm]# systemctl enable fence_virtd [kvm]# systemctl start fence_virtd
This is important, do not forget to open a firewall port UDP 1229 on the hypervisor:
[kvm]# firewall-cmd --permanent --add-port=1229/udp [kvm]# firewall-cmd --reload
Copy the fence secret key /etc/cluster/fence_xvm.key
to all cluster nodes. Make sure that the file name and the path are the same as on the hypervisor.
[kvm]# for i in $(seq 1 3);do \ ssh node$i mkdir /etc/cluster; \ scp /etc/cluster/fence_xvm.key node$i:/etc/cluster/;\ done
Configure Fence Agent fence_xvm
This configuration applies to cluster nodes and not the hypervisor.
Our cluster nodes have RHEL High Availability Add-On configured. See here for more info.
Install fence-virt package on every cluster node.
[nodex]# yum install fence-virt
[nodex]# ls -Z /etc/cluster/fence_xvm.key -rw-r--r--. root root unconfined_u:object_r:cluster_conf_t:s0 /etc/cluster/fence_xvm.key
Open a firewall port TCP 1229 on all cluster nodes:
[nodex]# firewall-cmd --permanent --add-port=1229/tcp [nodex]# firewall-cmd --reload
Check fencing:
[nodex]# fence_xvm -o list nfs 66bc6e9e-73dd-41af-85f0-e50b34e1fc07 on node1 c6220f3a-f937-4470-bfae-d3a3f49e2500 on node2 2711db33-da71-4119-85da-ae7b294d9d4a on node3 632121f5-6e40-4910-b863-f4f16d7abcaf on
Try fencing one of the cluster nodes:
[node1]# fence_xvm -o off -H node2
Add a stonith resource to the pacemaker cluster:
[node1]# pcs stonith create fence_node1 fence_xvm \ key_file="/etc/cluster/fence_xvm.key" \ action="reboot" \ port="node1" \ pcmk_host_list="node1.hl.local"
The port is the name of the VM as seen by libvirt (virsh list), and pcmk_host_list contains the name of the cluster node.
This would create a stonith resource per cluster node.
Alternatively, a host map can be used:
[node1]# pcs stonith create fence_all fence_xvm \ key_file="/etc/cluster/fence_xvm.key" \ action="reboot" \ pcmk_host_map="node1.hl.local:node1,node2.hl.local:node2,node3.hl.local:node3" \ pcmk_host_list="node1,node2,node3" \ pcmk_host_check=static-list
The command above would create a single stonith resource that can fence all cluster nodes.
It has worked. Great!!! Thanks a lot.
I have got an error msg at my environment as below. I have omitted the action parameter and it has worked. I think there is an default action which will work.
Another thing at my environment the selinux file context is as below. I haven’t modified the context.
[root@node1 cluster]# ls -Z /etc/cluster/fence_xvm.key
-rw-r–r–. root root unconfined_u:object_r:etc_t:s0 /etc/cluster/fence_xvm.key
[root@node1 cluster]# pcs stonith create fence_all fence_xvm key_file=”/etc/cluster/fence_xvm.key” action=reboot pcmk_host_map=”node1.public.example.com:node1,node2.public.example.com:node2,node3.public.example.com:node3″ pcmk_host_list=”node1,node2,node3″ pcmk_host_check=static-list
Error: stonith option ‘action’ is deprecated and should not be used, use pcmk_off_action, pcmk_reboot_action instead, use –force to override
[root@node1 cluster]# pcs stonith create fence_all fence_xvm key_file=”/etc/cluster/fence_xvm.key” pcmk_host_map=”node1.public.example.com:node1,node2.public.example.com:node2,node3.public.example.com:node3″ pcmk_host_list=”node1,node2,node3″ pcmk_host_check=static-list
[root@node1 cluster]# pcs stonith show
fence_all (stonith:fence_xvm): Started node1.public.example.com
[root@node1 cluster]#
Well done!
Hi Tomas,
Great tutorial. I am attempting to configure the cluster nodes. I see you mentioned 1229/TCP
Is it TCP or UDP?
You’ve got the answer right there, 1229/TCP.
This is way too cool, however:
– When you create a resource with pcs stonith create you must use pcmk_off_action=reboot instead of action=reboot
– Opening UDP 1229 on hypervisor somewhere blocks the UDP traffic still
I couldn’t find any solution regards to the FW issue on hypervisor, do you have any idea? I opened TCP 1229 on cluster nodes and UDP 1229 on hypervisor
This may be a zone issue regarding firewall
# firewall-cmd –get-active-zones
libvirt
interfaces: virbr0
public
interfaces: eth0
Just add the zone appropriate to the interface in my case it would be libvirt to fix
#firewall-cmd –add-port=1229/udp –permanent –zone=libvirt
success
#firewall-cmd –reload
Thanks, that’s very useful.