About six years ago, my company setup a two node DNS cluster using Heartbeat with Pacemaker on Ubuntu 14.04.
This was to provide our customers with a recursive DNS server located in our Datacentre.
The two node cluster is setup to be CNS only (caching name server). I have another private ANS server (authoritative name server), which pushes out zone updates to these, so its not public facing.
I thought it would be wise, rather than doing a do-release-upgrade and risking in a live environment everything going BANG! i would setup a UAT environment with Ubuntu version 20.4. (latest to date)
I quickly found out that, when doing apt-get install pacemaker it comes bundled with corosync. There is no way to stop this.
Even if you do apt-get install pacemaker corosync- it just complains and errors, and even if you install pacemaker
and try and uninstall corosync after, it removes pacemaker! #annoying!
I tried also ignoring it, and installing heartbeat (apt-get install heartbeat) and trying to use it with pacemaker.
It turns out no matter what i did, heartbeat just wouldn’t work or play with pacemaker.
This was extremely annoying! as i wanted to follow my guide i worked on six years ago! I even come across a post where,
someone had done something similar to us, which was an interesting read.
I even reached out to the guy who made the post and asked how or if he has tried upgrading his setup to Ubuntu 20.4
But believe he moved on, so i thought i would write a short guide on our to do what i did six years ago, but instead of
using heartbeat i will be using Corosync!
At the end of day it doesn’t actually matter! both Corosync and Heartbeat is a messaging layer and works the same way with pacemaker.
So here we go.
Install Corosync and Pacemaker
sudo apt-get install pacemaker
Note that Corosync is installed as a dependency of the Pacemaker package.Corosync and Pacemaker are now installed
but they need to be configured before they will do anything useful.
Create Cluster Authorization Key
In order to allow nodes to join a cluster, Corosync requires that each node possesses an identical cluster authorization key. On the primary server, install the haveged package:
sudo apt-get install haveged
This software package allows us to easily increase the amount of entropy on our server, which is required by the corosync-keygen script. On the primary server, run the corosync-keygen script:
sudo corosync-keygen
This will generate a 128-byte cluster authorization key, and write it to /etc/corosync/authkey.
Now that we no longer need the haveged package, let’s remove it from the primary server:
sudo apt-get remove –purge haveged
sudo apt-get clean
On the primary server, copy the authkey to the secondary server:
sudo scp /etc/corosync/authkey username@secondary_ip:/tmp
On the secondary server, move the authkey file to the proper location, and restrict its permissions to root
sudo mv /tmp/authkey /etc/corosync
sudo chown root: /etc/corosync/authkey
sudo chmod 400 /etc/corosync/authkey
Now both servers should have an identical authorization key in the /etc/corosync/authkey file.
Configure Corosync Cluster
In order to get our desired cluster up and running, we must set up these on both servers, open the corosync.conf file for editing in your favorite editor (we’ll use vi):
sudo vi /etc/corosync/corosync.conf
Here is a Corosync configuration file that will allow your servers to communicate as a cluster. Be sure to replace the Bold parts with the appropriate values.
bindnetaddr should be set to the private IP address of the server you are currently working on. The two other bold items should be set to the indicated server’s private IP address. With the exception of the bindnetaddr, the file should be identical on both servers.
Replace the contents of corosync.conf with this configuration, with the changes that are specific to your environment:
totem {
version: 2
cluster_name: lbcluster
transport: udpu
interface {
ringnumber: 0
bindnetaddr: server_private_IP_address
broadcast: yes
mcastport: 5405
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
nodelist {
node {
ring0_addr: primary_private_IP_address
name: primary
nodeid: 1
}
node {
ring0_addr: secondary_private_IP_address
name: secondary
nodeid: 2
}
}
logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
to_syslog: yes
timestamp: on
}
The totem section, which refers to the Totem protocol that Corosync uses for cluster membership,
specifies how the cluster members should communicate with each other. In our setup, the important settings include transport:
udpu (specifies unicast mode) and bindnetaddr (specifies which network address Corosync should bind to).
The quorum section specifies that this is a two-node cluster, so only a single node is required for quorum (two_node: 1).
This is a workaround of the fact that achieving a quorum requires at least three nodes in a cluster.
This setting will allow our two-node cluster to elect a coordinator (DC), which is the node that controls the cluster at any given time.
The nodelist section specifies each node in the cluster, and how each node can be reached.
Here, we configure both our primary and secondary nodes, and specify that they can be reached via their respective private IP addresses.
The logging section specifies that the Corosync logs should be written to /var/log/corosync/corosync.log.
If you run into any problems with the rest of this guide, be sure to look here while you troubleshoot.
Save and exit.
Reboot
Your see now if you logon as root or do “sudo -s” and run cmd “crm_mon” you should see both servers are online, and in a cluster. Now its time to setup resources for the servers to share. This is the script we wrote, which still works fine. We have 2 Public ips which float between these boxes. So if one goes offline, the other one will take over, and because bind9 cant listen on 0.0.0.0 we need to reload bind9 when a failover occurs so bind can start answering quieries.
My Script
you need to go into CRM
type crm and press enter
now type configure and paste in these commands, obviously modifiy to suit your needs.
property stonith-enabled=false
primitive dns1-ip IPaddr2 params ip=”Public Floating IP”
location dns1-ip-on-dns1 dns1-ip 100: dns1
primitive dns1-restart-bind ocf:heartbeat:anything params binfile=”/usr/sbin/rndc” cmdline_options=”reload”
group dns1-group dns1-ip dns1-restart meta target-role=”Started”
primitive dns2-ip IPaddr2 params ip=”Second Public Floating IP”
location dns2-ip-on-dns2 dns2-ip 100: dns2
primitive dns2-restart-bind ocf:heartbeat:anything params binfile=”/usr/sbin/rndc” cmdline_options=”reload”
group dns2-group dns2-ip dns2-restart meta target-role=”Started”
commit
exit
Now you have a two node cluster sharing resources!
Useful Commands
crm node commands:
standby dnsx
online dnsx
crm resource commands:
migrate <resource> <host>