Archives

Categories

more about Heartbeat

In a comment on my blog post “a Heartbeat developer comments on my blog post” Alan Robertson writes: I got in a hurry on my math because of the emergency. So, there are even more assumptions (errors?) than I documented. In particular, the probability model I gave was for a particular node to fail. So […]

a Heartbeat developer comments on my blog post

Alan Robertson (a major contributor to the Heartbeat project) commented on my post failure probability and clusters. His comment deserves wider readership than a comment generally gets so I’m making a post out of it. Here it is:

One of my favorite phrases is “complexity is the enemy of reliability” . This is absolutely true, […]

2 node vs 3+ node clusters

A comment on my post about the failure probability of clusters suggested that a six node cluster that has one node fail should become a five node cluster.

The problem with this is what to do when nodes recover from a failure. For example if a six node cluster had a node fail and became […]

failure probability and clusters

When running a high-availability cluster of two nodes it will generally be configured such that if one node fails then the other runs. Some common operation (such as accessing a shared storage device or pinging a router) will be used by the surviving node to determine that the other node is dead and that it’s […]

heartbeat – what defines a cluster?

In Debian bug 418210 there is discussion of what constitutes a cluster.

I believe that the node configuration lines in the config file /etc/ha.d/ha.cf should authoritatively define what is in the cluster and any broadcast packets from other nodes should be ignored.

Currently if you have two clusters sharing the same VLAN and they both […]