Archives

Categories

Bridging and Redundancy

I’ve been working on a redundant wireless network for a client. The network has two sites that have pairs of links (primary and backup) which have dedicated wireless hardware (not 802.11 and some proprietary controller in the device – it’s not an interface for a Linux box).

When I first started work the devices were configured in a fully bridged mode, so I decided to use Linux Bridging (with brctl) to bridge an Ethernet port connected to the LAN with only one of the two wireless devices. The remote end had a Linux box that would bridge both the wireless devices at it’s end (there were four separate end-points as the primary and backup links were entirely independent). This meant of course that packets would go over the active link and then return via the inactive link, but needless data transfer on the unused link didn’t cause any problems.

The wireless devices claimed to implement bridging but didn’t implement STP (Spanning Tree Protocol) and they munged every packet to have the MAC address of the wireless device (unlike a Linux bridge which preserves the MAC address). The lack of STP meant that the devices couldn’t be connected at both ends. They also only forwarded IP packets so I couldn’t use STP implementations in Linux hosts or switches to prevent loops.

Below (in the part of this post which shouldn’t be in the RSS feed) I have included the script I wrote to manage a failover bridge. It pings the router at the other end when the primary link is in use, if it can’t reach it then it removes the Ethernet device that corresponds to the primary link and adds the device related to the secondary link. I had an hourly cron job that would flip it back to the primary link if it was on the secondary.

I ended up not using this in production because there were other some routers on the network which couldn’t cope with a MAC address changing and needed a reboot after such changes (even waiting 15 minutes didn’t result in the new MAC being reliably detected). So I’m posting it here for the benefit of anyone who is interested.

#!/usr/bin/perl -w

use strict;
use Sys::Syslog;
use POSIX;

my $cfg_file = "/etc/bridged.cfg";
my $daemon = 1;

openlog("bridged", "pid cons", "local1") or die "openlog";

if($ARGV[0] and $ARGV[0] eq "-d")
{
  $daemon = 0;
}

open(CFG, "<" . $cfg_file) or die "Can't read \"$cfg_file\".";
my %cfg = ( SLEEP => '30',
            PING_INTERVAL => '3',
            PING_COUNT => '3',
           
);
while(<CFG>)
{
  chomp;
  if($_ =~ /^#/ or length($_) == 0) { next; }
  my ($name, $val) = split(/ *= */);
  $name =~ s/ \t*$//;
  $val =~ s/^ \t*//;
  $cfg{$name} = $val;
}
close(CFG);

system("ifconfig $cfg{IF1} up");
system("brctl addif $cfg{BRIDGE} $cfg{IF1}");
system("brctl setpathcost $cfg{BRIDGE} $cfg{IF1} 25");
system("brctl delif $cfg{BRIDGE} $cfg{IF2}");
system("ifconfig $cfg{IF2} down");

my $route1 = 1;

if($daemon == 1)
{
  my $rc = fork();
  if($rc > 0)
  {
# parent
    exit(0);
  }

  if($rc != 0)
  {
    error("Can't fork.");
  }

  POSIX::setsid();
  my $pidfile = "/var/run/bridged.pid";
  open(PID, ">" . $pidfile) or error("Can't open $pidfile.");
  print PID getpid() . "\n" or error("Can't write to $pidfile.");
  close(PID);

  open(STDIN, "</dev/null") or error("Can't open /dev/null.");
  open(STDOUT, ">/dev/null") or error("Can't open /dev/null.");
  open(STDERR, ">/dev/null") or error("Can't open /dev/null.");
}
syslog("info", "starting up, 60 second delay");
sleep(60);
syslog("info", "now running");

sub ping_host
{
  my $ip = shift;
  open(CMD, "ping -i $cfg{PING_INTERVAL} -c $cfg{PING_COUNT} $ip | sed -n \"/transmitted/s/^.*\\([0-9].*\\).received.*\$/\\1/p\" |");
  my $count = <CMD>;
  close(CMD);
  if($count == 0)
  {
    syslog("err", "Host $ip can't be pinged");
  }
  return $count;
}

while(sleep($cfg{SLEEP}))
{
  if($route1)
  {
    unlink("$cfg{SIGNAL}");
    if(ping_host($cfg{PING1}) == 0)
    {
      system("ifconfig $cfg{IF2} up");
      system("brctl addif $cfg{BRIDGE} $cfg{IF2}");
      system("brctl setpathcost $cfg{BRIDGE} $cfg{IF2} 50");
      system("brctl delif $cfg{BRIDGE} $cfg{IF1}");
      system("ifconfig $cfg{IF1} down");
      syslog("warning", "Interface $cfg{IF1} is broken, using interface $cfg{IF2}");
      $route1 = 0;
    }
  }
  else
  {
    if(stat("$cfg{SIGNAL}"))
    {
      unlink("$cfg{SIGNAL}");
      system("ifconfig $cfg{IF1} up");
      system("brctl addif $cfg{BRIDGE} $cfg{IF1}");
      system("brctl setpathcost $cfg{BRIDGE} $cfg{IF1} 25");
      system("brctl delif $cfg{BRIDGE} $cfg{IF2}");
      system("ifconfig $cfg{IF2} down");
      syslog("warning", "Interface $cfg{IF1} is considered to be fixed");
      sleep(60);
      $route1 = 1;
    }
  }
}

1 comment to Bridging and Redundancy