Linux, politics, and other interesting things
I have just configured IPVS on a Xen server for load balancing between multiple virtual hosts. The benefit is not load balancing but management. With two virtual machines providing a service I can gracefully shut one down for maintenance and have the other take the load. When there are two machines providing a service a load balancing configuration is much better than a hot-spare, one reason is the fact that there may be application scaling issues that prevent one machine with twice the resources from giving as much performance as two smaller machines. Another is the fact that if you have a machine configured but never used there will always be some doubt as to whether it would work…
The first thing to do is to assign the IP address of the service to the front-end machine so that other machines on the segment (IE routers) will be able to send data to it. If the address for the service is 10.0.0.5 then the command “ip addr add dev eth0 10.0.0.5/24 broadcast +” will make it a secondary address on the eth0 interface. On a Debian system you would add the line “up ip addr add dev eth0 10.0.0.5/24 broadcast + || true” to the appropriate section of /etc/network/interfaces, for a Red Hat system it seems that /etc/rc.local is the best place for it. I expect that it would be possible to merely advertise the IP address via ARP without adding it to the interface, but the ability to ping the IPVS server on the service address seems useful and there seems no benefit in not assigning the address.
There are three methods used by IPVS for forwarding packets, gatewaying/routing (the default), IPIP encapsulation (tunneling), and masquerading. The gatewaying/routing method requires the back-end server to respond to requests on the service address. That would mean assigning the address to the back-end server without advertising it via ARP (which seems likely to have some issues for managing the system). The IPIP encapsulation method requires setting up IPIP which seemed like it would be excessively difficult (although maybe not more than required to set up masquerading). The masquerading option (which I initially chose) rewrites the packets to have the IP address of the real server. So for example if the service address is 10.0.0.5 and the back-end server has the address 10.0.1.5 then it will see packets addresses to 10.0.1.5. A benefit of masquerading is that it allows you to use different ports, so for example you could have a non-virtualised mail server listening on port 25 and a back-end server for a virtual service listening on port 26. While there is no practical limit to the number of private IP addresses that you might use it seems easier to manage servers listening on different ports with the same IP address – and there is the issue of server programs that are not written to support binding to an IP address.
ipvsadm -A -t 10.0.0.5:25 -s lblc -p
ipvsadm -a -t 10.0.0.5:25 -r 10.0.1.5 -m
The above two commands create an IPVS configuration that listens on port 25 of IP address 10.0.0.5 and then masquerades connections to 10.0.1.5 on port 25 (the default is to use the same port).
Now the problem is in getting the packets to return via the IPVS server. If the IPVS server happens to be your default gateway then it’s not a problem and it will already be working after the above two commands (if a service is listening on 10.0.1.5 port 25).
If the IPVS server is not the default gateway and you have only one IP address on the back-end server then this will require using netfilter to mark the packets and then route based on the packet matching. Marking via netfilter also seems to be the only well documented way of doing similar things. I spent some time working on this and didn’t get it working. However having multiple IP addresses per server is a recommended practice anyway (a back-end interface for communication between servers as well as a front-end interface for public data).
ip rule add from 10.0.1.5 table 1
ip route add default via 10.0.0.1 table 1
I use the above two commands to set up a new routing table for the data for the virtual service. The first line causes any packets from 10.0.1.5 to be sent to routing table 1 (I currently have a rough plan to have table numbers match ethernet device numbers, the data in question is going out device eth1). The second line adds a default router to table 1 which sends all packets to 10.0.0.1 (the private IP address of the IPVS server).
Then it SHOULD all be working, but in the network that I’m using (RHEL4 DomU and RHEL5 Dom0 and IPVS) it doesn’t. For some reason the data packets from the DomU are not seen as part of the same TCP stream (both in Net Filter connection tracking and by the TCP code in the kernel). So I get an established connection (3 way handshake completed) but no data transfer. The server sends the SMTP greeting repeatedly but nothing is received. At this stage I’m not sure whether there is something missing in my configuration or whether there’s a bug in IPVS. I would be happy to send tcpdump output to anyone who wants to try and figure it out.
My next attempt at this was via routing. I removed the “-m” option from the ipvsadm command and added the service IP address to the back-end with the command “ifconfig lo:0 10.0.0.5 netmask 255.255.255.255” and configured the mail server to bind to port 25 on address 10.0.0.5. Success at last!
Now I just have to get Piranha working to remove back-end servers from the list when they fail.
Update: It’s quite important that when adding a single IP address to device lo:0 you use a netmask of 255.255.255.255. If you use the same netmask as the front-end device (which would seem like a reasonable thing to do) then (with RHEL4 kernels at least) you get proxy ARPs by default. For example you used netmask 255.255.255.0 to add address 10.0.0.5 to device lo:0 then on device eth0 the machine will start answering ARP requests for 10.0.0.6 etc. Havoc then ensues.