Red Hat, Microsoft, and Virtualisation Support

Red Hat has just announced a deal with MS for support of RHEL virtual machines on Windows Server and Windows virtual machines on RHEL [1]. It seems that this deal won’t deliver anything before “calendar H2 2009” so nothing will immediately happen – but the amount of testing to get these things working correctly is significant.

Red Hat has stated that “the agreements contain no patent or open source licensing components” and “the agreements contain no financial clauses, other than industry-standard certification/validation testing fees” so it seems that there is nothing controversial in this. Of course that hasn’t stopped some people from getting worked up about it.

I think that this deal is a good thing. I have some clients who run CentOS and RHEL servers (that I installed and manage) as well as some Windows servers. Some of these clients have made decisions about the Windows servers that concern me (such as not using ECC RAM, RAID, or backups). It seems to me that if I was to use slightly more powerful hardware for the Linux servers I could run Windows virtual machines for those clients, manage all the backups at the block device level (without bothering the Windows sysadmins). This also has the potential to save the client some costs in terms of purchasing hardware and managing it.

When this deal with MS produces some results (maybe in 6 months time) I will recommend that some of my clients convert CentOS machines to RHEL to take advantage of it. If my clients take my advice in this regard then it will result in a small increase in revenue and market share for RHEL. So Red Hat’s action in this regard seems to be a good business decision for them. If my clients take my advice and allow me to use virtualisation to better protect their critical data that is on Windows servers then it will be a significant benefit for the users.

4 comments to Red Hat, Microsoft, and Virtualisation Support

  • Andrew Hardy

    What about the Windows system state backup? Snapshot backups on the host have always been fraught with peril (VMWare VCB anyone? Junk!) because you need to be able to quiesce the guest. Writing some kind of script on the guest to do a ssb->flatfile is a little ugly you want to just be able to do a snapshot and that’s that.

    I agree this deal is a good thing. A great thing. But it’s not automagically going to resolve some inherent design flaws. At the end of the day the point of a backup is to restore and the Windows sysadmins are going to need a degree of comfort that they can restore. You should not have to do a two step restore the snap and then restore the individual files to do a partial recovery. Design to recover.

    Would love to have an offline discussion with you Russell about your thoughts on this and some other things, are you on IRC anymore? :)


  • etbe

    Sam Varghese quoted me in ITWire. One thing that interested me was the different perspectives shown by the other people who were quoted.

    Andrew: If the virtualised OS works correctly then it will write the data to the virtual disk in an order such that at any time it will be in a state that can be correctly recovered (journal replay or rollback on both the filesystem and on any databases running on it). If the virtual machine works correctly the it will commit data to the physical media in the same order and things will be fine.

    In operation of LVM snapshots with Xen (running on RHEL5 on AMD64, and Debian Etch and Lenny on i386) I have never had a problem with implementing this. On one AMD64 Xen server which is a Frankenstein setup of Debian/Lenny with a CentOS kernel I use regular files on an Ext3 filesystem for disk images and I have found that “xm destroy” routinely causes filesystem corruption.

    Now this is just my test results, of course that can’t prove that there isn’t a horrible bug lurking which I have been lucky enough to avoid so far. ;) But I have done such things often enough over a long enough period to become convinced that such problems are at least unlikely.

    If you have some evidence of problems in this regard that you can publish then I would be very interested to see it. Bugs in VMWare don’t particularly interest me though as I have no plans to use it regardless.

  • Andrew Hardy

    I come from a commercial world where you always ensured the Windows backup clients had “backup system state” enabled otherwise the system was unrecoverable.

    My point is that Windows sysadmins in such a corporate team aren’t comfortable with their backups performed by other sysadmins on other types of systems. They aren’t convinced, by a long shot, that they can recover.

    For instance, I haven’t seen Windows file systems recovered by replaying journals or rollbacks on the file systems. It’s an entirely different scale and different type of team. Your level of comfort is much higher!

    From a design perspective, what about partial system recovery? You don’t want to have to recover an entire snap somewhere so that you can manually/scriptcopy the things you need.

    It will work just fine for simple scenarios, and not work well for more complex types of recovery. I’ve seen large solutions relying on snapshots for terabytes of data, and I’ve seen those storage frames run out of space and not be able to protect the data.

    These are not very technical examples, without much in the way of detailed bits & bytes on filesystems to backup the argument and I certainly can’t point at bugs in any products to say “here’s the problem”. They’re design problems. I hope you can appreciate my argument :)


  • etbe

    Andrew: It boggles the mind that an OS could be so badly designed that you can’t just snapshot it. Such snapshot backups are a matter of routine on Linux systems and always work well.

    While you have some good points about serious installations with remotely competent people running them, I still believe that snapshot backups have something to offer Windows systems. I run several Linux servers at sites that have Windows servers with no backups at all and no RAID! They are one drive failure away from disaster! Anything is better than that.