Table of Contents
The Problem
Just over 2 years ago my Dell T320 server had a motherboard failure [1]. I recently bought another T320 that had been gutted (no drives, PSUs, or RAM) and put the bits from my one in it.
I installed Debian and the resulting installation wouldn’t boot, I tried installing with both UEFI and BIOS modes with the same result. Then I realised that the disks I had installed were available even though I hadn’t gone through the RAID configuration (I usually make a separate RAID-0 for each disk to work best with BTRFS or ZFS). I tried changing the BIOS setting for SATA disks between “RAID” and “AHCI” modes which didn’t change things and realised that the BIOS setting in question probably applies to the SATA connector on the motherboard and that the RAID card was in “IT” mode which means that each disk is seen separately.
If you are using ZFS or BTRFS you don’t want to use a RAID-1, RAID-5, or RAID-6 on the hardware RAID controller, if there are different versions of the data on disks in the stripe then you want the filesystem to be able to work out which one is correct. To use “IT” mode you have to flash a different unsupported firmware on the RAID controller and then you either have to go to some extra effort to make it bootable or have a different device to boot from.
The Root Causes
Dell has no reason to support unusual firmware on their RAID controllers. Installing different firmware on a device that is designed for high availability is going to have some probability of data loss and perhaps more importantly for Dell some probability of customers returning hardware during the support period and acting innocent about why it doesn’t work. Dell has a great financial incentive to make it difficult to install Dell firmware on LSI cards from other vendors which have equivalent hardware as they don’t want customers to get all the benefits of iDRAC integration etc without paying the Dell price premium.
All the other vendors have similar financial incentives so there is no official documentation or support on converting between different firmware images. Dell’s support for upgrading the Dell version is pretty good, but it aborts if it sees something different.
The Attempts
I tried following the instructions in this document to flash back to Dell firmware [2]. This document is about the H310 RAID card in my Dell T320 AKA a “LSI SAS 9211-8i”. The sas2flash.efi program didn’t seem to do anything, it returned immediately and didn’t give an error message.
This page gives a start of how to get inside the Dell firmware package but doesn’t work [3]. It didn’t cover the case where sasdupie aborts with an error because it detects the current version as “00.00.00.00” not something that the upgrade program is prepared to upgrade from. But it’s a place to start looking for someone who wants to try harder at this.
The Solution
Dell tower servers have as a standard feature an internal USB port for a boot device. So I created a boot image on a spare USB stick and installed it there and it then loads the kernel and mounts the filesystem from a SATA hard drive. Once I got that working everything was fine. The Debian/Trixie installer would probably have allowed me to install an EFI device on the internal USB stick as part of the install if I had known what was going to happen.
The system is now fully working and ready to sell. Now I just need to find someone who wants “IT” mode on the RAID controller and hopefully is willing to pay extra for it.
Whatever I sell the system for it seems unlikely to cover the hours I spent working on this. But I learned some interesting things about RAID firmware and hopefully this blog post will be useful to other people, even if only to discourage them from trying to change firmware.
Leave a Reply
You must be logged in to post a comment.