BTRFS Status Dec 2014

My last problem with BTRFS was in August [1]. BTRFS has been running mostly uneventfully for me for the last 4 months, that’s a good improvement but the fact that 4 months of no problems is noteworthy for something as important as a filesystem is a cause for ongoing concern.

A RAID-1 Array

A week ago I had a minor problem with my home file server, one of the 3TB disks in the BTRFS RAID-1 started giving read errors. That’s not a big deal, I bought a new disk and did a “btrfs replace” operation which was quick and easy. The first annoyance was that the output of “btrfs device stats” reported an error count for the new device, it seems that “btrfs device replace” copies everything from the old disk including the error count. The solution is to use “btrfs device stats -z” to reset the count after replacing a device.

I replaced the 3TB disk with a 4TB disk (with current prices it doesn’t make sense to buy a new 3TB disk). As I was running low on disk space I added a 1TB disk to give it 4TB of RAID-1 capacity, one of the nice features of BTRFS is that a RAID-1 filesystem can support any combination of disks and use them to store 2 copies of every block of data. I started running a btrfs balance to get BTRFS to try and use all the space before learning from the mailing list that I should have done “btrfs filesystem resize” to make it use all the space. So my balance operation had configured the filesystem to configure itself for 2*3TB+1*1TB disks which wasn’t the right configuration when the 4TB disk was fully used. To make it even more annoying the “btrfs filesystem resize” command takes a “devid” not a device name.

I think that when BTRFS is more stable it would be good to have the btrfs utility warn the user about such potential mistakes. When a replacement device is larger than the old one it will be very common to want to use that space. The btrfs utility could easily suggest the most likely “btrfs filesystem resize” to make things easier for the user.

In a disturbing coincidence a few days after replacing the first 3TB disk the other 3TB disk started giving read errors. So I replaced the second 3TB disk with a 4TB disk and removed the 1TB disk to give a 4TB RAID-1 array. This is when it would be handy to have the metadata duplication feature and copies= option of ZFS.

Ctree Corruption

2 weeks ago a basic workstation with a 120G SSD owned by a relative stopped booting, the most significant errors it gave were “BTRFS: log replay required on RO media” and “BTRFS: open_ctree failed”. The solution to this is to run the command “btrfs-zero-log”, but that initially didn’t work. I restored the system from a backup (which was 2 months old) and took the SSD home to work on it. A day later “btrfs-zero-log” worked correctly and I recovered all the data. Note that I didn’t even try mounting the filesystem in question read-write, I mounted it read-only to copy all the data off. While in theory the filesystem should have been OK I didn’t have a need to keep using it at that time (having already wiped the original device and restored from backup) and I don’t have confidence in BTRFS working correctly in that situation.

While it was nice to get all the data back it’s a concern when commands don’t operate consistently.

Debian and BTRFS

I was concerned when the Debian kernel team chose 3.16 as the kernel for Jessie (the next Debian release). Judging by the way development has been going I wasn’t confident that 3.16 would turn out to be stable enough for BTRFS. But 3.16 is working reasonably well on a number of systems so it seems that it’s likely to work well in practice.

But I’m still deploying more ZFS servers.

The Value of Anecdotal Evidence

When evaluating software based on reports from reliable sources (IE most readers will trust me to run systems well and only report genuine bugs) bad reports have a much higher weight than good reports. The fact that I’ve seen kernel 3.16 to work reasonably well on ~6 systems is nice but that doesn’t mean it will work well on thousands of other systems – although it does indicate that it will work well on more systems than some earlier Linux kernels which had common BTRFS failures.

But the annoyances I had with the 3TB array are repeatable and will annoy many other people. The ctree coruption problem MIGHT have been initially caused by a memory error (it’s a desktop machine without ECC RAM) but the recovery process was problematic and other users might expect problems in such situations.

Tags: