Archives

Categories

ZFS on Debian/Wheezy

As storage capacities increase the probability of data corruption increases as does the amount of time required for a fsck on a traditional filesystem. Also the capacity of disks is increasing a lot faster than the contiguous IO speed which means that the RAID rebuild time is increasing, for example my first hard disk was 70M and had a transfer rate of 500K/s which meant that the entire contents could be read in a mere 140 seconds! The last time I did a test on a more recent disk a 1TB SATA disk gave contiguous transfer rates ranging from 112MB/s to 52MB/s which meant that reading the entire contents took 3 hours and 10 minutes, and that problem is worse with newer bigger disks. The long rebuild times make greater redundancy more desirable.

BTRFS vs ZFS

Both BTRFS and ZFS checksum all data to cover the case where a disk returns corrupt data, they don’t need a fsck program, and the combination of checksums and built-in RAID means that they should have less risk of data loss due to a second failure during rebuild. ZFS supports RAID-Z which is essentially a RAID-5 with checksums on all blocks to handle the case of corrupt data as well as RAID-Z2 which is a similar equivalent to RAID-6. RAID-Z is quite important if you don’t want to have half your disk space taken up by redundancy or if you want to have your data survive the loss or more than one disk, so until BTRFS has an equivalent feature ZFS offers significant benefits. Also BTRFS is still rather new which is a concern for software that is critical to data integrity.

I am about to install a system to be a file server and Xen server which probably isn’t going to be upgraded a lot over the next few years. It will have 4 disks so ZFS with RAID-Z offers a significant benefit over BTRFS for capacity and RAID-Z2 offers a significant benefit for redundancy. As it won’t be upgraded a lot I’ll start with Debian/Wheezy even though it isn’t released yet because the system will be in use without much change well after Squeeze security updates end.

ZFS on Wheezy

Getting ZFS to basically work isn’t particularly hard, the ZFSonLinux.org site has the code and reasonable instructions for doing it [1]. The zfsonlinux code doesn’t compile out of the box on Wheezy although it works well on Squeeze. I found it easier to get a the latest Ubuntu working with ZFS and then I rebuilt the Ubuntu packages for Debian/Wheezy and they worked. This wasn’t particularly difficult but it’s a pity that the zfsonlinux site didn’t support recent kernels.

Root on ZFS

The complication with root on ZFS is that the ZFS FAQ recommends using whole disks for best performance so you can avoid alignment problems on 4K sector disks (which is an issue for any disk large enough that you want to use it with ZFS) [2]. This means you have to either use /boot on ZFS (which seems a little too experimental for me) or have a separate boot device.

Currently I have one server running with 4*3TB disks in a RAID-Z array and a single smaller disk for the root filesystem. Having a fifth disk attached by duct-tape to a system that is only designed for four disks isn’t ideal, but when you have an OS image that is backed up (and not so important) and a data store that’s business critical (but not needed every day) then a failure on the root device can be fixed the next day without serious problems. But I want to fix this and avoid creating more systems like it.

There is some good documentation on using Ubuntu with root on ZFS [3]. I considered using Ubuntu LTS for the server in question, but as I prefer Debian and I can recompile Ubuntu packages for Debian it seems that Debian is the best choice for me. I compiled those packages for Wheezy, did the install and DKMS build, and got ZFS basically working without much effort.

The problem then became getting ZFS to work for the root filesystem. The Ubuntu packages didn’t work with the Debian initramfs for some reason and modules failed to load. This wasn’t necessarily a show-stopper as I can modify such things myself, but it’s another painful thing to manage and another way that the system can potentially break on upgrade.

The next issue is the unusual way that ZFS mounts filesystems. Instead of having block devices to mount and entries in /etc/fstab the ZFS system does things for you. So if you want a ZFS volume to be mounted as root you configure the mountpoint via the “zfs set mountpoint” command. This of course means that it doesn’t get mounted if you boot with a different root filesystem and adds some needless pain to the process. When I encountered this I decided that root on ZFS isn’t a good option. So for this new server I’ll install it with an Ext4 filesystem on a RAID-1 device for root and /boot and use ZFS for everything else.

Correct Alignment

After setting up the system with a 4 disk RAID-1 (or mirror for the pedants who insist that true RAID-1 has only two disks) for root and boot I then created partitions for ZFS. According to fdisk output the partitions /dev/sda2, /dev/sdb2 etc had their first sector address as a multiple of 2048 which I presume addresses the alignment requirement for a disk that has 4K sectors.

Installing ZFS

deb http://www.coker.com.au wheezy zfs

I created the above APT repository (only AMD64) for ZFS packages based on Darik Horn’s Ubuntu packages (thanks for the good work Darik). Installing zfs-dkms, spl-dkms, and zfsutils gave a working ZFS system. I could probably have used Darik’s binary packages but I think it’s best to rebuild Ubuntu packages to use on Debian.

The server in question hasn’t gone live in production yet (it turns out that we don’t have agreement on what the server will do). But so far it seems to be working OK.

13 comments to ZFS on Debian/Wheezy

  • I’m going to try building ZFS for Open Pandora (little Linux handheld). I have two Pandora’s, one with 512MB RAM, and one with 256MB RAM. I want it to work with 1TB disks over USB. Wish me luck in this insane undertaking! :P
    Google says: No results found for “ZFS on a phone”, No results found for “ZFS on a handheld”, so maybe I will be pioneering something here. I have heard that it did run okay on a PIII with 512MB, so that’s hopeful.

  • etbe

    My one ZFS production server used to have 4G of RAM, it was repeatedly getting OOM kernel panics even though I had reduced the ARC size to 512M. The server has since been upgraded to 12G of RAM, the extra RAM solves that problem, generally improves performance, and is cheaper than spending time working on it.

    I predict a poor result from trying to run ZFS on 512M.

  • Martin

    Many thanks for this piece. I have zfs-fuse running for a couple of years but did not see the need to compile the zfsonlinux-code. Since I am a lazy guy it seems I have waited long enough ;).

    If I expierience any problems I will report back to You.

    I just love this sh*t – no home like my GNU/Linux boxes.

    Regards,
    Martin

  • Yaroslav Halchenko

    You could have also gave a shout to a kfreebsd port of Debian with ZFS support out of the box. I wondered why not?

  • etbe

    Yaroslav: I didn’t know that the kfreebsd had such support out of the box. That’s great news and I’ll keep that in mind next time I have to install a dedicated file server system.

    But for the systems I’m running right now I need to run lots of Linux applications. While kfreebsd is pretty good at doing that there will be subtle differences and having a slightly different system adds some extra work.

    Thanks for the comment, I’ll try to remember to give the kfreebsd people a shout in future!

  • etbe

    http://news.ycombinator.com/item?id=4356053

    ycombinator has an interesting discussion about ZFS which references this post, blumentopf says “several high-profile people in the community, Russell Coker for one [1], have started using this on production systems”.

  • My Pandora port has not succeeded yet! I built it okay (built Linux and ZFS on the device), but it does not work yet. Perhaps because Pandora is using a pre-empt kernel. In other news, I built 9base (plan 9 libs and tools) on my N900 phone, I’m not quite sure why. That only took 4 minutes, including compiling libc. Plan 9 code is nice and small!

  • Andreas

    As Wheezy was frozen already, zfs will probably not make it into the upcoming stable release, but will it possibly be in Jessie ?

  • Hi !

    Found this phantastic ! I immidiately went to a testmachine
    and added the sources ….
    But I am getting 404 errors … !
    Any ideas ?

    Thanks anyway !
    br,
    ++mabra

  • etbe

    mabra: Are the 404s from important things or just the i18n? I don’t use an i18n directory in my repository.

  • Hi !

    To you answer: I never mentioned ‘i18n’ and do not understand, what this means. Now, just to b sure, I gave it
    a second try and got a PGP verification error, which I ignored. Installing ‘zfsutils’ now fails with: ‘E: Some packages could not be authenticated’.

    Do you know something about that? Do I have to register some PGP key and if ‘yes’, how ??

    Thanks anyway !

    br,
    ++mabra

  • etbe

    Run “gpg –recv-key F5C75256” check that the fingerprint matches the below with “gpg –fingerprint F5C75256” and then as root run “gpg –export F5C75256 | apt-key add -“.

    pub 1024D/F5C75256 1998-10-11
    Key fingerprint = D51E 60CA 9899 6009 F1A4 B4CD C2B0 79FC F5C7 5256
    uid Russell Coker <russell@coker.com.au>
    uid Russell Coker <etbe@debian.org>

  • Peter

    Many thanks for this!

    I build a system with /boot and / as etx4 FS and o zpool with some zfs. After creating everything was mounted and looks good. But during a reboot no zfs was mounted again. To do it manually I could simple call “$ zfs mount “. But how would be a good automatic way?

    I read about the mountall package from the PPA https://launchpad.net/~zfs-native/+archive/stable but as far I can see it’s not in your repo. How do you do mounting during boot?