ZFS, BTRFS, XFS, EXT4 and LVM with KVM – a storage performance comparison

Written by Gionatan Danti on . Posted in Virtualization

User Rating:  / 77
PoorBest 

Some considerations

First, don't get me wrong: this is not a trial against BTRFS. I would like to see it as the best performing filesystem, as it has a ton of promising features. But hey – face the reality: its performances here were really, really low. You can argue that I used a too old kernel, or that enterprise hardware has BBU RAID cards and that it should perform much better with SSDs.

Yes, yes and yes. But older kernel are a fact of life: enterprise distributions (as RedHat, CentOS and SuSE) don't ship with bleeding-edge kernels. Moreover, spinning disks are here to stay. Finally, BTRFS itself don't like HW RAID cards so much, as they interfere with retrieving the correct data in case of bitrotting.

So, while I plan to do some test on a SSD equipped system and on a old, but enterprise-grade server, BTRFS really had to perform reasonably well in the common, 7200 RPM disks case.

In the end, while BTRFS is very well suited to manage many small, rarely-changing files (eg: fileserver, NAS), it don't bode well with large, rewriting files (as VM images and databases).

BTRFS apart, what else can we tell from this benchmark session?

Classical LVMs (preallocated volumes) remain the safer bet, performance wise. After all, they present to the guest system a mostly contiguous disk space, reducing fragmentation. On the other side, they are not very flexible: you not only have no thin provisioning options, but dealing with raw volumes is always a little clunky (eg: for backup purpose). Moreover, LVM snapshot support is quite slow. On the other hand, LVM allow for snapshotting a single volume/VM image, mitigating the snapshot performance hit. So, if you are all for the fastest build, go with normal LVM volumes.

Thin LVM are a good compromise: their default coarse chunk size (64KB to 8MB, but often in the 512KB+ range) means a less fragmented allocation than CoW filesystems, and you have the added benefit of thin storage. Moreover, thin snapshot implementation is way faster than the legacy one.

An even more interesting setup is the ThinLVM + nozeroing + filesystem combo. You take all the advantage of thin volumes (with no zeroing-imposed speed degradation) with filesystem's typical easy of use. The fastest upper filesystem to use seem to be EXT4, but even XFS remains a very good choice.

ZFS was a pleasant surprise: while it is a proven, fast filesystem under Solaris / FreeBSD, its native Linux implementation is very good (and fast). It features a load of advanced characteristics, even more that BTRFS (eg: working RAID5/6 equivalent, on-line deduplication, etc). The only complaint is that, while it is a native Linux kernel module, it is not a 100% direct porting: it is based on the Sun Porting Layer, a port providing emulation of many Solaris specific APIs. This, in turn, means that it is not possibile to include ZFS support directly into mainline kernel. Anyway, if you plan to use ZFS, remember to enable xattr=sa and to use the latest RPM/DEB version provided by the project, as older version had some rare, but nasty, bugs.

Comments   

 
#11 capsicum 2016-02-14 03:42
What are the structural details of the thin LVM arrangement? The KVM information I have gives a warning that thin provisioning is not possible with LVM pools. I am new to KVM and VMs, but I do know traditional LVM structure (Pv, Vg, , Lv or thin-Lv , fs)
 
 
#12 Albert Henriksen 2016-02-15 21:40
In my own tests, BTRFS performance is more than 180 times faster if you do the following:

- Disable COW on the folder containing VM image files (to reduce write amplification)
- Disable QCOW2 and use sparse RAW for VM image files (to reduce fragmentation of extents apparently caused by QCOW2 block mapping algorithm)

Both tests were on a Linux 4.2 kernel. The QCOW2 cluster size was 64K in the test using QCOW2. I only tested with COW disabled. The performance difference is likely even greater with NOCOW + RAW versus COW + QCOW2.

To convert VM images, the following commands are useful:
$ chattr +C new_images/
$ truncate -s 100G new_images/vm1.raw
$ qemu-nbd -c /dev/nbd0 old_images/vm1.qcow2
$ dd conv=notrunc,sp arse bs=4M if=old_images/v m1.qcow2 of=new_images/vm1.raw

Shut down virtual machines before conversion, change XML to point to new files and restart virtual machines when done.
 
 
#13 mt 2016-03-03 11:17
Quoting Albert Henriksen:
In my own tests, BTRFS performance is more than 180 times faster if you do the following:

- Disable COW on the folder containing VM image files (to reduce write amplification)
- Disable QCOW2 and use sparse RAW for VM image files (to reduce fragmentation of extents apparently caused by QCOW2 block mapping algorithm)


But that makes btrfs useless. No snapshots, no checksumming. It's fair to test with CoW - do you have any numbers for that?
 
 
#14 Sam 2016-05-23 00:54
Hello,

I'm taking it you forgot to mount BTRFS with compression enabled (which really should be the default)?

Can you please test BTRFS and mount sure you're mounting with the compress=lzo option ?
 
 
#15 Sam 2016-05-23 00:58
Also just saw your note about Kernel 3.10! - we run many hundreds of VMs and not a single production server is running a kernel this old, we run between 4.4 and 4.6 on CentOS 7.

QCOW2 is also a very suboptimal for modern VMs, in reality you'd always use raw devices or logical volumes.

It would be interesting to see you re-run these tests using a modern kernel, say at least 4.4 and either raw block devices or logical volumes along with mounting BTRFS properly with the compress=lzo option
 
 
#16 Luca 2016-05-23 23:28
Great article, but pagination makes it painful to read
 
 
#17 Gionatan Danti 2016-05-24 15:22
@Sam

No, I did not use any compression (which, by the way, was disabled by default).

I stick to distibution-pro vided kernels when possible, and 3.10.x is the current kernel for RHEL7/CentOS7.

Finally, I agree that RAW images are marginally faster than preallocated QCOW2 files, and when possibile I used them. However, for the block layer/filesyste m combo which does not support snapshots, I used QCOW2 to have at least partial feature parity with the more flexible alternatives.
 
 
#18 Yonsy Solis 2016-05-30 16:28
ok, do you try to use distribution provided kernels when possible, but when you integrate a filesystem from and external module (ZFS from ZFS on Linux) and another filesystem from the provided Kernel (BTRFS) with the old characteristics from this, your camparison get invalid.

ZFS get more updates without upgrade the kernel. This is not the case with BTRFS that need updated kernel. The kernel version is important to know in this case (and will need to be updated to a comparison used in Enterprise distributions, Uubntu 16.04 LTS for example implements 4.4 kernel now)
 
 
#19 Brian Candler 2016-12-15 09:37
For "raw images ZFS", do you mean you created a zvol block device, or a raw .img file sitting in a zfs dataset (filesystem)?
 
 
#20 Gionatan Danti 2016-12-15 09:52
Quoting Brian Candler:
For "raw images ZFS", do you mean you created a zvol block device, or a raw .img file sitting in a zfs dataset (filesystem)?


The latter: raw images on a ZFS filesystem
 

You have no rights to post comments