ZFS, BTRFS, XFS, EXT4 and LVM with KVM – a storage performance comparison
Many configurations to test...
I/O under Linux can be configured in a multitude of ways, even more when you factor in how libvirt/KVM supports high level file container as Qcow2, QED and the likes. So, while I like to give you a broad view of the various configurations, I can not test every options. I restricted myself to testing only a subset of possible storage configurations, and I explicitly excluded the options that do not supports snapshots. So, while attaching a direct MBR or GTP style partition to a VM can be fastest options (and sometime make sense) I did not consider this case here.
For the same reason, I generally used the distribution's default parameters. Unless noted differently, I benchmarked both fat and thin provisioning configurations.
The tested scenarios are:
1) Qcow2 backend on top of XFS filesystem on top of a raw MD device. Both thin and partial (metadata only) preallocation modes were benchmarked;
2) Logical Volumes backend, both in classical LVM (fat preallocation) and thin (thin lvm target) modes. Moreover, thin lvm was analized with both zeroing on and off;
3) raw images on XFS and EXT4 on top of classical LVM, relaying on filesystem sparse-file support for thin provisioning;
4) raw images on XFS and EXT4 on top of thin LVM, relaying on thin lvm target for thin provisioning. In this case, LVM zeroing was disabled as the to-be-zero blocks are directly managed inside the filesystem structures;
5) raw images BTRFS on top of its mirror+stripe implementation (no MD here). I benchmarked BTRFS with CoW both enabled and disabled (nodatacow mount option)
6) raw images ZFS on top of its mirror+stripe implementation (no MD again)
Ok, lets see how things add up...
Comments
- Disable COW on the folder containing VM image files (to reduce write amplification)
- Disable QCOW2 and use sparse RAW for VM image files (to reduce fragmentation of extents apparently caused by QCOW2 block mapping algorithm)
Both tests were on a Linux 4.2 kernel. The QCOW2 cluster size was 64K in the test using QCOW2. I only tested with COW disabled. The performance difference is likely even greater with NOCOW + RAW versus COW + QCOW2.
To convert VM images, the following commands are useful:
$ chattr +C new_images/
$ truncate -s 100G new_images/vm1.raw
$ qemu-nbd -c /dev/nbd0 old_images/vm1.qcow2
$ dd conv=notrunc,sp arse bs=4M if=old_images/v m1.qcow2 of=new_images/vm1.raw
Shut down virtual machines before conversion, change XML to point to new files and restart virtual machines when done.
But that makes btrfs useless. No snapshots, no checksumming. It's fair to test with CoW - do you have any numbers for that?
I'm taking it you forgot to mount BTRFS with compression enabled (which really should be the default)?
Can you please test BTRFS and mount sure you're mounting with the compress=lzo option ?
QCOW2 is also a very suboptimal for modern VMs, in reality you'd always use raw devices or logical volumes.
It would be interesting to see you re-run these tests using a modern kernel, say at least 4.4 and either raw block devices or logical volumes along with mounting BTRFS properly with the compress=lzo option
No, I did not use any compression (which, by the way, was disabled by default).
I stick to distibution-pro vided kernels when possible, and 3.10.x is the current kernel for RHEL7/CentOS7.
Finally, I agree that RAW images are marginally faster than preallocated QCOW2 files, and when possibile I used them. However, for the block layer/filesyste m combo which does not support snapshots, I used QCOW2 to have at least partial feature parity with the more flexible alternatives.
ZFS get more updates without upgrade the kernel. This is not the case with BTRFS that need updated kernel. The kernel version is important to know in this case (and will need to be updated to a comparison used in Enterprise distributions, Uubntu 16.04 LTS for example implements 4.4 kernel now)
The latter: raw images on a ZFS filesystem
RSS feed for comments to this post