I/O benchmarks: HDTune benchmark
The second test is HD Tune Benchmark Pro, which let us examine the bandwidth and access time of the primary disk reading 512 byte, 4 KB, 64 KB and 1024 KB sized chunks. Generally, the low-sized fragment benchmarks are access-time bound, while the 1024 KB sized benchmark is peak bandwidth bound.
Let see the results (please keep in mind that the HDTune benchmarks are read only tests):
Can you ever see the 512 byte and 4 KB results? This graph is typical for a server equipped with mechanical disks: the lowest-sized benchmarks are dominated by seek time (which is a constant time, not related to chunk size). This is the main reason while flash-based SSD are so fast ;)
However we can isolate the virtualizer speed, and we can see that Xen and KVM are considerably slower that Vmware, which in turn is slight slower that VirtualBox across the board. Another point of view for the same that is the following:
This time, we are not measuring performances on KB/s, but on IOPS (I/O operations per second) terms. As you can see, VirtualBox satisfies the greater number of IOPS for the three low sized benchmark, while it is only slight behind the leader (VMware) on the 1024 KB sized test.
At last, we want to see the disks total access time (seek time + read latency)
This graph is yet another view of the same data: VirtualBox is the leader shortly followed by VMware, while KVM and especially Xen are significantly slower.
Did you remember that I run that test invalidating host-side cache each time? What can be the I/O results if we did not that action and, instead, we want to use the host-side cache? If you are curious, these graphs are for you...
I/O operations per second:
Access time speed:
Wow, if a real physical disk can really guarantee that sort of results, it should be a best buy ;)
Seriously speaking, these results are not really from the physical disk subsystems: they are a cache-to-cache copy or, if you prefer, a host-memory-to-guest-memory copy. In other words, these results really shown hypervisor's I/O overhead.
VMware is the indisputable leader, with VirtualBox at the second place and Xen, greatly behind, at the third. The most interesting thing however is the incredibly slow KVM shown: its results are only a little better that the non-cached version. What can be the culprit here? The cache was disabled? I think no, because the results are better than the non-cached version – they are only too little better. It can be the slow block subsystem? Yes, it can be, but the 1024 KB seems to suggest a slow host-to-guest memory copy performance also. Whichever is the cause, KVM is simply a full light-year behind in the cached tests.
UPDATE: a recent article comparing KVM vs VirtualBox can be found here: http://www.ilsistemista.net/index.php/virtualization/12-kvm-vs-virtualbox-40-on-rhel-6.html