Total benchmarks run time
This article focuses on how well the host machine manage an ever increasing number of virtual machines. In order to present you realistic results, I run the benchmark using 1, 2 or 3 tiles (4, 8 or 12 VMs).
The first graph depicts total wall-clock run time, ie how much time a complete benchmark run needs:
This first result is eloquent: enabling the write-back cache translates in much lower execution time, at a point that a 3-tile setup (12 virtual machines) performs better than a 2-tiles setup (8 virtual machines) without caching.
But where the wb-enabled case gains the most?
As you can see, is the filecopy benchmark the speedup the most. This was expected: apache benchmark is basically CPU-bound, while sysbench's complex test is fsync-write bound, a situation where a writeback cache is of little help. Still, the increased filecopy speed is a very nice bonus.
Did you notice how emails seem to basically take no time? It depend on how the SMTP protocol works: even when overloaded by other activities, postfix try hard to queue all incoming emails for later delivery. This delayed delivery phase is not directly timed, but it is another source of fsync-heavy writes.