KVM cache modes overview, continued

We can use an image to better track how things work:

Guest / Host write caching

Let start from the beginning, assuming traditional (no-barrier-passing) behavior: when a guest-side application write something, data generally go into the guest-side pagecache. The pagecache will then be periodically flushed to the virtual disk device and so to the host-side disk image file. On the other side, if the application has to write some very important data, it can bypass the pagecache and use a synchronized write semantic where a write is supposed to return if and only if the write successfully committed all the data to the (virtual) permanent storage system.

Anyway, at this point (writes flushed to virtual disk device) we have three possibilities:

  • if the cache policy is set to “writeback”, data will be cached in the host-side pagecache (red arrow);
  • if the cache policy is set to “none”, data will be immediately flushed to the physical disk/controller cache (gray arrow);
  • if the cache policy is set to “writethrough”, data will be immediately flushed to the physical disk platters (blue arrow).

Note that the only 100% safe choice is the write-through setting, as the others will not guarantee that a write returns only after the data are committed to permanent physical storage. This is not always a problem: after all, on a classic, not-virtualized system you don't have any guarantee that normal write will be immediately stored to disk. However, as we stated above, some writes must be committed to disk or significant loss will occur (eg: think to a database system or filesystem journal commits). These important writes are generally marked as “synchronous” ones and are executed with the O_SYNC or similar flags, meaning that the system is supposed to return only when all data are successfully committed to the permanent storage system.

Unfortunately, when we speak about the virtualized guest the only cache setting that guarantee a 100% permanent write, the writethrough mode, is the slower one (as you will find soon). This means that you had to make a choice between safety and performance, with the no-cache mode often used because, while not 100% safe, it was noticeably safer than write-back cache.

Now things have changed: newer KVM releases enable a “barrier-passing” feature that assure a 100% permanent data storage for guest-side synchronized writes, regardless of the caching mode in use. This means that we can potentially use the high-performance “writeback” setting without fear of data loss (see the green arrow above). However your guest operating system had to use barriers in the first place: this means that for most EXT3-based Linux distributions (as Debian) you had to manually enable barriers or use a filesystem with write barriers turned on by default (most notably EXT4).

If you virtualize an old operating system without barriers support, you had to use the write-through cache setting or at most the no-cache mode. In the latter case you don't have 100% guarantee that synchronized writes will be stored to disk; however, if your guest OS didn't support barriers, it is intrinsics unsafe on standard hardware also. So the no-cache mode seems a good bet for these barrier-less operating system, specially considering the high performance impact of write-through cache mode.

Ok, things are always wonderful in theory, but how well the new write-barrier-passing feature works in a practical environment? We will see that in a moment...