A small detour: buffering vs caching

While this page is not strictly needed for article comprehension, I feel it is important for terminology clarifications.

As you probably know, modern operating systems tend to aggressively cache data, using how much unused memory at their disposal.

Indeed, issuing the “free” command on a Linux terminal show something similar to that:

             total         used           free        shared        buffers       cached
Mem:       7801072       210632        7590440             0           2296         23004
-/+ buffers/cache:       185332        7615740
Swap:      8388600            0        8388600

Notice the “buffers” and “cached” column: what they means?

First we should note that, in Linux, every time you write to a file your are using the pagecache, while when you write to a block device (eg: a logical volume), bypassing the filesystem, you are using buffering.

Without too much surprise, the “cached” column represents the filesystem blocks cached for later reuse (it is the so-called “pagecache”). When you read something from a file, its content ends not only in your application, but on pagecache also. When you write something, generally your writes end first in the pagecache and, only after some seconds, they hit the disks.

How about the “buffer” column? Buffers are closely related to caching, but serve a somewhat different purpose: while a cache explicitly retains data until they are stale or they are forcedly flushed, a buffer retains data only for the smallest amount of time needed to efficiently transfer data to/from the backing device. In other word, they are a necessity due to hardware constrains: for example, as small disk transfers are quite inefficient, a buffer would accumulate smaller writes until the backing block device is closed. At this point, it flush data to the backing device and release the memory allocated for buffering. On contrary, a cache would accumulate writes for much higher threshold, basically ignoring the close syscall and, in order to improve future reads, it will maintain a local copy of the written data even after they are flushed to the backing device.

As my KVM setup is using LVM-based virtual disks, you may wonder why, in the rest of the article, I speak about “cache” and not “buffer”. They point is that buffers can be effectively used as a long-term cache. Remember what I wrote above? Buffers are flushed when the underlying device is closed via a close() syscall. This means that if we don't close the device, buffers remain in place and data can be written/read directly from the them, rather than from the device. This is precisely what happens with Qemu/KVM on top of a LVM-based disk: while the qemu process is running, the buffer retain both read and written blocks, acting as a true cache. It's worth note that a simple VM reboot will not drain the buffers, as the qemu process is still running. In order to completely discard the accumulated buffers, you had to shutdown the VM (or kill the qemu process running it). This is the only significant difference between the buffered-backed LVM virtual disk and a real pagecache-backed file-based virtual disk: in this latter case, a shutdown will not drain the cache, so a subsequent VM start benefits from the old, still valid, data in cache.

In short: while I am using LVM-based virtual disks that are not strictly pagecached-backed, the current Linux's buffers implementation is, in this case, very similar to a classical cache. But if they are the same, why spend and entire page arguing about the terminology? The answer is simple: historically, pagecache was somewhat more CPU-hungry then buffering. The difference is very small, at a point that with a modern CPU it is negligible, but I am quite picky when describing benchmark results :)