Sysbench: read/write speed

Sysbench fileio test is useful to measure I/O speed under a different usage pattern that bonnie++.

For example, sysbench fileio test issue a fsync() every 100 write requests by default, but let us select also a much more aggressive, one write/one fsync() usage pattern. The first case try to mimic normal disk activities (eg: with fsync() called regularly by the OS), while the latter enable us to mirror some database usage pattern (eg: each write must be synchronized to disk).

The first graph measure sequential read/write speeds, with both 100 writes/one fsync() and one write/one fsync() patterns:

Sysbench sequential read write speed

As you can see, with default settings (100 writes/one fsync()) all the contenders behave similarly. However, by issuing many more fsync() we see that EXT3 is way quicker than the others (it score almost 2X higher than XFS).We also see a quite strange BTRFS behavior: while read speed should be unaffected by fsync() (this is a write-related operation), it loses a fair amount of speed in this test. Why? As the read + fsync() test was performed after the write + fsync() one, probably it has to do with the manner the write test organizes the data on disk: by issuing a fsync() after each write, the FS has fewer chances to optimally organize data on disk.

Now, let's see random read/write speeds:

Sysbench random read write speed

We see a very mixed bag now. EXT4 seems the most penalized filesystem, while XFS seems the fastest.

However, a true random read speed of 23 MB/sec is well over the capabilities of a normal, mechanical hard disk. For example, consider an access time of ~10 ms: this give us about 100 seeks/sec and multiplying this number for the sysbench default block size (16 KB) we have a maximum of about 1600 KB/sec for reading true random data a single block at a time. Even considering that by default sysbench use only 2 GB of disk space, and so the access time has the potential to go down at about 4 ms, the XFS and BTRFS scores exceeded the theoretical maximum achievable performance. I don't say that this is a bad thing: after all, I've checked many times the results and I can guarantee that these two FS are really faster than EXT3/4 in this test. However, these results indicate that sysbench random test is probably not-so-random and that some other thing (eg: caching) can influence it.

Also note that both XFS and BTRFS show higher read scores when I first run the write + fsync() test: it seems that the optimizations done when not issuing that many fsync() are counterproductive in this case. Perhaps I found a corner case where XFS's and BTRFS's delayed allocator is not so smart; issuing one fsync() per write request we actually force the FS to forget about the delayed allocation, forcing it to write the data immediately.

Speaking about write speed, we see that the fastest FS is, by far, EXT3. XFS is at a very distant second place and it is followed by the other two filesystems. Such a low (pseudo-)random write speed can badly affect write-intensive application as database, high-volume logging programs and virtual machine system.

UPDATE: preparing the system for another benchmark, I noticed that, in contrast to what written in Fedora 14 documentation, write barriers were non enabled on EXT3 filesystem. Please read the updated "Conclusions" page.