Fragmentation: Linux kernel untar
We speak about fragmentation when a single file is split into multiple, noncontiguous parts called fragments. As with standard mechanical drive each head seek is very expensive (5-15 ms on most products, while the CPU clock period of a entry-level process is measured in ns, some 6 orders of magnitude lower) fragmentation is #1 enemy for any filesystem. An heavily, random fragmented filesystem will be much slower than a low fragmented one.
- This problem can be addressed by using at least one of these two strategies: running, on regular basis, a defragmenter (a software specifically written to reorganize files on disk, so that the various file fragments are reorganized in greater, single contiguous fragments);
- implementing some non-fragmenting logic directly within the filesystem (so that for each write the filesystem try to use contiguous disk space).
The first solution is surely quite uncomfortable (as we must deal with periodic defragmentations), but it can implement some complex logic to almost totally eliminate fragmentation. The second possibility is much more transparent (we haven't to do anything special), but can be more fragmentation-prone.
Windows system historically choose the first approach: starting from Win2000, Microsoft include an on-line defragmenter for NTFS volumes. Linux filesystems historically choose the second path: all the benchmarked filesystems integrate some non-fragmenting logic and, until recently, not all filesystems had an on-line or even off-line defragmenter.
While two years ago only xfs had a stable, widely distributed online defragmenter, now the situation is way better, as both ext4 and btrfs have stable defraggers that are included in most (if not all) distributions.
So, how these filesystems behave regarding fragmentation? Let's see a simple, common task: untar the Linux kernel. This operation sequentially creates over 36000 files:
All examined filesystems shows great resilience against framentation: ext3 is the worse here, but it manage to keep fragmentation at about 1.03 fragments per file. The other filesystems, being capable of more sophisticated allocation policies (eg: extents, delayed allocation, etc.), show even better results, ending with the perfect situation were one file = one fragment.