In this article, we saw how both NCQ and OS controlled queues are very useful to extract greater performance from the typically slow disk subsystem. Moreover, the deadline scheduler again emerged as the best performing one. CFQ is slower, but this was expected: it primary usage scenario is on desktop clients, where fairness (ie: reasonable per-process I/O resource allocation) is more important that pure throughput.
On server side, I recommend to use the deadline scheduler and to enable all hardware-based queues, both at the disk and at the controller level (if present). The default OS queue size of 128 entry should be enough, but if you typically have a very long QD you can try something bigger (but pay attention not only to total throughput, but to latency also).
Have a nice day!