I'm sure that you hear something similar to “software is always behind processors features”, in the sense that it usually require many years because a new hardware feature is actively used by common, well spread software.
We had many example of this trend: for example, while the i386 processor bring 32 bit computing and other advanced capability to the x86 world in year 1985, the first real 32 bit operating system from Microsoft was Windows NT 3.1, released in 1993, a full 8 years gap. Obviously, some time must pass from hardware support to software support, simply because you need time to write your software – and you can do that only when you have a working hardware. So, a certain gap is not only understandable, but often inevitable.
However, after a certain period, the software stack really had to use the new hardware features, otherwise it will lost potentials performance and functional advantages. A good example of very significant, but not always used at maximum potential, are the SSE2 and X86-64 extensions to the venerable x86 ISA, effectively released with working product on 2001 and 2003. These modern extensions promise great speed improvement and better flexibility; however, quite often they are not used. In this regard, it is emblematic that only on its latest and greatest operating system (Windows 7), Microsoft is actually pushing 64 bit adoption, while XP-64 and Vista-64 where much less widespread.
So, what are the performance advantages that we can expect using SSE2 and X86-64 software extensions? And when we can reasonably assume that these new instructions will bring some speed increase? In the end, it is really important to modify our software to effectively use these extensions? To ask to these questions, we first need to know better SSE2 and X86-64. While the former was the older, I will speak of X86-64 at first. To know why, continue reading with the next page...