To all system administrators: AMD Opteron and Athlon CPUs which are dual-core are prone to TSC skew.
Only rarely will this cause complete system failures, symptoms are more
along the lines of unexpected behaviour and intermittent faults. When
you see a ping time of -21ms you know something is wrong! Mostly not a
problem if you run any modern Linux distro because the kernel uses the
more modern ACPI HPET (high precision event timer). Bad luck for
Windows admins though, you 'll need Windows Server 2008 to properly
avoid the issue, for server 2k3 the fix is to add /usepmtimer to the
boot.ini however this incorporates a performance penalty and reduces
scalability of the system.
Recently at work some pretty major systems
(ie Windows 2k3 Oracle servers) failed because of this issue but, had
those systems been Linux Oracle servers I can guarantee there would
have been no problem. Understandably I'm really unimpressed with
Microsoft and Windows - how hard can it be to patch the OS to use
ACPI-HPET properly? Or even to release a patch that uses PM timer when
an affected CPU is detected? Linux kernel dev's have known about this since 2005 and fixed it very promptly! What on earth is stopping Microsoft from fixing this problem?
be fair; this is not entirely Microsoft's fault but again to be fair,
Linux has a built-in workaround since 2006 while Microsoft's workaround
is manual. Please leave a comment if you've had to deal with this
problem. There is much google mojo here so please offer solutions and
describe symptoms to help others.