You’re not going to believe this one….
I got a call from a long-time customer of my employer ( it was my turn with the pager 🙁 ) at approx 3:00am on Monday 20/12/2004 reporting that a SCO machine on their network had gone down and that their investigations pointed to our machine ( also running SCO ) as the culprit.
As the machines in question had been installed and running for approx 6 years and neither machine had been modified in quite some time, I was to put it mildly, sceptical of their conclusions.
After a day of sifting through the system, application and other logs ( thanks mostly to the excellent sar utility from the Sysstat tools ) I came to the conclusion that one or more processes on our system had suddenly become very active, but I was unable to determine the identity of the culprit.
One thing I did notice was that the uptime was approx 248 days.
I passed the analysis on to a colleague who found that there is a known bug ( OSS456B ) in some versions of SCO OpenServer that result in unpredictable behavior once uptime exceeds 248 days, some sort of internal integer overflow error.
In the case of our machine, once the magic number was hit it started broadcasting packets to any SCO Licence daemons on machines connected to the network at an enormous rate effectively mounting a DoS attack on any machines listening on the desired port.
Wow! That is incredible. As a programmer (and I use the term advisedly) that is the kind of bug you don’t need.