Thursday, January 24, 2008
SOFTWARE HORROR STORIES
My Home Page
Comp. Risks
Verification Course
Submit a Story!
The time is now
The Mars Climate Orbiter crashed in September 1999 because of a "silly mistake": wrong units in a program. Story Story Report
The 1988 shooting down of the Airbus 320 by the USS Vincennes was attributed to the cryptic and misleading output displayed by the tracking software. Story More
Death resulted from inadequate testing of the London Ambulance Service software. Story
Several 1985-7 deaths of cancer patients were due to overdoses of radiation resulting from a race condition between concurrent tasks in the Therac-25 software. Report Report Story More More More More
Errors in medical software have caused deaths. Details in B.W. Boehm, "Software and its Impact: A Quantitative Assessment," Datamation, 19(5), 48-59(1973).
An Airbus A320 crashes at an air show. Story
A China Airlines Airbus Industrie A300 crashes on April 26, 1994 killing 264. Recommendations include software modifications. Summary
The British destroyer H.M.S. Sheffield was sunk in the Falkland Islands war. According to one report, the ship's radar warning systems were programmed to identify the Exocet missile as "friendly" because the British arsenal includes the Exocet's homing device and allowed the missile to reach its target, namely the Sheffield. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 48.
An error in an aircraft design program contributed to several serious air crashes. From P. Naur and B. Randell, eds., Software Engineering: Report on a Conference Sponsored by the NATO Science Committee, Brussels, NATO Scientific Affairs Division, 1968, p. 121.
An Air New Zealand airliner crashed into an Antarctic mountain; its crew had not been told that the input data to its navigational computer, which described its flight plan, had been changed. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 52.
The Ariane 5 satellite launcher malfunction was caused by a faulty software exception routine resulting from a bad 64-bit floating point to 16-bit integer conversion. Report Story Story Story Story
During the maiden flight of the Discovery space shuttle, 30 seconds of (non-critical) real-time telemetry data was lost due to a problem in the requirement stage of the software development process. Story
A train stopped in the middle of nowhere (London' Docklands Light Railway) due to future station location changes after the software was deployed and reluctance to change the software. Story
The Dallas/Fort Worth air-traffic system began spitting out gibberish in the Fall of 1989 and controllers had to track planes on paper. "Ghost in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
Several Space Shuttle missions have been delayed due to hardware/software interaction problems. Story
An airplane software control returned inappropriate responses to pilot inquiries during abnormal flight conditions. Story
The Pathfinder reset problem. Story More
An Iraqi Scud missile hit Dhahran barracks, leaving 28 dead and 98 wounded. The incoming missile was not detected by the Patriot defenses, whose clock had drifted .36 seconds during the 4-day continuous siege, the error increasing with elapsed time since the system was turned on. This software flaw prevented real-time tracking. The specifications called for aircraft speeds, not Mach 6 missiles, for 14-hour continuous performance, not 100. Patched software arrived via air one day later. From ACM SIGSOFT Software Engineering Notes, vol 16, #3. See Story More More More
Bug-infested [air traffic control software] was scoured by software experts at Carnegie-Mellon and the Massachusetts Institute of Technology to determine whether it could be salvaged or had to be canceled outright. Story
Were a missile to approach at a certain tricky angle (all) 27 programs would fail to shoot it down. Story
The Apollo 8 spacecraft erased part of the computer's memory. From G. J. Myers, Software Reliability: Principles & Practice, p. 25.
Eighteen errors were detected during the 10-day flight of Apollo 14. From G. J. Myers, Software Reliability: Principles & Practice, p. 25.
A 1963 NORAD exercise was incapacitated because a software error caused the incorrect routing of radar information. From G. J. Myers, Software Reliability: Principles & Practice, p. 25.
The U.S. Strategic Air Command's 465L Command System, even after being operational for 12 years, still averaged one software failure per day. From G. J. Myers, Software Reliability: Principles & Practice, p. 25.
An error in a single FORTRAN statement resulted in the loss of the first American probe to Venus. From G. J. Myers, Software Reliability: Principles & Practice, p. 25.
On June 3, 1980, the North American Aerospace Defense Command (NORAD) reported that the U.S. was under missile attack. The report was traced to a faulty computer circuit that generated incorrect signals. If the developers of the software responsible for processing these signals had taken into account the possibility that the circuit could fail, the false alert might not have occurred. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 48.
The manned space capsule Gemini V missed its landing point by 100 miles because its guidance program ignored the motion of the earth around the sun. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 49.
Five nuclear reactors were shut down temporarily because a program testing their resistance to earthquakes used an arithmetic sum of variables instead of the square root of the sum of the squares of the variables. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 49.
In a 1977 exercise, when it was connected to the command-and-control systems of several regional commands, the WWMCCS had an average success rate for message transmission of only 38 percent. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 51.
Aegis was installed on the U.S.S. Ticonderoga, a Navy cruiser. After the Ticonderoga was commissioned the weapon system underwent its first operational test. In this test it failed to shoot down six out of 16 targets because of faulty software; earlier small-scale and simulation tests had not uncovered certain system errors. In addition, because of test-range limitations, at no time were more than three targets presented to the system simultaneously. For a sizable attack approaching Aegis' design limits the results would most likely have been worse. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American , vol. 253, no. 6 (Dec. 1985), p. 51.
On June 19, 1985 the Strategic Defense Initiative Organization performed a simple experiment: The crew of the space shuttle was to position the shuttle so that a mirror mounted on its side could reflect a laser beamed from the top of a mountain 10,023 feet above sea level. The experiment failed because the computer program controlling the shuttle's movements interpreted the information it received on the laser's location as indicating the elevation in nautical miles instead of feet. As a result the program positioned the shuttle to receive a beam from a nonexistent mountain 10,023 nautical miles above sea level. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American , vol. 253, no. 6 (Dec. 1985), p. 51.
The first operational launch attempt of the space shuttle, whose real-time operating software consists of about 500,000 lines of code, failed because of a synchronization problem among its flight-control computers. The software error responsible for the failure, which was itself introduced when another error was fixed two years earlier, would have revealed itself, on the average, once in 67 times. From "The development of software for ballistic-missile defense," by H. Lin, Scientific American, vol. 253, no. 6 (Dec. 1985), p. 52.
"The change was so simple he didn't feel he had to inform anyone that it took place and the mistake he made was so stupid. He had no idea of the damage it would caused." The day after the product shipped 50 beta testers called and reported that all the paychecks were being printed at zero dollars. Story
The Sendmail security bug. Story
INTEL processor bugs galore. List Pentium discussion
A computer-monitored house arrest inmate escaped and subsequently committed murder. This was caused by the reporting software not re-trying when it received a busy signal at the main computer number. Story
The clock in the video camera indicated a customer had withdrawn his money at the same time as a fraud occurred, so the bank forwarded his photo to the authorities. The clock had been off by about one hour. Story
The nine-hour breakdown of AT&T's long-distance telephone network in Jan. 1990, caused by an untested code patch, dramatized the vulnerability of complex computer systems everywhere. "Ghost in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
On July 1-2, 1991, computer-software collapses in telephone switching stations disrupted service in Washington DC, Pittsburgh, Los Angeles and San Francisco. Once again, seemingly minor maintenance problems had crippled the digital System 7. About twelve million people were affected in the crash of July 1, 1991. Said the New York Times Service: "Telephone company executives and federal regulators said they were not ruling out the possibility of sabotage by computer hackers, but most seemed to think the problems stemmed from some unknown defect in the software running the networks." Within the week, a red-faced software company, DSC Communications Corporation of Plano, Texas, owned up to glitches in the signal transfer point software that DSC had designed for Bell Atlantic and Pacific Bell. The immediate cause of the July 1 crash was a single mistyped character: one tiny typographical flaw in one single line of the software. One mistyped letter, in one single line, had deprived the nations capital of phone service. It was not particularly surprising that this tiny flaw had escaped attention: a typical System 7 station requires ten million lines of code. From The Hacker Crackdown, by Bruce Sterling, 1992. Story More More More
During a payday rush in 1989, a faulty program shut down 1,800 automated-teller machines at Tokyo's Dai-Ichi Kangyo Bank. "Ghost in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
When an airline's reservation system went down in 1989, 14,000 travel agents had to book flights manually. "Ghost in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
In the early 1980s, Buick had to give 80,000 V6 cars a chip transplant to fix flaws in their microprocessors. "Ghost in the Machine," Time Magazine, Jan. 29, 1990. p. 58. Story
The New York Stock Exchange opened one hour late on Dec. 18, 1995 due to a communications problem in the software. Story
Chemical Bank went down for 5 hours on July 20, 1994 due to a file update overloading the computer system. Story
There was a San Francisco 911 system crash of over 30 minutes on Oct. 12, 1995. Patched but not fixed, it still misses between 100-200 calls per day. Story
The hole in Ozone layer over Antartica left undetected for extended period because data was considered anomalous by software because it was out of the specified range. Story
The Denver airport stayed closed for over a year due to software glitches in the automated baggage handling system. Story More
Bell Atlantic Corp. failed to bill approximately 400,000 AT&T customers in parts of Virginia, Maryland, Washington D.C., and West Virginia for their long-distance calls on their January 1998 bill. AT&T stated that their Operations Support Systems provided Bell Atlantic with the correct billing data for three of the twenty billing cycles, customer's billed on the 2nd, 4-5th, and 7th of the month, and that a Bell Atlantic computer error failed to produce the AT&T portion of the bill. Bell Atlantic has stated that the problem was a "systems glitch", "processing error", and/or "data processing error". [Supposedly, computer tapes were used to transfer the billing details between AT&T and Bell Atlantic.] From an AT&T press release, dated 16-Jan-1998, reprinted in the Richmond Times-Dispatch, 17 Jan 1998, p. C10.
Oodles of software will fail in the year 2000. Story More More Lots more
The IRS uncovered an unintended side effect of its effort to eliminate the Year 2000 computer bug: About 1,000 taxpayers who were current in their tax installment agreements were suddenly declared in default due to a programming error. [There are 62 million lines of source code to check; the error was caused by an attempted Y2K fix.] From the Associated Press newswire (AP US & World, 23 Jan 1998, by Rob Wells).
An alert to all National Association of Miniature Enthusiasts (NAME) members: A member recently called the office to find out why she hasn't received her Houseparty Gazette. She discovered that the computer has deactivated ALL members whose memberships expire in the year 2000 and beyond. Kim ... said she had no way of knowing who those folks are unless they call her and let her know. From the rec.arts.dollhouses newsgroup.
One production line shut down when the laser-driven printer putting "sell-by" dates on products couldn't handle the 2000 date. Industry Week, Jan. 5, 1998, p. 26.
Many programs err in, or simply ignore, the century rule for leap years on the Gregorian calendar (every 4th year is a leap year, except every 100th year which is not, except every 400th year which is). For example, early releases of the popular spreadsheet program Lotus 1-2-3 treated 2000 as a non-leap year, a problem eventually fixed. But, all releases of Lotus 1-2-3 take 1900 as a leap year; by the time this error was recognized, the company deemed it too late to correct: ``The decision was made at some point that a change now would disrupt formulas which were written to accommodate this anomaly''. Excel, part of Microsoft Office, has the same flaw. From Calendrical Calculations , N. Dershowitz and E. M. Reingold, p. xviii.
The New York City Taxi and Limousine Commission chose March 1, 1996 as the start date for a new, higher fare structure for cabs. Meters programmed by one company in Queens forgot about the leap day and charged customers the higher rate on February 29. The New York Times, March 1, 1997.
A computer software error at the Tiwai Point aluminum smelter in Southland, New Zealand at midnight on New Year's Eve 1997 caused more than $AU 1 million of damage. The software error was the failure to account for leap years (and considering a 366th day in the year to be invalid), causing 660 process control computers to shut down and the smelting pots to cool. The same problem occurred two hours later at Comalco's Bell Bay smelter in Tasmania (which is two hours behind New Zealand). The general manager of operations for New Zealand Aluminum Smelters, David Brewer, said ``It was a complicated problem and it took quite some time [until midafternoon] to find the cause.'' The New Zealand Herald , January 8, 1997, and The Dominion, in Wellington, New Zealand.
A "computer error" is blamed for a false report of three death by an incurable disease when a woman killed her daughter and tried to kill her son and herself. From ACM SIGSOFT Software Engineering Notes, vol. 10, no. 3
A Norwegian class gets a pornographic image because of cache problem, when a recycled link leads to a pornographic site. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue 47.
Computers were blamed when, in three separate incidents, 3 million, 5.4 million, and 1.5 million gallons of raw sewage were dumped into Willamette River. From ACM SIGSOFT Software Engineering Notes, vol. 13, no. 3.
The U.S. national EFTPOS system crashed on 2 Jun 1997 for two hours and 100K transactions were "lost". One central processor failed and backup procedures to redistribute the load also failed. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue 21.
Computer blunders were blamed for $650M student loan losses. From ACM SIGSOFT Software Engineering Notes , vol. 20, no. 3.
An Internet routing "black hole" cuts off ISPs; MAI Network Services routing table errors directed 50,000 routing addresses to MAI; InterNIC goofed, as well, 23 Apr 1997. From ACM SIGSOFT Software Engineering Notes, vol. 22, no. 4.
Votes were lost by a computer in Toronto. The Toronto district finally abandoned computerized voting, leaving a year-old race unresolved. From ACM SIGSOFT Software Engineering Notes , vol. 15, no. 2.
A cat was registered as a voter to demonstrate risks (no pawtograph required). From ACM SIGSOFT Software Engineering Notes, vol. 20, no. 1.
A "read-ahead" synchronization glitch and/or an eager operator caused a large data entry error, and the wrong winner was announced in a Rome, Italy city election. From ACM SIGSOFT Software Engineering Notes, vol. 15, no. 1.
In a German parliament election, the program rounds up the Greens' 4.97%, which was less than the 5% cutoff; when corrected, the Social Democrats attained a one seat majority. From ACM SIGSOFT Software Engineering Notes, vol. 17, no. 3.
An Oregon computer error reversed election results. From ACM SIGSOFT Software Engineering Notes, vol. 18, no. 1.
A (CTSS) raw password file was distributed as message-of-the-day, due to an editor temporary file name confusion. See Morris and Thompson, CACM 22, 11, Nov 1979.
The U.S. Social Security Administration systems could not handle non-Anglo names, affecting $234 billion for 100,000 people, some going back to 1937. From Internet Risks Forum NewsGroup (RISKS) , vol 18, issue 80.
Software prevented the correction of a recognized Olympic skating scoring error. From ACM SIGSOFT Software Engineering Notes, vol. 17, no. 2.
A computer scoring glitch at an Olympic boxing match causes the evident winner to lose. From ACM SIGSOFT Software Engineering Notes, vol. 17, no. 4.
A man's auto insurance rate triples when he turns 101 (= 1 mod 100). From ACM SIGSOFT Software Engineering Notes, vol. 12, no. 1.
A Montreal life insurance company dies due to software bugs in its integrated system. From ACM SIGSOFT Software Engineering Notes, vol. 17, no. 2.
A computer test residue generates a false tsunami warning in Japan. From ACM SIGSOFT Software Engineering Notes, vol. 19, no. 3.
Chicago cat owners were billed $5 for unlicensed dachshunds. A database search on "DHC" (for dachshunds) found "domestic house cats" with shots but no license. From ACM SIGSOFT Software Engineering Notes, vol. 12, no. 3.
The Korean Airlines KAL 901 accident in Guam killed 225 out of 254 aboard. A worldwide bug was discovered in barometric altimetry in Ground Proximity Warning System (GPWS). From ACM SIGSOFT Software Engineering Notes, vol. 23, no. 1.
A "computer error" affected hundreds of U.K. A-level exam results. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue 40.
The Paris police computer mismatched a Corsican city code with postal code, and was unable to collect motorists' fines. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue 41.
Netscape Communicator 4.02 and 4.01a allowed disclosure of passwords. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue 34.
A bank robbery "wanted" poster of the wrong person was due to an unchecked match. From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue 29.
The Soviet Phobos I Mars probe was lost, due to a faulty software update, at a cost of 300 million rubles. Its disorientation broke the radio link and the solar batteries discharged before reacquisition. From Aviation Week, 13 Feb 1989.
An F-18 fighter plane crashed due to a missing exception condition. From ACM SIGSOFT Software Engineering Notes, vol. 6, no. 2.
An F-14 fighter plane was lost to uncontrollable spin, traced to tactical software. From ACM SIGSOFT Software Engineering Notes, vol. 9, no. 5.
A Parisian computer transforms traffic charges into big crimes. From ACM SIGSOFT Software Engineering Notes, vol. 14, no. 6.
CyberSitter censors "menu */ #define" because of the string "nu...de". From Internet Risks Forum NewsGroup (RISKS), vol. 19, issue 56.
In a heavily loaded computer system, a steady stream of high-priority processes can prevent a low-priority process from ever getting resources. Generally, one of two things will happen. Either the process will eventually be run (at 2 A.M. Sunday, when the system is finally lightly loaded), or the computer system will eventually crash and lose all unfinished low-priority processes.... Rumor has it that, when they shut down the IBM 7094 at MIT in 1973, they found a low-priority process that had been submitted in 1967 and had not yet been run. From Silbershatz and Galvin, pp. 142-143.
GTE Corp. mistakenly printed 50,000 unlisted residential phone numbers and addresses in 19 directories that were leased to telemarkteters in communities between Santa Barbara and Huntington Beach. GTE blames the problem on a software snafu. The company faces fines of up to 1.5 billion dollars, if found guilty of gross negligence. From comp.dcom.telecom newsgroup (27 Apr 1998); X Telecom Digest, Volume 18, Issue 60, Message 4 of 7.
On Sept. 19, 1989 an overflow (of a 2-byte integer) at a Washington, DC hospital caused a computer to collapse and forced them to do things manually.
On Nov. 16, 1989 an overflow (of a 2-byte integer) in the Michingan Terminal System caused a computer crash in Newcastle, followed by crashes all over the U.S.
Midwest Telephone Company had a program to assign telephone numbers with a $5 million annual maintenance budget. In 1981, they reported: "No more than 15 known errors remain unsolved at the end of each month." In fact, people had stopped using the program and were entering numbers manually, leaving the database hopelessly outdated.
Bank of America was forced to write off a $60 million investment in a new software systems and reverted to its 15-year old predecessor.
Due to a software error, Continental Airlines consistently undercharged for plane rentals by one day.
SRI International's computer reset the time by averaging 11 clocks, though one was 12 hours off.
In 1980, the ARPAnet shut down on account of a self-propagating error.
Rumor has it that a military plane flipped over when crossing the equator.
Rumor has it that an Airbus plane crashed into its hangar, since its onboard computer interpreted a bump as turbulence in the air.
Software reboot during the Apollo 11 landing forced Armstrong to manually land the lunar lander. Story
In 1989, Swedish Gripen prototype crashed due to new software in the fly-by-wire system. Story
In 1995, Swedish Gripen fighter plane crashed during air-show. Story
Soldiers killed. Story
Roundup of US government Y2K bugs.
French ticket reservation software took 4 months to get working. Story
In October 1995, 200,000 French civil servants were paid twice.
On May 3, 2000, Paris area telephone service collapsed. Story
Software error causes patients to be declared dead. Story
Shuttle simulator bug. Story
Software suspected in 1994 Chinook helicopter crash, killing 29. Story Report
For two days during the summer holidays in 2004, the French national railroad company's reservation system was disorganized, due to a faulty patch. Report
Subscribe to:
Post Comments (Atom)
1 comment:
www.i-netsolution.com This online community website script can be customized and be branded for you. Online Community site has With dozens of community building tools, your users will contribute the content, building your site leaving you free to promote and market your site and there by making money from advertising revenuewww.i-netsolution.com
Post a Comment