my friends, two disasters in one month: the air france airbus 300 (http://www.reuters.com/article/domesticNews/idUSTRE55O6P120090626) and the washington d.c. metro crash (http://www.reuters.com/article/latestCrisis/idUSN23529575).
what gives?
the corporations and governments have no answers. what they do know, they seem to be (as always) covering up.
so...let me venture an answer as to how these wrecks occur: software.
we are not yet in an age where software (s/w) should be allowed to handle the operation of aircraft and train systems or any other systems that can accidentally maim and kill people! yes, i know: s/w can handle such systems under most conditions and most of the time they operate as expected.
but, do they? how would anyone know? how can it be 'proven'? (probability and statistics doesn't 'prove' anything about how systems will work in the real-world! the infinitesimal probability of a thousand heads in a row for an honest coin doesn't 'prove' that outcome won't occur. and it has nothing to say, when it does occur!)
so, what happens when the conditions are drastically different from what some s/w is programmed to handle? or, what happens when the s/w errors? or, what happens when some hacker who knows the frailties of some system, takes advantage of them?
i submit to you that there is no s/w that can gracefully degrade in these situations.
when conditions drastically depart from what some s/w is programmed to handle, the best that can be done is to send the s/w into a 'safe mode' where it continues to run in a type of diagnostic mode after putting the thing it is handling into a 'safety' mode of operation (satellites do this often). however, what happens when an aircraft is being tossed around by storm? any 'safety' mode for that?
as for error and hacking, when those occur, that's it. the execution stream is corrupted and the system is going to do unexpected things. all system developers can do is try to prevent errors and hacks.
now, dear reader, you may or may not know that the metro subway system in d.c. is run by s/w. the operator is just in the car to override the s/w in certain situations (like when some tourist gets caught in the doors when they are closing). in normal circumstances, the trains start and stop and move between A and B under s/w control.
but what would happen if the s/w had errors in it that would cause a train NOT to stop or simply allow a train to go on its merry way with no control? and how would we know if some s/w had or didn't have such an 'error' in it? (we would call these, errors, right?) well...we wouldn't know! why? because of the way s/w is created today.
having over 30 years programming experience in a variety of languages and contexts AND having worked with some of the managers of the metro control s/w a long time ago, i can tell you that the philosophy under which that system (and most others) are developed is one of: just get it done, then test it! in other words: minimal or no design (certainly not a mathematically rigorous design with proof of correctness), sloppy (i.e., non-structured) 1970s style programming, and 'seat-of-the-pants' testing.
(at least that was the impression i got from what they told me and how they wanted things done on what we were working on at the time!) now, why did they hold the philosophy they held? because developing s/w any other way is way too expensive (and at the end of the day, s/w development is a business).
i remember telling those managers that doing things the way the were doing it
guaranteed disaster. why? because: a jumble of code (1970s style...and i've done my share), minimally documented, soon becomes opaque to anyone but the actual programmer and often to the actual programmer as well. beyond a certain threshold number of lines of code (depending on the language) the whole system becomes a big unknown to everyone involved. a development team will then break-up into groups having differing opinions about how and why the s/w works the way it seems to. but no lasting consensus will be possible as new operational 'features' come to the fore as the system is tested or used. (the irs tax calulation system is very much in this state, right now!)
in addition, i had learned that testing such a s/w jumble and then adding fixes doesn't work because adding fixes has a certain probability of introducing errors itself. so, you could end up in an infinite regress of testing and fixing.
but most damning of all is that testing s/w jumbles can only find errors that show up often enough and make gross enough errors to be detected.
when a test methodology has eliminated errors (that can be detected) that have a frequency of about 1/week, most teams call the testing 'a wrap'. they then go into 'maintenance mode' where they fix an error when they find an error! yikes, will this prevent me from getting turned into hamburger on the information superhighway? don't think so.
now, clearly, calling testing a wrap doesn't means the s/w is 'error free'. there is no lower limit on the frequency of errors. an error that only occurs once in the life time of a system could potentially destroy it!
now, it is beyond the scope of discussion here to consider the subtlety of errors. most s/w errors don't bring a system down, but make small contributions (e.g. 'memory leaks') which ultimately bring a system down or make it so slow that it can't keep up with the real-time situation.
i really hammered this one home to the managers, who took it as a personal critique. needless to say, my time with them was short!
the bottom line to all this is simple. software isn't born from an act of love of creation, but as a way to make the rich owners of software companies richer!!!!!!!!!!!
the owners and executives of such companies couldn't give 'a rat's rass' about how correct one program out of the hundreds they are working on at any time is! they are much more concerned with how much insuring each program works as it is supposed to (when given proper--mathematically rigorous--requirements for the s/w...which is rarely the case) would impact their profit margins. i've heard: "can't be done." "who cares?" "it's all too much!" "let the experts work on it...meanwhile, get legal to c.o.a. in any case..."
well...enough for tonite...this is just one of the ways capitalism fails us. i'll go into other ways, as current events dictate.
bye...more...l a t e r...
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment