June 22, 2014
7 Hard Drives Failing: What are the odds? These are the odds.
I run a data center. Disk drives that are left running continuously last between two and three years. Three years is about 36 months.
The odds of a disk failing in any given month are roughly one in 36. The odds of two different drives failing in the same month are roughly one in 36 squared, or 1 in about 1,300. The odds of three drives failing in the same month is 36 cubed or 1 in 46,656. The odds of seven different drives failing in the same month is 37 to the 7th power = 1 in 78,664,164,096.
Of course this is very simplified because disk failure modes are more at end-of-service-life rather than linearly spread over median life. So what if I am off by a factor of 4X? This crude calculation gets us into the same astronomical ballpark. You could insure against this event happening by buying lottery tickets. --theBuckWheat Comment at Doug Ross @ Journal: GEORGE WILL ON MIRACULOUS IRS COINCIDENCE OF CRASHED HARD DRIVES: "Religions Have Been Founded on Less"
Posted by gerardvanderleun at June 22, 2014 11:14 PM
Applied math, in the form of statistics, really allows the individual to 'level up' their reasoning skills.
The hardware equipment called 'Radar' functions off the concepts grounded in statistics and my 'radar' is getting a clear 'True Positive' for the probability of shenanigans in regards to these 'lost emails'.
The Liberals aren't too good at Math, are they?
Hmmmm! Has anyone asked if those hard drives are/were solid state?
Um, I accidently asked my IT man to have his assistant's helper reformat. I uh...MEANT to say refoment!
An honest mistake.
Lois Lerner's hard drive "crashing" doesn't mean dick, even if true. E-mails reside on a network server.
If it's other docs, like MS Word documents, they are also not supposed to reside on an individual's hard drive if they pertain to official business. Basic government and even corporate protocol.
Let the record reflect that I am not a mathematician and certainly not a statistician. But ISTM that the data center operator's math and methodology are incorrect. I think he has made a statistical error in treating the HDs failures as related events when they are independent events.
If the expected life span of an HD is 36 months, and for simplicity ignoring that failures occur nearer the end than the beginning, then each HD has a 1/36 chance of failing in any given month - regardless of what the HD one office away does. So the 1/36 odds per hard drive never change.
The incredulity therefore is not over failing HDs per se, but that the exact same people for whom the committee wants to read their emails are the ones whose HDs failed, and at the same time.
As Yogi Berra said in a different context, "It's too coincidental to be a coincidence."
I worked the math out - which I would welcome checking - at my site, and in fact the odds against it are sensibly comparable to the inverse of the number atoms in the entire universe.
Odds of a crash are irrelevant. They are required by law to archive their emails. It doesn't matter if her computer was busted, they still have backups. Every computer those emails went to and through also have a copy. And THOSE were archived. Its a stupid lie and they know it.
Honestly, this might have worked back in 1997, but today? Nobody buys it. Nobody.
"Nobody buys it"
That's the point, the Regime is so drunk with power, so sure that they have this nation sewed up and in their back pocket that they are not even bothering to concoct good lies. Which we all know that they can do.
No, they are not following the law. They are no longer making even the pretense of compliance with any law on the books. If the Regime isn't even going to comply with the laws that they themselves enacted why would you expect them to pay any attention to any other law.
We are not that far (IMHO) from this Regime from sending out their secret policemen to kill their opponents in the night.
And NOBODY does ANYTHING!
The probabilities of HD failure in your example is dependent on the amount of HD sampled. Thus they are not Mutually Exclusive.
You also are using the HD's of individual computers and not the networked and backed-up HD's of a Data Center.
"TheBuckwheats" use of the probabilities are mutually exclusive and though simple, is a relatively accurate ballpark figure - of which once the numbers are calculated, the lie made by "Liberal-arts" majors is starkly evident.
You asked for input, and that is mine (my education has been slowly decaying for the last 14 years, though the concepts are still intact - thankfully. I'm thinking its nearly time for a memory refresh. )
Look, the administration are lying through their collective teeth and everybody knows they're lying. And they know everybody knows they're lying; they just don't care. I mean, what's anybody going to do about it? Nothing. Nobody is going to do a damn thing about it and they know it. I'm surprised they even bothered with the excuses, that they just didn't tell Congress they can't have the damn emails and that's that....and don't ask again.
Cond001, the probability of an individual HD failing is 1 in 36, and for each HD those odds do not change. The probability that some HD will fail within a set of HDs does indeed depend on how many are in the set. But for each individual one, it remains one in 36.
However, the commenter on Doug Ross's site who wrote, "The odds of a disk failing in any given month are roughly one in 36," really did not set the correct parameters to begin with. The implication is that all HD's fail before the 37th month, and that is not true.
Expected service life for hardware components like this really means (in this example) that by the beginning of the 37th month, half of drives installed at the beginning of the period will have failed, so departments routinely replace components before that time.
But that kind of context would really complicate things, so I used the parameters that he set out just to keep it simple. That's also the reason I didn't refer to backups on the network - he didn't refer to them and besides, network backups have no effect on the 1/36 odds of HD failure.
Sorry, I do not get the reference to "the Buckwheats." But be careful libeling liberal-arts majors - my BA is in philosophy and my Master is in theology! However, I did have a highly technical career.
I asked my chemical-engineer daughter to check my calcs and she said they looked okay to her.
As I said nine days ago, "No one believes this, nor should they. So why lie? The only possible explanation is that the emails' contents are so damaging to the White House that having everyone know you are flat-out lying through your teeth is far better than having them read the emails."
Donald Sensing is right. The only possible explanation is that the emails' contents are so damaging to the White House that having everyone know you are flat-out lying through your teeth is far better than having them read the emails. I suspect the emails lead straight to impeachment.
Sooner or later (I hope sooner) some IT person somewhere with access to an exchange server will post:" I have copies of Lois Lerner's emails. What am I offered?" Then the fun begins. . .
"Cond001, the probability of an individual HD failing is 1 in 36, and for each HD those odds do not change. The probability that some HD will fail within a set of HDs does indeed depend on how many are in the set. But for each individual one, it remains one in 36..."
For each HD to fail simultaneously and be mutually exclusive requires them to be mutliplied. Thus "TheBuckWheat" is correct. (1/36)*(1/36)* etc...
Furthermore, he was comparing Data Center Hard Drives and their backups (which all individual Desktop Hard-drives are tied into) and you were comparing the individual Desktop Hard Drives only w/o backups (Different equation all together from yours).
The size of your sample space of harddrives is a variable dependent upon all the Micro-computers as a whole (and will change your working probability when you increase or decrease the amount of computers in the sample space (thus making your probability of 1/1,388 a dependent variable and not mutually exclusive. If you kept the probability of 1/36 per HD failure, you would have been better off.
Never the less, your findings still found the probabilities of HD failure astronomical and still brought to the front the fact the a lie has been told.
I'm sure your a very intelligent man and I am quite familiar with your writings (of which I agree most highly). I'm glad to hear that your daughter is well educated too. You must be very proud of her.
Though I have an unfinished degree, and am a nobody of little consequence, you are still wrong.
Here's a little bit more information as to how a Data Center works and how they have built-in redundant systems to ensure that no data is lost (and co-incides with "TheBuckwheat"'s Premise.
1. I believe the government uses Microsoft Exchange for their email servers. They have built-in exchange mail database redundancy. So, unless they did not follow Microsofts recommendations they are telling a falsehood. (snip)
2. Every IT organization that I know of has hotswappable disk drives. Every server built since 2000 has them. Meaning that if a single disk goes bad it’s easy to replace. (snip)
3. ALL Servers use some form of RAID technology. The only way that data can be totally lost (Meaning difficult to bring back) is if more than a single disk goes before the first bad disk is replaced….
www . americanthinker . com/blog/2014/06/it_experts_call_bs_on_irs_claim_to_have_lost_lerner_emails.html
You were uncertain as to whether you were correct or not and you seemed to telegraph that you wanted an opinion of your work - which I obliged. Had you not telegraphed this, I woudl have said nothing.
It's like "Casino". If you don't know it's statistically impossible for that many drives to take a crap at the same time, let alone just exactly the ones being subpoenaed, you're too dumb to be doing this for a living. If you do know and you're saying it anyway, you're in on the scam.
Cond0011, thank you for your compliments on my writing. I do not agree you are of no consequence!
However, indulge me at one more attempt at self justification:
According the IRS, the emails are gone because the HDs went bad and the drives were subsequently destroyed. Everything you say about the servers and RAID technology is true but not relevant to the mathematical problem as the IRS presents it because the IRS is implicitly saying that there are no email backups. The IRS position is that the emails were vanished from the surface of the earth when the HDs went bad and were then destroyed.
Therefore I did not include potential backup systems in attacking the IRS' position because I wanted to attack it on the same terms as the IRS presented it. The backup systems are certainly a political issue, but within the terms the IRS itself set, not a mathematical one.
"For each HD to fail simultaneously and be mutually exclusive requires them to be mutliplied. Thus "TheBuckWheat" is correct. (1/36)*(1/36)* etc..."
Reread my post and you'll see that is exactly what I did, although calculated within the universe of a presumed 50,000 email-capable computers within the entire IRS. (I assure you there are many thousands more than that, I was simply being as charitable as possible to the IRS.) What Buckwheat did was consider only those seven, but that's too narrow a set.
Thank you for your comments, but I have t say I stand by my post.
It's a little more than obvious isn't it?
To even consider the possibilities of actual equipment failure is one symptom of what's wrong with the public's reception and processing outrageous lies and illegal behavior.
Uh, er, mm, the dog ate my homework. And six of my buddies' homework too. I'm good with thsat.
Can there be any doubt that the original wrong-doing and the subsequent cover up are part of a premeditated and calculated plan to subvert justice and allow a tyranny to take over?
I will go so far as to say that we will not see another presidential election.
The mid-term elections are pretty much a dog and pony show to appease any of the citizens that think we can "fix this thing".
Look at how the Cloward-Piven Strategy is unfolding around the flood of illegal aliens at our border.
Look at how Alinsky's Playbook is moving things along quite nicely in the political arena.
Look at the decreasing military power and capability while the Brownshirts, Obama's homeland army is growing by leaps and bounds.
After you're done looking at all this stuff look at beans, bibles, and bullets. Oh, and avoid crowds.
WW4 will be fought with sticks and stones, and none of us will be there.
"Cond001, the probability of an individual HD failing is 1 in 36, and for each HD those odds do not change. "
Yes, but you included the hypothetical number of computers involved in your probability. Teh Sample space you used was 50k making the working probability as 1/1,388.
If you had made your Sample Space 60k, the working probability would be 1/1666.67. Thus would the working probability be dependent on the Sample Space and not Mutually Exclusive as 1/36 is.
"According the IRS, the emails are gone because the HDs went bad and the drives were subsequently destroyed. Everything you say about the servers and RAID technology is true but not relevant to the mathematical problem as the IRS presents it because the IRS is implicitly saying that there are no email backups. "
You do not understand how Networked Computers work, Don. That is why there has been various Programmers who work in Data-Centers come forth and tell us that this IRS proclamation is a lie. Computers Networked together have Hard-Drive resources outside the Computer workstation being used.
"Reread my post and you'll see that is exactly what I did, ...What Buckwheat did was consider only those seven, but that's too narrow a set. "
TheBuckWheats comes from the standpoint of the DataCenter for all the "xx",000 of Computer Workstations and the built-in redundancies within the average Computer Network. Again, you do not understand Newworked Computers. For Each Redundant Safety for losing Data in a Network to fail simultaneously would require the multiplication of each mutually exclusive Hard Drive, thus him writing (1/36)*(1/36)*(1/36)...
So, I stand by TheBuckwheats assessment and not yours as the more accurate one.
I really hate to keep driving this home, .... especially since both of you agree with the conclusion that a lie has been told by the IRS.
cond001 wrote to me, "You do not understand how Networked Computers work, Don" and "Again, you do not understand Newworked Computers [sic]."
Well, since I used to design and install business networks for a major (though local) B2B technology company, I will have to take issue with that.
I KNOW that the IRS is lying about backups as well as the "various Programmers" you refer to. And obviously as well as the Oversight committee knows, since that topic dominated last night's testimony. And the IRS's position has, um, evolved, since the COmmissioner last night said, basically, Oh sure Lerner's hard drive was backed up but there's no way to recover from the backup those emails that you want. So the continuing backup claim by the IRS is that the emails were irretrievably lost when they destroyed the hard drive, backups be damned.
My position is that Buckwheat's definition of the parameters of the problem are too narrow. You don't agree and offer reasons why. I think your reasons fail just as you think mine do. But can't we all get along?
Absolutely not! Pistols at dawn.
I don't believe for one second that the IRS story on this. I want make that clear, but I have had two hard drives down at the same time and I only have two hard drives. So I think the odds are not that high, but they are still very high.
Anyone who believes that all the IRS emails were accidently lost either has never worked for a company or is so naive that they still believe the fallacy of a government that is for the people.
The American people should demand that Congress force Lerner and all other IRS officials to tell the truth or sit in jail until they are ready to do do.
Your estimate of the probability for seven different hard drives crashing in the same month is completely wrong.. You first state that the drives last from two to three years if run continuously. That means that if you have a data center with say 100 computers purchased at the same time and running continuously at the end of three years, all the drives will have died (you must be buying cheap drives, mine typically last a bit longer). So on average 100/36 drives die each month = approx. 3 drives going out per month. If you are running 250 computers in your data center all purchased at the same time then you lose 250/36 drives per month = 7 drives per month.