## SAY MORE, ARTHUR BENJAMIN!–STATISTICS AT THE PINNACLE–PART I

One of my good hiking friends posted a TED talk by Arthur Benjamin on why we should teach statistics at the pinnacle of math education, rather than calculus.  I had only two complaints with his talk: first, it was too short, fewer than 3 minutes.  He should have gone on for an hour with that audience.  They would have learned a lot from him. Second, I’d add that statistics can teach us a lot about life lessons.

I commented briefly, saying that I could easily write for 3 pages.  Then I thought, “Why not?” Few will read it, because it’s math, and well….

Anyway, I’d start off with a deprecatory statement about my field:  “We statisticians are almost never right.  That’s remarkable. Never right.  BUT, we know how wrong we are likely to be, because our estimates have a margin of error.  Any estimate that does not have a margin of error is, to us, worthless.  If that fact went to the Halls of Congress, if somebody said that “Social Security will be bankrupt by 2028,” I’d like someone to ask, “What is the margin of error?”  Why?  Because somebody made a prediction about the future with data.  If somebody made a different prediction with slightly different assumptions, they would have gotten a different answer.  How different?  That is what margins of error are all about.  We’re talking about the future, and we can’t predict the future with utter confidence.

“What is confidence?” I would ask. “Let me first define probability: Probability is the likelihood an event will occur in the future, not the past.  It can be 0, no possibility at all; 1, certain it will occur, or any number in between those two.  Those who gamble know something about probability.  Good bridge players know probabilities of various distributions of cards; six missing cards in a suit are more likely to divide 4-2 than 3-3.  What we need to learn in this society is the idea that probability is not always equal if there are only two options.  Heads-tails is 50-50; boy-girl is close enough, although not exactly 50%. Millions of illegals voted in the last election or did not, or vaccines cause autism vs. they don’t, and you still have two possibilities, but now they aren’t equal.  I wish the media would learn that and not assume all sides deserve equal billing.  As a corollary, I wish the media would remember that strong statements require strong evidence.

“Roll a die, and there is 1/6 chance a 3 will come up; all 6 possibilities have equal probability.  But when you roll two dice, there are 11 possible sums, from 2-12 inclusive, and their probabilities are not all 1/11.  If you disagree, please see me with your wallet in hand and we will play, because the expected value of my winnings, which is the likelihood of my profit or loss over a period of time, will be in my favor.  If I can bet on the fewest sums that will in the long run pay me money, I will choose 7, which has a 1/6 probability, 6 and 8, which each have (5/36) probability, and either 5 or 9, each of which has 1/9 probability.  In the long run, the probability will be 20/36 in my favor.  We need to teach that competing ideas do not necessarily have the same probability.  That means we shouldn’t give equal time to people who think alien abduction occurs, because it either does or doesn’t, and they feel they should have equal say.  When we get to more significant probabilistic questions, such as smoking significantly increases the likelihood of lung cancer or heart disease, or that polio vaccination dramatically decreases the likelihood of contracting polio, we can and should make appropriate public policy.  Liberal theories?  Nope, just laws of mathematics that can be proven and which may be applied to everyday life.

“Furthermore, probability can be independent or dependent, and failure to remember that was in part was behind the Challenger shuttle disaster. Independence means that the results of one trial don’t affect the next.  Dice don’t have a memory.  Dependence means that they do.  When one O-ring fails, the likelihood of another’s failing increased.  Pull three aces out of a deck of cards, and the probability I will draw an ace from the remaining cards is now 1/49.  That is a conditional probability.

“When we make an estimate of something, we need a margin of error, a wonderful concept which teaches us to be humble and say, “I could be wrong,” four words every man ought to learn before getting married, and a breath of fresh air again in the Hallowed Halls of Power.  A caution, however, in that a margin of error doesn’t mean anything goes, that “anything is possible.”  Anything is possible if one’s idea of possibility is a one in a trillion event matters.  Statistics discusses things like million, billion, and trillion, so let me describe likelihoods for various scenarios:

• 1 in 1000: about the likelihood of getting a straight flush in poker or correctly picking a second at random that I have chosen which occurred in the last 17 minutes.
• 1 in 10000: about the likelihood of guessing right a kilometer I am thinking of between Chicago and Tokyo, or picking a minute correctly that I am thinking of that occurred in the past week.
• 1 in 100,000: correctly picking a millimeter at random that I am thinking about on a football field from the back of end zone to the back of the opposite end zone.  Correctly pick an hour chosen at random in the past 12 years.
• 1 in a million: Correctly pick a person chosen at random in a large city; a second chosen at random in the last 12 days; an acre I am thinking of in a large wilderness area 50 x 30 miles size.
• 1 in two billion:  Correctly pick a second, chosen at random, from the 1 January 1955 to now.  A single second. Correctly pick a randomly chosen acre in the US.
• 1 in a trillion: Pick a day at random since the Earth was formed.

I think that every legislator be compelled to know the differences among million, billion and trillion before they are allowed to run for office, so we don’t get silly statements of “billions and billions, and billions of acres are locked up by the federal government.”  The whole country has fewer than 2 billion acres.  If you don’t have the sense of what a billionaire is, you can’t appreciate how much money that is.  A billionaire could spend two thousand dollars a minute for a full year, day and night, before they would run out of money.  Ten million dollar house bought Monday morning?  Paid off Thursday evening.

“We use something called a confidence interval.  That is a range around an estimate where we state how confident we are that the true value lies in the interval.  It isn’t probability, it’s confidence.  You see, there exists a true value, but it is unknown and unknowable.  The range we have will either contain that true value or it won’t.  That is a 100%-0% question and not helpful.  We have 95% confidence intervals to explain that if we were to take 100 different samples, obtain 100 different estimates and confidence intervals, 95% of them would contain the true value, but we wouldn’t know which 95.  See?  We don’t know the answer.  But we are highly confident we can construct an interval wherein it lies.

Knowing confidence intervals would have been useful for journalists who reported on the once famous 44,000-98,000 deaths annually due to medical errors.  They rounded the latter figure up to 100,000 and used it, but the point estimate of 71,000 was the single best number.  Zero was not possible, nor 10,000, nor a million, not possible if we are going to remain sensible about the world.

“Global climate change likelihood is prediction, which lends itself to statistics and to confidence intervals, and the IPCC was more than 95% confident years ago, a strong statement of science.  It means that the interval they calculated was highly likely not to contain 0, no temperature rise.  It is incumbent upon those who disagree to come up with a confidence interval so that we can look at their data and see what assumptions and calculations their models have.  This would prevent a lot of unnecessary arguing, and the arguments we have would be more appropriate.

“Means and medians are basic concepts people should understand, because a mean, the average, is affected greatly by outliers, whereas the median is not nearly as sensitive.  Housing prices and salaries are much better described by the median.

People talk about a non-existent term called the Law of Averages.  I’d not teach it, and maybe it would go away. There is The Law of Large Numbers, which says frequencies of events with the same likelihood of occurrence even out, given enough trials or instances.”

“I can see that a lot of you are yawning and looking fried.  I’m giving you a year’s curriculum in a few minutes.  Imagine, however, how useful all this stuff might be if I had a year to teach it to students.  I actually tried to do that in Tucson in 2011, for free, as a trial course, my swan song before leaving town 3 years later.  But I didn’t have an education degree, and the school had other priorities.  Such a shame, really.   OK, let’s take a break, and come back and I’ll finish the summary.”

Tags: ,