Month: November 2016

In Stunning Upset, Real World Defeats Data

screen-shot-2016-11-09-at-2-47-35-pm

Between 8pm and 12pm ET last night, The New York Times’ presidential prediction swung from 81% Clinton to 95% Trump. Slate’s VoteCastr forecast of voter turnout flailed on all seven of their key states. Mike Murphy, long-time political pundit said on MSNBC, “Tonight, data died.” And Sam Wang at the Princeton Election Consortium, who called the election for Clinton on October 18th, now has to eat a bug.

There are only three ways to criticize analysis you disagree with. One, go after the analyst. Two, the methodology. Three, and most common, both. All of these have already happened.

Last week, Ryan Grim at The Huffington Post accused Nate Silver of abandoning polling for punditry. This morning, HBR  bemoaned the sorry state of poll data in an age of cell phones and caller ID. Also today, The Chicago Tribune published a eulogy for the entire “profession of prognostication.”

But take a cleansing breath. This has happened before. On the way from natural philosophy to modern science, empiricism took a detour through alchemy.

In his excellent book Extraordinary Popular Delusions and the Madness of Crowds, historian Charles McKay recounts tales of collective convictions gone wrong. His chapter on alchemy includes scores of mystics, from Geber in the 700s to Count Cagliostro a thousand years later, who claimed to possess the legendary philosopher’s stone.

In all that time, no one succeeded in turning base metals into gold. Some went mad in their pursuits. Princes lost fortunes financing endless experiments. But the thing that put an end to alchemy wasn’t a ban. It was an evolution into something better. The answer to pseudoscience wasn’t less science, but more.

It seems that data science is in its awkward alchemy phase. And the answer now, as it was then, isn’t less science, but more.

If changes in the real world cause selection bias and sampling problems for traditional data gathering tactics, what new ways would work better? If data-driven model-tuning produces more useful results but obscures the model itself, what new auditing methods would fix this?

Among today’s pollsters and data-modelers there may be Nicolas Flamels, willfully leading the credulous astray with techno-wizardry. But there are modern-day Roger Bacons, too, who fix their errors, advocate for the scientific method and advance the practice of actual data science so we can all understand the real world better tomorrow than we do today.