Software Security Needs Its Nate Silver

Nate Silver, the rock star statistician behind the New York Times FiveThirtyEight blog, became an unwilling player in the heated political rhetoric ahead of the Nov 6. Presidential election. Silver covers politics and other news from the viewpoint of a statistician: putting the rhetoric and the political consultant’s alchemy aside to look at the numbers. Despite a breathless narrative about the tight race between Romney and Obama, Silver never gave the Republican Presidential candidate a better than 40% chance at winning the election, based on its analysis of state- and national tracking polls. On the eve of the election, he said Romney’s chance of winning was down to just 9%.

Despite Silver’s stellar track record at calling races, the political punditry and Romney’s supporters were irate. Political hacks argued that elections were “one time events” and inherently unpredictable, therefore Silver’s numbers couldn’t be trusted. Republicans argued that he was in the bag for Democrats: a sterling example (pun intended) of the biases of a left-leaning media. As it turned out, Silver called the presidential vote correctly in all 50 states on election day and in the vast majority of Senate and House races down the ballot, as well, silencing the chorus of voices who argued that politics was too complex to boil down to numbers and averages. So much for that.

I say this, not to toot Silver’s horn, or President Obama’s. Rather, the whole thing has me wondering who is the software industry’s Nate Silver, and when we might expect her (or him) to arrive?

Like politics before Nate Silver, the software industry is still in its dark ages - at least when it comes to quantifying complex things like “application security” and “code quality.” Despite vast improvements in security testing and secure development practices, many of us still throw reason out the window when making purchasing decisions: relying on mushy metrics like a vendor’s reputation or customer feedback to assess the quality and security of software we buy and deploy, when what’s needed is hard, incontrovertible facts.

Data from the most recent Veracode State of Software Security Report (SOSS) backs this up. According to the most recent report, just 16% of enterprises - fewer than one in five - have requested that their software vendors conduct an application security assessment of their product. Fully 84% of enterprises failed to test the software of any of their third party software vendors. (Its worth noting, also, that Veracode sets the bar high - defining “enterprise” as a company with more than $500 million in annual revenues. The statistics among smaller, less affluent companies, we may assume, will be no better.)

Demand for software testing is also concentrated in three verticals: financial services, IT services and technology, despite the obvious need for it elsewhere. As an example: security researchers like Kevin Fu, now at the University of Michigan, and Barnaby Jack have made no secret that medical devices - including implantable devices - are rife with security holes and lax coding. The consequences of these lax practices couldn’t be larger for patient health as well as for the hospitals and doctor’s offices that use this equipment. And yet, Veracode’s SOSS report shows that requests for application code reviews coming from the Healthcare sector make up just 5% of all assessment requests. A similar argument could be made about the Utilities and Energy sector - a prime target of sophisticated attackers these days, and also an industry with a long history of lax application and network security. Requests from that sector accounted for just 2% of all the assessment requests received by Veracode.

And it’s not like the need isn’t there. Close to 80 percent of the web application builds scanned by Veracode were found to leak information. Cross site scripting vulnerabilities were found in 71 percent of the web applications scanned, while 67% were found to have directory traversal and cryptographic implementation vulnerabilities. Among non-web applications, sixty two percent of non-web applications tested by Veracode were found to have problems in the way they implemented cryptography, while more than half also had problems with secure error handling and directory traversal.

So why is application testing still the exception, rather than the rule, even at large and wealthy firms? As with political prognostication - or baseball before the advent of the stats-heavy “moneyball”- software purchasing is still an activity that’s driven more by relationships and reputations than hard data. Again - just to repeat the number - only 16% of organizations did any application testing at all. Those are companies that, in essence, are “trusting” their vendors to deliver them software that’s free of exploitable software vulnerabilities or other problems that might expose them to information leaks, attacks or worse. And, if Veracode’s data is representative, in around 8 of 10 cases, they’re not getting what they asked for.

Among the small subset of companies that do test applications, Veracode found that the quality of that testing and the outcomes derived from it varied greatly. Many firms, Veracode concluded, pursue an ‘ad hoc’ approach to testing - deciding whether or not to test applications on a case-by-case basis and without a clear mandate from their business about what to accomplish with the tests. Still others took a more data-driven and programmatic approach to testing, with clearly defined guidelines for which applications required security assessments, clearly defined criteria for compliance and consequences for failing to comply, and tight integration between the purchasing and application security groups within an organization.

The results stemming from each approach were striking. Veracode found “significant difference” between the two approaches, when measuring the number of applications that eventually achieved policy compliance (whether that was OWASP, SANS or internal policy) within the 18 month timeframe Veracode used to observe compliance efforts. Using a programmatic approach to testing, 52% of enterprise applications achieved policy compliance, while only 34% of applications achieved policy compliance in organizations that used a more ad-hoc approach to testing.

Silver, who once worked for the insider publication Baseball Prospectus, said recently that he didn’t start covering politics out of a love of the subject - or of politicians. In fact, he’s expressed distaste for both. Rather, he said he felt like political discourse and media coverage was still caught in “the stone age” and could benefit from at least the level of analytic rigor that baseball and sports journalism now enjoy. I’d say that the same is true of the software industry. We need our Nate Silver - the charismatic quant who can come in and show us that the “science” we’ve been relying on when making decisions about software is just alchemy, and that the right path forward is one that relies both on high standards and rigorous, repeatable measurement against those standards.

Comments (3)

Matt Palmer | November 16, 2012 7:50 am

I'd agree that repeatable measurements and solid analysis would be a good step forward. However, I'm not sure you really want a "charismatic quant" to show you the way forward. How do you know that Nate's predictions are any good? His track record? His source material? How do you know that this couldn't have happened purely by chance? How many other people making predictions came close, or failed, or did well last time but not this time? Out of the number of people making predictions (most of whom you will never have heard of), how many would you expect to make accurate predictions purely by chance several times in a row...? There are many potential biases here. Maybe Nate can provide us with some stats on how you can reliably distinguish between the good and the merely lucky!

john | November 17, 2012 2:54 am

Well, corporate america is not run by individuals that give a damn about you or your so-called ownership stake. It's an irresponsible, perverted system. Even the board of directors' incentives committees have a horribly incestuous relationship with the folks that they "objectively" evaluate. "They" are concerned about their own short term incentives and more importantly, payouts. If even a slice of their cash goes this quarter into even trying to understand the longer term issues like SQLi vuln or persistent XSS instead of showering customer decision makers with favors for this quarter's big deals, well, that is a problem. According to them, security is easiest described as a cost center that should be minimized. Now pipe down and I am back to the golf course. Understand your subject matter first, no whining. Quit rambling and put the blame where it belongs. Shall we have another TARP for cybersec screwups? It's already in motion.

fo | November 17, 2012 3:08 am

Great post. But Nate Silver gathers and normalizes input for one event. That's it. Web sites defend 24/7 for decades? Against determined global interests. Everyone's SQLi vote counts 100%. There are do-overs all the time. Big differences. Nate whatever would fail, because he could never get the input. Let's say every presidential vote will change the entire election one way and the other. It's apples to oranges here.

Please Post Your Comments & Reviews

Your email address will not be published. Required fields are marked *

The content of this field is kept private and will not be shown publicly.