/oct 24, 2016

A look at Vulnerabilities and Dependencies by Language

By Brian Wallace

As a Data Scientist at SourceClear I get to analyze lots of interesting vulnerability data as well as anonymized project data. New customers often ask us what "normal" looks like when it comes to vulnerabilities in their projects, so I thought I'd take a look and share a few insights.

How many projects have vulnerabilities, and how many do they usually have?

I looked at projects analyzed with SourceClear and broke them out by language. Unsurprisingly most projects have a handful of vulnerabilities. 80% of JavaScript projects have vulnerabilities, with an average of 7 vulnerabilities per project, while almost 60% of Java projects, with a comparably more robust history with security tools, still have vulnerabilities.

Stats 1

What's up with all these dependencies?

When we analyze your projects, we first build a full dependency graph to see what libraries are in use. Those libraries you specify are called 'direct' dependencies, but that's not everything. Your dependencies have dependencies, called 'transitive' dependencies. Your package manager resolves this whole graph until you've got dozens, sometimes hundreds of libraries inside your projects.

How do these direct vs. transitive libraries break down by language? Java projects look pretty tame with a smaller number of dependencies. JavaScript projects are another story. For every JavaScript dependency you add, you end up with about 8 others coming along for the ride, with an average of 350 dependencies total.

Stats 2

With every app so full of dependencies, lets take a look to see which ones are the most popular - both direct and transitive ones.

Top Java Dependencies

Direct Transitive
guava (29%) slf4j (46%)
avro (29%) jackson (45%)
log4j (26%) jackson datamapper (43%)

Top JavaScript Dependencies

Direct Transitive
mocha (29%) inherits (86%)
express (29%) minimatch (80%)
eslint (20%) ms (79%)

OK, what does this mean? Well, according to our data, 86% of JavaScript applications rely on the inherits library, for example, mostly as a transitive dependency. The top transitive libraries appear in at least 70% of projects: inherits, ms, minimatch, mkdirp, minimist, isarray, core-utils-is. That means 70% of JavaScript projects rely on these libraries, possibly without even knowing it.

Any one of these libraries could become the next leftpad, breaking the majority of JavaScript projects with a single vulnerability.

Where do vulnerabilities typically come from?

In every language but Python (oddly enough) - most vulnerabilities are introduced through transitive dependencies.

Stats 3

Thanks

Drop us a line if there's something else you'd like us to dig into. Of course you can analyze your own projects with SourceClear too, to see what dependencies lurk in your projects, and which of them may be vulnerable.

Related Posts

By Brian Wallace

Brian is data scientist at Veracode, working to identify actionable, data-driven insights to AppSec problems.