Our mission is to help the world's developers build software, safely. We have a lot of areas that we will be tackling and a lot of features we will be building but we started the journey by helping developers know what third-party code they are using, what it does and what components have vulnerabilities first because we think it is one of the most pressing security problems facing software development today. This post is about how we track and identify vulnerabilities and the information we are putting in the advisories.
My colleague Sean Kinzer recently wrote two excellent posts Using CPEs for Open-Source vulnerabilities? Think Again and Why Relying On the NVD is Not Good For Open-Source Security Tools. I recommend reading those posts if you haven’t already.
We believe that there are three important parts to identifying open-source component vulnerabilities:
As hackers and security researchers have turned their attention to open-source components the number of disclosures has risen. In a typical week we see between five and ten relevant issues released through the NVD system but as Sean points out in his blog referenced above that is a sub-set of the pool of disclosed issues. Most developers simply monkey-patch the component in-situ or update it and then push the update to the binary distribution sites without ever notifying the US government run database.
We have a team of in-house dedicated security researchers and are constantly improving our back-end tools and processes but here are some sources of data that we track today:
For each potential issue that comes across our sights we first decide if it may or may not be relevant. When it’s marked as relevant a researcher does the analysis to determine what the issue is and what it affects and then turns it into something we call an artifact. In that artifact creation process we tear down the advisory to really understand it and identify the root cause. This often means creating working exploits that we share with users. We determine if there is a fix available and if there are potential work arounds as well as determining the vulnerable methods (see below). We also attach information about exploits such as metasploit to help drive prioritization in remediation.
This is manually intensive work and we will be announcing a research bounty program in the coming months. If you want to look at some great examples of completed research artifacts:
We know that the vast majority of open-source component security issues are not yet disclosed and we know that because we have been doing a lot of work using data-science and machine learning to examine all of the components we know about and uncover them. We aren’t quite ready to talk about all of the details about how we do this yet but at a high-level our architecture collects public open-source components when our customers use our system. We collect this open-source using a system we call Librarian that tracks all versions and their binaries and source code. Using this big-data set we are able to look for brand-new or similar issues, explore our hunches and check to make sure that patches have been applied.
This is obviously “special sauce” and one of things that makes us unique so in the spirit of transparency I am just giving a small hint about what we are doing and where we are headed. Look for a lot more about this in the coming months. Honestly we have to rethink the disclosure process first!
When we first built our minimal viable product (also called a prototype) all we did was identify if people were using vulnerable components. After a little while we noticed that despite telling people that they were using high-risk vulnerable components they weren’t fixing them and couldn’t fathom why. We dived in with our early adopters who often told us that when we alerted them they looked into it but found that they weren’t using the vulnerable part of the vulnerable component or using it in a way that made them vulnerable. Luckily for us several members of the team have built commercial static code analysis tools in the past and so we knew exactly how to solve that problem.
Today we build a call graph on the users custom code which shows all the paths that their code takes. We do this by shallow cloning the code to the agent so that the source-code never leaves the users network under any circumstances. Each vulnerability artifact is annotated with the vulnerable methods that our research team have determined and the list of vulnerable methods is passed down to the agent for matching.
It turns out that developers typically only use the vulnerable methods of vulnerable components about 25% of the time meaning that if you only identify vulnerable components you have a 3x false positive rate and we all know that developers hate false positives.
So now you know what we do behind the scenes and how we work under-the-hood, let’s take a quick tour of how it's used and what the interface looks like.
You will notice interesting widgets:
Note : We are adding web-sockets soon so this will be updated in real-time (no need for a page refresh) and we'll be adding a lot more stats and graphs. If there is data you really want now just let us know.
You can see the various view that maybe of interest if you are wanting to understand what components you have here.
Quickly see the repositories that contain vulnerable components.
You first see the graph at the top of the page that allows you to get a quick view and do some high level filtering. You can scroll through the entire list of the vulnerabilities in that organization if you wish (we lazy load them using React) or use the search and filters. For instance type denial of service in the issues search and we just show you only those vulnerabilities. You will notice in this view you can sort to see vulnerabilities that have known exploits.
You can click into any vulnerability and see the Vulnerability Details. In the screen-shot above you can see information about the issue and how to fix it. There is a lot of detail to cover here which I will leave for a future post but there are some highlights that are important to cover.
And of course there is even more. Each vulnerability has its own page with everything we know about the issue including the CVSS score, links to known exploits and other references about it.
Add info and screen shots here
Each vulnerability has it's own page in our vulnerability catalog which is fully searchable right off of our homepage here. For more information see our main product tour https://www.srcclr.com/product-tour.