/dec 16, 2015

Vulnerable Methods Under the Hood

By Asankhaya Sharma

Yesterday, Mark Curphey introduced a new feature that we released in our product called Vulnerable Methods. We developed the vulnerable methods technology to provide more accurate and detailed information to our customers when they are using libraries and components in their code that have vulnerabilities. So far, we have seen that in the majority of cases when someone is using a vulnerable library they are not calling the vulnerable methods. Thus, this feature really helps in cutting down the false positives.

In this article, I will explain how this feature works under the hood to give you a better understanding of how we built it.

A key aspect of the vulnerable methods technology is based on building and traversing call graphs of programs. So, before I describe the vulnerable methods analysis lets take a short detour to understand call graphs.

Call Graphs

A call graph of a program captures the calling relationships between methods. It is a directed graph where each node represents a method and each edge (m1, m2) indicates that method m1 calls method m2. Due to recursion, a call graph may contain cycles.

As an example, look at the following code snippet. The main method in the CheckHash class is calling three methods from the BCrypt class that is in a different library jbcrypt.

import org.mindrot.jbcrypt.BCrypt;

public class CheckHash {

  public static void main(String[] args) {
    String candidate = args[0];
    String hashed = BCrypt.hashpw(candidate, BCrypt.gensalt(12));

    BCrypt.checkpw(candidate, hashed);
  }

}

For this simple program, if we were to construct a call graph it would look something like below:

Image of Call Graph

Even in such a simple example it is easy to see the peculiarities of building a call graph for a realistic program. Note that we only show 4 nodes in the graph. The methods hashpw, gensalt and checkpw in the BCrypt class may in turn call other methods but those cannot be known by just analyzing the code snippet above. We need to analyze the library code if have to build a complete call graph for a program.

Furthermore, in an object oriented language like Java, static call graph construction is complicated due to the presence of dynamic dispatch and requires alias analysis. As part of the vulnerable methods technology, we have implemented a new call graph construction algorithm that is based on class hierarchy analysis (CHA) and rapid type analysis (RTA). We have also implemented several additional optimizations and heuristics to handle indirect calls (via threads), bridge methods (due to type erasure), and certain cases of reflection.

Constructing a precise call graph is essential for vulnerable methods analysis. The next section explains how this call graph gets used.

Vulnerable Methods Analysis

The vulnerable methods analysis has three parts. Firstly, we analyze a vulnerability to identify the root cause of the vulnerability, this allows us to figure out what methods are affected by the vulnerability, we call these the vulnerable methods. Secondly, we analyze the library itself to see all the public methods of the library that call the vulnerable methods. Finally, when we scan a project that uses a vulnerable library we check if the project calls the public vulnerable methods of the library.

If all that sounds complicated, let's take an example of a real vulnerability in a library to understand it better. Consider CVE-2015-0886, it describes a vulnerability in a Java library - jbcrypt. For this vulnerability we would do the following:

Identify the root cause

The version 0.3m of the library was vulnerable to an integer overflow. The commit that fixed this issue is available on GitHub.

Image of GitHub Commit of the fix

From the fix, we can see that the method crypt_raw was modified to prevent the integer overflow. In the version 0.3m the method crypt_raw was responsible for the vulnerability. Thus, we would consider crypt_raw as the vulnerable method.

Analyze the library

Once we have the vulnerable method, we need to analyze the jbcrypt library itself to see if there are other methods in the library that call crypt_raw. For this part of the analysis, we build the call graph of the library and traverse it to compute all the public methods that have a call chain (following the edges) to the node for the vulnerable method crypt_raw. In this particular case, since the library is very small, you can have a look at the source code and see that crypt_raw is called only from hashpw. Now, we also need to consider hashpw as a vulnerable method and look at all the methods in the library that call hashpw and so on. We continue this process until we cannot add any new method to the list of vulnerable methods (until we reach a fixed point). At the end, for this library the public vulnerable methods would include hashpw and checkpw.

Analyze the project

This complete list of vulnerable methods for the vulnerable component is generated when we create the artifact for the vulnerability. During a scan, when we detect that a vulnerable component is present in the dependency graph of a project, we build the call graph for the project and check if the vulnerable methods of the component are called. For our example program, the method main was calling the vulnerable methods hashpw and checkpw:

Image of Call Graph with Vulnerable Method highlighted

Thus, not only can we tell if a program is using a vulnerable component, but also if it is making a call to the vulnerable method. In a more realistic project, the call chains can be really large or there may be multiple paths to the vulnerable method. In all cases, we provide complete information on the full call chains showing all the methods in the chain and the line numbers where they are called in the code:

Image of Vulnerable Call Chains

As a developer, the call chains leading to vulnerable methods are useful while thinking about how to fix the vulnerability. Since we also analyze the library code, you can see the full path within your project, how it calls the library code and eventually the call to the vulnerable method.

Given that the vulnerable methods analysis depends on the ability to identify the root cause of the vulnerability, the SourceClear security research team will analyze and annotate the vulnerable methods information for every new artifact that we create in our catalog. As of today, vulnerable methods are currently available for Java but we are already working on adding support for other languages including Ruby, JavaScript and Python. We believe vulnerable methods are key to eliminate false positives from component vulnerabilities and provide an accurate picture for open-source component risk.

Related Posts

By Asankhaya Sharma

Dr. Asankhaya Sharma is the Director of Software Engineering at Veracode. Asankhaya is a cyber security expert and technology leader with over a decade of experience in creating security products for industry, academia and open-source community. He is passionate about building high performing teams and taking innovative products to market. He is also an Adjunct Professor at the Singapore Institute of Technology.