/sep 20, 2017

Analyzing Apache Struts Vulnerabilities Using SGL

By Asankhaya Sharma

Recently, a large data breach was disclosed by Equifax that allowed hackers to steal personal information of over 143 million Americans. The underlying issue that was responsible for the breach turned out to be an un-patched open-source Apache Struts component. In this blog post we will discuss about the security issues that have affected Apache Struts recently and the impact they have had. We will also show how we can use our Security Graph Language (SGL) to dig deeper into the open-source ecosystem and find other related vulnerabilities.

RCE in Apache Struts

A critical remote code execution (RCE) vulnerability was made public in Apache Struts 2 weeks ago, just after it was patched in version 2.5.13. Catalogued as CVE-2017-9805 (and by us as SVE-5011), it allows an unauthenticated attacker to run malicious code on an application server.

The exploit is remarkably straightforward: the attacker POSTs a malicious XML payload, and the server deserializes it into a Java object and calls its methods. Creating a ProcessBuilder gives the attacker shell access, allowing her to exfiltrate credentials or pivot deeper into the victim's network.

Struts has a less-than-stellar track record security-wise. Just two months ago, another RCE vulnerability was disclosed, and the database SourceClear maintains shows 48 more known vulnerabilities affecting it -- not counting vulnerabilities it inherits from explicit and implicit dependencies.

SGL

In addition to informing people if they are using vulnerable libraries, we've focused our efforts on trying to find such problems in the wild. SGL is a domain-specific language for this purpose; it is our attempt to improve how vulnerabilities are described and identified.

The recent buzz surrounding the Equifax data leak has been about a few CVEs:

  • CVE-2017-5638, disclosed in March, affecting Struts Core, and supposedly the root cause of the Equifax data leak
  • CVE-2017-7525, disclosed in June and affecting jackson-databind, an explicit Struts dependency
  • CVE-2017-9805, disclosed a coupe of weeks ago and affecting the Struts REST plugin

In the remainder of this post, we'll walk through a few simple examples, illustrating how one might estimate the impact of these vulnerabilities, as well as use knowledge of them to find similar issues.

Estimating Impact

SGL is currently accessed through a REPL client. It is an imperative language for graph traversals in the spirit of TinkerPop's Gremlin.

  ___  ___ _
 / __|/ __| |
 \__ \ (_ | |__
 |___/\___|____|


sgl> vulnerability(type: 'CVE', identity: "2017-7525")

  vulnerability(cwe:0, cvss:'(AV:N/AC:L/Au:N/C:P/I:P/A:P)', type:'CVE', identity:'2017-7525')

  1 items after 0.3s

This query simply checks if the given vulnerability exists in our database and returns it.

sgl> vulnerability(type: 'CVE', identity: '2017-5638') has_version_range has_library

  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-core', version:'2.5')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-core', version:'2.5.1')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-core', version:'2.5.2')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-core', version:'2.5.5')
  <elided>

  32 items after 0.7s

We anchor our traversal at the vulnerability and take two steps outward: the first to the version ranges it is known to affect, and the second to the libraries in those version ranges. This gives us the versions of the Struts Core libraries involved.

To estimate the impact of the vulnerability, we look at both its implicit and explicit dependants -- explicit dependants are declared in the library's pom.xml, while implicit ones are embedded in it in some other way, perhaps by copy-pasting or JAR shading.

sgl> vulnerability(type: "CVE", identity: "2017-5638") has_version_range has_library union(embedded_in*, dependent_on*) count

  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-rest-showcase', version:'2.5')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-gxp-plugin', version:'2.5.1')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-portlet-plugin', version:'2.5.1')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-sitemesh-plugin', version:'2.5.1')
  library(language:'java', coord1:'com.github.a-pz', coord2:'struts2-thymeleaf3-plugin', version:'1.0.0-RELEASE')
  library(language:'java', coord1:'com.github.subchen', coord2:'jetbrick-template', version:'1.2.13')
  <elided>

  35 items after 0.5s

The * indicates that the step should run recursively and find all transitive dependants. For this example, we find only Struts libraries -- this makes sense as Struts Core is only used in applications, which aren't typically found on Maven Central.

Let's compare this to what we get with the jackson-databind vulnerability.

sgl> vulnerability(type: "CVE", identity: "2017-7525") has_version_range has_library union(embedded_in*, dependent_on*)

  library(language:'java', coord1:'org.rapidoid', coord2:'rapidoid-x-demo', version:'4.0.3')
  library(language:'java', coord1:'com.fasterxml.jackson.datatype', coord2:'jackson-datatype-joda', version:'2.7.0-rc1')
  library(language:'java', coord1:'com.paxovision', coord2:'paxo-reporter', version:'1.0.11')
  library(language:'java', coord1:'org.springframework', coord2:'spring-web', version:'4.3.0.RELEASE')
  library(language:'java', coord1:'com.fasterxml.jackson.module', coord2:'jackson-module-jaxb-annotations', version:'2.8.0.rc1')
  library(language:'java', coord1:'com.truward.brikar', coord2:'brikar-common', version:'1.5.25')
  library(language:'java', coord1:'org.springframework', coord2:'spring-orm', version:'4.3.0.RELEASE')
  <elided>

  10566 items after 6.8s

Lots more results, as Jackson is a fairly ubiquitous dependency. The size of the results may be seen as a rough indicator of vulnerability impact.

sgl> vulnerability(type: "CVE", identity: "2017-9805") has_version_range has_library union(dependent_on*, embedded_in*) limit 5

  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-rest-showcase', version:'2.5')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-rest-showcase', version:'2.5.1')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-rest-showcase', version:'2.5.2')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-rest-showcase', version:'2.3.20')
  library(language:'java', coord1:'org.apache.struts', coord2:'struts2-rest-showcase', version:'2.3.24')

  5 items after 0.2s

Running the the same query with the final vulnerability reveals the Struts Showcase application.

Another thing we could try is to look at libraries' methods. The dataset that SGL operates over contains call graphs of all libraries; we hope to someday build a global call graph of the entire open source ecosystem, and have started with a small subset of Maven Central.

sgl> vulnerability(identity: "2017-9805") has_version_range has_library has_method count

  3345

  1 items after 0.5s

sgl> vulnerability(identity: "2017-9805") has_version_range has_library has_method limit 5

  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionMapper', method_name:'isGet', descriptor:'(Ljavax/servlet/http/HttpServletRequest;)')
  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionMapper', method_name:'isPut', descriptor:'(Ljavax/servlet/http/HttpServletRequest;)')
  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionMapper', method_name:'<init>', descriptor:'()')
  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionMapper', method_name:'isPost', descriptor:'(Ljavax/servlet/http/HttpServletRequest;)')
  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionMapper', method_name:'<clinit>', descriptor:'()')

  5 items after 0.3s

Going from the Struts libraries to their methods yields 3345 results per library on average.

To explore the dataset a bit more, we can also find callers of methods of the REST plugin. This will pick up all callers across the ecosystem, not just those which declare Struts as a dependency in their pom.xml.

sgl> vulnerability(identity: "2017-9805") has_version_range has_library has_method called_by* limit 5

  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionMapper', method_name:'getMapping', descriptor:'(Ljavax/servlet/http/HttpServletRequest;Lcom/opensymphony/xwork2/config/ConfigurationManager;)')
  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionMapper', method_name:'isPut', descriptor:'(Ljavax/servlet/http/HttpServletRequest;)')
  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionSupport', method_name:'options', descriptor:'()')
  method(module_name:'null', class_name:'org/apache/struts2/rest/RestActionInvocation', method_name:'processResult', descriptor:'()')
  method(module_name:'null', class_name:'org/apache/struts2/rest/ContentTypeHandlerManager', method_name:'handleResult', descriptor:'(Lcom/opensymphony/xwork2/config/entities/ActionConfig;Ljava/lang/Object;Ljava/lang/Object;)')

  5 items after 0.3s

This returns methods from other Struts projects.

Going back to the particular Struts vulnerability we began with, we could use SGL to identify other vulnerabilities with the same underlying cause. The vulnerable method that our vulnerability artifact is tagged with, XStreamHandler#toObject, does not sanitize data correctly, and could be said to be the cause of the vulnerability. However, it in turn calls XStream#fromXML, which is responsible for the actual deserialization. In other words, XStream#fromXML is the sink into which unsanitized data must go to cause harm.

Call graphs can give us an overapproximation of which sources feed this particular sink.

let struts = library(coord1:'org.apache.struts', coord2:'struts2-rest-plugin') in
let xstream = method(class_name:regex `.*\/XStream`, method_name: regex `fromXML`) in
struts has_method where(calls* xstream)

This SGL program describes the potential sources up to 2 calls away and returns all of them. The query takes 4 minutes and returns 2 results.

sgl> struts has_method where(calls*(2) xstream)

  method(module_name:'null', class_name:'org/apache/struts2/rest/handler/XStreamHandler', method_name:'toObject', descriptor:'(Lcom/opensymphony/xwork2/ActionInvocation;Ljava/io/Reader;Ljava/lang/Object;)')
  method(module_name:'null', class_name:'org/apache/struts2/rest/handler/XStreamHandler', method_name:'toObject', descriptor:'(Ljava/io/Reader;Ljava/lang/Object;)')

  2 items after 236.0s

This correctly finds XStreamHandler#toObject, and an overload which may not may not be vulnerable; we'll have to take a closer look. For this query there are only 2 results, but running it periodically would inform us if there were new issues to triage.

SGL is still very much in development, but we hope that it will advance the state of OSS security by providing greater transparency into what goes into code in the wild. If interested, request more information on our upcoming community researcher programme when Security Graph Language launches.

Related Posts

By Asankhaya Sharma

Dr. Asankhaya Sharma is the Director of Software Engineering at Veracode. Asankhaya is a cyber security expert and technology leader with over a decade of experience in creating security products for industry, academia and open-source community. He is passionate about building high performing teams and taking innovative products to market. He is also an Adjunct Professor at the Singapore Institute of Technology.