/jul 27, 2009

Bytecode Analysis Is Not The Same As Binary Analysis

By Chris Wysopal

Gartner analyst Neil MacDonald has written that Byte Code Analysis is not the Same as Binary Analysis. He describes the difference between statically analyzing binary code, which runs on an x86, ARM, or SPARC CPU, and statically analyzing bytecode, which runs on a virtual machine such as the Java VM or the .NET CLR. As more companies with software security testing technology wade into the "no source available" pool (come on in guys, the water is nice), it is important to understand what capabilities you need for software assurance when you don't have access to source. If the software you are concerned about is written in a language such as C or C++, and then compiled to form an executable binary, as the majority of commercial software is, you will need true binary analysis. The analysis technology provided by Ounce Labs and Fortify Software isn't capable of understanding this native compiled code. The other situation where you will need binary analysis is when you have access to some of the source but other parts of your software are in binary form. This is common because most C/C++ programs, written by enterprises and software vendors alike, are partially built with compiled libraries that are distributed in binary form. If you are only looking at the source-available subset of the software you are not covering 100% of the code. You will also need binary analysis, and not just bytecode analysis, if your Java code uses JNI or your .NET assemblies call into non-managed code. Even within the set of bytecode analysis techniques available today there are significant differences in technology. At Veracode, we generate our software analysis model directly from the bytecode with no lossy intermediate step back to source code. Source code static analysis tool companies have taken an indirect route to analysis. The tools first use a bytecode decompiler to create source code from the bytecode. Then the tools build an analysis model from the source code. This means that any code generation decisions made by the compiler, which are in the executing software, will be missing from this model. I would say this isn't really even bytecode analysis at all. It is decompiled bytecode source analysis. Bytecode analysis and binary analysis are important technologies for assuring the integrity of the software supply chain. These techniques are a powerful addition to first generation static analysis where source was required. Make sure you are getting the capabilities of true binary analysis and direct bytecode analysis to protect your organization from application security risk.

Veracode Security Solutions
Security Threat Guides

Related Posts

By Chris Wysopal

Chris Wysopal, co-founder and CTO of Veracode, is recognized as an expert and a well-known speaker in the information security field. He has given keynotes at computer security events and has testified on Capitol Hill on the subjects of government computer security and how vulnerabilities are discovered in software. His opinions on Internet security are highly sought after and most major print and media outlets have featured stories on Mr. Wysopal and his work. At Veracode, Mr. Wysopal is responsible for the security analysis capabilities of Veracode technology.