by Tyler Shields
Welcome back to the series on anti-debugging. Hopefully you have your debugger and development environment handy as we are about to dive into the first round of anti-debugging code. In the first post to this series we discussed six different types of anti-debugging techniques that are in common use today. To refresh, the classifications buckets that we chose to use are:
- API Based Anti-Debugging
- Exception Based Anti-Debugging
- Process and Thread Block Anti-Debugging
- Modified Code Anti-Debugging
- Hardware and Register Based Anti-Debugging
- Timing and Latency Anti-Debugging
Basic API Anti-Debugging
We’ll continue this series of posts by going into a bit more depth on the easiest of API based anti-debugging techniques. An application programming interface (API) is used to support requests made from other applications for resources or functionality within a target service or library. In our case we will be primarily focused on the Microsoft Windows operating system API. There are a number of calls built directly into the operating system API that make detection of a debugger possible. Minor differences in thread and process meta-data is present when processes are run within a debugger. These calls typically facilitate a process or thread examination technique in order to determine if the target thread has a debugger attached.
When learning about anti-debugging, a developer will typically first be introduced to the IsDebuggerPresent() function. This function analyzes the process block of a target process to determine if the processes is running under the context of a debugging session. We’ll save the details of how this actually works for a later article, however suffice it to say that the target process has a flag that will contain a non-zero value if the process is being debugged. This flag is queried and returned when IsDebuggerPresent() is called. A very basic debugging detection routine would be to call this function and execute different code paths based on the response.
Prototype: BOOL WINAPI IsDebuggerPresent(void);
if (IsDebuggerPresent()) {
//Debugger Detected - Do Something Here
} else {
//No Debugger Detected - Continue
}
We could also use the API function CheckRemoteDebuggerPresent(). Contrary to first thought, this function does not target a process on a remote machine, nor does it even require that it target a process remote to itself. The call can use a parameter pointing to itself to determine if it is running inside of a debugger. In the example below we pass in a handle to our current process by calling the GetCurrentProcess() function along with a variable to hold the return value from the CheckRemoteDebuggerPresent() call.
Prototype: BOOL WINAPI CheckRemoteDebuggerPresent(__in HANDLE hProcess,
__inout PBOOL pbDebuggerPresent);
BOOL pbIsPresent = FALSE;
CheckRemoteDebuggerPresent(GetCurrentProcess(), &pbIsPresent);
if (pbIsPresent) {
//Debugger Detected - Do Something Here
} else {
//No Debugger Detected - Continue
}
While these two methods are probably the easiest and most straightforward methods of anti-debugging, they are also the most likely to be understood by a person wishing to bypass them. We can mix it up a bit and use a call to OutputDebugString() instead. OutputDebugString() is typically used to output a string value to the debugging data stream. This string is then displayed in the debugger. Due to this fact, the function OutputDebugString() acts differently based on the existence of a debugger on the running process. If a debugger is attached to the process, the function will execute normally and no error state will be registered; however if there is no debugger attached, LastError will be set by the process letting us know that we are debugger free. To execute this method we set LastError to an arbitrary value of our choosing and then call OutputDebugString(). We then check GetLastError() and if our error code remains, we know we are debugger free.
Prototype: void WINAPI OutputDebugString(__in_opt LPCTSTR lpOutputString);
DWORD Val = 123;
SetLastError(Val);
OutputDebugString(L"random");
if(GetLastError() == Val) {
//Debugger Detected - Do Something Here
} else {
//No Debugger Detected - Continue
}
These three methods are the basic starting point for a developer wishing to implement anti-debugging into their code base. The methods are so simple they could even be implemented as macros making a call quick and easy. Numerous other API based detection methods exist with a vast array of complexity. In the next post in this series we will discuss slightly more advanced API anti-debugging techniques that will make reverse engineering and debugging even more difficult.
by Tyler Shields
For those that don’t know, anti-debugging is the implementation of one or more techniques within computer code that hinders attempts at reverse engineering or debugging a target process. Typically this is achieved by detecting minute differences in memory, operating system, process information, latency, etc. that occur when a process is started in or attached to by a debugger compared to when it is not. Most research into anti-debugging has been conducted from the vantage point of a reverse engineer attempting to bypass the techniques that have been implemented. Limited data has been presented that demonstrates anti-debugging methods in a high level language that the average developer can understand. It is with this in mind that I hope to begin a series of posts that present some of the methods of anti-debugging in a clear, concise, and well documented fashion. The end goal of this series is to arm developers with the techniques and knowledge that will allow them to add a layer of protection to their software while simultaneous educating reverse engineers in some of the anti-debugging methods used by malware authors today.
Before we delve into the intricacies of individual methods of anti-debugging let’s use this post to define the classes of anti-debugging that we will be discussing. While other classes may exist, the definition of these classes is an attempt to include the majority of anti-debugging methods in use today. There is some overlap between classifications and we may have left out some methods due to limited exposure or effectiveness.
API Based Anti-Debugging
API based anti-debugging is the most straightforward and possibly the easiest to understand for a typical developer. Using both documented and undocumented API calls, these methods query process and system information to determine the existence or operation of a debugger. From single line calls such as IsDebuggerPresent() and CheckRemoteDebugger() to slightly more complex methods including debugger detaching and CloseHandle() checks. These methods are generally trivial to add to an existing code base and many can even be implemented in as few as two or three lines.
Exception Based Anti-Debugging
Exception based anti-debugging is slightly different than your basic API based techniques. Many times when a debugger is attached to a process, exceptions are trapped and handled by the debugger without regard to passing the exception back to the application for continued execution. Occasionally these exceptions can even crash or terminate a process when run under a debugger and be handled gracefully when running clean. It is these discrepancies that makes exception based anti-debugging techniques possible.
Process and Thread Block Anti-Debugging
Some of the API based anti-debugging methods use published functions to query information from within the process and thread blocks for our running code. Many API based detections can be subverted within a debugger by hooking the API call and returning values that indicate a clean process. One way around this subversion is to directly query the process and thread blocks, bypassing the API calls. Direct analysis of the process and thread blocks, while more complex, can lead to a more accurate and high assurance result.
Modified Code Anti-Debugging
One of the methods that a debugger uses to signal a breakpoint is to insert a break byte into the running code at the location that it wishes to stop execution. The process execution breaks when this value is seen, giving control to the debugger. When the program is resumed, the breakpoint value is removed and replaced with the original byte, the execution backed up one byte, and the program is resumed. Detection of software based breakpoints can be achieved by analyzing the process for modifications from the expected norm.
Hardware and Register Based Anti-Debugging
A second way that a debugger can break the execution of a process is by using a hardware breakpoint. A hardware breakpoint relies upon CPU registers to store the pertinent information and to detect when the target break addresses are seen on the bus. A break interrupt is triggered at the appropriate time based on these register values. Reading or modifying the hardware can allow for the detection of a debugger.
Timing and Latency Anti-Debugging
Finally timing and latency can be used as an effective anti-debugging method. When executing a program within a debugger, specifically when single stepping, a much larger latency occurs between execution of instructions. This latency can be detected and compared against a reasonable threshold to detect the existence of a debugger attached to our process.
Each of the classes of anti-debugging outlined above has merit when used individually to protect a process. While none of them can be assured to ever protect a program from a determined reverse engineer or debugger, implementation of these techniques (or many of them if appropriate) can sufficiently slow down the debugging process and hopefully make the attacker spend his time on other, easier, ventures. In the remainder of this series on anti-debugging we will review in depth some of the more interesting methods of each of the above classes. So bring along your debugger and your development environment and let the games begin.
by Chris Eng
Another BlackHat has come and gone. As usual, it was a very busy week juggling customer meetings, recruiting, conference planning, vendor parties, and, oh yes, the actual BlackHat presentations. I had a fantastic time catching up with old friends and finally getting the opportunity to meet more of the Security Twits and others in the security community. I didn’t submit a talk this year, but nevertheless, fake Dan Kaminsky was still excited to see me.

My favorite talk, as expected, was the Sotirov/Dowd talk on How To Impress Girls With Browser Memory Protection Bypasses. The attack is a conceptually simple, yet completely reliable technique for exploiting vulnerabilities in web browsers. Of course, the media has sensationalized the impact of their findings, but ultimately, this is still significant as far as browser-based exploits are concerned (here is a more accurate report). It’s worth mentioning that part of the technique allowing them to load a .NET DLL at an arbitrary location under Vista was reliant on an implementation bug wherein the OS disables ASLR if the version in the .NET COR header was below a certain value. However, the address space spraying and stack spraying techniques are likely to be extended to other platforms utilizing similar memory protection mechanisms.
As for the girls? I can report first-hand that the ladies at TAO on Wednesday night were hanging on Alex’s every word. They were particularly impressed when he whipped out the laptop for a live demo. Unfortunately, none of the dozen iPhone owners in the immediate vicinity thought to snap a picture (too busy Twittering). Oh well.
I also enjoyed Hovav Shacham’s talk on return-oriented programming. Simply put, he described a generalization of the return-to-libc shellcode approach with the intent to demonstrate that one could achieve Turing-complete computation using “found code” in process images. By chaining together series of mini-computations ending in return (RET) instructions, it was possible to build higher-level programming constructs such as branches and loops. The nature of the x86 instruction set provides some flexibility because instructions are interpreted differently depending on how you align the instruction pointer (i.e. the old shellcode trick of searching the process image for any JMP EBX instruction and using that as your EIP). In RISC architectures such as SPARC, however, you don’t have that luxury; if your %pc isn’t aligned properly you get a bus error. So it was quite interesting to see that they were able to extend the concept to RISC. The practicality of the attack technique is limited by the fact that the shellcode is tuned to a particular binary image — if the shellcode was built using instructions extrapolated from glibc 2.3.5, it won’t work for a system running glibc 2.4.
I thought Scott Stender’s talk on Concurrency Attacks in Web Applications was interesting as well. In a nutshell, spewing thousands of simultaneous requests at web application transactions that are not thread-safe can create interesting problems. In the presentation, Scott ran his demo against a VM running on the attack machine. I found myself wondering how effective the same attack would be over the Internet — would it be significantly less reliable (or not at all)? Race conditions are generally easier to exploit locally than remotely due to more predictable execution conditions. Certainly this is an under-tested vulnerability class though.
One presentation I wasn’t able to attend but want to follow up on is Nate McFeters, John Heasman, and Rob Carter’s talk which discussed the GIFAR attack I’ve been hearing so much about lately. The gist is that you can create a file that is both a valid GIF and a valid JAR, then use some Java applet tricks to initiate HTTP requests on behalf of the victim.
Finally, the Pwnie Awards didn’t fail to disappoint. Drama ensued over the Most Overhyped award, but at least this year some of the winners showed up to claim their awards! Halvar rapping Symantec lyrics was also quite memorable.
All in all, a fun and informative week, but as usual, I was relieved to get the hell out of Vegas and head home on Friday morning.
P.S. For a much more entertaining BlackHat/Defcon Recap, read Jennifer Jabbusch’s account of the week’s events. It’s my favorite one so far!
by Chris Eng
I spent the weekend in Berlin attending a conference called PH-Neutral, run primarily by the Phenoelit crew. This was the first European security conference I’ve attended and I found it quite different from any North American security gathering I’ve been to, such as BlackHat, CanSecWest, SOURCE Boston, BlueHat, or RSA. Everything was far more casual and laid back, which is something I had heard about European conferences but hadn’t experienced until now (even EUSecWest is held in a club whereas CanSecWest is in a Marriott).

The event was held at Die Insel, on a tiny island a few kilometers outside of Berlin’s city center, near Treptower Park. The venue is mostly used for live music so basically it feels like a dark, somewhat dingy club (certainly the bathrooms are reminiscent of a club). The presentations were on the 3rd floor in a room that probably held about 60 people in close quarters; to handle overflow, a closed-circuit feed was being simulcast on the 4th floor, which was a bit less crowded and, more importantly, opened out onto a rooftop deck which meant better ventilation. The bottom floor led out to a Biergarten with tables, beach chairs, and a stage which was used for DJing. The layout was actually pretty efficient for allowing around 200 people to mill about and socialize/network while not having to stray too far from where the talks were presented.

As far as the event itself, when I said “laid back” earlier, don’t interpret that to mean disorganized or watered down in any way. It was run with stereotypical German efficiency, from badging to presentations to the after-hours parties. The presentations were just as technical and relevant as any of the more “corporate” conferences. Unfortunately for me, I don’t know that many people in European security circles, and most of the ones I do know weren’t in attendance. Those I did meet, however, were impressively smart and well-versed. Nobody was trying to conduct business transactions or slip away for meetings, which is inevitably what happens when only technical folks are present!

For me, a few talks stood out. Fukami and BeF’s talk on SWF and the Malware Tragedy discussed methods for automated static detection of malware in Flash movies. Much of it centered on heuristics related to inconsistencies in the file format or tag structure, abnormal concentrations of strings in the constant pool, or the existence of various obfuscation techniques. Ultimately, there are false positive issues to be addressed but that is just a fact of life with static analysis, and it will be an iterative process to refine those heuristics as the attack vectors evolve. I thought this talk was particularly timely given the increasing prevalence of Flash as a conduit for exploits/malware, such as the most recent Flash 0day that made the news (granted, this was an exploit against Flash itself, not just using Flash as a delivery mechanism, but close enough).
I also enjoyed pierre’s talk on counterintelligence, basically a mélange of wiretapping and other bugging devices discovered in the wild. War stories are always interesting, particularly when it comes to the realm of physical security. One of the x-ray images he showed of a bugged pen was identical to a pen that I own (minus the bugging device of course… I hope). The feel of the talk reminded me a bit of James Atkinson’s talk at SOURCE, “Telephone Defenses Against the Dark Arts” (video: Part 1 and Part 2), which also got rave reviews.
Mike Eddington’s presentation on the Peach 2 fuzzing framework was also quite interesting. Peach 2 was released several months back but I haven’t really been paying much attention to it or any other fuzzing tool for some time. In fact the last time I really had to implement a protocol fuzzer, I was using SPIKE 2.9, so that gives you some indication of how long it’s been. Peach 2 includes some powerful built-in capabilities such as node relationships (e.g. field 1 represents the length of field 2; field 10 is a CRC-32 of fields 1 through 9), data transforms (those with battle scars from ASN.1 will be happy), state machines (packets 1 and 2 have to be normal in order to fuzz packet 3), monitoring agents (detecting when a crash happens and under what conditions), and much more. I am itching to go fuzz something now just so I can tinker with Peach.
All in all, it was a good trip and I enjoyed the opportunity to see how things are done across the pond, and to do a little sightseeing in a historic and beautiful city.
by Chris Eng
Yesterday, Dave Lewis over at LiquidMatrix Security Digest cried foul at Core Security for releasing too much detail about a recent DoS vulnerability they had discovered. His specific gripe was that they provided an IDA Pro excerpt that showed where the vulnerability was triggered. The excerpt is short, so I’ll even copy/paste it here:
.text:00405C1B mov esi, [ebp+dwLen] ; Our value from packet
...
.text:00405C20 push edi
.text:00405C21 test esi, esi ; Check value != 0
...
.text:00405C31 push esi ; Alloc with our length
.text:00405C32 mov [ebp+var_4], 0
.text:00405C39 call operator new(uint); Big values return NULL
.text:00405C3E mov ecx, esi ; Memcpy with our length
.text:00405C40 mov esi, [ebp+pDestionationAddr]
.text:00405C43 mov [ebx+4], eax ; new result is used as dest
.text:00405C46 mov edi, eax ; address without checks.
.text:00405C48 mov eax, ecx
.text:00405C4A add esp, 4
.text:00405C4D shr ecx, 2
.text:00405C50 rep movsd ; AV due to invalid
.text:00405C52 mov ecx, eax ; destination pointer.
.text:00405C54 and ecx, 3
Dave asserts that publishing 16 commented assembly instructions makes this disclosure irresponsible. But look at the code — it’s completely generic, just a textbook example of what it looks like when you forget to check a return value after calling operator new. Sure, Core gives you the exact offsets into the executable, but so what? If I have the binary, then it’s not going to be too hard to find the vulnerability anyway. It’s not like Core is giving away a proof-of-concept exploit that generates the malformed registration packet required to trigger the DoS. What’s more, they provide a detailed timeline going back to January 30th of this year describing exactly how the disclosure process with the vendor transpired. This looks extremely responsible to me; I just can’t understand what is “not cool” here.
There’s another interesting angle to this, completely unrelated to Core’s disclosure process. The vulnerability itself is described in the advisory as follows:
Un-authenticated client programs connecting to the service can send a malformed packet that causes a memory allocation operation (a call to new() operator) to fail returning a NULL pointer. Due to a lack of error-checking for the result of the memory allocation operation, the program later tries to use the pointer as a destination for memory copy operation, triggering an access violation error and terminating the service.
This may bring to mind some recent discussions on whether callers of memory allocation functions should check the return value prior to use. To summarize, one camp says “caller should check”, the other camp says “callee should exit on allocation failure.” This is a gross oversimplification and if you want more detailed arguments, read the other blog posts that I linked to. In this case, if the “exit on failure” approach were taken, the DoS scenario might still happen, whereas if the caller were checking, the error could be handled more gracefully. More fuel for the debate!
by Chris Eng
Finally getting around to posting our materials from the talk that Chris Wysopal and I gave at BlackHat this year entitled “Static Detection of Application Backdoors.” Here are the slide deck and the accompanying whitepaper:
Also, as a proof-of-concept, we had demonstrated using IDA Pro’s scripting framework to detect one of the backdoor examples that we discussed — suspicious cryptographic API calls. Specifically, it flags calls to known encryption, decryption, and/or key management functions where a constant value is passed to a specific argument position. This can help identify situations such as an application encrypting data with a hard-coded key. We had numerous requests to post the code, so here it is:
Veracode’s binary analysis technology uses similar (but more sophisticated) techniques. We build our own intermediate representation of the binary’s data flows, control flows, and range propagation which is not based on IDA Pro. We then scan that representation for backdoors in ways similar to the cryptoconst script. However, at BlackHat you’re not allowed to promote your own products/services, so it wasn’t appropriate for us to use it for demonstration purposes.
by Chris Wysopal
There has been some talk in the press lately about backdoors due to the recent court case where it was disclosed that federal agents planted a keystroke logger on a suspect’s computer using a trojan program. Many of the articles don’t report on the court case but raise the question as Declan McCullagh titles his article, “Will security firms detect police spyware?”
You can see the security cat and mouse game playing out between the police and suspected criminals although the roles here are reversed. The criminals are trying to secure their communications and the police are trying to break it to collect evidence. At first the police could just tap the data as it moved from sender to receiver. In response criminals started using strong end to end encryption. To get around that the police compromised the endpoint system with a backdoor to get at the clear text of the messages or passwords to log into encrypted servers. The criminal’s obvious response to that is to secure his system and try and detect any backdoor code.
Real detection of backdoor code is a challenging computer security problem to solve. This article will discuss different methods of detection and the different classes of backdoors.
Spyware, trojan keystroke loggers, or remote access trojans are what I call system backdoors, since they compromise the integrity of the whole computer system. These are typically in programs that you don’t want on your system. They are often installed via social engineering, a vulnerability, or a combination of both.
There is a different, more subtle and insidious backdoor which I call an application backdoor. This is backdoor code that is planted in an application that you do want on your system such as Wordpress, Borland Interbase, or tcpdump. All of those programs had versions that contained backdoors at one point. An application backdoor allows the attacker to bypass the designed authentication or authorization functions of the application and access its data and transactions. Sometimes an application backdoor is also a system backdoor if it allows functionality such as system commands and the application is running in an environment where the OS doesn’t prevent this.
For both system and application backdoors, once you know the program contains a backdoor you can make a signature for the program and detect instances of the program by computing a signature and looking it up in a signature database. This is how many traditional AV products find backdoor programs. The backdoor is typically detected by someone noticing a program that wasn’t supposed to be running on their computer. Then after inspection they also realize it is performing behavior on their system that they don’t want and report it to an AV vendor.
This approach doesn’t work well for custom or low population backdoors such as those used by sophisticated attackers or the police. It also doesn’t work well for application backdoors which are typically planted in the source code of the application or in the binary at the legitimate distribution site. Remember that applications backdoors are in programs that are supposed to be on the system so they don’t stand out. We need better was to detect backdoors than signatures.
Another way to detect backdoors is by looking for backdoor behavior on the system. Is a program listening on the network that isn’t supposed to? Is a program hooking into system level calls to gather keystrokes? Is a program hooking into system level calls to hide its behavior from the system log and administration tools. This is a big step up from signatures because it can detect a backdoor that hasn’t been seen before.
There are limitations to the behavior detection approach. If a backdoor program has done a good job of hiding its behavior from the system you can’t detect it. If a program is supposed to be listening on the network, detecting that doesn’t help. Some behavior may occur on a timer so the behavior detection would have to catch it in the act perhaps within very small intervals of time. Behavior doesn’t do a very good job at application backdoors at all. How will behavior find a special password that bypasses normal authentication or special processing when a certain client IP address is used?
A better solution to finding backdoors that overcomes many of the behavior approach’s limitations is binary static analysis. Binary static analysis can look at all of the functionality of the program without having to wait for it to get into a particular state which is the bane of dynamic (behavioral) analysis.
Binary static analysis is adept at seeing “baked in” static passwords, keys, or IP addresses. It can detect if there are encrypted or self-modifying blocks of code which may mean someone is hiding something. It can detect functionality such as listing on the network or shimming system calls. It can also do this from a trusted clean computer whereas the behavioral analysis needs to run on the same computer that the program being analyzed is executing.
Veracode’s binary analysis technology has scans that look for backdoors in executables. Chris Eng and I are giving a presentation titled, “Static Detection of Application Backdoors” at Black Hat in Las Vegas on August 2. We will cover detection of 4 classes of application backdoors that have:
- Special credentials
- Hidden functionality
- Unintended network activity
- Manipulation of security critical parameters
We will be giving a demo of proof of concept detection using IDA Pro and even screening a famous clip from a movie concerning backdoors. I’ll give you a hint, “Backdoors are not secrets!”
by Christien Rioux
Type safety is a feature of numerous modern programming languages. C++ is not strict about type safety, and as a result, vulnerabilities may appear in programs in unexpected ways. Here’s an example I recently discovered.
Consider this structure:
typedef struct _NOTIFYICONDATAA {
DWORD cbSize;
HWND hWnd;
UINT uID;
UINT uFlags;
UINT uCallbackMessage;
HICON hIcon;
#if (_WIN32_IE < 0x0500)
CHAR szTip[64];
#else
CHAR szTip[128];
#endif
#if (_WIN32_IE >= 0x0500)
DWORD dwState;
DWORD dwStateMask;
CHAR szInfo[256];
union {
UINT uTimeout;
UINT uVersion;
} DUMMYUNIONNAME;
CHAR szInfoTitle[64];
DWORD dwInfoFlags;
#endif
#if (_WIN32_IE >= 0x600)
GUID guidItem;
#endif
} NOTIFYICONDATAA, *PNOTIFYICONDATAA;
Note all the _WIN32_IE preprocessor macros. Problem scenario is this: In one file, you create NOTIFYICONDATAA structure, fill in cbSize with sizeof(NOTIFYICONDATAA), and pass it to another routine in another translation unit/source file. In one source file, you have _WIN32_IE undefined, so it defaults to the latest version for your Platform SDK. In the other source file you #define _WIN32_IE to 0x0400 for backward compatibility purposes. Note that this creates a discrepancy in the structural layout, but if you only use the IE4 features of that structure, then you might think you’re okay.
This is all too common. We’ve analyzed a lot of binaries here over at Veracode and found that it is relatively common to have two versions of the same structure compiled with different #define settings and they end up with different lengths, easily leading to overflow conditions that are non-obvious to the casual analysis. In fact, if you only look at one source file at a time, you’ll always miss this case, because you need to look interprocedurally between the translation units to note that the types are different between the point of allocation and the point of use.
This underscores a fundamental issue of concern with C++ — that a type defined in one translation unit is considered equivalent to another type of the same name in another translation unit as long as their names are the same, regardless of their layout. The linker cares when comparing decorated/manged names, but nothing else does. This is usually mitigated by declaring your classes/structures in a header file that is uniformly #included everywhere. However, in the scenario outlined above, it is relatively easy to have things go wrong without hearing a peep from your compiler.
The security ramifications of this issue are clear, but we haven’t really been looking for them. I propose that for future correspondence, we refer to this issue as a ‘non-uniform layout’ bug. Privacy concerns prevent me from posting the examples that I have of this problem, but if anyone out there has good real-life examples of this issue they’d like to publish, I’d love to see them.
by Mike VanEmmerik
Analysis of binary files without access to the source code is becoming more prevalent in the last five years or so. Of course Java decompilers have been around almost as long as Java itself, but that’s not machine code. I’m talking about analysis of native machine code (x86 or PowerPC instructions), and not from object code (.o or .obj files), which have relocation and symbol information in them. In other words, the actual programs that run on real computers.
The University of Wisconsin has had their Codesurfer/x86 project since about 2003. It uses a combination of disassembly and custom static analyses to automatically analyze x86 binaries for security vulnerabilities, with a research slant. Of course, Veracode is using static binary analysis for commercial security analysis services. Researchers at the University of Arizona have been investigating alias issues and register liveness of executable code. There has been work on DSP (Digital Signal Processor) binaries in Europe and elsewhere. There are even PhD theses on binary analysis (including my own, currently under examination).
The author of IDA Pro is beta testing a decompiler-like visualization plug-in called Hex-Rays. Phrack Magazine, issue 64 has an article entitled “Automated vulnerability auditing in machine code“. We’ve come a long way since any analysis of binary programs was compared to making pigs from sausages.
It seems to me that the benefits of binary analysis are moving from underground to mainstream. Binary analysis is a superset of source code analysis. Often an organization uses third party applications or libraries in software development and cannot legally or logically access source data. Additionally compiler, optimizer and OS bugs, security vulnerabilities or other malicious behavior can be reflected in an application’s security state. Binary analysis reflects the data flow of the entire compiled application as the OS/Platform may execute the intended and un-intended functionality inherent within the code.
So, you can always compile source code to put it into binary form but you cannot do the reverse for binary code. Binary analysis thus analyzes the part of your software that you have source code for and the binary part that you do not.
The increasing availability of binary analysis tools will surely lead to more effective discovery of vulnerabilities, by all parties including those generating malware. It makes sense that software developers should also take advantage of binary analysis for security checking.
Powered by WordPress