Request Membership
Categories
Posts By Month
Bloggers
Related Links
Input Validation RSS

Anti-Debugging Series - Part II

Welcome back to the series on anti-debugging. Hopefully you have your debugger and development environment handy as we are about to dive into the first round of anti-debugging code. In the first post to this series we discussed six different types of anti-debugging techniques that are in common use today. To refresh, the classifications buckets that we chose to use are:

  • API Based Anti-Debugging
  • Exception Based Anti-Debugging
  • Process and Thread Block Anti-Debugging
  • Modified Code Anti-Debugging
  • Hardware and Register Based Anti-Debugging
  • Timing and Latency Anti-Debugging

Basic API Anti-Debugging

We’ll continue this series of posts by going into a bit more depth on the easiest of API based anti-debugging techniques. An application programming interface (API) is used to support requests made from other applications for resources or functionality within a target service or library. In our case we will be primarily focused on the Microsoft Windows operating system API. There are a number of calls built directly into the operating system API that make detection of a debugger possible. Minor differences in thread and process meta-data is present when processes are run within a debugger. These calls typically facilitate a process or thread examination technique in order to determine if the target thread has a debugger attached.

When learning about anti-debugging, a developer will typically first be introduced to the IsDebuggerPresent() function. This function analyzes the process block of a target process to determine if the processes is running under the context of a debugging session. We’ll save the details of how this actually works for a later article, however suffice it to say that the target process has a flag that will contain a non-zero value if the process is being debugged. This flag is queried and returned when IsDebuggerPresent() is called. A very basic debugging detection routine would be to call this function and execute different code paths based on the response.

Prototype: BOOL WINAPI IsDebuggerPresent(void); 

if (IsDebuggerPresent()) {
    //Debugger Detected - Do Something Here
} else {
    //No Debugger Detected - Continue
}

We could also use the API function CheckRemoteDebuggerPresent(). Contrary to first thought, this function does not target a process on a remote machine, nor does it even require that it target a process remote to itself. The call can use a parameter pointing to itself to determine if it is running inside of a debugger. In the example below we pass in a handle to our current process by calling the GetCurrentProcess() function along with a variable to hold the return value from the CheckRemoteDebuggerPresent() call.

Prototype: BOOL WINAPI CheckRemoteDebuggerPresent(__in HANDLE hProcess,
           __inout PBOOL pbDebuggerPresent);

BOOL pbIsPresent = FALSE;
CheckRemoteDebuggerPresent(GetCurrentProcess(), &pbIsPresent);
if (pbIsPresent) {
    //Debugger Detected - Do Something Here
} else {
    //No Debugger Detected - Continue
}

While these two methods are probably the easiest and most straightforward methods of anti-debugging, they are also the most likely to be understood by a person wishing to bypass them. We can mix it up a bit and use a call to OutputDebugString() instead. OutputDebugString() is typically used to output a string value to the debugging data stream. This string is then displayed in the debugger. Due to this fact, the function OutputDebugString() acts differently based on the existence of a debugger on the running process. If a debugger is attached to the process, the function will execute normally and no error state will be registered; however if there is no debugger attached, LastError will be set by the process letting us know that we are debugger free. To execute this method we set LastError to an arbitrary value of our choosing and then call OutputDebugString(). We then check GetLastError() and if our error code remains, we know we are debugger free.

Prototype: void WINAPI OutputDebugString(__in_opt  LPCTSTR lpOutputString);

DWORD Val = 123;
SetLastError(Val);
OutputDebugString(L"random");
if(GetLastError() == Val) {
    //Debugger Detected - Do Something Here
} else {
    //No Debugger Detected - Continue
}

These three methods are the basic starting point for a developer wishing to implement anti-debugging into their code base. The methods are so simple they could even be implemented as macros making a call quick and easy. Numerous other API based detection methods exist with a vast array of complexity. In the next post in this series we will discuss slightly more advanced API anti-debugging techniques that will make reverse engineering and debugging even more difficult.

Anti-Debugging Series - Part I

For those that don’t know, anti-debugging is the implementation of one or more techniques within computer code that hinders attempts at reverse engineering or debugging a target process. Typically this is achieved by detecting minute differences in memory, operating system, process information, latency, etc. that occur when a process is started in or attached to by a debugger compared to when it is not. Most research into anti-debugging has been conducted from the vantage point of a reverse engineer attempting to bypass the techniques that have been implemented. Limited data has been presented that demonstrates anti-debugging methods in a high level language that the average developer can understand. It is with this in mind that I hope to begin a series of posts that present some of the methods of anti-debugging in a clear, concise, and well documented fashion. The end goal of this series is to arm developers with the techniques and knowledge that will allow them to add a layer of protection to their software while simultaneous educating reverse engineers in some of the anti-debugging methods used by malware authors today.

Before we delve into the intricacies of individual methods of anti-debugging let’s use this post to define the classes of anti-debugging that we will be discussing. While other classes may exist, the definition of these classes is an attempt to include the majority of anti-debugging methods in use today. There is some overlap between classifications and we may have left out some methods due to limited exposure or effectiveness.

API Based Anti-Debugging
API based anti-debugging is the most straightforward and possibly the easiest to understand for a typical developer. Using both documented and undocumented API calls, these methods query process and system information to determine the existence or operation of a debugger. From single line calls such as IsDebuggerPresent() and CheckRemoteDebugger() to slightly more complex methods including debugger detaching and CloseHandle() checks. These methods are generally trivial to add to an existing code base and many can even be implemented in as few as two or three lines.

Exception Based Anti-Debugging
Exception based anti-debugging is slightly different than your basic API based techniques. Many times when a debugger is attached to a process, exceptions are trapped and handled by the debugger without regard to passing the exception back to the application for continued execution. Occasionally these exceptions can even crash or terminate a process when run under a debugger and be handled gracefully when running clean. It is these discrepancies that makes exception based anti-debugging techniques possible.

Process and Thread Block Anti-Debugging
Some of the API based anti-debugging methods use published functions to query information from within the process and thread blocks for our running code. Many API based detections can be subverted within a debugger by hooking the API call and returning values that indicate a clean process. One way around this subversion is to directly query the process and thread blocks, bypassing the API calls. Direct analysis of the process and thread blocks, while more complex, can lead to a more accurate and high assurance result.

Modified Code Anti-Debugging
One of the methods that a debugger uses to signal a breakpoint is to insert a break byte into the running code at the location that it wishes to stop execution. The process execution breaks when this value is seen, giving control to the debugger. When the program is resumed, the breakpoint value is removed and replaced with the original byte, the execution backed up one byte, and the program is resumed. Detection of software based breakpoints can be achieved by analyzing the process for modifications from the expected norm.

Hardware and Register Based Anti-Debugging
A second way that a debugger can break the execution of a process is by using a hardware breakpoint. A hardware breakpoint relies upon CPU registers to store the pertinent information and to detect when the target break addresses are seen on the bus. A break interrupt is triggered at the appropriate time based on these register values. Reading or modifying the hardware can allow for the detection of a debugger.

Timing and Latency Anti-Debugging
Finally timing and latency can be used as an effective anti-debugging method. When executing a program within a debugger, specifically when single stepping, a much larger latency occurs between execution of instructions. This latency can be detected and compared against a reasonable threshold to detect the existence of a debugger attached to our process.

Each of the classes of anti-debugging outlined above has merit when used individually to protect a process. While none of them can be assured to ever protect a program from a determined reverse engineer or debugger, implementation of these techniques (or many of them if appropriate) can sufficiently slow down the debugging process and hopefully make the attacker spend his time on other, easier, ventures. In the remainder of this series on anti-debugging we will review in depth some of the more interesting methods of each of the above classes. So bring along your debugger and your development environment and let the games begin.

Microsoft Fixes 8-year Old Design Flaw in SMB

With regard to the recent Patch Tuesday fix, there has been an issue fixed regarding NTLM Relaying, that has been around for more than eight years.

In 2000, I wrote an advisory about NTLM relaying (CVE-2000-0834). The problem turned out to be significantly larger than I originally suggested in the advisory. The attack extended to other NTLM-based authentications on other protocols and allowed general-purpose credential theft via a man-in-the-middle attack.

The SMBRelay tool was published in 2001 by Sir Dystic of Cult Of The Dead Cow, and that really took it to the next level. The protocol completely fell apart. It kicked off a number of other analyses of the NTLM protocol that finally resulted in this patch. Eight years after it’s discovery.

At least they got around to it. Thanks!

US Government Detects Attacks on Obama and McCain Computers

Now that the presidential race is over Newsweek is reporting that the US Government, through the FBI and Secret Service, notified the Obama and McCain campaigns that their computers had been compromised and sensitive documents copied.

…the FBI and the Secret Service came to the campaign with an ominous warning: “You have a problem way bigger than what you understand,” an agent told Obama’s team. “You have been compromised, and a serious amount of files have been loaded off your system.” The following day, Obama campaign chief David Plouffe heard from White House chief of staff Josh Bolten, to the same effect: “You have a real problem … and you have to deal with it.” The Feds told Obama’s aides in late August that the McCain campaign’s computer system had been similarly compromised.

This information demonstrates that the US government has a sophisticated intrusion detection capability. This is likely part of the NSA internet surveillance system that was made public by an AT&T technician in 2006.

It is likely that the system has a set of watch IP ranges that are sensitive from a national security perspective. The campaigns’ computers were probably on this list. The traffic between foreign IP addresses and these watch IPs is then scrutinized for espionage. The pattern of activity flagged would be Microsoft Office documents and PDFs being retrieved or other intruder signs such as an encrypted tunnel with a foreign endpoint.

This shows that the US Government has the capability to detect some types foreign attacks although they probably have to be selective of the IP ranges they monitor. It’s nice to know that if the White House computers were leaking documents to China or Russia that there is some detection capability, but the fact that this is done at the Internet backbone level means any IP could be targeted and it might not just be to look for foreign intrusions.

MBTA vs MIT Students Case Continues

A hearing will be held in Boston tomorrow to decide whether or not the restraining order gagging the MIT students from talking about the vulnerabilities they have found should be lifted. Even though the Defcon presentation is widely available and the MBTA disclosed the “Confidential” memo from the MIT students in their court filings, they are seeking a permanent speech injunction. An august group of computer scientists has signed a letter which will be entered into the record for the case. This list includes: Dave Farber of Carnegie Mellon University, Steve Bellovin from Columbia University, David Wagner from UC Berkeley, Dan Wallach from Rice University, Matt Blaze from the University of Pennsylvania, and Bruce Schneier. An excerpt:

We write to express our firm belief that research on security vulnerabilities, and the sensible publication of the results of the research, are critical for scientific advancement, public safety and a robust market for secure technologies. Generally speaking, the norm in our field is that researchers take reasonable steps to protect the individuals using the systems studied. We understand that the student researchers took such steps with regard to their research, notably by planning not to present a critical element of a flaw they found. They did this so that their audience would be unable to exploit the security flaws they uncovered. . . .

The restraining order at issue in this case also fosters a dangerous information imbalance. In this case, for example, it allows the vendors of the technology and the MBTA to claim greater efficacy and security than their products warrant, then use the law to silence those who would reveal the technologies’ flaws. In this case, the law gives the public a false sense of security, achieved through law, not technical effectiveness. Preventing researchers from discussing a technology’s vulnerabilities does not make them go away - in fact, it may exacerbate them as more people and institutions use and come to rely upon the illusory protection. Yet the commercial purveyors of such technologies often do not want truthful discussions of their products’ flaws, and will likely withhold the prior approval or deny researchers access for testing if the law supports that effort. . . .

Yet at the same time that researchers need to act responsibly, vendors should not be granted complete control of the publication of such information, as it appears MBTA sought here. As noted above, vendors and users of such technologies often have an incentive to hide the flaws in the system rather than come clean with the public and take the steps necessary to remedy them. Thus, while researchers often refrain from publishing the technical details necessary to exploit the flaw, a legal ban on discussion of security flaws, such as that contained in the temporary restraining order, is especially troubling.

It will be interesting to see what arguments the MBTA uses to keep the students from speaking on a topic where all the important vulnerability information seems to have already disclosed. Sure the students haven’t presented a cookbook exploit tool but they have also stated they have no intention of doing so.

Perhaps the court will investigate what the MBTA’s and their technology vendors response has been to the MiFare card vulnerabilities that were disclosed responsibly. If there has been no vigorous response to responsibly disclosed vulnerabilities of many months ago how can they say with a straight face that are truly responding to new security information and just need more time.

BlackHat Recap

Another BlackHat has come and gone. As usual, it was a very busy week juggling customer meetings, recruiting, conference planning, vendor parties, and, oh yes, the actual BlackHat presentations. I had a fantastic time catching up with old friends and finally getting the opportunity to meet more of the Security Twits and others in the security community. I didn’t submit a talk this year, but nevertheless, fake Dan Kaminsky was still excited to see me.

My favorite talk, as expected, was the Sotirov/Dowd talk on How To Impress Girls With Browser Memory Protection Bypasses. The attack is a conceptually simple, yet completely reliable technique for exploiting vulnerabilities in web browsers. Of course, the media has sensationalized the impact of their findings, but ultimately, this is still significant as far as browser-based exploits are concerned (here is a more accurate report). It’s worth mentioning that part of the technique allowing them to load a .NET DLL at an arbitrary location under Vista was reliant on an implementation bug wherein the OS disables ASLR if the version in the .NET COR header was below a certain value. However, the address space spraying and stack spraying techniques are likely to be extended to other platforms utilizing similar memory protection mechanisms.

As for the girls? I can report first-hand that the ladies at TAO on Wednesday night were hanging on Alex’s every word. They were particularly impressed when he whipped out the laptop for a live demo. Unfortunately, none of the dozen iPhone owners in the immediate vicinity thought to snap a picture (too busy Twittering). Oh well.

I also enjoyed Hovav Shacham’s talk on return-oriented programming. Simply put, he described a generalization of the return-to-libc shellcode approach with the intent to demonstrate that one could achieve Turing-complete computation using “found code” in process images. By chaining together series of mini-computations ending in return (RET) instructions, it was possible to build higher-level programming constructs such as branches and loops. The nature of the x86 instruction set provides some flexibility because instructions are interpreted differently depending on how you align the instruction pointer (i.e. the old shellcode trick of searching the process image for any JMP EBX instruction and using that as your EIP). In RISC architectures such as SPARC, however, you don’t have that luxury; if your %pc isn’t aligned properly you get a bus error. So it was quite interesting to see that they were able to extend the concept to RISC. The practicality of the attack technique is limited by the fact that the shellcode is tuned to a particular binary image — if the shellcode was built using instructions extrapolated from glibc 2.3.5, it won’t work for a system running glibc 2.4.

I thought Scott Stender’s talk on Concurrency Attacks in Web Applications was interesting as well. In a nutshell, spewing thousands of simultaneous requests at web application transactions that are not thread-safe can create interesting problems. In the presentation, Scott ran his demo against a VM running on the attack machine. I found myself wondering how effective the same attack would be over the Internet — would it be significantly less reliable (or not at all)? Race conditions are generally easier to exploit locally than remotely due to more predictable execution conditions. Certainly this is an under-tested vulnerability class though.

One presentation I wasn’t able to attend but want to follow up on is Nate McFeters, John Heasman, and Rob Carter’s talk which discussed the GIFAR attack I’ve been hearing so much about lately. The gist is that you can create a file that is both a valid GIF and a valid JAR, then use some Java applet tricks to initiate HTTP requests on behalf of the victim.

Finally, the Pwnie Awards didn’t fail to disappoint. Drama ensued over the Most Overhyped award, but at least this year some of the winners showed up to claim their awards! Halvar rapping Symantec lyrics was also quite memorable.

All in all, a fun and informative week, but as usual, I was relieved to get the hell out of Vegas and head home on Friday morning.

P.S. For a much more entertaining BlackHat/Defcon Recap, read Jennifer Jabbusch’s account of the week’s events. It’s my favorite one so far!

Missing the Point

A co-worker passed along this snapshot taken at the Karsten Nohl, Jake Appelbaum, and Dino Dai Zovi talk at HOPE this past weekend. The context, of course, is that the overzealous Debian developer who accidentally crippled OpenSSL back in 2006 said he did so because valgrind reported uninitialized memory use. Click through for the full-size version.

So automated software review is dangerous now? Perhaps that bullet should read “modifying code you don’t understand is dangerous.”

Minimizing the Attack Surface, Part 2

I’m finally getting around to finishing my post on minimizing attack surfaces. Here’s Part 1, in case you missed it.

First, a quick clarification. I noticed that some of the readers who commented on that first post wanted to talk about improving security through the use of various development methodologies or coding frameworks. Those are interesting tangents (and ones that I may write about in the future), but my intention with this post is to discuss a very specific problem related to how people integrate third-party code — that is, the stuff you import or link in but didn’t write yourself.

As I mentioned previously, developers have a tendency to “bolt on” third-party components to applications without understanding the security implications. Often, these components are glossed over or ignored completely during threat modeling discussions. I attempted to illustrate this with my fictitious WhizBang library example in Part 1.

When integrating a third-party component, developers familiarize themselves with the API but generally don’t care how it’s implemented. Granted, that’s how an API is supposed to work; you don’t have to futz around with code beyond the API boundary, and you can blissfully ignore parts of the library that you don’t need. In past consulting gigs, I’ve sat in threat modeling discussions where nobody knew whether a particular library generated network traffic. “We just use the API,” they say. The fact that it works is good enough; nobody seems to care how it works.

That mindset is ideal for rapid development but problematic for security. Failing to understand the complete application, as opposed to just the part you wrote, prevents you from accurately assessing its security posture.

It’s also no coincidence that web app pen testers love third-party components — we get excited when we see “bolted on” interfaces, because we know that developers tend to leave extraneous functionality exposed. The resulting findings usually generate reactions such as “I didn’t even know that servlet had an upload function.”

An Example

Here’s a close-to-home example related to my post about DWR 2.0.5 from the other day. DWR is an Ajax framework that has a variety of operating modes. In-house, we use a subset of DWR’s full functionality — specifically, we interact with it using the “plaincall” method only, so we made sure that the features we didn’t need were disabled via the configuration file. As it turned out, there were vulnerable code paths prior to the “do you have this thing disabled” check. In hindsight, if we had taken more time to understand the exposed interfaces, we could have reduced the attack surface by filtering out unneeded request patterns before they even touched the third-party code.

But wait, you say. What about maintainability? If I whitelist using a point-in-time application profile, doesn’t this create the same maintenance headache as the reviled WAF? It doesn’t have to. Certainly, one option would be to whitelist each and every unique URL that references the DWR framework, e.g.

/dwr/call/plaincall/myMethod1
/dwr/call/plaincall/myMethod2
/dwr/call/plaincall/myMethod3

But then you’d have to update the whitelist every time you added or removed functionality from your application. Also, don’t lose sight of the security goal, which is to minimize the amount of exposed third-party code. If I add or remove URLs that list, provided they are still using the “plaincall” method, I’m hitting the same DWR dispatcher every time. So I’ve increased maintenance cost without any security benefit.

A better option is to simply tighten the URL pattern a bit in the J2EE container. Here’s the default configuration:

<servlet-mapping>
  <servlet-name>dwr-invoker</servlet-name>
  <url-pattern>/dwr/*</url-pattern>
</servlet-mapping>

Now, instead of allowing every URL starting with /dwr/ to be processed by the DWR library, you could be a little more restrictive:

<servlet-mapping>
  <servlet-name>dwr-invoker</servlet-name>
  <url-pattern>/dwr/call/plaincall/*</url-pattern>
</servlet-mapping>

In this configuration, you don’t have to worry about /dwr/call/someothercodepath any more. There is less third-party code exposed, thereby reducing the overall attack surface of the application. (NB: DWR also serves up a couple of Javascript files, so those URL patterns will have to be whitelisted too)

A Logical Extension

Even if you’re not a developer, you should still be thinking about attack surfaces. People download and install blogging platforms such as WordPress, Movable Type, etc. all the time, but how many take additional steps to harden their installations? The concept is the same as the OS hardening analogy I brought up at the very beginning of this discussion.

Similarly, people install third-party WordPress plugins or Joomla components without considering that most of them are written by some random programmer who is a whiz with the plugin API but knows nothing about security?

At the risk of sounding trite, always remember that security is only as strong as the weakest link.

Scrawlr: Are We Being Too Greedy?

HP released a new tool called Scrawlr yesterday that can be used to identify certain types of SQL Injection vulnerabilities in a website. It was a joint effort with Microsoft and a direct response to the mass SQL Injection attacks of late.

Scrawlr quickly came under fire on the Web Security mailing list for having some pretty major limitations. Billy Hoffman et al have been quick to point out that the tool was designed to address a very specific subset of SQL Injection vulnerability — the type affected by the mass attacks — and is not designed to be a general purpose replacement for existing SQL Injection scanners. Let’s look at the limitations, as outlined on the HP page, one by one.

Limitation: Will only crawl up to 1500 pages

Depends on what they mean by 1500 pages. For example, if I have these links on my front page, is that one URL or three?

  • http://www.veracode.com/blog/?p=111&foo=1
  • http://www.veracode.com/blog/?p=111&foo=2
  • http://www.veracode.com/blog/?p=111&foo=3

Or, does it mean that it will really only crawl 1500 pages total, so if I have the same link 1500 times on the front page, it won’t go any further? Either way, for most smaller websites this is probably fine. If you need more than 1500 you could give it different starting URLs in an attempt to improve coverage. It would be nice to have a clearer definition of what it means to “crawl up to 1500 pages” though.

Limitation: Does not support sites requiring authentication

Well, this will render it useless for the majority of enterprise apps. But there are still a lot of sites out there that don’t require authentication, including some of the ones that got hit during the mass attacks, such as the United Nations, UK government, etc.

[Update 06/26: Thomas Ptacek Mike Tracy investigates further and provides a workaround that'll work for the majority of sites that use cookie-based auth]

Limitation: Does not perform Blind SQL injection

They have taken a lot of flack for this but Billy describes it as a conscious choice:

An early version of the tool checked for blind SQL injection, but the final verison of Scrawlr did not. … The biggest feedback we got from early testing was developers wanted to “see” the vulnerability. Differential analysis is kind of difficult to visualize in a way that is helpful for the average dev, and pulling the table names through blind was too much of a performance issue.

I can sort of understand this rationale. Blind SQL Injection testing is much more susceptible to false positives. As users of any commercial web scanner or source code analyzer will attest, the more time you spend chasing down FPs, the less likely you are to put any faith in future results. It’d be nice if there was a way to toggle Blind SQL Injection testing on and off, though (could be off by default so nobody gets confused).

Limitation: Cannot retrieve database contents

Who cares? Find and fix the vulnerability. Pulling down the entire database “because you can” is a total ego move.

Limitation: Does not support JavaScript or flash parsing

Nobody does this very well anyway, particularly the JavaScript part. Writing a great crawler is probably the hardest part of writing an automated web scanner and it’s one of the biggest differentiators from one product to the next. You’re not going to get that for free.

Limitation: Will not test forms for SQL Injection (POST Parameters)

This is probably the toughest one to swallow. It’s not that difficult to parse out forms from HTML, and form POSTs can represent a major chunk of the attack surface. Granted, the Chinese tool associated with the mass attacks did operate solely on GET requests (i.e. parameters in the query string) so HP can defend this again by saying the tool is really aimed at the sites being targeted by the mass attacks. I think it’s a little short-sighted though; chances are that the mass attacks will evolve and it’s better to be proactive about it than reactive.

Conclusion

It’s tough to bash someone for releasing a free tool. I personally think HP should add an option for enabling Blind SQL Injection testing, and that they should consider supporting POSTs as well as GETs. You’re basically getting a (massively) stripped-down WebInspect for free, so take it for what it is. No single tool is a panacea.

The jury is still out on how effective Scrawlr is against the things it does claim support for. Keep watching the Web Security list; the reviews are filtering in.

Minimizing the Attack Surface, Part 1

What was the first thing you learned about network security? There’s a good chance it had something to do with port scanning. After scanning a few boxes, you realized that modern operating systems have a lot of open ports by default, meaning a lot of services. Some had an obvious purpose, like telnet on tcp/23 or ftp fon tcp/21. Others left you wondering, what the heck is listening on tcp/515 or tcp/7100? And remember, you couldn’t ask Google because it didn’t exist (well, maybe it did depending on when you got into security).

Your first real lesson about locking down a host was how to reduce its attack surface. You learned how to disable services using /etc/inetd.conf. Then you learned about rc.d and how to prevent unnecessary services from being launched at startup. Next, maybe you configured the Xserver to disallow remote connections or moved on to removing setuid permissions from files. As you worked, you’d periodically re-scan the box to gauge progress, asking yourself “have I removed everything I don’t need?” The underlying motivation, of course, is that an attacker can’t hack something that isn’t there.

You learned how to extend those concepts to the network — configuring firewall rules, router ACLs, VLANs, etc. Segmenting the network. Creating a DMZ. No need to dwell on this, you get the idea.

Eventually, people realized that applications had an attack surface too. Web servers and application servers got a lot of attention, followed closely by custom web applications. “What do you mean you can execute SQL queries against my database? That’s impossible, I have a firewall!”

Some companies, the ones who could afford it anyway, started to build security into their development cycle. Doing threat modeling during the design phase made sense, because hey, it’s much cheaper to fix security holes in a whiteboard drawing than it is to rewrite your authorization module from scratch after it’s in production.

Let’s talk strictly about custom web applications now. What I’ve observed is that most development groups, even the ones who actively engage in threat modeling, do not understand their web application’s attack surface. The lead architect can whiteboard a high-level diagram of all the major components and how they interact. Individual developers can go a bit deeper, telling you which files they touch, what database permissions they need, or how various pieces of data are encrypted in storage. At the end of this exercise you have a complete picture of the processes, data flows, protocols, privilege boundaries, external entities, and so on, and you’re well on your way to understanding all of the potential attack vectors.

Or are you?

What often gets overlooked or glossed over is the impact of external libraries or packages. Nobody writes everything from scratch. A typical list of third-party libraries for a Java-based Web 2.0 application might include DWR, GWT, Axis, and Dojo, plus about 30 other libraries to do everything from logging to parsing to image manipulation. Nine out of ten times, the libraries will be installed in full, using the default configuration from page one of the README file.

Why is this relevant? Because just as those old Unix boxes exposed unnecessary services, libraries expose unnecessary code. Let’s say you installed Dojo to simplify the process of creating an HTML table with rows and columns that can be sorted on demand. Did you remember to remove all the .js files you didn’t need? Or maybe you installed Axis or DWR or anything else that has its own Servlet(s) for processing requests. Have you compared what that Servlet can do against what you need it to do?

A fictitious example may help illustrate further. Imagine you just downloaded a new library called WhizBang. You follow the installation instructions to define and map two servlets in your web.xml file, WhizServlet and BangServlet, and you configure it to integrate with your web app. After a bit of trial and error, it’s functional. Yay! This is where most developers stop.

Nobody asks, “how much of this do I actually need?” Case in point, what if your application only uses WhizServlet? BangServlet is still exposed, and you don’t even use it! Similarly, what if WhizServlet takes an “action” parameter which can be either “view”, “edit”, or “delete”, and your application only uses “view”? You’re still exposing the other actions to anybody who knows the URL syntax (pretty trivial if it’s open source). You wouldn’t expose large chunks of your own code that you weren’t using, so why should it be any different with libraries?

This post is getting kind of long so I’m going to split it up. In the next post, I’ll continue the discussion of attack surface minimization, as well as some of the tradeoffs that go along with this approach.

Next Page »
 

Powered by WordPress