Request Membership
Categories
Posts By Month
Bloggers
Related Links
Input Validation RSS

An Ounce of Prevention is Worth a Pound of Cure

A conversation on Twitter this morning started out like this:

@dinozaizovi: Finding vulnerabilities without exploiting them is like putting on a dress when you have nowhere to go.

This clever analogy spurred a discussion about the importance of proving exploitability as a prerequisite to fixing bugs. While I agree that nothing is more convincing than a working exploit, there will always be a greater volume of bugs discovered than there are vulnerability researchers to write exploits. Don’t get me wrong — as a former penetration tester, I agree that it is fun to write exploits, it just shouldn’t be a gating factor. Putting the burden of proof on the researcher to develop an exploit is not scalable, nor does it help create a development culture that improves software security over the long term.

A related topic, and one that hits closer to home for me, is how software developers deal with the results of static analysis. Static analysis is often misunderstood, particularly by people who have only dealt with dynamic analysis (fuzzing, web scanning, etc.) or penetration testing in the past. Because static analysis detects flaws without actually executing the target application, there’s an increased likelihood of finding “noise” (insignificant flaws) or false positives. On the other hand, static analysis provides broader coverage, often detecting flaws in complex code paths that a web scan or human tester would be unlikely to find. So there’s your trade-off.

Here’s a conversation I have all too frequently, paraphrased:

DEVELOPER
I don’t think I should have to fix this SQL injection flaw unless you can prove to me that it’s exploitable.

ME
Static analysis isn’t performed against a running instance of the application. Not all flaws will be exploitable vulnerabilities, but some of them almost certainly are. Here, let me show you all of the code paths where untrusted user input enters the application and eventually gets used in the ad-hoc SQL query we’ve marked as a bug.

DEVELOPER
But what’s the URL that I can click on to exploit it?

ME
Static analysis is different from a penetration test. The output of our analysis is a code path, not a URL. URL construction cannot be derived solely from the application code, because it depends on outside factors such as how the web server and application server are configured. Moreover, we don’t have the necessary context of how this flaw fits into the business logic of the application. Maybe this functionality is only accessible by certain users when their accounts are in a particular status. It might take a couple hours working closely with a developer in a test environment to come up with the attack URL. It might take several more hours to write a script around that attack URL to mine the database. On the other hand, it would take about 10 minutes to replace that ad-hoc query with a parameterized prepared statement.

DEVELOPER
Well, if you can’t demonstrate the vulnerability, then it’s not real.

ME
Demonstrating a working exploit certainly proves a system is vulnerable. But the lack of a working exploit is hardly proof that it’s not vulnerable. You could spend the time to investigate every single flaw to figure out which ones are vulnerable, or you could fix them all in such a way that you’re guaranteed it won’t be vulnerable. In our opinion, the time is better spent on the latter.

DEVELOPER
[more defensiveness]

ME
[bangs head against wall]

Now imagine that conversation stretching out to 30 minutes or more. They could’ve fixed a half-dozen flaws already. And it’s not limited to SQL injection. For example, consider cross-site scripting (XSS):

DEVELOPER
I need you to prove that this XSS flaw is exploitable.

ME
How about just applying the proper output encoding so you know the untrusted input will be rendered safely by the browser?

Buffer overflows:

DEVELOPER
I need you to prove that this buffer overflow is exploitable.

ME
How about just using a bounded copy or putting in a length check, so you know the buffer won’t overflow?

By now you get the picture. Many developers want proof, to the extent that they’ll sacrifice efficiency to get it. If we are to improve software over the long haul, developers must learn to recognize situations where it takes less time to patch a bug than to argue about its exploitability. On a more positive note, from someone who talks to static analysis customers on a daily basis, the tide is starting to turn in the right direction. But it is still an uphill battle.

How To Protect Your Users From Password Theft

Monster.com recently disclosed yet another major breach that compromised the personal data of over 1.3 million users. This is not unlike the previous breach in August 2007, though the attack vector was likely different. From a notice on their website (emphasis mine):

We recently learned our database was illegally accessed and certain contact and account data were taken, including Monster user IDs and passwords, email addresses, names, phone numbers, and some basic demographic data. The information accessed does not include resumes.

Considering the well-known tendency to use the same password on multiple websites, compounded with the fact that Monster pledged a comprehensive security review after the first breach, it’s just embarrassing that they are still storing passwords in the clear.

So let’s talk about how to properly store passwords for a web application.

Use a one-way cryptographic hash

Don’t store your passwords in the clear! If you do, an attacker just needs to find one SQL Injection vulnerability and he’s got the password for every one of your users. The idea behind using a one-way algorithm is that the hash value can’t be reversed to “decrypt” the password. So how does authentication work? When a user attempts to login, you apply the same one-way algorithm to convert the user-provided password into the hash value, and then compare the two hashes. If they match, then the user-provided password was correct. At no time is the password ever stored in the clear.

Often, developers will hear the advice “use a hash” and interpret that as “run the plaintext password through MD5 or SHA-1 and store the result.” But that only solves part of the problem — the part about using an irreversible algorithm. It doesn’t protect against pre-computation. Let’s say you’ve used SHA-1 to hash your passwords, and your USERS table looks like this in the database:

USER          PASSWORD_HASH
admin         5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8
bob           fbb73ec5afd91d5b503ca11756e33d21a9045d9d
jim           7c6a61c68ef8b9b6b061b28c348bc1ed7921cb53

So if you wanted to obtain the original passwords you’d have to run a dictionary or brute force attack, hashing all possible password options with SHA-1 and comparing the output to the stored hashes. This would take a long time but eventually you’d figure some of them out. But what if you already had a list of all 8-character permutations and their corresponding SHA-1 hashes? Now all you have to do is look up the hashes, rather than computing them on-the-fly. This is the idea behind rainbow tables.

An attacker with a SHA-1 rainbow table covering 8-character alphanumeric combinations would quickly look up those three hashes and obtain the original passwords of “password”, “p4ssword”, and “passw0rd” respectively.

Use a salt

The best defense against pre-computation of raw hashes is salting. To salt a password, you append or prepend a random string of bits to the plaintext password and hash the result. You then store the salt value alongside the hash so that it can be used by the authentication routine. Look in the /etc/shadow file of any modern Unix system and you’ll see something like this:

user1:$1$lKorlp4C$RD5TSM6PaZ6oaWRVUuXT40:13740:0:99999:7:::
user2:$1$qOmA0CUm$I6IdbZDTDl6B6m7s77VPe1:13650:0:99999:7:::
user3:$1$nIEInNo5$PSxcLtvGIJArL8r2AQl74.:13749:0:99999:7:::

Let’s look at the “user1″ entry in the example above, paying attention to the second field which contains a bunch of alphanumeric characters separated by dollar signs. The first token, 1, is a version number, The second token, lKorlp4C, is the salt. The third token, RD5TSM6PaZ6oaWRVUuXT40, is the one-way hash that was calculated using lKorlp4C as the salt.

When the user attempts to login, the system passes the user-provided password along with the stored salt into the hash routine (in this case, md5crypt), and compares the result to the stored hash.

Each bit of salt used doubles the amount of storage and computation required for a pre-computed table. For instance, if we used one bit of salt — either 0 or 1 — the rainbow table would have to account for two variations of every password. Eight bits of salt require 2^8, or 256 variations of every password. Use a sufficiently large salt and pre-computation becomes infeasible. For example, the md5crypt utility uses 48 bits of salt (and for an extra layer of protection, it runs 1000 iterations of MD5 to slow down dictionary attacks).

There are a couple of common mistakes that people make with regard to salting. First, don’t use the same salt every time. If you do, you’re not really increasing the search space because the attacker only has to account for a single salt value. Second, don’t worry about protecting the salt values, they’re not secrets. The added security is derived not from the secrecy of the salt but rather by the amount it increases the resources required for pre-computation.

If you have OpenSSL installed you can play around with various salt mechanisms and see what the output looks like:

$ openssl passwd -h
Usage: passwd [options] [passwords]
where options are
-crypt             standard Unix password algorithm (default)
-1                 MD5-based password algorithm
-apr1              MD5-based password algorithm, Apache variant
-salt string       use provided salt
-in file           read passwords from file
-stdin             read passwords from stdin
-noverify          never verify when reading password from terminal
-quiet             no warnings
-table             format output as table
-reverse           switch table columns

$ openssl passwd -1 password
$1$LH1SwzJI$0ho4XuPVfGlbWIcNuGIap/
$ openssl passwd -1 password
$1$eAUtQOBh$GlvJwVsyb8In5KKkvnR0E0
$ openssl passwd -1 password
$1$PgaSiWTy$ElLh6uy83Y6T4Y70AGmV20

A quick Google search shows that there is a lot of confusion about salting.

But wait, now my password recovery feature won’t work

What’s that? You say your application has one of those “Forgot My Password” features where a user can type in their username and their current password will be sent to the e-mail address on file? Clearly, that requirement depends on passwords being stored either in the clear or using a reversible mechanism such as symmetric encryption.

The answer here is to redesign your password recovery feature. Don’t let an unnecessary requirement force you into poor security practices. If you must e-mail a password, generate a temporary password that’s only valid for a short time period, and require the user to login immediately and select a new password. This obviates the need to retrieve the original, forgotten password.

Why not just use symmetric encryption?

Instead of storing passwords in the clear, you could encrypt them using a symmetric algorithm such as AES and have the application encrypt/decrypt as needed. While this solves the plaintext storage problem, it creates a new problem: key management. Where do you store the key? How often does it change? How many people have access to it? What do you do if/when the key is compromised? And so on. The tradeoff really isn’t worth it for something that’s more elegantly solved with salted hashes.

Layered defenses

While you’re rethinking password storage, it might be a good time to consider other common flubs such as password complexity and brute-force protections.

In conclusion

  • Storing passwords in the clear puts your users at unnecessary risk if (when) your application database is compromised
  • Use salted hashes instead of storing passwords in a recoverable format
  • Password recovery mechanisms can be implemented without needing to obtain the original password
  • As with any aspect of security architecture, use layered defenses

Have fun refactoring!

How Boring Flaws Become Interesting

One of the great challenges for consumers of static analysis products, particularly desktop tools, is dealing with the large flaw counts. You have to wade through the findings to decide what to fix and when, which can be a daunting task. At Veracode, we continuously update our analysis engine to aggressively reduce false positives, thereby enabling our customers to more efficiently triage their results. Even so, it’s not unusual for customers to ask for clarification on certain flaws as they prioritize fixes.

The other day, we ran into an example that ended up being much more interesting than it appeared. The flaw category was Insecure Temporary Files, and the question was “should I really care about this?” The flaw we identified was in a Java application, and the offending line was something like this:

tmpFile = java.io.File.createTempFile(deploymentName, ".war");

I know what you’re thinking. You think the rest of this post is about how createTempFile() uses java.util.Random instead of java.security.SecureRandom to generate filenames, and since Random is seeded with the system time, you can work backwards to figure out the seed and use it to predict all future temporary files. That’s not it, so keep reading!

We couldn’t remember specifically what was so bad about createTempFile(), aside from using a non-cryptographic PRNG, so we checked the Java API for clues:

Creates a new empty file in the specified directory, using the given prefix and suffix strings to generate its name. … To create the new file, the prefix and the suffix may first be adjusted to fit the limitations of the underlying platform. If the prefix is too long then it will be truncated, but its first three characters will always be preserved. If the suffix is too long then it too will be truncated, but if it begins with a period character (‘.’) then the period and the first three characters following it will always be preserved. Once these adjustments have been made the name of the new file will be generated by concatenating the prefix, five or more internally-generated characters, and the suffix.

This behavior was verified with a quick test program:

$ for i in `seq 1 10`; do java createTempFile; done
/tmp/prefix53363suffix
/tmp/prefix200suffix
/tmp/prefix53898suffix
/tmp/prefix26801suffix
/tmp/prefix13687suffix
/tmp/prefix2221suffix
/tmp/prefix28661suffix
/tmp/prefix61720suffix
/tmp/prefix23104suffix
/tmp/prefix29833suffix

OK, that looks about right. It does what it says it does. One of my colleagues quickly raised the question, what happens if the generated filename already exists? So he generated /tmp/prefix0suffix through /tmp/prefix65535suffix and ran the test program again.

$ for i in `seq 1 10`; do java createTempFile; done
/tmp/prefix65536suffix
/tmp/prefix65537suffix
/tmp/prefix65538suffix
/tmp/prefix65539suffix
/tmp/prefix65540suffix
/tmp/prefix65541suffix
/tmp/prefix65542suffix
/tmp/prefix65543suffix
/tmp/prefix65544suffix
/tmp/prefix65545suffix

Uh-oh, not good. So not only does createTempFile() use a pretty small search space, but when it exhausts that space, it degrades to being 100% predictable? Decompiling the relevant portion of JRE 1.6.0_07, we can see exactly how the filenames are constructed:

private static File generateFile(String s, String s1, File file)
    throws IOException
{
    if(counter == -1)
        counter = (new Random()).nextInt() & 0xffff;
    counter++;
    return new File(file, (new StringBuilder()).append(s).append(Integer.toString(counter)).append(s1).toString());
}

public static File createTempFile(String s, String s1, File file)
    throws IOException
{
    ...
    File file1;
    do
        file1 = generateFile(s, s2, file);
    while(!checkAndCreate(file1.getPath(), securitymanager));
    return file1;
}

What this tells us is that createTempFile() is actually worse than we thought. Notice that counter is only ever assigned a random value once. As soon as it has that first random value, it simply increments from that point forward. The reason we didn’t get sequential output on our first test run was because we ran the test program 10 times, initializing counter each time. Had we put the loop inside the program, it would have generated a sequential list (try it yourself if you don’t believe me).

As luck would have it, Sun actually just fixed this problem in their latest release, Java 6 Update 11. Amazing that it went so long without being discovered. The updated function looks like this:

private static File generateFile(String s, String s1, File file)
    throws IOException
{
    long l = LazyInitialization.random.nextLong();
    if(l == 0x8000000000000000L)
        l = 0L;
    else
        l = Math.abs(l);
    return new File(file, (new StringBuilder()).append(s).append(Long.toString(l)).append(s1).toString());
}

If you’re wondering, the same bug is present in IBM Java 6 SR2, but it’s been fixed in SR3.

Returning to the original question that led us down this rathole, we came to the conclusion that yes, these types of flaws ARE worth fixing. Predictability and security rarely go hand in hand.

Tallying Twitter’s Application Security Best Practice Violations

If you were paying attention the last few days, you’ve probably read about the wave of attacks launched against the popular Twitter service. It started over the weekend, with a series of phishing attacks sent to unsuspecting Twittizens via Direct Message. Then, on Monday morning, Fox News announced Bill O’Riley (sic) was gay, CNN anchor Rick Sanchez tweeted that he was high on crack, and the Barack Obama transition team decided to raise a few bucks using affiliate referral links to survey websites. All told, 33 celebrity accounts were compromiwsed before Twitter caught on and took control of the hacked accounts.

Naturally, people wanted to know how it was done. A Twitter blog entry provided some vague detail:

The issue with these 33 accounts is different from the Phishing scam aimed at Twitter users this weekend. These accounts were compromised by an individual who hacked into some of the tools our support team uses to help people do things like edit the email address associated with their Twitter account when they can’t remember or get stuck.

What’s interesting about that paragraph is that the celebrity account hacks were not related to the phishing attacks, as one might assume, and they had nothing to do with an exploitable vulnerability in the Twitter app itself. Just a case of somebody getting hold of an admin account. Ho-hum.

Tonight, the “hacker” explained to Wired Magazine how he did it. I’ll try to summarize the attack, but you might have to read it several times because it’s subtle and complicated. Ready? Brace yourself… He used a dictionary attack to brute force a password.

Continue reading here after you’ve picked yourself up off the floor. Here’s the money quote:

The hacker, who goes by the handle GMZ, told Threat Level on Tuesday he gained entry to Twitter’s administrative control panel by pointing an automated password-guesser at a popular user’s account. The user turned out to be a member of Twitter’s support staff, who’d chosen the weak password “happiness.”

Now let’s consider the application security best practices that Twitter could have followed when designing their service, any of which would have foiled the attack.

  • Password complexity. In case you were wondering, the only restriction on Twitter passwords is a minimum length of six characters. No mixed case, no numbers, no special characters, none of that. Although they do encourage you to “Be tricky!”
  • Brute-force protections. Clearly there’s no account lockout mechanism, unless of course “happiness” was at the top of the word list. While there is no perfect solution to brute force attacks, it would appear Twitter didn’t even try.
  • Segregation of administrative functionality. I won’t underestimate the amount of effort required to segregate the admin interface. That being said, the attack would’ve failed if Twitter admins had to perform privileged functions via a dedicated internal interface.

Any others? Leave them in the comments.

In all fairness, it’s hard to make security a top priority in ANY company, much less a startup with overworked non-security-aware developers using an agile methodology with tight iterations (making some educated guesses here about Twitter). Ideally you want to start prioritizing security before you become an attractive target. Twitter missed the boat on that one, but I bet they’re paying attention now.

Microsoft Fixes 8-year Old Design Flaw in SMB

With regard to the recent Patch Tuesday fix, there has been an issue fixed regarding NTLM Relaying, that has been around for more than eight years.

In 2000, I wrote an advisory about NTLM relaying (CVE-2000-0834). The problem turned out to be significantly larger than I originally suggested in the advisory. The attack extended to other NTLM-based authentications on other protocols and allowed general-purpose credential theft via a man-in-the-middle attack.

The SMBRelay tool was published in 2001 by Sir Dystic of Cult Of The Dead Cow, and that really took it to the next level. The protocol completely fell apart. It kicked off a number of other analyses of the NTLM protocol that finally resulted in this patch. Eight years after it’s discovery.

At least they got around to it. Thanks!

Minimizing the Attack Surface, Part 2

I’m finally getting around to finishing my post on minimizing attack surfaces. Here’s Part 1, in case you missed it.

First, a quick clarification. I noticed that some of the readers who commented on that first post wanted to talk about improving security through the use of various development methodologies or coding frameworks. Those are interesting tangents (and ones that I may write about in the future), but my intention with this post is to discuss a very specific problem related to how people integrate third-party code — that is, the stuff you import or link in but didn’t write yourself.

As I mentioned previously, developers have a tendency to “bolt on” third-party components to applications without understanding the security implications. Often, these components are glossed over or ignored completely during threat modeling discussions. I attempted to illustrate this with my fictitious WhizBang library example in Part 1.

When integrating a third-party component, developers familiarize themselves with the API but generally don’t care how it’s implemented. Granted, that’s how an API is supposed to work; you don’t have to futz around with code beyond the API boundary, and you can blissfully ignore parts of the library that you don’t need. In past consulting gigs, I’ve sat in threat modeling discussions where nobody knew whether a particular library generated network traffic. “We just use the API,” they say. The fact that it works is good enough; nobody seems to care how it works.

That mindset is ideal for rapid development but problematic for security. Failing to understand the complete application, as opposed to just the part you wrote, prevents you from accurately assessing its security posture.

It’s also no coincidence that web app pen testers love third-party components — we get excited when we see “bolted on” interfaces, because we know that developers tend to leave extraneous functionality exposed. The resulting findings usually generate reactions such as “I didn’t even know that servlet had an upload function.”

An Example

Here’s a close-to-home example related to my post about DWR 2.0.5 from the other day. DWR is an Ajax framework that has a variety of operating modes. In-house, we use a subset of DWR’s full functionality — specifically, we interact with it using the “plaincall” method only, so we made sure that the features we didn’t need were disabled via the configuration file. As it turned out, there were vulnerable code paths prior to the “do you have this thing disabled” check. In hindsight, if we had taken more time to understand the exposed interfaces, we could have reduced the attack surface by filtering out unneeded request patterns before they even touched the third-party code.

But wait, you say. What about maintainability? If I whitelist using a point-in-time application profile, doesn’t this create the same maintenance headache as the reviled WAF? It doesn’t have to. Certainly, one option would be to whitelist each and every unique URL that references the DWR framework, e.g.

/dwr/call/plaincall/myMethod1
/dwr/call/plaincall/myMethod2
/dwr/call/plaincall/myMethod3

But then you’d have to update the whitelist every time you added or removed functionality from your application. Also, don’t lose sight of the security goal, which is to minimize the amount of exposed third-party code. If I add or remove URLs that list, provided they are still using the “plaincall” method, I’m hitting the same DWR dispatcher every time. So I’ve increased maintenance cost without any security benefit.

A better option is to simply tighten the URL pattern a bit in the J2EE container. Here’s the default configuration:

<servlet-mapping>
  <servlet-name>dwr-invoker</servlet-name>
  <url-pattern>/dwr/*</url-pattern>
</servlet-mapping>

Now, instead of allowing every URL starting with /dwr/ to be processed by the DWR library, you could be a little more restrictive:

<servlet-mapping>
  <servlet-name>dwr-invoker</servlet-name>
  <url-pattern>/dwr/call/plaincall/*</url-pattern>
</servlet-mapping>

In this configuration, you don’t have to worry about /dwr/call/someothercodepath any more. There is less third-party code exposed, thereby reducing the overall attack surface of the application. (NB: DWR also serves up a couple of Javascript files, so those URL patterns will have to be whitelisted too)

A Logical Extension

Even if you’re not a developer, you should still be thinking about attack surfaces. People download and install blogging platforms such as WordPress, Movable Type, etc. all the time, but how many take additional steps to harden their installations? The concept is the same as the OS hardening analogy I brought up at the very beginning of this discussion.

Similarly, people install third-party WordPress plugins or Joomla components without considering that most of them are written by some random programmer who is a whiz with the plugin API but knows nothing about security?

At the risk of sounding trite, always remember that security is only as strong as the weakest link.

Minimizing the Attack Surface, Part 1

What was the first thing you learned about network security? There’s a good chance it had something to do with port scanning. After scanning a few boxes, you realized that modern operating systems have a lot of open ports by default, meaning a lot of services. Some had an obvious purpose, like telnet on tcp/23 or ftp fon tcp/21. Others left you wondering, what the heck is listening on tcp/515 or tcp/7100? And remember, you couldn’t ask Google because it didn’t exist (well, maybe it did depending on when you got into security).

Your first real lesson about locking down a host was how to reduce its attack surface. You learned how to disable services using /etc/inetd.conf. Then you learned about rc.d and how to prevent unnecessary services from being launched at startup. Next, maybe you configured the Xserver to disallow remote connections or moved on to removing setuid permissions from files. As you worked, you’d periodically re-scan the box to gauge progress, asking yourself “have I removed everything I don’t need?” The underlying motivation, of course, is that an attacker can’t hack something that isn’t there.

You learned how to extend those concepts to the network — configuring firewall rules, router ACLs, VLANs, etc. Segmenting the network. Creating a DMZ. No need to dwell on this, you get the idea.

Eventually, people realized that applications had an attack surface too. Web servers and application servers got a lot of attention, followed closely by custom web applications. “What do you mean you can execute SQL queries against my database? That’s impossible, I have a firewall!”

Some companies, the ones who could afford it anyway, started to build security into their development cycle. Doing threat modeling during the design phase made sense, because hey, it’s much cheaper to fix security holes in a whiteboard drawing than it is to rewrite your authorization module from scratch after it’s in production.

Let’s talk strictly about custom web applications now. What I’ve observed is that most development groups, even the ones who actively engage in threat modeling, do not understand their web application’s attack surface. The lead architect can whiteboard a high-level diagram of all the major components and how they interact. Individual developers can go a bit deeper, telling you which files they touch, what database permissions they need, or how various pieces of data are encrypted in storage. At the end of this exercise you have a complete picture of the processes, data flows, protocols, privilege boundaries, external entities, and so on, and you’re well on your way to understanding all of the potential attack vectors.

Or are you?

What often gets overlooked or glossed over is the impact of external libraries or packages. Nobody writes everything from scratch. A typical list of third-party libraries for a Java-based Web 2.0 application might include DWR, GWT, Axis, and Dojo, plus about 30 other libraries to do everything from logging to parsing to image manipulation. Nine out of ten times, the libraries will be installed in full, using the default configuration from page one of the README file.

Why is this relevant? Because just as those old Unix boxes exposed unnecessary services, libraries expose unnecessary code. Let’s say you installed Dojo to simplify the process of creating an HTML table with rows and columns that can be sorted on demand. Did you remember to remove all the .js files you didn’t need? Or maybe you installed Axis or DWR or anything else that has its own Servlet(s) for processing requests. Have you compared what that Servlet can do against what you need it to do?

A fictitious example may help illustrate further. Imagine you just downloaded a new library called WhizBang. You follow the installation instructions to define and map two servlets in your web.xml file, WhizServlet and BangServlet, and you configure it to integrate with your web app. After a bit of trial and error, it’s functional. Yay! This is where most developers stop.

Nobody asks, “how much of this do I actually need?” Case in point, what if your application only uses WhizServlet? BangServlet is still exposed, and you don’t even use it! Similarly, what if WhizServlet takes an “action” parameter which can be either “view”, “edit”, or “delete”, and your application only uses “view”? You’re still exposing the other actions to anybody who knows the URL syntax (pretty trivial if it’s open source). You wouldn’t expose large chunks of your own code that you weren’t using, so why should it be any different with libraries?

This post is getting kind of long so I’m going to split it up. In the next post, I’ll continue the discussion of attack surface minimization, as well as some of the tradeoffs that go along with this approach.

Art vs. Science

I was just reading Dre’s post, R.I.P. CISSP, over at the tssci security blog, in which he predicts the upcoming OWASP People Certification Project will be the next big thing. This paragraph is quoted from James McGovern’s blog (James is the project leader):

As an Enterprise Architect, I understand the importance of the ability for a security professional to articulate risk to IT and business executives, yet I am also equally passionate that security professionals should also have the capability to sit down at a keyboard and actually do something as opposed to just talking about [it].

I agree wholeheartedly with this sentiment, and I believe the project goals are noble. So I went to read the latest OPCP draft proposal to see how they planned to tackle this admittedly difficult problem. What did I find? It’s just another test, with questions in a dozen or so broad categories. Far more specialized that CISSP, with topics that are more relevant to application security, but ultimately, still just a test.

The comment I once made about security educators/trainers is relevant here. Whatever questions end up on the OPCP test, these educators could probably answer most of them correctly without even studying. They lecture day in and day out about these topics. They have heard obscure questions and are prepared to answer them. And yet, many of them do not have any practical field experience.

A client chastised me once for making a statement that penetration testing is a mixture of art and science. He wanted to believe that it was completely scientific and could be distilled down to a checklist type approach. I explained that while much of it can be done methodically, there is a certain amount of skill and intuition that only comes from practical experience. You learn to recognize that “gut feel” when something is amiss. He became rather incensed and, in effect, told me I was full of it. This customer went on to institute a rigid, mechanical internal process for web app pen testing that was highly inefficient and, ultimately, still relied mostly on a couple bright people on the team who were in tune with both the art and the science.

Certifications only test the science.

Not a CISSP

One of my favorite pieces of swag from RSA was this “Not a CISSP” button that was pinned onto me by none other than Sinan Eren as I was chatting with Justine Aitel at the Immunity booth. Actually, there should have been a prize awarded just for finding the Immunity booth — they were subletting another vendor’s space for a few hours at a time, so one minute they’d be there and the next they were gone.

Not a CISSP

I digress. What inevitably happened once I started walking around with this button proudly displayed was that I would get one of two reactions. The first group — mostly current and former co-workers and acquaintances — understood the humor and got a good chuckle out of it. The second group would ponder for a bit and then ask, with some confusion, why I’d intentionally point out the fact that I’m not a CISSP. I’d give a brief answer and get back to talking about Veracode (we booth babes have responsibilities, you know).

So, why indeed? The long answer is that like many security certifications, it’s an ineffective measure of a security professional’s practical abilities. Employers and customers often assume the guy with the five magic letters on his resume is technically superior to the guy without. In my experience, it’s exactly the opposite, particularly in situations where you have to sit down at a keyboard and actually DO something as opposed to talking about it. Certainly, I’ve encountered some very notable exceptions to this observation, but we’re playing by the 80/20 rule here.

There’s a good reason for this. The trend in information security is toward specialization. Security has become such a broad umbrella of varying disciplines that it’s quite difficult to be a generalist. A security career is a balance between breadth and depth, and these days, the skilled pen tester, reverse engineer, or vulnerability researcher is more marketable than the guy who knows a little bit about dozens of different disciplines but can’t apply that knowledge in a practical situation. The CISSP subject matter illustrates this perfectly — you have cryptographic algorithms, site location principles, network security, and civil law on the same exam. I won’t even get into the complaints I’ve heard about the poorly-worded, overly simplistic exam questions or the ones that simply test one’s ability to memorize obscure facts.

I’m not claiming that there’s no value to holding the CISSP certification. It can’t hurt to have some exposure to business continuity planning, for example. The problem, as I stated in the beginning, is that the CISSP title is often interpreted as an indicator of practical abilities rather than a book-level understanding of security basics. These misaligned expectations can ultimately lead to bad hiring or staffing decisions.

Career advice, take it or leave it: If an employer or prospective employer demands that you get your CISSP in order to be hired or to progress in your career, run fast in the opposite direction and find a place where you will be valued for your cumulative experience rather than a piece of paper. Learn by doing, don’t “learn the test,” so to speak.

And that, in a nutshell, is why I love my “Not a CISSP” button.

By the way, here was my other favorite from RSA, thanks to WhiteHat. This one and “Samy is my hero” were the best out of a pretty clever selection… even though they forgot the semicolon after the single quote. <grin>

DROP Table SalesPitch

Squirreling Backdoors Into Distribution Points

So it seems that SquirrelMail 1.4.11 and 1.4.12 were recently backdoored. Similar to some high-profile backdoors in the past, this was done by modifying the distribution tarball on rather than infiltrating the source code repository [1]. In this case, the backdoor was detected when a user noticed that the MD5 published on SquirrelMail’s website didn’t match the calculated MD5 from the SourceForge distribution.

Since the SVN repository remained intact, we can’t go back and examine the backdoor in detail. However, we do have a newsgroup posting that sheds a little light on the situation:

> What diff do you see between the compromised version and
> the one that is there now? I see only a comment diff in one file.

it was a small block of code that checks for a $_SERVER var. If that var was present, it would redefine SM_PATH. Under normal circumstances, this would never be executed, but we have since learned how to make it execute.

In PHP, $_SERVER is an array populated by the web server that contains information such as headers, paths, and script locations. This includes some user-supplied input such as the URL query string and the HTTP headers. SM_PATH is the filesystem path where SquirrelMail is configured to be run from. So once an attacker controls SM_PATH, it’s likely that a subsequent call to include() can be exploited to fetch and execute PHP code from a remote server. This is a typical example of a Remote File Include vulnerability.

Note that the attacker backdoored the 1.5.1 distribution as well, with the same type of vulnerability but at a different location in the codebase.

I think what’s most interesting to me about this is that so many open source projects still rely on MD5 hashes for integrity checking. The minute the Xiaoyun Wang paper on MD5 collisions was released, every security practitioner in the world considered MD5 unsafe from that point forward. Even though practical attacks had not yet been formulated, the writing was on the wall. Unfortunately, the rest of the world either didn’t notice or didn’t care.

Cryptographers have since developed increasingly sophisticated attacks stemming from Wang’s original work. Recently, researchers in the Netherlands demonstrated two examples of chosen-prefix attacks which would make it possible for an attacker to take two tarballs (one original, one backdoored) and append a series of bytes to each that result in both files having the same MD5 hash. This proves beyond a shadow of a doubt that MD5 is not an effective method for verifying software integrity. There was hardly any doubt that this attack would surface eventually, so why is MD5 still in such widespread usage?

Cryptographic weaknesses aside, a lot of people completely miss the mark with hashes. MD5 or SHA-1 (or any hash function) are not very effective if the only way a user can verify them is on the same website where the download is hosted. If the download point is compromised, chances are the attacker can modify the hashes printed on the website too. Even when it’s done correctly, hashes only help identify when the distribution point is compromised. It does nothing to protect against source code compromise or vulnerabilities in the development tool chain.


[1] Static Detection of Backdoors, Chris Wysopal and Chris Eng, 2007.

Next Page »
 

Powered by WordPress