Do Not Pass QA: Catching Subtle Bugs In Act

687474703a2f2f692e696d6775722e636f6d2f6e454859716d532e706e67

Bugs happen. Severe bugs happen. Catastrophic bugs happen. There's simply no way to know how, exactly, the Goto Fail Bug – a tiny mistake which happened to disable an entire step of SSL verification deep in Apple code – ended up getting written into sslKeyExchange.c and saved. What is clear is that the bug got through Apple's QA process unnoticed and ultimately shipped on iOS and OSX. Let's consider for a moment that this bug was committed to your codebase during routine refactoring. How certain are you, really, that you would catch it? What can we do to improve the likelihood it will be caught? I'm going to be using the term "security-sensitive code" a lot, so I will use the abbreviation SSC. By SSC, I mean code that, by design, is making critical security decisions – such as around user identity, file access rights, and of course, encryption. A bug in security-sensitive code is more serious than code which is subject to the decisions of SSC. As such, an extra degree of care should be taken in writing and especially testing your SSC. Code auditing tools which may be too "noisy" to apply to a large codebase should be deployed selectively where a bug will hurt you most. As such, you should make sure you have clearly identified your security-sensitive code on a file-by-file basis. When new code is written, explicitly classify it as SSC or non-SSC.

Eyeballing It

This bug got past the just eyeballing it stage. It's not hard to see why: C syntax is very permissive when it comes to whitespace and indentation, and happens to have a peculiar "braceless condition" which binds to only the next statement. The code in question uses a style which looks concise and elegant (in my opinion anyway) but unfortunately can be visually misleading. In the original code, the bug in situ looks like this:


    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;

With braces and indentation applied (to one of several possible standards), it would look like this:

if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0) {
        goto fail;
    }
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) {
        goto fail;
    }
    goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) {
        goto fail;
    }

It is now much more visually obvious that the second goto fail is outside of a control flow block. If the braces had been there when the file was edited, the accidental second goto fail would probably never have ended up sitting there independent of an if-block in the first place. All programmers have their own very strongly held opinions about code formatting, but the lesson here is that in a language with flexible formatting, a QA-friendly standard should be chosen and enforced, especially in SSC. Enforcement doesn't need to be done by drilling it into programmers until they see braces with their eyes closed – most IDEs are happy to assist, and source files can be run through completely automated tools to clean up any slips. An example of a free tool which supports re-bracing unbraced clauses specifically would be uncrustify. Indent is how I got through C++ class in university without getting points taken off for failing to adhere to the professor's mandated standard. When he found out, he told me not to tell the other students about this standard utility – but using it isn't cheating in the real world. Additionally, make sure your QA team has access to the little conveniences in life, like colorized diffs:

When a line is changed, one sees a matching pair of red and green bars. The green bar alone shows that entirely new logic was added. In this particular case, once your eyes are drawn to the green bar, the result should look odd. (It should be noted that Apple releases open source in batches. The diff between the broken version and the older, non-broken one is most likely the combination of several distinct commits which would each produce a smaller diff.) As an aside, the colors in the above screenshot are not very red-green colorblind friendly. Whenever you use color to convey significant information, the shades used should be configurable by the end-user to be both easily distinguishable and not harsh on their eyes.

The Warning Waterfall

Compilers for the C family of programming languages are notoriously picky. Most possible warnings are actually disabled by default in major compilers because they're often very minor and annoying. As such, when you want to be picky, you reach for the -Wall ("warn all") flag or equivalent. Contrary to what the name suggests, however, -Wall is not all warnings in either gcc or clang. Running literally all warnings in gcc is a bit complicated, and clang provides -Weverything. Does this matter, versus -Wall, in practice? I pulled out a hobby project of mine which is a few hundred lines of C. Under clang -Wall, it came back completely clean. Under clang -Weverything, it comes back with ten warnings. Let's enumerate them:

Four of them are complaints that my files do not end with a newline character. This is pedantic in the extreme and easy to see why it wouldn't be included in -Wall.
One warns that a certain cast increases the alignment requirement. In context, the warning is caused by my test harness deliberately miscasting something to verify that the function call it is passed to will detect that it is invalid and fail gracefully. If it were not an intentional "mistake", I might have just unearthed an interesting bug.
Three are warnings about implicit sign conversion. In security-sensitive code, you should pay very close attention to these. Take extreme care that the intended logic of a check cannot be evaded by a sign conversion.
One is that I'm passing a non-literal format string specifier to something in the sprintf family. It turns out my API depends on the caller not allowing the parameter ultimately passed to the sprintf call to be user-controlled. That should probably be very explicitly documented, and call sites manually inspected for compliance.
The last warning is that one of my functions will never return. In this case, it was deliberate, but it wants me to add the noreturn attribute to make this clear. If it weren't deliberate, the possibility is there to really wreck a critical code path's day.

So 60% of the warnings returned by -Weverything were worth following up on to make sure my security-sensitive code is completely airtight, and the other 40% were all the same ignorable note. A large file could easily generate hundreds or even thousands of such warnings, but in security-sensitive code, this effort is worth it. In fact, clang -Weverything can catch the Goto Fail bug specifically. How? By complaining that there is dead code; SSLHashSHA1.final() and sslRawVerify() never get called and hence can be culled by the optimizer. That's a huge red flag. It's entirely possible for source to contain harmlessly dead code, of course, or even for it to be in there for deliberate reasons. However, you should definitely follow up on warnings such as this in SSC and make absolutely sure that you do not have broken control flow. Clang makes the dead code check available individually as -Wunreachable-code. Unfortunately, gcc removed support for this feature, as their implementation was optimizer-dependent and the inconsistency must have been driving someone crazy. I think that is a shame. If you're using Visual Studio for native code, well, don't look at me. I can never find anything on MSDN.

Evil Unit Tests

Unit tests! You do have them, yes? All security-sensitive code should have a corresponding test suite which verifies that every outcome of every security decision works properly. If this sounds like a lot of work, well, it is. It's not surprising that Apple's test suite did not catch the Goto Fail bug, because it's very narrow in terms of the many, many things that go on at multiple layers when negotiating an SSL connection over the network. However, that should not dissuade you from trying. Look into using readily available hacking tools to set up malicious interactions with your security-sensitive code without having to write an enormous custom tool which will itself have bugs. Be aware that downloading metasploit to your workstation may cause antiviruses to believe their day has finally come, and you to receive an anxious visit from the network security team. Speaking from experience, you ask? Maybe. If you learn anything from Goto Fail, it should be that no change is too minor to not be a risk of introducing catastrophic failure to your SSC. Always, always, always run your entire test suite before pushing any change in SSC to production.

Third Parties

I'd be remiss (read: in trouble) not to mention that there are third parties who can provide another perspective in reviewing your security-sensitive code. Every writer knows how difficult it can be to spot minor typos in their own draft, because they know what they meant. A third party can surface errors that escaped the notice of those intimately familiar with the codebase. In the interest of being completely clear, it just so happens that Goto Fail, specifically, would not be caught by Veracode's static binary analysis. This is because, in terms of the binary, nothing is demonstrably wrong; control flow proceeds smoothly and safely. The only way to tell from the binary that something is wrong is to have advance knowledge that the two functions SSLHashSHA1.final() and sslRawVerify() need to be called, and they are not. However, our static analysis can find many types of mistakes in utilizing standard security APIs, and all sorts of other minor slip-ups with big consequences. We frequently find that our customers have mistakenly shipped code with SSL certificate verification disabled – a "quick fix" for internal testing which never got removed, and puts end-users at risk of MITM attacks. This has a very similar effect to the Goto Fail bug! Code is hard. Security-sensitive code is very hard. Use all the tools at your disposal to minimize the risk of shipping your own Fail. Here's a picture, just because it's cool, of the iOS 6 binary before and after the patch for Goto Fail. IDA screenshot (and all the hard work) courtesy winocm.

Do Not Pass QA, Do Not Goto Fail: Catching Subtle Bugs In The Act

Eyeballing It

The Warning Waterfall

Evil Unit Tests

Third Parties

Related Posts

Enhancing Developer Efficiency With AI-Powered Remediation

Speed vs Security: Striking the Right Balance in Software Development with AI

Veracode Advances Cloud-Native Application Security with Longbow Acquisition