Skip to main content
May 5, 2008

Dilbert Does Canonicalization

I was checking out the "new and improved" Dilbert website a few minutes ago, checking out some of the new features and lamenting the overzealous use of Flash. One new feature is called "Mashups." Naturally, you'd assume that this was some fancy Web 2.0 API that one might use to create a "killer app" combining Google Maps, Twitter, traffic delays, police reports, and Dilbert comics, all neatly packaged up as a privacy-invading Facebook plugin. Sorry, no such luck. "Mashups" turns out to be a way for readers to unleash their inner comedian and create customized punch lines for the daily comic, which can then be voted on by others. For example, here are the mashups from the May 3rd comic.

Below is a screenshot of some of the user-generated comics that can be viewed. I've magnified the last pane of one of the strips using Flash's "Zoom In" feature. Notice anything interesting?

Yep, it's our old friend URL encoding, commonly used by web browsers to include non-alphanumeric characters into an HTTP request. Just interpret the %XX as a hex number, so %20 is the space character (decimal 32), %21 is an exclamation point (decimal 33) and so on. But why is it showing up in a Dilbert mashups?

My first thought was that someone must be poking around the Dilbert site looking for security holes. But then I noticed that it wasn't just the one strip; a lot of them had the same problem. And it seemed unlikely that there were that many security-minded people messing with the site relative to the rest of the cubicle dwellers trying to come up with funny things for Dilbert to say.

My next thought was just that some developer just forgot to call urlDecode() -- or whatever the Flash equivalent is -- on the user-supplied punch line. Except that's an oversimplication because: 1) it doesn't happen on every strip, 2) the web server usually strips off the first layer of URL encoding so the backend wouldn't see it unless it was double encoded (e.g. %2520), and 3) if you click on one of the thumbnail comics with the URL encoding anomaly, the full-size rendered version of the comic looks fine:

So clearly the "preview" code and the "full-size render" code are doing slightly different things with the same data, which may or may not have been properly decoded prior to being inserted into the database.

Any thoughts, readers? The pen tester in me wants to get to the bottom of this, but unlike some of the web app security people out there, I tend to be more conservative about hacking at stuff without a signed contract. Also, I don't think I can stand to read any more un-funny punch lines. But my gut tells me there is something fairly interesting going on behind the scenes here. Exploitable? Probably not. But it's a great example of how easy it is to misinterpret data.

Oh finally, here's a tip from Scott Adams himself on avoiding the Flash navigation and viewing the daily comic as a plain ol' GIF.

Related Content

Chris Eng, Chief Research Officer, is responsible for integrating security expertise into Veracode’s technology. In addition to helping define and prioritize the security feature set of the Veracode service, he consults frequently with customers to discuss and advance their application security initiatives. With over 15 years of experience in application security, Chris brings a wealth of practical expertise to Veracode.

Love to learn about Application Security?

Get all the latest news, tips and articles delivered right to your inbox.