Skip to main content
April 27, 2009

Decoding the Verizon DBIR 2009 Cover

As you probably know by now, the pattern of 1s and 0s on the cover of the 2009 Verizon Data Breach Investigations Report contains a hidden message. I decided to give it a whirl and eventually figured it out. No doubt plenty of people managed to beat me to it, as evidenced by the fact that I didn't get my solution in early enough to win the cash prize -- but so far, I haven't seen anybody write up a walkthrough, so I thought I'd do one. If you haven't taken a crack at it yet and plan to, then stop reading -- SPOILERS AHEAD, as they say. Otherwise, I hope that this is helpful to anyone who is interested in learning more about basic cryptography. I started by copying and pasting the binary digits from the cover of the report into a text file. Then, I converted the bits to ASCII resulting in the following text:

$ cat vz|unsplit|bin -d|split 72

The first thing I tried was Caesar shifts, which is basically ROT-n for values of n from 1 to 25. So for n=1, A is encoded as B, B is encoded as C, and so on, all the way through Z, which is encoded as A. For n=2, A is encoded as C, B is encoded as D... you get the picture. I won't print out the results of all the decodes because it would take up too much space, but suffice it to say, nothing interesting came out of it. Next, I tried frequency analysis, which can be an effective way of deciphering a simple substitution cipher (i.e. a given plaintext character is always encoded as the same ciphertext). A simple substitution cipher reflects the tendency of a written language to use certain letters more than others. For example, in English, the most frequent letter, E, appears roughly 170x more often than the least frequent letter, Z, for a sufficiently large sample size. Here's the frequency distribution from the Verizon ciphertext:

$ cat vz|unsplit|bin -d|split 1|sort|uniq -c|sort -n
     21 D
     21 Q
     24 O
     25 X
     26 A
     26 H
     27 B
     28 L
     28 U
     29 J
     31 C
     32 M
     33 W
     34 F
     34 S
     36 P
     38 I
     40 N
     40 V
     40 Y
     41 G
     42 Z
     43 T
     44 K
     56 R
     61 E

As you can see, the most frequent character, E, was only three times as prevalent as the least frequent character, D, which meant it was unlikely to be a simple substitution cipher, provided the plaintext was English. The frequency distribution was far too different than what we would expect. Just for kicks, I tried various transposition ciphers, rearranging the 900 characters into an M-by-N grid, for different values of M and N (M*N=900), and reading down the columns instead of across the rows. Frequency analysis already told us that we shouldn't expect to see any English text, but I thought some visual patterns might emerge. Wrong again. Around this point, I saw somebody on Twitter mention that there were clues embedded in the body of the report, so I started skimming through it. At the bottom of page 48 is “yr puvsser vaqrpuvssenoyr” which ROT-13 decodes to “le chiffre indechiffrable.” Here’s where I went briefly astray by using Google Translate instead of just Googling the term. The literal French translation is “indecipherable figure” which made me think that the clue was that the whole thing was a hoax and the front cover was just a bunch of garbage. A friend reminded me that “le chiffre indechiffrable” actually refers to a Vigenère cipher, which would have been painfully obvious if I’d used regular Google search instead of Google Translate (smacks self on head). Logically, the Vigenère would've been the next target anyway, as it's just a simple substitution cipher with a twist. If you're not familiar with how a Vigenère cipher works, it basically uses a keyword to cycle through different substitution maps. For example, if you were encoding ZZZZZZ with the keyword FOOBAR it would come out as ENNAZQ -- the letter Z is encoded differently depending on how it aligns with the keyword. You can see why frequency analysis isn't useful here. My first inclination was to just guess the keyword outright. I thought maybe it was something obvious such as VERIZON, VZ, RISK, DATA, BREACH, VIGENERE, etc. I grabbed Crypt::Vigenere and tried each of the guessed keywords, but none of them worked. I even wrote a quick script to brute force all 2- and 3-letter keywords, again coming up with nothing. Then I took a different approach -- trying to guess what the decoded message might contain and work backwards. I speculated that the first word would be CONGRATULATIONS which corresponds to a potential key of CHANGINEXMDKZRP. This didn't seem right, but the CHANGIN part of it seemed like too much of a coincidence. So I tried CONGRATS as the plaintext, which corresponded to the keyword CHANGING. I thought it was solved at this point, but decoding the entire ciphertext using CHANGING as the keyword still gave me junk. So then I searched through the PDF for the word CHANGING, and sure enough, on page 46, one of the bullet items says “Changing default credentials is key” (clever, huh). So I decoded with a keyword of CHANGINGDEFAULTCREDENTIALS and it worked. The text decodes to the following message:

$ cat vz|unsplit|bin -d|vigenere -d changingdefaultcredentials|split 72

Had the message not begun with “CONGRATS”, there are some other techniques for attacking a Vigenère cipher, including trying to deduce the length of the keyword by looking for cyclical patterns in the ciphertext. Luckily, it didn't come to that because I wanted to watch TV. I visited the embedded URL which said that somebody had already claimed first prize but that I was still in the top three. I later found out that about ten people, including myself, submitted solutions around the same time before the authors could update the congratulatory message. So I didn't win any money but it was still a lot of fun (and significantly better that the corny FBI challenge).

Veracode Security Solutions
Veracode Security Threat Guides

Chris Eng, Chief Research Officer, is responsible for integrating security expertise into Veracode’s technology. In addition to helping define and prioritize the security feature set of the Veracode service, he consults frequently with customers to discuss and advance their application security initiatives. With over 15 years of experience in application security, Chris brings a wealth of practical expertise to Veracode.

Love to learn about Application Security?

Get all the latest news, tips and articles delivered right to your inbox.