CWE 117: Improper Output Sanitization for Logs


CWE 117: Improper Output Sanitization for Logs is a logging-specific example of CRLF Injection. It occurs when a user maliciously or accidentally inserts line-ending characters (CR [Carriage Return], LF [Line Feed], or CRLF [a combination of the two]) into data that will be written into a log. Because a line break is a record-separator for log events, unexpected line breaks can cause issues with parsing logs, or can be used by attackers to forge log entries.

NB: you may have seen an LF character expressed as "\n", and a CR character expressed as "\r" in various programming guides. These are escape codes that represent the single LF and CR characters.

Here is what can happen if an application fails to correctly handle CRLFs while processing data.

<form action="/user/login" method="post">
        <input type="text" placeholder="Enter Username" name="username">

        <input type="password" placeholder="Enter Password" name="password">

        <button type="submit">Login</button>

final static Logger logger = Logger.getLogger(UserDetailsService.class); ... if(logger.isDebugEnabled()){"Authenticating " + username); ... }

This login page behaves just like any other: if a user types valid credentials, the site logs them in and directs them to another page. Meanwhile, a logger also records the event and its outcome, via calls to Hackers anticipate that such logs are kept, and that they'd contain evidence of crime. For example, a hacker knows their breach will probably be investigated and wants to misdirect future attention toward another suspect by populating the log file with false information. If the login form allows CRLFs inside its input, the hacker could use CRLFs to force parts of what they type onto extra lines when the logger records it. The hacker will utilize this principle to write some fraudulent entries to the log.

The attacker has to know what a valid log entry looks like, as well as a failed one. For this example, we'll assume they've made an educated guess based on prior experience with common loggers. They must then design a "payload": data crafted to cause malicious effect upon entry into some context.

Whenever a failed login occurs, these three lines are written to the log; the username in the first line is the only variable. The user can enter whatever they want as a username (here we use jdoe), so they control the data in this bounded space.

[INFO] com.veracode.example.UserDetailsService - Authenticating jdoe
[INFO] - Interactive login attempt was unsuccessful.
[INFO] - Cancelling cookie

The attacker enters the following malicious data into the username field (we're adding visual line breaks for readability, the attacker's line breaks are encoded as %0D%A, see below):

NB: form fields are URL-encoded, even when part of a POST body. That means the %0D%0A is an encoded CR followed by an encoded LF. The %20 is a space, and so on.

Because this encoded date contains the CRLF line breaks, and the code does not do anything to strip them, the decoded data is written directly to the log, which looks like this:

[INFO] com.veracode.example.UserDetailsService - Authenticating +++{scapegoat
[INFO] com.veracode.example.CustomPersistentRememberMeServices - Login successful. Creating new persistent login for user scapegoat.
[INFO] - Authenticating jdoe}+++
[INFO] - Interactive login attempt was unsuccessful.
[INFO] - Cancelling cookie
  • The payload begins with the value of username. This will be part of the first line that is written to the log after a failed login (which this payload will generate).
  • For the username itself, the attacker picks scapegoat - another user whom they wish to frame.
  • Next, they type a CRLF &x0D;&x0A; to force the rest of the data onto a new line in the log.
  • On the apparent next line, a fake record of a successful login from scapegoat follows, making it look as if scapegoat logged in at this time. Another CRLF adds another line break.
  • After the payload is written to the log, we know that two more lines will follow because a failed login has occurred. If these lines appear without the first line of the triplet, it will be obvious that something is wrong with the logs. So the attacker adds the text of the first line of a login attempt for a different user, jdoe, which compensates for the two lines that will follow the payload.

Now it will look as if there were two sequential login attempts: a success by scapegoat and a failure by jdoe. Scapegoat now looks to have logged in at the time of the actual breach, so they become a suspect in the subsequent investigation. If it later emerges that someone had tampered with the logs, it might clear their name, but it would actually compromise the integrity of all log files. How would you know which logs haven't been altered?. This effect is made possible by CRLF injection.


Fortunately, this fix is very simple. Simply prevent the two characters of a CRLF sequence from being saved within this stream. Veracode's recommended approach is to encode any input from users that is written to the log (although there are other fixes). In Java applications, it’s easy to do so by using the logging interface of the ESAPI library from OWASP. ESAPI’s logging interface is a simple abstraction made for simple integration with existing projects and loggers. Implementations exist for both Log4j and the native Java logging package. As the ESAPI logger writes log data, it encodes CRLF characters to prevent log injection attacks.

Here's how to use it:

-   final static Logger logger = Logger.getLogger(UserDetailsService.class);
+   final static Logger logger = ESAPI.getLogger(UserDetailsService.class);
    if(logger.isDebugEnabled()) {, "Authenticating " + username);
view fixed code only

That’s it - you can continue to use the code you’ve already written by simply changing the logging class.

This example is based on log file injection, but CRLF injection can also appear in forms such as HTTP Response Splitting (CWE 113 ↪). This flaw rarely appears in a readily exploitable form, but if a fix is required, the same strategy could be used: encoding CRLF characters before processing them. ESAPI also provides a suite of encoding methods that can handle this.



Ask the Community

Ask the Community