CWE 73: External Control of File Name or Path

Flaw

CWE 73: External Control of File Name or Path is a type of security flaw in which users can access resources from restricted locations on a file system. It is commonly called path traversal. If an attacker performs a path traversal attack successfully, they could potentially view sensitive files or other confidential information. This threat is possible when applications allow a user to specify the filename used in a file system operation.

For example:

<h1>Path Traversal Lab</h1>
<h4>Welcome to our free library. Please select any book from the shelf. Take care to not overreach...</h4>
<form action="/EBook/Download" method="GET">
    <select name="ebookName">
        <option value="Anna Karenina.epub">Anna Karenina (Leo Tolstoy)</option>
        <option value="Leaves of Grass.epub">Leaves of Grass (Walt Whitman)</option>
        <option value="Pygmalion.epub">Pygmalion (Bernard Shaw)</option>
        <option value="Siddhartha.epub">Siddhartha (Hermann Hesse)</option>
        <option value="The Picture of Dorian Gray.epub">
        The Picture of Dorian Gray (Oscar Wilde)</option>
    </select>
    <input type="submit" value="Submit" />
</form>

The above form, which the following C# ASP.NET Controller code processes, sends the selected file to the browser:

public class EBookController : Controller
{
    ...
    public ActionResult Download(string ebookName)
    {
        const string PATH = "d:\\app\\web\\ebooks";
        ActionResult result = new HttpNotFoundResult("ebook not found");

        string epub = PATH + ebookName;

        if (System.IO.File.Exists(epub))
        {
            using (var file = System.IO.File.OpenRead(epub))
            {
                result = new FileStreamResult(file, "application/octet-stream");
            }
        }
        return result;
    }
    ...
}

The page displayed in the browser offers a drop-down list to select a book name. If we select "Anna Karenina (Leo Tolstoy)"; the value of ebookName would then be:

Anna Karenina.epub

Which means that epub the full name of the file returned is:

 d:\app\web\ebooks\Anna Karenina.epub

The correct epub file is returned.

Since the ebookName value is not validated, a malicious user could ask for other files on the system. Having the choices be <select> options doesn't help here, because malicious users can easily make their own POST request using a tool, or intercept and modify the request as it leaves the browser, or do other things to bypass the browser's restrictions.

If an attacker guesses you are running an ASP.NET application on Windows, and supplies the following for ebookName:

..\web.config

This would result in epub being:

d:\app\web\ebooks\..\web.config

Each .. above means "in the parent of the current directory", or "go one level up". The application looks at a single directory above the one where it normally would. This action takes them out of the starting folder, goes one folder up to the web directory, and instructs the application to retrieve the web.config file, which contains information that could help the attacker. Automated tools exist to allow attackers to use this technique to search for and find many common sensitive data files with very little effort. Any file the application can access, the attacker can obtain.

To summarize, if an attacker is allowed to specify all or part of the filename, it may be possible to gain unauthorized access to files on the server, including those outside the webroot, which would normally be inaccessible to endusers. The level of exposure depends on the effectiveness of input validation routines, if any.

Fix

Regardless of how we choose to address this issue, there is no ideal fix within client-side code. It is trivial for users to circumvent client-side validation, and for this reason, you can never guarantee that what the server receives is trustworthy. Therefore, you must address the issue within server-side code.

There are three basic patterns to fix Path Traversal flaws, all of which are various ways to validate the input coming from the client. From best solution to worst, they are:

  1. Indirect references
  2. Pattern whitelisting
  3. Pattern blacklisting

Indirect References

In this example, the set of acceptable filenames is already known; the page intends to have only five possible options from a drop-down box. If the corresponding filenames on the server are also known, there is no need for the client to submit them. Instead, they can send a numeric identifier (or UUID, or any similar identifier) to indicate their choice; the server code can then map these to a filename. This way, it does not matter what value the user sends: only approved values can return a file, and all other inputs are rejected.

The following is a simple implementation of the indirect reference solution.

public class EBookController : Controller
 {
     ...
-    public ActionResult Download(string ebookName)
+    public ActionResult Download(int ebookID)
     {
         const string PATH = "d:\\app\\web\\ebooks";
         ActionResult result = new HttpNotFoundResult("ebook not found");
 
-        string epub = PATH + ebookName;
+        Dictionary<int, string> ebooks = new Dictionary<int, string>
+        {
+            {1, "Anna Karenina.epub"}
+            ,{2, "Leaves of Grass.epub"}
+            ,{3, "Pygmalion.epub"}
+            ,{4, "Siddhartha.epub"}
+            ,{5, "The Picture of Dorian Gray.epub"}
+        };
 
-        if (System.IO.File.Exists(epub))
+        if (ebooks.TryGetValue(ebookID, out string epub))
         {
-            using (var file = System.IO.File.OpenRead(epub))
+            string epubFile = PATH + epub;
+            if (System.IO.File.Exists(epubFile))
             {
-                result = new FileStreamResult(file, "application/octet-stream");
+                using (var file = System.IO.File.OpenRead(epubFile))
+                {
+                    result = new FileStreamResult(file, "application/octet-stream");
+                }
             }
         }
         return result;
view fixed code only

NB: We hard-coded the values in this example for clarity. In a live application, instead of hard-coding the items in a Dictionary<int,string>, you would probably look up the value from a key-value store, such as a database, properties file, or similar source.

Pattern Whitelisting

If you are unable to make an indirect reference, you can instead create a pattern or list of known good characters (a whitelist) in a valid filename, and ensure that any submitted data matches that list.

In our example, valid filenames only consist of "word characters", spaces, dots, dashes, and underscores; and they also all end in '.epub'. This pattern does not include characters such as \, so we can be confident that if a ebookName matches the pattern, it is likely to be safe. Be careful! An overly-broad pattern will provide you with no protection.

As an additional verification, we also canonicalize the resulting path, which means getting its most normal form. The call to the System.IO.Path.GetFullPath() will translate any special characters in the created path. We recommend that you check if the path still meets the expectations, which, in our example, is that the file needs to be inside d:\app\web\ebooks\ or any subdirectory below that.

The following is a possible implementation of this solution.

const string PATH = "d:\\app\\web\\ebooks";
         ActionResult result = new HttpNotFoundResult("ebook not found");
 
-        string epub = PATH + ebookName;
-
-        if (System.IO.File.Exists(epub))
+        var regex = new System.Text.RegularExpressions.Regex(@"^[\w-_. ]+\.epub$");
+        if (regex.IsMatch(ebookName))
         {
-            using (var file = System.IO.File.OpenRead(epub))
+            string epub = System.IO.Path.GetFullPath(PATH + ebookName);
+            if (epub.StartsWith(PATH) && System.IO.File.Exists(epub))
             {
-                result = new FileStreamResult(file, "application/octet-stream");
+                using (var file = System.IO.File.OpenRead(epub))
+                {
+                    return new FileStreamResult(file, "application/octet-stream");
+                }
             }
         }
         return result;
view fixed code only

Pattern blacklisting

If you do not have clear rules about filename patterns, you might have to resort to making a list of characters you know you wish to disallow, called a blacklist.

For example, you could choose to disallow any ebookName that includes .. or \, which would make it difficult to obtain files outside of the directory specified by constant PATH.

As an additional verification (similar as to pattern whitelist) we also canonicalize the resulting path, which means getting its most normal form. The call to the System.IO.Path.GetFullPath() will translate any special characters in the created path. It is recommended that you check if the path still meets the expectations, which, in our example, is that the file needs to be inside d:\app\web\ebooks\ or any subdirectory below that.

The following is a possible implementation of this solution.

const string PATH = "d:\\app\\web\\ebooks";
         ActionResult result = new HttpNotFoundResult("ebook not found");
 
-        string epub = PATH + ebookName;
-
-        if (System.IO.File.Exists(epub))
+        var regex = new System.Text.RegularExpressions.Regex(@""\.\.|\\|/"");
+        if (!regex.IsMatch(ebookName))
         {
-            using (var file = System.IO.File.OpenRead(epub))
+            string epub = System.IO.Path.GetFullPath(PATH + ebookName);
+            if (epub.StartsWith(PATH) && System.IO.File.Exists(epub))
             {
-                result = new FileStreamResult(file, "application/octet-stream");
+                using (var file = System.IO.File.OpenRead(epub))
+                {
+                    return new FileStreamResult(file, "application/octet-stream");
+                }
             }
         }
         return result;
view fixed code only

References

CWE ↪ WASC ↪

Ask the Community

Ask the Community