While doing research on linux rootkits, I read about some hooking techniques which are used by the rootkits to tamper the actual behavior of a syscall. Some of which are VFS hooking, hooking via syscall table hijacking etc..One of the techniques that caught my attention was the ftrace hook technique. The ftrace hook technique works quite differently which i have explained later in this article and it seems to be working fine on linux kernel≤5.8. Interestingly, there is an existing tool in the bcc (probably in BPF-CORE also) toolset that we can modify to detect the ftrace hook.
The ftrace utility is the Linux kernel’s own tracing infrastructure that provides static as well as dynamic tracing, depending on how it is used.
First, lets understand the stacksnoop tool.
The stacksnoop tool
The tool attaches the kprobe to the function name passed as an argument. The trace_stack function shown below gets called whenever the function(attached with kprobe) gets called in any process.
Let’s explore the BPf program inside stacksnoop which includes the trace_stack() function.
In red: The EBPf program creates a structure of data fields to be shared with the user. The fields include the stack id, the pid which calls the syscall(given as arg) and the comm length which is 16 in our case.
In yellow: The bpf program uses the helper functions in which BPF_STACK_TRACE creates stack trace map named stack_traces having 128 entries and the BPF_PERF_OUTPUT creates a BPF table for pushing out custom event data to user space via a perf ring buffer.
In white: Inside the trace_stack function, bpf_get_current_pid_tgid() gets called which stores the current program’s pid(which calls the syscall) into the pid field of data structure declared in red. Then from the collected data in the map(stack_traces), we fetch the stack id using get_stackid.get_stackid() helper function that walks the stack found via the struct pt_regs in ctx, saves it in the stack trace map(stack_traces), and returns a unique ID for the stack trace.
In orange: Data gets populated and pushed to events data table.
Well! That’s how mainly the EBPf program works in the kernel mode.
For tracing purposes in userspace, eBPF needs to be optimized for filtering, therefore the filtering operations are done directly in the userspace. The stacksnoop in userspace just prints the data via the events table as shown in the below snippet.
The ksym() helper function(highlighted above) helps in translating the kernel memory address into a kernel function name.
Stack trace and stack frame
Let’s discuss in brief about stack trace and stack frames.Stack trace is a collection of active stack frames at a certain point in time during the execution of a program.
A stack frame is a part of a stack trace which corresponds to a call to a subroutine which has not yet terminated with a return.
The function return address is pushed onto the stack first, then the arguments and space for local variables. Together, they make the “frame”.
What’s in the syscall’s stack trace
- Generally, when an application program calls an api, for example execve(), the wrapper function in the C library(glibc) gets called.
- The wrapper function copies the arguments(including syscall number) of the api to the registers as the kernel expects arguments to be in registers.
- Then the wrapper function executes a trap instruction(int 0x80 and sysenter) which causes the processor to switch from user mode to kernel mode.
- In response to the trap, the kernel invokes the system_call() routine(located in entry.S file).
- The system_call() routine saves the register values onto the kernel stack and then looks at sys_call_table array to invoke appropriate service routine.
- For execve(), __x64_sys_execve() service routine gets invoked.
The whole picture of above points is shown below:
Now that we understand how a system call goes through multiple stages, let’s monitor this activity via the stacksnoop tool.
We run the tool via command: python2 stacksnoop.py -v __x64_sys_mkdir
In the above image, it could be seen that the trace follows a certain pattern. We notice that for the syscall for any program(pid), entry_SYSCALL_64_after_hwframe frame is there in the stack trace. If we take a look in the entry.S (the entry function after the trap) file source code, we can see that entry_SYSCALL_64_after_hwframe is responsible for pushing eax/rax onto stack and makes a call to do_syscall_64.
Later, the do_syscall_64() calls do_syscall_x64() that assigns the ax register to the actual sys_syscall_name(the service routine).
The ftrace hook technique
The ftrace helper library hook method is a bit different from other rootkit hooking techniques as it does not hijack the syscall table. Rather, in this, we attach(or register) kprobe to kallsyms_lookup_name to retrieve kallsyms_lookup_name’s own address and later use it to fetch the target syscall name.
After resolving target syscall name (in our case mkdir()), its address gets saved in the address field of the ftrace_hook struct. The other important field in the ftrace_hook struct is the ops struct field. The ops struct in ftrace_hook structure contains a .func field which can be assigned with the callback function whenever our target syscall gets called(__x64_sys_mkdir in this case). Hence we assign .func with fh_ftrace_thunk(our callback) as shown:
Inside fh_ftrace_thunk, the instruction pointer(EIP/RIP) gets changed to the hook function that we want to get executed whenever __x64_sys_mkdir() syscall is made in the system.
As example, inside our hook function, we save a list of running process in the system inside /tmp folder.
We load our rootkit and call mkdir. As a result, the process list gets saved in /tmp as shown below.
NOTE: Not only __x64_sys_mkdir(), one can assign any function like tcp4_seq_show() to kallsyms_lookup_name to resolve function’s address and later use it for network port hiding purpose.
Adding our detection code to stacksnoop to detect ftrace hook
Now let’s look at how we can detect this hook activity using the stacksnoop.
First we make two lists for storing the stack_id and pid of the syscall(mkdir in our case) event.
When our monitored syscall’s data gets in the events table, these arrays will be appended with the data(pid and stackid). So, we add the pid and stack_id into the arrays as shown below.
Now, we add the condition(below image) in the stacksnoop program that for the same pid in the stack trace, if the stack trace id is not unique, then there must be something suspicious.
We run stacksnoop monitor tool on one terminal and on the another terminal we load our rootkit.
Command for monitor: sudo python2 stacksnoop.py -v __x64_sys_mkdir
We can see above that after loading the rootkit, when mkdir lmg was called, just after the actual stack trace, another trace gets created for ‘mkdir lmg’ in which frames differ from previous for the same pid i.e. 9240(in red).
The difference in the stack trace frames in the same pid for the same syscall is a clear sign of a suspicious activity happening in such a minute(very small) time difference.
Why entry_SYSCALL_64_after_hwframe is absent in the callback stack frame
The ftrace utility as we know is used for dynamic tracing in linux or lets us trace the function calls. When we start ftracing a kernel function, the function’s code gets changed in a way that the kernel inserts some assembly instructions for our function to notify the tracing system.
In other words, during compile time the kernel puts a few extra NOP instructions inside every function. So when the ftrace is attached to the target function, the inside NOPs gets replaced with __fentry__() function call. Now whenever the target function(say mkdir()) gets called, our registered callback also gets called along with the normal mkdir call. After the syscall’s normal execution, the callback changes the value of the instruction pointer(eip/rip) which leads to passing control to a new address(our hook).
In our case, our hook gets invoked when mkdir() function gets called. But actually the call to our hook was not made by the kernel’s do_syscall_64() but by our callback(fh__x64_sys_mkdir()) function. Therefore, for our hook there will be no entry for for entry_SYSCALL_64_after_hwframe(including the usual arguments setup) in the stack trace as the call was not made via do_syscall_x64() but it was made via fh__x64_sys_mkdir. And moreover the stack_id will be different from the actual syscall for the hooked stack trace( we kept id as a condition in our detection code).
Can we detect a rootkit that does hooking via syscall table hijacking?
Yes, via stack tracing it is possible to detect the rootkit that does hooking via syscall table hijacking. By monitoring and comparing the frames of a certain syscall event, we can detect the anomalies in the stack frames similar to what we did for ftrace. More details on this have been well explained here.
We tried monitoring the hook activity for a single syscall, however monitoring all syscalls at once would be tedious from a production point of view. Instead we can try putting certain syscalls under monitoring.
NOTE: Currently the Hookdetect tool monitors the number of entry frames in a stack trace(possibly, count > 3). With some modifications it can also be used to detect ftrace hooks. As an example, below figure shows the ftrace hook detection for __x64_sys_getdents() via Hookdetect tool.
Debugging the kernel using Ftrace - part 1
Ftrace is a tracing utility built directly into the Linux kernel. Many distributions already have various…
GitHub - iovisor/bcc: BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
BCC is a toolkit for creating efficient kernel tracing and manipulation programs, and includes several useful tools and…
Hooking Linux Kernel Functions, Part 2: How to Hook Functions with Ftrace
Ftrace is a Linux kernel framework for tracing Linux kernel functions. But our team managed to find a new way to use…
Call stack - Wikipedia
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a…
Detecting Kernel Hooking using eBPF
I demonstrate an example project that uses eBPF and stack traces to detect syscall-hooking kernel rootkits. Maybe? A…
linux/entry_64.S at master · torvalds/linux
Linux kernel source tree. Contribute to torvalds/linux development by creating an account on GitHub.