Friday, July 19, 2019

Dynamic Mac Malware Analysis using Dtrace


Dynamic malware analysis on Mac is, well, very hard to do, thanks to not enough tools or sandboxes on the market. So for a small time malware researcher like me, there is little hope, but nonetheless, there is hope. The hope’s name is DTrace.

DTrace is short for dynamic tracing, and according to dtrace.org
The name is short for Dynamic Tracing: an instrumentation technique pioneered by DTrace which dynamically patches live running instructions with instrumentation code

This blog will see some of the aspects of DTrace, which can be used for doing dynamic analysis of malware on Mac OS X. So let’s dive in.

What is DTrace?

DTrace is a dynamic tracing framework developed by Sun Microsystems. Originally it was developed for debugging and troubleshooting purposes. DTrace enables us to dynamically modify the system and user process to record data at certain locations, which are referred to as probes. DTrace binds request to probe to perform some actions. When a probe is activated or fired, DTrace gathers data from the probes and returns it.

Probes

DTrace probes comes from set of kernel modules, called providers. Each provider has multiple number of probes and information the probe provides, depends upon the provider and the probe. To see all the probes and providers, you can use dtrace -l command, which will list all the providers, their ID, Module name, function and name of the probe.
$ sudo dtrace -l
   ID   PROVIDER            MODULE                   FUNCTION           NAME
    1     dtrace                                                                                 BEGIN
    2     dtrace                                                                                 END
    3     dtrace                                                                                 ERROR
    4   lockstat       mach_kernel                      lck_mtx_lock       adaptive-acquire
    5   lockstat       mach_kernel                      lck_mtx_lock       adaptive-spin
    6   lockstat       mach_kernel                      lck_mtx_lock       adaptive-block
…….
…….

ID: Id number of the probe
PROVIDER: DTrace kernel module that performs instrumentations to enable the probe
FUNCTION: Name of kernel or user module where probe is located
NAME: The name of the probe.

Based on the above, the full probe name will be
Provider:module:function:name
So ID number 4 probe will be :
Lockstat:mach_kernel:lck_mtx_lock:adaptive-acquire
And ID 1 will be:
            Dtrace:::BEGIN

All fields in probe name are optional, if nothing is specified, DTrace will search all the probes. If only name is specified, DTrace will matches request to that probe name in all modules in all providers.

DTrace requires super user to execute, so don’t forget to use sudo. Depending upon the OS and the version you are using, number of probes can be different, to check this use dtrace -l | wc -l  command.
$ sudo dtrace -l| wc -l
  141201

I will not divulge more into DTrace, because that is out of scope. Just few pointers and how it can be used. More information about DTrace can be found online.

DTrace Scripting

Now we have some insight about the probes and providers, let’s see how to use them in a DTrace Script to automate things.

Writing a DTrace script is like wirting a C program. Syntax is almost same, structure is different. Let’s dive in.
A sample DTrace script or D script looks like this:
syscall::read:entry
/execname == “Terminal”/
{
            Printf(“Terminal with pid: %d, reads %s”, pid, arg1)
}

This a standard script clause. It is divided into 3 parts:
·      Probe Description: This is the full probe name that tells dtrace which probe to fire and track. In above example syscall::read:entry is the proble description, which describe only to trace read syscall at entry, mean when read syscall is called (not executed), this probe will fire.
·      Predicate: Predicate provide control tracing of data using logical expressions which are called as predicates. Here /execname == “Terminal”/ is a predicate. Predicate is always written inside two forward slashes and it is a conditional statement. In this case it dictates that probe be fired only when execname is “Terminal”. Execname is an inbuild keyword of D script which is the name of executable.
·      Actions: Actions as the name specifies do stuff. When probe matches the description and predicate condition is true, action is performed. So if Terminal executable call a read syscall, the D script will do whatever is mentioned in the actions.
We can save the above clause in a file called read.d and run using dtrace:
dtrace -s read.d

See for yourself what output you get.
We can include n number of clauses in D script to trace various activities in the OS, that includes if a malware is doing something. Now in next section we’ll see how we can use the D script to do some dynamic malware analysis on Mac.

Dynamic Analysis of Malware using DTrace

Now what I’ve observed in malware analysis on mac and linux, no matter in what kind of language the malware was coded, it will always use syscalls to do some meaningful stuff on the system. Otherwise there is nothing to be done.

So, for malware analysis, the probes which we want to focus on are related to syscall provider. You can check all the syscall probes by using sudo dtrace -l | grep syscall. Out of all these syscall probes only few are very meaningful to us, which will help in malware analysis. Few of those are:
·      Execve
·      Fork
·      Open
·      Posix_spawn
·      Write
·      Link
·      Ptrace
·      Sendmsg
·      Accept
·      Kill
·      Getlogin
·      Reboot
·      Rename
·      Vfork
·      Mkdir
·      Rmdir
And many more, depending on how deep you are diving in.
So how to create script, just using the same example as we mentioned in the previous section. Let me show:
syscall::execve:entry
/pid==$target/
{
                  this->retval = 0;
    printf("{\"execname\":\"%s\", \"api\":\"%s\", \"args\":[\"%S\", %S, %S], \"retval\":%d, \"walltimestamp\":%d, \"pid\":%d, \"ppid\":%d, \"tid\":%d, \"errno\":%d}\n",
        execname,probefunc,
        arg0 != (int64_t)NULL ? copyinstr(arg0) : "", arg1 != (int64_t)NULL ? copyinstr(arg1) : "", arg2 != (int64_t)NULL ? copyinstr(arg2) : "",
        (int)this->retval,
        (int64_t)walltimestamp/1000, pid, ppid, tid, errno);
}

syscall::fork:entry
/pid==$target/
{
                  this->retval = 0;
                  printf("{\"execname\":\"%s\", \"api\":\"%s\", \"args\":[None], \"retval\":%d, \"walltimestamp\":%d, \"pid\":%d, \"ppid\":%d, \"tid\":%d, \"errno\":%d}\n",
        execname,probefunc,
        (int)this->retval,
        (int64_t)walltimestamp/1000, pid, ppid, tid, errno);
   
}

The above two clause are for tracing execve and fork syscall, whenever execve and fork syscall is called, we will trace them. The pid must be the target process. Now target is the process I’m executing using this script. So, consider the above two clauses are in a d script file called syscall.d and  the following command in the terminal
$sudo dtrace -s syscall.d -c ./malware >> d_output.txt"

This command will run dtrace, executing script syscall.d, c switch specifies to run specified command and exit upon its completion. If command spawns a process, the pid of the process is passed in $target variable to the d script.

The action in the above clauses is nothing but formatting and some error handling due to data types, using conditional statements, which can be easily found on internet, one of the good resources http://dtrace.org/guide/bookinfo.html, provides very good information on dtrace.

Now what we have to do is to create clauses for all the syscalls we want to monitor via dtrace. Once all those are created, we have our basic dynamic malware analysis script on mac ready.


Multithreading

You must have found there is nothing much to writing dtrace scripts for analysis. But there is a reason I’ve included execve and fork syscall probes in above clauses. When these two are called (and some others)in a program or malware, they create another process. Now since in our predicate we only have pid comparing $target, how will we trace the newly spawn processes.

Again the answer is dtrace. Now lets see another clause, which will in the same script file as the main d script.
proc:::start
/ppid == $target/
{
            stop();
            system("sudo dtrace -w -s child.d %d\n",pid);
           
}

In this statement probe is proc:::start, which means start of every proc, the predicate is very important here, /ppid == $target/ , here ppid is dtrace keyword which gives the parent pid. So, what we’re doing here is whenever a process starts, and the parent of the process is current process, just stop the current process and call system command and executes the same or another dtrace script on the pid. Remember in this case the argument to the script is $1 not target.

Also using stop in dtrace is a destructive action, so we have to use -w switch while executing dtrace script. Now you will be able to monitor the child processes of the malware also.

This approach is nowhere perfect and can be improved upon, a lot. But at least it gives idea, what a program is doing with the system, without debugging it.

So, I ran the script on a custom sample that I created. The sample does the following:
-       Opens dialog box with ok, cancel and test button
-       Create a dummy file named “dropped_file”
-       Do  multiple forks and execve
-       Spawns calculator

The d script gives me following output:
This is parent process, pid is 3959 and Child 1 pid:3960
exiting....Process Started:3960
Process Started:3963
Process Started:3968
{"execname":"sh", "api":"open", "args":["/dev/dtracehelper\0"], "retval":0, "walltimestamp":1563538678751070, "pid":3968, "ppid":3963, "tid":120791, "errno":0}
{"execname":"sh", "api":"open", "args":["/dev/tty\0"], "retval":0, "walltimestamp":1563538678753491, "pid":3968, "ppid":3963, "tid":120791, "errno":0}
{"execname":"sh", "api":"execve", "args":["/Applications/Calculator.app/Contents/MacOS/Calculator\0", `s\300\262\344\177\0, \220H\300\262\344\177\0], "retval":0, "walltimestamp":1563538678754169, "pid":3968, "ppid":3963, "tid":120791, "errno":0}
{"execname":"Calculator", "api":"open", "args":["/dev/dtracehelper\0"], "retval":0, "walltimestamp":1563538678757921, "pid":3968, "ppid":3963, "tid":120791, "errno":0}
{"execname":"Calculator", "api":"open", "args":["/dev/dtracehelper\0"], "retval":0, "walltimestamp":1563538678847207, "pid":3968, "ppid":3963, "tid":120791, "errno":0}
{"execname":"Calculator", "api":"open", "args":["/var/root/Library/Saved Application State/com.apple.calculator.savedState/restorecount.plist\0"], "retval":0, "walltimestamp":1563538678870489, "pid":3968, "ppid":3963, "tid":120804, "errno":2}
Hello, World!

{"execname":"malware", "api":"posix_spawn", "args":["/bin/sh\0"], "retval":0, "walltimestamp":1563538677334949, "pid":3963, "ppid":3960, "tid":120771, "errno":0}

{"execname":"malware", "api":"fork", "args":[None], "retval":0, "walltimestamp":1563538675935390, "pid":3960, "ppid":1, "tid":120762, "errno":0}
{"execname":"malware", "api":"execve", "args":["dropped_file\0", }\016i\004\001\0, j\035W[\377\177\0], "retval":0, "walltimestamp":1563538675936119, "pid":3960, "ppid":1, "tid":120762, "errno":0}
{"execname":"dropped_file", "api":"open", "args":["/dev/dtracehelper\0"], "retval":0, "walltimestamp":1563538675937490, "pid":3960, "ppid":1, "tid":120762, "errno":0}

{"execname":"malware", "api":"open", "args":["dropped_file\0"], "retval":0, "walltimestamp":1563538675529687, "pid":3959, "ppid":3958, "tid":120760, "errno":0}
{"execname":"malware", "api":"write", "args":["\317\372\355\376\a\0"], "retval":0, "walltimestamp":1563538675529861, "pid":3959, "ppid":3958, "tid":120760, "errno":0}
{"execname":"malware", "api":"fork", "args":[None], "retval":0, "walltimestamp":1563538675530000, "pid":3959, "ppid":3958, "tid":120760, "errno":0}
This is child process, pid is 3960 and ppip is 1
Child is creating process using execve


You can see, the syscall activites are visible, even though not in order(that is why I’ve timestamp in the output). Also we can track the pid and ppid to see which process was spawned by which. I didn’t get time to beautify more. So you can try that.

If we use this on malware. It can gives us idea about the execution pattern of the malware, without indulging into long and tidy process of static analysis and debugging.

Once again, please refer to http://dtrace.org/guide/bookinfo.html for more info on DTrace. We can create highly complex scripts that can aid in dynamic malware analysis on Mac.
-->

0 comments:

Post a Comment