Malware Mutants Ninja osquery (and PolyLogyx)

Malware Mutants Ninja osquery (and PolyLogyx)

Now that I have lost most of the hair from my head, and whatever is left is turning grey, I think I can start my blogs in a patronizing way saying ‘Back when I was young …’.

So, Back when I was young, and had just started my career, it was about programming in COBOL.

Yup, kind of a boring language with simple English like statements to code into e.g. “Perform this Or that” or “add some number into some other number”. I mean it hardly felt like programming, till I moved to the bad ass world of C and system programming.

Now that’s a complex world, right? With mallocs and frees and IPCs and syscalls and multi-threading and synchronization objects like semaphores and spin locks and Mutexes.

A-ha, mutexes, that for some strange reason are called ‘Mutants’ in Windows.

(And no, it’s got nothing to with X-men or turtles).

It’s a technique to serialize access to shared resources in a software program by means of ‘mutual exclusion’ which is an object created in memory of the running program. Now malwares are also software programs and since great minds of the “security world” are constantly looking at means to identify malware, it turned out that peeping into program’s memory and identifying the ‘mutants’ it creates can be an interesting way of identify malwares. Why would a malware use a mutex? Pretty much similar reasons like a legititmate program e.g. synchronization or identifying a ‘re-infection’ or at times as an evasion technique.

For e.g. lets look at this malware with hash of 1EFEB85C8EC2C07DC0517CCCA7E8D743. If you look for it on VirusTotal.com, you will see it is a pretty nasty malware and creates and opens many mutexes e.g “Had3yghhuju98gggd9G6790hfv3.4”.

Surely there is nothing new in all what I have said so far. There shelf is full of tools and technologies that can assist with this purpose. However, the question is what’s the operational efficiency of these tools and technologies for a security analyst operating out of SOC. The analyst can push a tool like handles.exe or Process Hacker to all the endpoints, run them remotely, grab the output, bring it all back and then search thru it like finding a needle in haystack and pay for all that storage. Well, but it will surely work. More sophisticated way of solving could be to use a memory forensic framework like Volatility or GRR or any other commercial ones that can grab the running memory images of the processes from the endpoints, bring it all back, parse thru the data and search. Except that the volume of data is even larger, therefore the cost and efficiency, are higher and lower.

Wouldn’t it be much easy if all you had to do was ask the endpoint a question like ‘Well, can you give me all the mutants in this process’? And that’s it. No fancy tools or no memory parser or no big data acquisitions, and it could all be done from a central console? And while we are wishing for things, why not also wish for the ability to apply filters on mutant names or process names (or PIDs) so as to get very targeted and precise information at a low cost and high efficiency.

Well, it seems like this might be your lucky day.

With our latest extension to the osquery, we have added a new table called ‘win_process_handles’, which gives you the open handles and their names. The table schema looks something like:

osquery> .schema win_process_handles

CREATE TABLE win_process_handles(`pid` BIGINT, `handle_type` TEXT, `object_name` TEXT, `access_mask` BIGINT);

osquery>

where ‘handle_type’ could be a Registry Key, ALPC Port, Event, Section, File, Directory, Semaphore, Mutant or any other object for which Windows allows a handle.

As already mentioned, the tools like handle.exe (from sysinternals) and other products can give the same information to you, so it’s not like we are the only ones to find ‘water on mars’ but we wired it up with osquery, that already gives you a great number of tables and the magic of SQL will allow for simplistic questions to get then answers, for e.g. look at the following query:

osquery> select * from win_process_handles where handle_type=”Mutant” and pid in (select pid from processes where cwd like ‘%svchost%’);

+ — — — -+ — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — -+

| pid | handle_type | object_name | access_mask |

+ — — — -+ — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — -+

| 1296 | Mutant | \Sessions\2\BaseNamedObjects\SM0:1296:304:WilStaging_02 | 2031617 |

<snip>

| 14944 | Mutant | \Sessions\2\BaseNamedObjects\SM0:14944:120:WilError_01 | 2031617 |

+ — — — -+ — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — -+

I told osquery to give me a list of all the running processes names ‘svchost.exe’ which I fed into another table to get a list of all the “Mutant” objects. But processes tables in osquery will only give the PID of user processes. What about system services, that run with much higher privilegs. And to peep into them would require additional privileges.

Well, like ‘processes’, osquery has a table for ‘services’ and we can just do the same thing there:

osquery> select * from win_process_handles where handle_type=”Mutant” and pid in (select pid from services);

+ — — — -+ — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — -+

| pid | handle_type | object_name | access_mask |

+ — — — -+ — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — -+

| 68 | Mutant | \BaseNamedObjects\SM0:68:304:WilStaging_02 | 2031617 |

<snip>

| 50800 | Mutant | \BaseNamedObjects\SM0:50800:304:WilStaging_02 | 2031617 |

+ — — — -+ — — — — — — -+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — + — — — — — — -+

Cool, isn’t it? Now get funky and create other queries not restricted to svchost or mutants or for that matter even these tables. All this without having to bring a ton load of data from endpoints in your storage buckets (Amazon is already very rich, trust me) and then search through it to, only to find if there was a bad mutant named “like ‘%Had3yghhuju98gggd9G6790hfv3%’” in your endpoint.