Scan Providers

Version 2.5.0 is close to being released and comes with the last type of extension exposed to Python: scan providers. Scan providers extensions are not only the most complex type of extensions, but also the most powerful ones as they allow to add support for new file formats entirely from Python!

This feature required exposing a lot more of the SDK to Python and can’t be completely discussed in one post. This post is going to introduce the topic, while future posts will show real life examples.

Let’s start from the list of Python scan providers under Extensions -> Scan providers:

Scan provider extensions

This list is retrieved from the configuration file ‘scanp.cfg’. Here’s an example entry:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
[TEST]
label = Test scan provider
ext = test2,test3
group = db
file = Test.py
allocator = allocator
[TEST] label = Test scan provider ext = test2,test3 group = db file = Test.py allocator = allocator
[TEST]
label = Test scan provider
ext = test2,test3
group = db
file = Test.py
allocator = allocator

The name of the section has two purposes: it specifies the name of the format being supported (in this case ‘TEST’) and also the name of the extension, which automatically is associated to that format (in this case ‘.test’, case insensitive). The hard limit for format names is 9 characters for now, this may change in the future if more are needed. The label is the description. The ext parameter is optional and specifies additional extensions to be associated to the format. group specifies the type of file which is being supported; available groups are: img, video, audio, doc, font, exe, manexe, arch, db, sys, cert, script. file specifies the Python source file and allocator the function which returns a new instance of the scan provider class.

Let’s start with the allocator:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def allocator():
return TestScanProvider()
def allocator(): return TestScanProvider()
def allocator():
    return TestScanProvider()

It just returns a new instance of TestScanProvider, which is a class dervided from ScanProvider:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class TestScanProvider(ScanProvider):
def __init__(self):
super(TestScanProvider, self).__init__()
self.obj = None
class TestScanProvider(ScanProvider): def __init__(self): super(TestScanProvider, self).__init__() self.obj = None
class TestScanProvider(ScanProvider):

    def __init__(self):
        super(TestScanProvider, self).__init__()
        self.obj = None

Every scan provider has some mandatory methods it must override, let’s begin with the first ones:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TestObject()
self.obj.Load(self.getStream())
return self.SCAN_RESULT_OK
def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TestObject() self.obj.Load(self.getStream()) return self.SCAN_RESULT_OK
    def _clear(self):
        self.obj = None

    def _getObject(self):
        return self.obj

    def _initObject(self):
        self.obj = TestObject()
        self.obj.Load(self.getStream())
        return self.SCAN_RESULT_OK

_clear gives a chance to free internal resources when they’re no longer used. In Python this is not usually important as member objects will automatically be freed when their reference count reaches zero.

_getObject must return the internal instance of the object being parsed. This must return an instance of a CFFObject derived class.

_initObject creates the object instance and loads the data stream into it. In the sample above we assume it being successful. Otherwise, we would have to return SCAN_RESULT_ERROR. This method is not called by the main thread, so that it doesn’t block the UI during long parse operations.

Let’s take a look at the TestObject class:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class TestObject(CFFObject):
def __init__(self):
super(TestObject, self).__init__()
self.SetObjectFormatName("TEST")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
class TestObject(CFFObject): def __init__(self): super(TestObject, self).__init__() self.SetObjectFormatName("TEST") self.SetDefaultEndianness(ENDIANNESS_LITTLE)
class TestObject(CFFObject):

    def __init__(self):
        super(TestObject, self).__init__()
        self.SetObjectFormatName("TEST")
        self.SetDefaultEndianness(ENDIANNESS_LITTLE)

This is a minimalistic implementation of a CFFObject derived class. Usually it should contain at least an override of the CustomLoad method, which gives the opportunity to fail when the data stream is first loaded through the Load method. SetDefaultEndianness wouldn’t even be necessary, as every object defaults to little endian by default. SetObjectFormatName, on the other hand, is very important, as it sets the internal format name of the object.

Let’s now take a look at how we scan a file:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _startScan(self):
return self.SCAN_RESULT_OK
def _threadScan(self):
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_NativeCode
self.addEntry(e)
def _startScan(self): return self.SCAN_RESULT_OK def _threadScan(self): e = ScanEntryData() e.category = SEC_Warn e.type = CT_NativeCode self.addEntry(e)
    def _startScan(self):
        return self.SCAN_RESULT_OK
        
    def _threadScan(self):
        e = ScanEntryData()
        e.category = SEC_Warn
        e.type = CT_NativeCode
        self.addEntry(e)

The code above will issue a single warning concerning native code. When _startScan returns SCAN_RESULT_OK, _threadScan will be called from a thread other than the main UI one. The logic behind this is that _startScan is actually called from the main thread and if the scan of the file doesn’t require complex operations, like in the case above, then the method could return SCAN_RESULT_FINISHED and then _threadScan won’t be called at all. During a threaded scan, an abort by the user can be detected via the isAborted method.

From the UI side point of view, when a scan entry is clicked in summary, the scan provider is supposed to return UI information.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_NativeCode:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("Hello, world!")
return True
return False
def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_NativeCode: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("Hello, world!") return True return False
    def _scanViewData(self, xml, dnode, sdata):
        if sdata.type == CT_NativeCode:
            sdata.setViews(SCANVIEW_TEXT)
            sdata.data.setData("Hello, world!")
            return True
        return False

This will display a text field with a predefined content when the user clicks the scan entry in the summary. This is fairly easy, but what happens when we have several entries of the same type and need to differentiate between them? There’s where the data member of ScanEntryData plays a role, this is a string which will be included in the report xml and passed again back to _scanViewData as an xml node.

For instance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
e.data = "<o>1234</o>"
e.data = "<o>1234</o>"
e.data = "1234"

Becomes this in the final XML report:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<d>
<o>1234</o>
</d>
<d> <o>1234</o> </d>

    1234

The dnode argument of _scanViewData points to the ‘d’ node and its first child will be the ‘o’ node we passed. the xml argument represents an instance of the NTXml class, which can be used to retrieve the children of the dnode.

But this is only half of the story: some of the scan entries may represent embedded files (category SEC_File), in which case the _scanViewData method must return the data representing the file.

Apart from scan entries, we may also want the user to explore the format of the file. To do that we must return a tree representing the structure of our file:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, 1)
ft.appendChild(fi, 2)
return ft
def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, 1) ft.appendChild(fi, 2) return ft
    def _getFormat(self):
        ft = FormatTree()
        ft.enableIDs(True)
        fi = ft.appendChild(None, 1)
        ft.appendChild(fi, 2)
        return ft

The enableIDs method must be called right after creating a new FormatTree class. The code above creates a format item with id 1 with a child item with id 2, which results in the following:

Format tree

But of course, we haven’t specified neither labels nor different icons in the function above. This information is retrieved for each item when required through the following method:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _formatViewInfo(self, finfo):
if finfo.fid == 1:
finfo.text = "directory"
finfo.icon = PubIcon_Dir
return True
elif finfo.fid == 2:
finfo.text = "entry"
return True
return False
def _formatViewInfo(self, finfo): if finfo.fid == 1: finfo.text = "directory" finfo.icon = PubIcon_Dir return True elif finfo.fid == 2: finfo.text = "entry" return True return False
    def _formatViewInfo(self, finfo):
        if finfo.fid == 1:
            finfo.text = "directory"
            finfo.icon = PubIcon_Dir
            return True
        elif finfo.fid == 2:
            finfo.text = "entry"
            return True
        return False

The various items are identified by their id, which was specified during the creation of the tree.

The UI data for each item is retrieved through the _formatViewData method:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def _formatViewData(self, sdata):
if sdata.fid == 1:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hsplitter csizes="40-*"><hex id="2">")
sdata.setCallback(cb, None)
return True
return False <p>This will display a custom view with a table and a hex view separated by a splitter:</p> <p><a href="/wp-content/uploads/2015/09/scanp/cview.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/cview.png" alt="Custom view"></a></p> <p>Of course, also have specified the callback for our custom view:</p> <pre lang="python">def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0</pre> <p>It is good to remember that format item IDs and IDs used in custom views are used to encode bookmark jumps. So if they change, saved bookmark jumps become invalid.</p> <p>And here again the whole code for a better overview:</p> <pre lang="python">from Pro.Core import *
from Pro.UI import pvnInit, PubIcon_Dir
class TestObject(CFFObject):
def __init__(self):
super(TestObject, self).__init__()
self.SetObjectFormatName("TEST")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0
class TestScanProvider(ScanProvider):
def __init__(self):
super(TestScanProvider, self).__init__()
self.obj = None
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TestObject()
self.obj.Load(self.getStream())
return self.SCAN_RESULT_OK
def _startScan(self):
return self.SCAN_RESULT_OK
def _threadScan(self):
print("thread msg")
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_NativeCode
self.addEntry(e)
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_NativeCode:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("Hello, world!")
return True
return False
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, 1)
ft.appendChild(fi, 2)
return ft
def _formatViewInfo(self, finfo):
if finfo.fid == 1:
finfo.text = "directory"
finfo.icon = PubIcon_Dir
return True
elif finfo.fid == 2:
finfo.text = "entry"
return True
return False
def _formatViewData(self, sdata):
if sdata.fid == 1:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui></pre></hex><table id="1"></table><hex id="2">")
sdata.setCallback(cb, None)
return True
return False
def allocator():
return TestScanProvider() <p>If you have noticed from the screen-shot above, the analysed file is called ‘a.t’ and as such doesn’t automatically associate to our ‘test’ format. So how does it associate anyway?</p> <p>Clearly Profiler doesn’t rely on extensions alone to identify the format of a file. For external scan providers a signature mechanism based on YARA has been introduced. In the <strong>config</strong> directory of the user, you can create a file named ‘yara.plain’ and insert your identification rules in it, e.g.:</p> <pre lang="text">rule test
{
strings:
$sig = "test"
condition:
$sig at 0
}</pre> <p>This rule will identify the format as ‘test’ if the first 4 bytes of the file match the string ‘test’: the name of the rule identifies the format.</p> <p>The file ‘yara.plain’ will be compiled to the binary ‘yara.rules’ file at the first run. In order to refresh ‘yara.rules’, you must delete it.</p> <p>One important thing to remember is that a rule isn’t matched against an entire file, but only against the first 512 bytes.</p> <p>Of course, our provider behaves 100% like all other providers and can be used to load embedded files:</p> <p><a href="/wp-content/uploads/2015/09/scanp/embfiles.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/embfiles.png" alt="Embedded files"></a></p> <p>Our new provider is used automatically when an embedded file is identified as matching our format.</p><footer class="entry-footer">
<span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/scan-providers/" rel="bookmark"><time class="entry-date published" datetime="2015-09-21T22:13:50+00:00">September 21, 2015</time><time class="updated" datetime="2021-04-01T16:32:53+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span> </footer>
<div id="comments" class="comments-area">
<div id="respond" class="comment-respond">
<h2 id="reply-title" class="comment-reply-title">Leave a Reply <small><a rel="nofollow" id="cancel-comment-reply-link" href="/scan-providers/#respond" style="display:none;">Cancel reply</a></small></h2><form action="https://blog.cerbero.io/wp-comments-post.php" method="post" id="commentform" class="comment-form" novalidate=""></form><p class="comment-notes"><span id="email-notes">Your email address will not be published.</span> <span class="required-field-message">Required fields are marked <span class="required">*</span></span></p><p class="comment-form-comment"><label for="comment">Comment <span class="required">*</span></label> <textarea autocomplete="new-password" id="comment" name="d38951e234" cols="45" rows="8" maxlength="65525" required=""></textarea><textarea id="a48de96393fbe452c9de40e41628280f" aria-label="hp-comment" aria-hidden="true" name="comment" autocomplete="new-password" style="padding:0 !important;clip:rect(1px, 1px, 1px, 1px) !important;position:absolute !important;white-space:nowrap !important;height:1px !important;width:1px !important;overflow:hidden !important;" tabindex="-1"></textarea><script data-noptimize="">document.getElementById("comment").setAttribute("id", "a48de96393fbe452c9de40e41628280f");document.getElementById("d38951e234").setAttribute("id", "comment");</script></p><p class="comment-form-author"><label for="author">Name <span class="required">*</span></label> <input id="author" name="author" type="text" value="" size="30" maxlength="245" autocomplete="name" required=""></p>
<p class="comment-form-email"><label for="email">Email <span class="required">*</span></label> <input id="email" name="email" type="email" value="" size="30" maxlength="100" aria-describedby="email-notes" autocomplete="email" required=""></p>
<p class="comment-form-url"><label for="url">Website</label> <input id="url" name="url" type="url" value="" size="30" maxlength="200" autocomplete="url"></p>
<p class="comment-form-cookies-consent"><input id="wp-comment-cookies-consent" name="wp-comment-cookies-consent" type="checkbox" value="yes"> <label for="wp-comment-cookies-consent">Save my name, email, and website in this browser for the next time I comment.</label></p>
<input type="hidden" class="hcaptcha-widget-id" name="hcaptcha-widget-id" value="eyJzb3VyY2UiOlsiV29yZFByZXNzIl0sImZvcm1faWQiOiIxNTUxIn0=-2bc8715768fd30dc95101e48f992fcbf">
<h-captcha class="h-captcha" data-sitekey="0d172a02-d673-47d9-af17-346c4cd386d9" data-theme="light" data-size="normal" data-auto="false" data-ajax="false" data-force="false">
</h-captcha>
<input type="hidden" id="hcaptcha_comment_nonce" name="hcaptcha_comment_nonce" value="377ea2a271"><input type="hidden" name="_wp_http_referer" value="/scan-providers/"><p class="form-submit"><input name="submit" type="submit" id="submit" class="submit" value="Post Comment"> <input type="hidden" name="comment_post_ID" value="1551" id="comment_post_ID">
<input type="hidden" name="comment_parent" id="comment_parent" value="0">
</p></div></div><nav class="navigation post-navigation" aria-label="Posts">
<h2 class="screen-reader-text">Post navigation</h2>
<div class="nav-links"><div class="nav-previous"><a href="https://blog.cerbero.io/profiler-2-4/" rel="prev"><span class="meta-nav" aria-hidden="true">Previous</span> <span class="screen-reader-text">Previous post:</span> <span class="post-title">Profiler 2.4</span></a></div><div class="nav-next"><a href="https://blog.cerbero.io/torrent-support/" rel="next"><span class="meta-nav" aria-hidden="true">Next</span> <span class="screen-reader-text">Next post:</span> <span class="post-title">Torrent Support</span></a></div></div></nav>
<aside id="secondary" class="sidebar widget-area">
<section id="search-2" class="widget widget_search">
<form role="search" method="get" class="search-form" action="https://blog.cerbero.io/"></form>
<label>
<span class="screen-reader-text">
Search for: </span>
<input type="search" class="search-field" placeholder="Search …" value="" name="s">
</label>
<button type="submit" class="search-submit"><span class="screen-reader-text">
Search </span></button>
</section>
<section id="recent-posts-2" class="widget widget_recent_entries">
<h2 class="widget-title">Recent Posts</h2><nav aria-label="Recent Posts">
<ul>
<li> <a href="https://blog.cerbero.io/wim-format-package/" aria-current="page">WIM Format Package</a> </li>
<li> <a href="https://blog.cerbero.io/hfs-file-system/">HFS+ File System</a> </li>
<li> <a href="https://blog.cerbero.io/ext-file-systems/">EXT File Systems</a> </li>
<li> <a href="https://blog.cerbero.io/ntfs-file-system/">NTFS File System</a> </li>
<li> <a href="https://blog.cerbero.io/exfat-file-system/">ExFAT File System</a> </li>
<li> <a href="https://blog.cerbero.io/disk-format-package/">Disk Format Package</a> </li>
<li> <a href="https://blog.cerbero.io/fat-file-system/">FAT File System</a> </li>
<li> <a href="https://blog.cerbero.io/prototype-memory-services/">Prototype Memory & Services</a> </li>
<li> <a href="https://blog.cerbero.io/iso-format-2-0-package/">ISO Format 2.0 Package</a> </li>
<li> <a href="https://blog.cerbero.io/memory-decompression-pagefiles/">Memory Decompression & Pagefiles</a> </li>
</ul>
</nav></section><section id="archives-4" class="widget widget_archive"><h2 class="widget-title">Archives</h2> <label class="screen-reader-text" for="archives-dropdown-4">Archives</label>
<select id="archives-dropdown-4" name="archive-dropdown">
<option value="">Select Month</option>
<option value="https://blog.cerbero.io/2025/06/"> June 2025 (1)</option>
<option value="https://blog.cerbero.io/2025/05/"> May 2025 (7)</option>
<option value="https://blog.cerbero.io/2025/04/"> April 2025 (4)</option>
<option value="https://blog.cerbero.io/2025/03/"> March 2025 (2)</option>
<option value="https://blog.cerbero.io/2024/10/"> October 2024 (3)</option>
<option value="https://blog.cerbero.io/2024/09/"> September 2024 (1)</option>
<option value="https://blog.cerbero.io/2024/08/"> August 2024 (3)</option>
<option value="https://blog.cerbero.io/2024/07/"> July 2024 (5)</option>
<option value="https://blog.cerbero.io/2024/06/"> June 2024 (2)</option>
<option value="https://blog.cerbero.io/2024/04/"> April 2024 (4)</option>
<option value="https://blog.cerbero.io/2024/03/"> March 2024 (1)</option>
<option value="https://blog.cerbero.io/2024/02/"> February 2024 (1)</option>
<option value="https://blog.cerbero.io/2024/01/"> January 2024 (4)</option>
<option value="https://blog.cerbero.io/2023/12/"> December 2023 (3)</option>
<option value="https://blog.cerbero.io/2023/11/"> November 2023 (7)</option>
<option value="https://blog.cerbero.io/2023/10/"> October 2023 (3)</option>
<option value="https://blog.cerbero.io/2023/09/"> September 2023 (1)</option>
<option value="https://blog.cerbero.io/2023/07/"> July 2023 (1)</option>
<option value="https://blog.cerbero.io/2023/05/"> May 2023 (11)</option>
<option value="https://blog.cerbero.io/2023/03/"> March 2023 (9)</option>
<option value="https://blog.cerbero.io/2023/02/"> February 2023 (3)</option>
<option value="https://blog.cerbero.io/2023/01/"> January 2023 (1)</option>
<option value="https://blog.cerbero.io/2022/11/"> November 2022 (1)</option>
<option value="https://blog.cerbero.io/2022/09/"> September 2022 (2)</option>
<option value="https://blog.cerbero.io/2022/08/"> August 2022 (2)</option>
<option value="https://blog.cerbero.io/2022/07/"> July 2022 (3)</option>
<option value="https://blog.cerbero.io/2022/06/"> June 2022 (2)</option>
<option value="https://blog.cerbero.io/2022/05/"> May 2022 (5)</option>
<option value="https://blog.cerbero.io/2022/04/"> April 2022 (3)</option>
<option value="https://blog.cerbero.io/2022/03/"> March 2022 (4)</option>
<option value="https://blog.cerbero.io/2022/02/"> February 2022 (6)</option>
<option value="https://blog.cerbero.io/2022/01/"> January 2022 (1)</option>
<option value="https://blog.cerbero.io/2021/11/"> November 2021 (4)</option>
<option value="https://blog.cerbero.io/2021/10/"> October 2021 (5)</option>
<option value="https://blog.cerbero.io/2021/09/"> September 2021 (7)</option>
<option value="https://blog.cerbero.io/2021/06/"> June 2021 (1)</option>
<option value="https://blog.cerbero.io/2021/04/"> April 2021 (1)</option>
<option value="https://blog.cerbero.io/2021/03/"> March 2021 (4)</option>
<option value="https://blog.cerbero.io/2021/02/"> February 2021 (1)</option>
<option value="https://blog.cerbero.io/2020/12/"> December 2020 (1)</option>
<option value="https://blog.cerbero.io/2020/11/"> November 2020 (1)</option>
<option value="https://blog.cerbero.io/2020/10/"> October 2020 (1)</option>
<option value="https://blog.cerbero.io/2020/09/"> September 2020 (2)</option>
<option value="https://blog.cerbero.io/2020/07/"> July 2020 (2)</option>
<option value="https://blog.cerbero.io/2020/01/"> January 2020 (1)</option>
<option value="https://blog.cerbero.io/2019/09/"> September 2019 (1)</option>
<option value="https://blog.cerbero.io/2019/08/"> August 2019 (2)</option>
<option value="https://blog.cerbero.io/2019/07/"> July 2019 (1)</option>
<option value="https://blog.cerbero.io/2019/06/"> June 2019 (1)</option>
<option value="https://blog.cerbero.io/2019/05/"> May 2019 (3)</option>
<option value="https://blog.cerbero.io/2019/04/"> April 2019 (2)</option>
<option value="https://blog.cerbero.io/2018/06/"> June 2018 (1)</option>
<option value="https://blog.cerbero.io/2018/04/"> April 2018 (1)</option>
<option value="https://blog.cerbero.io/2018/03/"> March 2018 (1)</option>
<option value="https://blog.cerbero.io/2018/01/"> January 2018 (1)</option>
<option value="https://blog.cerbero.io/2017/11/"> November 2017 (2)</option>
<option value="https://blog.cerbero.io/2017/03/"> March 2017 (5)</option>
<option value="https://blog.cerbero.io/2016/07/"> July 2016 (2)</option>
<option value="https://blog.cerbero.io/2016/05/"> May 2016 (2)</option>
<option value="https://blog.cerbero.io/2016/04/"> April 2016 (1)</option>
<option value="https://blog.cerbero.io/2015/10/"> October 2015 (2)</option>
<option value="https://blog.cerbero.io/2015/09/"> September 2015 (2)</option>
<option value="https://blog.cerbero.io/2015/06/"> June 2015 (2)</option>
<option value="https://blog.cerbero.io/2014/12/"> December 2014 (2)</option>
<option value="https://blog.cerbero.io/2014/10/"> October 2014 (1)</option>
<option value="https://blog.cerbero.io/2014/09/"> September 2014 (3)</option>
<option value="https://blog.cerbero.io/2014/08/"> August 2014 (1)</option>
<option value="https://blog.cerbero.io/2014/07/"> July 2014 (1)</option>
<option value="https://blog.cerbero.io/2013/12/"> December 2013 (2)</option>
<option value="https://blog.cerbero.io/2013/11/"> November 2013 (5)</option>
<option value="https://blog.cerbero.io/2013/10/"> October 2013 (5)</option>
<option value="https://blog.cerbero.io/2013/09/"> September 2013 (6)</option>
<option value="https://blog.cerbero.io/2013/08/"> August 2013 (6)</option>
<option value="https://blog.cerbero.io/2013/07/"> July 2013 (1)</option>
<option value="https://blog.cerbero.io/2013/06/"> June 2013 (4)</option>
<option value="https://blog.cerbero.io/2013/05/"> May 2013 (7)</option>
<option value="https://blog.cerbero.io/2013/04/"> April 2013 (5)</option>
<option value="https://blog.cerbero.io/2013/03/"> March 2013 (3)</option>
<option value="https://blog.cerbero.io/2013/02/"> February 2013 (4)</option>
<option value="https://blog.cerbero.io/2013/01/"> January 2013 (3)</option>
<option value="https://blog.cerbero.io/2012/12/"> December 2012 (3)</option>
<option value="https://blog.cerbero.io/2012/11/"> November 2012 (5)</option>
<option value="https://blog.cerbero.io/2012/10/"> October 2012 (3)</option>
<option value="https://blog.cerbero.io/2012/09/"> September 2012 (1)</option>
<option value="https://blog.cerbero.io/2012/08/"> August 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/07/"> July 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/06/"> June 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/05/"> May 2012 (2)</option>
<option value="https://blog.cerbero.io/2012/04/"> April 2012 (1)</option>
<option value="https://blog.cerbero.io/2012/03/"> March 2012 (6)</option>
<option value="https://blog.cerbero.io/2012/02/"> February 2012 (5)</option>
<option value="https://blog.cerbero.io/2012/01/"> January 2012 (8)</option>
<option value="https://blog.cerbero.io/2011/11/"> November 2011 (1)</option>
<option value="https://blog.cerbero.io/2011/08/"> August 2011 (1)</option>
</select>
<script>(function(){
var dropdown=document.getElementById("archives-dropdown-4");
function onSelectChange(){
if(dropdown.options[ dropdown.selectedIndex ].value!==''){
document.location.href=this.options[ this.selectedIndex ].value;
}}
dropdown.onchange=onSelectChange;
})();</script>
</section> </aside><footer id="colophon" class="site-footer">
<nav class="main-navigation" aria-label="Footer Primary Menu">
<div class="menu-main-container"><ul id="menu-main-1" class="primary-menu"><li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1923"><a href="https://cerbero.io">Home</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2790"><a href="#">Products</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2181"><a href="https://cerbero.io/suite/">Cerbero Suite</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2183"><a href="https://cerbero.io/engine/">Cerbero Engine</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2567"><a href="https://cerbero.io/packages/">Packages</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2430"><a href="https://cerbero.io/e-zine/">E-Zine</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1926"><a href="/">Blog</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2791"><a href="#">Support</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3000"><a href="https://cerbero.io/manual/">User Manual</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2165"><a href="https://sdk.cerbero.io/">SDK Documentation</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2514"><a href="https://cerbero.io/faq/">FAQ</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1927"><a href="https://cerbero.io/resources/">Resources</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1930"><a href="https://cerbero.io/contact/">Contact</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2792"><a href="https://cerbero.io/shop/">Shop</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1928"><a href="https://cerbero.io/my-account/">My account</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1929"><a href="https://cerbero.io/cart/">Cart</a></li> </ul> </li> </ul></div></nav>
<div class="site-info"> <span class="site-title"><a href="https://blog.cerbero.io/" rel="home">Cerbero Blog</a></span> <a href="https://wordpress.org/" class="imprint"> Proudly powered by WordPress </a></div></footer><script type="speculationrules">{"prefetch":[{"source":"document","where":{"and":[{"href_matches":"\/*"},{"not":{"href_matches":["\/wp-*.php","\/wp-admin\/*","\/wp-content\/uploads\/*","\/wp-content\/*","\/wp-content\/plugins\/*","\/wp-content\/themes\/twentysixteen-child\/*","\/wp-content\/themes\/twentysixteen\/*","\/*\\?(.+)"]}},{"not":{"selector_matches":"a[rel~=\"nofollow\"]"}},{"not":{"selector_matches":".no-prefetch, .no-prefetch a"}}]},"eagerness":"conservative"}]}</script>
<script>(()=>{'use strict';let loaded=!1,scrolled=!1,timerId;function load(){if(loaded){return}
loaded=!0;clearTimeout(timerId);window.removeEventListener('touchstart',load);document.body.removeEventListener('mouseenter',load);document.body.removeEventListener('click',load);window.removeEventListener('keydown',load);window.removeEventListener('scroll',scrollHandler);const t=document.getElementsByTagName('script')[0];const s=document.createElement('script');s.type='text/javascript';s.id='hcaptcha-api';s.src='https://js.hcaptcha.com/1/api.js?onload=hCaptchaOnLoad&render=explicit';s.async=!0;t.parentNode.insertBefore(s,t)}
function scrollHandler(){if(!scrolled){scrolled=!0;return}
load()}
document.addEventListener('hCaptchaBeforeAPI',function(){const delay=0;if(delay>=0){timerId=setTimeout(load,delay)}
window.addEventListener('touchstart',load);document.body.addEventListener('mouseenter',load);document.body.addEventListener('click',load);window.addEventListener('keydown',load);window.addEventListener('scroll',scrollHandler)})})()</script>
<script src="//blog.cerbero.io/wp-content/cache/wpfc-minified/2rmead0w/a7b63.js"></script>
<script id="enlighterjs-js-after">!function(e,n){if("undefined"!=typeof EnlighterJS){var o={"selectors":{"block":"pre","inline":"code"},"options":{"indent":4,"ampersandCleanup":true,"linehover":false,"rawcodeDbclick":false,"textOverflow":"scroll","linenumbers":false,"theme":"enlighter","language":"generic","retainCssClasses":false,"collapse":false,"toolbarOuter":"","toolbarTop":"{BTN_RAW}{BTN_COPY}{BTN_WINDOW}{BTN_WEBSITE}","toolbarBottom":""}};(e.EnlighterJSINIT=function(){EnlighterJS.init(o.selectors.block,o.selectors.inline,o.options)})()}else{(n&&(n.error||n.log)||function(){})("Error: EnlighterJS resources not loaded yet!")}}(window,console);</script></hex><table id="1"></table></hsplitter></ui>
def _formatViewData(self, sdata): if sdata.fid == 1: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hsplitter csizes="40-*"><hex id="2">") sdata.setCallback(cb, None) return True return False <p>This will display a custom view with a table and a hex view separated by a splitter:</p> <p><a href="/wp-content/uploads/2015/09/scanp/cview.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/cview.png" alt="Custom view"></a></p> <p>Of course, also have specified the callback for our custom view:</p> <pre lang="python">def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0</pre> <p>It is good to remember that format item IDs and IDs used in custom views are used to encode bookmark jumps. So if they change, saved bookmark jumps become invalid.</p> <p>And here again the whole code for a better overview:</p> <pre lang="python">from Pro.Core import * from Pro.UI import pvnInit, PubIcon_Dir class TestObject(CFFObject): def __init__(self): super(TestObject, self).__init__() self.SetObjectFormatName("TEST") self.SetDefaultEndianness(ENDIANNESS_LITTLE) def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0 class TestScanProvider(ScanProvider): def __init__(self): super(TestScanProvider, self).__init__() self.obj = None def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TestObject() self.obj.Load(self.getStream()) return self.SCAN_RESULT_OK def _startScan(self): return self.SCAN_RESULT_OK def _threadScan(self): print("thread msg") e = ScanEntryData() e.category = SEC_Warn e.type = CT_NativeCode self.addEntry(e) def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_NativeCode: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("Hello, world!") return True return False def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, 1) ft.appendChild(fi, 2) return ft def _formatViewInfo(self, finfo): if finfo.fid == 1: finfo.text = "directory" finfo.icon = PubIcon_Dir return True elif finfo.fid == 2: finfo.text = "entry" return True return False def _formatViewData(self, sdata): if sdata.fid == 1: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui></pre></hex><table id="1"></table><hex id="2">") sdata.setCallback(cb, None) return True return False def allocator(): return TestScanProvider() <p>If you have noticed from the screen-shot above, the analysed file is called ‘a.t’ and as such doesn’t automatically associate to our ‘test’ format. So how does it associate anyway?</p> <p>Clearly Profiler doesn’t rely on extensions alone to identify the format of a file. For external scan providers a signature mechanism based on YARA has been introduced. In the <strong>config</strong> directory of the user, you can create a file named ‘yara.plain’ and insert your identification rules in it, e.g.:</p> <pre lang="text">rule test { strings: $sig = "test" condition: $sig at 0 }</pre> <p>This rule will identify the format as ‘test’ if the first 4 bytes of the file match the string ‘test’: the name of the rule identifies the format.</p> <p>The file ‘yara.plain’ will be compiled to the binary ‘yara.rules’ file at the first run. In order to refresh ‘yara.rules’, you must delete it.</p> <p>One important thing to remember is that a rule isn’t matched against an entire file, but only against the first 512 bytes.</p> <p>Of course, our provider behaves 100% like all other providers and can be used to load embedded files:</p> <p><a href="/wp-content/uploads/2015/09/scanp/embfiles.png"><img decoding="async" class="post-img" src="/wp-content/uploads/2015/09/scanp/embfiles.png" alt="Embedded files"></a></p> <p>Our new provider is used automatically when an embedded file is identified as matching our format.</p><footer class="entry-footer"> <span class="byline"><img alt="" src="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=49&d=mm&r=g" srcset="https://secure.gravatar.com/avatar/7a86aa69922858b8d41989621fc1ea364aae1e027546f88a54d94ab1ec2187fc?s=98&d=mm&r=g 2x" class="avatar avatar-49 photo" height="49" width="49" decoding="async"><span class="screen-reader-text">Author </span><span class="author vcard"><a class="url fn n" href="https://blog.cerbero.io/author/cerbero/">Erik Pistelli</a></span></span><span class="posted-on"><span class="screen-reader-text">Posted on </span><a href="https://blog.cerbero.io/scan-providers/" rel="bookmark"><time class="entry-date published" datetime="2015-09-21T22:13:50+00:00">September 21, 2015</time><time class="updated" datetime="2021-04-01T16:32:53+00:00">April 1, 2021</time></a></span><span class="cat-links"><span class="screen-reader-text">Categories </span><a href="https://blog.cerbero.io/category/suite-standard/" rel="category tag">Suite Standard</a></span> </footer> <div id="comments" class="comments-area"> <div id="respond" class="comment-respond"> <h2 id="reply-title" class="comment-reply-title">Leave a Reply <small><a rel="nofollow" id="cancel-comment-reply-link" href="/scan-providers/#respond" style="display:none;">Cancel reply</a></small></h2><form action="https://blog.cerbero.io/wp-comments-post.php" method="post" id="commentform" class="comment-form" novalidate=""></form><p class="comment-notes"><span id="email-notes">Your email address will not be published.</span> <span class="required-field-message">Required fields are marked <span class="required">*</span></span></p><p class="comment-form-comment"><label for="comment">Comment <span class="required">*</span></label> <textarea autocomplete="new-password" id="comment" name="d38951e234" cols="45" rows="8" maxlength="65525" required=""></textarea><textarea id="a48de96393fbe452c9de40e41628280f" aria-label="hp-comment" aria-hidden="true" name="comment" autocomplete="new-password" style="padding:0 !important;clip:rect(1px, 1px, 1px, 1px) !important;position:absolute !important;white-space:nowrap !important;height:1px !important;width:1px !important;overflow:hidden !important;" tabindex="-1"></textarea><script data-noptimize="">document.getElementById("comment").setAttribute("id", "a48de96393fbe452c9de40e41628280f");document.getElementById("d38951e234").setAttribute("id", "comment");</script></p><p class="comment-form-author"><label for="author">Name <span class="required">*</span></label> <input id="author" name="author" type="text" value="" size="30" maxlength="245" autocomplete="name" required=""></p> <p class="comment-form-email"><label for="email">Email <span class="required">*</span></label> <input id="email" name="email" type="email" value="" size="30" maxlength="100" aria-describedby="email-notes" autocomplete="email" required=""></p> <p class="comment-form-url"><label for="url">Website</label> <input id="url" name="url" type="url" value="" size="30" maxlength="200" autocomplete="url"></p> <p class="comment-form-cookies-consent"><input id="wp-comment-cookies-consent" name="wp-comment-cookies-consent" type="checkbox" value="yes"> <label for="wp-comment-cookies-consent">Save my name, email, and website in this browser for the next time I comment.</label></p> <input type="hidden" class="hcaptcha-widget-id" name="hcaptcha-widget-id" value="eyJzb3VyY2UiOlsiV29yZFByZXNzIl0sImZvcm1faWQiOiIxNTUxIn0=-2bc8715768fd30dc95101e48f992fcbf"> <h-captcha class="h-captcha" data-sitekey="0d172a02-d673-47d9-af17-346c4cd386d9" data-theme="light" data-size="normal" data-auto="false" data-ajax="false" data-force="false"> </h-captcha> <input type="hidden" id="hcaptcha_comment_nonce" name="hcaptcha_comment_nonce" value="377ea2a271"><input type="hidden" name="_wp_http_referer" value="/scan-providers/"><p class="form-submit"><input name="submit" type="submit" id="submit" class="submit" value="Post Comment"> <input type="hidden" name="comment_post_ID" value="1551" id="comment_post_ID"> <input type="hidden" name="comment_parent" id="comment_parent" value="0"> </p></div></div><nav class="navigation post-navigation" aria-label="Posts"> <h2 class="screen-reader-text">Post navigation</h2> <div class="nav-links"><div class="nav-previous"><a href="https://blog.cerbero.io/profiler-2-4/" rel="prev"><span class="meta-nav" aria-hidden="true">Previous</span> <span class="screen-reader-text">Previous post:</span> <span class="post-title">Profiler 2.4</span></a></div><div class="nav-next"><a href="https://blog.cerbero.io/torrent-support/" rel="next"><span class="meta-nav" aria-hidden="true">Next</span> <span class="screen-reader-text">Next post:</span> <span class="post-title">Torrent Support</span></a></div></div></nav> <aside id="secondary" class="sidebar widget-area"> <section id="search-2" class="widget widget_search"> <form role="search" method="get" class="search-form" action="https://blog.cerbero.io/"></form> <label> <span class="screen-reader-text"> Search for: </span> <input type="search" class="search-field" placeholder="Search …" value="" name="s"> </label> <button type="submit" class="search-submit"><span class="screen-reader-text"> Search </span></button> </section> <section id="recent-posts-2" class="widget widget_recent_entries"> <h2 class="widget-title">Recent Posts</h2><nav aria-label="Recent Posts"> <ul> <li> <a href="https://blog.cerbero.io/wim-format-package/" aria-current="page">WIM Format Package</a> </li> <li> <a href="https://blog.cerbero.io/hfs-file-system/">HFS+ File System</a> </li> <li> <a href="https://blog.cerbero.io/ext-file-systems/">EXT File Systems</a> </li> <li> <a href="https://blog.cerbero.io/ntfs-file-system/">NTFS File System</a> </li> <li> <a href="https://blog.cerbero.io/exfat-file-system/">ExFAT File System</a> </li> <li> <a href="https://blog.cerbero.io/disk-format-package/">Disk Format Package</a> </li> <li> <a href="https://blog.cerbero.io/fat-file-system/">FAT File System</a> </li> <li> <a href="https://blog.cerbero.io/prototype-memory-services/">Prototype Memory & Services</a> </li> <li> <a href="https://blog.cerbero.io/iso-format-2-0-package/">ISO Format 2.0 Package</a> </li> <li> <a href="https://blog.cerbero.io/memory-decompression-pagefiles/">Memory Decompression & Pagefiles</a> </li> </ul> </nav></section><section id="archives-4" class="widget widget_archive"><h2 class="widget-title">Archives</h2> <label class="screen-reader-text" for="archives-dropdown-4">Archives</label> <select id="archives-dropdown-4" name="archive-dropdown"> <option value="">Select Month</option> <option value="https://blog.cerbero.io/2025/06/"> June 2025 (1)</option> <option value="https://blog.cerbero.io/2025/05/"> May 2025 (7)</option> <option value="https://blog.cerbero.io/2025/04/"> April 2025 (4)</option> <option value="https://blog.cerbero.io/2025/03/"> March 2025 (2)</option> <option value="https://blog.cerbero.io/2024/10/"> October 2024 (3)</option> <option value="https://blog.cerbero.io/2024/09/"> September 2024 (1)</option> <option value="https://blog.cerbero.io/2024/08/"> August 2024 (3)</option> <option value="https://blog.cerbero.io/2024/07/"> July 2024 (5)</option> <option value="https://blog.cerbero.io/2024/06/"> June 2024 (2)</option> <option value="https://blog.cerbero.io/2024/04/"> April 2024 (4)</option> <option value="https://blog.cerbero.io/2024/03/"> March 2024 (1)</option> <option value="https://blog.cerbero.io/2024/02/"> February 2024 (1)</option> <option value="https://blog.cerbero.io/2024/01/"> January 2024 (4)</option> <option value="https://blog.cerbero.io/2023/12/"> December 2023 (3)</option> <option value="https://blog.cerbero.io/2023/11/"> November 2023 (7)</option> <option value="https://blog.cerbero.io/2023/10/"> October 2023 (3)</option> <option value="https://blog.cerbero.io/2023/09/"> September 2023 (1)</option> <option value="https://blog.cerbero.io/2023/07/"> July 2023 (1)</option> <option value="https://blog.cerbero.io/2023/05/"> May 2023 (11)</option> <option value="https://blog.cerbero.io/2023/03/"> March 2023 (9)</option> <option value="https://blog.cerbero.io/2023/02/"> February 2023 (3)</option> <option value="https://blog.cerbero.io/2023/01/"> January 2023 (1)</option> <option value="https://blog.cerbero.io/2022/11/"> November 2022 (1)</option> <option value="https://blog.cerbero.io/2022/09/"> September 2022 (2)</option> <option value="https://blog.cerbero.io/2022/08/"> August 2022 (2)</option> <option value="https://blog.cerbero.io/2022/07/"> July 2022 (3)</option> <option value="https://blog.cerbero.io/2022/06/"> June 2022 (2)</option> <option value="https://blog.cerbero.io/2022/05/"> May 2022 (5)</option> <option value="https://blog.cerbero.io/2022/04/"> April 2022 (3)</option> <option value="https://blog.cerbero.io/2022/03/"> March 2022 (4)</option> <option value="https://blog.cerbero.io/2022/02/"> February 2022 (6)</option> <option value="https://blog.cerbero.io/2022/01/"> January 2022 (1)</option> <option value="https://blog.cerbero.io/2021/11/"> November 2021 (4)</option> <option value="https://blog.cerbero.io/2021/10/"> October 2021 (5)</option> <option value="https://blog.cerbero.io/2021/09/"> September 2021 (7)</option> <option value="https://blog.cerbero.io/2021/06/"> June 2021 (1)</option> <option value="https://blog.cerbero.io/2021/04/"> April 2021 (1)</option> <option value="https://blog.cerbero.io/2021/03/"> March 2021 (4)</option> <option value="https://blog.cerbero.io/2021/02/"> February 2021 (1)</option> <option value="https://blog.cerbero.io/2020/12/"> December 2020 (1)</option> <option value="https://blog.cerbero.io/2020/11/"> November 2020 (1)</option> <option value="https://blog.cerbero.io/2020/10/"> October 2020 (1)</option> <option value="https://blog.cerbero.io/2020/09/"> September 2020 (2)</option> <option value="https://blog.cerbero.io/2020/07/"> July 2020 (2)</option> <option value="https://blog.cerbero.io/2020/01/"> January 2020 (1)</option> <option value="https://blog.cerbero.io/2019/09/"> September 2019 (1)</option> <option value="https://blog.cerbero.io/2019/08/"> August 2019 (2)</option> <option value="https://blog.cerbero.io/2019/07/"> July 2019 (1)</option> <option value="https://blog.cerbero.io/2019/06/"> June 2019 (1)</option> <option value="https://blog.cerbero.io/2019/05/"> May 2019 (3)</option> <option value="https://blog.cerbero.io/2019/04/"> April 2019 (2)</option> <option value="https://blog.cerbero.io/2018/06/"> June 2018 (1)</option> <option value="https://blog.cerbero.io/2018/04/"> April 2018 (1)</option> <option value="https://blog.cerbero.io/2018/03/"> March 2018 (1)</option> <option value="https://blog.cerbero.io/2018/01/"> January 2018 (1)</option> <option value="https://blog.cerbero.io/2017/11/"> November 2017 (2)</option> <option value="https://blog.cerbero.io/2017/03/"> March 2017 (5)</option> <option value="https://blog.cerbero.io/2016/07/"> July 2016 (2)</option> <option value="https://blog.cerbero.io/2016/05/"> May 2016 (2)</option> <option value="https://blog.cerbero.io/2016/04/"> April 2016 (1)</option> <option value="https://blog.cerbero.io/2015/10/"> October 2015 (2)</option> <option value="https://blog.cerbero.io/2015/09/"> September 2015 (2)</option> <option value="https://blog.cerbero.io/2015/06/"> June 2015 (2)</option> <option value="https://blog.cerbero.io/2014/12/"> December 2014 (2)</option> <option value="https://blog.cerbero.io/2014/10/"> October 2014 (1)</option> <option value="https://blog.cerbero.io/2014/09/"> September 2014 (3)</option> <option value="https://blog.cerbero.io/2014/08/"> August 2014 (1)</option> <option value="https://blog.cerbero.io/2014/07/"> July 2014 (1)</option> <option value="https://blog.cerbero.io/2013/12/"> December 2013 (2)</option> <option value="https://blog.cerbero.io/2013/11/"> November 2013 (5)</option> <option value="https://blog.cerbero.io/2013/10/"> October 2013 (5)</option> <option value="https://blog.cerbero.io/2013/09/"> September 2013 (6)</option> <option value="https://blog.cerbero.io/2013/08/"> August 2013 (6)</option> <option value="https://blog.cerbero.io/2013/07/"> July 2013 (1)</option> <option value="https://blog.cerbero.io/2013/06/"> June 2013 (4)</option> <option value="https://blog.cerbero.io/2013/05/"> May 2013 (7)</option> <option value="https://blog.cerbero.io/2013/04/"> April 2013 (5)</option> <option value="https://blog.cerbero.io/2013/03/"> March 2013 (3)</option> <option value="https://blog.cerbero.io/2013/02/"> February 2013 (4)</option> <option value="https://blog.cerbero.io/2013/01/"> January 2013 (3)</option> <option value="https://blog.cerbero.io/2012/12/"> December 2012 (3)</option> <option value="https://blog.cerbero.io/2012/11/"> November 2012 (5)</option> <option value="https://blog.cerbero.io/2012/10/"> October 2012 (3)</option> <option value="https://blog.cerbero.io/2012/09/"> September 2012 (1)</option> <option value="https://blog.cerbero.io/2012/08/"> August 2012 (2)</option> <option value="https://blog.cerbero.io/2012/07/"> July 2012 (2)</option> <option value="https://blog.cerbero.io/2012/06/"> June 2012 (2)</option> <option value="https://blog.cerbero.io/2012/05/"> May 2012 (2)</option> <option value="https://blog.cerbero.io/2012/04/"> April 2012 (1)</option> <option value="https://blog.cerbero.io/2012/03/"> March 2012 (6)</option> <option value="https://blog.cerbero.io/2012/02/"> February 2012 (5)</option> <option value="https://blog.cerbero.io/2012/01/"> January 2012 (8)</option> <option value="https://blog.cerbero.io/2011/11/"> November 2011 (1)</option> <option value="https://blog.cerbero.io/2011/08/"> August 2011 (1)</option> </select> <script>(function(){ var dropdown=document.getElementById("archives-dropdown-4"); function onSelectChange(){ if(dropdown.options[ dropdown.selectedIndex ].value!==''){ document.location.href=this.options[ this.selectedIndex ].value; }} dropdown.onchange=onSelectChange; })();</script> </section> </aside><footer id="colophon" class="site-footer"> <nav class="main-navigation" aria-label="Footer Primary Menu"> <div class="menu-main-container"><ul id="menu-main-1" class="primary-menu"><li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1923"><a href="https://cerbero.io">Home</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2790"><a href="#">Products</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2181"><a href="https://cerbero.io/suite/">Cerbero Suite</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2183"><a href="https://cerbero.io/engine/">Cerbero Engine</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2567"><a href="https://cerbero.io/packages/">Packages</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2430"><a href="https://cerbero.io/e-zine/">E-Zine</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1926"><a href="/">Blog</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2791"><a href="#">Support</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-3000"><a href="https://cerbero.io/manual/">User Manual</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2165"><a href="https://sdk.cerbero.io/">SDK Documentation</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-2514"><a href="https://cerbero.io/faq/">FAQ</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1927"><a href="https://cerbero.io/resources/">Resources</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1930"><a href="https://cerbero.io/contact/">Contact</a></li> </ul> </li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-has-children menu-item-2792"><a href="https://cerbero.io/shop/">Shop</a> <ul class="sub-menu"> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1928"><a href="https://cerbero.io/my-account/">My account</a></li> <li class="menu-item menu-item-type-custom menu-item-object-custom menu-item-1929"><a href="https://cerbero.io/cart/">Cart</a></li> </ul> </li> </ul></div></nav> <div class="site-info"> <span class="site-title"><a href="https://blog.cerbero.io/" rel="home">Cerbero Blog</a></span> <a href="https://wordpress.org/" class="imprint"> Proudly powered by WordPress </a></div></footer><script type="speculationrules">{"prefetch":[{"source":"document","where":{"and":[{"href_matches":"\/*"},{"not":{"href_matches":["\/wp-*.php","\/wp-admin\/*","\/wp-content\/uploads\/*","\/wp-content\/*","\/wp-content\/plugins\/*","\/wp-content\/themes\/twentysixteen-child\/*","\/wp-content\/themes\/twentysixteen\/*","\/*\\?(.+)"]}},{"not":{"selector_matches":"a[rel~=\"nofollow\"]"}},{"not":{"selector_matches":".no-prefetch, .no-prefetch a"}}]},"eagerness":"conservative"}]}</script> <script>(()=>{'use strict';let loaded=!1,scrolled=!1,timerId;function load(){if(loaded){return} loaded=!0;clearTimeout(timerId);window.removeEventListener('touchstart',load);document.body.removeEventListener('mouseenter',load);document.body.removeEventListener('click',load);window.removeEventListener('keydown',load);window.removeEventListener('scroll',scrollHandler);const t=document.getElementsByTagName('script')[0];const s=document.createElement('script');s.type='text/javascript';s.id='hcaptcha-api';s.src='https://js.hcaptcha.com/1/api.js?onload=hCaptchaOnLoad&render=explicit';s.async=!0;t.parentNode.insertBefore(s,t)} function scrollHandler(){if(!scrolled){scrolled=!0;return} load()} document.addEventListener('hCaptchaBeforeAPI',function(){const delay=0;if(delay>=0){timerId=setTimeout(load,delay)} window.addEventListener('touchstart',load);document.body.addEventListener('mouseenter',load);document.body.addEventListener('click',load);window.addEventListener('keydown',load);window.addEventListener('scroll',scrollHandler)})})()</script> <script src="//blog.cerbero.io/wp-content/cache/wpfc-minified/2rmead0w/a7b63.js"></script> <script id="enlighterjs-js-after">!function(e,n){if("undefined"!=typeof EnlighterJS){var o={"selectors":{"block":"pre","inline":"code"},"options":{"indent":4,"ampersandCleanup":true,"linehover":false,"rawcodeDbclick":false,"textOverflow":"scroll","linenumbers":false,"theme":"enlighter","language":"generic","retainCssClasses":false,"collapse":false,"toolbarOuter":"","toolbarTop":"{BTN_RAW}{BTN_COPY}{BTN_WINDOW}{BTN_WEBSITE}","toolbarBottom":""}};(e.EnlighterJSINIT=function(){EnlighterJS.init(o.selectors.block,o.selectors.inline,o.options)})()}else{(n&&(n.error||n.log)||function(){})("Error: EnlighterJS resources not loaded yet!")}}(window,console);</script></hex><table id="1"></table></hsplitter></ui>
    def _formatViewData(self, sdata):
        if sdata.fid == 1:
            sdata.setViews(SCANVIEW_CUSTOM)
            sdata.data.setData("")
            sdata.setCallback(cb, None)
            return True
        return False 

This will display a custom view with a table and a hex view separated by a splitter:

Custom view

Of course, also have specified the callback for our custom view:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0
def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0
def cb(cv, ud, code, view, data):
    if code == pvnInit:
        return 1
    return 0

It is good to remember that format item IDs and IDs used in custom views are used to encode bookmark jumps. So if they change, saved bookmark jumps become invalid.

And here again the whole code for a better overview:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
from Pro.UI import pvnInit, PubIcon_Dir
class TestObject(CFFObject):
def __init__(self):
super(TestObject, self).__init__()
self.SetObjectFormatName("TEST")
self.SetDefaultEndianness(ENDIANNESS_LITTLE)
def cb(cv, ud, code, view, data):
if code == pvnInit:
return 1
return 0
class TestScanProvider(ScanProvider):
def __init__(self):
super(TestScanProvider, self).__init__()
self.obj = None
def _clear(self):
self.obj = None
def _getObject(self):
return self.obj
def _initObject(self):
self.obj = TestObject()
self.obj.Load(self.getStream())
return self.SCAN_RESULT_OK
def _startScan(self):
return self.SCAN_RESULT_OK
def _threadScan(self):
print("thread msg")
e = ScanEntryData()
e.category = SEC_Warn
e.type = CT_NativeCode
self.addEntry(e)
def _scanViewData(self, xml, dnode, sdata):
if sdata.type == CT_NativeCode:
sdata.setViews(SCANVIEW_TEXT)
sdata.data.setData("Hello, world!")
return True
return False
def _getFormat(self):
ft = FormatTree()
ft.enableIDs(True)
fi = ft.appendChild(None, 1)
ft.appendChild(fi, 2)
return ft
def _formatViewInfo(self, finfo):
if finfo.fid == 1:
finfo.text = "directory"
finfo.icon = PubIcon_Dir
return True
elif finfo.fid == 2:
finfo.text = "entry"
return True
return False
def _formatViewData(self, sdata):
if sdata.fid == 1:
sdata.setViews(SCANVIEW_CUSTOM)
sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui>
from Pro.Core import * from Pro.UI import pvnInit, PubIcon_Dir class TestObject(CFFObject): def __init__(self): super(TestObject, self).__init__() self.SetObjectFormatName("TEST") self.SetDefaultEndianness(ENDIANNESS_LITTLE) def cb(cv, ud, code, view, data): if code == pvnInit: return 1 return 0 class TestScanProvider(ScanProvider): def __init__(self): super(TestScanProvider, self).__init__() self.obj = None def _clear(self): self.obj = None def _getObject(self): return self.obj def _initObject(self): self.obj = TestObject() self.obj.Load(self.getStream()) return self.SCAN_RESULT_OK def _startScan(self): return self.SCAN_RESULT_OK def _threadScan(self): print("thread msg") e = ScanEntryData() e.category = SEC_Warn e.type = CT_NativeCode self.addEntry(e) def _scanViewData(self, xml, dnode, sdata): if sdata.type == CT_NativeCode: sdata.setViews(SCANVIEW_TEXT) sdata.data.setData("Hello, world!") return True return False def _getFormat(self): ft = FormatTree() ft.enableIDs(True) fi = ft.appendChild(None, 1) ft.appendChild(fi, 2) return ft def _formatViewInfo(self, finfo): if finfo.fid == 1: finfo.text = "directory" finfo.icon = PubIcon_Dir return True elif finfo.fid == 2: finfo.text = "entry" return True return False def _formatViewData(self, sdata): if sdata.fid == 1: sdata.setViews(SCANVIEW_CUSTOM) sdata.data.setData("<ui><hsplitter csizes="40-*"></hsplitter></ui>
from Pro.Core import *
from Pro.UI import pvnInit, PubIcon_Dir

class TestObject(CFFObject):

    def __init__(self):
        super(TestObject, self).__init__()
        self.SetObjectFormatName("TEST")
        self.SetDefaultEndianness(ENDIANNESS_LITTLE)

def cb(cv, ud, code, view, data):
    if code == pvnInit:
        return 1
    return 0

class TestScanProvider(ScanProvider):

    def __init__(self):
        super(TestScanProvider, self).__init__()
        self.obj = None

    def _clear(self):
        self.obj = None

    def _getObject(self):
        return self.obj

    def _initObject(self):
        self.obj = TestObject()
        self.obj.Load(self.getStream())
        return self.SCAN_RESULT_OK

    def _startScan(self):
        return self.SCAN_RESULT_OK
        
    def _threadScan(self):
        print("thread msg")
        e = ScanEntryData()
        e.category = SEC_Warn
        e.type = CT_NativeCode
        self.addEntry(e)

    def _scanViewData(self, xml, dnode, sdata):
        if sdata.type == CT_NativeCode:
            sdata.setViews(SCANVIEW_TEXT)
            sdata.data.setData("Hello, world!")
            return True
        return False
        
    def _getFormat(self):
        ft = FormatTree()
        ft.enableIDs(True)
        fi = ft.appendChild(None, 1)
        ft.appendChild(fi, 2)
        return ft
        
    def _formatViewInfo(self, finfo):
        if finfo.fid == 1:
            finfo.text = "directory"
            finfo.icon = PubIcon_Dir
            return True
        elif finfo.fid == 2:
            finfo.text = "entry"
            return True
        return False
        
    def _formatViewData(self, sdata):
        if sdata.fid == 1:
            sdata.setViews(SCANVIEW_CUSTOM)
            sdata.data.setData("
") sdata.setCallback(cb, None) return True return False def allocator(): return TestScanProvider()

If you have noticed from the screen-shot above, the analysed file is called ‘a.t’ and as such doesn’t automatically associate to our ‘test’ format. So how does it associate anyway?

Clearly Profiler doesn’t rely on extensions alone to identify the format of a file. For external scan providers a signature mechanism based on YARA has been introduced. In the config directory of the user, you can create a file named ‘yara.plain’ and insert your identification rules in it, e.g.:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
rule test
{
strings:
$sig = "test"
condition:
$sig at 0
}
rule test { strings: $sig = "test" condition: $sig at 0 }
rule test
{
    strings:
        $sig = "test"

    condition:
        $sig at 0
}

This rule will identify the format as ‘test’ if the first 4 bytes of the file match the string ‘test’: the name of the rule identifies the format.

The file ‘yara.plain’ will be compiled to the binary ‘yara.rules’ file at the first run. In order to refresh ‘yara.rules’, you must delete it.

One important thing to remember is that a rule isn’t matched against an entire file, but only against the first 512 bytes.

Of course, our provider behaves 100% like all other providers and can be used to load embedded files:

Embedded files

Our new provider is used automatically when an embedded file is identified as matching our format.

Leave a Reply

Your email address will not be published. Required fields are marked *