Custom filters: Lua and misc/basic

Last year filters have been introduced and among them them the very useful ‘misc/basic‘. The upcoming 0.9.5 version of the Profiler improves this filter introducing the condition parameter.

For instance, let’s take the following filter:

Conditional misc/basic

It xors every byte if different than 0xFF and 0. The ‘misc/basic‘ filter can be used to express even more complex operations such as:

Advanced misc/basic

In this case the the filter xors every third dword with 0xAABBCCDD, following the pattern ‘xor skip skip’, in little endian mode and only if the value is different than 0 and 0xAABBCCDD. While lots of operations can be expressed with this filter, there are limits.

This is why Lua filters have been introduced. Right now there are two such filters available: ‘lua/custom‘ and ‘lua/loop‘. Let’s start with the second one which is just a shortcut.


if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end

This script does the exact same thing as the first example of the ‘misc/basic‘ filter: it xors every byte if different than 0xFF and 0. In this specific case there’s no reason to use a Lua filter. In fact, Lua filters are considerably slower than native filters. Thus, they should be used only when the operation is too complex to be expressed with any of the default filters.

While ‘lua/loop‘ is inteded for simple loop operations, ‘lua/custom‘, as the name suggests, can be used to implement a custom filter logic. Here’s an example, which again does the same thing as the previous example:

function run(filter)
    local c = filter:container()
    local size = c:size()
    local offset = 0
    local bsize = 16384
    while size ~= 0 do
        if bsize > size then bsize = size end
        local block = c:read(offset, bsize)
        local boffs = 0
        while boffs < bsize do
            local e = block:readU8(boffs)
            if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end
            block:writeU8(boffs, e)
            boffs = boffs + 1
        c:write(offset, block)
        offset = offset + bsize
        size = size - bsize
    return Base.FilterErr_None

The security of these scripting filters is very high. They run in a special sandboxed environment, have access only to a minimum set of secure functions, are limited in memory consumption (2 MBs by default, but it can be configured from the settings) and can be interrupted at any time by the user.

If you still don't wish to allow script filters, they can be disabled from the settings.

Lua filters settings

The Lua VM is almost vanilla, the only difference is that it allows for 64-bit numbers. As you can observe from the examples, the Lua library for bitwise operations has been renamed from 'bit32' to 'bit'.

We'll see some practical usage samples in the near future. Stay tuned!

Disasm options & filters

The upcoming version 0.9.4 of the Profiler introduces improvements to several disasm engines: ActionScript3, Dalvik, Java, MSIL. In particular it adds options, so that the user can decide whether to include file offsets and opcodes in the output.

Disasm options

The code indentation can be changed as well.

Another important addition is that these engines have been exposed as filters. This is especially noteworthy since byte code can sometimes be injected or stored outside of a method body, so that it is necessary to be able to disassemble raw data.

Disasm filters

Of course these filters can be used from Python too.

from Pro.Core import *

sp = proContext().currentScanProvider()
c = sp.getObjectStream()
c.setRange(0x2570, 0x10)

fstr = ""
c = applyFilters(c, fstr)

s =, c.size()).decode("utf-8")


/* 00000000 1A 00 8A+ */ const-string v0,  // string@018a (394)
/* 00000004 12 01     */ const/4 v1, #int 0 // #0
/* 00000006 12 22     */ const/4 v2, #int 2 // #2
/* 00000008 70 52 42+ */ invoke-direct {v3, v4, v0, v1, v2},  // method@0042 (66)
/* 0000000E 0E 00     */ return-void

In the future it will be possible to output a filter directly to NTTextStream, avoiding the need to read from NTContainer.

Stay tuned!

Exposing the Core (part 1)

The main feature of the upcoming 0.9.3 version of the Profiler is the expansion of the public SDK. This basically means that a consistent subset of the internal classes will be exposed. Although it’s a subset, there’s no way to document all methods and functions. Fortunately, many of them should be quite intuitive.

Some of the most common important classes are:

  • NTContainer: this is a generic container which is used to encapsulate data such as files and memory. It’s an extremely important class, since it’s used extensively. Containers can for the time being be created through SDK functions such as: createContainerFromFile/newContainer.
  • NTBuffer/NTContainerBuffer/CFFBuffer/etc.: used to efficiently read iteratively small amounts of data from a source.
  • NTTextStream/NTTextBuffer/NTTextStringBuffer: used to output text. Indentation can be specified.
  • NTXml: used to parse XML. Fast and secure. This class is based on RapidXML.
  • CFFObject: the class from which every format class inherits (ZipObject, PEObject, etc). A very small subset of this class is exposed for now. This will change in the future.
  • CFFStruct: representation of a file format structure.
  • CFFFlags: representation of flags in a CFFStruct.

One of the new additions is that Python can now use filters as well. Do you remember the post about Widget and Views? Let’s use the same code base and change just a few lines:

from Pro import *
from PySide import QtCore, QtGui
class MixedWidget(QtGui.QSplitter):
    def __init__(self, parent=None):
        super(MixedWidget, self).__init__(parent)
        self.setWindowTitle("Mixed widget")
        self.model = QtGui.QDirModel()
        tree = QtGui.QTreeView()
        ctx = proContext()
        self.hex = ctx.createView(ProView.Type_Hex, "")
    def updateFile(self, idx):
        if self.model.isDir(idx) == True:
            # modified lines
            name = self.model.filePath(idx)
            c = createContainerFromFile(name)
            fstr = ""
            c = applyFilters(c, fstr)
            # end
ctx = proContext()
w = MixedWidget()
v = ctx.createViewFromWidget(w)

With just three of the modified lines we are xoring all opened files with the value 0xCC and then show the resulting data in the hex view. The Profiler provides a huge number of filters for any kind of operation and they can be chained, so we could easily compress and then encrypt a file with AES by just replacing one line in the sample above. The function applyFilters displays an optional default wait dialog to the user to interrupt the operation (if it is executing in the main thread). Please remember that the easiest way to obtain the needed filters XML string is to use the UI view and use the export command from the list (context menu->Export…).

NTBuffer generates an exception when a read operations fail. Thus, it should be used as follows:

ctx = proContext()
v = ctx.getCurrentView()
d = v.getData()
b = NTContainerBuffer(d, ENDIANNESS_LITTLE, 0)
try: # or b.u8(), b.u16(), etc.
except IndexError as e:

A small snippet to show how to use NTXml:

x = NTXml()
ret = x.parse("")
if ret == NTXml_ErrNone:
    n = x.findChild(None, "r")
    if n != None:
        n = x.findChild(n, "e")
        if n != None:
            a = x.findAttribute(n, "t")
            if a != None:

Along with the core, several of the file objects will be exposed. A text dump of a structure could be as easy as:

c = createContainerFromFile(fname)
pe = PEObject()
out = NTTextStringBuffer()
pe.DosHeader().Dump(out) # CFFStruct::Dump

Please notice that the code above misses several checks. We need to make sure that c is valid and Load succeds. I’ll omit these checks here to keep the code minimal.

You might say that printing out a single structure is an easy task. So let’s take a look at another cooler sample:

c = createContainerFromFile(fname)
pe = PEObject()
out = NTTextStringBuffer()
tables = pe.MDTables("#~") # 'tables' references all .NET metadata tables
pe.DisassembleMSIL(out, 0x06000001) # .NET token (MethodDef | index)

These few lines output an entire .NET method such as:

private static void Main(string [] args)
 locals: int local_0,
         int local_1

 stloc_0 // int local_0
 ldloc_0 // int local_0
 stloc_1 // int local_1
 ldloc_1 // int local_1
  goto loc_22
  goto loc_60
 br_s loc_71
  ldstr "h"
  call System.Console::WriteLine(string) // returns void
  leave_s loc_81
 catch (System.ArgumentNullException)
  ldstr "null"
  call System.Console::WriteLine(string) // returns void
  leave_s loc_81
 catch (System.ArgumentException)
  ldstr "error"
  call System.Console::WriteLine(string) // returns void
  leave_s loc_81
 ldstr "k"
 call System.Console::WriteLine(string) // returns void
 ldstr "c"
 call System.Console::WriteLine(string) // returns void

Nice, isn’t it? Remember we can change the indentation programmatically.

Of course, it will also be possible to get the object currently being analyzed and similar stuff. But we’ll see how to do that in another post.

If you’re wondering why the case convention for methods is not always the same, the reason is simple. CFFObject/CFFStruct/etc are based on older code which followed the Win32-like convention. Consequently all derived classes like PEObject follow this convention. All other classes use the camel-case convention.

Damaged Zip archive (video)

In this video we can see how to inspect a damaged Zip archive using the Profiler in a real-world scenario. Although soon the automatic recovery of damaged Zip archives will be available and it will be possible to perform this sort of task programmatically, it’s still useful to see how to do this kind of thing manually.

Filters with range parameters

The upcoming 0.8.9 release improves filters and introduces range parameters. If you don’t know what filters are you can take a look at the original introductory post.

What is now possible to do is to specify an optional range for the filter to be applied to. This is extremely useful, since many times we need to modify only a portion of the input data. Take, for instance, a file in which a portion of data is encrypted (or compressed) and we want to keep the outer parts not affected by the filter. This would be the layout of the input data:

Data layout

To handle the outer parts (before and after) we have an extra parameter called ‘trim’. This parameter can be set to one of the following values:

  • no: the outer parts are not trimmed and kept in the output data
  • left: the left part is trimmed while the right one is kept
  • right: the right part is trimmed while the left one is kept
  • both: both outer parts are trimmed and the output data will contain only the output of the filter

Let’s try to understand this better with a practical example. Let’s consider a compressed SWF file. The Profiler handles the decompression automatically, but just for the sake of this post we’ll decompress it manually. In case you don’t know, the format of a compressed SWF is as following:

3 bytes: CWS signature (would be FSW uncompressed files)
4 bytes: uncompressed file size
1 byte: version

To dig deeper into the file format you may want to take a look at a new web page we are preparing at But for the sake of our example you need only to know the initial fields of the header. To decompress a SWF we need to decompress the data following the header and then to fix the first signature byte and we’ll do this manually and graphically with filters.

Let’s start by selecting the compressed input data and then by trying to add the unpack/zlib filter.

ZLib filter

As you can see we are prompted with a dialog asking us if we want to use the range defined by the selection of the hex editor on the left (input data). We accept and specify that we want to keep the data on the left (although in this case we could have just disabled the trimming completely, let’s just pretend there’s some data on the right we want to discard).

Now we need to fix the signature byte. To do this we use the misc/basic filter (set operation, offset 0, size 1 and no trimming).

By running the preview we’ll have the decompressed SWF.

The exported filters will look like this:


While this was a very simple case, much more complicated cases can be handled. Also remember that filters can be used to specify how to load embedded files, which means that, for instance, it’s easy to decrypt a file contained into another file and inspect it without ever having to save the decrypted data to disk.

I hope this post offers some understanding of an advanced use of filters.

An introduction to Filters

While filters have always been a part of the Profiler, they are now partly exposed only with the upcoming 0.8.7 version. Shortly explained: filters are algorithms of different nature which can be applied to portions of data. The starting point to interact with filters is the filter view. This view is triggered with the Ctrl+T shortcut. Let’s start with some step by step screenshots.

This is a Zip file and the gray area marks a compressed file. Let’s select this area with Ctrl+Alt+A.

Zipped file

Now let’s press Ctrl+T and take a look at the filter view.

Filter View

The tree at the top-left displays the available filters. The property editor next to it contains the available parameters for the currently selected filter and the list beneath the filters is the stack of filters which need to be applied to the input data. The hex view on the left contains the data which we have previously selected in the hex view (meaning the compressed file) and represents the input data.

Now if we want to decompress the data, let’s add the unpack/zlib filter (with the raw parameter checked) to the stack by clicking on “Add”.

Filter stack

And at this point we can click on the “Preview” button.

Unzipped file

But what if we want to retrieve the SHA1 of the decompressed data? Just add the hash/sha1 filter to the stack.

Filter stack

This time we’ll have only the SHA1 hash in the output view.


The filter stack can be imported and exported. Just right-click inside the stack list. The exported filters of the current stack will look like this:

If we had opened the filter view without a selection we would have had an editable hex view on the left, which we could fill with our custom data. Also, the hex views in the filter view have all the benefits of a regular hex view (that is if the filter view is not triggered while loading a file, as we’ll see later); this means that it’s possible to trigger a new filter view by selecting something in the hex view containing the filtered output data.

What I’ve showed until now isn’t an every-day task, but it serves the purpose of introducing the topic. We’ll see some real-word cases after I have introduced the misc/basic filter. But before of that let me shortly mention text output filters.

Right now the Profiler features two filters with text output: disasm/x86 and disasm/x64. When a filter outputs text, we’ll be shown by default a text view instead of the hex view on the right.

x86 filter

This can come handy when decrypting shellcode. What we also most probably need is the misc/basic filter, which I think will be one of the most commonly used filters. Let’s look at its parameters.

basic filter params

operation specifies what is going to be performed on the input data. Available operations are: set(fill), add, sub, mul, div, mod, and, or, xor, shl, shr, rol, ror. bits specifies the size in bits of each element the operation has to be performed on (values up to 64 bit are supported). endianness applies only to elements greater than 8 bits. radix is related to the input format of value. value contains the data used for the operation.

If I now inserted “FF” as value, then I would be xoring every byte of the input data with that 1-byte value. If, however, I inserted “FF 00”, then I would be xoring every even byte with “FF” and every uneven one with “00”. I may even only modify every third byte by using wildcards and leaving the first two unchanged, I only have to insert “* * FF”. Wildcards and arrays can be used even in the context of values bigger than 1 byte of course.

Now that I’ve introduced the basic filter, let’s see our first real-world case. What you see is a PDF containing an encrypted malware.

Malware PDF

At the caret position an encrypted Portable Executable starts, it’s quite easy to figure out because of the repetitive sequence of bytes (a PE header is full of zeroes). I select the data with the go to dialog (Ctrl+G).

Select encrypted malware

Now instead of triggering the filter view directly, I’m going to load the embedded file with Ctrl+E. This triggers the load file dialog.

Load malware

As you can see I specified the format of the file to load (PE) and inserted an optional name for it, but now it’s necessary to tell the Profiler how to decrypt the file, otherwise it won’t be able to load it. So I’ll just click on the “Filters” button at the bottom-right and a filter view dialog will pop up.

Malware filters

The decryption sequence is “EB FF FD FC EB FF 23 95”. It can be expressed like I did as an array.

But since it amounts to a 64-bit value, it can also be expressed like this.

Ok, back to the malware. After having added the decryption filter to the stack I just close the dialog and confirm the load file dialog. We’re now able to navigate the decrypted PE.

Decrypted malware

Now I can just save the project (which is under 1KB) and send it to a colleague who will be able to inspect the decrypted file.

The last real-world case I’m going to present I made it myself, because it was easier than to start looking for one. I just took a cursor resource from a PE and appended another RC4-encrypted PE to it. Then replaced the original resource and opened the host file with the Profiler.

Fake resource

The Profiler tells us that the are problems in the resource and also shows us the file. In fact, we can see where the foreign data begins. I just select the foreign data (I can do this both from the resource directory or from the DIBCUR embedded file), trigger the load file dialog and apply the decryption filter.

In upcoming updates there will be many improvements to filters: many more will be added and the concept will be extended and made customizable. Clearly this post doesn’t cover all of that, but I think for now this is enough to keep the introduction simple.