News for version 0.9.7

The new 0.9.7 version of the Profiler is out with the following news:

introduced C++ class/struct parsing with Clang
introduced headers, layouts and manual analysis in hex mode
exposed all the above to the Python SDK
added capability to turn into a portable application
– added SHA-3 hashes
– updated Qt to 4.8.5
– updated OpenSSL
– behavior change: displaying table flags now requires a double click

Enjoy!

Dissecting an ELF with C++ Types

While there are more interesting targets which could be manually analyzed with the new features provided in the Profiler, I decided to write a small post about ELF, also because official support for ELF will be added sooner or later.

Let’s start by importing the types contained in ‘elf.h’. You’ll probably find this header in ‘/usr/include’. Everything we’re interested in is in this file, so we can avoid importing other stuff. I added some predefines in order to avoid includes:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#define int8_t char
#define uint8_t unsigned char
#define int16_t short
#define uint16_t unsigned short
#define int32_t int
#define uint32_t unsigned int
#define int64_t long long
#define uint64_t unsigned long long
#define int8_t char #define uint8_t unsigned char #define int16_t short #define uint16_t unsigned short #define int32_t int #define uint32_t unsigned int #define int64_t long long #define uint64_t unsigned long long
#define int8_t char
#define uint8_t unsigned char
#define int16_t short
#define uint16_t unsigned short
#define int32_t int
#define uint32_t unsigned int
#define int64_t long long
#define uint64_t unsigned long long

Then I pasted ‘elf.h’ into the Header Manager after the HEADER_START directive and clicked on ‘Import’.

ELF types import

We now have a header (elf) with all the types we need to start the manual analysis.

Since this is just a demonstration I didn’t do a full analysis of the ELF format. I limited the scope to finding the imported symbols and their strings.

ELF analysis

Every ELF starts with a _Elf64_Ehdr header (Elf32_Ehdr for 32-bit files, in this case it’s a 64-bit ELF). The header specifies the offset, number and size of the sections (we’ll just assume the standard 0x40 size here). The ‘name’ field of sections is just an index into a ‘SHT_STRTAB’ section whose index is specified by the header. The contents of a section are specified by its type, so finding the symbol table is pretty straight-forward. In this ELF we have a SHT_DYNSYM section. This section is just an array of _Elf64_Sym structures. Again, their ‘st_name’ field is just an index into another SHT_STRTAB section (the interval in the screenshot named ‘.dynstr’).

As already mentioned in the previous post, we can create a layout programmatically as well:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
from Pro.UI import *
def buildElfLayout(obj, l):
hname = "elf"
hdr = CFFHeader()
if hdr.LoadFromFile(hname) == False:
return
sopts = CFFSO_GCC | CFFSO_Pack1
d = LayoutData()
d.setTypeOptions(sopts)
# add header
ehdr = obj.MakeStruct(hdr, "_Elf64_Ehdr", 0, sopts)
d.setColor(ntRgba(255, 0, 0, 70))
d.setStruct(hname, "_Elf64_Ehdr")
l.add(0, ehdr.Size(), d)
# add sections (we assume that e_shentsize is 0x40)
e_shoff = ehdr.Num("e_shoff")
e_shnum = ehdr.Num("e_shnum")
esects = obj.MakeStructArray(hdr, "_Elf64_Shdr", e_shoff, e_shnum, sopts)
d.setStruct(hname, "_Elf64_Shdr")
d.setArraySize(e_shnum)
l.add(e_shoff, esects.TotalSize(), d)
hv = proContext().getCurrentView()
if hv.isValid() and hv.type() == ProView.Type_Hex:
c = hv.getData()
obj = CFFObject()
obj.Load(c)
lname = "ELF_ANALYSIS" # we could make the name unique
l = proContext().getLayout(lname)
buildElfLayout(obj, l)
# apply the layout to the current hex view
hv.setLayoutName(lname)
from Pro.Core import * from Pro.UI import * def buildElfLayout(obj, l): hname = "elf" hdr = CFFHeader() if hdr.LoadFromFile(hname) == False: return sopts = CFFSO_GCC | CFFSO_Pack1 d = LayoutData() d.setTypeOptions(sopts) # add header ehdr = obj.MakeStruct(hdr, "_Elf64_Ehdr", 0, sopts) d.setColor(ntRgba(255, 0, 0, 70)) d.setStruct(hname, "_Elf64_Ehdr") l.add(0, ehdr.Size(), d) # add sections (we assume that e_shentsize is 0x40) e_shoff = ehdr.Num("e_shoff") e_shnum = ehdr.Num("e_shnum") esects = obj.MakeStructArray(hdr, "_Elf64_Shdr", e_shoff, e_shnum, sopts) d.setStruct(hname, "_Elf64_Shdr") d.setArraySize(e_shnum) l.add(e_shoff, esects.TotalSize(), d) hv = proContext().getCurrentView() if hv.isValid() and hv.type() == ProView.Type_Hex: c = hv.getData() obj = CFFObject() obj.Load(c) lname = "ELF_ANALYSIS" # we could make the name unique l = proContext().getLayout(lname) buildElfLayout(obj, l) # apply the layout to the current hex view hv.setLayoutName(lname)
from Pro.Core import *
from Pro.UI import *

def buildElfLayout(obj, l):
    hname = "elf"
    hdr = CFFHeader()
    if hdr.LoadFromFile(hname) == False:
        return
    sopts = CFFSO_GCC | CFFSO_Pack1
    d = LayoutData()
    d.setTypeOptions(sopts)
    
    # add header
    ehdr = obj.MakeStruct(hdr, "_Elf64_Ehdr", 0, sopts)
    d.setColor(ntRgba(255, 0, 0, 70))
    d.setStruct(hname, "_Elf64_Ehdr")
    l.add(0, ehdr.Size(), d)

    # add sections (we assume that e_shentsize is 0x40)
    e_shoff = ehdr.Num("e_shoff")
    e_shnum = ehdr.Num("e_shnum")
    esects = obj.MakeStructArray(hdr, "_Elf64_Shdr", e_shoff, e_shnum, sopts)
    d.setStruct(hname, "_Elf64_Shdr")
    d.setArraySize(e_shnum)
    l.add(e_shoff, esects.TotalSize(), d)

hv = proContext().getCurrentView()
if hv.isValid() and hv.type() == ProView.Type_Hex:
    c = hv.getData()
    obj = CFFObject()
    obj.Load(c)
    lname = "ELF_ANALYSIS" # we could make the name unique
    l = proContext().getLayout(lname) 
    buildElfLayout(obj, l)
    # apply the layout to the current hex view
    hv.setLayoutName(lname)

Moreover, the imported types can be used to do other operations not related to layouts. For instance let’s write few lines of code to print out the symbol names for this ELF:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
obj = proCoreContext().currentScanProvider().getObject()
hdr = CFFHeader()
if hdr.LoadFromFile("elf"):
syms = obj.MakeStructArray(hdr, "_Elf64_Sym", 0x39A0, 2179, CFFSO_GCC | CFFSO_Pack1)
it = syms.iterator()
while it.hasNext():
s = it.next()
name_offs = s.Num(0) + 0x105E8 # .dynstr offset
name = obj.ReadUInt8String(name_offs, 0x1000)[0].decode("utf-8")
print(name)
from Pro.Core import * obj = proCoreContext().currentScanProvider().getObject() hdr = CFFHeader() if hdr.LoadFromFile("elf"): syms = obj.MakeStructArray(hdr, "_Elf64_Sym", 0x39A0, 2179, CFFSO_GCC | CFFSO_Pack1) it = syms.iterator() while it.hasNext(): s = it.next() name_offs = s.Num(0) + 0x105E8 # .dynstr offset name = obj.ReadUInt8String(name_offs, 0x1000)[0].decode("utf-8") print(name)
from Pro.Core import *

obj = proCoreContext().currentScanProvider().getObject()

hdr = CFFHeader()
if hdr.LoadFromFile("elf"):
    syms = obj.MakeStructArray(hdr, "_Elf64_Sym", 0x39A0, 2179, CFFSO_GCC | CFFSO_Pack1)
    it = syms.iterator()
    while it.hasNext():
        s = it.next()
        name_offs = s.Num(0) + 0x105E8 # .dynstr offset
        name = obj.ReadUInt8String(name_offs, 0x1000)[0].decode("utf-8")
        print(name)

The output will be:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
endgrent
__ctype_toupper_loc
iswlower
sigprocmask
__snprintf_chk
getservent
wcscmp
putchar
strcasecmp
localtime
mblen
__vfprintf_chk
; etc.
endgrent __ctype_toupper_loc iswlower sigprocmask __snprintf_chk getservent wcscmp putchar strcasecmp localtime mblen __vfprintf_chk ; etc.
endgrent
__ctype_toupper_loc
iswlower
sigprocmask
__snprintf_chk
getservent
wcscmp
putchar
strcasecmp
localtime
mblen
__vfprintf_chk
; etc.

Rememebr that the advantages of using CFFStructs rely not only in their dynamism or easiness in displaying them graphically, but also security. Contrary to a structure pointer in C, there’s no risk of crash when accessing members in a CFFStruct.

Today some final tests will be performed on the new version and if everything goes well, it will be released tomorrow or the day after. So stay tuned!

C++ Types: Under the Hood

In this post we’re going to explore the SDK part of the Profiler associated to imported structures and also all the C++ internals connected to the layout creation of structures/classes.

At first I thought about subdividing the material into several posts, but at the end it’s probably better to have it all together for future reference.

Layouts

In the SDK a Layout is the class to be used when we need to create a graphical analysis of raw data. While we can create and handle headers from the UI, it is also possible to do it programmatically.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class LayoutInterval
end
start
class LayoutData
arraySize() -> UInt32
getColor() -> NTRgb
getDescription() -> NTUTF8String
getHeader() -> NTUTF8String
getType() -> NTUTF8String
setArraySize(UInt32 n)
setColor(NTRgb rgba)
setDescription(NTUTF8String const & description)
setStruct(NTUTF8String const & hdr, NTUTF8String const & type)
setTypeOptions(UInt32 opt)
typeOptions() -> UInt32
class LayoutPair
first
second
class Layout
add(MaxUInt offset, MaxUInt size, LayoutData data)
add(LayoutInterval interval, LayoutData data)
at(UInt32 i) -> LayoutPair
at(LayoutPair const & lp) -> UInt32
at(LayoutInterval interval) -> UInt32
count() -> UInt32
fromXml(NTUTF8String const & xstr) -> bool
getMatches(MaxUInt offset, MaxUInt size) -> LayoutPairList
getOverlappingWith(MaxUInt offset, MaxUInt size) -> LayoutPairList
isModified() -> bool
isNull() -> bool
isValid() -> bool
layoutName() -> NTString
remove(MaxUInt offset, MaxUInt size)
remove(LayoutInterval interval)
renameLayout(NTString const & name) -> bool
saveIfModified()
setModified(bool b)
toXml() -> NTUTF8String
class LayoutInterval end start class LayoutData arraySize() -> UInt32 getColor() -> NTRgb getDescription() -> NTUTF8String getHeader() -> NTUTF8String getType() -> NTUTF8String setArraySize(UInt32 n) setColor(NTRgb rgba) setDescription(NTUTF8String const & description) setStruct(NTUTF8String const & hdr, NTUTF8String const & type) setTypeOptions(UInt32 opt) typeOptions() -> UInt32 class LayoutPair first second class Layout add(MaxUInt offset, MaxUInt size, LayoutData data) add(LayoutInterval interval, LayoutData data) at(UInt32 i) -> LayoutPair at(LayoutPair const & lp) -> UInt32 at(LayoutInterval interval) -> UInt32 count() -> UInt32 fromXml(NTUTF8String const & xstr) -> bool getMatches(MaxUInt offset, MaxUInt size) -> LayoutPairList getOverlappingWith(MaxUInt offset, MaxUInt size) -> LayoutPairList isModified() -> bool isNull() -> bool isValid() -> bool layoutName() -> NTString remove(MaxUInt offset, MaxUInt size) remove(LayoutInterval interval) renameLayout(NTString const & name) -> bool saveIfModified() setModified(bool b) toXml() -> NTUTF8String
class LayoutInterval

    end
    start

class LayoutData

    arraySize() -> UInt32
    getColor() -> NTRgb
    getDescription() -> NTUTF8String
    getHeader() -> NTUTF8String
    getType() -> NTUTF8String
    setArraySize(UInt32 n)
    setColor(NTRgb rgba)
    setDescription(NTUTF8String const & description)
    setStruct(NTUTF8String const & hdr, NTUTF8String const & type)
    setTypeOptions(UInt32 opt)
    typeOptions() -> UInt32

class LayoutPair

    first
    second

class Layout

    add(MaxUInt offset, MaxUInt size, LayoutData data)
    add(LayoutInterval interval, LayoutData data)
    at(UInt32 i) -> LayoutPair
    at(LayoutPair const & lp) -> UInt32
    at(LayoutInterval interval) -> UInt32
    count() -> UInt32
    fromXml(NTUTF8String const & xstr) -> bool
    getMatches(MaxUInt offset, MaxUInt size) -> LayoutPairList
    getOverlappingWith(MaxUInt offset, MaxUInt size) -> LayoutPairList
    isModified() -> bool
    isNull() -> bool
    isValid() -> bool
    layoutName() -> NTString
    remove(MaxUInt offset, MaxUInt size)
    remove(LayoutInterval interval)
    renameLayout(NTString const & name) -> bool
    saveIfModified()
    setModified(bool b)
    toXml() -> NTUTF8String

Creating a layout is straightforward:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
# create a new layout or retrieve an existing one from the project
layout = proCoreContext().getLayout("LAYOUT_NAME")
# create data
data = LayoutData()
data.setDescription("text")
data.setColor(ntRgba(0xFF, 0, 0, 0x70))
# add interval
layout.add(70, 30, data)
from Pro.Core import * # create a new layout or retrieve an existing one from the project layout = proCoreContext().getLayout("LAYOUT_NAME") # create data data = LayoutData() data.setDescription("text") data.setColor(ntRgba(0xFF, 0, 0, 0x70)) # add interval layout.add(70, 30, data)
from Pro.Core import *

# create a new layout or retrieve an existing one from the project
layout = proCoreContext().getLayout("LAYOUT_NAME")
# create data
data = LayoutData()
data.setDescription("text")
data.setColor(ntRgba(0xFF, 0, 0, 0x70))
# add interval
layout.add(70, 30, data)

The data can be associated to a structure (or array of structures) as well. Please remember that the name of a header is always relative to header sub-directory of the user directory. Saving the layout is not necessary: it’s automatically saved in the project.

Attaching a layout to a hex view is also very easy:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.UI import *
hv = proContext().getCurrentView()
if hv.type() == ProView.Type_Hex:
hv.setLayoutName("LAYOUT_NAME")
from Pro.UI import * hv = proContext().getCurrentView() if hv.type() == ProView.Type_Hex: hv.setLayoutName("LAYOUT_NAME")
from Pro.UI import *

hv = proContext().getCurrentView()
if hv.type() == ProView.Type_Hex:
    hv.setLayoutName("LAYOUT_NAME")

Of course, layouts can be used for operations not related to graphical analysis as well.

Headers

Headers are part of the CFF Core and as such the naming convention of the CFFHeader class isn’t camel-case.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class CFFHeaderAliasData
category
name
type
value
vtype
class CFFHeaderStructData
name
schema
type
class CFFHeaderTypeDefData
name
type
class CFFHeader
AC_Define
AC_Enum
AC_Last
AVT_Integer
AVT_Last
AVT_Real
AVT_String
BeginEdit()
Close()
EndEdit()
Equals(CFFHeader s) -> bool
static GetACName(int category) -> char const *
static GetAVTName(int vtype) -> char const *
GetAliasCount() -> UInt32
GetAliasData(UInt32 i) -> CFFHeaderAliasData
GetStructBaseData(UInt32 i) -> CFFHeaderStructData
GetStructCount() -> UInt32
GetStructData(UInt32 i) -> CFFHeaderStructData
GetStructData(char const * name) -> CFFHeaderStructData
GetTypeDefCount() -> UInt32
GetTypeDefData(UInt32 i) -> CFFHeaderTypeDefData
InsertAlias(char const * name, int category, char const * type, int vtype, char const * value)
InsertStruct(char const * name, char const * type, char const * schema)
InsertTypeDef(char const * name, char const * type)
IsModified() -> bool
IsNull() -> bool
IsValid() -> bool
LoadFromFile(NTString const & name) -> bool
LoadFromXml(NTXml xml) -> bool
LoadFromXml(NTUTF8String const & xml) -> bool
SetModified(bool b)
class CFFHeaderAliasData category name type value vtype class CFFHeaderStructData name schema type class CFFHeaderTypeDefData name type class CFFHeader AC_Define AC_Enum AC_Last AVT_Integer AVT_Last AVT_Real AVT_String BeginEdit() Close() EndEdit() Equals(CFFHeader s) -> bool static GetACName(int category) -> char const * static GetAVTName(int vtype) -> char const * GetAliasCount() -> UInt32 GetAliasData(UInt32 i) -> CFFHeaderAliasData GetStructBaseData(UInt32 i) -> CFFHeaderStructData GetStructCount() -> UInt32 GetStructData(UInt32 i) -> CFFHeaderStructData GetStructData(char const * name) -> CFFHeaderStructData GetTypeDefCount() -> UInt32 GetTypeDefData(UInt32 i) -> CFFHeaderTypeDefData InsertAlias(char const * name, int category, char const * type, int vtype, char const * value) InsertStruct(char const * name, char const * type, char const * schema) InsertTypeDef(char const * name, char const * type) IsModified() -> bool IsNull() -> bool IsValid() -> bool LoadFromFile(NTString const & name) -> bool LoadFromXml(NTXml xml) -> bool LoadFromXml(NTUTF8String const & xml) -> bool SetModified(bool b)
class CFFHeaderAliasData

    category
    name
    type
    value
    vtype

class CFFHeaderStructData

    name
    schema
    type

class CFFHeaderTypeDefData

    name
    type

class CFFHeader

    AC_Define
    AC_Enum
    AC_Last
    AVT_Integer
    AVT_Last
    AVT_Real
    AVT_String

    BeginEdit()
    Close()
    EndEdit()
    Equals(CFFHeader s) -> bool
    static GetACName(int category) -> char const *
    static GetAVTName(int vtype) -> char const *
    GetAliasCount() -> UInt32
    GetAliasData(UInt32 i) -> CFFHeaderAliasData
    GetStructBaseData(UInt32 i) -> CFFHeaderStructData
    GetStructCount() -> UInt32
    GetStructData(UInt32 i) -> CFFHeaderStructData
    GetStructData(char const * name) -> CFFHeaderStructData
    GetTypeDefCount() -> UInt32
    GetTypeDefData(UInt32 i) -> CFFHeaderTypeDefData
    InsertAlias(char const * name, int category, char const * type, int vtype, char const * value)
    InsertStruct(char const * name, char const * type, char const * schema)
    InsertTypeDef(char const * name, char const * type)
    IsModified() -> bool
    IsNull() -> bool
    IsValid() -> bool
    LoadFromFile(NTString const & name) -> bool
    LoadFromXml(NTXml xml) -> bool
    LoadFromXml(NTUTF8String const & xml) -> bool
    SetModified(bool b)

A CFFHeader represents an abstract database in which structures/classes and other things are stored. While we won’t use most of its methods, some of them are very useful for common operations.

Let’s say we want to retrieve a specific structure from a header and use it.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import *
def output(s):
out = proTextStream()
s.Dump(out)
print(out.buffer)
obj = proCoreContext().currentScanProvider().getObject()
hdr = CFFHeader()
if hdr.LoadFromFile("WinNT"):
s = obj.MakeStruct(hdr, "_IMAGE_DOS_HEADER", 0, CFFSO_Pack1)
output(s)
from Pro.Core import * def output(s): out = proTextStream() s.Dump(out) print(out.buffer) obj = proCoreContext().currentScanProvider().getObject() hdr = CFFHeader() if hdr.LoadFromFile("WinNT"): s = obj.MakeStruct(hdr, "_IMAGE_DOS_HEADER", 0, CFFSO_Pack1) output(s)
from Pro.Core import *

def output(s):
    out = proTextStream()
    s.Dump(out)
    print(out.buffer)

obj = proCoreContext().currentScanProvider().getObject()
hdr = CFFHeader()
if hdr.LoadFromFile("WinNT"):
    s = obj.MakeStruct(hdr, "_IMAGE_DOS_HEADER", 0, CFFSO_Pack1)
    output(s)

The output of this snippet is:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
e_magic : 5A4D
e_cblp : 0090
e_cp : 0003
e_crlc : 0000
e_cparhdr : 0004
e_minalloc: 0000
e_maxalloc: FFFF
e_ss : 0000
e_sp : 00B8
e_csum : 0000
e_ip : 0000
e_cs : 0000
e_lfarlc : 0040
e_ovno : 0000
e_res.0 : 0000
e_res.1 : 0000
e_res.2 : 0000
e_res.3 : 0000
e_oemid : 0000
e_oeminfo : 0000
e_res2.0 : 0000
e_res2.1 : 0000
e_res2.2 : 0000
e_res2.3 : 0000
e_res2.4 : 0000
e_res2.5 : 0000
e_res2.6 : 0000
e_res2.7 : 0000
e_res2.8 : 0000
e_res2.9 : 0000
e_lfanew : 000000F8
e_magic : 5A4D e_cblp : 0090 e_cp : 0003 e_crlc : 0000 e_cparhdr : 0004 e_minalloc: 0000 e_maxalloc: FFFF e_ss : 0000 e_sp : 00B8 e_csum : 0000 e_ip : 0000 e_cs : 0000 e_lfarlc : 0040 e_ovno : 0000 e_res.0 : 0000 e_res.1 : 0000 e_res.2 : 0000 e_res.3 : 0000 e_oemid : 0000 e_oeminfo : 0000 e_res2.0 : 0000 e_res2.1 : 0000 e_res2.2 : 0000 e_res2.3 : 0000 e_res2.4 : 0000 e_res2.5 : 0000 e_res2.6 : 0000 e_res2.7 : 0000 e_res2.8 : 0000 e_res2.9 : 0000 e_lfanew : 000000F8
e_magic   : 5A4D
e_cblp    : 0090
e_cp      : 0003
e_crlc    : 0000
e_cparhdr : 0004
e_minalloc: 0000
e_maxalloc: FFFF
e_ss      : 0000
e_sp      : 00B8
e_csum    : 0000
e_ip      : 0000
e_cs      : 0000
e_lfarlc  : 0040
e_ovno    : 0000
e_res.0   : 0000
e_res.1   : 0000
e_res.2   : 0000
e_res.3   : 0000
e_oemid   : 0000
e_oeminfo : 0000
e_res2.0  : 0000
e_res2.1  : 0000
e_res2.2  : 0000
e_res2.3  : 0000
e_res2.4  : 0000
e_res2.5  : 0000
e_res2.6  : 0000
e_res2.7  : 0000
e_res2.8  : 0000
e_res2.9  : 0000
e_lfanew  : 000000F8

We can specify the following options when retrieving a structure:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
CFFSO_EndianDefault
CFFSO_EndianLittle
CFFSO_EndianBig
CFFSO_EndiannessDefault
CFFSO_EndiannessLittle
CFFSO_EndiannessBig
CFFSO_PointerDefault
CFFSO_Pointer16
CFFSO_Pointer32
CFFSO_Pointer64
CFFSO_PackNone
CFFSO_Pack1
CFFSO_Pack2
CFFSO_Pack4
CFFSO_Pack8
CFFSO_Pack16
CFFSO_NoCompiler
CFFSO_VC
CFFSO_GCC
CFFSO_Clang
CFFSO_EndianDefault CFFSO_EndianLittle CFFSO_EndianBig CFFSO_EndiannessDefault CFFSO_EndiannessLittle CFFSO_EndiannessBig CFFSO_PointerDefault CFFSO_Pointer16 CFFSO_Pointer32 CFFSO_Pointer64 CFFSO_PackNone CFFSO_Pack1 CFFSO_Pack2 CFFSO_Pack4 CFFSO_Pack8 CFFSO_Pack16 CFFSO_NoCompiler CFFSO_VC CFFSO_GCC CFFSO_Clang
CFFSO_EndianDefault
CFFSO_EndianLittle
CFFSO_EndianBig
CFFSO_EndiannessDefault
CFFSO_EndiannessLittle
CFFSO_EndiannessBig

CFFSO_PointerDefault
CFFSO_Pointer16
CFFSO_Pointer32
CFFSO_Pointer64

CFFSO_PackNone
CFFSO_Pack1
CFFSO_Pack2
CFFSO_Pack4
CFFSO_Pack8
CFFSO_Pack16

CFFSO_NoCompiler
CFFSO_VC
CFFSO_GCC
CFFSO_Clang

These are the same options which are available from the UI when adding a structure to a layout.

When options are not specified, they default to the default structure options of the object. It’s possible to specify the default structure options with this method:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
SetDefaultStructOptions(UInt32 options)
SetDefaultStructOptions(UInt32 options)
SetDefaultStructOptions(UInt32 options)

We’ll see later the implications of the various flags.

When I said that a CFFHeader represents an abstract database, I meant that it is not really bound to a specific format internally. All it cares about is that data is retrieved or set. The standard format used by headers is SQLite and you’ll need to use that format when creating layouts associated to structures. However, when using structures from Python it can be handy to avoid an associated header file. When the number of structures is very limited and you don’t need write or other complex operations, structures can be stored into an XML string. In fact, the internal format of structures is XML. Let’s take a look at one:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="_IMAGE_DOS_HEADER" type="struct">
<f id="e_magic" type="unsigned short">
<f id="e_cblp" type="unsigned short">
<f id="e_cp" type="unsigned short">
<f id="e_crlc" type="unsigned short">
<f id="e_cparhdr" type="unsigned short">
<f id="e_minalloc" type="unsigned short">
<f id="e_maxalloc" type="unsigned short">
<f id="e_ss" type="unsigned short">
<f id="e_sp" type="unsigned short">
<f id="e_csum" type="unsigned short">
<f id="e_ip" type="unsigned short">
<f id="e_cs" type="unsigned short">
<f id="e_lfarlc" type="unsigned short">
<f id="e_ovno" type="unsigned short">
<f id="e_res" type="unsigned short [4]">
<f id="e_oemid" type="unsigned short">
<f id="e_oeminfo" type="unsigned short">
<f id="e_res2" type="unsigned short [10]">
<f id="e_lfanew" type="long">
</f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></r>
<r id="_IMAGE_DOS_HEADER" type="struct"> <f id="e_magic" type="unsigned short"> <f id="e_cblp" type="unsigned short"> <f id="e_cp" type="unsigned short"> <f id="e_crlc" type="unsigned short"> <f id="e_cparhdr" type="unsigned short"> <f id="e_minalloc" type="unsigned short"> <f id="e_maxalloc" type="unsigned short"> <f id="e_ss" type="unsigned short"> <f id="e_sp" type="unsigned short"> <f id="e_csum" type="unsigned short"> <f id="e_ip" type="unsigned short"> <f id="e_cs" type="unsigned short"> <f id="e_lfarlc" type="unsigned short"> <f id="e_ovno" type="unsigned short"> <f id="e_res" type="unsigned short [4]"> <f id="e_oemid" type="unsigned short"> <f id="e_oeminfo" type="unsigned short"> <f id="e_res2" type="unsigned short [10]"> <f id="e_lfanew" type="long"> </f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></f></r>

  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  

We can inspect the format of a structure stored in a header from the Header Manager in the Explore tab by double clicking on it. But we can also avoid creating a header altogether and output the schema of parsed structures directly when importing them from C++. Just check ‘Test mode’ and as ‘Output’ select ‘schemas’.

Output schemas

Let’s import a simple structure such as:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
int a;
};
struct A { int a; };
struct A
{
    int a;
};

The output will be:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A" type="struct">
<f id="a" type="int">
</f></r>
<r id="A" type="struct"> <f id="a" type="int"> </f></r>

  

To use this structure from Python we can write the following code:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
schema = """
<header>
<r id="A" type="struct">
<f id="a" type="int">
</f></r>
</header>
"""
hdr = CFFHeader()
if hdr.LoadFromXml(schema):
s = obj.MakeStruct(hdr, "A", 0)
output(s)
schema = """ <header> <r id="A" type="struct"> <f id="a" type="int"> </f></r> </header> """ hdr = CFFHeader() if hdr.LoadFromXml(schema): s = obj.MakeStruct(hdr, "A", 0) output(s)
schema = """
""" hdr = CFFHeader() if hdr.LoadFromXml(schema): s = obj.MakeStruct(hdr, "A", 0) output(s)

As you can see it’s very simple. I’ll use this method for the examples in the rest of the post, because they’re just examples and there’s no point in creating a header file for them.

Pointers

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
CFFSO_Pointer16
CFFSO_Pointer32
CFFSO_Pointer64
CFFSO_Pointer16 CFFSO_Pointer32 CFFSO_Pointer64
CFFSO_Pointer16
CFFSO_Pointer32
CFFSO_Pointer64

As a rule of thumb if a structure contains a pointer (or a vtable pointer) it is always a good idea to specify the desired size. When the size is omitted both in the explicit options and in the default structure options, the size will be set to the default pointer size of an object, which apart for PEObjects and MachObjects will always be 32bits.

Endianness

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
CFFSO_EndianLittle
CFFSO_EndianBig
# or
CFFSO_EndiannessLittle
CFFSO_EndiannessBig
CFFSO_EndianLittle CFFSO_EndianBig # or CFFSO_EndiannessLittle CFFSO_EndiannessBig
CFFSO_EndianLittle
CFFSO_EndianBig
# or
CFFSO_EndiannessLittle
CFFSO_EndiannessBig

When endianness is not specified it will be set to the default of the object. While internally it’s already possible to have individual fields with different endianness, an extra XML field attribute to specify it will be added in the future.

Arrays

The first thing to say is that there’s a difference between an array of top level structures and an array of fields. Creating a top level array of structures is easy:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
s = obj.MakeStructArray(hdr, "A", 0, 10)
s = obj.MakeStructArray(hdr, "A", 0, 10)
s = obj.MakeStructArray(hdr, "A", 0, 10)

The support of arrays is somewhat limited. Multidimensional arrays are only partially supported, in the sense that they will be converted to a single dimension. For instance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
int a[10][10];
};
struct A { int a[10][10]; };
struct A
{
    int a[10][10];
};

Or in XML:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A" type="struct">
<f id="a" type="int [10][10]">
</f></r>
<r id="A" type="struct"> <f id="a" type="int [10][10]"> </f></r>

  

Will be convrted to:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
a.0 : 00905A4D
a.1 : 00000003
a.2 : 00000004
a.3 : 0000FFFF
a.4 : 000000B8
a.5 : 00000000
a.6 : 00000040
a.7 : 00000000
a.8 : 00000000
a.9 : 00000000
a.10: 00000000
a.11: 00000000
a.12: 00000000
; etc.
a.0 : 00905A4D a.1 : 00000003 a.2 : 00000004 a.3 : 0000FFFF a.4 : 000000B8 a.5 : 00000000 a.6 : 00000040 a.7 : 00000000 a.8 : 00000000 a.9 : 00000000 a.10: 00000000 a.11: 00000000 a.12: 00000000 ; etc.
a.0 : 00905A4D
a.1 : 00000003
a.2 : 00000004
a.3 : 0000FFFF
a.4 : 000000B8
a.5 : 00000000
a.6 : 00000040
a.7 : 00000000
a.8 : 00000000
a.9 : 00000000
a.10: 00000000
a.11: 00000000
a.12: 00000000

; etc.

Also notice that to access an array element in a CFFStruct the syntax to use is not “a[15]” but “a.15”, e.g.:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
print(s.Str("a.15"))
print(s.Str("a.15"))
print(s.Str("a.15"))

Sub-structures

The only thing to mention about Sub-structures is that complex sub-types are always dumped separately, e.g.:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
int a;
struct SUB
{
int sub;
} b;
};
struct A { int a; struct SUB { int sub; } b; };
struct A
{
    int a;
    struct SUB
    {
        int sub;
    } b;
};

In XML:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A::SUB" type="struct">
<f id="sub" type="int">
</f></r>
<r id="A" type="struct">
<f id="a" type="int">
<f id="b" type="struct A::SUB">
</f></f></r>
<r id="A::SUB" type="struct"> <f id="sub" type="int"> </f></r> <r id="A" type="struct"> <f id="a" type="int"> <f id="b" type="struct A::SUB"> </f></f></r>

  



  
  

In Python:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
schema = """
<header>
<r id="A::SUB" type="struct">
<f id="sub" type="int">
</f></r>
<r id="A" type="struct">
<f id="a" type="int">
<f id="b" type="struct A::SUB">
</f></f></r>
</header>
"""
hdr = CFFHeader()
if hdr.LoadFromXml(schema):
s = obj.MakeStruct(hdr, "A", 0)
output(s)
schema = """ <header> <r id="A::SUB" type="struct"> <f id="sub" type="int"> </f></r> <r id="A" type="struct"> <f id="a" type="int"> <f id="b" type="struct A::SUB"> </f></f></r> </header> """ hdr = CFFHeader() if hdr.LoadFromXml(schema): s = obj.MakeStruct(hdr, "A", 0) output(s)
schema = """
""" hdr = CFFHeader() if hdr.LoadFromXml(schema): s = obj.MakeStruct(hdr, "A", 0) output(s)

The output:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
a : 00905A4D
b.sub: 00000003
a : 00905A4D b.sub: 00000003
a    : 00905A4D
b.sub: 00000003

Being a separate type, we can also use ‘A::Sub’ without its parent.

A new thing we’ve just seen is the presence of multiple structures in a single XML header. I’ve pasted the whole Python code once again just for clarity, in the next examples I won’t repeat it, since the Python code never changes, only the header string does.

Unions

Unions just like sub-structures are fully supported. The only thing to keep in mind is that when we have a top level union, meaning not contained in another structure, such as:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
union A
{
int a;
short b;
};
union A { int a; short b; };
union A
{
    int a;
    short b;
};

Then to access its members it is necessary to add a ‘u.’ prefix. The reason for this is that CFFStructs support unions only as members, so the union above will result in a CFFStruct with a union member called ‘u’.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
u.a: 00905A4D
u.b: 5A4D
u.a: 00905A4D u.b: 5A4D
u.a: 00905A4D
u.b: 5A4D

Anonymous types

Anonymous types are only partially supported in the sense that they are given a name when imported. A type such as the following:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
union
{
int a;
int b;
} u;
};
struct A { union { int a; int b; } u; };
struct A
{
    union
    {
        int a;
        int b;
    } u;
};

Results in the following xml:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A::_Union_0" type="union">
<f id="a" type="int">
<f id="b" type="int">
</f></f></r>
<r id="A" type="struct">
<f id="u" type="union A::_Union_0">
</f></r>
<r id="A::_Union_0" type="union"> <f id="a" type="int"> <f id="b" type="int"> </f></f></r> <r id="A" type="struct"> <f id="u" type="union A::_Union_0"> </f></r>

  
  



  

As you can see a ‘_Type_’ + number naming convention has been used to rename anonymous types. The first character (‘_’) in the name represents the default anonymous prefix. This prefix is customizable. If a typedef is found for an anonymous type, then the new name for that type will created by using the anonymous prefix + the typedef name.

Bit-fields

Bit-fields are fully supported.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
int a : 1;
int b : 4;
};
struct A { int a : 1; int b : 4; };
struct A
{
    int a : 1;
    int b : 4;
};
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A" type="struct">
<f id="a" type="int" bits="1">
<f id="b" type="int" bits="4">
</f></f></r>
<r id="A" type="struct"> <f id="a" type="int" bits="1"> <f id="b" type="int" bits="4"> </f></f></r>

  
  

Output:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
a: 01
b: 06
: 0482D2
a: 01 b: 06 : 0482D2
a: 01
b: 06
 : 0482D2

The unnamed field at the end represents the unused bits given the field size, in this case we have an ‘int’ type and we’ve used only 5 bits of it.

There are significant differences in how compilers handle bit-fields. Visual C++ behaves differently than GCC/Clang. Some of the differences are summarized in this message by Richard W.M. Jones.

Another important difference I noticed is how bit fields are coalesced when the type changes, e.g.:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
int a : 1;
short b : 1;
int c : 1;
};
struct A { int a : 1; short b : 1; int c : 1; };
struct A
{
    int a : 1;
    short b : 1;
    int c : 1;
};

Without going now into how they are coalesced, the thing to remember is that the Profiler handles all these cases, but you need to specify the compiler to obtain the correct result.

Namespaces

Namespaces are fully supported.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
namespace N
{
struct A
{
int a;
};
}
namespace N { struct A { int a; }; }
namespace N
{

struct A
{
    int a;
};

}

Results in:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="N::A" type="struct">
<f id="a" type="int">
</f></r>
<r id="N::A" type="struct"> <f id="a" type="int"> </f></r>

  

Moreover, just as in C++ we can use namespaces to encapsulate #include directives.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
namespace N
{
#include <something>
}</something>
namespace N { #include <something> }</something>
namespace N
{

#include 

}

This will cause all the types declared in ‘Something’ to be prefixed by the namespace (‘N::’). This can be very handy when we want to include types with the same name into the same header file.

Inheritance

Inheritance is fully supported.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
int a;
};
struct B : public A
{
int b;
};
struct A { int a; }; struct B : public A { int b; };
struct A
{
    int a;
};

struct B : public A
{
    int b;
};

XML:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A" type="struct">
<f id="a" type="int">
</f></r>
<r id="B" type="struct">
<b>
<b type="struct A" access="public">
</b>
<f id="b" type="int">
</f></b></r>
<r id="A" type="struct"> <f id="a" type="int"> </f></r> <r id="B" type="struct"> <b> <b type="struct A" access="public"> </b> <f id="b" type="int"> </f></b></r>

  



  
    
  
  

Output:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
a: 00905A4D
b: 00000003
a: 00905A4D b: 00000003
a: 00905A4D
b: 00000003

Same with multiple inheritance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A" type="struct">
<f id="a" type="int">
</f></r>
<r id="B" type="struct">
<f id="b" type="int">
</f></r>
<r id="C" type="struct">
<b>
<b type="struct A" access="public">
<b type="struct B" access="public">
</b>
<f id="c" type="int">
</f></b></b></r>
<r id="A" type="struct"> <f id="a" type="int"> </f></r> <r id="B" type="struct"> <f id="b" type="int"> </f></r> <r id="C" type="struct"> <b> <b type="struct A" access="public"> <b type="struct B" access="public"> </b> <f id="c" type="int"> </f></b></b></r>

  



  



  
    
    
  
  

Output:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
a: 00905A4D
b: 00000003
c: 00000004
a: 00905A4D b: 00000003 c: 00000004
a: 00905A4D
b: 00000003
c: 00000004

VTables

The presence of virtual table pointers in structures which require them is fully supported. Let’s take for instance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
virtual void v() { }
int a;
};
struct A { virtual void v() { } int a; };
struct A
{
    virtual void v() { }
    int a;
};

XML:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A" type="struct">
<mv id="v" type="void (void)">
<f id="a" type="int">
<m id="operator=" type="struct A &(const struct A &)">
<m id="operator=" type="struct A &(struct A &&)">
<m id="~A" type="void (void)">
</m></m></m></f></mv></r>
<r id="A" type="struct"> <mv id="v" type="void (void)"> <f id="a" type="int"> <m id="operator=" type="struct A &(const struct A &)"> <m id="operator=" type="struct A &(struct A &&)"> <m id="~A" type="void (void)"> </m></m></m></f></mv></r>

  
  
  
  
  

Output:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
__vtable_ptr_0: 00905A4D
a : 00000003
__vtable_ptr_0: 00905A4D a : 00000003
__vtable_ptr_0: 00905A4D
a             : 00000003

Let’s see an example with multiple inheritance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
virtual void va() { }
int a;
};
struct B
{
virtual void vb() { }
int b;
};
struct C : public A, public B
{
int c;
};
struct A { virtual void va() { } int a; }; struct B { virtual void vb() { } int b; }; struct C : public A, public B { int c; };
struct A
{
    virtual void va() { }
    int a;
};

struct B
{
    virtual void vb() { }
    int b;
};

struct C : public A, public B
{
    int c;
};

Output:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
__vtable_ptr_0: 00905A4D
__vtable_ptr_1: 00000003
a : 00000004
b : 0000FFFF
c : 000000B8
__vtable_ptr_0: 00905A4D __vtable_ptr_1: 00000003 a : 00000004 b : 0000FFFF c : 000000B8
__vtable_ptr_0: 00905A4D
__vtable_ptr_1: 00000003
a             : 00000004
b             : 0000FFFF
c             : 000000B8

When virtual tables are involved it is very important to specify the compiler, because things can vary a great deal between VC++ and GCC/Clang.

Virtual Inheritance

Virtual inheritance is fully supported. Virtual inheritance is a C++ feature to be used in scenarios which involve multiple inheritance with a common base class.

Let’s take the complex case of:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
struct A
{
int a;
virtual void va() {}
};
struct B : public virtual A
{
virtual void vb() {}
};
struct B2
{
virtual void vb2() {}
};
struct C : public virtual A, public B
{
int b;
virtual void vc() {}
};
struct TOP
{
int top;
C c;
virtual void vtop() {}
};
struct A { int a; virtual void va() {} }; struct B : public virtual A { virtual void vb() {} }; struct B2 { virtual void vb2() {} }; struct C : public virtual A, public B { int b; virtual void vc() {} }; struct TOP { int top; C c; virtual void vtop() {} };
struct A
{
    int a;
    virtual void va() {}
};

struct B : public virtual A
{
    virtual void vb() {}
};

struct B2
{
    virtual void vb2() {}
};

struct C : public virtual A, public B
{
    int b;
    virtual void vc() {}
};

struct TOP
{
    int top;
    C c;
    virtual void vtop() {}
};

Output (Visual C++):

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
__vtable_ptr_0 : 00905A4D
top : 00000003
c.__vtable_ptr_0: 00000004
c.__vtable_ptr_1: 0000FFFF
c.__vtable_ptr_2: 000000B8
c.b : 00000000
c.a : 00000040
__vtable_ptr_0 : 00905A4D top : 00000003 c.__vtable_ptr_0: 00000004 c.__vtable_ptr_1: 0000FFFF c.__vtable_ptr_2: 000000B8 c.b : 00000000 c.a : 00000040
__vtable_ptr_0  : 00905A4D
top             : 00000003
c.__vtable_ptr_0: 00000004
c.__vtable_ptr_1: 0000FFFF
c.__vtable_ptr_2: 000000B8
c.b             : 00000000
c.a             : 00000040

Output (GCC):

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
__vtable_ptr_0 : 00905A4D
top : 00000003
c.__vtable_ptr_0: 00000004
c.b : 0000FFFF
c.a : 000000B8
__vtable_ptr_0 : 00905A4D top : 00000003 c.__vtable_ptr_0: 00000004 c.b : 0000FFFF c.a : 000000B8
__vtable_ptr_0  : 00905A4D
top             : 00000003
c.__vtable_ptr_0: 00000004
c.b             : 0000FFFF
c.a             : 000000B8

As you can see the layout differs from Visual C++ to GCC. Another thing to notice is that members of virtual base classes are appended at the end. There’s a very good presentation by Igor Skochinsky on C++ decompilation you can watch for more information.

Field alignment

Field alignment is an important factor. Structures which are not subject to packing constraints are aligned up to their biggest native member. It’s more complex than this, because sub-structures influence parent structures but not vice versa. Suffice it to say that there are some internal gotchas, but the Profiler should handle all cases correctly.

Packing

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
CFFSO_Pack1
CFFSO_Pack2
CFFSO_Pack4
CFFSO_Pack8
CFFSO_Pack16
CFFSO_Pack1 CFFSO_Pack2 CFFSO_Pack4 CFFSO_Pack8 CFFSO_Pack16
CFFSO_Pack1
CFFSO_Pack2
CFFSO_Pack4
CFFSO_Pack8
CFFSO_Pack16

When a packing constraint is applied, fields are aligned to either the field size or the packing whichever is less. A packing constraint of 1 is essential if we want to read raw data without any kind of padding between fields. For instance, PE structures in WinNT.h are all pragma packed to 1, so we must specify the same packing when using them.

Templates

And for the end a little treat: C++ templates. Let’s take for instance:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
template <typename t="">
struct A
{
T a;
};
template <typename t="">
struct B
{
T b;
};</typename></typename>
template <typename t=""> struct A { T a; }; template <typename t=""> struct B { T b; };</typename></typename>
template 
struct A
{
    T a;
};

template 
struct B
{
    T b;
};

XML:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<r id="A" type="struct" tparams="T">
<f id="a" type="T">
</f></r>
<r id="B" type="struct" tparams="T">
<f id="b" type="T">
</f></r>
<r id="A" type="struct" tparams="T"> <f id="a" type="T"> </f></r> <r id="B" type="struct" tparams="T"> <f id="b" type="T"> </f></r>

  



  

We can specify template parameters following the C++ syntax:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
s = obj.MakeStruct(hdr, "B<a<int>>", 0)</a<int>
s = obj.MakeStruct(hdr, "B<a<int>>", 0)</a<int>
s = obj.MakeStruct(hdr, "B>", 0)

Output:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
b.a: 00905A4D
b.a: 00905A4D
b.a: 00905A4D

So, even nested templates are supported. 😉

C++ Types: Introduction

As announced previously, the upcoming 0.9.7 version of the Profiler represents a milestone in the development road map. We’re excited to present to you an awesome set of new features. In fact, the ground to cover is so vast that one post is not nearly enough. Throughout this week I’ll write some posts to cover the basics and this will allow for enough time to beta test the new version before reaching a release candidate.

Let’s start with an awesome image:

Presentation

Does it look like a Clang based tool to parse C++ sources and extract type information? If yes, then that’s exactly it!

To sum it up very briefly, the Profiler is now able to extract C++ types such as classes and structures and use these types both in the UI and in Python.

Add structure dialog

Of course, there’s much more to it. The layout of C++ types is a complex matter and doesn’t just involve supporting simple data structures. This post is just an introduction, the next ones will focus on topics such as: endianness, pointers, arrays, sub-structures, unions, bit-fields, inheritance, virtual tables, virtual inheritance, anonymous types, alignment, packing and templates. Yes, you read correctly: templates. 🙂

And apart from the implications of C++ types themselves, there’s the SDK part of the Profiler which will also require some dedicated posts. In this introduction I’m going to show a very simple flow and one of the many possible use cases.

You probably have noticed that the code in the screenshot above belongs to WinNT.h. Let’s see how to import the types in this header quickly. Usually we could parse all the headers of a framework with a few clicks, but while Clang is ideal to parse both Linux and OS X sources, it has difficulty with some Visual C++ extensions which are completely invalid C++ code. So rather than importing the whole Windows SDK we just limit ourselves to a part of WinNT.h.

I have added some predefines for Windows types (we could also include WinDef.h):

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#define BYTE unsigned char
#define WORD unsigned short
#define DWORD unsigned int
#define __int64 long long
#define LONG long
#define CHAR char
#define WCHAR short
#define ULONGLONG unsigned long long
#define UNALIGNED
#define SHORT short
#define NTAPI
#define VOID void
#define PVOID void *
#define BOOL unsigned int
#define BOOLEAN unsigned int
#define BYTE unsigned char #define WORD unsigned short #define DWORD unsigned int #define __int64 long long #define LONG long #define CHAR char #define WCHAR short #define ULONGLONG unsigned long long #define UNALIGNED #define SHORT short #define NTAPI #define VOID void #define PVOID void * #define BOOL unsigned int #define BOOLEAN unsigned int
#define BYTE unsigned char
#define WORD unsigned short
#define DWORD unsigned int
#define __int64 long long
#define LONG long
#define CHAR char
#define WCHAR short
#define ULONGLONG unsigned long long
#define UNALIGNED
#define SHORT short
#define NTAPI
#define VOID void
#define PVOID void *
#define BOOL unsigned int
#define BOOLEAN unsigned int

Then I just copied the header into the import tool. Usually this isn’t necessary, because we can set up the include directories from the UI and then just use #include directives, but since we need to modify the header to remove invalid C++ extensions, it makes sense to paste it.

The beginning of the code:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
HEADER_START("WinNT");
typedef struct _GUID {
unsigned long Data1;
unsigned short Data2;
unsigned short Data3;
unsigned char Data4[ 8 ];
} GUID;
typedef GUID CLSID;
typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header
WORD e_magic; // Magic number
WORD e_cblp; // Bytes on last page of file
WORD e_cp; // Pages in file
WORD e_crlc; // Relocations
WORD e_cparhdr; // Size of header in paragraphs
WORD e_minalloc; // Minimum extra paragraphs needed
WORD e_maxalloc; // Maximum extra paragraphs needed
WORD e_ss; // Initial (relative) SS value
WORD e_sp; // Initial SP value
WORD e_csum; // Checksum
WORD e_ip; // Initial IP value
WORD e_cs; // Initial (relative) CS value
WORD e_lfarlc; // File address of relocation table
WORD e_ovno; // Overlay number
WORD e_res[4]; // Reserved words
WORD e_oemid; // OEM identifier (for e_oeminfo)
WORD e_oeminfo; // OEM information; e_oemid specific
WORD e_res2[10]; // Reserved words
LONG e_lfanew; // File address of new exe header
} IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;
// etc. etc.
HEADER_START("WinNT"); typedef struct _GUID { unsigned long Data1; unsigned short Data2; unsigned short Data3; unsigned char Data4[ 8 ]; } GUID; typedef GUID CLSID; typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header WORD e_magic; // Magic number WORD e_cblp; // Bytes on last page of file WORD e_cp; // Pages in file WORD e_crlc; // Relocations WORD e_cparhdr; // Size of header in paragraphs WORD e_minalloc; // Minimum extra paragraphs needed WORD e_maxalloc; // Maximum extra paragraphs needed WORD e_ss; // Initial (relative) SS value WORD e_sp; // Initial SP value WORD e_csum; // Checksum WORD e_ip; // Initial IP value WORD e_cs; // Initial (relative) CS value WORD e_lfarlc; // File address of relocation table WORD e_ovno; // Overlay number WORD e_res[4]; // Reserved words WORD e_oemid; // OEM identifier (for e_oeminfo) WORD e_oeminfo; // OEM information; e_oemid specific WORD e_res2[10]; // Reserved words LONG e_lfanew; // File address of new exe header } IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER; // etc. etc.
HEADER_START("WinNT");

typedef struct _GUID {
    unsigned long  Data1;
    unsigned short Data2;
    unsigned short Data3;
    unsigned char  Data4[ 8 ];
} GUID;

typedef GUID CLSID;

typedef struct _IMAGE_DOS_HEADER {      // DOS .EXE header
    WORD   e_magic;                     // Magic number
    WORD   e_cblp;                      // Bytes on last page of file
    WORD   e_cp;                        // Pages in file
    WORD   e_crlc;                      // Relocations
    WORD   e_cparhdr;                   // Size of header in paragraphs
    WORD   e_minalloc;                  // Minimum extra paragraphs needed
    WORD   e_maxalloc;                  // Maximum extra paragraphs needed
    WORD   e_ss;                        // Initial (relative) SS value
    WORD   e_sp;                        // Initial SP value
    WORD   e_csum;                      // Checksum
    WORD   e_ip;                        // Initial IP value
    WORD   e_cs;                        // Initial (relative) CS value
    WORD   e_lfarlc;                    // File address of relocation table
    WORD   e_ovno;                      // Overlay number
    WORD   e_res[4];                    // Reserved words
    WORD   e_oemid;                     // OEM identifier (for e_oeminfo)
    WORD   e_oeminfo;                   // OEM information; e_oemid specific
    WORD   e_res2[10];                  // Reserved words
    LONG   e_lfanew;                    // File address of new exe header
  } IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER;

// etc. etc.

Did you notice the HEADER_START macro?

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
HEADER_START("WinNT");
HEADER_START("WinNT");
HEADER_START("WinNT");

This tells our parser that the C++ types following this directive will be dumped into the header “WinNT.cphdr”. This file is relative to the header directory, a sub-directory of the user data directory. A HEADER_END directive does also exist, it equals to invoking the start directive with an empty string. To give you a better idea how these directives work take a look at this snippet:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
// the types in A.h won't be dumped to a header file
#include <a.h>
HEADER_START("BC");
// the types of B.h and C.h will end up in BC.cphdr
#include <b.h>
#include <c.h>
HEADER_END();
// what follows is not dumped to a header file</c.h></b.h></a.h>
// the types in A.h won't be dumped to a header file #include <a.h> HEADER_START("BC"); // the types of B.h and C.h will end up in BC.cphdr #include <b.h> #include <c.h> HEADER_END(); // what follows is not dumped to a header file</c.h></b.h></a.h>
// the types in A.h won't be dumped to a header file

#include 

HEADER_START("BC");

// the types of B.h and C.h will end up in BC.cphdr
#include 
#include 

HEADER_END();

// what follows is not dumped to a header file

If you specify the “#” string in the start directive, the types which follow will be dumped to the ‘this’ header. This is a special header which lives in the current project, so that you can pass the Profiler project to a colleague and it will already contain the necessary types without having to send extra files.

Back to the importing process, we click on ‘Import’ and that’s it. If Clang encounters C++ errors, we can fix them thanks to the diagnostic information:

Diagnostic information

We can explore the created header file from the ‘Explore’ tab.

Explore header

Now let’s use the header to analyze a PE file inside of a Zip archive.

Add structure to layout

Please notice that I’m adding the types with a packing of 1: PE structures are pragma packed to 1.

What you see applied to the hex view, is a layout. In a layout you can insert structures or intervals (a segment of data with a description and a color).

A layout can even be created programmatically and be attached to a hex view as we’ll see in some other post. The implementation of layouts in the Profiler is quite cool, because they are standalone objects. Layouts are not really bound to a hex view: a view just chooses to be attached to a layout. This means that you can share a single layout among different hex views and changes will reflect in all attached views.

Multi-view layout

And while I didn’t mention it, the table view below on the left is the layout inspector. Its purpose is to let you inspect the structures associated to a layout at a particular position. Since layouts allow for overlapping structures, the inspector shows all structures associated in the current range.

Multi-structure inspection

But what if you go somewhere else and return to the hex view? The layout will be gone. Of course, you could press Ctrl+Alt+L and re-attach the layout to the view. There are other two options: navigate back or create a bookmark!

Bookmark

The created bookmark when activated will jump to the right entry and associate the layout for us. Remember that changing the name of a layout invalidates the bookmark.

That’s all for now. And we’ve only scraped the surface… 🙂

Portable Application

This is a very small addition to the upcoming 0.9.7 version of the Profiler, but nonetheless it can be handy. There are occasions in which it is necessary to copy the Profiler from one Windows system to another. Currently this involves copying the user settings: the ones stored in AppData and those in the Windows registry. In this post we’ll see how to create a standalone version of the Profiler which leaves no stuff around in the system. Here’s how:

1 – Copy the directory of the Profiler from the installation path to a writable location.
2 – Create a sub-directory named “user”.
3 – Run the Profiler.

That’s it. Now all your settings will be stored under the user directory.

Remember to create your config files for extensions, actions, key providers, etc. under user/config/, as the config directory under the root one contains files which might be overwritten during an update.

Python can be stored as a sub-directory and its path, once set, will be automatically be recognized as a relative one.

During this month there won’t be many posts, a major update is under development and we hope it will be ready at the end of the month. Stay tuned as something extremely cool is coming. 😉

News for version 0.9.6

The new 0.9.6 version of the Profiler is out. The main new feature is the support for Mach-O files. Since this feature stands on its own, it did make sense to postpone other features to the next version and in the meanwhile let our users benefit from this addition.

Here’s the changelist:

added support for Mach-O files
added support for fat/universal binaries
added support for Apple code signatures
– exposed DemangleSymbolName to Python

The DemangleSymbolName function demangles both VC++ and GCC symbols. Its use is straightforward:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from Pro.Core import DemangleSymbolName
demangled = DemangleSymbolName("__ZNK8OSObject14getRetainCountEv")
print(demangled)
# outputs: OSObject::getRetainCount() const
from Pro.Core import DemangleSymbolName demangled = DemangleSymbolName("__ZNK8OSObject14getRetainCountEv") print(demangled) # outputs: OSObject::getRetainCount() const
from Pro.Core import DemangleSymbolName
demangled = DemangleSymbolName("__ZNK8OSObject14getRetainCountEv")
print(demangled)
# outputs: OSObject::getRetainCount() const

Mach-O support (including Universal Binaries and Apple Code Signatures)

The reason behind this addition is that before undertaking the next big step in the road map of the Profiler there was some spare time to dedicate to some extra features for the upcoming 0.9.6 version. There have also been some customer requests for Mach-O support, so we hope that this will satisfy their request. While there are still some things left which would be useful and nice to add to the Mach-O support, they are not many.

Layout

The first screenshot as you can see features the Mach-O layout.

The logic of Mach-Os starts with their load commands which describe everything else:

Load commands

Segments and sections:

Segments

Entry points (LC_MAIN, LC_UNIXTHREAD):

Entry points

Symbols:

Symbols

Then the LC_DYLD_INFO can describle some VM operations for rebasing and binding:

Rebase

Binding:

Bind

Also the DyldInfo export section is represented as in the file as a tree:

Export

Function starts:

Function starts

Of course, Mach-O support makes little sense without Fat/Universal Binary support:

Fat/Universal Binary

While the upcoming version won’t yet support validation of Apple Code Signatures embedded in Mach-Os, it’s already possible to inspect their format and the embedded certificates.

Apple Code Signature

As usual all the formats added have been exposed to Python as well. I paste some of the SDK class documentation here excluding constants, which are just too many.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
class MachObject
: CFFObject
AddressToOffset(MaxUInt address) -> MaxUInt
AddressToSection(MaxUInt address) -> CFFStruct
AddressToSegment(MaxUInt address) -> CFFStruct
BuildSymbolsValueHash(CFFStruct symtablc) -> NTHash< MaxUInt,UInt32 >
CertificateLCs() -> NTUIntVector
DyLibModules(CFFStruct dysymtablc) -> CFFStruct
DySymTableLC() -> CFFStruct
DyTableOfContents(CFFStruct dysymtablc) -> CFFStruct
DyldDisassembleBind(NTTextStream out, MaxUInt offset, UInt32 size)
DyldDisassembleBind(NTTextStream out, CFFStruct dyldinfo)
DyldDisassembleLazyBind(NTTextStream out, CFFStruct dyldinfo)
DyldDisassembleRebase(NTTextStream out, MaxUInt offset, UInt32 size)
DyldDisassembleRebase(NTTextStream out, CFFStruct dyldinfo)
DyldDisassembleWeakBind(NTTextStream out, CFFStruct dyldinfo)
DyldFindExportedSymbol(CFFStruct dyldinfo, char const * symbol) -> MaxUInt
DyldInfoLC() -> CFFStruct
EntryPointAddress(CFFStruct lc) -> MaxUInt
EntryPointLCs() -> NTUIntVector
ExternalSymbolReferences(CFFStruct dysymtablc) -> CFFStruct
FunctionStartsLC() -> CFFStruct
FunctionStartsOffsetsAndValues(CFFStruct funcstartslc, NTVector< MaxUInt > & values) -> NTUIntVector
GetLC(LoadCmdInfo info) -> CFFStruct
GetLC(UInt32 index) -> CFFStruct
GetLCCount() -> UInt32
GetLCDescription(CFFStruct s) -> NTString
GetLCDescription(UInt32 index) -> NTString
GetLCInfo(UInt32 index) -> LoadCmdInfo
GetLCInfoFromOffset(MaxUInt offset) -> LoadCmdInfo
static GetLCName(UInt32 cmd) -> NTString
IndirectSymbolTable(CFFStruct dysymtablc) -> CFFStruct
IsMachO64() -> bool
MachHeader() -> CFFStruct
OffsetToAddress(MaxUInt offset) -> MaxUInt
OffsetToSection(MaxUInt offset) -> CFFStruct
OffsetToSegment(MaxUInt offset) -> CFFStruct
ProcessLoadCommands() -> bool
ReadSLEB128(NTBuffer b) -> Int64
ReadSLEB128(MaxUInt offset, UInt32 & size) -> Int64
ReadULEB128(NTBuffer b) -> UInt64
ReadULEB128(MaxUInt offset, UInt32 & size) -> UInt64
SectionFromOffset(UInt32 cmd, MaxUInt offset) -> CFFStruct
SegmentSections(CFFStruct seg) -> CFFStruct
SymTableLC() -> CFFStruct
SymbolNList(CFFStruct symtablc) -> CFFStruct
class FatObject
: CFFObject
Architectures() -> CFFStruct
class AppleCodeSignatureObject
: CFFObject
BlobFromOffset(UInt32 offset) -> CFFStruct
BlobIndexes(CFFStruct supblob) -> CFFStruct
BlobName(UInt32 magic) -> NTString
BlobName(CFFStruct blob) -> NTString
IsSuperBlob(UInt32 magic) -> bool
IsSuperBlob(CFFStruct blob) -> bool
TopBlob() -> CFFStruct
class MachObject : CFFObject AddressToOffset(MaxUInt address) -> MaxUInt AddressToSection(MaxUInt address) -> CFFStruct AddressToSegment(MaxUInt address) -> CFFStruct BuildSymbolsValueHash(CFFStruct symtablc) -> NTHash< MaxUInt,UInt32 > CertificateLCs() -> NTUIntVector DyLibModules(CFFStruct dysymtablc) -> CFFStruct DySymTableLC() -> CFFStruct DyTableOfContents(CFFStruct dysymtablc) -> CFFStruct DyldDisassembleBind(NTTextStream out, MaxUInt offset, UInt32 size) DyldDisassembleBind(NTTextStream out, CFFStruct dyldinfo) DyldDisassembleLazyBind(NTTextStream out, CFFStruct dyldinfo) DyldDisassembleRebase(NTTextStream out, MaxUInt offset, UInt32 size) DyldDisassembleRebase(NTTextStream out, CFFStruct dyldinfo) DyldDisassembleWeakBind(NTTextStream out, CFFStruct dyldinfo) DyldFindExportedSymbol(CFFStruct dyldinfo, char const * symbol) -> MaxUInt DyldInfoLC() -> CFFStruct EntryPointAddress(CFFStruct lc) -> MaxUInt EntryPointLCs() -> NTUIntVector ExternalSymbolReferences(CFFStruct dysymtablc) -> CFFStruct FunctionStartsLC() -> CFFStruct FunctionStartsOffsetsAndValues(CFFStruct funcstartslc, NTVector< MaxUInt > & values) -> NTUIntVector GetLC(LoadCmdInfo info) -> CFFStruct GetLC(UInt32 index) -> CFFStruct GetLCCount() -> UInt32 GetLCDescription(CFFStruct s) -> NTString GetLCDescription(UInt32 index) -> NTString GetLCInfo(UInt32 index) -> LoadCmdInfo GetLCInfoFromOffset(MaxUInt offset) -> LoadCmdInfo static GetLCName(UInt32 cmd) -> NTString IndirectSymbolTable(CFFStruct dysymtablc) -> CFFStruct IsMachO64() -> bool MachHeader() -> CFFStruct OffsetToAddress(MaxUInt offset) -> MaxUInt OffsetToSection(MaxUInt offset) -> CFFStruct OffsetToSegment(MaxUInt offset) -> CFFStruct ProcessLoadCommands() -> bool ReadSLEB128(NTBuffer b) -> Int64 ReadSLEB128(MaxUInt offset, UInt32 & size) -> Int64 ReadULEB128(NTBuffer b) -> UInt64 ReadULEB128(MaxUInt offset, UInt32 & size) -> UInt64 SectionFromOffset(UInt32 cmd, MaxUInt offset) -> CFFStruct SegmentSections(CFFStruct seg) -> CFFStruct SymTableLC() -> CFFStruct SymbolNList(CFFStruct symtablc) -> CFFStruct class FatObject : CFFObject Architectures() -> CFFStruct class AppleCodeSignatureObject : CFFObject BlobFromOffset(UInt32 offset) -> CFFStruct BlobIndexes(CFFStruct supblob) -> CFFStruct BlobName(UInt32 magic) -> NTString BlobName(CFFStruct blob) -> NTString IsSuperBlob(UInt32 magic) -> bool IsSuperBlob(CFFStruct blob) -> bool TopBlob() -> CFFStruct
class MachObject
    : CFFObject

    AddressToOffset(MaxUInt address) -> MaxUInt
    AddressToSection(MaxUInt address) -> CFFStruct
    AddressToSegment(MaxUInt address) -> CFFStruct
    BuildSymbolsValueHash(CFFStruct symtablc) -> NTHash< MaxUInt,UInt32 >
    CertificateLCs() -> NTUIntVector
    DyLibModules(CFFStruct dysymtablc) -> CFFStruct
    DySymTableLC() -> CFFStruct
    DyTableOfContents(CFFStruct dysymtablc) -> CFFStruct
    DyldDisassembleBind(NTTextStream out, MaxUInt offset, UInt32 size)
    DyldDisassembleBind(NTTextStream out, CFFStruct dyldinfo)
    DyldDisassembleLazyBind(NTTextStream out, CFFStruct dyldinfo)
    DyldDisassembleRebase(NTTextStream out, MaxUInt offset, UInt32 size)
    DyldDisassembleRebase(NTTextStream out, CFFStruct dyldinfo)
    DyldDisassembleWeakBind(NTTextStream out, CFFStruct dyldinfo)
    DyldFindExportedSymbol(CFFStruct dyldinfo, char const * symbol) -> MaxUInt
    DyldInfoLC() -> CFFStruct
    EntryPointAddress(CFFStruct lc) -> MaxUInt
    EntryPointLCs() -> NTUIntVector
    ExternalSymbolReferences(CFFStruct dysymtablc) -> CFFStruct
    FunctionStartsLC() -> CFFStruct
    FunctionStartsOffsetsAndValues(CFFStruct funcstartslc, NTVector< MaxUInt > & values) -> NTUIntVector
    GetLC(LoadCmdInfo info) -> CFFStruct
    GetLC(UInt32 index) -> CFFStruct
    GetLCCount() -> UInt32
    GetLCDescription(CFFStruct s) -> NTString
    GetLCDescription(UInt32 index) -> NTString
    GetLCInfo(UInt32 index) -> LoadCmdInfo
    GetLCInfoFromOffset(MaxUInt offset) -> LoadCmdInfo
    static GetLCName(UInt32 cmd) -> NTString
    IndirectSymbolTable(CFFStruct dysymtablc) -> CFFStruct
    IsMachO64() -> bool
    MachHeader() -> CFFStruct
    OffsetToAddress(MaxUInt offset) -> MaxUInt
    OffsetToSection(MaxUInt offset) -> CFFStruct
    OffsetToSegment(MaxUInt offset) -> CFFStruct
    ProcessLoadCommands() -> bool
    ReadSLEB128(NTBuffer b) -> Int64
    ReadSLEB128(MaxUInt offset, UInt32 & size) -> Int64
    ReadULEB128(NTBuffer b) -> UInt64
    ReadULEB128(MaxUInt offset, UInt32 & size) -> UInt64
    SectionFromOffset(UInt32 cmd, MaxUInt offset) -> CFFStruct
    SegmentSections(CFFStruct seg) -> CFFStruct
    SymTableLC() -> CFFStruct
    SymbolNList(CFFStruct symtablc) -> CFFStruct

class FatObject
    : CFFObject

    Architectures() -> CFFStruct

class AppleCodeSignatureObject
    : CFFObject

    BlobFromOffset(UInt32 offset) -> CFFStruct
    BlobIndexes(CFFStruct supblob) -> CFFStruct
    BlobName(UInt32 magic) -> NTString
    BlobName(CFFStruct blob) -> NTString
    IsSuperBlob(UInt32 magic) -> bool
    IsSuperBlob(CFFStruct blob) -> bool
    TopBlob() -> CFFStruct

Given the SDK capabilities, it’s easy to perform custom scans on Mach-Os or to create plugins.

That’s all. Hope you enjoyed and don’t be shy if you have feature requests or suggestions. 😉

News for version 0.9.5

We’re happy to present to you the new version of the Profiler with the following news:

introduced Lua filters: lua/custom and lua/loop
added optional condition to misc/basic
added JavaScript execute action
added JavaScript debugger
– simplified save report/project logic
– included actions among the extensions views
– improved detection of shellcodes
introduced max file size option for shellcode detection
improved OLE Streams parsing and extraction from RTFs
exposed getHash method in ScanProvider to Python
– added text replace functionality to text controls

While most of the items in the list have been discussed in previous posts, some of them need a brief introduction.

Max file size for shellcode detection

While shellcode detection applies by default to files of any size, you might want to specify a threshold.

Shellcodes scan options

This is useful if you want to speed up the analysis of large files. It might come handy in some cases.

The ‘getHash’ method

This method should be used by hooks to retrieve a hash for the currently scanned file. The syntax is very simple:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
sp.getHash("md5")
sp.getHash("md5")
sp.getHash("md5")

Of course one could use a filter to hash the file, but the advantage of this method is that once a particular hash type has been computed it won’t be computed again if requested by another hook.

Improved OLE Streams parsing and extraction from RTFs

In one of the previous use cases we’ve analyzed a huge set of malicious RTF documents. Some of them were not recognized correctly and some of them showed problems in the automatic extraction of OLE streams. This release fixes these issues.

RTF set

As you can see all RTFs are now correctly parsed and their OLE stream has been extracted. Some of the OLE objects though are not extracted correctly. After looking into it, it seems to be a problem with the malicious files themselves. OLE streams are encoded as hex strings into the RTF and in some of these files there’s an extra byte which invalidates the sequence.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
01 05 00 00 02 00 00 00 1B 00 00 00 A 4D
01 05 00 00 02 00 00 00 1B 00 00 00 A 4D
01 05 00 00 02 00 00 00 1B 00 00 00 A 4D

That ‘A’ character between 00 and 4D makes the sequence to be 00 A4 D which is incorrect. Our guess is that the malware generator which produced these RTFs outputted some invalid ones by inserting an ‘A’ character instead of a 0x0A newline.

While RTF readers are not able to parse these objects either it’s still interesting for our analysis to be able to inspect them. So we just load the RTF files patching the ‘A’ character with a filter as in the screenshot below.

Fixing a broken OLE stream

That fixes it and we are now able to inspect the embedded OLE object and its threats. As you can see we get directly the shellcode disassembly from the automatic analysis.

Fixed OLE stream

Enjoy!

JavaScript Analysis

The upcoming 0.9.5 version of the Profiler introduces tools to interactively analyze JavaScript code. In a few words it adds the capability to execute snippets of code or to debug them. The JavaScript engine used is the one in WebKit.

Let’s take a look at the newly introduced actions:

JavaScript actions

The ‘Execute JavaScript‘ action executes a script and lets the user decided whether to process ‘eval‘ calls or not.

Execute JavaScript

Even when ‘eval‘ calls are not being processed, the argument is still printed out for the user to inspect. And in case ‘eval‘s are performed, then the result (if any) is printed out as well.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
js_eval: print('hello world'); 1 + 1
js_print: hello world
js_eval_result: 2
js_eval: print('hello world'); 1 + 1 js_print: hello world js_eval_result: 2
js_eval: print('hello world'); 1 + 1
js_print: hello world
js_eval_result: 2

Let’s take a look at the same code under the JavaScript debugger. Given the JavaScript debug capabilities already in Qt, it was easy to integrate a full fledged debugger:

JavaScript Debugger

The debugger can be executed as a stand-alone utility (jsdbg.exe) as well.

It shouldn’t take long before the new version is ready and then we’ll see these features in action against some real world samples. Stay tuned!

Custom filters: Lua and misc/basic

Last year filters have been introduced and among them them the very useful ‘misc/basic‘. The upcoming 0.9.5 version of the Profiler improves this filter introducing the condition parameter.

For instance, let’s take the following filter:

Conditional misc/basic

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<flts><f name="misc/basic" operation="xor" check="!=" bits="8" value="FF" cvalue="FF 0"></f></flts>
<flts><f name="misc/basic" operation="xor" check="!=" bits="8" value="FF" cvalue="FF 0"></f></flts>

It xors every byte if different than 0xFF and 0. The ‘misc/basic‘ filter can be used to express even more complex operations such as:

Advanced misc/basic

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
<flts><f name="misc/basic" operation="xor" check="!=" bits="32" value="AABBCCDD * *" endianness="little" cvalue="AABBCCDD 0"></f></flts>
<flts><f name="misc/basic" operation="xor" check="!=" bits="32" value="AABBCCDD * *" endianness="little" cvalue="AABBCCDD 0"></f></flts>

In this case the the filter xors every third dword with 0xAABBCCDD, following the pattern ‘xor skip skip’, in little endian mode and only if the value is different than 0 and 0xAABBCCDD. While lots of operations can be expressed with this filter, there are limits.

This is why Lua filters have been introduced. Right now there are two such filters available: ‘lua/custom‘ and ‘lua/loop‘. Let’s start with the second one which is just a shortcut.

lua/loop

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end
if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end
if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end

This script does the exact same thing as the first example of the ‘misc/basic‘ filter: it xors every byte if different than 0xFF and 0. In this specific case there’s no reason to use a Lua filter. In fact, Lua filters are considerably slower than native filters. Thus, they should be used only when the operation is too complex to be expressed with any of the default filters.

While ‘lua/loop‘ is inteded for simple loop operations, ‘lua/custom‘, as the name suggests, can be used to implement a custom filter logic. Here’s an example, which again does the same thing as the previous example:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
function run(filter)
local c = filter:container()
local size = c:size()
local offset = 0
local bsize = 16384
while size ~= 0 do
if bsize > size then bsize = size end
local block = c:read(offset, bsize)
local boffs = 0
while boffs < bsize do
local e = block:readU8(boffs)
if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end
block:writeU8(boffs, e)
boffs = boffs + 1
end
c:write(offset, block)
offset = offset + bsize
size = size - bsize
end
return Base.FilterErr_None
end
function run(filter) local c = filter:container() local size = c:size() local offset = 0 local bsize = 16384 while size ~= 0 do if bsize > size then bsize = size end local block = c:read(offset, bsize) local boffs = 0 while boffs < bsize do local e = block:readU8(boffs) if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end block:writeU8(boffs, e) boffs = boffs + 1 end c:write(offset, block) offset = offset + bsize size = size - bsize end return Base.FilterErr_None end
function run(filter)
    local c = filter:container()
    local size = c:size()
    local offset = 0
    local bsize = 16384
    while size ~= 0 do
        if bsize > size then bsize = size end
        local block = c:read(offset, bsize)
        local boffs = 0
        while boffs < bsize do
            local e = block:readU8(boffs)
            if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end
            block:writeU8(boffs, e)
            boffs = boffs + 1
        end
        c:write(offset, block)
        offset = offset + bsize
        size = size - bsize
    end
    return Base.FilterErr_None
end

The security of these scripting filters is very high. They run in a special sandboxed environment, have access only to a minimum set of secure functions, are limited in memory consumption (2 MBs by default, but it can be configured from the settings) and can be interrupted at any time by the user.

If you still don't wish to allow script filters, they can be disabled from the settings.

Lua filters settings

The Lua VM is almost vanilla, the only difference is that it allows for 64-bit numbers. As you can observe from the examples, the Lua library for bitwise operations has been renamed from 'bit32' to 'bit'.

We'll see some practical usage samples in the near future. Stay tuned!