PDF JavaScript Extraction Demo Package

We have already shown in the past how simple it is to leverage the capabilities of Cerbero SDK to extract JavaScript from PDF documents using a simple hook.

In this post we’ll use a package to deploy the demo code.

The advantage of using an installable package is that it minimizes the effort on the part of the user to test the code and the deployment method is compatible with both Cerbero Suite and Cerbero Engine.

We explained how packages work in a previous post in case you missed that.

The demo code is the following:

from Pro.Core import *

def printJSEntry(sp, xml, tnode):
    # data node
    dnode = xml.findChild(tnode, "d")
    if not dnode:
        return
    # we let Cerbero extract the JavaScript for us
    params = NTStringVariantHash()
    params.insert("op", "js")
    idnode = xml.findChild(dnode, "id")
    if idnode:
        params.insert("id", int(xml.value(idnode), 16))
    ridnode = xml.findChild(dnode, "rid")
    if idnode:
        params.insert("rid", int(xml.value(ridnode), 16))
    js = sp.customOperation(params)
    # print out the JavaScript
    print("JS CODE")
    print("-------")
    print(js)

def pdfExtractJS(sp, ud):
    xml = sp.getReportXML()
    # object node
    onode = xml.findChild(None, "o")
    if onode:
        # scan node
        snode = xml.findChild(onode, "s")
        if snode:
            # enumerate scan entries
            tchild = xml.firstChild(snode)
            while tchild:
                if xml.name(tchild) == "t":
                    # type attribute
                    tattr = xml.findAttribute(tchild, "t")
                    # check if it's a JavaScript entry
                    if tattr and int(xml.value(tattr)) == CT_JavaScript:
                        printJSEntry(sp, xml, tchild)
                tchild = xml.nextSibling(tchild)

And the configuration for the hook extension is the following:

[PDF JavaScript Extraction Demo]
file = pdf_js_extract_demo.py
scanned = pdfExtractJS
formats = PDF
enable = yes

Out of this two parts we created a package with an automatic setup which you can download from here.

The package can be installed with a few clicks. In fact, on Windows it can be installed directly from the shell context menu.

The setup dialog informs you that the package is verified as it was signed by Cerbero. Do not install the package if the signature couldn’t be verified!

The package once installed is visible in the list of installed packages. From there it can be uninstalled.

While the package is installed, it will print out the JavaScript code contained in PDF documents even if such documents are encrypted.

Packages are a not only a great way to deploy tools and plugins for Cerbero Suite and Cerbero Engine, but they also enable the secure deployment of demonstration snippets and other data.

Cerbero Suite 5.1 is out!

We’re happy to announce the release of Cerbero Suite 5.1 and Cerbero Engine 2.1!

This release comes packed with features and improvements. In this post we summarized the most important ones.

Installable Packages

While there are many interesting new features in this release, we consider the most important one to be the introduction of installable packages.

Packages enable developers to create plugins that can be easily installed by the user with just a few clicks. Not only that, but the same package is compatible with both Cerbero Suite and Cerbero Engine.

Packages can be encrypted and signed. When a package is not signed or the signature cannot be trusted, it is shown by the installation dialog.

We wrote an in-depth article about packages if you’re interested in learning more.

Improved Decompiler

We have introduced some improvements in the decompiler output. The most interesting of these improvements is the support of indirect string literal references.

We wrote a post about this topic for more information.

Local Carbon Structures

Previously, imported structures were shared among Carbon disassemblies in the same project. In Cerbero Suite 5.1 every disassembly in a project can have its own local structures.

This is especially useful when importing data structures from PDB files.

Of course, shared structures are also supported.

Improved CFBF Format View

We have simplified the analysis of Microsoft Office legacy documents that contain text controls by previewing their name in the format view.

We have published a 150-seconds video analysis of an Emotet sample which as part of its obfuscation strategy makes use of text controls.

Improved XLSB Support

We have improved support for the Microsoft Excel XLSB format.

We’ll soon publish malware analysis to showcase these improvements.

Improved Silicon Excel Emulator

We have added support for the FORMULA.ARRAY macro, since this macro is often used by malicious Excel documents.

Hierarchy View Size Column

We received this feature request on Twitter: now the hierarchy view also shows the size of files.

This can be useful when prioritizing the analysis of embedded files.

Improved File Dialogs

We disabled the preview of actual file icons in all file dialogs. This makes opening folders with thousands of files blazingly fast and it’s also better for security.

This may seem like a minor problem, but the devil is in the details…

Grid Layouts in Custom Views

We have added a new type of layout in custom views: grid layouts. This new layout type is already documented in our latest official SDK documentation.

Additionally, this new version comes with minor speed optimizations and bug fixes.

Installable Packages

In the upcoming Cerbero Suite 5.1 and Cerbero Engine 2.1 we have introduced installable packages for extensions.

This means that from now on installing a plugin in Cerbero Suite or Cerbero Engine might require only a few clicks or a command in the terminal.

Packages can be managed in Cerbero Suite from the command line, using the Python SDK and of course from the UI. On Windows they can be installed from the shell context menu as well.

From the command line packages can be managed using the following syntax:

-pkg-create : Create Package
    Syntax: -pkg-create input.zip output.cppkg
    --name : The unique name of the package
    --author : The author of the package
    --version : The version of the package. E.g.: --version "1.0.1"
    --descr : A description of the package
    --sign : The key to sign the package. E.g.: --sign private_key.pem

-pkg-install : Install Package
    Syntax: -pkg-install package_to_install.cppkg
    --force : Silently installs unverified packages

-pkg-uninstall : Uninstall Package
    Syntax: -pkg-uninstall "Package Name"

-pkg-verify : Verify Package
    Syntax: -pkg-verify package_to_verify.cppkg

Similarly packages can be installed, uninstalled and verified from Cerbero Engine using the ProManage.py script inside the local ‘python’ directory. E.g.:

python ProManage.py -pkg-install /path/to/package.cppkg

Packages can be signed. When a package is unsigned or the signature cannot be trusted, it is shown by the installation dialog.

A key pair for signing and verifying packages can be generated as follows:

# create the private key
openssl genrsa -out private.pem 4096

# extract the public key
openssl rsa -in private.pem -outform PEM -pubout -out public.pem

The public key must be added to the list of trusted signers. This can be done by placing the generated file with the name of the issuer in the ‘certs/pkg’ directory or by using the UI.

Since packages have their own format, they can be inspected using Cerbero Suite as any other supported file format.

Like the rest of the functionality related to packages, the class to parse packages is located inside ‘Pro.Package’.

Packages must have a unique name, an author, a version number of maximum 4 parts and a description. Packages are created from Zip archives and they can operate in three different ways:

  1. Relying on the automatic setup, without a setup script.
  2. Relying on a setup script.
  3. Relying on both the automatic setup and a setup script.

Out of the three ways, the first one is certainly the most intuitive: all the files in the Zip archive are installed following the same directory structure as in the archive.

This means that if the archive contains a file called:

plugins/python/CustomFolder/Code.py

It will be installed in the same directory under the user folder of Cerbero Suite or Cerbero Engine.

This is true for all files, except files in the ‘config’ directory. Those files are treated specially and their contents will be appended or removed from the configuration files of the user.

So, for instance, if the following configuration for an action must be installed:

[TestAction]
category = Test
label = Text label
file = TestCode.py
context = hex

It must only be stored in the archive under config/actions.cfg and the automatic installation/uninstallation process takes care of the rest.

Sometimes, however, an automatic installation might not be enough to install an extension. In that case a setup script called ‘setup.py’ can be provided in the archive:

def install(sctx):
    # custom operations
    return True
    
def uninstall(sctx):
    # custom operations
    return True

However, installing everything manually might also not be ideal. In many cases the optimal solution would be an automatic installation with only a few custom operations:

def install(sctx):
    # custom operations
    return sctx.autoInstall()
    
def uninstall(sctx):
    # custom operations
    return sctx.autoUninstall()

To store files in the archive which should be ignored by the automatic setup, they must be placed under a folder called ‘setup’.

Alternatively, files can be individually installed and uninstalled relying on the automatic setup using the ‘installFile’ and ‘uninstallFile’ methods of the setup context, which is passed to the functions in the setup script.

Custom extraction operations can be performed using the ‘extract’ method of the setup context.

An important thing to consider is that if the package is called ‘Test Package’, it will not make any difference if files are placed in the archive at the top level or under a root directory called ‘Test Package’.

For instance:

config/actions.cfg
setup.py

And:

Test Package/config/actions.cfg
Test Package/setup.py

Is considered to be the same. This way when creating the Zip archive, it can be created directly from a directory with the same name of the package.

Having a verified signature is not only good for security purposes, but also allows the package to show a custom icon in the installation dialog. The icon must be called ‘pkgicon.png’ and regardless of its size, it will be resized to a 48×48 icon when shown to the user.

What follows is an easy-to-adapt Python script to create packages using the command line of Cerbero Suite. It uses the “-c” parameter, to avoid displaying message boxes.

import os, sys, shutil, subprocess

cerbero_app = r"[CERBERO_APP_PATH]"

private_key = r"[OPTIONAL_PRIVATE_KEY_PATH]"

pkg_dir = r"C:\MyPackage\TestPackage"
pkg_out = r"C:\MyPackage\TestPackage.cppkg"

pkg_name = "Test Package"
pkg_author = "Test Author"
pkg_version = "1.0.1"
pkg_descr = "Description."

shutil.make_archive(pkg_dir, "zip", pkg_dir)

args = [cerbero_app, "-c", "-pkg-create", pkg_dir + ".zip", pkg_out, "--name", pkg_name, "--author", pkg_author, "--version", pkg_version, "--descr", pkg_descr]
if private_key:
    args.append("--sign")
    args.append(private_key)

ret = subprocess.run(args).returncode
os.remove(pkg_dir + ".zip")

print("Package successfully created!" if ret == 0 else "Couldn't create package!")
sys.exit(ret)