Skip to content

Part 2: Creating a Simple Parser

Warning

This tutorial is deprecated, and will be rewritten to account for JEB 4 API changes.

JEB Plugin Development Tutorial part 2/8

The source code for part 2 of this sample plugin is located on GitHub:

  • Clone this repo: git clone https://github.com/pnfsoftware/jeb2-plugindemo-js.git
  • Switch to the tutorial2 branch: git checkout tutorial2

Sample file#

In this parser development tutorial, we will consider a simple Javascript parser. Let's start with a definition of a test file:

#Javascript

function a() {
  alert("a called");
}

function b() {
  var b = "b called";
  alert(b);
}

Parser Objective#

Our parser will split the input JavaScript file into several parts, one for each JavaScript function().

Defined variables will will be displayed in a supplementary tab in a table.

This initial goal is to detect and display the contents of the file. The secondary goal is to define and manage several views, and show how to use the delegation mechanism.

Generic Parser#

First, let's open this js file without modifying anything in JEB.

JEB analyzes the input artifact and built the project as following:

  • The top-level project represents your current workspace.
  • The artifact represents the file that was opened. (If you choose "File, Add an Artifact..." from the menu, a new artifact will be added at the same level.)
  • The children of the artifact are the units. Their icon can be specified, if none, you will see the default one .

JEB provides a default generic parser that displays any file. It contains two fragments:

  • a Description panel with generic information related to unit
  • an Hex Dump view which displays the content of the object as hexadecimal.

Detect the js file#

Upon loading a file, the first step JEB takes is ask all parsers if they are able to handle the input artifact (in our case, a file). So, the first step is to add the detection for this JavaScript (js) file.

For the sake of making this tutorial simple, we will suppose that all our js files start with a "#Javascript" tag.

In part 1, we learned that IPlugin is the entry point of the plugin; however, it only defines a getPluginInformation() method.

Let's start to check all classes/interfaces that extend IPlugin (in Eclipse, right click on IPlugin, Open Type Hierarchy).

What we have here:

  • our own plugin SamplePlugin class
  • IEnginesPlugin for Engines plugins (refer to separate tutorial)
  • IUnitIdentifier for Parser plugins. To free developers from the task of implementing all methods of IUnitIdentifier, the abstract class AbstractUnitIdentifier is provided. Let's extend it instead of IPlugin or IUnitIdentifier.

We see that only two methods are not implemented:

  • canIdentify: used to detect if the parser should to be used (the one we were looking for!)
  • prepare: used to create an IUnit that will perform the processing

Let's check canIdentify parameters in the API:

  • IInput is a reference to the input file or stream, we can read data from here.
  • IUnitCreator is the parent that created this unit (parsers are producing units or type IUnit). We will use this parameter in the tutorial explaining how delegation works.

canIdentify could use IInput.getHeader, IInput.getStream or other method to detect specific file type. Here, we will use AbstractUnitIdentifier.checkBytes which provides an easy way to check header as you can see:

private final static byte[] JS_HEADER = "#Javascript".getBytes();

@Override
public boolean canIdentify(IInput input, IUnitCreator parent) {
    return checkBytes(input, 0, JS_HEADER);
}

We are now ready, let's start JEB.

Note: you may see the following error in the logger: Unit plugin class com.jeb.sample.SamplePlugin must have a public no-argument constructor This means that you did not provide a default constructor without argument. Let's use this one:

public SamplePlugin() {
    super("Javascript", 0);
}

Open test.js. You should see this:

The default parser did not create the generic unit for js file! That's because there is nothing defined in the prepare method. Keep on reading.

Build a unit#

Look back at the prepare method. It should return an IUnit.

The IUnit will be the main processing class, so it is recommended to build a dedicated class. Just like we did for IPlugin, we will use one of the default convenience abstract classes provided: AbstractUnit.

public class SampleUnit extends AbstractUnit {
    public SampleUnit(String name, IInput input, IUnitProcessor unitProcessor, IUnitCreator parent, IPropertyDefinitionManager pdm) {
        super("js", name, unitProcessor, parent, pdm);
    }

    @Override
    public boolean process() {
        // indicates that the processing is already done: process won't be called again in future
        setProcessed(true);
        // default is false. True indicates that processing is successful.
        return true;
    }
}

and the code of the caller:

@Override
public IUnit prepare(String name, IInput input, IUnitProcessor unitProcessor, IUnitCreator parent) {
    return new SampleUnit(name, input, unitProcessor, parent, pdm);
}

You can test your parser:

Not very exciting. The content of the file is not even displayed. AbstractUnit is a simple unit with the description panel. Its constructor does not use the IInput.

Let's try something else: what are all available subclasses that we have:

Let's use another default implementation: AbstractBinaryUnit.

public class SampleUnit extends AbstractBinaryUnit {
    public SampleUnit(String name, IInput input, IUnitProcessor unitProcessor, IUnitCreator parent, IPropertyDefinitionManager pdm) {
        super(null, input, "js", name, unitProcessor, parent, pdm);
    }
}

So, now we have 2 tabs for a single unit. Each tab is a representation of a JEB Document. Default documents for a Binary Unit are Description and Hex Dump. They both display text.

Note: the description Document is a default document that is always attached to a unit. You can modify its content by overriding IUnit.getDescription.

Add a Document#

JEB provides the ability for units to produce all sorts of documents to be represented by clients.

There are three types of documents, and default implementations are provided:

  • Text buffers (for arbitrary long line-based documents)
  • Tables
  • Trees and table trees

Let's start with the simplest document: a text document. We will use the provided AsciiDocument implementation.

The display is delegated to an IUnitFormatter. This display can be modified by overriding the IUnit.getFormatter method.

@Override
public IUnitFormatter getFormatter() {
    return new UnitFormatterAdapter(new AbstractUnitRepresentation("javascript raw code", true) {
        @Override
        public IGenericDocument getDocument() {
            return new AsciiDocument(getInput());
        }
    });
}

The result is what we expected: