public class

EIRCompiler

extends Object
java.lang.Object
   ↳ com.pnfsoftware.jeb.core.units.code.asm.decompiler.ir.compiler.EIRCompiler

Class Overview

Compiler of IR expressions, IR statements, IR CFG, IR routines, and IR programs (code and data).

Current limitations:

  • The compilation of IEUntranslatedInstruction is not supported.
  • Wildcard types are supported in << ... >> brackets. Leave the type info empty (<<>>) for EImm to specify they should be mutable but not carry any type information (this is different than <<?>> which specify a wildcard type of 1 slot with no information.

Rules and syntax:

  • The Syntax is similar to the IR formatting, except bitsizes must be present and etypes must be absent(to be supported at a later time), eg s32:var1 = i32:01h
  • Unless a wanted size is specified, statements are auto-assigned a size of 1
  • Optional wanted size for statements: "/SIZE: ...", eg /2: nop will create an ENop of size 2 instead of the default size 1
  • Optional wanted native address (hex) for statements: "ADDRESS: ...", eg 1A00: nop will create an ENop statement whose mapping to a hypothetical native address is set to 0x1A00
  • Literal labels for EJump are supported, eg: @label1 ... goto @label1
  • Spaces between tokens and parenthesis around expressions are mostly mandatory; in order to avoid compilation errors, it is better to use them systematically
  • Comments: // or ;
  • EVar are created automatically upon first encounter unless they already exist in the routine-context (checked first), or the global-context (checked second)
  • The class of EVar used for creation depends on its name prefixed; by default, special locals (negative range [2, 0x10000[) is used
  • Standardized prefixes:
       global-context
          rID            -> physical register EVar if possible
          RID            -> virtual register EVar if possible
          gADDR          -> memory-mapped global EVar if possible
          ptr_gAAA       -> global symbol EVar if possible
       routine-context
          vID            -> virtual routine-context EVar if possible (similar to global-context's R..)
          varADDR        -> memory-mapped local stack EVar variable (negative stack offset rel.to SP0)
          parADDR        -> memory-mapped local stack EVar variable (positive or null stack offset rel.to SP0)
          ptr_varADDR    -> pointer (reference) to a local memory-mapped stack variable
          ptr_parADDR    -> pointer (reference) to a local memory-mapped stack parameter
          $r..           -> copy of var
          $r..$N         -> additional copy of var (N>=1)
          $r.._r..       -> copy of var pair
          $r.._r..$N     -> additional copy of var pair (N>=1)
          $r..loX        -> copy of var, truncated (LSB part)
          $r..hiX        -> copy of var, truncated (MSB part)
          $r..loX$N      -> additional copy of var, truncated (LSB part)
          $r..hiX$N      -> additional copy of var, truncated (MSB part)
     

Specific rules for expression and statement compilation:
- PC-assigns can receive additional information, to be provided as end-of-line tags enclosed in brackets:
- [BRANCH] -> means the PC-assign should be generated as if it came from a normal branching instruction
- [SUB] -> means the PC-assign should be generated as if it came from call-to-sub instruction
- [BRANCH_HINTS:offsets] -> provide pseudo-native target hints for the branching instructions; offsets must be a comma-separated list of pseudo-native offsets (not IR offsets)

Specific rules for CFG compilation:
- N/A

Specific rules for routine compilation:
- routines may or may not be enclosed in PROC/ENDP

Specific rules for program compilation:
- routines must be enclosed in PROC/ENPD. The wanted name, wanted pseudo start address (native), and IR prototype are all optional:
- data elements: see below.

 PROC Name @NativeAddress :Prototype
 ...
 ...
 ENDP
 

Defining references: simulate dynamically resolved references to routine and data imported into the module, but physically located in an external component.

 IMPORT CODE MethodName [:OptionalPrototype]
 IMPORT DATA FieldName [:OptionalType]
 

Defining data elements (native memory):

 - syntax for raw bytes (does not create variable object, just memory init.)
     DB/DW/DD/DQ/DS @Address Value
     B,W,D,Q=BYTE, WORD, DWORD, QWORD (1, 2, 4, 8 byte), hex or decimal value
     endianness for memory-encoding matches the processor's (referenced in the native context, held by global context provided to the compiler)
     DB can also be used byte sequences: Value is an arbitrarily-long hex-encoded byte sequence or an escaped string - no zero terminator is appended
     
   examples:
     DB @100 0x11
     DW @100 0x1122
     DD @100 0x11223344
     DQ @100 0x1122334455667788
     DB @100 '11aabb660099ff414141141'  <--- hex-encoded string (note the single-quotes, vs double-quotes for strings)
     DB @100 "Hello World!"             <--- encode to ASCII
     [NOT SUPPORTED YET] DB @100 U"Hello World!"            <--- encode to UTF8
     [NOT SUPPORTED YET] DB @100 L"Hello World!"            <--- encode to UTF16LE (note little-endian)
 
 - syntax for regular data items:
     DV Name @Address :TypeName [OptionalValue]
   where Value is an optional hex-encoded string whose length must be less than or equals to the size of the variable Type
   
 - syntax for string (ascii-encoded, 0-terminated) data items:
     DS Name @Address "Hello"
   the zero terminator is added implicitly, so the above string would translate to 6 bytes, not 5 
   
 - syntax for imported references:
     DR Name @Address &ImportName
 
 

Summary

Nested Classes
class EIRCompiler.CompiledExpression A compiled expression. 
class EIRCompiler.CompiledField  
class EIRCompiler.CompiledProgram A compiled program. 
class EIRCompiler.CompiledRoutine A compiled routine. 
class EIRCompiler.CompiledStatement A compiled statement. 
Public Constructors
EIRCompiler(IEGlobalContext gctx)
Create an IR compiler.
Public Methods
static <T extends IEGeneric> T cc(String s, IEGlobalContext gctx, Class<T> clazz)
Convenience method to parse an IR expression or statement.
static IEGeneric cc(String s, IEGlobalContext gctx)
Convenience method to parse an IR expression or statement.
CFG<IEStatement> compileCfg(String... slist)
Compile a sequence of statements and return the CFG.
CFG<IEStatement> compileCfg(IERoutineContext ctx, String... slist)
Compile a sequence of statements and return the CFG.
EIRCompiler.CompiledExpression compileExpression(IERoutineContext ctx, String s)
Compile a non-statement expression.
EIRCompiler.CompiledExpression compileExpression(String s)
Compile a non-statement expression.
EIRCompiler.CompiledProgram compileProgram(File file)
Compile an IR program made of 1 or more routines.
EIRCompiler.CompiledProgram compileProgram(List<String> slist)
Compile an IR program made of 1 or more routines.
EIRCompiler.CompiledProgram compileProgram(String... slist)
Compile an IR program made of 1 or more routines.
EIRCompiler.CompiledRoutine compileRoutine(String... slist)
Compile an IR routine.
EIRCompiler.CompiledStatement compileStatement(IERoutineContext ctx, String s)
Compile a single statement.
EIRCompiler.CompiledStatement compileStatement(String s)
Compile a single statement.
void reset()
Reset this compiler's state.
[Expand]
Inherited Methods
From class java.lang.Object

Public Constructors

public EIRCompiler (IEGlobalContext gctx)

Create an IR compiler.

Parameters
gctx global IR context - one can be provided by getGlobalContext() or, if no converter is available, can be created ad-hoc

Public Methods

public static T cc (String s, IEGlobalContext gctx, Class<T> clazz)

Convenience method to parse an IR expression or statement.

public static IEGeneric cc (String s, IEGlobalContext gctx)

Convenience method to parse an IR expression or statement.

public CFG<IEStatement> compileCfg (String... slist)

Compile a sequence of statements and return the CFG.

Parameters
slist statement list
Returns
  • IR CFG

public CFG<IEStatement> compileCfg (IERoutineContext ctx, String... slist)

Compile a sequence of statements and return the CFG.

Parameters
ctx optional routine context to be used; if null, a fresh context will be created
slist statement list
Returns
  • IR CFG

public EIRCompiler.CompiledExpression compileExpression (IERoutineContext ctx, String s)

Compile a non-statement expression.

Parameters
ctx optional routine context to be used; if null, a fresh context will be created
s pure expression string (not a statement)
Returns
  • the compiled expression

public EIRCompiler.CompiledExpression compileExpression (String s)

Compile a non-statement expression.

Parameters
s pure expression string (not a statement)
Returns
  • the compiled expression

public EIRCompiler.CompiledProgram compileProgram (File file)

Compile an IR program made of 1 or more routines.

Parameters
file UTF8 encoded source file
Returns
  • the compiled program
Throws
IOException

public EIRCompiler.CompiledProgram compileProgram (List<String> slist)

Compile an IR program made of 1 or more routines.

Parameters
slist program source strings
Returns
  • the compiled program

public EIRCompiler.CompiledProgram compileProgram (String... slist)

Compile an IR program made of 1 or more routines.

Parameters
slist program source strings
Returns
  • the compiled program

public EIRCompiler.CompiledRoutine compileRoutine (String... slist)

Compile an IR routine.

Parameters
slist routine source
Returns
  • the compiled routine

public EIRCompiler.CompiledStatement compileStatement (IERoutineContext ctx, String s)

Compile a single statement.

Parameters
ctx optional routine context to be used; if null, a fresh context will be created
s statement string
Returns
  • the compiled statement

public EIRCompiler.CompiledStatement compileStatement (String s)

Compile a single statement.

Parameters
s statement string
Returns
  • the compiled statement

public void reset ()

Reset this compiler's state. Note that the global IR context (IEGlobalContext) is not reset.