public class

CFG

extends Object
implements IControlFlowGraph<InsnType extends IInstruction> Iterable<T>
java.lang.Object
   ↳ com.pnfsoftware.jeb.core.units.code.asm.cfg.CFG<InsnType extends com.pnfsoftware.jeb.core.units.code.IInstruction>

Class Overview

This class represents a Control Flow Graph for a routine/method, or more generally, any body of code. The CFG can be typed to handle specific instructions.

This class provides basic Data Flow Analysis support. A client can request simple and full du- and ud-chains. Simple chains are for intra-block use and reference instructions by index; Full chains contain precise cross-block information and reference instructions by address.

There are two ways to build a CFG:

Summary

Constants
int FLAG_ALLOW_ARTIFICIAL_BLOCK_END
int FLAG_SUBROUTINE_CALL_NOT_BREAKING
Public Constructors
CFG(long entry, List<BasicBlock<InsnType>> blocks)
Create a standard CFG using a pre-constructed list of basic blocks.
CFG(List<? extends InsnType> insns, List<IrregularFlowData> irrdata)
Convenience constructor.
CFG(List<? extends InsnType> insns, List<IrregularFlowData> irrdata, IInstructionAugmenter augmenter, long base, long entry, int flags)
Create a CFG by recursively processing a list of sequential instructions.
CFG(Map<Long, InsnType> offsetToInsn, List<IrregularFlowData> irrdata, IInstructionAugmenter augmenter, long entry, int flags)
Create a CFG by recursively processing a collection of instructions.
Public Methods
boolean addEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, boolean keepDFA)
Add a regular edge X->Y.
Iterable<AddressableInstruction<InsnType>> addressableInstructions()
void deleteAllEdges()
Remove all block edges.
boolean deleteDuplicateEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, boolean keepDFA)
Delete one duplicate edge X->Y.
boolean deleteEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, boolean keepDFA)
Delete a regular edge X->Y.
void doDataFlowAnalysis()
Perform data flow analysis on the CFG.
void doDataFlowAnalysis(boolean redo)
Perform data flow analysis on the CFG.
void doDataFlowAnalysis(boolean redo, boolean integrateCalculatedInputRegisters)
Perform data flow analysis on the CFG.
String format(boolean formatAddresses, int formatChains, boolean formatInOut, IFormattingContextFactory<InsnType> fcf)
Format the CFG into a printable string, with optional data chains.
String format(boolean formatAddresses, int formatChains, boolean formatInOut)
Format the CFG into a printable string, with optional data chains.
String format()
Format this CFG as an assembly-code listing with simple data chains and input/output registers

Consider using a CFGFormatter instead of this method.

String formatSimple()
Format this CFG as an assembly-code listing without any DFA data.
BasicBlock<InsnType> get(int index)
Retrieve a basic block.
Map<Long, BasicBlock<InsnType>> getAddressBlockMap()
Get a complete map of the basic blocks and their addresses in the CFG.
BasicBlock<InsnType> getBlockAt(long base)
Get the basic block that starts at the provided address or offset.
BasicBlock<InsnType> getBlockContaining(long address)
Get the basic block that contains the provided address.
List<BasicBlock<InsnType>> getBlocks()
Get a copy of the block list of the CFG.
int getEffectiveSize()
Calculate the 'effective' size of this CFG, that is, the sum of the size of each basic block.
long getEndAddress()
Routine highest address (exclusive).
long getEntryAddress()
Routine entry-point address.
BasicBlock<InsnType> getEntryBlock()
Get the entry block.
List<BasicBlock<InsnType>> getExitBlocks()
Get the ordered list of exit blocks.
long getFirstAddress()
Routine lowest address (inclusive).
int getFlags()
List<Couple<Long, Long>> getGaps()
Get a list of gaps found in the routine CFG.
void getGraphRepresentation(List<int[]> edges, List<int[]> irregularEdges)
Get the a graph representation of the CFG.
Map<Integer, Set<Long>> getInputMap()
InsnType getInstruction(long address)
Get the instruction located at the exact address.
int getInstructionCount()
Get the total number of instructions in the CFG.
Couple<BasicBlock<InsnType>, Integer> getInstructionLocation(long address)
List<InsnType> getInstructions()
Get the instruction list of this CFG by aggregating each instruction of every block.
Map<Long, InsnType> getInstructionsMap()
Get the map of addresses to instructions that compose this CFG.
long getLastAddress()
Routine highest address (inclusive).
Map<Integer, Set<Long>> getOuputMap()
IVariableInformationProvider getVariableInformationProvider()
boolean hasExit()
Determine if this CFG has exit blocks, that is, blocks without out-edges.
boolean hasNoExit()
Determine if this CFG does not have any exit block.
Iterable<InsnType> instructions()
void invalidateDataFlowAnalysis()
Explicitly invalidate a previously performed data flow analysis.
Iterator<BasicBlock<InsnType>> iterator()
int pruneOrphans()
Remove orphan basic blocks, i.e.
int reconnectEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, BasicBlock<InsnType> z, boolean keepDFA)
This method regularly reconnects a block X from Y to Z.
void removeBlock(BasicBlock<InsnType> b, boolean keepDFA)
Remove a block, and update the fixtures.
InsnType replaceInstruction(long address, InsnType insn)
Replace an instruction and invalidate the DFA.
InsnType replaceInstruction(long address, InsnType insn, boolean keepDFA)
Replace an instruction.
IVariableInformationProvider setVariableInformationProvider(IVariableInformationProvider prv)
CFG<InsnType> shallowCopy(boolean copyBlockReferences)
Shallow duplication of a CFG:
- Block references are optionally copied
- DFA data is not copied
int simplify(boolean mergeOnCalls, boolean mergeOnJumps, boolean mergeNonConsecutive)
Merge the blocks that can be merged, without changing the flow of the graph.
int size()
Get the number of blocks.
String toString()
[Expand]
Inherited Methods
From class java.lang.Object
From interface com.pnfsoftware.jeb.core.units.code.IControlFlowGraph
From interface java.lang.Iterable

Constants

public static final int FLAG_ALLOW_ARTIFICIAL_BLOCK_END

Constant Value: 2 (0x00000002)

public static final int FLAG_SUBROUTINE_CALL_NOT_BREAKING

Constant Value: 1 (0x00000001)

Public Constructors

public CFG (long entry, List<BasicBlock<InsnType>> blocks)

Create a standard CFG using a pre-constructed list of basic blocks.

public CFG (List<? extends InsnType> insns, List<IrregularFlowData> irrdata)

Convenience constructor. Same as CFG(insns, irrdata, null, 0, 0, 0).

Parameters
insns list of sequential instructions; they must be contiguous (no gap), which means that the first instruction is at address `base`, the second at `base`+sizeofFirstInstruction, etc.
irrdata irregular control flow information

public CFG (List<? extends InsnType> insns, List<IrregularFlowData> irrdata, IInstructionAugmenter augmenter, long base, long entry, int flags)

Create a CFG by recursively processing a list of sequential instructions.

Delay-slot instruction sets are not supported by this constructor.

Parameters
insns list of sequential instructions; they must be contiguous (no gap), which means that the first instruction is at address `base`, the second at `base`+sizeofFirstInstruction, etc.
irrdata irregular control flow information
augmenter optional instruction augmenter
base routine base address
entry routine entry-point address
flags parsing flags, see FLAG_xxx constants

public CFG (Map<Long, InsnType> offsetToInsn, List<IrregularFlowData> irrdata, IInstructionAugmenter augmenter, long entry, int flags)

Create a CFG by recursively processing a collection of instructions.

Delay-slot instruction sets are not supported by this constructor.

Parameters
offsetToInsn map of address-to-instructions
irrdata irregular control flow information
augmenter optional instruction augmenter
entry routine entry-point address
flags parsing flags, see FLAG_xxx constants

Public Methods

public boolean addEdge (BasicBlock<InsnType> x, BasicBlock<InsnType> y, boolean keepDFA)

Add a regular edge X->Y. This method fails if the edge already exists. Both blocks must be part of the CFG.

Parameters
x source block
y destination block
keepDFA if true the DFA is considered to be still valid after the edge addition
Returns
  • true if x->y was added

public Iterable<AddressableInstruction<InsnType>> addressableInstructions ()

public void deleteAllEdges ()

Remove all block edges.

public boolean deleteDuplicateEdge (BasicBlock<InsnType> x, BasicBlock<InsnType> y, boolean keepDFA)

Delete one duplicate edge X->Y.

Parameters
x source block
y destination block
keepDFA if true, if true the DFA is considered to be still valid after the edge removal
Returns
  • true if x->y existed and was deleted

public boolean deleteEdge (BasicBlock<InsnType> x, BasicBlock<InsnType> y, boolean keepDFA)

Delete a regular edge X->Y. Warning! The method throws if duplicate edges x->y are found.

Parameters
x source block
y destination block
keepDFA if true, if true the DFA is considered to be still valid after the edge removal
Returns
  • true if x->y existed and was deleted

public void doDataFlowAnalysis ()

Perform data flow analysis on the CFG.

This method does not force a new analysis: Same as doDataFlowAnalysis(false).

public void doDataFlowAnalysis (boolean redo)

Perform data flow analysis on the CFG. Input registers are integrated into

This method does not propagate input registers: Same as doDataFlowAnalysis(redo, true).

Parameters
redo if true, force a new analysis even if one is not required

public void doDataFlowAnalysis (boolean redo, boolean integrateCalculatedInputRegisters)

Perform data flow analysis on the CFG.

Parameters
redo if true, force a new analysis even if one is not required
integrateCalculatedInputRegisters if true, the live registers determined after analysis will be integrated in the use-def chains

public String format (boolean formatAddresses, int formatChains, boolean formatInOut, IFormattingContextFactory<InsnType> fcf)

Format the CFG into a printable string, with optional data chains.

Consider using a CFGFormatter instead of this method.

Parameters
formatAddresses true to prepend instructions by their address
formatChains 0=no, 1=simple chains, 2=full chains
formatInOut true to format the input(live) and output(reaching) registers
Returns
  • the formatted CFG

public String format (boolean formatAddresses, int formatChains, boolean formatInOut)

Format the CFG into a printable string, with optional data chains.

Consider using a CFGFormatter instead of this method.

public String format ()

Format this CFG as an assembly-code listing with simple data chains and input/output registers

Consider using a CFGFormatter instead of this method.

Returns
  • the formatted CFG

public String formatSimple ()

Format this CFG as an assembly-code listing without any DFA data.

Consider using a CFGFormatter instead of this method.

public BasicBlock<InsnType> get (int index)

Retrieve a basic block.

public Map<Long, BasicBlock<InsnType>> getAddressBlockMap ()

Get a complete map of the basic blocks and their addresses in the CFG. This method is recommended for use when multiple, repeated invocations of #getBlockAt(int) are to be made.

Returns
  • a map of address to block

public BasicBlock<InsnType> getBlockAt (long base)

Get the basic block that starts at the provided address or offset.

Parameters
base the block address/offset
Returns
  • basic block, or null if none starts at that address

public BasicBlock<InsnType> getBlockContaining (long address)

Get the basic block that contains the provided address.

Note that the address just needs to be in the block address range; it does not need to point to the beginning of an instruction within the block

Parameters
address an address within the block

public List<BasicBlock<InsnType>> getBlocks ()

Get a copy of the block list of the CFG. The list is ordered by ascending block address. Modifying the list does not impact the CFG.

Returns
  • a copy of the list of blocks

public int getEffectiveSize ()

Calculate the 'effective' size of this CFG, that is, the sum of the size of each basic block.

Returns
  • the CFG size

public long getEndAddress ()

Routine highest address (exclusive).

public long getEntryAddress ()

Routine entry-point address. Note that this address may not be the lowest one in the CFG.

public BasicBlock<InsnType> getEntryBlock ()

Get the entry block. The entry block is unique.

Returns
  • never null

public List<BasicBlock<InsnType>> getExitBlocks ()

Get the ordered list of exit blocks. A CFG may have zero or more exit blocks; CFG representing standard routines will have at least one (generally one) exit block.

Returns
  • a collection of blocks, possibly empty

public long getFirstAddress ()

Routine lowest address (inclusive). Note that this address may not be the entry-point address for this routine.

public int getFlags ()

public List<Couple<Long, Long>> getGaps ()

Get a list of gaps found in the routine CFG. Gaps are memory spaces between blocks that do not belong to the CFG itself.

Optimizing compilers and obfuscators can produce routines with gaps.

Returns
  • a list of address ranges [begin, end) of gaps

public void getGraphRepresentation (List<int[]> edges, List<int[]> irregularEdges)

Get the a graph representation of the CFG. The list of edges return use a 1-based node numbering scheme.

Parameters
edges (output) array of regular edges, eg: {{1,2},{1,3},{2,3}}
irregularEdges (output) array of irregular edges

public Map<Integer, Set<Long>> getInputMap ()

public InsnType getInstruction (long address)

Get the instruction located at the exact address.

Returns
  • an instruction, null if none

public int getInstructionCount ()

Get the total number of instructions in the CFG. This method sums the number of instructions of each block of the CFG.

public Couple<BasicBlock<InsnType>, Integer> getInstructionLocation (long address)

public List<InsnType> getInstructions ()

Get the instruction list of this CFG by aggregating each instruction of every block. The list is ordered by ascending address/offset.

public Map<Long, InsnType> getInstructionsMap ()

Get the map of addresses to instructions that compose this CFG.

public long getLastAddress ()

Routine highest address (inclusive).

public Map<Integer, Set<Long>> getOuputMap ()

public IVariableInformationProvider getVariableInformationProvider ()

public boolean hasExit ()

Determine if this CFG has exit blocks, that is, blocks without out-edges.

public boolean hasNoExit ()

Determine if this CFG does not have any exit block.

public Iterable<InsnType> instructions ()

public void invalidateDataFlowAnalysis ()

Explicitly invalidate a previously performed data flow analysis.

To be used for instance after a CFG (or parts of it) is modified, eg, an instruction is pulled out or optimized (changing its register usage), etc.

public Iterator<BasicBlock<InsnType>> iterator ()

public int pruneOrphans ()

Remove orphan basic blocks, i.e. basic blocks with no parents which are not the entry block, and all their descendants.

Returns
  • number of remove basic blocks

public int reconnectEdge (BasicBlock<InsnType> x, BasicBlock<InsnType> y, BasicBlock<InsnType> z, boolean keepDFA)

This method regularly reconnects a block X from Y to Z. (x->y becomes x->z). Warning! The method throws if duplicate edges x->y are found.

Parameters
x the original source block
y the original destination block
z the new destination block
keepDFA if true the DFA is considered to be still valid after the reconnect
Returns
  • +1: success
    0: failure, x->y does not exist
    -1: failure, a edge x->z already exists, and duplicate edges are never allowed

public void removeBlock (BasicBlock<InsnType> b, boolean keepDFA)

Remove a block, and update the fixtures. This method is content-agnostic.

A CFG should never contain empty blocks or orphan blocks (no parents). However, when the CFG is optimized, instructions and edges are removed, and some nodes might end up empty or orphans. This transient, stale state is fine AS LONG AS the client knows what it is doing and removes the blocks as soon as they're done, before passing it down the processing chain. This method removes such blocks and updates the edges of connected and connecting blocks.

Caveats:

  • 1st caveat: the block's out-degree is one, OR the block's in-degree is one or zero. (Otherwise, stitching up the edges would not be possible.)
  • 2nd caveat: duplicate edges can be introduced. The client should remove them if it doesn't like that.
  • 3rd caveat: the fixtures update is instruction-agnostic. This means that the client is solely responsible regarding the decision to remove a block. For instance, the last instruction of the predecessor block should be checked and the client should make sure that removing the block and updating the fixtures is an operation compatible with the semantics of that last instruction.

Data flow analysis: not used; invalidation depends on keepDFA parameter.

Parameters
b block to be removed
keepDFA if true the DFA is considered still valid after the removal (BE CAUTIOUS!!)

public InsnType replaceInstruction (long address, InsnType insn)

Replace an instruction and invalidate the DFA. An existing instruction must exist at the provided address. No adjustment is done (eg, the replaced instruction size is not modified). It is up to the caller to adjust the instruction in order to keep a valid CFG obect.

Parameters
address the target address
insn the instruction that will replace the instruction at the provided address
Returns
  • null on error, else the instruction that was replaced

public InsnType replaceInstruction (long address, InsnType insn, boolean keepDFA)

Replace an instruction. An existing instruction must exist at the provided address. No adjustment is done (eg, the replaced instruction size is not modified). It is up to the caller to adjust the instruction in order to keep a valid CFG obect.

Parameters
address the target address
insn the instruction that will replace the instruction at the provided address
keepDFA whether or not existing data flow analysis should be invalidated upon a successful replacement
Returns
  • null on error, else the instruction that was replaced

public IVariableInformationProvider setVariableInformationProvider (IVariableInformationProvider prv)

public CFG<InsnType> shallowCopy (boolean copyBlockReferences)

Shallow duplication of a CFG:
- Block references are optionally copied
- DFA data is not copied

Returns
  • a new CFG

public int simplify (boolean mergeOnCalls, boolean mergeOnJumps, boolean mergeNonConsecutive)

Merge the blocks that can be merged, without changing the flow of the graph.

Necessary conditions include: adjacent blocks, first block falls thru the second one (possibly through two duplicate edges), no block shall have irregular outputs, and only the first block may have irregular inputs.

Data flow analysis: not used; invalidated (if simplifications were performed).

This method uses getBreakingFlow(long) and getRoutineCall(long) to determine block ends.

Parameters
mergeOnCalls if true, this optimization will merge blocks ending by a call to a sub-routine
mergeOnJumps if true, this optimization will merge blocks ending by a simple branching instruction
mergeNonConsecutive allow to merge non strictly consecutive blocks. There are some conditions to respect: the order must be correct, there must NOT exist a block between two consecutive blocks AND the second block must have no outblock. Be careful that there may remain a non-sense jump instruction and it must be necessary to add a jump to previously fallthrough address (caller has to manage it).
Returns
  • number of mergers performed

public int size ()

Get the number of blocks.

public String toString ()