java.lang.Object | |
↳ | com.pnfsoftware.jeb.core.units.code.asm.cfg.CFG<InsnType extends com.pnfsoftware.jeb.core.units.code.IInstruction> |
This class represents a Control Flow Graph for a method (routine) or any body of code. The CFG
can be typed to handle specific instructions
.
This class provides basic Data Flow Analysis support. A client can request simple and full du- and ud-chains. Simple chains are for intra-block use and reference instructions by index; Full chains contain precise cross-block information and reference instructions by address.
There are two ways to build a CFG:
CFG(long, List)
and providing a list of pre-parsed basic
blocks
.CFG(List, List)
and providing a flat list of instructions.Constants | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
int | FLAG_ALLOW_ARTIFICIAL_BLOCK_END | This flag is used to indicate that blocks may legally end on instructions that are not terminators. | |||||||||
int | FLAG_ALLOW_UNREACHABLE_BLOCKS | This flag is used to indicate that the CFG may legally contain blocks not reachable from the entry-point. | |||||||||
int | FLAG_SUBROUTINE_CALL_NOT_BREAKING | This flag is used to specify that call-to-subroutine statements are not basic block
terminators. |
Fields | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
public int | dfaTotalCount | ||||||||||
public long | dfaTotalTimeMs |
Public Constructors | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
CFG(long entry, List<BasicBlock<InsnType>> blocks)
Create a standard CFG using a pre-constructed list of basic blocks.
| |||||||||||
CFG(List<? extends InsnType> insns, List<IrregularFlowData> irrdata)
Convenience constructor.
| |||||||||||
CFG(List<? extends InsnType> insns, List<IrregularFlowData> irrdata, IInstructionAugmenter augmenter, long base, long entry, int flags)
Create a CFG by recursively processing a list of sequential instructions.
| |||||||||||
CFG(Map<Long, InsnType> offsetToInsn, List<IrregularFlowData> irrdata, IInstructionAugmenter augmenter, long entry, int flags)
Create a CFG by recursively processing a collection of instructions.
|
Public Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
void |
addBlock(BasicBlock<InsnType> b)
Add a block, does nothing else.
| ||||||||||
void |
addBlock(int index, BasicBlock<InsnType> b)
Insert a block, does nothing else.
| ||||||||||
boolean |
addEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y)
TODO: rename to connect() and SHOULD NOT USE.
| ||||||||||
int |
addEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, int index)
Add a regular edge X->Y.
| ||||||||||
int | addIrregularEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, int index) | ||||||||||
boolean |
addIrregularEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y)
TODO: Rename to connectIrregular() and SHOULD NOT USE.
| ||||||||||
Iterable<AddressableInstruction<InsnType>> |
addressableInstructions()
Get an instruction-with-address iterator.
| ||||||||||
IDFA<InsnType> |
createDataFlowAnalysisHelperObject()
Create a fresh DFA helper object.
| ||||||||||
void |
deleteAllEdges()
Remove all block edges.
| ||||||||||
boolean |
deleteDuplicateEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y)
Delete one duplicate edge X->Y.
| ||||||||||
int |
deleteEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, int xToYPos)
Delete an edge x->y.
| ||||||||||
boolean |
deleteEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y)
Delete an edge x->y.
| ||||||||||
int | deleteEdges(BasicBlock<InsnType> x, BasicBlock<InsnType> y) | ||||||||||
int |
deleteIrregularEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, int xToYPos)
Delete an irregular edge x->y.
| ||||||||||
boolean |
deleteIrregularEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y)
Delete an irregular edge x->y.
| ||||||||||
int | deleteIrregularEdges(BasicBlock<InsnType> x, BasicBlock<InsnType> y) | ||||||||||
void |
deleteIrregularOutEdges(BasicBlock<InsnType> b)
Delete all irregular output edges for the provided block.
| ||||||||||
void |
deleteOutEdges(BasicBlock<InsnType> b)
Delete all regular output edges for the provided block.
| ||||||||||
IDFA<InsnType> |
doDataFlowAnalysis(boolean redo, int varCollectionFlags)
Perform data flow analysis on the CFG.
| ||||||||||
IDFA<InsnType> |
doDataFlowAnalysis()
Perform data flow analysis on the CFG using standard DFA settings.
| ||||||||||
IDFA<InsnType> |
doDataFlowAnalysis(boolean redo)
Perform data flow analysis on the CFG.
| ||||||||||
IDFA<InsnType> |
doDataFlowAnalysis(boolean redo, int varCollectionFlags, boolean integrateCalculatedInputRegisters)
Perform data flow analysis on the CFG.
| ||||||||||
AddressableInstruction<InsnType> |
findInstruction(InsnType insn)
Locate an instruction.
| ||||||||||
String |
format()
Format this CFG as an assembly-code listing.
| ||||||||||
String |
formatInstructions()
Format this CFG as an assembly-code listing.
| ||||||||||
BasicBlock<InsnType> |
get(int index)
Retrieve a basic block.
| ||||||||||
Map<Long, BasicBlock<InsnType>> |
getAddressBlockMap()
Get a complete map of the basic blocks and their addresses in the CFG.
| ||||||||||
AddressableInstruction<InsnType> | getAddressableInstruction(long address) | ||||||||||
BasicBlock<InsnType> | getBlock(int index) | ||||||||||
BasicBlock<InsnType> |
getBlockAt(long address)
Get the basic block that starts at the provided address or offset.
| ||||||||||
BasicBlock<InsnType> | getBlockByLastAddress(long lastAddress) | ||||||||||
BasicBlock<InsnType> |
getBlockContaining(long address)
Get the basic block that contains the provided address.
| ||||||||||
BasicBlock<InsnType> |
getBlockEndingAt(long address)
Get the basic block that ends on the provided address.
| ||||||||||
List<BasicBlock<InsnType>> |
getBlocks()
Get a copy of the block list of the CFG.
| ||||||||||
List<BasicBlock<InsnType>> |
getBlocksView()
Get a read-only view of the list of blocks for this CFG.
| ||||||||||
List<IDFA<InsnType>> |
getCurrentDFAs()
Retrieve a list of all current data flow analyses.
| ||||||||||
int |
getDFADefaultCollectionFlags()
Note that the initial value is set to
STANDARD_COLLECTION_FLAGS . | ||||||||||
IDFA<InsnType> |
getDataFlowAnalysis()
Retrieve a valid DFA object done with standard parameters (conservative, with inputs, no
copies).
| ||||||||||
IDFA<InsnType> |
getDataFlowAnalysis(int varCollectionFlags, boolean integrateCalculatedInputRegisters)
Retrieve a valid DFA object matching the provided parameters.
| ||||||||||
int |
getEffectiveSize()
Calculate the 'effective' size of this CFG, that is, the sum of the size of each basic block.
| ||||||||||
long |
getEndAddress()
Routine highest address (exclusive).
| ||||||||||
long |
getEntryAddress()
Routine entry-point address.
| ||||||||||
BasicBlock<InsnType> |
getEntryBlock()
Get the entry block.
| ||||||||||
List<BasicBlock<InsnType>> |
getExitBlocks()
Get the ordered list of exit blocks.
| ||||||||||
long |
getFirstAddress()
Routine lowest address (inclusive).
| ||||||||||
int |
getFlags()
Get the CFG flags.
| ||||||||||
List<Couple<Long, Long>> |
getGaps()
Get a list of gaps found in the routine CFG.
| ||||||||||
void |
getGraphRepresentation(List<int[]> edges, List<int[]> irregularEdges)
Get the a graph representation of the CFG.
| ||||||||||
InsnType |
getInstruction(long address)
Get the instruction located at the exact address.
| ||||||||||
int |
getInstructionCount()
Get the total number of instructions in the CFG.
| ||||||||||
Couple<BasicBlock<InsnType>, Integer> |
getInstructionLocation(long address)
Locate an instruction.
| ||||||||||
Couple<BasicBlock<InsnType>, Integer> |
getInstructionLocation(InsnType insn)
Locate an instruction.
| ||||||||||
List<InsnType> |
getInstructions()
Get the instruction list of this CFG by aggregating each instruction of every block.
| ||||||||||
Map<Long, InsnType> |
getInstructionsMap()
Get the map of addresses to instructions that compose this CFG.
| ||||||||||
long |
getLastAddress()
Routine highest address (inclusive).
| ||||||||||
TimedOperationVerifier | getTimedOperationVerifier() | ||||||||||
IVariableInformationProvider |
getVariableInformationProvider()
Retrieve the current variable information provider.
| ||||||||||
boolean |
hasExit()
Determine if this CFG has exit blocks, that is, blocks without out-edges.
| ||||||||||
boolean |
hasNoExit()
Determine if this CFG does not have any exit block.
| ||||||||||
int | indexOf(BasicBlock<InsnType> b) | ||||||||||
Iterable<Couple<Integer, BasicBlock<InsnType>>> |
indexedBlocks()
Get an indexed-block iterator.
| ||||||||||
Iterable<InsnType> |
instructions()
Get an instruction iterator.
| ||||||||||
void |
invalidateDataFlowAnalysis(long addressOfInstructionChange)
Partially invalidate data flow analyses.
| ||||||||||
void |
invalidateDataFlowAnalysis()
Invalidate all previously performed data flow analyses.
| ||||||||||
boolean |
isDFADefaultIntegrateInputs()
Note that the initial value is set to
STANDARD_INTEGRATE_INPUTS . | ||||||||||
Iterator<BasicBlock<InsnType>> |
iterator()
Get a block iterator.
| ||||||||||
int |
reconnectEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, BasicBlock<InsnType> z, Integer xToYPos)
This method reconnects a block x from y to z, i.e.
| ||||||||||
int |
reconnectEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, BasicBlock<InsnType> z)
This method reconnects a block x from y to z, i.e.
| ||||||||||
int |
reconnectIrregularEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, BasicBlock<InsnType> z)
This method irregularly reconnects a block x from y to z, i.e.
| ||||||||||
int |
reconnectIrregularEdge(BasicBlock<InsnType> x, BasicBlock<InsnType> y, BasicBlock<InsnType> z, Integer xToYPos)
This method irregularly reconnects a block x from y to z, i.e.
| ||||||||||
int | reconnectIrregularEdges(BasicBlock<InsnType> x, BasicBlock<InsnType> y, BasicBlock<InsnType> z) | ||||||||||
void |
removeBlock(BasicBlock<InsnType> b)
Remove a block, and update the fixtures.
| ||||||||||
int |
removeUnreachableBlocks(boolean forced)
Remove all blocks not reachable from the entry-point.
| ||||||||||
int |
removeUnreachableBlocks()
Remove all blocks not reachable from the entry-point.
| ||||||||||
InsnType |
replaceInstruction(long address, InsnType insn)
Replace an instruction.
| ||||||||||
boolean |
replaceInstructionsInBlock(long address, int cnt, Collection<InsnType> insns)
Replace a sequence of instructions contained in a single basic block.
| ||||||||||
int |
setDFADefaultCollectionFlags(int collectionFlags)
Note that the initial value is set to
STANDARD_COLLECTION_FLAGS . | ||||||||||
boolean |
setDFADefaultIntegrateInputs(boolean integrateInputs)
Note that the initial value is set to
STANDARD_INTEGRATE_INPUTS . | ||||||||||
void |
setTimedOperationVerifier(TimedOperationVerifier tov)
Set an optional timer verifier that will be checked during long operations, including DFA
computations.
| ||||||||||
IVariableInformationProvider |
setVariableInformationProvider(IVariableInformationProvider prv)
Set the variable information provider.
| ||||||||||
CFG<InsnType> |
shallowCopy(boolean copyBlockReferences)
Perform a shallow duplication of a CFG:
- Block references are optionally copied - DFA data is not copied | ||||||||||
int |
simplify()
Merge consecutive blocks that can be safely merged.
| ||||||||||
int |
simplify(boolean mergeOnCalls, boolean mergeOnJumps, boolean mergeNonConsecutive)
Usage of this function is not recommended.
| ||||||||||
int |
size()
Get the number of blocks.
| ||||||||||
BasicBlock<InsnType> |
splitBlock(BasicBlock<InsnType> b, int index)
Split a block into two blocks.
| ||||||||||
String | toString() |
[Expand]
Inherited Methods | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
From class
java.lang.Object
| |||||||||||
From interface
com.pnfsoftware.jeb.core.units.code.IControlFlowGraph
| |||||||||||
From interface
java.lang.Iterable
|
This flag is used to indicate that blocks may legally end on instructions that are not terminators.
This flag is used to indicate that the CFG may legally contain blocks not reachable from the entry-point.
This flag is used to specify that call-to-subroutine
statements are not basic block
terminators.
Create a standard CFG using a pre-constructed list of basic blocks.
Convenience constructor. Same as
CFG(insns, irrdata, null, 0,
0, 0)
.
insns | list of sequential instructions; they must be contiguous (no gap), which means that the first instruction is at address `base`, the second at `base`+sizeofFirstInstruction, etc. |
---|---|
irrdata | irregular control flow information |
Create a CFG by recursively processing a list of sequential instructions.
Delay-slot instruction sets are not supported by this constructor.
insns | list of sequential instructions; they must be contiguous (no gap), which means that the first instruction is at address `base`, the second at `base`+sizeofFirstInstruction, etc. |
---|---|
irrdata | irregular control flow information |
augmenter | optional instruction augmenter |
base | routine base address |
entry | routine entry-point address |
flags | parsing flags, see FLAG_xxx constants
|
Create a CFG by recursively processing a collection of instructions.
Delay-slot instruction sets are not supported by this constructor.
offsetToInsn | map of address-to-instructions |
---|---|
irrdata | irregular control flow information |
augmenter | optional instruction augmenter |
entry | routine entry-point address |
flags | parsing flags, see FLAG_xxx constants
|
Add a block, does nothing else. The block is not connected, it is up to client code to connect it.
Insert a block, does nothing else. The block is not connected, it is up to client code to connect it.
TODO: rename to connect() and SHOULD NOT USE. This method does nothing if such an edge already exists.
Add a regular edge X->Y.
x | source block |
---|---|
y | destination block |
TODO: Rename to connectIrregular() and SHOULD NOT USE.
Get an instruction-with-address iterator. Also see instructions()
.
remove
)
Create a fresh DFA helper object. Clients should use doDataFlowAnalysis()
methods
instead of using this object directly.
Remove all block edges.
Delete one duplicate edge X->Y.
x | source block |
---|---|
y | destination block |
Delete an edge x->y.
Delete an edge x->y. If duplicates exist, the first one in the list of out-edges deleted.
x | source block |
---|---|
y | destination block |
Delete an irregular edge x->y.
Delete an irregular edge x->y. If duplicates exist, the first one in the list of irrout-edges is deleted.
x | source block |
---|---|
y | destination block |
Delete all irregular output edges for the provided block.
b | block |
---|
Delete all regular output edges for the provided block.
b | block |
---|
Perform data flow analysis on the CFG. Default settings are used if not specified.
redo | if true, force a new analysis even if one is not required |
---|---|
varCollectionFlags | variable collection flags |
Perform data flow analysis on the CFG using standard DFA settings. Input registers are integrated into the ud-chains. If possible, the analysis is conservative. Variable-copies are cleaned.
This method does not force a new analysis unless the settings of the last valid analysis do
not match the required settings: it is the same as invoking
doDataFlowAnalysis(false)
.
Perform data flow analysis on the CFG. Default settings are used if not specified.
redo | if true, force a new analysis even if one is not required |
---|
Perform data flow analysis on the CFG.
redo | if true, force a new analysis even if one is not required |
---|---|
varCollectionFlags | variable collection flags |
integrateCalculatedInputRegisters | if true, the live registers determined after analysis will be integrated in the use-def chains |
Format this CFG as an assembly-code listing.
Consider using a CFGFormatter
instead of this method.
Format this CFG as an assembly-code listing.
Consider using a CFGFormatter
instead of this method.
Get a complete map of the basic blocks and their addresses in the CFG. This method is recommended for use when multiple, repeated invocations of #getBlockAt(int) are to be made.
Get the basic block that starts at the provided address or offset.
address | the block address/offset |
---|
Get the basic block that contains the provided address.
Note that the address just needs to be in the block address range; it does not need to point to the beginning of an instruction within the block
address | an address within the block |
---|
Get the basic block that ends on the provided address.
address | wanted block end address (exclusive) |
---|
Get a copy of the block list of the CFG. The list is ordered by ascending block address. Modifying the list does not impact the CFG.
Get a read-only view of the list of blocks for this CFG. The list is ordered by ascending block address.
Note that the initial value is set to STANDARD_COLLECTION_FLAGS
.
Retrieve a valid DFA object done with standard parameters (conservative, with inputs, no copies).
Retrieve a valid DFA object matching the provided parameters.
varCollectionFlags | variable collection flags |
---|---|
integrateCalculatedInputRegisters | if true, the live registers determined after analysis will be integrated in the use-def chains |
Calculate the 'effective' size of this CFG, that is, the sum of the size of each basic block.
Routine highest address (exclusive).
Routine entry-point address. Note that this address may not be the lowest one in the CFG.
Get the entry block. The entry block is unique.
Get the ordered list of exit blocks. A CFG may have zero or more exit blocks; CFG representing standard routines will have at least one (generally one) exit block.
Routine lowest address (inclusive). Note that this address may not be the entry-point address for this routine.
Get the CFG flags. See FLAG_xxx
in this class.
Get a list of gaps found in the routine CFG. Gaps are memory spaces between blocks that do not belong to the CFG itself.
Optimizing compilers and obfuscators can produce routines with gaps.
Get the a graph representation of the CFG. The list of edges use a 1-based node numbering scheme.
edges | (output) array of regular edges, eg: {{1,2},{1,3},{2,3}} |
---|---|
irregularEdges | (output) array of irregular edges |
Get the instruction located at the exact address.
Get the total number of instructions in the CFG. This method sums the number of instructions of each block of the CFG.
Locate an instruction.
address | instruction address |
---|
Locate an instruction.
Get the instruction list of this CFG by aggregating each instruction of every block. The list is ordered by ascending address/offset.
Get the map of addresses to instructions that compose this CFG.
Routine highest address (inclusive).
Retrieve the current variable information provider.
Determine if this CFG has exit blocks, that is, blocks without out-edges.
Determine if this CFG does not have any exit block.
Get an indexed-block iterator.
Get an instruction iterator. Also see addressableInstructions()
.
remove
)
Partially invalidate data flow analyses. This method can be used after modifying an instruction of the CFG.
addressOfInstructionChange | address |
---|
Invalidate all previously performed data flow analyses. This method can be used after a CFG modification that would render DFA objects obsoletes.
Note that the initial value is set to STANDARD_INTEGRATE_INPUTS
.
Get a block iterator.
remove
)
This method reconnects a block x from y to z, i.e. x->y becomes x->z. If z is null, this method deletes the edge x>y.
x | source block |
---|---|
y | original destination block |
z | new destination block; null to indicate that the edge x>y should be deleted |
xToYPos | optional index specifying which x>y edge (if there are more than one) should be picked (this index is not an index into the full out-list of x); if null, this method will fail on duplicate edges (see return codes); if non-null, this method can introduce duplicate edges |
This method reconnects a block x from y to z, i.e. x->y becomes x->z. Duplicates edges are forbidden.
See reconnectEdge(BasicBlock, BasicBlock, BasicBlock, Integer)
.
This method irregularly reconnects a block x from y to z, i.e. x->y becomes x->z. Duplicates edges are forbidden.
See reconnectIrregularEdge(BasicBlock, BasicBlock, BasicBlock, Integer)
.
This method irregularly reconnects a block x from y to z, i.e. x->y becomes x->z. If z is null, this method deletes the irregular edge x>y.
x | source block |
---|---|
y | original destination block |
z | new destination block; null to indicate that the irregular edge x>y should be deleted |
xToYPos | optional index specifying which x>y irregular edge (if there are more than one) should be picked (this index is not an index into the full irrout-list of x); if null, this method will fail on duplicate irregular edges (see return codes) |
Remove a block, and update the fixtures. This method is content-agnostic.
A CFG should never contain empty blocks or orphan blocks (no parents). However, when the CFG is optimized, instructions and edges are removed, and some nodes might end up empty or orphans. This transient, stale state is fine AS LONG AS the client knows what it is doing and removes the blocks as soon as they're done, before passing it down the processing chain. This method removes such blocks and updates the edges of connected and connecting blocks.
Caveats:
b | block to be removed |
---|
Remove all blocks not reachable from the entry-point.
If the CFG is composed of IResizableInstruction
, instruction sizes may be adjusted to
avoid the introduction of gaps between blocks.
forced | if true, unreachable blocks will be removed even if the CFG's flags specify that unreachable blocks are allowed |
---|
Remove all blocks not reachable from the entry-point.
Replace an instruction. An existing instruction must exist at the provided address. No adjustment is done (eg, the replaced instruction size is not modified). It is up to the caller to adjust the instruction in order to keep a valid CFG object.
address | the target address |
---|---|
insn | the instruction that will replace the instruction at the provided address |
Replace a sequence of instructions contained in a single basic block.
address | start address for replacement |
---|---|
cnt | number of instructions to be replaced |
insns | new instructions; the total size must be equal to the size of the instructions being replaced |
Note that the initial value is set to STANDARD_COLLECTION_FLAGS
.
Note that the initial value is set to STANDARD_INTEGRATE_INPUTS
.
Set an optional timer verifier that will be checked during long operations, including DFA computations.
Set the variable information provider.
The provider is used by two components:
- The data flow analysis components: the provider is used to determine which variables are
'copies' of other variables, and can selectively block their propagation to avoid cluttering
live-var data and reaching-var data (especially imporntant to have clean reaching -output
information)
- The CFG formatter: when rendering a CFG with data chains: variables are rendered using
their names instead of their ids
prv | Optional provider |
---|
Perform a shallow duplication of a CFG:
- Block references are optionally copied
- DFA data is not copied
Merge consecutive blocks that can be safely merged.
Usage of this function is not recommended.
Merge the blocks that can be merged, without changing the flow of the graph. WATCH OUT! When merging with mergeOnCalls or mergeOnJumps set to true, the branching instructions are not removed from the merged blocks. It is the responsibility of the client to remove unnecessary branching instructions in those cases.
Necessary conditions include: adjacent blocks, first block falls thru the second one (possibly through two duplicate edges), no block shall have irregular outputs, and only the first block may have irregular inputs.
Data flow analysis: not used; invalidated (if simplifications were performed).
This method uses getBreakingFlow(long)
and
getRoutineCall(long)
to determine block ends.
mergeOnCalls | if true, this optimization will merge blocks ending by a call to a sub-routine |
---|---|
mergeOnJumps | if true, this optimization will merge blocks ending by a simple branching instruction |
mergeNonConsecutive | allow to merge non strictly consecutive blocks. There are some conditions to respect: the order must be correct, there must NOT exist a block between two consecutive blocks AND the second block must have no outblock. Be careful that there may remain a non-sense jump instruction and it must be necessary to add a jump to previously fallthrough address (caller has to manage it). |
Get the number of blocks.
Split a block into two blocks.
b | block to split |
---|---|
index | index of the "split instruction" in the block; that instruction will be the first instruction of the newly-created block |