Class DUtil
dexdec
IR access and manipulation utility methods.-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic int
Calculate the trivial complexity metric for the provided IR expression.static boolean
canHandlerCatchException
(IDMethodContext ctx, IDExceptionHandler h, String exceptionSig) static boolean
canThrow
(BasicBlock<IDInstruction> b, int from) Determine whether the trailing instructions of a block can throw.static boolean
canThrow
(BasicBlock<IDInstruction> b, int from, int to) Determine whether a provided range of instructions of a block can throw.static boolean
checkBlock
(BasicBlock<IDInstruction> b, DOpcodeType... opcodes) Check for instruction matching in the provided block.static int
checkSequence
(CFG<IDInstruction> cfg, int blkindex, DOpcodeType... opcodes) Check for instruction matching in a sequence of blocks.static int
Clean the graph of an IR method context.Collect the unique list of variable ids making up this expression, including the expression itself if is anIDVar
.Collect all thevariables
making up this expression, including the expression itself if is a variable.static CFG<IDInstruction>
copyGraph
(CFG<IDInstruction> cfg, boolean deepCopy, IDMethodContext newContext) Copy an IR-CFG.static int
countVariable
(IDExpression e, IDVar var) Count the number of occurrences of a given variable in the provided IR expression.static int
createRegisterVarId
(int registerBase, boolean dualSlot) Create the primary variable mapping to a register or couple of registers.static Collection<BasicBlock<IDInstruction>>
determineInterval
(BasicBlock<IDInstruction> hdrblk) Find the graph interval rooted in the provided header block.static void
dump
(CFG<IDInstruction> cfg, String filename) Dump an IR-CFG to a Graphvizdot
file in the user's temporary folder.static void
dump
(CFG<IDInstruction> cfg, String filename, String title) Dump an IR-CFG to a Graphvizdot
file in the user's temporary folder.static String
formatVarId
(int varid) Generate the standard variable name associated with the provided variable id.static String
formatVarIds
(Collection<Integer> varids) A convenience method to format a collection of variable ids into a comma-separated string.Generate an offset IR-to-native offset conversion map.static String
generateNativeAddress
(IDMethodContext ctx, IDExpression elt) static ICodeCoordinates
static ContextAccessType
getCAT
(IDInstruction insn, boolean inclAssigDst) Determine the aggregated context type of an instruction.static String
static Collection<BasicBlock<IDInstruction>>
Get the list of reachable blocks of the provided IR-CFG.static boolean
Determine whether the IR expression makes invocations (general invocations, new, new-array).static boolean
hasVariable
(IDExpression e, int varid) Determine whether the provided expression is or contains the provided variable.static boolean
Determine whether the provided expression containsvariables
.static void
insertHeaderBlock
(IDMethodContext ctx, int reqInsnCount, int reqInsnSize) Insert a header block filled with NOP instructions.static boolean
isDoubleSlotVarId
(int varid) static boolean
isImmNonZero
(IDExpression exp) Determine if an expression is an IR immediate not holding the value 0.static boolean
isImmValue
(IDExpression exp, long value) Determine if an expression is an IR immediate holding the provided value.static boolean
isImmZero
(IDExpression exp) Determine if an expression is an IR immediate holding the value 0.static boolean
isLegalVarId
(int varid) static boolean
Determine whether the provided block is protected by acatch-all
(catching Throwable).static boolean
isRegisterVarId
(int varid) static boolean
isSingleSlotVarId
(int varid) static boolean
static boolean
isVirtualVarId
(int varid) static boolean
mayBeFinal
(IDField f, IDexUnit dex) Determine whether the provided field is or appears to befinal
.static boolean
mergeBlocks
(IDMethodContext ctx, BasicBlock<IDInstruction> firstBlock) Merge two blocks.static boolean
modifyInstructionSize
(IDMethodContext ctx, IDInstruction targetInstruction, int minWantedSize) Modify the size of an instruction within a CFG.static int
modifyInstructionSizes
(IDMethodContext ctx, Function<IDInstruction, Integer> modifier) Update the instruction sizes of a CFG.static IDInstruction
nextInstruction
(CFG<IDInstruction> cfg, BasicBlock<IDInstruction> b, int i) Determine the meaningful next instruction in the execution flow, assuming no exception was raised.static int
Normalize the IR by resetting all instructions to the unit size (1).static int
normalizeGraph
(IDMethodContext ctx, int wantedInstructionSize) Normalize the IR by resetting all instructions to the required size.static int
parseVarIdFromStandardName
(String varname, boolean softFail) Parse a standard variable name (as generated byformatVarId(int)
) and retrieve the associated variable id.static int
removeGaps
(CFG<IDInstruction> cfg) Remove all gaps from the provided IR-CFG.static boolean
removeInstruction
(BasicBlock<IDInstruction> b, int idx) Remove an instruction in a block.static int
removeUnreachableBlocks
(CFG<IDInstruction> cfg, IDTryData ex) Remove unreachable blocks from an IR-CFG.static int
Remove unreachable blocks from an IR-CFG.static boolean
Remove an unreachable trampoline block (ending on a JCOND or JUMP).static boolean
replaceInExpression
(IDExpression e, IDExpression target, IDExpression repl) Replace an IR expression located inside another IR.static int
SimplifyDOpcodeType#JCOND
andDOpcodeType.IR_SWITCH
IR instructions:
- JCOND: the target branch must not point to the fallthrough instruction; if it is, the JCOND must be replaced by a JUMP (while taking care of potential side-effects introduced by the evaluation of the JCOND predicate)
- SWITCH: the case targets must not point to the fallthrough instruction; if any does, the case entry must be removed (and will be taken care by the switch'sdefault
entry)static BasicBlock<IDInstruction>
splitBlock
(IDMethodContext ctx, BasicBlock<IDInstruction> blk, int index) Split a block of the context's CFG into two blocks.static boolean
unprotectBlock
(CFG<IDInstruction> cfg, IDTryData exdata, int blkAddress) Unprotect a basic block.static List<IDInstruction>
unroll
(IDMethodContext ctx, IDVar pivotVar, long addrStart, long addrEnd, int maxProcessedInsnCount, int maxUnrolledInsnCount, int maxLoopIterCnt) Attempt to unroll some code whose pivot is a singlevariable
.static int
updateTargets
(IDMethodContext ctx, Map<Integer, Integer> oldToNewOffsets) static boolean
Determine whether the IR expression uses references, defined as any of: accessing fields, accessing array elements, creating arrays.static void
Verify an IR-CFG from a method context.static void
verifyGraph
(IDMethodContext ctx, CFG<IDInstruction> cfg, IDTryData exdata, String dumpFilename, String info) Verify an IR-CFG.static void
verifyGraph
(IDMethodContext ctx, String info) Verify an IR-CFG from a method context.static void
verifyGraph
(IDMethodContext ctx, String dumpFilename, String info) Verify an IR-CFG.static int
verifyUnicity
(CFG<IDInstruction> cfg) Verify IR unicity within a full graph.static int
Verify IR unicity within an expression.
-
Constructor Details
-
DUtil
public DUtil()
-
-
Method Details
-
createRegisterVarId
public static int createRegisterVarId(int registerBase, boolean dualSlot) Create the primary variable mapping to a register or couple of registers. A primary variable has an id that can be mapped directly to the underlying Dalvik register id(s).- Parameters:
registerBase
- the register id of the Dalvik variable, or first register of the pair if the variable is dual-slot; must be in [0, 0xFFFF] (at most 0xFFFE for a dual-slot variable)dualSlot
-- Returns:
-
isRegisterVarId
public static boolean isRegisterVarId(int varid) -
isVirtualVarId
public static boolean isVirtualVarId(int varid) -
isLegalVarId
public static boolean isLegalVarId(int varid) -
isSingleSlotVarId
public static boolean isSingleSlotVarId(int varid) -
isDoubleSlotVarId
public static boolean isDoubleSlotVarId(int varid) -
formatVarId
Generate the standard variable name associated with the provided variable id.- Parameters:
varid
- a variable id- Returns:
- the variable name; on error, the method throws
IllegalArgumentException
-
parseVarIdFromStandardName
Parse a standard variable name (as generated byformatVarId(int)
) and retrieve the associated variable id.- Parameters:
varname
- a variable namesoftFail
- if true, return -1 on error; else, the method will throwIllegalArgumentException
on error- Returns:
- the id
-
formatVarIds
A convenience method to format a collection of variable ids into a comma-separated string.- Parameters:
varids
- collection of variable ids- Returns:
- a string
-
replaceInExpression
Replace an IR expression located inside another IR. This method is a deep version ofIDExpression.replaceSubExpression(IDExpression, IDExpression)
. Comparison is done by identity.- Parameters:
e
- parent IRtarget
- the target to be replaced; should not be the same as `e`repl
- the replacement IR- Returns:
- true if the target was successfully replaced
-
countVariable
Count the number of occurrences of a given variable in the provided IR expression.- Parameters:
e
- an IR expressionvar
-- Returns:
-
usesReferences
Determine whether the IR expression uses references, defined as any of: accessing fields, accessing array elements, creating arrays.- Parameters:
e
- an IR expression- Returns:
-
collectVars
Collect all thevariables
making up this expression, including the expression itself if is a variable. The returned collection may contain duplicates. Example: If the provided expression if((var1 + var2 + 5) / var1)
, this method will return[var1, var2, var1]
. Note that the collection is done in post-order, so ordering is predictable.- Parameters:
e
- an IR expression- Returns:
- a list of all vars used in the expression
-
collectUniqueVarIds
Collect the unique list of variable ids making up this expression, including the expression itself if is anIDVar
.- Parameters:
e
- an IR expression- Returns:
- a set of variable ids used in the expression
-
calculateComplexity
Calculate the trivial complexity metric for the provided IR expression. Every IR element has a weight of 1.- Parameters:
e
- an IR expression- Returns:
- the complexity score
-
hasInvokeInfo
Determine whether the IR expression makes invocations (general invocations, new, new-array).- Parameters:
e
- an IR expression- Returns:
-
hasVariables
Determine whether the provided expression containsvariables
.- Parameters:
e
- an IR expression- Returns:
-
hasVariable
Determine whether the provided expression is or contains the provided variable.- Parameters:
e
- an expressionvarid
- the searched variable id- Returns:
-
checkBlock
Check for instruction matching in the provided block. The block must be fully matched.- Parameters:
b
-opcodes
-- Returns:
-
checkSequence
Check for instruction matching in a sequence of blocks. Blocks must be fully contiguous and fully matched.
This is an extended version ofcheckBlock(BasicBlock, DOpcodeType...)
, which performs matching on a single block.- Parameters:
cfg
-blkindex
-opcodes
-- Returns:
- the number of blocks containing the sequence; on failure
-
mayBeFinal
Determine whether the provided field is or appears to befinal
.- Parameters:
f
-dex
-- Returns:
-
cleanGraph
Clean the graph of an IR method context.- Returns:
- the number of clean-up operations performed
-
removeGaps
Remove all gaps from the provided IR-CFG.Gaps are illegal in an IR-CFG. They are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
cfg
-- Returns:
- the number of gaps that were removed
-
removeUnreachableBlocks
Remove unreachable blocks from an IR-CFG. They are blocks that cannot be reached, regularly or irregularly, from the entry-point block (i.e. the first block, at offset 0).Unreachable blocks are illegal in an IR-CFG. They are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
ctx
-- Returns:
-
removeUnreachableBlocks
Remove unreachable blocks from an IR-CFG. They are blocks that cannot be reached, regularly or irregularly, from the entry-point block (i.e. the first block, at offset 0).Unreachable blocks are illegal in an IR-CFG. They are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
cfg
-ex
-- Returns:
-
getReachableBlocks
Get the list of reachable blocks of the provided IR-CFG. They are the blocks that can be reached, regularly or irregularly, from the entry-point block (i.e. the first block, at offset 0).- Parameters:
cfg
-- Returns:
-
simplifyJCondsAndSwitches
SimplifyDOpcodeType#JCOND
andDOpcodeType.IR_SWITCH
IR instructions:
- JCOND: the target branch must not point to the fallthrough instruction; if it is, the JCOND must be replaced by a JUMP (while taking care of potential side-effects introduced by the evaluation of the JCOND predicate)
- SWITCH: the case targets must not point to the fallthrough instruction; if any does, the case entry must be removed (and will be taken care by the switch'sdefault
entry)Illegal JCOND/SWITCH are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
ctx
-- Returns:
-
getExceptionSignature
- Parameters:
ctx
-h
-- Returns:
-
canHandlerCatchException
public static boolean canHandlerCatchException(IDMethodContext ctx, IDExceptionHandler h, String exceptionSig) - Parameters:
ctx
-h
-exceptionSig
-- Returns:
-
unprotectBlock
Unprotect a basic block. The CFG is adjusted and cleaned afterward.- Parameters:
cfg
-exdata
-blkAddress
- block address- Returns:
- true if the block was protected and is now unprotected; false if the block was not protected
-
getCAT
Determine the aggregated context type of an instruction.- Parameters:
insn
- instructioninclAssigDst
- if true and the instruction is an assignment, the destination is included when calculated the CAT; else, the right-side only is considered- Returns:
- aggregated CAT
-
isUsingCaughtException
- Parameters:
ctx
-b
-- Returns:
-
generateNativeAddress
- Parameters:
ctx
-elt
-- Returns:
-
generateNativeCoordinates
- Parameters:
ctx
-elt
-- Returns:
-
verifyGraph
Verify an IR-CFG from a method context. On failure, the CFG will be dumped to "failed.dot"- Parameters:
ctx
- IR context, from which the CFG and optional exception-info object will be pulled- Throws:
RuntimeException
- (or derived, such asIllegalArgumentException
orIllegalStateException
) if the verification failed
-
verifyGraph
Verify an IR-CFG from a method context. On failure, the CFG will be dumped to "failed.dot"- Parameters:
ctx
- IR context, from which the CFG and optional exception-info object will be pulledinfo
- optional string added to the generated DOT file's header containing the CFG (if verification failed)- Throws:
RuntimeException
- (or derived, such asIllegalArgumentException
orIllegalStateException
) if the verification failed
-
verifyGraph
Verify an IR-CFG.- Parameters:
ctx
- context containing the CFGdumpFilename
- optional path to filename where the CFG will be dumped if the verification filedinfo
- optional string added to the generated DOT file's header containing the CFG (if verification failed)- Throws:
RuntimeException
- (or derived, such asIllegalArgumentException
orIllegalStateException
) if the verification failed
-
verifyGraph
public static void verifyGraph(IDMethodContext ctx, CFG<IDInstruction> cfg, IDTryData exdata, String dumpFilename, String info) Verify an IR-CFG.- Parameters:
ctx
-cfg
- CFGexdata
- optional exception-info objectdumpFilename
- optional path to filename where the CFG will be dumped if the verification filedinfo
- optional string added to the generated DOT file's header containing the CFG (if verification failed)- Throws:
RuntimeException
- (or derived, such asIllegalArgumentException
orIllegalStateException
) if the verification failed
-
verifyUnicity
Verify IR unicity within a full graph.- Parameters:
cfg
-- Returns:
- (informative) count of collected expressions
- Throws:
IllegalStateException
- if a duplicate expression is found
-
verifyUnicity
Verify IR unicity within an expression.- Parameters:
exp
- an IR expression- Returns:
- (informative) count of collected expressions
- Throws:
IllegalStateException
- if a duplicate expression is found
-
dump
Dump an IR-CFG to a Graphvizdot
file in the user's temporary folder.- Parameters:
cfg
-filename
-
-
dump
Dump an IR-CFG to a Graphvizdot
file in the user's temporary folder.- Parameters:
cfg
- CFGfilename
- filename (the ".dot" extension will be appended if necessary)title
- optional title or header string
-
removeInstruction
Remove an instruction in a block. Throws on error (watch out: the boolean return value does not indicate failure).- Parameters:
b
- basic blockidx
- index of the instruction to remove in this block- Returns:
- true if the instruction was truly removed, false if it was removed and replaced by an unconditional jump to avoid creating an empty block
-
removeUnreachableTrampoline
public static boolean removeUnreachableTrampoline(CFG<IDInstruction> cfg, BasicBlock<IDInstruction> b) Remove an unreachable trampoline block (ending on a JCOND or JUMP).- Parameters:
cfg
-b
- the block must be an unreachable (no input) trampoline (single-instruction jump/jcc); if the block is the first block in the CFG, the method will fail- Returns:
- true if the block was removed
-
copyGraph
public static CFG<IDInstruction> copyGraph(CFG<IDInstruction> cfg, boolean deepCopy, IDMethodContext newContext) Copy an IR-CFG.- Parameters:
cfg
- a CFGdeepCopy
- if false, the CFG is only structurally duplicated (that is, the copy is shallow: the instructions of the new CFG are reused in the new CFG); if true, the instructions areduplicated
as well).newContext
- if performing deep copy, the optional new context to beset up for the IR instructions
- Returns:
- the CFG copy (deep or shallow)
-
normalizeGraph
Normalize the IR by resetting all instructions to the unit size (1). Exception data is updated as well. The modification are done in-place: the CFG and other objects references remain valid.- Parameters:
ctx
- IR context- Returns:
- the number of modifications performed; if nothing has changed, the return value is 0
-
normalizeGraph
Normalize the IR by resetting all instructions to the required size. Exception data is updated as well. The modification are done in-place: the CFG and other objects references remain valid.- Parameters:
ctx
- IR contextwantedInstructionSize
- required instruction size, must be 1 or above- Returns:
- the number of modifications performed; if nothing has changed, the return value is 0
-
modifyInstructionSizes
public static int modifyInstructionSizes(IDMethodContext ctx, Function<IDInstruction, Integer> modifier) Update the instruction sizes of a CFG. This method is a general version ofnormalizeGraph()
. No CFG block is inserted or deleted. Only instructions' sizes and offsets are updated. Note that the CFG is modified, but the CFG reference within the context is not changed (i.e. no new CFG is created).- Parameters:
ctx
- IR contextmodifier
- instruction size provider; the function may return null to indicate that the instruction size should not change- Returns:
- the number of modifications
-
modifyInstructionSize
public static boolean modifyInstructionSize(IDMethodContext ctx, IDInstruction targetInstruction, int minWantedSize) Modify the size of an instruction within a CFG. No CFG block is inserted or deleted. Only instructions' sizes and offsets are updated. Note that the CFG is modified, but the CFG reference within the context is not changed (i.e. no new CFG is created).- Parameters:
ctx
- IR contextinsn
- the target instructionminWantedSize
- the minimal wanted size- Returns:
- true if the instruction was modified, false if no modification was required
-
updateTargets
- Parameters:
ctx
- IR contextoldToNewOffsets
- a mapping of the old offsets to the corresponding new offsets- Returns:
- the number of modifications
-
insertHeaderBlock
Insert a header block filled with NOP instructions. Essentially, the CFG is shifted downward to accommodate the insertion of a number of lead instructions (to be executed before the entry point).This method never fails. The returned CFG and IR context information is updated to maintain consistency.
- Parameters:
ctx
- an IR contextreqInsnCount
- number of NOP instructions to be prependedreqInsnSize
- size set for each individual instruction
-
splitBlock
public static BasicBlock<IDInstruction> splitBlock(IDMethodContext ctx, BasicBlock<IDInstruction> blk, int index) Split a block of the context's CFG into two blocks. The newly-created block will be set up to have the same irregular outputs as the original block. The context's exception information data will be updated accordingly.- Parameters:
ctx
- IR contextblk
- block to splitindex
- index of the "split instruction" in the block; that instruction will be the first instruction of the newly-created block- Returns:
- the newly-created block
-
mergeBlocks
Merge two blocks.- Parameters:
ctx
- IR contextfirstBlock
- the first block, to be merged with the subsequent (fall-through) block- Returns:
- success indicator
-
generateBlockOffsetMap
Generate an offset IR-to-native offset conversion map.- Returns:
- a map of (IR block offset -> Dalvik block offset)
-
isImmZero
Determine if an expression is an IR immediate holding the value 0.- Parameters:
exp
- an IR expression- Returns:
- true if the provided expression is an immediate equivalent to the value 0
-
isImmValue
Determine if an expression is an IR immediate holding the provided value. The immediate value is determined according to its type. If the type of the expression is undetermined, a signed type is assumed.- Parameters:
exp
- an IR expressionvalue
- the value to check against- Returns:
- true if the provided expression that evaluated to the provided value
-
isImmNonZero
Determine if an expression is an IR immediate not holding the value 0.- Parameters:
exp
- an IR expression- Returns:
- true if the provided expression is an immediate not equivalent to the value 0
-
nextInstruction
public static IDInstruction nextInstruction(CFG<IDInstruction> cfg, BasicBlock<IDInstruction> b, int i) Determine the meaningful next instruction in the execution flow, assuming no exception was raised. Jump targets are followed, no-operations are skipped.- Parameters:
cfg
- current CFGb
- current blocki
- index of the current instruction in the block- Returns:
- the next instruction, or null if it cannot be determined accurately
-
unroll
public static List<IDInstruction> unroll(IDMethodContext ctx, IDVar pivotVar, long addrStart, long addrEnd, int maxProcessedInsnCount, int maxUnrolledInsnCount, int maxLoopIterCnt) Attempt to unroll some code whose pivot is a singlevariable
. The generated code will not contain branching instructions JUMP, JCOND, SWITCH, RETURN, THROW, etc.).- Parameters:
ctx
- IR method context (which references the CFG to operate on)pivotVar
- pivot variableaddrStart
- start address (inclusive) - generally, the instruction at that address initializes the pivot variable, although it is not a requirementaddrEnd
- end address (exclusive), will stop the unrolling processmaxProcessedInsnCount
- maximum amount of instructions to process until the end is reachedmaxUnrolledInsnCount
- maximum amount of unrolled instructions that may be generatedmaxLoopIterCnt
- maximum count of iterations to be unrolled (note: keep in mind for values greater than 1, the resulting IR is likely to be larger than the original IR)- Returns:
- null on error or failure, else the list of unrolled instructions (the list may
contain duplicates, client code needs to use
IDInstruction.duplicate()
to create unique instructions)
-
determineInterval
public static Collection<BasicBlock<IDInstruction>> determineInterval(BasicBlock<IDInstruction> hdrblk) Find the graph interval rooted in the provided header block.An interval is defined as: Given a header node H, an interval I(H) is the maximal, single-entry subgraph in which H is the only entry node, and in which all closed paths contain H.
- Parameters:
hdrblk
- header block- Returns:
- interval nodes (the first node is the header)
-
canThrow
Determine whether the trailing instructions of a block can throw.- Parameters:
b
- blockfrom
- starting instruction index- Returns:
- true if the instructions may throw
-
canThrow
Determine whether a provided range of instructions of a block can throw.- Parameters:
b
-from
- starting indexto
- final index (exclusive)- Returns:
- true if the instructions may throw
-
isProtectedByCatchAll
Determine whether the provided block is protected by acatch-all
(catching Throwable).- Parameters:
ctx
-blk
-- Returns:
-