Class DUtil
dexdec IR access and manipulation utility methods.-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic intCalculate the trivial complexity metric for the provided IR expression.static booleancanHandlerCatchException(IDMethodContext ctx, IDExceptionHandler h, String exceptionSig) static booleancanThrow(BasicBlock<IDInstruction> b, int from) Determine whether the trailing instructions of a block can throw.static booleancanThrow(BasicBlock<IDInstruction> b, int from, int to) Determine whether a provided range of instructions of a block can throw.static booleancheckBlock(BasicBlock<IDInstruction> b, DOpcodeType... opcodes) Check for instruction matching in the provided block.static intcheckSequence(CFG<IDInstruction> cfg, int blkindex, DOpcodeType... opcodes) Check for instruction matching in a sequence of blocks.static intClean the graph of an IR method context.Collect the unique list of variable ids making up this expression, including the expression itself if is anIDVar.Collect all thevariablesmaking up this expression, including the expression itself if is a variable.static CFG<IDInstruction>copyGraph(CFG<IDInstruction> cfg, boolean deepCopy, IDMethodContext newContext) Copy an IR-CFG.static intcountVariable(IDExpression e, IDVar var) Count the number of occurrences of a given variable in the provided IR expression.static intcreateRegisterVarId(int registerBase, boolean dualSlot) Create the primary variable mapping to a register or couple of registers.static Collection<BasicBlock<IDInstruction>>determineInterval(BasicBlock<IDInstruction> hdrblk) Find the graph interval rooted in the provided header block.static voiddump(CFG<IDInstruction> cfg, String filename) Dump an IR-CFG to a Graphvizdotfile in the user's temporary folder.static voiddump(CFG<IDInstruction> cfg, String filename, String title) Dump an IR-CFG to a Graphvizdotfile in the user's temporary folder.static StringformatVarId(int varid) Generate the standard variable name associated with the provided variable id.static StringformatVarIds(Collection<Integer> varids) A convenience method to format a collection of variable ids into a comma-separated string.Generate an offset IR-to-native offset conversion map.static StringgenerateNativeAddress(IDMethodContext ctx, IDExpression elt) static ICodeCoordinatesstatic ContextAccessTypegetCAT(IDInstruction insn, boolean inclAssigDst) Determine the aggregated context type of an instruction.static Stringstatic Collection<BasicBlock<IDInstruction>>Get the list of reachable blocks of the provided IR-CFG.static booleanDetermine whether the IR expression makes invocations (general invocations, new, new-array).static booleanhasVariable(IDExpression e, int varid) Determine whether the provided expression is or contains the provided variable.static booleanDetermine whether the provided expression containsvariables.static voidinsertHeaderBlock(IDMethodContext ctx, int reqInsnCount, int reqInsnSize) Insert a header block filled with NOP instructions.static booleanisDoubleSlotVarId(int varid) static booleanisImmNonZero(IDExpression exp) Determine if an expression is an IR immediate not holding the value 0.static booleanisImmValue(IDExpression exp, long value) Determine if an expression is an IR immediate holding the provided value.static booleanisImmZero(IDExpression exp) Determine if an expression is an IR immediate holding the value 0.static booleanisLegalVarId(int varid) static booleanDetermine whether the provided block is protected by acatch-all(catching Throwable).static booleanisRegisterVarId(int varid) static booleanisSingleSlotVarId(int varid) static booleanstatic booleanisVirtualVarId(int varid) static booleanmayBeFinal(IDField f, IDexUnit dex) Determine whether the provided field is or appears to befinal.static booleanmergeBlocks(IDMethodContext ctx, BasicBlock<IDInstruction> firstBlock) Merge two blocks.static booleanmodifyInstructionSize(IDMethodContext ctx, IDInstruction targetInstruction, int minWantedSize) Modify the size of an instruction within a CFG.static intmodifyInstructionSizes(IDMethodContext ctx, Function<IDInstruction, Integer> modifier) Update the instruction sizes of a CFG.static IDInstructionnextInstruction(CFG<IDInstruction> cfg, BasicBlock<IDInstruction> b, int i) Determine the meaningful next instruction in the execution flow, assuming no exception was raised.static intNormalize the IR by resetting all instructions to the unit size (1).static intnormalizeGraph(IDMethodContext ctx, int wantedInstructionSize) Normalize the IR by resetting all instructions to the required size.static intparseVarIdFromStandardName(String varname, boolean softFail) Parse a standard variable name (as generated byformatVarId(int)) and retrieve the associated variable id.static intremoveGaps(CFG<IDInstruction> cfg) Remove all gaps from the provided IR-CFG.static booleanremoveInstruction(BasicBlock<IDInstruction> b, int idx) Remove an instruction in a block.static intremoveUnreachableBlocks(CFG<IDInstruction> cfg, IDTryData ex) Remove unreachable blocks from an IR-CFG.static intRemove unreachable blocks from an IR-CFG.static booleanRemove an unreachable trampoline block (ending on a JCOND or JUMP).static booleanreplaceInExpression(IDExpression e, IDExpression target, IDExpression repl) Replace an IR expression located inside another IR.static intSimplifyDOpcodeType#JCONDandDOpcodeType.IR_SWITCHIR instructions:
- JCOND: the target branch must not point to the fallthrough instruction; if it is, the JCOND must be replaced by a JUMP (while taking care of potential side-effects introduced by the evaluation of the JCOND predicate)
- SWITCH: the case targets must not point to the fallthrough instruction; if any does, the case entry must be removed (and will be taken care by the switch'sdefaultentry)static BasicBlock<IDInstruction>splitBlock(IDMethodContext ctx, BasicBlock<IDInstruction> blk, int index) Split a block of the context's CFG into two blocks.static booleanunprotectBlock(CFG<IDInstruction> cfg, IDTryData exdata, int blkAddress) Unprotect a basic block.static List<IDInstruction>unroll(IDMethodContext ctx, IDVar pivotVar, long addrStart, long addrEnd, int maxProcessedInsnCount, int maxUnrolledInsnCount, int maxLoopIterCnt) Attempt to unroll some code whose pivot is a singlevariable.static intupdateTargets(IDMethodContext ctx, Map<Integer, Integer> oldToNewOffsets) static booleanDetermine whether the IR expression uses references, defined as any of: accessing fields, accessing array elements, creating arrays.static voidverifyDataFlow(CFG<IDInstruction> cfg) Perform data flow verifications on the graph.static voidVerify an IR-CFG from a method context.static voidverifyGraph(IDMethodContext ctx, CFG<IDInstruction> cfg, IDTryData exdata, String dumpFilename, String info) Verify an IR-CFG.static voidverifyGraph(IDMethodContext ctx, String info) Verify an IR-CFG from a method context.static voidverifyGraph(IDMethodContext ctx, String dumpFilename, String info) Verify an IR-CFG.static intverifyUnicity(CFG<IDInstruction> cfg) Verify IR unicity within a full graph.static intVerify IR unicity within an expression.
-
Constructor Details
-
DUtil
public DUtil()
-
-
Method Details
-
createRegisterVarId
public static int createRegisterVarId(int registerBase, boolean dualSlot) Create the primary variable mapping to a register or couple of registers. A primary variable has an id that can be mapped directly to the underlying Dalvik register id(s).- Parameters:
registerBase- the register id of the Dalvik variable, or first register of the pair if the variable is dual-slot; must be in [0, 0xFFFF] (at most 0xFFFE for a dual-slot variable)dualSlot-- Returns:
-
isRegisterVarId
public static boolean isRegisterVarId(int varid) -
isVirtualVarId
public static boolean isVirtualVarId(int varid) -
isLegalVarId
public static boolean isLegalVarId(int varid) -
isSingleSlotVarId
public static boolean isSingleSlotVarId(int varid) -
isDoubleSlotVarId
public static boolean isDoubleSlotVarId(int varid) -
formatVarId
Generate the standard variable name associated with the provided variable id.- Parameters:
varid- a variable id- Returns:
- the variable name; on error, the method throws
IllegalArgumentException
-
parseVarIdFromStandardName
Parse a standard variable name (as generated byformatVarId(int)) and retrieve the associated variable id.- Parameters:
varname- a variable namesoftFail- if true, return -1 on error; else, the method will throwIllegalArgumentExceptionon error- Returns:
- the id
-
formatVarIds
A convenience method to format a collection of variable ids into a comma-separated string.- Parameters:
varids- collection of variable ids- Returns:
- a string
-
replaceInExpression
Replace an IR expression located inside another IR. This method is a deep version ofIDExpression.replaceSubExpression(IDExpression, IDExpression). Comparison is done by identity.- Parameters:
e- parent IRtarget- the target to be replaced; should not be the same as `e`repl- the replacement IR- Returns:
- true if the target was successfully replaced
-
countVariable
Count the number of occurrences of a given variable in the provided IR expression.- Parameters:
e- an IR expressionvar-- Returns:
-
usesReferences
Determine whether the IR expression uses references, defined as any of: accessing fields, accessing array elements, creating arrays.- Parameters:
e- an IR expression- Returns:
-
collectVars
Collect all thevariablesmaking up this expression, including the expression itself if is a variable. The returned collection may contain duplicates. Example: If the provided expression if((var1 + var2 + 5) / var1), this method will return[var1, var2, var1]. Note that the collection is done in post-order, so ordering is predictable.- Parameters:
e- an IR expression- Returns:
- a list of all vars used in the expression
-
collectUniqueVarIds
Collect the unique list of variable ids making up this expression, including the expression itself if is anIDVar.- Parameters:
e- an IR expression- Returns:
- a set of variable ids used in the expression
-
calculateComplexity
Calculate the trivial complexity metric for the provided IR expression. Every IR element has a weight of 1.- Parameters:
e- an IR expression- Returns:
- the complexity score
-
hasInvokeInfo
Determine whether the IR expression makes invocations (general invocations, new, new-array).- Parameters:
e- an IR expression- Returns:
-
hasVariables
Determine whether the provided expression containsvariables.- Parameters:
e- an IR expression- Returns:
-
hasVariable
Determine whether the provided expression is or contains the provided variable.- Parameters:
e- an expressionvarid- the searched variable id- Returns:
-
checkBlock
Check for instruction matching in the provided block. The block must be fully matched.- Parameters:
b-opcodes-- Returns:
-
checkSequence
Check for instruction matching in a sequence of blocks. Blocks must be fully contiguous and fully matched.
This is an extended version ofcheckBlock(BasicBlock, DOpcodeType...), which performs matching on a single block.- Parameters:
cfg-blkindex-opcodes-- Returns:
- the number of blocks containing the sequence; on failure
-
mayBeFinal
Determine whether the provided field is or appears to befinal.- Parameters:
f-dex-- Returns:
-
cleanGraph
Clean the graph of an IR method context.- Returns:
- the number of clean-up operations performed
-
removeGaps
Remove all gaps from the provided IR-CFG.Gaps are illegal in an IR-CFG. They are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
cfg-- Returns:
- the number of gaps that were removed
-
removeUnreachableBlocks
Remove unreachable blocks from an IR-CFG. They are blocks that cannot be reached, regularly or irregularly, from the entry-point block (i.e. the first block, at offset 0).Unreachable blocks are illegal in an IR-CFG. They are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
ctx-- Returns:
-
removeUnreachableBlocks
Remove unreachable blocks from an IR-CFG. They are blocks that cannot be reached, regularly or irregularly, from the entry-point block (i.e. the first block, at offset 0).Unreachable blocks are illegal in an IR-CFG. They are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
cfg-ex-- Returns:
-
getReachableBlocks
Get the list of reachable blocks of the provided IR-CFG. They are the blocks that can be reached, regularly or irregularly, from the entry-point block (i.e. the first block, at offset 0).- Parameters:
cfg-- Returns:
-
simplifyJCondsAndSwitches
SimplifyDOpcodeType#JCONDandDOpcodeType.IR_SWITCHIR instructions:
- JCOND: the target branch must not point to the fallthrough instruction; if it is, the JCOND must be replaced by a JUMP (while taking care of potential side-effects introduced by the evaluation of the JCOND predicate)
- SWITCH: the case targets must not point to the fallthrough instruction; if any does, the case entry must be removed (and will be taken care by the switch'sdefaultentry)Illegal JCOND/SWITCH are tolerated during an optimization phase; however, an optimizer, or any code modifying the IR, must provide a legal IR to subsequent components.
- Parameters:
ctx-- Returns:
-
getExceptionSignature
- Parameters:
ctx-h-- Returns:
-
canHandlerCatchException
public static boolean canHandlerCatchException(IDMethodContext ctx, IDExceptionHandler h, String exceptionSig) - Parameters:
ctx-h-exceptionSig-- Returns:
-
unprotectBlock
Unprotect a basic block. The CFG is adjusted and cleaned afterward.- Parameters:
cfg-exdata-blkAddress- block address- Returns:
- true if the block was protected and is now unprotected; false if the block was not protected
-
getCAT
Determine the aggregated context type of an instruction.- Parameters:
insn- instructioninclAssigDst- if true and the instruction is an assignment, the destination is included when calculated the CAT; else, the right-side only is considered- Returns:
- aggregated CAT
-
isUsingCaughtException
- Parameters:
ctx-b-- Returns:
-
generateNativeAddress
- Parameters:
ctx-elt-- Returns:
-
generateNativeCoordinates
- Parameters:
ctx-elt-- Returns:
-
verifyGraph
Verify an IR-CFG from a method context. On failure, the CFG will be dumped to "failed.dot"- Parameters:
ctx- IR context, from which the CFG and optional exception-info object will be pulled- Throws:
RuntimeException- (or derived, such asIllegalArgumentExceptionorIllegalStateException) if the verification failed
-
verifyGraph
Verify an IR-CFG from a method context. On failure, the CFG will be dumped to "failed.dot"- Parameters:
ctx- IR context, from which the CFG and optional exception-info object will be pulledinfo- optional string added to the generated DOT file's header containing the CFG (if verification failed)- Throws:
RuntimeException- (or derived, such asIllegalArgumentExceptionorIllegalStateException) if the verification failed
-
verifyGraph
Verify an IR-CFG.- Parameters:
ctx- context containing the CFGdumpFilename- optional path to filename where the CFG will be dumped if the verification filedinfo- optional string added to the generated DOT file's header containing the CFG (if verification failed)- Throws:
RuntimeException- (or derived, such asIllegalArgumentExceptionorIllegalStateException) if the verification failed
-
verifyGraph
public static void verifyGraph(IDMethodContext ctx, CFG<IDInstruction> cfg, IDTryData exdata, String dumpFilename, String info) Verify an IR-CFG.- Parameters:
ctx-cfg- CFGexdata- optional exception-info objectdumpFilename- optional path to filename where the CFG will be dumped if the verification filedinfo- optional string added to the generated DOT file's header containing the CFG (if verification failed)- Throws:
RuntimeException- (or derived, such asIllegalArgumentExceptionorIllegalStateException) if the verification failed
-
verifyUnicity
Verify IR unicity within a full graph.- Parameters:
cfg-- Returns:
- (informative) count of collected expressions
- Throws:
IllegalStateException- if a duplicate expression is found
-
verifyUnicity
Verify IR unicity within an expression.- Parameters:
exp- an IR expression- Returns:
- (informative) count of collected expressions
- Throws:
IllegalStateException- if a duplicate expression is found
-
verifyDataFlow
Perform data flow verifications on the graph. Warning: on large graphs, this verification can be very costly. Currently implemented:
- verify that live variables at a block entry match the reached variables at the exits points of all input blocks- Parameters:
cfg- CFG- Throws:
IllegalArgumentException- if a data flow inconsistency is detected
-
dump
Dump an IR-CFG to a Graphvizdotfile in the user's temporary folder.- Parameters:
cfg-filename-
-
dump
Dump an IR-CFG to a Graphvizdotfile in the user's temporary folder.- Parameters:
cfg- CFGfilename- filename (the ".dot" extension will be appended if necessary)title- optional title or header string
-
removeInstruction
Remove an instruction in a block. Throws on error (watch out: the boolean return value does not indicate failure).- Parameters:
b- basic blockidx- index of the instruction to remove in this block- Returns:
- true if the instruction was truly removed, false if it was removed and replaced by an unconditional jump to avoid creating an empty block
-
removeUnreachableTrampoline
public static boolean removeUnreachableTrampoline(CFG<IDInstruction> cfg, BasicBlock<IDInstruction> b) Remove an unreachable trampoline block (ending on a JCOND or JUMP).- Parameters:
cfg-b- the block must be an unreachable (no input) trampoline (single-instruction jump/jcc); if the block is the first block in the CFG, the method will fail- Returns:
- true if the block was removed
-
copyGraph
public static CFG<IDInstruction> copyGraph(CFG<IDInstruction> cfg, boolean deepCopy, IDMethodContext newContext) Copy an IR-CFG.- Parameters:
cfg- a CFGdeepCopy- if false, the CFG is only structurally duplicated (that is, the copy is shallow: the instructions of the new CFG are reused in the new CFG); if true, the instructions areduplicatedas well).newContext- if performing deep copy, the optional new context to beset up for the IR instructions- Returns:
- the CFG copy (deep or shallow)
-
normalizeGraph
Normalize the IR by resetting all instructions to the unit size (1). Exception data is updated as well. The modification are done in-place: the CFG and other objects references remain valid.- Parameters:
ctx- IR context- Returns:
- the number of modifications performed; if nothing has changed, the return value is 0
-
normalizeGraph
Normalize the IR by resetting all instructions to the required size. Exception data is updated as well. The modification are done in-place: the CFG and other objects references remain valid.- Parameters:
ctx- IR contextwantedInstructionSize- required instruction size, must be 1 or above- Returns:
- the number of modifications performed; if nothing has changed, the return value is 0
-
modifyInstructionSizes
public static int modifyInstructionSizes(IDMethodContext ctx, Function<IDInstruction, Integer> modifier) Update the instruction sizes of a CFG. This method is a general version ofnormalizeGraph(). No CFG block is inserted or deleted. Only instructions' sizes and offsets are updated. Note that the CFG is modified, but the CFG reference within the context is not changed (i.e. no new CFG is created).- Parameters:
ctx- IR contextmodifier- instruction size provider; the function may return null to indicate that the instruction size should not change- Returns:
- the number of modifications
-
modifyInstructionSize
public static boolean modifyInstructionSize(IDMethodContext ctx, IDInstruction targetInstruction, int minWantedSize) Modify the size of an instruction within a CFG. No CFG block is inserted or deleted. Only instructions' sizes and offsets are updated. Note that the CFG is modified, but the CFG reference within the context is not changed (i.e. no new CFG is created).- Parameters:
ctx- IR contextinsn- the target instructionminWantedSize- the minimal wanted size- Returns:
- true if the instruction was modified, false if no modification was required
-
updateTargets
- Parameters:
ctx- IR contextoldToNewOffsets- a mapping of the old offsets to the corresponding new offsets- Returns:
- the number of modifications
-
insertHeaderBlock
Insert a header block filled with NOP instructions. Essentially, the CFG is shifted downward to accommodate the insertion of a number of lead instructions (to be executed before the entry point).This method never fails. The returned CFG and IR context information is updated to maintain consistency.
- Parameters:
ctx- an IR contextreqInsnCount- number of NOP instructions to be prependedreqInsnSize- size set for each individual instruction
-
splitBlock
public static BasicBlock<IDInstruction> splitBlock(IDMethodContext ctx, BasicBlock<IDInstruction> blk, int index) Split a block of the context's CFG into two blocks. The newly-created block will be set up to have the same irregular outputs as the original block. The context's exception information data will be updated accordingly.- Parameters:
ctx- IR contextblk- block to splitindex- index of the "split instruction" in the block; that instruction will be the first instruction of the newly-created block- Returns:
- the newly-created block
-
mergeBlocks
Merge two blocks.- Parameters:
ctx- IR contextfirstBlock- the first block, to be merged with the subsequent (fall-through) block- Returns:
- success indicator
-
generateBlockOffsetMap
Generate an offset IR-to-native offset conversion map.- Returns:
- a map of (IR block offset -> Dalvik block offset)
-
isImmZero
Determine if an expression is an IR immediate holding the value 0.- Parameters:
exp- an IR expression- Returns:
- true if the provided expression is an immediate equivalent to the value 0
-
isImmValue
Determine if an expression is an IR immediate holding the provided value. The immediate value is determined according to its type. If the type of the expression is undetermined, a signed type is assumed.- Parameters:
exp- an IR expressionvalue- the value to check against- Returns:
- true if the provided expression that evaluated to the provided value
-
isImmNonZero
Determine if an expression is an IR immediate not holding the value 0.- Parameters:
exp- an IR expression- Returns:
- true if the provided expression is an immediate not equivalent to the value 0
-
nextInstruction
public static IDInstruction nextInstruction(CFG<IDInstruction> cfg, BasicBlock<IDInstruction> b, int i) Determine the meaningful next instruction in the execution flow, assuming no exception was raised. Jump targets are followed, no-operations are skipped.- Parameters:
cfg- current CFGb- current blocki- index of the current instruction in the block- Returns:
- the next instruction, or null if it cannot be determined accurately
-
unroll
public static List<IDInstruction> unroll(IDMethodContext ctx, IDVar pivotVar, long addrStart, long addrEnd, int maxProcessedInsnCount, int maxUnrolledInsnCount, int maxLoopIterCnt) Attempt to unroll some code whose pivot is a singlevariable. The generated code will not contain branching instructions JUMP, JCOND, SWITCH, RETURN, THROW, etc.).- Parameters:
ctx- IR method context (which references the CFG to operate on)pivotVar- pivot variableaddrStart- start address (inclusive) - generally, the instruction at that address initializes the pivot variable, although it is not a requirementaddrEnd- end address (exclusive), will stop the unrolling processmaxProcessedInsnCount- maximum amount of instructions to process until the end is reachedmaxUnrolledInsnCount- maximum amount of unrolled instructions that may be generatedmaxLoopIterCnt- maximum count of iterations to be unrolled (note: keep in mind for values greater than 1, the resulting IR is likely to be larger than the original IR)- Returns:
- null on error or failure, else the list of unrolled instructions (the list may
contain duplicates, client code needs to use
IDInstruction.duplicate()to create unique instructions)
-
determineInterval
public static Collection<BasicBlock<IDInstruction>> determineInterval(BasicBlock<IDInstruction> hdrblk) Find the graph interval rooted in the provided header block.An interval is defined as: Given a header node H, an interval I(H) is the maximal, single-entry subgraph in which H is the only entry node, and in which all closed paths contain H.
- Parameters:
hdrblk- header block- Returns:
- interval nodes (the first node is the header)
-
canThrow
Determine whether the trailing instructions of a block can throw.- Parameters:
b- blockfrom- starting instruction index- Returns:
- true if the instructions may throw
-
canThrow
Determine whether a provided range of instructions of a block can throw.- Parameters:
b-from- starting indexto- final index (exclusive)- Returns:
- true if the instructions may throw
-
isProtectedByCatchAll
Determine whether the provided block is protected by acatch-all(catching Throwable).- Parameters:
ctx-blk-- Returns:
-