ROSE 0.11.145.192
Classes | Typedefs
Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics Namespace Reference

Description

Generate C source AST from binary AST.

This semantic domain is used by the BinaryToSource analysis to generate low-level C source code from a binary. The semantic values of this domain are C expressions as source code strings. When a RISC operator, such as "add" is invoked on two semantic values, say C expressions "123" and "x", the result is a new value that holds a larger C expression, such as "(123 + x)". The concept is quite simple, but in practice this domain needs to handle three additional things:

Sizes other than 8, 16, 32, and 64: The semantic values know their exact size in bits and generate C code that uses the smallest allowable type to represent the value, one of uint8_t, uint16_t, uint32_t, or uint64_t. All values are unsigned for consistency, and operations such as sign extension are coded explicitly (this is how it happens in the instruction semantics layers, and the C code is a reflection of those operations). The generated C code uses masking (bit-wise AND) to ensure that unused high-order bits of the C value are zero (e.g., when storing a 5-bit value in a uint8_t the value will be masked with 0x1f).

Undefined behavior of C shift operations: The C language does not define the behavior of shift operators when the shift count is as wide or wider than the lhs operand. But since the CPU defines these operations, and since one of the points of this translation is to be able to recompile a binary specimen for a different architecture, the translation needs to generate well-defined behavior in these cases. Therefore, all shift operations are protected with conditional code in the C output.

Multi-state vs. single-state: Instruction semantics can operate on multiple machine states at once. For instance, an x86 PUSH instruction might update the stack pointer register before writing to the stack, but then use the stack pointer during the write operations. In some other operation it might update the stack pointer but then use the new value. The generated C program has only a single state object: the register global variables and global memory variable. Therefore, the generated code performs all calculations up front using static single assignment (SSA) and then generates the side effects that update the C program state.

For clues about how to use this domain, see Rose::BinaryAnalysis::BinaryToSource. In general, one constructs the domain and processes one instruction at a time. For each instruction, the domain's state is reset to an initial value, then the instruction is processed, then the side effect list is examined to generate the C code for the instruction.

Classes

class  RiscOperators
 Basic semantic operations. More...
 
class  SValue
 Semantic values for generating C source code ASTs. More...
 

Typedefs

typedef Sawyer::SharedPointer< class SValueSValuePtr
 Shared-ownership pointer for a binary-to-source semantic value.
 
typedef BaseSemantics::RegisterStateGeneric RegisterState
 Register state used by this domain.
 
typedef BaseSemantics::RegisterStateGenericPtr RegisterStatePtr
 Pointer to register states used by this domain.
 
typedef NullSemantics::MemoryState MemoryState
 Memory state used by this domain.
 
typedef NullSemantics::MemoryStatePtr MemoryStatePtr
 Pointer to memory states used by this domain.
 
typedef BaseSemantics::State State
 State used by this domain.
 
typedef BaseSemantics::StatePtr StatePtr
 Pointer to states used by this domain.
 
typedef boost::shared_ptr< class RiscOperatorsRiscOperatorsPtr
 Shared-ownership pointer for basic semantic operations.
 

Typedef Documentation

◆ SValuePtr

Shared-ownership pointer for a binary-to-source semantic value.

Definition at line 57 of file SourceAstSemantics.h.

◆ RegisterState

Register state used by this domain.

Definition at line 160 of file SourceAstSemantics.h.

◆ RegisterStatePtr

Pointer to register states used by this domain.

Definition at line 161 of file SourceAstSemantics.h.

◆ MemoryState

Memory state used by this domain.

Definition at line 163 of file SourceAstSemantics.h.

◆ MemoryStatePtr

Pointer to memory states used by this domain.

Definition at line 164 of file SourceAstSemantics.h.

◆ State

State used by this domain.

Definition at line 166 of file SourceAstSemantics.h.

◆ StatePtr

Pointer to states used by this domain.

Definition at line 167 of file SourceAstSemantics.h.

◆ RiscOperatorsPtr

Shared-ownership pointer for basic semantic operations.

Definition at line 175 of file SourceAstSemantics.h.