ROSE 0.11.145.147
|
Generate C source AST from binary AST.
This semantic domain is used by the BinaryToSource analysis to generate low-level C source code from a binary. The semantic values of this domain are C expressions as source code strings. When a RISC operator, such as "add" is invoked on two semantic values, say C expressions "123" and "x", the result is a new value that holds a larger C expression, such as "(123 + x)". The concept is quite simple, but in practice this domain needs to handle three additional things:
Sizes other than 8, 16, 32, and 64: The semantic values know their exact size in bits and generate C code that uses the smallest allowable type to represent the value, one of uint8_t
, uint16_t
, uint32_t
, or uint64_t
. All values are unsigned for consistency, and operations such as sign extension are coded explicitly (this is how it happens in the instruction semantics layers, and the C code is a reflection of those operations). The generated C code uses masking (bit-wise AND) to ensure that unused high-order bits of the C value are zero (e.g., when storing a 5-bit value in a uint8_t the value will be masked with 0x1f).
Undefined behavior of C shift operations: The C language does not define the behavior of shift operators when the shift count is as wide or wider than the lhs operand. But since the CPU defines these operations, and since one of the points of this translation is to be able to recompile a binary specimen for a different architecture, the translation needs to generate well-defined behavior in these cases. Therefore, all shift operations are protected with conditional code in the C output.
Multi-state vs. single-state: Instruction semantics can operate on multiple machine states at once. For instance, an x86 PUSH instruction might update the stack pointer register before writing to the stack, but then use the stack pointer during the write operations. In some other operation it might update the stack pointer but then use the new value. The generated C program has only a single state object: the register global variables and global memory variable. Therefore, the generated code performs all calculations up front using static single assignment (SSA) and then generates the side effects that update the C program state.
For clues about how to use this domain, see Rose::BinaryAnalysis::BinaryToSource. In general, one constructs the domain and processes one instruction at a time. For each instruction, the domain's state is reset to an initial value, then the instruction is processed, then the side effect list is examined to generate the C code for the instruction.
Classes | |
class | RiscOperators |
Basic semantic operations. More... | |
class | SValue |
Semantic values for generating C source code ASTs. More... | |
Typedefs | |
typedef Sawyer::SharedPointer< class SValue > | SValuePtr |
Shared-ownership pointer for a binary-to-source semantic value. | |
typedef BaseSemantics::RegisterStateGeneric | RegisterState |
Register state used by this domain. | |
typedef BaseSemantics::RegisterStateGenericPtr | RegisterStatePtr |
Pointer to register states used by this domain. | |
typedef NullSemantics::MemoryState | MemoryState |
Memory state used by this domain. | |
typedef NullSemantics::MemoryStatePtr | MemoryStatePtr |
Pointer to memory states used by this domain. | |
typedef BaseSemantics::State | State |
State used by this domain. | |
typedef BaseSemantics::StatePtr | StatePtr |
Pointer to states used by this domain. | |
typedef boost::shared_ptr< class RiscOperators > | RiscOperatorsPtr |
Shared-ownership pointer for basic semantic operations. | |
typedef Sawyer::SharedPointer<class SValue> Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics::SValuePtr |
Shared-ownership pointer for a binary-to-source semantic value.
Definition at line 57 of file SourceAstSemantics.h.
typedef BaseSemantics::RegisterStateGeneric Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics::RegisterState |
Register state used by this domain.
Definition at line 160 of file SourceAstSemantics.h.
typedef BaseSemantics::RegisterStateGenericPtr Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics::RegisterStatePtr |
Pointer to register states used by this domain.
Definition at line 161 of file SourceAstSemantics.h.
typedef NullSemantics::MemoryState Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics::MemoryState |
Memory state used by this domain.
Definition at line 163 of file SourceAstSemantics.h.
typedef NullSemantics::MemoryStatePtr Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics::MemoryStatePtr |
Pointer to memory states used by this domain.
Definition at line 164 of file SourceAstSemantics.h.
State used by this domain.
Definition at line 166 of file SourceAstSemantics.h.
typedef BaseSemantics::StatePtr Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics::StatePtr |
Pointer to states used by this domain.
Definition at line 167 of file SourceAstSemantics.h.
typedef boost::shared_ptr<class RiscOperators> Rose::BinaryAnalysis::InstructionSemantics::SourceAstSemantics::RiscOperatorsPtr |
Shared-ownership pointer for basic semantic operations.
Definition at line 175 of file SourceAstSemantics.h.