ROSE 0.11.145.147
|
Engine for specimens containing machine instructions.
This engine is reponsible for creating a partitioner for a specimen that has machine instructions such as the Intel x86 family of instruction sets, Arm instruction sets, PowerPC instructio sets, Motorola instruction sets, MIPS instruction sets, etc. It is specifically not to be used for byte code targeting the likes of the Java Virtual Machine (JVM) or the Common Language Runtime (CLR).
This engine provides an instance static member function that instantiates an engine of this type on the heap and returns a shared-ownership pointer to the instance. Refer to the base class, Partitioner2::Engine, to learn how to instantiate engines from factories.
This engine uses a hybrid approach combining linear and recursvie diassembly. Linear disassembly progresses by starting at some low address in the specimen address space, disassembling one instruction, and then moving on to the next (fallthrough) address and repeating. This approach is quite good at disassembling everything (especially for fixed length instructions) but makes no attempt to organize instructions according to flow of control. On the other hand, recursive disassembly uses a work list containing known instruction addresses, disassembles an instruction from the worklist, determines its control flow successors, and adds those addresses to the work list. As a side effect, it produces a control flow graph. ROSE's hybrid approach uses linear disassembly to find starting points using heuristics such as common compiler function prologues and epilogues, references from symbol tables of various types, and other available data. Once starting points are known, ROSE uses recursive disassembly to follow the control flow. Various kinds of analysis and heuristics are used to control the finer points of recursive disassembly. Part of the trick to a successful and accurate disassembly of what is essentially equivalent to the halting problem, depends on finding the right balance between the pure recursive approach and the heuristics and analyses. This balance is often different for each kind of specimens.
Definition at line 48 of file EngineBinary.h.
#include <Rose/BinaryAnalysis/Partitioner2/EngineBinary.h>
Public Types | |
using | Ptr = EngineBinaryPtr |
Shared ownership pointer. | |
Public Types inherited from Rose::BinaryAnalysis::Partitioner2::Engine | |
using | Ptr = EnginePtr |
Shared ownership pointer. | |
Public Member Functions | |
virtual void | loadVxCore (const std::string &spec) |
Parses a vxcore specification and initializes memory. | |
virtual void | loadContainers (const std::vector< std::string > &fileNames) |
Loads memory from binary containers. | |
virtual void | loadNonContainers (const std::vector< std::string > &names) |
Loads memory from non-containers. | |
virtual PartitionerPtr | createTunedPartitioner () |
Create a tuned partitioner. | |
virtual PartitionerPtr | createPartitionerFromAst (SgAsmInterpretation *) |
Create a partitioner from an AST. | |
virtual bool | partitionCilSections (const PartitionerPtr &) |
Partition any sections containing CIL code. | |
bool | hasCilCodeSection () |
Determine whether the interpretation header contains a CIL code section. | |
virtual std::vector< FunctionPtr > | makeEntryFunctions (const PartitionerPtr &, SgAsmInterpretation *) |
Make functions at specimen entry addresses. | |
virtual std::vector< FunctionPtr > | makeErrorHandlingFunctions (const PartitionerPtr &, SgAsmInterpretation *) |
Make functions at error handling addresses. | |
virtual std::vector< FunctionPtr > | makeImportFunctions (const PartitionerPtr &, SgAsmInterpretation *) |
Make functions at import trampolines. | |
virtual std::vector< FunctionPtr > | makeExportFunctions (const PartitionerPtr &, SgAsmInterpretation *) |
Make functions at export addresses. | |
virtual std::vector< FunctionPtr > | makeSymbolFunctions (const PartitionerPtr &, SgAsmInterpretation *) |
Make functions for symbols. | |
virtual std::vector< FunctionPtr > | makeContainerFunctions (const PartitionerPtr &, SgAsmInterpretation *) |
Make functions based on specimen container. | |
virtual std::vector< FunctionPtr > | makeInterruptVectorFunctions (const PartitionerPtr &, const AddressInterval &vector) |
Make functions from an interrupt vector. | |
virtual std::vector< FunctionPtr > | makeUserFunctions (const PartitionerPtr &, const std::vector< rose_addr_t > &) |
Make a function at each specified address. | |
virtual void | discoverBasicBlocks (const PartitionerPtr &) |
Discover as many basic blocks as possible. | |
virtual FunctionPtr | makeNextDataReferencedFunction (const PartitionerConstPtr &, rose_addr_t &startVa) |
Scan read-only data to find function pointers. | |
virtual FunctionPtr | makeNextCodeReferencedFunction (const PartitionerConstPtr &) |
Scan instruction ASTs to function pointers. | |
virtual std::vector< FunctionPtr > | makeCalledFunctions (const PartitionerPtr &) |
Make functions for function call edges. | |
virtual std::vector< FunctionPtr > | makeFunctionFromInterFunctionCalls (const PartitionerPtr &, rose_addr_t &startVa) |
Make functions from inter-function calls. | |
virtual void | discoverFunctions (const PartitionerPtr &) |
Discover as many functions as possible. | |
virtual std::set< rose_addr_t > | attachDeadCodeToFunction (const PartitionerPtr &, const FunctionPtr &, size_t maxIterations=size_t(-1)) |
Attach dead code to function. | |
virtual DataBlockPtr | attachPaddingToFunction (const PartitionerPtr &, const FunctionPtr &) |
Attach function padding to function. | |
virtual std::vector< DataBlockPtr > | attachPaddingToFunctions (const PartitionerPtr &) |
Attach padding to all functions. | |
virtual size_t | attachAllSurroundedCodeToFunctions (const PartitionerPtr &) |
Attach all possible intra-function basic blocks to functions. | |
virtual size_t | attachSurroundedCodeToFunctions (const PartitionerPtr &) |
Attach intra-function basic blocks to functions. | |
virtual void | attachBlocksToFunctions (const PartitionerPtr &) |
Attach basic blocks to functions. | |
virtual std::set< rose_addr_t > | attachDeadCodeToFunctions (const PartitionerPtr &, size_t maxIterations=size_t(-1)) |
Attach dead code to functions. | |
virtual std::vector< DataBlockPtr > | attachSurroundedDataToFunctions (const PartitionerPtr &) |
Attach intra-function data to functions. | |
virtual bool | makeNextCallReturnEdge (const PartitionerPtr &, boost::logic::tribool assumeCallReturns) |
Insert a call-return edge and discover its basic block. | |
virtual BasicBlockPtr | makeNextBasicBlockFromPlaceholder (const PartitionerPtr &) |
Discover basic block at next placeholder. | |
virtual BasicBlockPtr | makeNextBasicBlock (const PartitionerPtr &) |
Discover a basic block. | |
virtual bool | matchFactory (const std::vector< std::string > &specimen) const override |
Predicate for matching a concrete engine factory by settings and specimen. | |
virtual EnginePtr | instanceFromFactory (const Settings &) override |
Virtual constructor for factories. | |
virtual void | reset () override |
Reset the engine to its initial state. | |
SgAsmBlock * | frontend (const std::vector< std::string > &args, const std::string &purpose, const std::string &description) override |
Most basic usage of the partitioner. | |
virtual SgAsmInterpretation * | parseContainers (const std::vector< std::string > &fileNames) override |
Parse specimen binary containers. | |
virtual MemoryMapPtr | loadSpecimens (const std::vector< std::string > &fileNames=std::vector< std::string >()) override |
Load and/or link interpretation. | |
virtual PartitionerPtr | partition (const std::vector< std::string > &fileNames=std::vector< std::string >()) override |
Partition instructions into basic blocks and functions. | |
virtual SgAsmBlock * | buildAst (const std::vector< std::string > &fileNames=std::vector< std::string >()) override |
Obtain an abstract syntax tree. | |
virtual std::list< Sawyer::CommandLine::SwitchGroup > | commandLineSwitches () override |
Command-line switches for a particular engine. | |
virtual std::pair< std::string, std::string > | specimenNameDocumentation () override |
Documentation about how the specimen is specified. | |
virtual bool | isNonContainer (const std::string &) override |
Determine whether a specimen name is a non-container. | |
virtual bool | areContainersParsed () const override |
Returns true if containers are parsed. | |
virtual PartitionerPtr | createPartitioner () override |
Create partitioner. | |
virtual void | runPartitionerInit (const PartitionerPtr &) override |
Finds interesting things to work on initially. | |
virtual void | runPartitionerRecursive (const PartitionerPtr &) override |
Runs the recursive part of partioning. | |
virtual void | runPartitionerFinal (const PartitionerPtr &) override |
Runs the final parts of partitioning. | |
virtual SgProject * | roseFrontendReplacement (const std::vector< boost::filesystem::path > &fileNames) override |
SgAsmBlock * | frontend (int argc, char *argv[], const std::string &purpose, const std::string &description) |
Most basic usage of the partitioner. | |
virtual SgAsmBlock * | frontend (const std::vector< std::string > &args, const std::string &purpose, const std::string &description)=0 |
Most basic usage of the partitioner. | |
virtual SgAsmInterpretation * | parseContainers (const std::vector< std::string > &fileNames)=0 |
Parse specimen binary containers. | |
SgAsmInterpretation * | parseContainers (const std::string &fileName) |
virtual MemoryMapPtr | loadSpecimens (const std::vector< std::string > &fileNames=std::vector< std::string >())=0 |
Load and/or link interpretation. | |
MemoryMapPtr | loadSpecimens (const std::string &fileName) |
virtual PartitionerPtr | partition (const std::vector< std::string > &fileNames=std::vector< std::string >())=0 |
Partition instructions into basic blocks and functions. | |
PartitionerPtr | partition (const std::string &fileName) |
virtual SgAsmBlock * | buildAst (const std::vector< std::string > &fileNames=std::vector< std::string >())=0 |
Obtain an abstract syntax tree. | |
SgAsmBlock * | buildAst (const std::string &fileName) |
virtual BinaryLoaderPtr | obtainLoader (const BinaryLoaderPtr &hint) |
Obtain a binary loader. | |
virtual BinaryLoaderPtr | obtainLoader () |
Obtain a binary loader. | |
virtual std::vector< FunctionPtr > | makeNextPrologueFunction (const PartitionerPtr &, rose_addr_t startVa) |
Make function at prologue pattern. | |
virtual std::vector< FunctionPtr > | makeNextPrologueFunction (const PartitionerPtr &, rose_addr_t startVa, rose_addr_t &lastSearchedVa) |
Make function at prologue pattern. | |
BinaryLoaderPtr | binaryLoader () const |
Property: binary loader. | |
virtual void | binaryLoader (const BinaryLoaderPtr &) |
Property: binary loader. | |
ThunkPredicatesPtr | functionMatcherThunks () const |
Property: Predicate for finding functions that are thunks. | |
virtual void | functionMatcherThunks (const ThunkPredicatesPtr &) |
Property: Predicate for finding functions that are thunks. | |
ThunkPredicatesPtr | functionSplittingThunks () const |
Property: Predicate for finding thunks at the start of functions. | |
virtual void | functionSplittingThunks (const ThunkPredicatesPtr &) |
Property: Predicate for finding thunks at the start of functions. | |
Public Member Functions inherited from Rose::BinaryAnalysis::Partitioner2::Engine | |
std::list< Sawyer::CommandLine::SwitchGroup > | allCommandLineSwitches () |
List of command-line switches for all engines. | |
virtual void | addToParser (Sawyer::CommandLine::Parser &) |
Add switches and sections to command-line parser. | |
void | addAllToParser (Sawyer::CommandLine::Parser &) |
Add switches and sections to command-line parser. | |
virtual Sawyer::CommandLine::Parser | commandLineParser (const std::string &purpose, const std::string &description) |
Creates a command-line parser. | |
bool | isFactory () const |
Returns true if this object is a factory. | |
virtual void | savePartitioner (const PartitionerConstPtr &, const boost::filesystem::path &, SerialIo::Format=SerialIo::BINARY) |
virtual PartitionerPtr | loadPartitioner (const boost::filesystem::path &, SerialIo::Format=SerialIo::BINARY) |
virtual void | checkSettings () |
Check settings after command-line is processed. | |
virtual bool | isRbaFile (const std::string &) |
Determine whether a specimen is an RBA file. | |
virtual bool | areSpecimensLoaded () const |
Returns true if specimens are loaded. | |
virtual void | adjustMemoryMap () |
Adjust memory map post-loading. | |
virtual void | checkCreatePartitionerPrerequisites () const |
Check that we have everything necessary to create a partitioner. | |
virtual PartitionerPtr | createBarePartitioner () |
Create a bare partitioner. | |
virtual void | runPartitioner (const PartitionerPtr &) |
Partitions instructions into basic blocks and functions. | |
virtual void | labelAddresses (const PartitionerPtr &, const Configuration &) |
Label addresses. | |
virtual std::vector< DataBlockPtr > | makeConfiguredDataBlocks (const PartitionerPtr &, const Configuration &) |
Make data blocks based on configuration. | |
virtual std::vector< FunctionPtr > | makeConfiguredFunctions (const PartitionerPtr &, const Configuration &) |
Make functions based on configuration information. | |
virtual void | updateAnalysisResults (const PartitionerPtr &) |
Runs various analysis passes. | |
Architecture::BaseConstPtr | architecture () |
Property: Architecture. | |
SgAsmBlock * | frontend (int argc, char *argv[], const std::string &purpose, const std::string &description) |
Most basic usage of the partitioner. | |
Sawyer::CommandLine::ParserResult | parseCommandLine (int argc, char *argv[], const std::string &purpose, const std::string &description) |
Parse the command-line. | |
virtual Sawyer::CommandLine::ParserResult | parseCommandLine (const std::vector< std::string > &args, const std::string &purpose, const std::string &description) |
Parse the command-line. | |
SgAsmBlock * | buildAst (const std::string &fileName) |
Obtain an abstract syntax tree. | |
SgAsmInterpretation * | parseContainers (const std::string &fileName) |
Parse specimen binary containers. | |
MemoryMapPtr | loadSpecimens (const std::string &fileName) |
Load and/or link interpretation. | |
PartitionerPtr | partition (const std::string &fileName) |
Partition instructions into basic blocks and functions. | |
MemoryMapPtr | memoryMap () const |
Property: memory map. | |
virtual void | memoryMap (const MemoryMapPtr &) |
Property: memory map. | |
virtual Architecture::BaseConstPtr | obtainArchitecture () |
Determine the architecture. | |
virtual Architecture::BaseConstPtr | obtainArchitecture (const Architecture::BaseConstPtr &hint) |
Determine the architecture. | |
const std::string & | name () const |
Property: Name. | |
void | name (const std::string &) |
Property: Name. | |
const Settings & | settings () const |
Property: All settings. | |
Settings & | settings () |
Property: All settings. | |
void | settings (const Settings &) |
Property: All settings. | |
BasicBlockWorkList::Ptr | basicBlockWorkList () const |
Property: BasicBlock work list. | |
void | basicBlockWorkList (const BasicBlockWorkList::Ptr &) |
Property: BasicBlock work list. | |
CodeConstants::Ptr | codeFunctionPointers () const |
Property: Instruction AST constants. | |
void | codeFunctionPointers (const CodeConstants::Ptr &) |
Property: BasicBlock work list. | |
SgAsmInterpretation * | interpretation () const |
Property: interpretation. | |
virtual void | interpretation (SgAsmInterpretation *) |
Property: interpretation. | |
ProgressPtr | progress () const |
Property: progress reporting. | |
virtual void | progress (const ProgressPtr &) |
Property: progress reporting. | |
const std::vector< std::string > & | specimen () const |
Property: specimen. | |
virtual void | specimen (const std::vector< std::string > &) |
Property: specimen. | |
Public Member Functions inherited from Sawyer::SharedObject | |
SharedObject () | |
Default constructor. | |
SharedObject (const SharedObject &) | |
Copy constructor. | |
virtual | ~SharedObject () |
Virtual destructor. | |
SharedObject & | operator= (const SharedObject &) |
Assignment. | |
Public Member Functions inherited from Sawyer::SharedFromThis< Engine > | |
SharedPointer< Engine > | sharedFromThis () |
Create a shared pointer from this . | |
SharedPointer< const Engine > | sharedFromThis () const |
Create a shared pointer from this . | |
Static Public Member Functions | |
static Ptr | instance () |
Allocating constructor. | |
static Ptr | instance (const Settings &) |
Allocating constructor with settings. | |
static Ptr | factory () |
Allocate a factory. | |
static Sawyer::CommandLine::SwitchGroup | engineSwitches (EngineSettings &) |
Command-line switches related to the general engine behavior. | |
static Sawyer::CommandLine::SwitchGroup | loaderSwitches (LoaderSettings &) |
Command-line switches related to loading specimen into memory. | |
static Sawyer::CommandLine::SwitchGroup | disassemblerSwitches (DisassemblerSettings &) |
Command-line switches related to decoding instructions. | |
static Sawyer::CommandLine::SwitchGroup | partitionerSwitches (PartitionerSettings &) |
Command-line switches related to partitioning instructions. | |
static Sawyer::CommandLine::SwitchGroup | astConstructionSwitches (AstConstructionSettings &) |
Command-line switches related to constructing an AST from the partitioner. | |
Static Public Member Functions inherited from Rose::BinaryAnalysis::Partitioner2::Engine | |
static EngineBinaryPtr | instance () |
static std::list< std::pair< std::string, std::string > > | allSpecimenNameDocumentation () |
Documentation for all specimen specifications. | |
static void | registerFactory (const EnginePtr &factory) |
Register an engine as a factory. | |
static bool | deregisterFactory (const EnginePtr &factory) |
Remove a concrete engine factory from the registry. | |
static std::vector< EnginePtr > | registeredFactories () |
List of all registered factories. | |
static void | disassembleForRoseFrontend (SgAsmInterpretation *) |
static EnginePtr | forge (const std::vector< std::string > &specimen) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (const std::string &specimen) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (const std::vector< std::string > &arguments, Sawyer::CommandLine::Parser &, const PositionalArgumentParser &, const Settings &) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (const std::vector< std::string > &arguments, Sawyer::CommandLine::Parser &, const PositionalArgumentParser &) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (const std::vector< std::string > &arguments, Sawyer::CommandLine::Parser &, const Settings &) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (const std::vector< std::string > &arguments, Sawyer::CommandLine::Parser &) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (int argc, char *argv[], Sawyer::CommandLine::Parser &, const PositionalArgumentParser &, const Settings &) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (int argc, char *argv[], Sawyer::CommandLine::Parser &, const PositionalArgumentParser &) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (int argc, char *argv[], Sawyer::CommandLine::Parser &, const Settings &) |
Creates a suitable engine based on the specimen. | |
static EnginePtr | forge (int argc, char *argv[], Sawyer::CommandLine::Parser &) |
Creates a suitable engine based on the specimen. | |
Protected Member Functions | |
EngineBinary ()=delete | |
Default constructor. | |
EngineBinary (const Settings &) | |
Protected Member Functions inherited from Rose::BinaryAnalysis::Partitioner2::Engine | |
Engine ()=delete | |
Default constructor. | |
Engine (const Engine &)=delete | |
Engine & | operator= (const Engine &)=delete |
Engine (const std::string &name, const Settings &settings) | |
Allocating instance constructors are implemented by the non-abstract subclasses. | |
Shared ownership pointer.
Definition at line 54 of file EngineBinary.h.
|
protecteddelete |
Default constructor.
Deleted, use factory method instance() instead.
|
static |
Command-line switches related to the general engine behavior.
The switches are configured to adjust the specified settings object when parsed.
|
static |
Command-line switches related to loading specimen into memory.
The switches are configured to adjust the specified settings object when parsed.
|
static |
Command-line switches related to decoding instructions.
The switches are configured to adjust the specified settings object when parsed.
|
static |
Command-line switches related to partitioning instructions.
The switches are configured to adjust the specified settings object when parsed.
|
static |
Command-line switches related to constructing an AST from the partitioner.
The switches are configured to adjust the specified settings object when parsed.
|
virtual |
Parses a vxcore specification and initializes memory.
Parses a VxWorks core dump in the format defined by Jim Leek and loads the data into ROSE's analysis memory. The argument should be everything after the first colon in the URL "vxcore:[MEMORY_ATTRS]:[FILE_ATTRS]:FILE_NAME".
|
virtual |
Obtain a binary loader.
Find a suitable binary loader by one of the following methods (in this order):
hint
is supplied, use it.std::runtime_error
.In any case, the binaryLoader property is set to this method's return value.
|
virtual |
Obtain a binary loader.
Find a suitable binary loader by one of the following methods (in this order):
hint
is supplied, use it.std::runtime_error
.In any case, the binaryLoader property is set to this method's return value.
|
virtual |
Loads memory from binary containers.
If the engine has an interpretation whose memory map is missing or empty, then the engine obtains a binary loader via obtainLoader and invokes its load
method on the interpretation. It then copies the interpretation's memory map into the engine (if present, or leaves it as is).
|
virtual |
Loads memory from non-containers.
Processes each non-container string (as determined by isNonContainer) and modifies the memory map according to the string.
|
virtual |
Create a tuned partitioner.
Returns a partitioner that is tuned to operate on a specific instruction set architecture. A memoryMap must be assigned already, either explicitly or as the result of earlier steps.
|
virtual |
Create a partitioner from an AST.
Partitioner data structures are often more useful and more efficient for analysis than an AST. This method initializes the engine and a new partitioner with information from the AST.
|
virtual |
Partition any sections containing CIL code.
Decodes and partitions any sections of type SgAsmCliHeader. These sections contain CIL byte code.
Returns true if a section containing CIL code was found, false otherwise.
bool Rose::BinaryAnalysis::Partitioner2::EngineBinary::hasCilCodeSection | ( | ) |
Determine whether the interpretation header contains a CIL code section.
When the interpretation has a header with a section named "CLR Runtime Header", it contains CIL code. This predicate returns true for such interpretations.
|
virtual |
Make functions at specimen entry addresses.
A function is created at each specimen entry address for all headers in the specified interpretation and adds them to the specified partitioner's CFG/AUM.
Returns a list of such functions, some of which may have existed prior to this call.
|
virtual |
Make functions at error handling addresses.
Makes a function at each error handling address in the specified interpratation and inserts the function into the specified partitioner's CFG/AUM.
Returns the list of such functions, some of which may have existed prior to this call.
|
virtual |
Make functions at import trampolines.
Makes a function at each import trampoline and inserts them into the specified partitioner's CFG/AUM. An import trampoline is a thunk that branches to a dynamically loaded/linked function. Since ROSE does not necessarily load/link dynamic functions, they often don't appear in the executable. Therefore, this function can be called to create functions from the trampolines and give them the same name as the function they would have called had the link step been performed.
Returns a list of such functions, some of which may have existed prior to this call.
|
virtual |
Make functions at export addresses.
Makes a function at each address that is indicated as being an exported function, and inserts them into the specified partitioner's CFG/AUM.
Returns a list of such functions, some of which may have existed prior to this call.
|
virtual |
Make functions for symbols.
Makes a function for each function symbol in the various symbol tables under the specified interpretation and inserts them into the specified partitioner's CFG/AUM.
Returns a list of such functions, some of which may have existed prior to this call.
|
virtual |
Make functions based on specimen container.
Traverses the specified interpretation parsed from, for example, related ELF or PE containers, and make functions at certain addresses that correspond to specimen entry points, imports and exports, symbol tables, etc. This method only calls many of the other "make*Functions" methods and accumulates their results.
Returns a list of such functions, some of which may have existed prior to this call.
|
virtual |
Make functions from an interrupt vector.
Reads the interrupt vector and builds functions for its entries. The functions are inserted into the partitioner's CFG/AUM.
Returns the list of such functions, some of which may have existed prior to this call.
|
virtual |
Make a function at each specified address.
A function is created at each address and is attached to the partitioner's CFG/AUM. Returns a list of such functions, some of which may have existed prior to this call.
|
virtual |
Discover as many basic blocks as possible.
Processes the "undiscovered" work list until the list becomes empty. This list is the list of basic block placeholders for which no attempt has been made to discover instructions. This method implements a recursive descent disassembler, although it does not process the control flow edges in any particular order. Subclasses are expected to override this to implement a more directed approach to discovering basic blocks.
|
virtual |
Scan read-only data to find function pointers.
Scans read-only data beginning at the specified address in order to find pointers to code, and makes a new function at when found. The pointer must be word aligned and located in memory that's mapped read-only (not writable and not executable), and it must not point to an unknown instruction or an instruction that overlaps with any instruction that's already in the CFG/AUM.
Returns a pointer to a newly-allocated function that has not yet been attached to the CFG/AUM, or a null pointer if no function was found. In any case, the startVa is updated so it points to the next read-only address to check.
Functions created in this manner have the SgAsmFunction::FUNC_SCAN_RO_DATA
reason.
|
virtual |
Scan instruction ASTs to function pointers.
Scans each instruction to find pointers to code and makes a new function when found. The pointer must be word aligned and located in memory that's mapped read-only (not writable and not executable), and it most not point to an unknown instruction of an instruction that overlaps with any instruction that's already in the CFG/AUM.
This function requires that the partitioner has been initialized to track instruction ASTs as they are added to and removed from the CFG/AUM.
Returns a pointer to a newly-allocated function that has not yet been attached to the CFG/AUM, or a null pointer if no function was found.
Functions created in this manner have the SgAsmFunction::FUNC_INSN_RO_DATA
reason.
|
virtual |
Make functions for function call edges.
Scans the partitioner's CFG to find edges that are marked as function calls and makes a function at each target address that is concrete. The function is added to the specified partitioner's CFG/AUM.
Returns a list of such functions, some of which may have existed prior to this call.
|
virtual |
Make function at prologue pattern.
Scans executable memory starting at the specified address and which is not represented in the CFG/AUM and looks for byte patterns and/or instruction patterns that indicate the start of a function. When a pattern is found a function (or multiple functions, depending on the type of matcher) is created and inserted into the specified partitioner's CFG/AUM.
Patterns are found by calling the Partitioner::nextFunctionPrologue method, which most likely invokes a variety of predefined and user-defined callbacks to search for the next pattern.
Returns a vector of non-null function pointers pointer for the newly inserted functions, otherwise returns an empty vector. If the lastSearchedVa
is provided, it will be set to the highest address at which a function prologue was searched.
|
virtual |
Make function at prologue pattern.
Scans executable memory starting at the specified address and which is not represented in the CFG/AUM and looks for byte patterns and/or instruction patterns that indicate the start of a function. When a pattern is found a function (or multiple functions, depending on the type of matcher) is created and inserted into the specified partitioner's CFG/AUM.
Patterns are found by calling the Partitioner::nextFunctionPrologue method, which most likely invokes a variety of predefined and user-defined callbacks to search for the next pattern.
Returns a vector of non-null function pointers pointer for the newly inserted functions, otherwise returns an empty vector. If the lastSearchedVa
is provided, it will be set to the highest address at which a function prologue was searched.
|
virtual |
Make functions from inter-function calls.
This method scans the unused executable areas between existing functions to look for additional function calls and creates new functions for those calls. It starts the scan at startVa
which is updated upon return to be the next address that needs to be scanned. The startVa
is never incremented past the end of the address space (i.e., it never wraps back around to zero), so care should be taken to not call this in an infinite loop when the end of the address space is reached.
The scanner tries to discover new basic blocks in the unused portion of the address space. These basic blocks are not allowed to overlap with existing, attached basic blocks, data blocks, or functions since that is an indication that we accidentally disassembled non-code. If the basic block looks like a function call and the target address(es) is not pointing into the middle of an existing basic block, data-block, or function then a new function is created at the target address. The basic blocks which were scanned are not explicitly attached to the partitioner's CFG since we cannot be sure we found their starting address, but they might be later attached by following the control flow from the functions we did discover.
Returns the new function(s) for the first basic block that satisfied the requirements outlined above, and updates startVa
to be a greater address which is not part of the basic block that was scanned.
|
virtual |
Discover as many functions as possible.
Discover as many functions as possible by discovering as many basic blocks as possible (discoverBasicBlocks), Each time we run out of basic blocks to try, we look for another function prologue pattern at the lowest possible address and then recursively discover more basic blocks. When this procedure is exhausted a call to attachBlocksToFunctions tries to attach each basic block to a function.
|
virtual |
Attach dead code to function.
Examines the ghost edges for the basic blocks that belong to the specified function in order to discover basic blocks that are not reachable according the CFG, adds placeholders for those basic blocks, and causes the function to own those blocks.
If maxIterations
is larger than one then multiple iterations are performed. Between each iteration makeNextBasicBlock is called repeatedly to recursively discover instructions for all pending basic blocks, and then the CFG is traversed to add function-reachable basic blocks to the function. The loop terminates when the maximum number of iterations is reached, or when no more dead code can be found within this function.
Returns the set of newly discovered addresses for unreachable code. These are the ghost edge target addresses discovered at each iteration of the loop and do not include addresses of basic blocks that are reachable from the ghost target blocks.
|
virtual |
Attach function padding to function.
Examines the memory immediately prior to the specified function's entry address to determine if it is alignment padding. If so, it creates a data block for the padding and adds it to the function.
Returns the padding data block, which might have existed prior to this call. Returns null if the function apparently has no padding.
|
virtual |
Attach padding to all functions.
Invokes attachPaddingToFunction for each known function and returns the set of data blocks that were returned by the individual calls.
|
virtual |
Attach all possible intra-function basic blocks to functions.
This is similar to attachSurroundedCodeToFunctions except it calls that method repeatedly until it cannot do anything more. Between each call it also follows the CFG for the newly discovered blocks to discover as many blocks as possible, creates more functions by looking for function calls, and attaches additional basic blocks to functions by following the CFG for each function.
This method is called automatically by Engine::runPartitioner if the PartitionerSettings::findingIntraFunctionCode property is set.
Returns the sum from all the calls to attachSurroundedCodeToFunctions.
|
virtual |
Attach intra-function basic blocks to functions.
This method scans the unused address intervals (those addresses that are not represented by the CFG/AUM). For each unused interval, if the interval is immediately surrounded by a single function then a basic block placeholder is created at the beginning of the interval and added to the function.
Returns the number of new placeholders created.
|
virtual |
Attach basic blocks to functions.
Calls Partitioner::discoverFunctionBasicBlocks once for each known function the partitioner's CFG/AUM in a sophomoric attempt to assign existing basic blocks to functions.
|
virtual |
Attach dead code to functions.
Calls attachDeadCodeToFunction once for each function that exists in the specified partitioner's CFG/AUM, passing along maxIterations
each time.
Returns the union of the dead code addresses discovered for each function.
|
virtual |
Attach intra-function data to functions.
Looks for addresses that are not part of the partitioner's CFG/AUM and which are surrounded immediately below and above by the same function and add that address interval as a data block to the surrounding function. Returns the list of such data blocks added.
|
virtual |
Insert a call-return edge and discover its basic block.
Inserts a call-return (E_CALL_RETURN) edge for some function call that lacks such an edge and for which the callee may return. The assumeCallReturns
parameter determines whether a call-return edge should be added or not for callees whose may-return analysis is indeterminate. If assumeCallReturns
is true then an indeterminate callee will have a call-return edge added; if false then no call-return edge is added; if indeterminate then no call-return edge is added at this time but the vertex is saved so it can be reprocessed later.
Returns true if a new call-return edge was added to some call, or false if no such edge could be added. A post condition for a false return is that the pendingCallReturn list is empty.
|
virtual |
Discover basic block at next placeholder.
Discovers a basic block at some arbitrary placeholder. Returns a pointer to the new basic block if a block was discovered, or null if no block is discovered. A postcondition for a null return is that the CFG has no edges coming into the "undiscovered" vertex.
|
virtual |
Discover a basic block.
Discovers another basic block if possible. A variety of methods will be used to determine where to discover the next basic block:
Returns the basic block that was discovered, or the null pointer if there are no pending undiscovered blocks.
BinaryLoaderPtr Rose::BinaryAnalysis::Partitioner2::EngineBinary::binaryLoader | ( | ) | const |
Property: binary loader.
The binary loader that maps a binary container's sections into simulated memory and optionally performs dynamic linking and relocation fixups. If none is specified then the engine will choose one based on the container.
|
virtual |
Property: binary loader.
The binary loader that maps a binary container's sections into simulated memory and optionally performs dynamic linking and relocation fixups. If none is specified then the engine will choose one based on the container.
ThunkPredicatesPtr Rose::BinaryAnalysis::Partitioner2::EngineBinary::functionMatcherThunks | ( | ) | const |
Property: Predicate for finding functions that are thunks.
This collective predicate is used when searching for function prologues in order to create new functions. Its purpose is to try to match sequences of instructions that look like thunks and then create a function at that address. A suitable default list of predicates is created when the engine is initialized, and can either be replaced by a new list, an empty list, or the list itself can be adjusted. The list is consulted only when PartitionerSettings::findingThunks is set.
|
virtual |
Property: Predicate for finding functions that are thunks.
This collective predicate is used when searching for function prologues in order to create new functions. Its purpose is to try to match sequences of instructions that look like thunks and then create a function at that address. A suitable default list of predicates is created when the engine is initialized, and can either be replaced by a new list, an empty list, or the list itself can be adjusted. The list is consulted only when PartitionerSettings::findingThunks is set.
ThunkPredicatesPtr Rose::BinaryAnalysis::Partitioner2::EngineBinary::functionSplittingThunks | ( | ) | const |
Property: Predicate for finding thunks at the start of functions.
This collective predicate is used when searching for thunks at the beginnings of existing functions in order to split those thunk instructions into their own separate function. A suitable default list of predicates is created when the engine is initialized, and can either be replaced by a new list, an empty list, or the list itself can be adjusted. The list is consulted only when PartitionerSettings::splittingThunks is set.
|
virtual |
Property: Predicate for finding thunks at the start of functions.
This collective predicate is used when searching for thunks at the beginnings of existing functions in order to split those thunk instructions into their own separate function. A suitable default list of predicates is created when the engine is initialized, and can either be replaced by a new list, an empty list, or the list itself can be adjusted. The list is consulted only when PartitionerSettings::splittingThunks is set.
|
overridevirtual |
Predicate for matching a concrete engine factory by settings and specimen.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Virtual constructor for factories.
This creates a new object by calling the class method instance
for the class of which this
is a type. All arguments are passed to instance
.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Reset the engine to its initial state.
This does not reset the settings properties since that can be done easily by constructing a new engine. It only resets the interpretation, binary loader, and memory map so all the top-level steps get executed again. This is a useful way to re-use the same partitioner to process multiple specimens.
Reimplemented from Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Most basic usage of the partitioner.
This method does everything from parsing the command-line to generating an abstract syntax tree. If all is successful, then an abstract syntax tree is returned. The return value is a SgAsmBlock node that contains all the detected functions. If the specimen consisted of an ELF or PE container then the parent nodes of the returned AST will lead eventually to an SgProject node.
The command-line can be provided as a typical argc
and argv
pair, or as a vector of arguments. In the latter case, the vector should not include argv[0]
or argv[argc]
(which is always a null pointer).
The command-line supports a "--help" (or "-h") switch to describe all other switches and arguments, essentially generating output like a Unix man(1) page.
The purpose
should be a single line string that will be shown in the title of the man page and should not start with an upper-case letter, a hyphen, white space, or the name of the command. E.g., a disassembler tool might specify the purpose as "disassembles a binary specimen".
The description
is a full, multi-line description written in the Sawyer markup language where "@" characters have special meaning.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Parse specimen binary containers.
Parses the ELF and PE binary containers to create an abstract syntax tree (AST). If fileNames
contains names that are recognized as raw data or other non-containers then they are skipped over at this stage but processed during the loadSpecimens stage.
This method tries to determine the specimen architecture. It also resets the interpretation to be the return value (see below), and clears the memory map.
Returns a binary interpretation (perhaps one of many). ELF files have only one interpretation; PE files have a DOS and a PE interpretation and this method will return the PE interpretation. The user may, at this point, select a different interpretation. If the list of names has nothing suitable for ROSE's frontend
function (the thing that does the container parsing) then the null pointer is returned.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Load and/or link interpretation.
Loads and/or links the engine's interpretation according to the engine's binary loader with these steps:
Returns a reference to the engine's memory map.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Partition instructions into basic blocks and functions.
Disassembles and organizes instructions into basic blocks and functions with these steps:
Returns the partitioner that was used and which contains the results.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Obtain an abstract syntax tree.
Constructs a new abstract syntax tree (AST) from partitioner information with these steps:
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Command-line switches for a particular engine.
Returns the list of switch groups that declare the command-line switches specific to a particular engine. Since every Engine
subclass needs its own particular switches (possibly in addition to the base class switches), this is implemented in each subclass that needs switches. The base class returns a list of switch groups that are applicable to all engines, although the subclasses can refine this list, and the subclass implementations should augment what the base implementation returns.
In order to implement the "--help" switch to show the man page, we need a way to include the switch documentation for all possible engine subclasses at once. Therefore, the returned command line switch groups must have names and prefixes that are unique across all subclasses, and the descriptions should refer to the name of the subclass. For instance, the EngineBinary class, which returns many switch groups, will name the switch groups like "binary-load", "binary-dis", "binary-part", "binary-ast", etc. and will make it clear in each group description that these switches are intended for the binary engine.
See allCommandLineSwitches for details about how the "--help" man page is constructed.
Reimplemented from Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Documentation about how the specimen is specified.
The documentation string that's returned is expected to be used in a command-line parser description and thus may contain special formatting constructs. For most engine subclasses, this will be a description of those command-line positional arguments that describe the specimen. For instance, the EngineJvm subclass would probably document that the specimen consists of one or more file names ending with the string ".class".
In order to support the –help switch that generates the man page, it must be possible to include the documentation for all subclasses concurrently. Therefore, each subclass returns both a section title and the section documentation string. The section title and documentation string should make it clear that this part of the documentation applies only to that particular subclass.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Determine whether a specimen name is a non-container.
Certain strings are recognized as special instructions for how to adjust a memory map and are not intended to be passed to ROSE's frontend
function. This predicate returns true for such strings.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Returns true if containers are parsed.
Specifically, returns true if the engine has a non-null interpretation. If it has a null interpretation then parseContainers might have already been called but no binary containers specified, in which case calling it again with the same file names will have no effect.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Create partitioner.
This is the method usually called to create a new partitioner.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Finds interesting things to work on initially.
Seeds the partitioner with addresses and functions where recursive disassembly should begin.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Runs the recursive part of partioning.
This is the long-running guts of the partitioner.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Runs the final parts of partitioning.
This does anything necessary after the main part of partitioning is finished. For instance, it might give names to some functions that don't have names yet.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
overridevirtual |
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
SgAsmBlock * Rose::BinaryAnalysis::Partitioner2::Engine::frontend | ( | int | argc, |
char * | argv[], | ||
const std::string & | purpose, | ||
const std::string & | description | ||
) |
Most basic usage of the partitioner.
This method does everything from parsing the command-line to generating an abstract syntax tree. If all is successful, then an abstract syntax tree is returned. The return value is a SgAsmBlock node that contains all the detected functions. If the specimen consisted of an ELF or PE container then the parent nodes of the returned AST will lead eventually to an SgProject node.
The command-line can be provided as a typical argc
and argv
pair, or as a vector of arguments. In the latter case, the vector should not include argv[0]
or argv[argc]
(which is always a null pointer).
The command-line supports a "--help" (or "-h") switch to describe all other switches and arguments, essentially generating output like a Unix man(1) page.
The purpose
should be a single line string that will be shown in the title of the man page and should not start with an upper-case letter, a hyphen, white space, or the name of the command. E.g., a disassembler tool might specify the purpose as "disassembles a binary specimen".
The description
is a full, multi-line description written in the Sawyer markup language where "@" characters have special meaning.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
|
virtual |
Most basic usage of the partitioner.
This method does everything from parsing the command-line to generating an abstract syntax tree. If all is successful, then an abstract syntax tree is returned. The return value is a SgAsmBlock node that contains all the detected functions. If the specimen consisted of an ELF or PE container then the parent nodes of the returned AST will lead eventually to an SgProject node.
The command-line can be provided as a typical argc
and argv
pair, or as a vector of arguments. In the latter case, the vector should not include argv[0]
or argv[argc]
(which is always a null pointer).
The command-line supports a "--help" (or "-h") switch to describe all other switches and arguments, essentially generating output like a Unix man(1) page.
The purpose
should be a single line string that will be shown in the title of the man page and should not start with an upper-case letter, a hyphen, white space, or the name of the command. E.g., a disassembler tool might specify the purpose as "disassembles a binary specimen".
The description
is a full, multi-line description written in the Sawyer markup language where "@" characters have special meaning.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
virtual |
Parse specimen binary containers.
Parses the ELF and PE binary containers to create an abstract syntax tree (AST). If fileNames
contains names that are recognized as raw data or other non-containers then they are skipped over at this stage but processed during the loadSpecimens stage.
This method tries to determine the specimen architecture. It also resets the interpretation to be the return value (see below), and clears the memory map.
Returns a binary interpretation (perhaps one of many). ELF files have only one interpretation; PE files have a DOS and a PE interpretation and this method will return the PE interpretation. The user may, at this point, select a different interpretation. If the list of names has nothing suitable for ROSE's frontend
function (the thing that does the container parsing) then the null pointer is returned.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
virtual |
Load and/or link interpretation.
Loads and/or links the engine's interpretation according to the engine's binary loader with these steps:
Returns a reference to the engine's memory map.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
virtual |
Partition instructions into basic blocks and functions.
Disassembles and organizes instructions into basic blocks and functions with these steps:
Returns the partitioner that was used and which contains the results.
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.
|
virtual |
Obtain an abstract syntax tree.
Constructs a new abstract syntax tree (AST) from partitioner information with these steps:
If an std::runtime_exception
occurs and the EngineSettings::exitOnError property is set, then the exception is caught, its text is emitted to the partitioner's fatal error stream, and exit(1)
is invoked.
Implements Rose::BinaryAnalysis::Partitioner2::Engine.