ROSE 0.11.145.192
|
String encoding scheme.
A string encoding scheme indicates how a string (sequence of code points) is encoded as a sequence of octets and vice versa.
#include <Rose/BinaryAnalysis/String.h>
Public Types | |
typedef Sawyer::SharedPointer< StringEncodingScheme > | Ptr |
Shared ownership pointer to a StringEncodingScheme. | |
Public Member Functions | |
virtual std::string | name () const =0 |
Name of encoding. | |
virtual Ptr | clone () const =0 |
Create a new copy of this encoder. | |
virtual Octets | encode (const CodePoints &)=0 |
Encode a string into a sequence of octets. | |
State | state () const |
Decoder state. | |
virtual State | decode (Octet)=0 |
Decode one octet. | |
CodePoints | consume () |
Consume pending decoded code points. | |
const CodePoints & | codePoints () const |
Return pending decoded code points without consuming them. | |
size_t | length () const |
Number of code points decoded since reset. | |
virtual void | reset () |
Reset the state machine to an initial state. | |
CharacterEncodingForm::Ptr | characterEncodingForm () const |
Property: Character encoding format. | |
void | characterEncodingForm (const CharacterEncodingForm::Ptr &cef) |
Property: Character encoding format. | |
CharacterEncodingScheme::Ptr | characterEncodingScheme () const |
Property: Character encoding scheme. | |
void | characterEncodingScheme (const CharacterEncodingScheme::Ptr &ces) |
Property: Character encoding scheme. | |
CodePointPredicate::Ptr | codePointPredicate () const |
Property: Code point predicate. | |
void | codePointPredicate (const CodePointPredicate::Ptr &cpp) |
Property: Code point predicate. | |
Public Member Functions inherited from Sawyer::SharedObject | |
SharedObject () | |
Default constructor. | |
SharedObject (const SharedObject &) | |
Copy constructor. | |
virtual | ~SharedObject () |
Virtual destructor. | |
SharedObject & | operator= (const SharedObject &) |
Assignment. | |
Protected Member Functions | |
StringEncodingScheme (const CharacterEncodingForm::Ptr &cef, const CharacterEncodingScheme::Ptr &ces, const CodePointPredicate::Ptr &cpp) | |
Protected Attributes | |
State | state_ = INITIAL_STATE |
CodePoints | codePoints_ |
size_t | nCodePoints_ = 0 |
CharacterEncodingForm::Ptr | cef_ |
CharacterEncodingScheme::Ptr | ces_ |
CodePointPredicate::Ptr | cpp_ |
typedef Sawyer::SharedPointer<StringEncodingScheme> Rose::BinaryAnalysis::Strings::StringEncodingScheme::Ptr |
Shared ownership pointer to a StringEncodingScheme.
See Shared ownership.
|
inlineprotected |
|
inlineprotected |
|
inlinevirtual |
|
pure virtual |
Name of encoding.
Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.
|
pure virtual |
Create a new copy of this encoder.
Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.
|
pure virtual |
Encode a string into a sequence of octets.
Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.
|
inline |
Decode one octet.
Processes a single octet and updates the decoder state machine. Returns the new state. See documentation for Strings::State for restrictions on state transitions.
Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.
CodePoints Rose::BinaryAnalysis::Strings::StringEncodingScheme::consume | ( | ) |
Consume pending decoded code points.
Returns code points that haven't been consume yet, and then removes them from the decoder. This can be called from any state because we want the caller to be able to consume code points as they're decoded, which is a little bit different than how consume
methods operate in the decoders that return scalar values. A reset will discard pending code points.
|
inline |
|
inline |
|
virtual |
Reset the state machine to an initial state.
Reimplemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.
|
inline |
Property: Character encoding format.
The character encoding format is responsible for converting each code point to a sequence of code values. For instance, a UTF-16 encoding will convert each code point (a number between zero and about 1.2 million) into a sequence of 16-bit code values. Each code value will eventually be converted to a pair of octets by the character encoding scheme.
|
inline |
Property: Character encoding format.
The character encoding format is responsible for converting each code point to a sequence of code values. For instance, a UTF-16 encoding will convert each code point (a number between zero and about 1.2 million) into a sequence of 16-bit code values. Each code value will eventually be converted to a pair of octets by the character encoding scheme.
|
inline |
Property: Character encoding scheme.
The character encoding scheme is responsible for converting each code value to a sequence of one or more octets. The code value is part of a sequence of code values generated by the character encoding format for a single code point. For instance, a character encoding scheme for UTF-16 will need to know whether the octets are stored in bit- or little-endian order.
|
inline |
Property: Character encoding scheme.
The character encoding scheme is responsible for converting each code value to a sequence of one or more octets. The code value is part of a sequence of code values generated by the character encoding format for a single code point. For instance, a character encoding scheme for UTF-16 will need to know whether the octets are stored in bit- or little-endian order.
|
inline |
Property: Code point predicate.
The code point predicate tests whether a specific code point is allowed as part of a string. For instance, when decoding NUL-terminated ASCII strings one might want to consider only those strings that contain printable characters and white space in order to limit the number of false positives when searching for strings in memory.
|
inline |
Property: Code point predicate.
The code point predicate tests whether a specific code point is allowed as part of a string. For instance, when decoding NUL-terminated ASCII strings one might want to consider only those strings that contain printable characters and white space in order to limit the number of false positives when searching for strings in memory.
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |