Description

String encoding scheme.

A string encoding scheme indicates how a string (sequence of code points) is encoded as a sequence of octets and vice versa.

Definition at line 533 of file String.h.

#include <Rose/BinaryAnalysis/String.h>

Inheritance diagram for Rose::BinaryAnalysis::Strings::StringEncodingScheme:

[legend]

Collaboration diagram for Rose::BinaryAnalysis::Strings::StringEncodingScheme:

[legend]

Public Types
typedef Sawyer::SharedPointer< StringEncodingScheme >	Ptr
	Shared ownership pointer to a StringEncodingScheme.

Public Member Functions
virtual std::string	name () const =0
	Name of encoding.

virtual Ptr	clone () const =0
	Create a new copy of this encoder.

virtual Octets	encode (const CodePoints &)=0
	Encode a string into a sequence of octets.

State	state () const
	Decoder state.

virtual State	decode (Octet)=0
	Decode one octet.

CodePoints	consume ()
	Consume pending decoded code points.

const CodePoints &	codePoints () const
	Return pending decoded code points without consuming them.

size_t	length () const
	Number of code points decoded since reset.

virtual void	reset ()
	Reset the state machine to an initial state.


CharacterEncodingForm::Ptr	characterEncodingForm () const
	Property: Character encoding format.

void	characterEncodingForm (const CharacterEncodingForm::Ptr &cef)
	Property: Character encoding format.


CharacterEncodingScheme::Ptr	characterEncodingScheme () const
	Property: Character encoding scheme.

void	characterEncodingScheme (const CharacterEncodingScheme::Ptr &ces)
	Property: Character encoding scheme.


CodePointPredicate::Ptr	codePointPredicate () const
	Property: Code point predicate.

void	codePointPredicate (const CodePointPredicate::Ptr &cpp)
	Property: Code point predicate.

Public Member Functions inherited from Sawyer::SharedObject
	SharedObject ()
	Default constructor.

	SharedObject (const SharedObject &)
	Copy constructor.

virtual	~SharedObject ()
	Virtual destructor.

SharedObject &	operator= (const SharedObject &)
	Assignment.

Protected Member Functions
	StringEncodingScheme (const CharacterEncodingForm::Ptr &cef, const CharacterEncodingScheme::Ptr &ces, const CodePointPredicate::Ptr &cpp)

Protected Attributes
State	state_ = INITIAL_STATE

CodePoints	codePoints_

size_t	nCodePoints_ = 0

CharacterEncodingForm::Ptr	cef_

CharacterEncodingScheme::Ptr	ces_

CodePointPredicate::Ptr	cpp_

Member Typedef Documentation

◆ Ptr

typedef Sawyer::SharedPointer<StringEncodingScheme> Rose::BinaryAnalysis::Strings::StringEncodingScheme::Ptr

Shared ownership pointer to a StringEncodingScheme.

See Shared ownership.

Definition at line 553 of file String.h.

Constructor & Destructor Documentation

◆ StringEncodingScheme() [1/2]

Rose::BinaryAnalysis::Strings::StringEncodingScheme::StringEncodingScheme ( )

inlineprotected

Definition at line 543 of file String.h.

◆ StringEncodingScheme() [2/2]

Rose::BinaryAnalysis::Strings::StringEncodingScheme::StringEncodingScheme	(	const CharacterEncodingForm::Ptr &	cef,
		const CharacterEncodingScheme::Ptr &	ces,
		const CodePointPredicate::Ptr &	cpp
	)

inlineprotected

Definition at line 545 of file String.h.

◆ ~StringEncodingScheme()

virtual Rose::BinaryAnalysis::Strings::StringEncodingScheme::~StringEncodingScheme ( )

inlinevirtual

Definition at line 550 of file String.h.

Member Function Documentation

◆ name()

virtual std::string Rose::BinaryAnalysis::Strings::StringEncodingScheme::name ( ) const

pure virtual

Name of encoding.

Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.

◆ clone()

virtual Ptr Rose::BinaryAnalysis::Strings::StringEncodingScheme::clone ( ) const

pure virtual

Create a new copy of this encoder.

Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.

◆ encode()

virtual Octets Rose::BinaryAnalysis::Strings::StringEncodingScheme::encode ( const CodePoints & )

pure virtual

Encode a string into a sequence of octets.

Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.

◆ state()

State Rose::BinaryAnalysis::Strings::StringEncodingScheme::state ( ) const

inline

Decoder state.

Definition at line 565 of file String.h.

◆ decode()

virtual State Rose::BinaryAnalysis::Strings::StringEncodingScheme::decode ( Octet )

pure virtual

Decode one octet.

Processes a single octet and updates the decoder state machine. Returns the new state. See documentation for Strings::State for restrictions on state transitions.

Implemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.

◆ consume()

CodePoints Rose::BinaryAnalysis::Strings::StringEncodingScheme::consume ( )

Consume pending decoded code points.

Returns code points that haven't been consume yet, and then removes them from the decoder. This can be called from any state because we want the caller to be able to consume code points as they're decoded, which is a little bit different than how consume methods operate in the decoders that return scalar values. A reset will discard pending code points.

◆ codePoints()

const CodePoints & Rose::BinaryAnalysis::Strings::StringEncodingScheme::codePoints ( ) const

inline

Return pending decoded code points without consuming them.

Definition at line 582 of file String.h.

◆ length()

size_t Rose::BinaryAnalysis::Strings::StringEncodingScheme::length ( ) const

inline

Number of code points decoded since reset.

Definition at line 585 of file String.h.

◆ reset()

virtual void Rose::BinaryAnalysis::Strings::StringEncodingScheme::reset ( )

virtual

Reset the state machine to an initial state.

Reimplemented in Rose::BinaryAnalysis::Strings::LengthEncodedString, and Rose::BinaryAnalysis::Strings::TerminatedString.

◆ characterEncodingForm() [1/2]

CharacterEncodingForm::Ptr Rose::BinaryAnalysis::Strings::StringEncodingScheme::characterEncodingForm ( ) const

inline

Property: Character encoding format.

The character encoding format is responsible for converting each code point to a sequence of code values. For instance, a UTF-16 encoding will convert each code point (a number between zero and about 1.2 million) into a sequence of 16-bit code values. Each code value will eventually be converted to a pair of octets by the character encoding scheme.

Definition at line 598 of file String.h.

◆ characterEncodingForm() [2/2]

void Rose::BinaryAnalysis::Strings::StringEncodingScheme::characterEncodingForm ( const CharacterEncodingForm::Ptr & cef )

inline

Property: Character encoding format.

The character encoding format is responsible for converting each code point to a sequence of code values. For instance, a UTF-16 encoding will convert each code point (a number between zero and about 1.2 million) into a sequence of 16-bit code values. Each code value will eventually be converted to a pair of octets by the character encoding scheme.

Definition at line 599 of file String.h.

◆ characterEncodingScheme() [1/2]

CharacterEncodingScheme::Ptr Rose::BinaryAnalysis::Strings::StringEncodingScheme::characterEncodingScheme ( ) const

inline

Property: Character encoding scheme.

The character encoding scheme is responsible for converting each code value to a sequence of one or more octets. The code value is part of a sequence of code values generated by the character encoding format for a single code point. For instance, a character encoding scheme for UTF-16 will need to know whether the octets are stored in bit- or little-endian order.

Definition at line 610 of file String.h.

◆ characterEncodingScheme() [2/2]

void Rose::BinaryAnalysis::Strings::StringEncodingScheme::characterEncodingScheme ( const CharacterEncodingScheme::Ptr & ces )

inline

Property: Character encoding scheme.

The character encoding scheme is responsible for converting each code value to a sequence of one or more octets. The code value is part of a sequence of code values generated by the character encoding format for a single code point. For instance, a character encoding scheme for UTF-16 will need to know whether the octets are stored in bit- or little-endian order.

Definition at line 611 of file String.h.

◆ codePointPredicate() [1/2]

CodePointPredicate::Ptr Rose::BinaryAnalysis::Strings::StringEncodingScheme::codePointPredicate ( ) const

inline

Property: Code point predicate.

The code point predicate tests whether a specific code point is allowed as part of a string. For instance, when decoding NUL-terminated ASCII strings one might want to consider only those strings that contain printable characters and white space in order to limit the number of false positives when searching for strings in memory.

Definition at line 621 of file String.h.

◆ codePointPredicate() [2/2]

void Rose::BinaryAnalysis::Strings::StringEncodingScheme::codePointPredicate ( const CodePointPredicate::Ptr & cpp )

inline

Property: Code point predicate.

The code point predicate tests whether a specific code point is allowed as part of a string. For instance, when decoding NUL-terminated ASCII strings one might want to consider only those strings that contain printable characters and white space in order to limit the number of false positives when searching for strings in memory.

Definition at line 622 of file String.h.

Member Data Documentation

◆ state_

State Rose::BinaryAnalysis::Strings::StringEncodingScheme::state_ = INITIAL_STATE

protected

Definition at line 535 of file String.h.

◆ codePoints_

CodePoints Rose::BinaryAnalysis::Strings::StringEncodingScheme::codePoints_

protected

Definition at line 536 of file String.h.

◆ nCodePoints_

size_t Rose::BinaryAnalysis::Strings::StringEncodingScheme::nCodePoints_ = 0

protected

Definition at line 537 of file String.h.

◆ cef_

CharacterEncodingForm::Ptr Rose::BinaryAnalysis::Strings::StringEncodingScheme::cef_

protected

Definition at line 538 of file String.h.

◆ ces_

CharacterEncodingScheme::Ptr Rose::BinaryAnalysis::Strings::StringEncodingScheme::ces_

protected

Definition at line 539 of file String.h.

◆ cpp_

CodePointPredicate::Ptr Rose::BinaryAnalysis::Strings::StringEncodingScheme::cpp_

protected

Definition at line 540 of file String.h.

The documentation for this class was generated from the following file:

String.h

Description

Public Types

Public Member Functions

Protected Member Functions

Protected Attributes

Member Typedef Documentation

◆ Ptr

Constructor & Destructor Documentation

◆ StringEncodingScheme() [1/2]

◆ StringEncodingScheme() [2/2]

◆ ~StringEncodingScheme()

Member Function Documentation

◆ name()

◆ clone()

◆ encode()

◆ state()

◆ decode()

◆ consume()

◆ codePoints()

◆ length()

◆ reset()

◆ characterEncodingForm() [1/2]

◆ characterEncodingForm() [2/2]

◆ characterEncodingScheme() [1/2]

◆ characterEncodingScheme() [2/2]

◆ codePointPredicate() [1/2]

◆ codePointPredicate() [2/2]

Member Data Documentation

◆ state_

◆ codePoints_

◆ nCodePoints_

◆ cef_

◆ ces_

◆ cpp_