TSDuck Version 3.32-2769 (TSDuck - The MPEG Transport Stream Toolkit)
ts::DVBCharTable Class Referenceabstract

Definition of a character set for DVB encoding. More...

Inheritance diagram for ts::DVBCharTable:
Collaboration diagram for ts::DVBCharTable:

Public Member Functions

virtual ~DVBCharTable () override
 Virtual destructor.
 
virtual bool canEncode (const UString &str, size_t start=0, size_t count=NPOS) const =0
 Check if a string can be encoded using the charset (ie all characters can be represented). More...
 
virtual bool decode (UString &str, const uint8_t *data, size_t size) const =0
 Decode a string from the specified byte buffer. More...
 
UString decoded (const uint8_t *data, size_t size) const
 Decode a string from the specified byte buffer and return a UString. More...
 
UString decodedWithByteLength (const uint8_t *&data, size_t &size) const
 Decode a string (preceded by its one-byte length) from the specified byte buffer. More...
 
bool decodeWithByteLength (UString &str, const uint8_t *&data, size_t &size) const
 Decode a string (preceded by its one-byte length) from the specified byte buffer. More...
 
virtual size_t encode (uint8_t *&buffer, size_t &size, const UString &str, size_t start=0, size_t count=NPOS) const =0
 Encode a C++ Unicode string. More...
 
ByteBlock encoded (const UString &str, size_t start=0, size_t count=NPOS) const
 Encode a C++ Unicode string as a ByteBlock. More...
 
ByteBlock encodedWithByteLength (const UString &str, size_t start=0, size_t count=NPOS) const
 Encode a C++ Unicode string as a ByteBlock (preceded by its one-byte length). More...
 
virtual size_t encodeTableCode (uint8_t *&buffer, size_t &size) const
 Encode the character set table code. More...
 
size_t encodeWithByteLength (uint8_t *&buffer, size_t &size, const UString &str, size_t start=0, size_t count=NPOS) const
 Encode a C++ Unicode string preceded by its one-byte length. More...
 
UString name () const
 Get the character set name. More...
 
uint32_t tableCode () const
 Get the DVB table code for the character set. More...
 
virtual void unregister () const override
 Unregister the character set from the repository of character sets. More...
 

Static Public Member Functions

static bool DecodeTableCode (uint32_t &code, size_t &codeSize, const uint8_t *dvb, size_t dvbSize)
 This static function gets the character coding table at the beginning of a DVB string. More...
 
static UStringList GetAllNames ()
 Find all registered character set names. More...
 
static const CharsetGetCharset (const UString &name)
 Get a character set by name. More...
 
static const DVBCharTableGetTableFromLeadingCode (uint32_t code)
 Get a DVB character set by table code. More...
 

Static Public Attributes

static constexpr uint16_t DVB_CODEPOINT_CRLF = 0xE08A
 Code point for DVB-encoded CR/LF in two-byte character sets.
 
static constexpr uint8_t DVB_SINGLE_BYTE_CRLF = 0x8A
 DVB-encoded CR/LF in single-byte character sets.
 

Protected Member Functions

 DVBCharTable (const UChar *name, uint32_t tableCode)
 Protected constructor. More...
 

Detailed Description

Definition of a character set for DVB encoding.

It is important to understand the difference between DVBCharset and DVBCharTable.

Both classes are subclasses of Charset. So, they both have the capabilities to encode and decode binary strings in DVB representation. But this is the only similarity.

DVBCharset is the generic decoder and encoder for DVB strings. When decoding a DVB string, it recognizes the leading sequence and uses the appropriate character table (a DVBCharTable) to interpret the binary data. When encoding a string, DVBCharset selects the most appropriate DVB character table and encodes the string using that table, after inserting the appropriate leading sequence to indicate which character table was used.

DVBCharTable, on the other hand, is an abstract superclass for all DVB character tables. Each subclass of DVBCharTable implements one specific DVB character table (modified ISO 6937, ISO 8859-1, ISO 8859-2, etc.) When encoding or decoding strings, a subclass of DVBCharTable only decode or encode binary data for this specific DVB character table. No leading sequence or decoded or encoded.

Usage guidelines:

See also
ETSI EN 300 468, Annex A.
DVBCharset

Constructor & Destructor Documentation

◆ DVBCharTable()

ts::DVBCharTable::DVBCharTable ( const UChar name,
uint32_t  tableCode 
)
protected

Protected constructor.

Parameters
[in]namecharset name
[in]tableCodeDVB table code

Member Function Documentation

◆ DecodeTableCode()

static bool ts::DVBCharTable::DecodeTableCode ( uint32_t &  code,
size_t &  codeSize,
const uint8_t *  dvb,
size_t  dvbSize 
)
static

This static function gets the character coding table at the beginning of a DVB string.

The character coding table is encoded on up to 3 bytes at the beginning of a DVB string. The following encodings are recognized, based on the first byte of the DVB string:

  • First byte >= 0x20: The first byte is a character. The default encoding is ISO-6937. We return zero as code.
  • First byte == 0x10: The next two bytes indicate an ISO-8859 encoding. We return 0x10xxyy as code.
  • First byte == 0x1F: The second byte is an encoding_type_id. This encoding is not supported here.
  • Other value: One byte encoding.
Parameters
[out]codeReturned character coding table value. Zero when no code is present (use the default character table). 0xFFFFFFFF in case of invalid data.
[out]codeSizeSize in bytes of character coding table in dvb.
[in]dvbAddress of a DVB string.
[in]dvbSizeSize in bytes of the DVB string.
Returns
True on success, false on error (truncated, unsupported format, etc.)

◆ tableCode()

uint32_t ts::DVBCharTable::tableCode ( ) const
inline

Get the DVB table code for the character set.

Returns
DVB table code.

◆ GetTableFromLeadingCode()

static const DVBCharTable* ts::DVBCharTable::GetTableFromLeadingCode ( uint32_t  code)
static

Get a DVB character set by table code.

Parameters
[in]codeTable code of the requested character set.
Returns
Address of the character or zero if not found.

◆ encodeTableCode()

virtual size_t ts::DVBCharTable::encodeTableCode ( uint8_t *&  buffer,
size_t &  size 
) const
virtual

Encode the character set table code.

Stop either when the specified number of characters are serialized or when the buffer is full, whichever comes first.

Parameters
[in,out]bufferAddress of the buffer. The address is updated to point after the encoded value.
[in,out]sizeSize of the buffer. Updated to remaining size.
Returns
The number of serialized byte.

◆ unregister()

virtual void ts::DVBCharTable::unregister ( ) const
overridevirtual

Unregister the character set from the repository of character sets.

This is done automatically when the object is destructed. Can be called earlier to make sure a character set is no longer referenced.

Reimplemented from ts::Charset.

◆ name()

UString ts::Charset::name ( ) const
inlineinherited

Get the character set name.

Returns
The name.

◆ GetCharset()

static const Charset* ts::Charset::GetCharset ( const UString name)
staticinherited

Get a character set by name.

Parameters
[in]nameName of the requested character set.
Returns
Address of the character set or zero if not found.

◆ GetAllNames()

static UStringList ts::Charset::GetAllNames ( )
staticinherited

Find all registered character set names.

Returns
List of all registered names.

◆ decode()

virtual bool ts::Charset::decode ( UString str,
const uint8_t *  data,
size_t  size 
) const
pure virtualinherited

Decode a string from the specified byte buffer.

Parameters
[out]strReturned decoded string.
[in]dataAddress of an encoded string.
[in]sizeSize in bytes of the encoded string.
Returns
True on success, false on error (truncated, unsupported format, etc.)

Implemented in ts::DVBCharTableUTF8, ts::DVBCharTableUTF16, ts::DVBCharTableSingleByte, ts::DVBCharset, ts::DumpCharset, and ts::ARIBCharset.

◆ decoded()

UString ts::Charset::decoded ( const uint8_t *  data,
size_t  size 
) const
inherited

Decode a string from the specified byte buffer and return a UString.

Errors (truncation, unsupported format, etc) are ignored.

Parameters
[in]dataAddress of an encoded string.
[in]sizeSize in bytes of the encoded string.
Returns
The decoded string.

◆ decodeWithByteLength()

bool ts::Charset::decodeWithByteLength ( UString str,
const uint8_t *&  data,
size_t &  size 
) const
inherited

Decode a string (preceded by its one-byte length) from the specified byte buffer.

Parameters
[out]strReturned decoded string.
[in,out]dataAddress of an encoded string. The address is updated to point after the decoded value.
[in,out]sizeSize of the buffer. Updated to remaining size.
Returns
True on success, false on error (truncated, unsupported format, etc.)

◆ decodedWithByteLength()

UString ts::Charset::decodedWithByteLength ( const uint8_t *&  data,
size_t &  size 
) const
inherited

Decode a string (preceded by its one-byte length) from the specified byte buffer.

Errors (truncation, unsupported format, etc) are ignored.

Parameters
[in,out]dataAddress of an encoded string. The address is updated to point after the decoded value.
[in,out]sizeSize of the buffer. Updated to remaining size.
Returns
The decoded string.

◆ canEncode()

virtual bool ts::Charset::canEncode ( const UString str,
size_t  start = 0,
size_t  count = NPOS 
) const
pure virtualinherited

Check if a string can be encoded using the charset (ie all characters can be represented).

Parameters
[in]strThe string to encode.
[in]startStarting offset in str.
[in]countMaximum number of characters to encode.
Returns
True if all characters can be encoded.

Implemented in ts::DVBCharTableUTF8, ts::DVBCharTableUTF16, ts::DVBCharTableSingleByte, ts::DVBCharset, ts::DumpCharset, and ts::ARIBCharset.

◆ encode()

virtual size_t ts::Charset::encode ( uint8_t *&  buffer,
size_t &  size,
const UString str,
size_t  start = 0,
size_t  count = NPOS 
) const
pure virtualinherited

Encode a C++ Unicode string.

Unrepresentable characters are skipped. Stop either when the specified number of characters are serialized or when the buffer is full, whichever comes first.

Parameters
[in,out]bufferAddress of the buffer. The address is updated to point after the encoded value.
[in,out]sizeSize of the buffer. Updated to remaining size.
[in]strThe string to encode.
[in]startStarting offset in str.
[in]countMaximum number of characters to encode.
Returns
The number of serialized characters (which is usually not the same as the number of written bytes).

Implemented in ts::DVBCharTableUTF8, ts::DVBCharTableUTF16, ts::DVBCharTableSingleByte, ts::DVBCharset, ts::DumpCharset, and ts::ARIBCharset.

◆ encodeWithByteLength()

size_t ts::Charset::encodeWithByteLength ( uint8_t *&  buffer,
size_t &  size,
const UString str,
size_t  start = 0,
size_t  count = NPOS 
) const
inherited

Encode a C++ Unicode string preceded by its one-byte length.

Unrepresentable characters are skipped. Stop either when the specified number of characters are serialized or when the buffer is full, whichever comes first.

Parameters
[in,out]bufferAddress of the buffer. The address is updated to point after the encoded value.
[in,out]sizeSize of the buffer. Updated to remaining size.
[in]strThe string to encode.
[in]startStarting offset in str.
[in]countMaximum number of characters to encode.
Returns
The number of serialized characters (which is usually not the same as the number of written bytes).

◆ encoded()

ByteBlock ts::Charset::encoded ( const UString str,
size_t  start = 0,
size_t  count = NPOS 
) const
inherited

Encode a C++ Unicode string as a ByteBlock.

Unrepresentable characters are skipped.

Parameters
[in]strThe string to encode.
[in]startStarting offset in str.
[in]countMaximum number of characters to encode.
Returns
A ByteBlock containing the encoded string.

◆ encodedWithByteLength()

ByteBlock ts::Charset::encodedWithByteLength ( const UString str,
size_t  start = 0,
size_t  count = NPOS 
) const
inherited

Encode a C++ Unicode string as a ByteBlock (preceded by its one-byte length).

Unrepresentable characters are skipped.

Parameters
[in]strThe string to encode.
[in]startStarting offset in str.
[in]countMaximum number of characters to encode.
Returns
A ByteBlock containing the encoded string.

The documentation for this class was generated from the following file: