ucommon::utf8 Class Reference

A core class of ut8 encoded string functions. More...

#include <unicode.h>

Inheritance diagram for ucommon::utf8:

Inheritance graph
[legend]

Static Public Member Functions

static unsigned ccount (char *string, ucs4_t character)
 Count occurrences of a unicode character in string.
static size_t chars (ucs4_t character)
 How many chars requires to encode a given unicode character.
static size_t chars (unicode_t string)
 How many chars requires to encode a given wchar string.
static ucs4_t codepoint (char *encoded)
 Convert a utf8 encoded codepoint to a ucs4 character value.
static size_t count (char *string)
 Count ut8 encoded ucs4 codepoints in string.
static char * find (char *string, ucs4_t character, size_t start=0)
 Find first occurance of character in string.
static ucs4_t get (CharacterProtocol &buffer)
 Get a unicode character from a character protocol.
static char * offset (char *string, ssize_t position)
 Get codepoint offset in a string.
static size_t pack (unicode_t unicode, CharacterProtocol &buffer, size_t size)
 Convert a utf8 string into a unicode data buffer.
static ucs4_t put (ucs4_t character, CharacterProtocol &buffer)
 Push a unicode character to a character protocol.
static char * rfind (char *string, ucs4_t character, size_t end=(size_t)-1l)
 Find last occurrence of character in string.
static unsigned size (char *codepoint)
 Compute character size of utf8 string codepoint.
static ucs4_tudup (char *string)
 Dup a utf8 string into a ucs4_t string.
static size_t unpack (unicode_t string, CharacterProtocol &buffer)
 Convert a unicode string into utf8.
static ucs2_twdup (char *string)
 Dup a utf8 string into a ucs2_t representation.

Static Public Attributes

static char * nil
 A convenient NULL pointer value.
static unsigned ucsize
 Size of "unicode_t" character codes, may not be ucs4_t size.

Detailed Description

A core class of ut8 encoded string functions.

This is a foundation for all utf8 string processing.

Author:
David Sugar

Definition at line 62 of file unicode.h.


Member Function Documentation

static unsigned ucommon::utf8::ccount ( char *  string,
ucs4_t  character 
) [static]

Count occurrences of a unicode character in string.

Parameters:
string to search in.
character code to search for.
Returns:
count of occurrences.

static size_t ucommon::utf8::chars ( ucs4_t  character  )  [static]

How many chars requires to encode a given unicode character.

Parameters:
character to encode.
Returns:
number of chars required to encode given character.

static size_t ucommon::utf8::chars ( unicode_t  string  )  [static]

How many chars requires to encode a given wchar string.

Parameters:
string of ucs4 data.
Returns:
number of chars required to encode given string.

static ucs4_t ucommon::utf8::codepoint ( char *  encoded  )  [static]

Convert a utf8 encoded codepoint to a ucs4 character value.

Parameters:
encoded utf8 codepoint.
Returns:
ucs4 string or 0 if invalid.

static size_t ucommon::utf8::count ( char *  string  )  [static]

Count ut8 encoded ucs4 codepoints in string.

Parameters:
string of utf8 data.
Returns:
codepount count, 0 if empty or invalid.

static char* ucommon::utf8::find ( char *  string,
ucs4_t  character,
size_t  start = 0 
) [static]

Find first occurance of character in string.

Parameters:
string to search in.
character code to search for.
start offset in string in codepoints.
Returns:
pointer to first instance or NULL if not found.

static ucs4_t ucommon::utf8::get ( CharacterProtocol buffer  )  [static]

Get a unicode character from a character protocol.

Parameters:
buffer of character protocol to read from.
Returns:
unicode character or EOF error.

static char* ucommon::utf8::offset ( char *  string,
ssize_t  position 
) [static]

Get codepoint offset in a string.

Parameters:
string of utf8 data.
position of codepoint in string, negative offsets are from tail.
Returns:
offset of codepoint or NULL if invalid.

static size_t ucommon::utf8::pack ( unicode_t  unicode,
CharacterProtocol buffer,
size_t  size 
) [static]

Convert a utf8 string into a unicode data buffer.

Parameters:
unicode data buffer.
buffer of character protocol to pack from.
size of unicode data buffer in codepoints.
Returns:
number of code points converted.

static ucs4_t ucommon::utf8::put ( ucs4_t  character,
CharacterProtocol buffer 
) [static]

Push a unicode character to a character protocol.

Parameters:
character to push to file.
buffer of character protocol to push character to.
Returns:
unicode character or EOF on error.

static char* ucommon::utf8::rfind ( char *  string,
ucs4_t  character,
size_t  end = (size_t)-1l 
) [static]

Find last occurrence of character in string.

Parameters:
string to search in.
character code to search for.
end offset to start from in codepoints.
Returns:
pointer to last instance or NULL if not found.

static unsigned ucommon::utf8::size ( char *  codepoint  )  [static]

Compute character size of utf8 string codepoint.

Parameters:
codepoint in string.
Returns:
size of codepoint as utf8 encoded data, 0 if invalid.

static size_t ucommon::utf8::unpack ( unicode_t  string,
CharacterProtocol buffer 
) [static]

Convert a unicode string into utf8.

Parameters:
string of unicode data to pack
buffer of character protocol to put data into.
Returns:
number of code points converted.


The documentation for this class was generated from the following file:
Generated on 14 Aug 2013 for UCommon by  doxygen 1.4.7