|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.ibm.icu.text.ComposedCharIter
public final class ComposedCharIter
ComposedCharIter is an iterator class that returns all of the precomposed characters defined in the Unicode standard, along with their decomposed forms. This is often useful when building data tables (e.g. collation tables) which need to treat composed and decomposed characters equivalently.
For example, imagine that you have built a collation table with ordering
rules for the canonically decomposed
forms of all
characters used in a particular language. When you process input text using
this table, the text must first be decomposed so that it matches the form
used in the table. This can impose a performance penalty that may be
unacceptable in some situations.
You can avoid this problem by ensuring that the collation table contains rules for both the decomposed and composed versions of each character. To do so, use a ComposedCharIter to iterate through all of the composed characters in Unicode. If the decomposition for that character consists solely of characters that are listed in your ruleset, you can add a new rule for the composed character that makes it equivalent to its decomposition sequence.
Note that ComposedCharIter iterates over a static table
of the composed characters in Unicode. If you want to iterate over the
composed characters in a particular string, use Normalizer
instead.
When constructing a ComposedCharIter there is one optional feature that you can enable or disable:
Normalizer.IGNORE_HANGUL
- Do not iterate over the Hangul
characters and their corresponding Jamo decompositions.
This option is off by default (i.e. Hangul processing is enabled)
since the Unicode standard specifies that Hangul to Jamo
is a canonical decomposition.
ComposedCharIter is currently based on version 2.1.8 of the Unicode Standard. It will be updated as later versions of Unicode are released.
Field Summary | |
---|---|
static char |
DONE
Deprecated. ICU 2.2 |
Constructor Summary | |
---|---|
ComposedCharIter()
Deprecated. ICU 2.2 |
|
ComposedCharIter(boolean compat,
int options)
Deprecated. ICU 2.2 |
Method Summary | |
---|---|
String |
decomposition()
Deprecated. ICU 2.2 |
boolean |
hasNext()
Deprecated. ICU 2.2 |
char |
next()
Deprecated. ICU 2.2 |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final char DONE
next()
returns this value when there are no more composed characters
over which to iterate.
Constructor Detail |
---|
public ComposedCharIter()
public ComposedCharIter(boolean compat, int options)
compat
- false for canonical decompositions only;
true for both canonical and compatibility
decompositions.options
- Optional decomposition features. Currently, the only
supported option is Normalizer.IGNORE_HANGUL
, which
causes this ComposedCharIter not to iterate
over the Hangul characters and their corresponding
Jamo decompositions.Method Detail |
---|
public boolean hasNext()
next()
.
public char next()
hasNext()
will return false and further calls
to next will return DONE
.
public String decomposition()
next()
. The resulting decomposition is
affected by the settings of the options passed to the constructor.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |