com.ibm.icu.lang

Interface UProperty

public interface UProperty

Selection constants for Unicode properties.

These constants are used in functions like UCharacter.hasBinaryProperty(int) to select one of the Unicode properties.

The properties APIs are intended to reflect Unicode properties as defined in the Unicode Character Database (UCD) and Unicode Technical Reports (UTR).

For details about the properties see http://www.unicode.org.

For names of Unicode properties see the UCD file PropertyAliases.txt.

Important: If ICU is built with UCD files from Unicode versions below 3.2, then properties marked with "new" are not or not fully available. Check UCharacter.getUnicodeVersion() to be sure.

Author: Syn Wee Quek

See Also: UCharacter

UNKNOWN: ICU 2.6

Nested Class Summary
static interfaceUProperty.NameChoice
Selector constants for UCharacter.getPropertyName() and UCharacter.getPropertyValueName().
Field Summary
static intAGE
String property Age.
static intALPHABETIC

Binary property Alphabetic.

static intASCII_HEX_DIGIT
Binary property ASCII_Hex_Digit (0-9 A-F a-f).
static intBIDI_CLASS
Enumerated property Bidi_Class.
static intBIDI_CONTROL

Binary property Bidi_Control.

Format controls which have specific functions in the Bidi Algorithm.

static intBIDI_MIRRORED

Binary property Bidi_Mirrored.

Characters that may change display in RTL text.

Property for UCharacter.isMirrored().

See Bidi Algorithm; UTR 9.

static intBIDI_MIRRORING_GLYPH
String property Bidi_Mirroring_Glyph.
static intBINARY_LIMIT

One more than the last constant for binary Unicode properties.

static intBINARY_START
First constant for binary Unicode properties.
static intBLOCK
Enumerated property Block.
static intCANONICAL_COMBINING_CLASS
Enumerated property Canonical_Combining_Class.
static intCASE_FOLDING
String property Case_Folding.
static intCASE_SENSITIVE

Binary property Case_Sensitive.

Either the source of a case mapping or _in_ the target of a case mapping.

static intDASH

Binary property Dash.

Variations of dashes.

static intDECOMPOSITION_TYPE
Enumerated property Decomposition_Type.
static intDEFAULT_IGNORABLE_CODE_POINT

Binary property Default_Ignorable_Code_Point (new).

static intDEPRECATED

Binary property Deprecated (new).

The usage of deprecated characters is strongly discouraged.

static intDIACRITIC

Binary property Diacritic.

Characters that linguistically modify the meaning of another character to which they apply.

static intDOUBLE_LIMIT
One more than the last constant for double Unicode properties.
static intDOUBLE_START
First constant for double Unicode properties.
static intEAST_ASIAN_WIDTH
Enumerated property East_Asian_Width.
static intEXTENDER

Binary property Extender.

Extend the value or shape of a preceding alphabetic character, e.g. length and iteration marks.

static intFULL_COMPOSITION_EXCLUSION

Binary property Full_Composition_Exclusion.

CompositionExclusions.txt + Singleton Decompositions + Non-Starter Decompositions.

static intGENERAL_CATEGORY
Enumerated property General_Category.
static intGENERAL_CATEGORY_MASK
Bitmask property General_Category_Mask.
static intGRAPHEME_BASE

Binary property Grapheme_Base (new).

For programmatic determination of grapheme cluster boundaries.

static intGRAPHEME_CLUSTER_BREAK
Enumerated property Grapheme_Cluster_Break (new in Unicode 4.1).
static intGRAPHEME_EXTEND

Binary property Grapheme_Extend (new).

For programmatic determination of grapheme cluster boundaries.

Me+Mn+Mc+Other_Grapheme_Extend-Grapheme_Link-CGJ

static intGRAPHEME_LINK

Binary property Grapheme_Link (new).

For programmatic determination of grapheme cluster boundaries.

static intHANGUL_SYLLABLE_TYPE
Enumerated property Hangul_Syllable_Type, new in Unicode 4.
static intHEX_DIGIT

Binary property Hex_Digit.

Characters commonly used for hexadecimal numbers.

static intHYPHEN

Binary property Hyphen.

Dashes used to mark connections between pieces of words, plus the Katakana middle dot.

static intIDEOGRAPHIC

Binary property Ideographic.

CJKV ideographs.

static intIDS_BINARY_OPERATOR

Binary property IDS_Binary_Operator (new).

For programmatic determination of Ideographic Description Sequences.

static intIDS_TRINARY_OPERATOR

Binary property IDS_Trinary_Operator (new).

static intID_CONTINUE

Binary property ID_Continue.

Characters that can continue an identifier.

ID_Start+Mn+Mc+Nd+Pc

static intID_START

Binary property ID_Start.

Characters that can start an identifier.

Lu+Ll+Lt+Lm+Lo+Nl

static intINT_LIMIT
One more than the last constant for enumerated/integer Unicode properties.
static intINT_START
First constant for enumerated/integer Unicode properties.
static intISO_COMMENT
String property ISO_Comment.
static intJOINING_GROUP
Enumerated property Joining_Group.
static intJOINING_TYPE
Enumerated property Joining_Type.
static intJOIN_CONTROL

Binary property Join_Control.

Format controls for cursive joining and ligation.

static intLEAD_CANONICAL_COMBINING_CLASS
Enumerated property Lead_Canonical_Combining_Class.
static intLINE_BREAK
Enumerated property Line_Break.
static intLOGICAL_ORDER_EXCEPTION

Binary property Logical_Order_Exception (new).

Characters that do not use logical order and require special handling in most processing.

static intLOWERCASE

Binary property Lowercase.

Same as UCharacter.isULowercase(), different from UCharacter.islower().

Ll+Other_Lowercase

static intLOWERCASE_MAPPING
String property Lowercase_Mapping.
static intMASK_LIMIT
One more than the last constant for bit-mask Unicode properties.
static intMASK_START
First constant for bit-mask Unicode properties.
static intMATH

Binary property Math.

Sm+Other_Math

static intNAME
String property Name.
static intNFC_INERT
Binary property NFC_Inert.
static intNFC_QUICK_CHECK
Enumerated property NFC_Quick_Check.
static intNFD_INERT
Binary property NFD_Inert.
static intNFD_QUICK_CHECK
Enumerated property NFD_Quick_Check.
static intNFKC_INERT
Binary property NFKC_Inert.
static intNFKC_QUICK_CHECK
Enumerated property NFKC_Quick_Check.
static intNFKD_INERT
Binary property NFKD_Inert.
static intNFKD_QUICK_CHECK
Enumerated property NFKD_Quick_Check.
static intNONCHARACTER_CODE_POINT

Binary property Noncharacter_Code_Point.

Code points that are explicitly defined as illegal for the encoding of characters.

static intNUMERIC_TYPE
Enumerated property Numeric_Type.
static intNUMERIC_VALUE
Double property Numeric_Value.
static intPATTERN_SYNTAX
Binary property Pattern_Syntax (new in Unicode 4.1).
static intPATTERN_WHITE_SPACE
Binary property Pattern_White_Space (new in Unicode 4.1).
static intPOSIX_ALNUM
Binary property alnum (a C/POSIX character class).
static intPOSIX_BLANK
Binary property blank (a C/POSIX character class).
static intPOSIX_GRAPH
Binary property graph (a C/POSIX character class).
static intPOSIX_PRINT
Binary property print (a C/POSIX character class).
static intPOSIX_XDIGIT
Binary property xdigit (a C/POSIX character class).
static intQUOTATION_MARK

Binary property Quotation_Mark.

static intRADICAL

Binary property Radical (new).

For programmatic determination of Ideographic Description Sequences.

static intSCRIPT
Enumerated property Script.
static intSEGMENT_STARTER
Binary Property Segment_Starter.
static intSENTENCE_BREAK
Enumerated property Sentence_Break (new in Unicode 4.1).
static intSIMPLE_CASE_FOLDING
String property Simple_Case_Folding.
static intSIMPLE_LOWERCASE_MAPPING
String property Simple_Lowercase_Mapping.
static intSIMPLE_TITLECASE_MAPPING
String property Simple_Titlecase_Mapping.
static intSIMPLE_UPPERCASE_MAPPING
String property Simple_Uppercase_Mapping.
static intSOFT_DOTTED

Binary property Soft_Dotted (new).

Characters with a "soft dot", like i or j.

An accent placed on these characters causes the dot to disappear.

static intSTRING_LIMIT
One more than the last constant for string Unicode properties.
static intSTRING_START
First constant for string Unicode properties.
static intS_TERM
Binary property STerm (new in Unicode 4.0.1).
static intTERMINAL_PUNCTUATION

Binary property Terminal_Punctuation.

Punctuation characters that generally mark the end of textual units.

static intTITLECASE_MAPPING
String property Titlecase_Mapping.
static intTRAIL_CANONICAL_COMBINING_CLASS
Enumerated property Trail_Canonical_Combining_Class.
static intUNICODE_1_NAME
String property Unicode_1_Name.
static intUNIFIED_IDEOGRAPH

Binary property Unified_Ideograph (new).

For programmatic determination of Ideographic Description Sequences.

static intUPPERCASE

Binary property Uppercase.

Same as UCharacter.isUUppercase(), different from UCharacter.isUpperCase().

Lu+Other_Uppercase

static intUPPERCASE_MAPPING
String property Uppercase_Mapping.
static intVARIATION_SELECTOR
Binary property Variation_Selector (new in Unicode 4.0.1).
static intWHITE_SPACE

Binary property White_Space.

Same as UCharacter.isUWhiteSpace(), different from UCharacter.isSpace() and UCharacter.isWhitespace().

Space characters+TAB+CR+LF-ZWSP-ZWNBSP

static intWORD_BREAK
Enumerated property Word_Break (new in Unicode 4.1).
static intXID_CONTINUE

Binary property XID_Continue.

ID_Continue modified to allow closure under normalization forms NFKC and NFKD.

static intXID_START

Binary property XID_Start.

ID_Start modified to allow closure under normalization forms NFKC and NFKD.

Field Detail

AGE

public static final int AGE
String property Age. Corresponds to UCharacter.getAge(int).

UNKNOWN: ICU 2.4

ALPHABETIC

public static final int ALPHABETIC

Binary property Alphabetic.

Property for UCharacter.isUAlphabetic(), different from the property in UCharacter.isalpha().

Lu + Ll + Lt + Lm + Lo + Nl + Other_Alphabetic.

UNKNOWN: ICU 2.6

ASCII_HEX_DIGIT

public static final int ASCII_HEX_DIGIT
Binary property ASCII_Hex_Digit (0-9 A-F a-f).

UNKNOWN: ICU 2.6

BIDI_CLASS

public static final int BIDI_CLASS
Enumerated property Bidi_Class. Same as UCharacter.getDirection(int), returns UCharacterDirection values.

UNKNOWN: ICU 2.4

BIDI_CONTROL

public static final int BIDI_CONTROL

Binary property Bidi_Control.

Format controls which have specific functions in the Bidi Algorithm.

UNKNOWN: ICU 2.6

BIDI_MIRRORED

public static final int BIDI_MIRRORED

Binary property Bidi_Mirrored.

Characters that may change display in RTL text.

Property for UCharacter.isMirrored().

See Bidi Algorithm; UTR 9.

UNKNOWN: ICU 2.6

BIDI_MIRRORING_GLYPH

public static final int BIDI_MIRRORING_GLYPH
String property Bidi_Mirroring_Glyph. Corresponds to UCharacter.getMirror(int).

UNKNOWN: ICU 2.4

BINARY_LIMIT

public static final int BINARY_LIMIT

One more than the last constant for binary Unicode properties.

UNKNOWN: ICU 2.6

BINARY_START

public static final int BINARY_START
First constant for binary Unicode properties.

UNKNOWN: ICU 2.6

BLOCK

public static final int BLOCK
Enumerated property Block. Same as UCharacter.UnicodeBlock.of(int), returns UCharacter.UnicodeBlock values.

UNKNOWN: ICU 2.4

CANONICAL_COMBINING_CLASS

public static final int CANONICAL_COMBINING_CLASS
Enumerated property Canonical_Combining_Class. Same as UCharacter.getCombiningClass(int), returns 8-bit numeric values.

UNKNOWN: ICU 2.4

CASE_FOLDING

public static final int CASE_FOLDING
String property Case_Folding. Corresponds to UCharacter.foldCase(String, boolean).

UNKNOWN: ICU 2.4

CASE_SENSITIVE

public static final int CASE_SENSITIVE

Binary property Case_Sensitive.

Either the source of a case mapping or _in_ the target of a case mapping. Not the same as the general category Cased_Letter.

UNKNOWN: ICU 2.6

DASH

public static final int DASH

Binary property Dash.

Variations of dashes.

UNKNOWN: ICU 2.6

DECOMPOSITION_TYPE

public static final int DECOMPOSITION_TYPE
Enumerated property Decomposition_Type. Returns UCharacter.DecompositionType values.

UNKNOWN: ICU 2.4

DEFAULT_IGNORABLE_CODE_POINT

public static final int DEFAULT_IGNORABLE_CODE_POINT

Binary property Default_Ignorable_Code_Point (new).

Property that indicates codepoint is ignorable in most processing.

Codepoints (2060..206F, FFF0..FFFB, E0000..E0FFF) + Other_Default_Ignorable_Code_Point + (Cf + Cc + Cs - White_Space)

UNKNOWN: ICU 2.6

DEPRECATED

public static final int DEPRECATED

Binary property Deprecated (new).

The usage of deprecated characters is strongly discouraged.

UNKNOWN: ICU 2.6

DIACRITIC

public static final int DIACRITIC

Binary property Diacritic.

Characters that linguistically modify the meaning of another character to which they apply.

UNKNOWN: ICU 2.6

DOUBLE_LIMIT

public static final int DOUBLE_LIMIT
One more than the last constant for double Unicode properties.

UNKNOWN: ICU 2.4

DOUBLE_START

public static final int DOUBLE_START
First constant for double Unicode properties.

UNKNOWN: ICU 2.4

EAST_ASIAN_WIDTH

public static final int EAST_ASIAN_WIDTH
Enumerated property East_Asian_Width. See http://www.unicode.org/reports/tr11/ Returns UCharacter.EastAsianWidth values.

UNKNOWN: ICU 2.4

EXTENDER

public static final int EXTENDER

Binary property Extender.

Extend the value or shape of a preceding alphabetic character, e.g. length and iteration marks.

UNKNOWN: ICU 2.6

FULL_COMPOSITION_EXCLUSION

public static final int FULL_COMPOSITION_EXCLUSION

Binary property Full_Composition_Exclusion.

CompositionExclusions.txt + Singleton Decompositions + Non-Starter Decompositions.

UNKNOWN: ICU 2.6

GENERAL_CATEGORY

public static final int GENERAL_CATEGORY
Enumerated property General_Category. Same as UCharacter.getType(int), returns UCharacterCategory values.

UNKNOWN: ICU 2.4

GENERAL_CATEGORY_MASK

public static final int GENERAL_CATEGORY_MASK
Bitmask property General_Category_Mask. This is the General_Category property returned as a bit mask. When used in UCharacter.getIntPropertyValue(c), returns bit masks for UCharacterCategory values where exactly one bit is set. When used with UCharacter.getPropertyValueName() and UCharacter.getPropertyValueEnum(), a multi-bit mask is used for sets of categories like "Letters".

UNKNOWN: ICU 2.4

GRAPHEME_BASE

public static final int GRAPHEME_BASE

Binary property Grapheme_Base (new).

For programmatic determination of grapheme cluster boundaries. [0..10FFFF]-Cc-Cf-Cs-Co-Cn-Zl-Zp-Grapheme_Link-Grapheme_Extend-CGJ

UNKNOWN: ICU 2.6

GRAPHEME_CLUSTER_BREAK

public static final int GRAPHEME_CLUSTER_BREAK
Enumerated property Grapheme_Cluster_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns UGraphemeClusterBreak values.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

GRAPHEME_EXTEND

public static final int GRAPHEME_EXTEND

Binary property Grapheme_Extend (new).

For programmatic determination of grapheme cluster boundaries.

Me+Mn+Mc+Other_Grapheme_Extend-Grapheme_Link-CGJ

UNKNOWN: ICU 2.6

GRAPHEME_LINK

public static final int GRAPHEME_LINK

Binary property Grapheme_Link (new).

For programmatic determination of grapheme cluster boundaries.

UNKNOWN: ICU 2.6

HANGUL_SYLLABLE_TYPE

public static final int HANGUL_SYLLABLE_TYPE
Enumerated property Hangul_Syllable_Type, new in Unicode 4. Returns HangulSyllableType values.

UNKNOWN: ICU 2.6

HEX_DIGIT

public static final int HEX_DIGIT

Binary property Hex_Digit.

Characters commonly used for hexadecimal numbers.

UNKNOWN: ICU 2.6

HYPHEN

public static final int HYPHEN

Binary property Hyphen.

Dashes used to mark connections between pieces of words, plus the Katakana middle dot.

UNKNOWN: ICU 2.6

IDEOGRAPHIC

public static final int IDEOGRAPHIC

Binary property Ideographic.

CJKV ideographs.

UNKNOWN: ICU 2.6

IDS_BINARY_OPERATOR

public static final int IDS_BINARY_OPERATOR

Binary property IDS_Binary_Operator (new).

For programmatic determination of Ideographic Description Sequences.

UNKNOWN: ICU 2.6

IDS_TRINARY_OPERATOR

public static final int IDS_TRINARY_OPERATOR

Binary property IDS_Trinary_Operator (new).

UNKNOWN: ICU 2.6

ID_CONTINUE

public static final int ID_CONTINUE

Binary property ID_Continue.

Characters that can continue an identifier.

ID_Start+Mn+Mc+Nd+Pc

UNKNOWN: ICU 2.6

ID_START

public static final int ID_START

Binary property ID_Start.

Characters that can start an identifier.

Lu+Ll+Lt+Lm+Lo+Nl

UNKNOWN: ICU 2.6

INT_LIMIT

public static final int INT_LIMIT
One more than the last constant for enumerated/integer Unicode properties.

UNKNOWN: ICU 2.4

INT_START

public static final int INT_START
First constant for enumerated/integer Unicode properties.

UNKNOWN: ICU 2.4

ISO_COMMENT

public static final int ISO_COMMENT
String property ISO_Comment. Corresponds to UCharacter.getISOComment(int).

UNKNOWN: ICU 2.4

JOINING_GROUP

public static final int JOINING_GROUP
Enumerated property Joining_Group. Returns UCharacter.JoiningGroup values.

UNKNOWN: ICU 2.4

JOINING_TYPE

public static final int JOINING_TYPE
Enumerated property Joining_Type. Returns UCharacter.JoiningType values.

UNKNOWN: ICU 2.4

JOIN_CONTROL

public static final int JOIN_CONTROL

Binary property Join_Control.

Format controls for cursive joining and ligation.

UNKNOWN: ICU 2.6

LEAD_CANONICAL_COMBINING_CLASS

public static final int LEAD_CANONICAL_COMBINING_CLASS
Enumerated property Lead_Canonical_Combining_Class. ICU-specific property for the ccc of the first code point of the decomposition, or lccc(c)=ccc(NFD(c)[0]). Useful for checking for canonically ordered text; see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD . Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

LINE_BREAK

public static final int LINE_BREAK
Enumerated property Line_Break. Returns UCharacter.LineBreak values.

UNKNOWN: ICU 2.4

LOGICAL_ORDER_EXCEPTION

public static final int LOGICAL_ORDER_EXCEPTION

Binary property Logical_Order_Exception (new).

Characters that do not use logical order and require special handling in most processing.

UNKNOWN: ICU 2.6

LOWERCASE

public static final int LOWERCASE

Binary property Lowercase.

Same as UCharacter.isULowercase(), different from UCharacter.islower().

Ll+Other_Lowercase

UNKNOWN: ICU 2.6

LOWERCASE_MAPPING

public static final int LOWERCASE_MAPPING
String property Lowercase_Mapping. Corresponds to UCharacter.toLowerCase(String).

UNKNOWN: ICU 2.4

MASK_LIMIT

public static final int MASK_LIMIT
One more than the last constant for bit-mask Unicode properties.

UNKNOWN: ICU 2.4

MASK_START

public static final int MASK_START
First constant for bit-mask Unicode properties.

UNKNOWN: ICU 2.4

MATH

public static final int MATH

Binary property Math.

Sm+Other_Math

UNKNOWN: ICU 2.6

NAME

public static final int NAME
String property Name. Corresponds to UCharacter.getName(int).

UNKNOWN: ICU 2.4

NFC_INERT

public static final int NFC_INERT
Binary property NFC_Inert. ICU-specific property for characters that are inert under NFC, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.

See Also: NFD_INERT

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NFC_QUICK_CHECK

public static final int NFC_QUICK_CHECK
Enumerated property NFC_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NFD_INERT

public static final int NFD_INERT
Binary property NFD_Inert. ICU-specific property for characters that are inert under NFD, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions. There is one such property per normalization form. These properties are computed as follows - an inert character is: a) unassigned, or ALL of the following: b) of combining class 0. c) not decomposed by this normalization form. AND if NFC or NFKC, d) can never compose with a previous character. e) can never compose with a following character. f) can never change if another character is added. Example: a-breve might satisfy all but f, but if you add an ogonek it changes to a-ogonek + breve See also com.ibm.text.UCD.NFSkippable in the ICU4J repository, and icu/source/common/unormimp.h .

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NFD_QUICK_CHECK

public static final int NFD_QUICK_CHECK
Enumerated property NFD_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NFKC_INERT

public static final int NFKC_INERT
Binary property NFKC_Inert. ICU-specific property for characters that are inert under NFKC, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.

See Also: NFD_INERT

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NFKC_QUICK_CHECK

public static final int NFKC_QUICK_CHECK
Enumerated property NFKC_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NFKD_INERT

public static final int NFKD_INERT
Binary property NFKD_Inert. ICU-specific property for characters that are inert under NFKD, i.e., they do not interact with adjacent characters. Used for example in normalizing transforms in incremental mode to find the boundary of safely normalizable text despite possible text additions.

See Also: NFD_INERT

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NFKD_QUICK_CHECK

public static final int NFKD_QUICK_CHECK
Enumerated property NFKD_Quick_Check. Returns numeric values compatible with Normalizer.QuickCheckResult.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

NONCHARACTER_CODE_POINT

public static final int NONCHARACTER_CODE_POINT

Binary property Noncharacter_Code_Point.

Code points that are explicitly defined as illegal for the encoding of characters.

UNKNOWN: ICU 2.6

NUMERIC_TYPE

public static final int NUMERIC_TYPE
Enumerated property Numeric_Type. Returns UCharacter.NumericType values.

UNKNOWN: ICU 2.4

NUMERIC_VALUE

public static final int NUMERIC_VALUE
Double property Numeric_Value. Corresponds to UCharacter.getUnicodeNumericValue(int).

UNKNOWN: ICU 2.4

PATTERN_SYNTAX

public static final int PATTERN_SYNTAX
Binary property Pattern_Syntax (new in Unicode 4.1). See UAX #31 Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/)

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

PATTERN_WHITE_SPACE

public static final int PATTERN_WHITE_SPACE
Binary property Pattern_White_Space (new in Unicode 4.1). See UAX #31 Identifier and Pattern Syntax (http://www.unicode.org/reports/tr31/)

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

POSIX_ALNUM

public static final int POSIX_ALNUM
Binary property alnum (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

POSIX_BLANK

public static final int POSIX_BLANK
Binary property blank (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

POSIX_GRAPH

public static final int POSIX_GRAPH
Binary property graph (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

POSIX_PRINT

public static final int POSIX_PRINT
Binary property print (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

POSIX_XDIGIT

public static final int POSIX_XDIGIT
Binary property xdigit (a C/POSIX character class). Implemented according to the UTS #18 Annex C Standard Recommendation. See the UCharacter class documentation.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

QUOTATION_MARK

public static final int QUOTATION_MARK

Binary property Quotation_Mark.

UNKNOWN: ICU 2.6

RADICAL

public static final int RADICAL

Binary property Radical (new).

For programmatic determination of Ideographic Description Sequences.

UNKNOWN: ICU 2.6

SCRIPT

public static final int SCRIPT
Enumerated property Script. Same as UScript.getScript(int), returns UScript values.

UNKNOWN: ICU 2.4

SEGMENT_STARTER

public static final int SEGMENT_STARTER
Binary Property Segment_Starter. ICU-specific property for characters that are starters in terms of Unicode normalization and combining character sequences. They have ccc=0 and do not occur in non-initial position of the canonical decomposition of any character (like " in NFD(a-umlaut) and a Jamo T in an NFD(Hangul LVT)). ICU uses this property for segmenting a string for generating a set of canonically equivalent strings, e.g. for canonical closure while processing collation tailoring rules.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

SENTENCE_BREAK

public static final int SENTENCE_BREAK
Enumerated property Sentence_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns USentenceBreak values.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

SIMPLE_CASE_FOLDING

public static final int SIMPLE_CASE_FOLDING
String property Simple_Case_Folding. Corresponds to UCharacter.foldCase(int, boolean).

UNKNOWN: ICU 2.4

SIMPLE_LOWERCASE_MAPPING

public static final int SIMPLE_LOWERCASE_MAPPING
String property Simple_Lowercase_Mapping. Corresponds to UCharacter.toLowerCase(int).

UNKNOWN: ICU 2.4

SIMPLE_TITLECASE_MAPPING

public static final int SIMPLE_TITLECASE_MAPPING
String property Simple_Titlecase_Mapping. Corresponds to UCharacter.toTitleCase(int).

UNKNOWN: ICU 2.4

SIMPLE_UPPERCASE_MAPPING

public static final int SIMPLE_UPPERCASE_MAPPING
String property Simple_Uppercase_Mapping. Corresponds to UCharacter.toUpperCase(int).

UNKNOWN: ICU 2.4

SOFT_DOTTED

public static final int SOFT_DOTTED

Binary property Soft_Dotted (new).

Characters with a "soft dot", like i or j.

An accent placed on these characters causes the dot to disappear.

UNKNOWN: ICU 2.6

STRING_LIMIT

public static final int STRING_LIMIT
One more than the last constant for string Unicode properties.

UNKNOWN: ICU 2.4

STRING_START

public static final int STRING_START
First constant for string Unicode properties.

UNKNOWN: ICU 2.4

S_TERM

public static final int S_TERM
Binary property STerm (new in Unicode 4.0.1). Sentence Terminal. Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/)

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

TERMINAL_PUNCTUATION

public static final int TERMINAL_PUNCTUATION

Binary property Terminal_Punctuation.

Punctuation characters that generally mark the end of textual units.

UNKNOWN: ICU 2.6

TITLECASE_MAPPING

public static final int TITLECASE_MAPPING
String property Titlecase_Mapping. Corresponds to UCharacter.toTitleCase(String).

UNKNOWN: ICU 2.4

TRAIL_CANONICAL_COMBINING_CLASS

public static final int TRAIL_CANONICAL_COMBINING_CLASS
Enumerated property Trail_Canonical_Combining_Class. ICU-specific property for the ccc of the last code point of the decomposition, or lccc(c)=ccc(NFD(c)[last]). Useful for checking for canonically ordered text; see Normalizer.FCD and http://www.unicode.org/notes/tn5/#FCD . Returns 8-bit numeric values like CANONICAL_COMBINING_CLASS.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

UNICODE_1_NAME

public static final int UNICODE_1_NAME
String property Unicode_1_Name. Corresponds to UCharacter.getName1_0(int).

UNKNOWN: ICU 2.4

UNIFIED_IDEOGRAPH

public static final int UNIFIED_IDEOGRAPH

Binary property Unified_Ideograph (new).

For programmatic determination of Ideographic Description Sequences.

UNKNOWN: ICU 2.6

UPPERCASE

public static final int UPPERCASE

Binary property Uppercase.

Same as UCharacter.isUUppercase(), different from UCharacter.isUpperCase().

Lu+Other_Uppercase

UNKNOWN: ICU 2.6

UPPERCASE_MAPPING

public static final int UPPERCASE_MAPPING
String property Uppercase_Mapping. Corresponds to UCharacter.toUpperCase(String).

UNKNOWN: ICU 2.4

VARIATION_SELECTOR

public static final int VARIATION_SELECTOR
Binary property Variation_Selector (new in Unicode 4.0.1). Indicates all those characters that qualify as Variation Selectors. For details on the behavior of these characters, see StandardizedVariants.html and 15.6 Variation Selectors.

UNKNOWN: ICU 3.0 This API might change or be removed in a future release.

WHITE_SPACE

public static final int WHITE_SPACE

Binary property White_Space.

Same as UCharacter.isUWhiteSpace(), different from UCharacter.isSpace() and UCharacter.isWhitespace().

Space characters+TAB+CR+LF-ZWSP-ZWNBSP

UNKNOWN: ICU 2.6

WORD_BREAK

public static final int WORD_BREAK
Enumerated property Word_Break (new in Unicode 4.1). Used in UAX #29: Text Boundaries (http://www.unicode.org/reports/tr29/) Returns UWordBreakValues values.

UNKNOWN: ICU 3.4 This API might change or be removed in a future release.

XID_CONTINUE

public static final int XID_CONTINUE

Binary property XID_Continue.

ID_Continue modified to allow closure under normalization forms NFKC and NFKD.

UNKNOWN: ICU 2.6

XID_START

public static final int XID_START

Binary property XID_Start.

ID_Start modified to allow closure under normalization forms NFKC and NFKD.

UNKNOWN: ICU 2.6