Package dk.brics.automaton
Class Datatypes
- java.lang.Object
-
- dk.brics.automaton.Datatypes
-
public final class Datatypes extends java.lang.Object
Basic automata for representing common datatypes related to Unicode, XML, and XML Schema.
-
-
Field Summary
Fields Modifier and Type Field Description private static java.util.Map<java.lang.String,Automaton>
automata
private static java.util.Set<java.lang.String>
unicodeblock_names
private static java.lang.String[]
unicodeblock_names_array
private static java.util.Set<java.lang.String>
unicodecategory_names
private static java.lang.String[]
unicodecategory_names_array
private static Automaton
ws
private static java.util.Set<java.lang.String>
xml_names
private static java.lang.String[]
xml_names_array
-
Constructor Summary
Constructors Modifier Constructor Description private
Datatypes()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description private static void
buildAll()
private static java.util.Map<java.lang.String,Automaton>
buildMap(java.lang.String[] exps)
static boolean
exists(java.lang.String name)
Checks whether a given automaton is available.static Automaton
get(java.lang.String name)
Returns pre-built automaton.(package private) static Automaton
getWhitespaceAutomaton()
static boolean
isUnicodeBlockName(java.lang.String name)
Checks whether the given string is the name of a Unicode block (seeget(String)
).static boolean
isUnicodeCategoryName(java.lang.String name)
Checks whether the given string is the name of a Unicode category (seeget(String)
).static boolean
isXMLName(java.lang.String name)
Checks whether the given string is the name of an XML / XML Schema automaton (seeget(String)
).private static Automaton
load(java.lang.String name)
static void
main(java.lang.String[] args)
Invoke during compilation to pre-build automata.private static Automaton
makeCodePoint(int cp)
private static void
put(java.util.Map<java.lang.String,Automaton> map, java.lang.String name, Automaton a)
private static void
putFrom(java.lang.String name, java.util.Map<java.lang.String,Automaton> from)
private static void
putWith(java.lang.String[] exps, java.util.Map<java.lang.String,Automaton> use)
private static void
store(java.lang.String name, Automaton a)
-
-
-
Field Detail
-
automata
private static final java.util.Map<java.lang.String,Automaton> automata
-
ws
private static final Automaton ws
-
unicodeblock_names
private static final java.util.Set<java.lang.String> unicodeblock_names
-
unicodecategory_names
private static final java.util.Set<java.lang.String> unicodecategory_names
-
xml_names
private static final java.util.Set<java.lang.String> xml_names
-
unicodeblock_names_array
private static final java.lang.String[] unicodeblock_names_array
-
unicodecategory_names_array
private static final java.lang.String[] unicodecategory_names_array
-
xml_names_array
private static final java.lang.String[] xml_names_array
-
-
Method Detail
-
main
public static void main(java.lang.String[] args)
Invoke during compilation to pre-build automata. Automata are stored in the directory specified by the system propertydk.brics.automaton.datatypes
. (Default:build
, relative to the current working directory.)
-
get
public static Automaton get(java.lang.String name)
Returns pre-built automaton. Automata are loaded as resources from the class loader of theDatatypes
class. (Typically, the pre-built automata are stored in the same jar as this class.)The following automata are available:
Available automata Name Description NCName
NCName from XML Namespaces 1.0 QName
QName from XML Namespaces 1.0 Char
Char from XML 1.0 NameChar
NameChar from XML 1.0 URI
URI from RFC2396 with amendments from RFC2373 anyname
optional URI enclosed by brackets, followed by NCName noap
strings not containing '@' and '%' whitespace
optional S from XML 1.0 whitespacechar
a single whitespace character from XML 1.0 string
string from XML Schema Part 2 boolean
boolean from XML Schema Part 2 decimal
decimal from XML Schema Part 2 float
float from XML Schema Part 2 integer
integer from XML Schema Part 2 duration
duration from XML Schema Part 2 dateTime
dateTime from XML Schema Part 2 time
time from XML Schema Part 2 date
date from XML Schema Part 2 gYearMonth
gYearMonth from XML Schema Part 2 gYear
gYear from XML Schema Part 2 gMonthDay
gMonthDay from XML Schema Part 2 gDay
gDay from XML Schema Part 2 hexBinary
hexBinary from XML Schema Part 2 base64Binary
base64Binary from XML Schema Part 2 NCName2
NCName from XML Schema Part 2 NCNames
list of NCNames from XML Schema Part 2 QName2
QName from XML Schema Part 2 Nmtoken2
NMTOKEN from XML Schema Part 2 Nmtokens
NMTOKENS from XML Schema Part 2 Name2
Name from XML Schema Part 2 Names
list of Names from XML Schema Part 2 language
language from XML Schema Part 2 BasicLatin
BasicLatin block from Unicode 3.1 Latin-1Supplement
Latin-1Supplement block from Unicode 3.1 LatinExtended-A
LatinExtended-A block from Unicode 3.1 LatinExtended-B
LatinExtended-B block from Unicode 3.1 IPAExtensions
IPAExtensions block from Unicode 3.1 SpacingModifierLetters
SpacingModifierLetters block from Unicode 3.1 CombiningDiacriticalMarks
CombiningDiacriticalMarks block from Unicode 3.1 Greek
Greek block from Unicode 3.1 Cyrillic
Cyrillic block from Unicode 3.1 Armenian
Armenian block from Unicode 3.1 Hebrew
Hebrew block from Unicode 3.1 Arabic
Arabic block from Unicode 3.1 Syriac
Syriac block from Unicode 3.1 Thaana
Thaana block from Unicode 3.1 Devanagari
Devanagari block from Unicode 3.1 Bengali
Bengali block from Unicode 3.1 Gurmukhi
Gurmukhi block from Unicode 3.1 Gujarati
Gujarati block from Unicode 3.1 Oriya
Oriya block from Unicode 3.1 Tamil
Tamil block from Unicode 3.1 Telugu
Telugu block from Unicode 3.1 Kannada
Kannada block from Unicode 3.1 Malayalam
Malayalam block from Unicode 3.1 Sinhala
Sinhala block from Unicode 3.1 Thai
Thai block from Unicode 3.1 Lao
Lao block from Unicode 3.1 Tibetan
Tibetan block from Unicode 3.1 Myanmar
Myanmar block from Unicode 3.1 Georgian
Georgian block from Unicode 3.1 HangulJamo
HangulJamo block from Unicode 3.1 Ethiopic
Ethiopic block from Unicode 3.1 Cherokee
Cherokee block from Unicode 3.1 UnifiedCanadianAboriginalSyllabics
UnifiedCanadianAboriginalSyllabics block from Unicode 3.1 Ogham
Ogham block from Unicode 3.1 Runic
Runic block from Unicode 3.1 Khmer
Khmer block from Unicode 3.1 Mongolian
Mongolian block from Unicode 3.1 LatinExtendedAdditional
LatinExtendedAdditional block from Unicode 3.1 GreekExtended
GreekExtended block from Unicode 3.1 GeneralPunctuation
GeneralPunctuation block from Unicode 3.1 SuperscriptsandSubscripts
SuperscriptsandSubscripts block from Unicode 3.1 CurrencySymbols
CurrencySymbols block from Unicode 3.1 CombiningMarksforSymbols
CombiningMarksforSymbols block from Unicode 3.1 LetterlikeSymbols
LetterlikeSymbols block from Unicode 3.1 NumberForms
NumberForms block from Unicode 3.1 Arrows
Arrows block from Unicode 3.1 MathematicalOperators
MathematicalOperators block from Unicode 3.1 MiscellaneousTechnical
MiscellaneousTechnical block from Unicode 3.1 ControlPictures
ControlPictures block from Unicode 3.1 OpticalCharacterRecognition
OpticalCharacterRecognition block from Unicode 3.1 EnclosedAlphanumerics
EnclosedAlphanumerics block from Unicode 3.1 BoxDrawing
BoxDrawing block from Unicode 3.1 BlockElements
BlockElements block from Unicode 3.1 GeometricShapes
GeometricShapes block from Unicode 3.1 MiscellaneousSymbols
MiscellaneousSymbols block from Unicode 3.1 Dingbats
Dingbats block from Unicode 3.1 BraillePatterns
BraillePatterns block from Unicode 3.1 CJKRadicalsSupplement
CJKRadicalsSupplement block from Unicode 3.1 KangxiRadicals
KangxiRadicals block from Unicode 3.1 IdeographicDescriptionCharacters
IdeographicDescriptionCharacters block from Unicode 3.1 CJKSymbolsandPunctuation
CJKSymbolsandPunctuation block from Unicode 3.1 Hiragana
Hiragana block from Unicode 3.1 Katakana
Katakana block from Unicode 3.1 Bopomofo
Bopomofo block from Unicode 3.1 HangulCompatibilityJamo
HangulCompatibilityJamo block from Unicode 3.1 Kanbun
Kanbun block from Unicode 3.1 BopomofoExtended
BopomofoExtended block from Unicode 3.1 EnclosedCJKLettersandMonths
EnclosedCJKLettersandMonths block from Unicode 3.1 CJKCompatibility
CJKCompatibility block from Unicode 3.1 CJKUnifiedIdeographsExtensionA
CJKUnifiedIdeographsExtensionA block from Unicode 3.1 CJKUnifiedIdeographs
CJKUnifiedIdeographs block from Unicode 3.1 YiSyllables
YiSyllables block from Unicode 3.1 YiRadicals
YiRadicals block from Unicode 3.1 HangulSyllables
HangulSyllables block from Unicode 3.1 CJKCompatibilityIdeographs
CJKCompatibilityIdeographs block from Unicode 3.1 AlphabeticPresentationForms
AlphabeticPresentationForms block from Unicode 3.1 ArabicPresentationForms-A
ArabicPresentationForms-A block from Unicode 3.1 CombiningHalfMarks
CombiningHalfMarks block from Unicode 3.1 CJKCompatibilityForms
CJKCompatibilityForms block from Unicode 3.1 SmallFormVariants
SmallFormVariants block from Unicode 3.1 ArabicPresentationForms-B
ArabicPresentationForms-B block from Unicode 3.1 Specials
Specials block from Unicode 3.1 HalfwidthandFullwidthForms
HalfwidthandFullwidthForms block from Unicode 3.1 Specials
Specials block from Unicode 3.1 OldItalic
OldItalic block from Unicode 3.1 Gothic
Gothic block from Unicode 3.1 Deseret
Deseret block from Unicode 3.1 ByzantineMusicalSymbols
ByzantineMusicalSymbols block from Unicode 3.1 MusicalSymbols
MusicalSymbols block from Unicode 3.1 MathematicalAlphanumericSymbols
MathematicalAlphanumericSymbols block from Unicode 3.1 CJKUnifiedIdeographsExtensionB
CJKUnifiedIdeographsExtensionB block from Unicode 3.1 CJKCompatibilityIdeographsSupplement
CJKCompatibilityIdeographsSupplement block from Unicode 3.1 Tags
Tags block from Unicode 3.1 Lu
Lu category from Unicode 3.1 Ll
Ll category from Unicode 3.1 Lt
Lt category from Unicode 3.1 Lm
Lm category from Unicode 3.1 Lo
Lo category from Unicode 3.1 L
L category from Unicode 3.1 Mn
Mn category from Unicode 3.1 Mc
Mc category from Unicode 3.1 Me
Me category from Unicode 3.1 M
M category from Unicode 3.1 Nd
Nd category from Unicode 3.1 Nl
Nl category from Unicode 3.1 No
No category from Unicode 3.1 N
N category from Unicode 3.1 Pc
Pc category from Unicode 3.1 Pd
Pd category from Unicode 3.1 Ps
Ps category from Unicode 3.1 Pe
Pe category from Unicode 3.1 Pi
Pi category from Unicode 3.1 Pf
Pf category from Unicode 3.1 Po
Po category from Unicode 3.1 P
P category from Unicode 3.1 Zs
Zs category from Unicode 3.1 Zl
Zl category from Unicode 3.1 Zp
Zp category from Unicode 3.1 Z
Z category from Unicode 3.1 Sm
Sm category from Unicode 3.1 Sc
Sc category from Unicode 3.1 Sk
Sk category from Unicode 3.1 So
So category from Unicode 3.1 S
S category from Unicode 3.1 Cc
Cc category from Unicode 3.1 Cf
Cf category from Unicode 3.1 Co
Co category from Unicode 3.1 Cn
Cn category from Unicode 3.1 C
C category from Unicode 3.1 Loaded automata are cached in memory.
- Parameters:
name
- name of automaton- Returns:
- automaton
-
isUnicodeBlockName
public static boolean isUnicodeBlockName(java.lang.String name)
Checks whether the given string is the name of a Unicode block (seeget(String)
).
-
isUnicodeCategoryName
public static boolean isUnicodeCategoryName(java.lang.String name)
Checks whether the given string is the name of a Unicode category (seeget(String)
).
-
isXMLName
public static boolean isXMLName(java.lang.String name)
Checks whether the given string is the name of an XML / XML Schema automaton (seeget(String)
).
-
exists
public static boolean exists(java.lang.String name)
Checks whether a given automaton is available.- Parameters:
name
- automaton name- Returns:
- true if the automaton is available
-
load
private static Automaton load(java.lang.String name)
-
store
private static void store(java.lang.String name, Automaton a)
-
buildAll
private static void buildAll()
-
makeCodePoint
private static Automaton makeCodePoint(int cp)
-
buildMap
private static java.util.Map<java.lang.String,Automaton> buildMap(java.lang.String[] exps)
-
putWith
private static void putWith(java.lang.String[] exps, java.util.Map<java.lang.String,Automaton> use)
-
putFrom
private static void putFrom(java.lang.String name, java.util.Map<java.lang.String,Automaton> from)
-
put
private static void put(java.util.Map<java.lang.String,Automaton> map, java.lang.String name, Automaton a)
-
getWhitespaceAutomaton
static Automaton getWhitespaceAutomaton()
-
-