This is how the parser talks to its input entities, of all kinds.
The entities are in a stack.
For internal entities, the character arrays are referenced here,
and read from as needed (they're read-only). External entities have
mutable buffers, that are read into as needed.
Note: This maps CRLF (and CR) to LF without regard for
whether it's in an external (parsed) entity or not. The XML 1.0 spec
is inconsistent in explaining EOL handling; this is the sensible way.
getColumnNumber
public int getColumnNumber()
returns -1; maintaining column numbers hurts performance
getEncoding
public String getEncoding()
Returns the name of the encoding in use, else null; the name
returned is in as standard a form as we can get.
getLineNumber
public int getLineNumber()
Returns the current line number in this input source
getName
public String getName()
getNameChar
public char getNameChar()
throws IOException,
SAXException
returns the next name char, or NUL ... faster than getc(),
and the common "name or nmtoken must be next" case won't
need ungetc().
getPublicId
public String getPublicId()
Returns the public ID of this input source, if known
getSystemId
public String getSystemId()
Returns the system ID of this input source, if known
getc
public char getc()
throws IOException,
SAXException
gets the next Java character -- might be part of an XML
text character represented by a surrogate pair, or be
the end of the entity.
ignorableWhitespace
public boolean ignorableWhitespace(DTDEventListener handler)
throws IOException,
SAXException
whitespace in markup (flagged to app, discardable)
the document handler's ignorableWhitespace() method
is called on all the whitespace found
init
public void init(InputSource in,
String name,
InputEntity stack,
boolean isPE)
throws IOException,
SAXException
init
public void init(b[] ,
String name,
InputEntity stack,
boolean isPE)
throws SAXException
isDocument
public boolean isDocument()
isEOF
public boolean isEOF()
throws IOException,
SAXException
returns true iff there's no more data to consume ...
isInternal
public boolean isInternal()
isParameterEntity
public boolean isParameterEntity()
maybeWhitespace
public boolean maybeWhitespace()
throws IOException,
SAXException
optional grammatical whitespace (discarded)
parsedContent
public boolean parsedContent(DTDEventListener docHandler)
throws IOException,
SAXException
normal content; whitespace in markup may be handled
specially if the parser uses the content model.
content terminates with markup delimiter characters,
namely ampersand (&) and left angle bracket (<).
the document handler's characters() method is called
on all the content found
peek
public boolean peek(String next,
chars[] )
throws IOException,
SAXException
returns false iff 'next' string isn't as provided,
else skips that text and returns true.
NOTE: two alternative string representations are
both passed in, since one is faster.
peekc
public boolean peekc(char c)
throws IOException,
SAXException
lookahead one character
rememberText
public String rememberText()
startRemembering
public void startRemembering()
ungetc
public void ungetc()
two character pushback is guaranteed
unparsedContent
public boolean unparsedContent(DTDEventListener docHandler,
boolean ignorableWhitespace,
String whitespaceInvalidMessage)
throws IOException,
SAXException
CDATA -- character data, terminated by "]]>" and optionally
including unescaped markup delimiters (ampersand and left angle
bracket). This should otherwise be exactly like character data,
modulo differences in error report details.
The document handler's characters() or ignorableWhitespace()
methods are invoked on all the character data found
docHandler
- gets callbacks for character dataignorableWhitespace
- if true, whitespace characters will
be reported using docHandler.ignorableWhitespace(); implicitly,
non-whitespace characters will cause validation errorswhitespaceInvalidMessage
- if true, ignorable whitespace
causes a validity error report as well as a callback