This class is responsible for scanning the structure and content
of document fragments. The scanner acts as the source for the
document information which is communicated to the document handler.
This component requires the following features and properties from the
component manager that uses it:
- http://xml.org/sax/features/validation
- http://apache.org/xml/features/scanner/notify-char-refs
- http://apache.org/xml/features/scanner/notify-builtin-refs
- http://apache.org/xml/properties/internal/symbol-table
- http://apache.org/xml/properties/internal/error-reporter
- http://apache.org/xml/properties/internal/entity-manager
DEBUG_CONTENT_SCANNING
protected static final boolean DEBUG_CONTENT_SCANNING
Debug content dispatcher scanning.
ENTITY_RESOLVER
protected static final String ENTITY_RESOLVER
Property identifier: entity resolver.
NAMESPACES
protected static final String NAMESPACES
Feature identifier: namespaces.
NOTIFY_BUILTIN_REFS
protected static final String NOTIFY_BUILTIN_REFS
Feature identifier: notify built-in refereces.
SCANNER_STATE_CDATA
protected static final int SCANNER_STATE_CDATA
Scanner state: CDATA section.
SCANNER_STATE_COMMENT
protected static final int SCANNER_STATE_COMMENT
Scanner state: comment.
SCANNER_STATE_CONTENT
protected static final int SCANNER_STATE_CONTENT
Scanner state: content.
SCANNER_STATE_DOCTYPE
protected static final int SCANNER_STATE_DOCTYPE
Scanner state: DOCTYPE.
SCANNER_STATE_END_OF_INPUT
protected static final int SCANNER_STATE_END_OF_INPUT
Scanner state: end of input.
SCANNER_STATE_PI
protected static final int SCANNER_STATE_PI
Scanner state: processing instruction.
SCANNER_STATE_REFERENCE
protected static final int SCANNER_STATE_REFERENCE
Scanner state: reference.
SCANNER_STATE_ROOT_ELEMENT
protected static final int SCANNER_STATE_ROOT_ELEMENT
Scanner state: root element.
SCANNER_STATE_START_OF_MARKUP
protected static final int SCANNER_STATE_START_OF_MARKUP
Scanner state: start of markup.
SCANNER_STATE_TERMINATED
protected static final int SCANNER_STATE_TERMINATED
Scanner state: terminated.
SCANNER_STATE_TEXT_DECL
protected static final int SCANNER_STATE_TEXT_DECL
Scanner state: Text declaration.
fAttributeQName
protected final org.apache.xerces.xni.QName fAttributeQName
Attribute QName.
fCurrentElement
protected org.apache.xerces.xni.QName fCurrentElement
Current element.
fDocumentHandler
protected org.apache.xerces.xni.XMLDocumentHandler fDocumentHandler
Document handler.
fElementQName
protected final org.apache.xerces.xni.QName fElementQName
Element QName.
fEntityStack
protected int[] fEntityStack
Entity stack.
fHasExternalDTD
protected boolean fHasExternalDTD
has external dtd
fInScanContent
protected boolean fInScanContent
SubScanner state: inside scanContent method.
fMarkupDepth
protected int fMarkupDepth
Markup depth.
fNotifyBuiltInRefs
protected boolean fNotifyBuiltInRefs
Notify built-in references.
fScannerState
protected int fScannerState
Scanner state.
fStandalone
protected boolean fStandalone
Standalone.
fTempString
protected final org.apache.xerces.xni.XMLString fTempString
String.
fTempString2
protected final org.apache.xerces.xni.XMLString fTempString2
String.
endEntity
public void endEntity(String name,
org.apache.xerces.xni.Augmentations augs)
throws org.apache.xerces.xni.XNIException
This method notifies the end of an entity. The DTD has the pseudo-name
of "[dtd]" parameter entity names start with '%'; and general entities
are just specified by their name.
- endEntity in interface XMLEntityHandler
- endEntity in interface XMLScanner
name
- The name of the entity.augs
- Additional information that may include infoset augmentations
org.apache.xerces.xni.XNIException
- Thrown by handler to signal an error.
getDocumentHandler
public org.apache.xerces.xni.XMLDocumentHandler getDocumentHandler()
Returns the document handler
getFeatureDefault
public Boolean getFeatureDefault(String featureId)
Returns the default state for a feature, or null if this
component does not want to report a default value for this
feature.
- getFeatureDefault in interface org.apache.xerces.xni.parser.XMLComponent
featureId
- The feature identifier.
getPropertyDefault
public Object getPropertyDefault(String propertyId)
Returns the default state for a property, or null if this
component does not want to report a default value for this
property.
- getPropertyDefault in interface org.apache.xerces.xni.parser.XMLComponent
propertyId
- The property identifier.
getRecognizedFeatures
public String[] getRecognizedFeatures()
Returns a list of feature identifiers that are recognized by
this component. This method may return null if no features
are recognized by this component.
- getRecognizedFeatures in interface org.apache.xerces.xni.parser.XMLComponent
getRecognizedProperties
public String[] getRecognizedProperties()
Returns a list of property identifiers that are recognized by
this component. This method may return null if no properties
are recognized by this component.
- getRecognizedProperties in interface org.apache.xerces.xni.parser.XMLComponent
getScannerStateName
protected String getScannerStateName(int state)
Returns the scanner state name.
handleEndElement
protected int handleEndElement(org.apache.xerces.xni.QName element,
boolean isEmpty)
throws org.apache.xerces.xni.XNIException
Handles the end element. This method will make sure that
the end element name matches the current element and notify
the handler about the end of the element and the end of any
relevent prefix mappings.
Note: This method uses the fQName variable.
The contents of this variable will be destroyed.
org.apache.xerces.xni.XNIException
- Thrown if the handler throws a SAX exception
upon notification.
reset
public void reset(org.apache.xerces.xni.parser.XMLComponentManager componentManager)
throws org.apache.xerces.xni.parser.XMLConfigurationException
Resets the component. The component can query the component manager
about any features and properties that affect the operation of the
component.
- reset in interface org.apache.xerces.xni.parser.XMLComponent
- reset in interface XMLScanner
componentManager
- The component manager.
scanAttribute
protected void scanAttribute(org.apache.xerces.xni.XMLAttributes attributes)
throws IOException,
org.apache.xerces.xni.XNIException
Scans an attribute.
[41] Attribute ::= Name Eq AttValue
Note: This method assumes that the next
character on the stream is the first character of the attribute
name.
Note: This method uses the fAttributeQName and
fQName variables. The contents of these variables will be
destroyed.
attributes
- The attributes list for the scanned attribute.
scanCDATASection
protected boolean scanCDATASection(boolean complete)
throws IOException,
org.apache.xerces.xni.XNIException
Scans a CDATA section.
Note: This method uses the fTempString and
fStringBuffer variables.
complete
- True if the CDATA section is to be scanned
completely.
- True if CDATA is completely scanned.
scanCharReference
protected void scanCharReference()
throws IOException,
org.apache.xerces.xni.XNIException
Scans a character reference.
[66] CharRef ::= '' [0-9]+ ';' | '' [0-9a-fA-F]+ ';'
scanComment
protected void scanComment()
throws IOException,
org.apache.xerces.xni.XNIException
Scans a comment.
[15] Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'
Note: Called after scanning past '<!--'
scanContent
protected int scanContent()
throws IOException,
org.apache.xerces.xni.XNIException
Scans element content.
- Returns the next character on the stream.
scanDocument
public boolean scanDocument(boolean complete)
throws IOException,
org.apache.xerces.xni.XNIException
Scans a document.
complete
- True if the scanner should scan the document
completely, pushing all events to the registered
document handler. A value of false indicates that
that the scanner should only scan the next portion
of the document and return. A scanner instance is
permitted to completely scan a document if it does
not support this "pull" scanning model.
- True if there is more to scan, false otherwise.
scanEndElement
protected int scanEndElement()
throws IOException,
org.apache.xerces.xni.XNIException
Scans an end element.
[42] ETag ::= '</' Name S? '>'
Note: This method uses the fElementQName variable.
The contents of this variable will be destroyed. The caller should
copy the needed information out of this variable before calling
this method.
scanEntityReference
protected void scanEntityReference()
throws IOException,
org.apache.xerces.xni.XNIException
Scans an entity reference.
org.apache.xerces.xni.XNIException
- Thrown if handler throws exception upon
notification.
scanPIData
protected void scanPIData(String target,
org.apache.xerces.xni.XMLString data)
throws IOException,
org.apache.xerces.xni.XNIException
Scans a processing data. This is needed to handle the situation
where a document starts with a processing instruction whose
target name starts with "xml". (e.g. xmlfoo)
- scanPIData in interface XMLScanner
target
- The PI targetdata
- The string to fill in with the data
scanStartElement
protected boolean scanStartElement()
throws IOException,
org.apache.xerces.xni.XNIException
Scans a start element. This method will handle the binding of
namespace information and notifying the handler of the start
of the element.
[44] EmptyElemTag ::= '<' Name (S Attribute)* S? '/>'
[40] STag ::= '<' Name (S Attribute)* S? '>'
Note: This method assumes that the leading
'<' character has been consumed.
Note: This method uses the fElementQName and
fAttributes variables. The contents of these variables will be
destroyed. The caller should copy important information out of
these variables before calling this method.
- True if element is empty. (i.e. It matches
production [44].
scanStartElementAfterName
protected boolean scanStartElementAfterName()
throws IOException,
org.apache.xerces.xni.XNIException
Scans the remainder of a start or empty tag after the element name.
- True if element is empty.
scanStartElementName
protected void scanStartElementName()
throws IOException,
org.apache.xerces.xni.XNIException
Scans the name of an element in a start or empty tag.
scanXMLDeclOrTextDecl
protected void scanXMLDeclOrTextDecl(boolean scanningTextDecl)
throws IOException,
org.apache.xerces.xni.XNIException
Scans an XML or text declaration.
[23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
[24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ")
[80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' | "'" EncName "'" )
[81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
[32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'")
| ('"' ('yes' | 'no') '"'))
[77] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'
scanningTextDecl
- True if a text declaration is to
be scanned instead of an XML
declaration.
setDocumentHandler
public void setDocumentHandler(org.apache.xerces.xni.XMLDocumentHandler documentHandler)
setDocumentHandler
setFeature
public void setFeature(String featureId,
boolean state)
throws org.apache.xerces.xni.parser.XMLConfigurationException
Sets the state of a feature. This method is called by the component
manager any time after reset when a feature changes state.
Note: Components should silently ignore features
that do not affect the operation of the component.
- setFeature in interface org.apache.xerces.xni.parser.XMLComponent
- setFeature in interface XMLScanner
featureId
- The feature identifier.state
- The state of the feature.
setInputSource
public void setInputSource(org.apache.xerces.xni.parser.XMLInputSource inputSource)
throws IOException
Sets the input source.
inputSource
- The input source.
setProperty
public void setProperty(String propertyId,
Object value)
throws org.apache.xerces.xni.parser.XMLConfigurationException
Sets the value of a property. This method is called by the component
manager any time after reset when a property changes value.
Note: Components should silently ignore properties
that do not affect the operation of the component.
- setProperty in interface org.apache.xerces.xni.parser.XMLComponent
- setProperty in interface XMLScanner
propertyId
- The property identifier.value
- The value of the property.
setScannerState
protected final void setScannerState(int state)
Sets the scanner state.
state
- The new scanner state.
startEntity
public void startEntity(String name,
org.apache.xerces.xni.XMLResourceIdentifier identifier,
String encoding,
org.apache.xerces.xni.Augmentations augs)
throws org.apache.xerces.xni.XNIException
This method notifies of the start of an entity. The DTD has the
pseudo-name of "[dtd]" parameter entity names start with '%'; and
general entities are just specified by their name.
- startEntity in interface XMLEntityHandler
- startEntity in interface XMLScanner
name
- The name of the entity.identifier
- The resource identifier.encoding
- The auto-detected IANA encoding name of the entity
stream. This value will be null in those situations
where the entity encoding is not auto-detected (e.g.
internal entities or a document entity that is
parsed from a java.io.Reader).augs
- Additional information that may include infoset augmentations
org.apache.xerces.xni.XNIException
- Thrown by handler to signal an error.