Package | Description |
---|---|
org.htmlparser |
The basic API classes which will be used by most developers when working with
the HTML Parser.
|
org.htmlparser.lexer |
The lexer package is the base level I/O subsystem.
|
Modifier and Type | Class and Description |
---|---|
class |
PrototypicalNodeFactory
A node factory based on the prototype pattern.
|
class |
StringNodeFactory
Deprecated.
Use PrototypicalNodeFactory#setTextPrototype(Text)
A more efficient implementation of affecting all string nodes, is to replace
the Text node prototype in the For example, if you were using: StringNodeFactory factory = new StringNodeFactory(); factory.setDecode(true);to decode all text issued from Text.toPlainTextString() ,
you would instead create a subclass of TextNode
and set it as the prototype for text node generation:
PrototypicalNodeFactory factory = new PrototypicalNodeFactory (); factory.setTextPrototype (new TextNode () { public String toPlainTextString() { return (org.htmlparser.util.Translate.decode (super.toPlainTextString ())); } });Similar constructs apply to removing escapes and converting non-breaking spaces, which were the examples previously provided. Using a subclass avoids the wrapping and delegation inherent in the decorator pattern, with subsequent improvements in processing speed and memory usage. |
Modifier and Type | Method and Description |
---|---|
NodeFactory |
Parser.getNodeFactory()
Get the current node factory.
|
Modifier and Type | Method and Description |
---|---|
void |
Parser.setNodeFactory(NodeFactory factory)
Set the current node factory.
|
Modifier and Type | Class and Description |
---|---|
class |
Lexer
This class parses the HTML stream into nodes.
|
Modifier and Type | Field and Description |
---|---|
protected NodeFactory |
Lexer.mFactory
The factory for new nodes.
|
Modifier and Type | Method and Description |
---|---|
NodeFactory |
Lexer.getNodeFactory()
Get the current node factory.
|
Modifier and Type | Method and Description |
---|---|
void |
Lexer.setNodeFactory(NodeFactory factory)
Set the current node factory.
|
HTML Parser is an open source library released under LGPL.