sunlabs.brazil.template

Class ContentTemplate

public class ContentTemplate extends Template

Template class for extracting content out of remote html pages. This class is used by the TemplateHandler, for extracting the "content" out of html documents for later integration with a look-and-feel template using one or more of: {@link SetTemplate}, {@link BSLTemplate}, or {@link sunlabs.brazil.filter.ReplaceFilter}, The plan is to snag the title and the content, and put them into request properties. The resultant processed output will be discarded. The following properties are gathered:
title
The document title
all
The entire content
bodyArgs
The attributes to the body tag, if any
content
The body, delimited by content.../content>. The text inside multiple <content> ... </content> pairs are concatenated together.
script
All "<script>"..."</script>" tags found in the document head
scriptSrcs
A white-space delimited list of all "src" attributes found in "script" tags.
style
All "<style">..."</style"> tags found in the document head
meta-[name]
Every meta tag "name" and "content"
link-[rel]
Every link tag "rel" and "href"
user-agent
The origin user agent
referer
The user agent referrer (if any)
last-modified
The document last modified time (if any) in std format
content-length
The document content length, as fetched from the origin server
Properties:
prepend
Prepend this string to the property names define above, that are populated by this template. (defaults to "").

Version: %V% 2.2

Author: Stephen Uhler

Method Summary
booleandone(RewriteContext hr)
Extract useful properties out of the http mime headers.
booleaninit(RewriteContext hr)
voidtag_body(RewriteContext hr)
Grab the "body" attributes, and toss all output to this point.
voidtag_content(RewriteContext hr)
toss everything up to and including here, but turn on content accumulation.
voidtag_link(RewriteContext hr)
Extract data out of link tags into the properties.
voidtag_meta(RewriteContext hr)
Extract data out of meta tags into the properties.
voidtag_script(RewriteContext hr)
Append all "script" code while in the head section.
voidtag_slash_body(RewriteContext hr)
If no content tags are present, use the entire "body" instead.
voidtag_slash_content(RewriteContext hr)
Save the content gathered so far, and turn off content accumulation.
voidtag_slash_head(RewriteContext hr)
Mark end of head section.
voidtag_slash_title(RewriteContext hr)
Gather up the title - no tags allowed between title ....
voidtag_style(RewriteContext hr)
Append all "style" code while in the head section.
voidtag_title(RewriteContext hr)
Toss everything up to and including this entity.

Method Detail

done

public boolean done(RewriteContext hr)
Extract useful properties out of the http mime headers.

init

public boolean init(RewriteContext hr)

tag_body

public void tag_body(RewriteContext hr)
Grab the "body" attributes, and toss all output to this point.

tag_content

public void tag_content(RewriteContext hr)
toss everything up to and including here, but turn on content accumulation.

tag_link

public void tag_link(RewriteContext hr)
Extract data out of link tags into the properties. Prefix the "rel" attribute with "link-" to use as the property name.

tag_meta

public void tag_meta(RewriteContext hr)
Extract data out of meta tags into the properties. For "http-equiv" tags, set the corrosponding http respones header.

tag_script

public void tag_script(RewriteContext hr)
Append all "script" code while in the head section. If the script has a "src" attribute, we'll put the "src" in a variable so the template can deal with it (them?) For now, ignore it.

tag_slash_body

public void tag_slash_body(RewriteContext hr)
If no content tags are present, use the entire "body" instead.

tag_slash_content

public void tag_slash_content(RewriteContext hr)
Save the content gathered so far, and turn off content accumulation.

tag_slash_head

public void tag_slash_head(RewriteContext hr)
Mark end of head section. All "script" content in the "body" is left alone.

tag_slash_title

public void tag_slash_title(RewriteContext hr)
Gather up the title - no tags allowed between title .... /title.

tag_style

public void tag_style(RewriteContext hr)
Append all "style" code while in the head section.

tag_title

public void tag_title(RewriteContext hr)
Toss everything up to and including this entity.