|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--javax.swing.text.html.HTMLEditorKit.ParserCallback | +--org.pricingnexus.tools.HtmlToDOMTransform
Tiny but nice helper class: From a stream or String [XXX: must be developed!] this object tries to read a complete HTML page and will try to convert it into a JDOM Document Object and return the reference to this object. The core functionality is handled by the html.parser class, a more internal class from the swing API as it provides us with a "well formed" HTML code, i.e. everything is opened and closed like XML. The only task therefore is to handle the callbacks and construct a Document object from it. The idea for this type of implementation cam after reading the chapter about HTML parsing in the book "JAVA Network Programming" from E.R. Harold. $Header$
Fields inherited from class javax.swing.text.html.HTMLEditorKit.ParserCallback |
IMPLIED |
Constructor Summary | |
HtmlToDOMTransform()
Default constructor of class. |
Method Summary | |
void |
cleanUp()
Clean up the Document object |
void |
flush()
Probably do nothing when flushing occurs |
org.jdom.Document |
getJDOM()
If parsing was successfull a reference to the JDOM object can be retreived via this method. |
void |
handleComment(char[] text,
int position)
We'll don't care for comments |
void |
handleEndOfLineString(java.lang.String theString)
Do nothing at EOL - is this correct? |
void |
handleEndTag(javax.swing.text.html.HTML.Tag tag,
int position)
The close tag is easy: Using the getParrent method of element we climb back one level in the tree |
void |
handleError(java.lang.String errorMessage,
int position)
Error handling is an open issue for further development |
void |
handleSimpleTag(javax.swing.text.html.HTML.Tag tag,
javax.swing.text.MutableAttributeSet attributes,
int position)
Simple Tags are new elements with just the name of the tag as the element. |
void |
handleStartTag(javax.swing.text.html.HTML.Tag tag,
javax.swing.text.MutableAttributeSet attributes,
int position)
The following six methods instantiate the callbacks from the Parser class A new "open tag" event triggers - the creation of a new child below the currentElement - assignment of the tag name to the element name - finally reassignment of the currentElement to this child |
void |
handleText(char[] text,
int position)
Also quite straight forward: Just add the Text as new Content to the currentElement |
org.w3c.dom.Document |
parseToDOM()
This one works like parseToJDOM() but returns a w3c.DOM object instead |
org.jdom.Document |
parseToJDOM()
The important method: Retreives data via the stream and tries to parse it inro a JDOM object. |
void |
setReader(java.io.Reader theReader)
The initializer method. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public HtmlToDOMTransform()
Method Detail |
public void setReader(java.io.Reader theReader)
public void handleStartTag(javax.swing.text.html.HTML.Tag tag, javax.swing.text.MutableAttributeSet attributes, int position)
handleStartTag
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public void handleEndTag(javax.swing.text.html.HTML.Tag tag, int position)
handleEndTag
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public void handleSimpleTag(javax.swing.text.html.HTML.Tag tag, javax.swing.text.MutableAttributeSet attributes, int position)
handleSimpleTag
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public void handleText(char[] text, int position)
handleText
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public void handleComment(char[] text, int position)
handleComment
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public void handleError(java.lang.String errorMessage, int position)
handleError
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public void flush()
flush
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public void handleEndOfLineString(java.lang.String theString)
handleEndOfLineString
in class javax.swing.text.html.HTMLEditorKit.ParserCallback
public org.jdom.Document parseToJDOM() throws java.io.IOException
public org.w3c.dom.Document parseToDOM() throws java.io.IOException
public void cleanUp()
public org.jdom.Document getJDOM()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |