Linkable.Analysis
Class HTMLAnalyzer
java.lang.Object
|
+--Linkable.Analysis.HTMLAnalyzer
- All Implemented Interfaces:
- RefLinkAnalyzer
- public class HTMLAnalyzer
- extends java.lang.Object
- implements RefLinkAnalyzer
Field Summary |
private static boolean |
DEBUG
|
private java.io.BufferedReader |
in
|
private static java.lang.String |
ME
|
private java.lang.String |
pubDate
|
(package private) org.w3c.tidy.Tidy |
tidy
|
(package private) java.io.BufferedInputStream |
tidyIn
|
(package private) java.io.FileOutputStream |
tidyOut
|
(package private) XHTMLAnalyzer |
xa
|
Constructor Summary |
HTMLAnalyzer(java.lang.String url)
Constructor |
HTMLAnalyzer(java.lang.String localURL,
java.lang.String url)
|
Method Summary |
java.util.Vector |
buildCitationList(java.lang.String docURN)
|
java.lang.String |
buildLocalMetaData(java.lang.String DOI,
java.lang.String pubDate,
Creation c)
|
Reference[] |
buildRefList(BibData b)
|
java.lang.String |
getDate()
|
java.lang.String |
getLinkedText(Reference[] refList,
java.lang.String url)
getLinkedText emits XML for the linked body of the text and/or the
characters of the text body followed by reference-link data suitable
for separate presentation. |
java.lang.String |
getLinkedTextFinalize()
getLinkedTextFinalize emits XML for finishing off the Surrogate
linked text output. |
java.lang.String |
getLinkedTextInitialize()
getLinkedTextInitialize sets up to generate XML for our Surrogate,
but not the incantation. |
private boolean |
runTidy(java.lang.String url)
|
Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait |
ME
private static final java.lang.String ME
DEBUG
private static final boolean DEBUG
in
private java.io.BufferedReader in
tidy
org.w3c.tidy.Tidy tidy
tidyIn
java.io.BufferedInputStream tidyIn
tidyOut
java.io.FileOutputStream tidyOut
xa
XHTMLAnalyzer xa
pubDate
private java.lang.String pubDate
HTMLAnalyzer
public HTMLAnalyzer(java.lang.String url)
throws SurrogateException
- Constructor
- Parameters:
is
- name of file that contains the HTML to be converted- Throws:
SurrogateException
- if the url cannot be opened for analysis
HTMLAnalyzer
public HTMLAnalyzer(java.lang.String localURL,
java.lang.String url)
throws SurrogateException
getDate
public java.lang.String getDate()
- Specified by:
getDate
in interface RefLinkAnalyzer
buildLocalMetaData
public java.lang.String buildLocalMetaData(java.lang.String DOI,
java.lang.String pubDate,
Creation c)
- Specified by:
buildLocalMetaData
in interface RefLinkAnalyzer
buildRefList
public Reference[] buildRefList(BibData b)
- Specified by:
buildRefList
in interface RefLinkAnalyzer
buildCitationList
public java.util.Vector buildCitationList(java.lang.String docURN)
- Specified by:
buildCitationList
in interface RefLinkAnalyzer
getLinkedText
public java.lang.String getLinkedText(Reference[] refList,
java.lang.String url)
throws SurrogateException
- getLinkedText emits XML for the linked body of the text and/or the
characters of the text body followed by reference-link data suitable
for separate presentation. Note that the reference-link data can be
constructed by this routine but saved for output by the
getLinkedTextFinalize routine.
- Specified by:
getLinkedText
in interface RefLinkAnalyzer
- Parameters:
The
- array of Reference objects belonging to this Surrogate.The
- net URL of the document, for a base URL- Throws:
SurrogateException
- if URL to be analyzed cannot be opened.
getLinkedTextInitialize
public java.lang.String getLinkedTextInitialize()
- getLinkedTextInitialize sets up to generate XML for our Surrogate,
but not the incantation.
- Specified by:
getLinkedTextInitialize
in interface RefLinkAnalyzer
getLinkedTextFinalize
public java.lang.String getLinkedTextFinalize()
- getLinkedTextFinalize emits XML for finishing off the Surrogate
linked text output. The main
use for this routine is to emit the linkage data elements for
documents that are not expressed in HTML or in XHTML.
- Specified by:
getLinkedTextFinalize
in interface RefLinkAnalyzer
runTidy
private boolean runTidy(java.lang.String url)