Linkable.Analysis
Class AuthorSection
java.lang.Object
|
+--Linkable.Analysis.AuthorSection
- public class AuthorSection
- extends java.lang.Object
This class handles the parsing of the author section of a document.
It decides whether or not to stay in the author section and whether
or not to skip text or to look for an author name. The constructor
specifies an array of tag strings that should be used to look for a
potential author or author list. SAX events (handleStartTag,
handleEndTag, and handleText) are mirrored here in the AuthorSection,
which handles it specifically for the author section. None of these
routines is called unless it is known that we are parsing the author
section. After one of these routines returns false, they will not
be called again for this paper.
Method Summary |
private boolean |
addText(char[] text,
int offset,
int length,
java.lang.String textString)
addText is given a hunk of text from the author section of a
paper, which is not a header as determined by the parser,
which is parsed into one or more authors names. |
Author[] |
getAuthors()
getAuthors() returns an Author[] array out of the authors
seen by this AuthorSection object so far. |
protected boolean |
gotAuthors()
|
private void |
handleAuthor(java.lang.String textString)
|
protected void |
handleEndTag(java.lang.String name)
handles end tag. |
protected boolean |
handleStartTag(java.lang.String name,
org.xml.sax.AttributeList attrs)
Given a tag and its attributes, determine whether this could be
the start of a new author list. |
protected boolean |
handleText(char[] text,
int offset,
int length,
java.lang.String textString)
Given a string of text, parse it as possibly being the start of the
document body, else -- if grabAuthor is true -- parse it as an author
name or author list. |
private static boolean |
isEndOfAuthorSection(java.lang.String textString)
isEndOfAuthorSection examines this hunk of text, which should
be a header (as determined by the parser) and returns true if
this could be the start of the body of the text. |
private void |
putAuthor(java.lang.String authorName)
|
void |
setTable(boolean sw)
|
Methods inherited from class java.lang.Object |
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait |
ME
private static final java.lang.String ME
DEBUG
private static final boolean DEBUG
v
private java.util.Vector v
possibleStartTags
private java.lang.String[] possibleStartTags
startAuthorTags
private java.util.Vector startAuthorTags
index
private int index
cs
private ContextSection cs
notInTable
private boolean notInTable
grabAuthor
private boolean grabAuthor
AuthorSection
public AuthorSection(ContextSection _cs,
java.lang.String[] st)
setTable
public void setTable(boolean sw)
handleStartTag
protected boolean handleStartTag(java.lang.String name,
org.xml.sax.AttributeList attrs)
- Given a tag and its attributes, determine whether this could be
the start of a new author list. If we are just beginning, this
is also where the sequence of tags that starts up an author
list is remembered.
- Parameters:
The
- name of the tag that has been encountered, e.g. "p"The
- AttributeList on that tag (relevant for HTML, TeX)
handleEndTag
protected void handleEndTag(java.lang.String name)
- handles end tag. Basically back out of properly nested tags.
When we've backed out all the way, then we can consider tags that
start new authors or a Context Section.
- Parameters:
the
- name of the element or environment that is being ended.
handleText
protected boolean handleText(char[] text,
int offset,
int length,
java.lang.String textString)
- Given a string of text, parse it as possibly being the start of the
document body, else -- if grabAuthor is true -- parse it as an author
name or author list.
- Parameters:
The
- string of text, including newlines and whitespace
isEndOfAuthorSection
private static boolean isEndOfAuthorSection(java.lang.String textString)
- isEndOfAuthorSection examines this hunk of text, which should
be a header (as determined by the parser) and returns true if
this could be the start of the body of the text.
- Parameters:
The
- hunk of text as a String
addText
private boolean addText(char[] text,
int offset,
int length,
java.lang.String textString)
- addText is given a hunk of text from the author section of a
paper, which is not a header as determined by the parser,
which is parsed into one or more authors names.
- Parameters:
Hunk
- of text in char[] formWhere
- in the text the author string startsHow
- long the string isHunk
- of text in String form
getAuthors
public Author[] getAuthors()
- getAuthors() returns an Author[] array out of the authors
seen by this AuthorSection object so far.
handleAuthor
private void handleAuthor(java.lang.String textString)
putAuthor
private void putAuthor(java.lang.String authorName)
gotAuthors
protected boolean gotAuthors()